Chatbots are in. They have been for awhile. They can be controversial, as suggested by Tay’s plight in 2016, or they can be confusing and spark panic in the general public, as Facebook found out last year, with rumors of its chatbots ‘talking to each other’ widely circulating in the media. They can also be scary. But mostly, chatbots are just utilitarian devices, to facilitate various aspects of customer interaction by automating certain tasks that don’t strictly need a human to be on the other end of the line. In certain cases, such as looking up information from databases, chatbots are likely to be much faster than their human customer-service counterparts. The literature on chatbots online is vast and ever-increasing, and one of the more interesting aspects which I’m looking forward to following is the gradual indistinguishability between human and chatbot interactions online. I even think, given the time, that I will write myself a ‘Kierke-bot‘, that will question its own existence, and that of the human querying it. However, before I set out on such lofty goals, I had to learn how to build a basic Chatbot, and this post will outline the steps I took to make that happen. I used the Rasa Stack framework to build the chatbot, utilizing the Spotify API, and using Slack as the front-end for the chatbot. The end-result is an extremely basic chatbot, that doesn’t do much (yet), but getting all the parts to work together was a bit of a challenge, and I want to use this as a baseline to create a more advanced chatbot in the future.
I started out by reading this very well written blog by Nathaniel Kohn, and then absorbing Justina Petraityte’s wonderful introductory video to the Rasa Stack framework. Her code, which formed the basis for mine, lives here. I also used Nahid Alam’s neatly written post on Medium as a reference. The biggest hurdle was getting all the dependencies to install, and I eventually hit upon the packages outlined in the requirements files on my code repository for this project. But of course, this installation experience will vary wildly depending on what Operating System, version of Python etc etc. Using a virtual environment is highly recommended. In the end I got everything to work, though you may have to use Stack Overflow and the questions asked on the Rasa community forum. It might take awhile, but trust me, the feeling of that first functioning chatbot (no matter if it has just the one badly trained function) is magical.
2. The Rasa Stack
Why reinvent the wheel? This description follows straight from the Rasa documentation, and is a very succinct introduction to the Rasa stack:
“The Rasa Stack is a pair of open source libraries (Rasa NLU and Rasa Core) that allow developers to expand chatbots and voice assistants beyond answering simple questions. Using state-of-the-art machine learning, your bots can hold contextual conversations with users.”
The following image, also from the Rasa website, outlines the basic process::
Rasa NLU (Natural Language Understanding) takes the input text and breaks it down into structured data, recognizing entities and intent, while the Rasa Core is in charge of dialogue and decides how to proceed with the user-bot conversation. The documentation is pretty extensive, and is highly recommended reading. A high-level overview of the architecture is also useful as a reference, especially once you’re deep in the model training, and it’s important to take a step back and recall what exactly each part of the system is tasked with doing.
3. Model Training via Rasa NLU
You need to specify a config file , and you need some training data, based on what you want your chatbot to accomplish. Before starting on the training data, I find it useful to look at the domain file (screenshot below) to assess what I need from my chatbot.
We need to define our actions, entities and intents. The actions are pretty self-explanatory: the bot says hello, asks about which artist the user is interested in, queries the Spotify API (more on this later) for the top tracks for the relevant artist, and says goodbye. In our very simple chatbot, we just need to recognize the entity artist, i.e. whichever artist the user specifies. As for intent, we have three simple interpretations of each user-inputted statement. The bot wants to know: given something that the user has typed, should it greet the user? Is the user providing information, i.e. seeking to inform? Or perhaps, the situation calls for the bot to say goodbye to the user. In addition, the bot might also just be required to listen, which is when more user-provided information is required. Slots are the bot’s memory, storing, in this case, the value of the artist entered by the user. Based on all this information, we can begin generating our training data (./data/data.json in our case). This can be somewhat tedious to do, but luckily there is a web interface that one can call (provided one has npm and node.js) using:
“rasa-nlu-trainer -v data/data.json“
and you get a very simple way of building your dataset, specifying the entities and intent for each line to feed into the NLU model.
After that, you run the nlu_model.py code, which builds and stores the model, to be fed into the Rasa Core for the dialogue management part of the chatbot construction. Once the model has been trained, you can test it on a sentence of your own creation, to test the performance, and that would bring the job of Rasa NLU to a close. As you can see from the screenshot below, the model does a decent job of identifying the principal intent and entity from test sentence “Could you tell me the top tracks for bob dylan please?“.
The Rasa NLU uses an LSTM architecture implemented in Keras, and uses CRFs (Conditional Random Fields) to carry out the NER (Named Entity Recognition) tasks. It utilizes the familiar Bag-of-Words model for recognizing intent. There are several subtleties in choosing which NLU pipeline is most appropriate, but in general, the spaCy model (i.e. the one we have used here) is more suited to situations with a limited number of training examples.
4. Dialogue Management via Rasa Core
After training the model, we need to build the dialogue for the chatbot, and for this, we need some more training data. This data comes in the form of a Markdown file, traditionally called stories.md, an excerpt from which is shown below:
This file basically defines your standard expected conversation flow, and if you go over some conversations in your head (or go online and pretend to be a bot perhaps?), this flow will start to make sense. As a simple example, the user can ask “What are the top tracks“, prompting the chatbot to ask for the artist entity, or the user can directly enter “What are the top tracks for The Doors”, in which case the chatbot would directly query the Spotify API. However, you would require different conversation flows for the two distinct situations. If all you wanted your bot to output was static responses, then we could just have got by via the templates defined in the domain.yml file. Life, and chatbots, should never be that simple, however, so in order to define custom actions, we need another python script (actions.py), which does whatever we need the bot to do, query Spotify, in this particular case. In the dialogue_management_model.py script, the training parameters for stories.md are defined (batch size, epochs etc), and, assuming the custom actions server is running:
“python -m rasa_core_sdk.endpoint –actions actions”
running the dialogue management python script will start the chatbot in the command line interface. There is also the interactive learning mode, started by running the train_online.py script instead of the dialogue management code (All of this while the custom actions server is running). This online learning ability is extremely cool, and you can rectify your chatbot’s foibles in real time. A little snippet of the online learning for our spotify_chatbot is shown below:
After every user-input, the program checks if the bot has understood the entity or the intent accurately. If it has not, then the user can correct the mistake. After the interactive learning session has ended, the new data can be added to stories.md.
5. Querying Spotify, and connecting to Slack as a frontend
I was looking for a basic custom action that I could test out, and I had always been curious about the Spotify API, so this felt like a decent opportunity to satisfy both needs. Of course, the application I have utilized is fairly trivial (getting the top tracks for an artist), but putting the complete pipeline together was fun. In order to query the Spotify API I used Spotipy, and used the Client Credentials Flow to authenticate my requests. The process is pretty straightforward. Once you register your app (provided you have a Spotify account), you get a client_id and a client_secret, which can be used (cf. actions.py) to authorize calls to the API. Just as a first attempt, I have hard-coded the URIs (Unique Resource Identifiers) of five artists, for the Spotify queries. It shouldn’t be too much effort to dynamically query the URIs for each artist specified by the user.
For the Slack interfacing I almost entirely followed Justina Petraityte’s video tutorial, and the steps, briefly, involved registering an app with the Slack API, going through the permissions and authorizations (specifically the Slack Bot User OAuth Access Token), and connecting the app to a Slack workspace. This process also involved the use of ngrok, which I don’t know much about beyond being able to use it to connect the Slack frontend to my local backend. In order to do this part right, please watch the final 30 minutes or so of the video referenced above. I’d be happy to answer any questions you might have after that.
So we have constructed a chatbot in Python using the Rasa stack, and used it to query the Spotify API, also utilizing Slack as a frontend for our bot. I have tried to outline most of the process above, and all of the code from this project is on my Github. In case anything in this post isn’t clear, or perhaps something I’ve said can be corrected, please feel free to get in touch with me, or leave a comment below. I am only starting to dig into Rasa NLU and Rasa Core, and there are many things I haven’t understood completely yet. But the hope is that, with a basic running implementation, it will be easier to add levels of complexity as my understanding deepens. I don’t think my chatbot obsession will be going away anytime soon, so I anticipate more posts in the future. The image below shows a screen-grab of a full conversation between the chatbot and me on my Slack workspace. It’s a bit crude as of now, but I foresee improvements in its future! Till the next post, see you soon!