🗯️ 👉🏾 🕜 Alice on Kotlin: turning the code into Yandex.Station 👨🏾‍🔬 👨‍👩‍👦‍👦 💆🏽

In June, Yandex hosted an online hackathon among voice skill developers. We at Just AI were just updating our open source framework in Kotlin to support Alice's cool new features. And it was necessary to come up with some kind of simple example for the README ...

About how a couple of hundred lines of code on Kotlin turned into Yandex.Station

Alice + Kotlin = JAICF

Just AI has an open source and completely free framework for developing voice applications and text chatbots - JAICF . It is written in Kotlin , a programming language from JetBrains, which is well known to all androids and servers who write a bloody enterprise (well, or rewrite it from Java). The framework aims to facilitate the creation of precisely conversational applications for various voice, text and even telephone assistants.

Yandex has Alice, a voice assistant with a pleasant voice and an open API for third-party developers. That is, any developer can expand Alice's functionality for millions of users and even get money from Yandex for this .

We of courseofficially made JAICF friends with Alice , so now you can write skills in Kotlin. And this is what it looks like.

Script -> Webhook -> Dialogue

Any Alicia skill is a voice dialogue between a user and a digital assistant. The dialogue is described in JAICF in the form of scripts, which are then run on the webhook server, which is registered in Yandex.Dialogues.

Scenario

Let's take a skill we came up with for a hackathon. It helps to save money when shopping in stores. First, see how it works.

Here you can see how the user asks Alice - "Tell me what is more profitable - so many rubles for such and such an amount or so much for that?"

Alice immediately launches our skill (because it is called "What is more profitable") and transfers all the necessary information to it - the user's intent and data from his request .

The skill, in turn, reacts to the intent, processes the data, and returns a useful response. Alice says the answer and turns off, because the skill ends the session (they call this "one-pass skill").

Here is such a simple scenario, which, however, allows you to quickly calculate how much one product is more profitable than another. And at the same time win a speaking column from Yandex.

What does it look like in Kotlin?

object MainScenario: Scenario() {
    init {
        state("profit") {
            activators {
                intent("CALCULATE.PROFIT")
            }

            action {
                activator.alice?.run {
                    val a1 = slots["first_amount"]
                    val a2 = slots["second_amount"]
                    val p1 = slots["first_price"]
                    val p2 = slots["second_price"]
                    val u1 = slots["first_unit"]
                    val u2 = slots["second_unit"] ?: firstUnit

                    context.session["first"] = Product(a1?.value?.double ?: 1.0, p1!!.value.int, u1!!.value.content)
                    context.session["second"] = p2?.let {
                        Product(a2?.value?.double ?: 1.0, p2.value.int, u2!!.value.content)
                    }

                    reactions.go("calculate")
                }
            }

            state("calculate") {
                action {
                    val first = context.session["first"] as? Product
                    val second = context.session["second"] as? Product

                    if (second == null) {
                        reactions.say("   ?")
                    } else {
                        val profit = try {
                            ProfitCalculator.calculateProfit(first!!, second)
                        } catch (e: Exception) {
                            reactions.say("   , .   .")
                            return@action
                        }

                        if (profit == null || profit.percent == 0) {
                            reactions.say("     .")
                        } else {
                            val variant = when {
                                profit.product === first -> ""
                                else -> ""
                            }

                            var reply = "$variant   "

                            reply += when {
                                profit.percent < 10 -> "   ${profit.percent}%."
                                profit.percent < 100 -> " ${profit.percent}%."
                                else -> "  ${profit.percent}%."
                            }

                            context.client["last_reply"] = reply
                            reactions.say(reply)
                            reactions.alice?.endSession()
                        }
                    }
                }
            }

            state("second") {
                activators {
                    intent("SECOND.PRODUCT")
                }

                action {
                    activator.alice?.run {
                        val a2 = slots["second_amount"]
                        val p2 = slots["second_price"]
                        val u2 = slots["second_unit"]

                        val first = context.session["first"] as Product
                        context.session["second"] = Product(
                            a2?.value?.double ?: 1.0,
                            p2!!.value.int,
                            u2?.value?.content ?: first.unit
                        )

                        reactions.go("../calculate")
                    }
                }
            }
        }

        fallback {
            reactions.say(",   . " +
                    "  :  , 2   230   3   400.")
        }
    }
}

The full script is available on Github .

As you can see, this is a regular object that extends the Scenario class from the JAICF library. Basically, the script is a state machine, where each node is a possible state of the conversation. This is how we implement the work with the context, since the context of the dialogue is a very important component of any voice application.

Let's say the same phrase can be interpreted differently depending on the context of the dialogue. By the way, this is one of the reasons why we chose Kotlin for our framework - it allows you to create a laconic DSL , in which it is convenient to manage such nested contexts and transitions between them.

The state is activated withactivator (for example, an intent ) and executes the nested code block - the action . And inside the action, you can do whatever you want, but the main thing is to return some useful answer to the user or to interrogate something. This is done through reactions . Follow the links to find a detailed description of each of these entities.

Intents and slots

An intent is a language-independent representation of a user request. Actually, it is an identifier of what the user wants to get from your conversational application.

Alice recently learned how to automatically define intents for your skill if you first describe a special grammar. Moreover, she knows how to extract the necessary data from the phrase in the form of slots - for example, the price and volume of goods, as in our example.

To make it all work, you need to describe this grammar and slots . This is the grammar in our skill, and these are the slotswe use it in it. This allows our skill to receive at the entrance not just a line of a user request in Russian, but an already language-independent identifier and converted slots in addition (the price of each product and its volume).

JAICF, of course, supports any other NLU engine (for example, Caila or Dialogflow ), but in our example we wanted to use this particular Alice feature to show how it works.

Webhook

Okay, we have the script. How do we check that it works?

Of course, adherents of the test-driven-development approach will appreciate the built-in mechanism for automated testing of dialog scripts in JAICF , which we personally use all the time, since we do large projects, and it's hard to check all the changes by hand. But our example is quite small, so we'd better start the server right away and try to talk to Alice.

To run the script, you need a webhook - a server that accepts incoming requests from Yandex when the user starts talking with your skill. The server is not difficult to start at all - you just need to configure your bot and hang some endpoint on it.

val skill = BotEngine(
    model = MainScenario.model,
    activators = arrayOf(
        AliceIntentActivator,
        BaseEventActivator,
        CatchAllActivator
    )
)

This is how the bot is configured - here we describe what scripts are used in it, where to store user data and what activators we need for the script to work (there may be several of them).

fun main() {
    embeddedServer(Netty, System.getenv("PORT")?.toInt() ?: 8080) {
        routing {
            httpBotRouting("/" to AliceChannel(skill, useDataStorage = true))
        }
    }.start(wait = true)
}

But this is how a server with a webhook starts up just like that - you just need to specify which channel at which endpoint should work. We run the JetBrains Ktor server here, but you can use any other in JAICF .

Here we have used one more feature of Alice - storing user data in her internal database ( useDataStorage option ). JAICF will automatically save and restore the context from there and everything that our script writes there. Serialization is transparent.

Dialog

We can finally test it all! The server runs locally, so we need a temporary public URL for Alice's requests to reach our webhook from the Internet. To do this, it is convenient to use the free ngrok tool , simply by running a command in the terminal like ngrok http 8080

All requests will arrive in real time on your PC - so you can debug and edit the code.

Now you can take the received https URL and specify it when creating a new Aliego dialogue on Yandex. Dialogues . There you can also test the dialog with text. But if you want to talk to a skill with a voice, now Alice can quickly publish private skills, which at the time of development are available only to you. So, without going through a long moderation from Yandex, you can already start talking with your skill directly from Alice's application or from a smart speaker.

Publication

We have tested everything and are ready to publish the skill for all Alice users! To do this, our webhook must be hosted somewhere on a public server with a constant URL. In principle, applications on JAICF can be run anywhere where Java is supported (even on an Android smartphone).

We ran our example on Heroku . We just created a new application and registered the address of our Github repository where the skill code is stored. Heroku builds and runs everything from source itself. We just have to register the resulting public URL in the Yandex. Dialogues and send it all for moderation .

Total

This little tutorial follows in the footsteps of the Yandex hackathon , where the above scenario “ Which is more profitable ” won one of three Yandex.Stations! Here, by the way, you can see how it was .

The JAICF framework on Kotlin helped me quickly implement and debug the dialog script, without bothering with working with Alice's API, contexts and databases, while not limiting the possibilities (as is often the case with similar libraries).

useful links

The full JAICF doc is here .

Instructions for creating skills on it for Alice are here .

The source of the skill itself can be found there .

And if you liked

Feel free to contribute to JAICF , as colleagues from Yandex are already doing , or just leave an asterisk on Github .

And if you have any questions, we answer them immediately in our cozy Slack .

Alice on Kotlin: turning the code into Yandex.Station