Recording of conversations on an asterisk and their recognition on Yandex.Speech

Small project. Simple implementation. A note on the asterisk dialplan, console commands and Yandex recognition API. You will read and do not step on my rake, I will read it in six months or a year and remember what I did.





Objective: to get a textual representation of conversations recorded on an asterisk.





Record the conversation first

MixMonitor records the conversation. Usually MixMonitor records both interlocutors in one channel. We need to get each channel in a separate file. Therefore, there are two options r and t, where we can specify files to record different channels.





The parameter b is also used - to start recording at the moment the conversation starts.





From the 16th asterisk there was an option S - to synchronize t and r files, (in the one that began to be recorded later, silence was added to the beginning of the file). The S option was removed from the 18th asterisk, because this became the default behavior and counter-option n was added. But I use b, so I didn't need these extra dances.





MixMonitor (record-o.wav, br (record-r.wav) t (record-t.wav), command)





Then also in the MixMonitor command we will specify the command to be executed after recording. As part of this command, we will normalize each record - level it and then smudge the two records into one two-channel file.





sox --norm record-t.wav record-t-norm.wav // normalize the recording of one side of the conversation









sox --norm record-r.wav record-r-norm.wav // normalize the recording of the second side of the conversation









sox record-r-norm.wav record-t-norm.wav --channels 2 --combine merge record.wav //





- , - . , , .





record-o.wav - MixMonitor', , .





wav . mp3 .













https://howto.a17.su/asterisk/call-recording.html





https://voxlink.ru/kb/asterisk-configuration/integraciya-asterisk-so-speech-analytics/





.





API : , . - 30 , API .





- wav ogg . wav , API wav-, ogg. , ogg





/usr/bin/ffmpeg -i record.wav -acodec libopus record.ogg // ogg





-, ( ) , .





S3-, S3- . buckets.





.Storage





, , id. id ( , , , ).





.





.





, 2020, - . - 2 .





.. . , , . . , . , . . .





: - . - . . .





, .. ( ).





( SpeechKit)





Access keys. The main thing here is not to get confused, since you will have keys from both the recognition service (API key) and from the S3 storage (static key). Both types of keys are on the service account.









Hopefully this post will save you a few minutes and you will quickly implement your project as needed.








All Articles