Recognizing emotions in telephone recordings

The technology for recognizing emotions in speech can be used in a huge number of tasks. In particular, this will allow automating the process of monitoring the quality of customer service at call centers.





Determining a person's emotions by their speech is already a relatively saturated market. I reviewed several solutions from companies in the Russian and international markets. Let's try to figure out what their advantages and disadvantages are.





1) Empath





2017 Empath. Web Empath, , Smartmedical. , .





, , . . , , , , , . , , .





. [1]. , , .





2)





Smart Logger II QM Analyzer, , , . QM Analyzer : , , , [2]. , , , , [3].





, . .





3) Neurodata Lab





Neurodata Lab , , , , , . Neurodata Lab RAMAS — , 12 : , , , — . , [4].





RAMAS Neurodata Lab -, . , , . : , .





, . , , [1].





, . .









Empath









Neurodata Lab













-





+





+





+





-





-









+





+





-





+





-





+





- IT- .





.





Call processing flowchart
-

:





  1. RNNoise_Wrapper





  2. pyAudioAnalysis





  3. vosk-api





  4. dostoevsky





, .





. , , . , . , . Librosa.





:





  • - (MFCC)









  • -









  • (Tonnetz)





3 - . Emo-DB, .





scikit-learn:





  • SVC





  • RandomForestClassifier





  • GradientBoostingClassifier





  • KNeighborsClassifier





  • MLPClassifier





  • BaggingClassifier





- Emo-DB 79%. , 23%. , .





- 55%.

























Emo-DB





4





408





MLPClassifier





79.268%/22.983%





MCartEmo-admntlf





7





324





KNeighborsClassifier





49.231%





MCartEmo-asnef





5





373





GradientBoostingClassifier





49.333%





MCartEmo-pnn





3





421





BaggingClassifier





55.294%





. .





- MCartEmo-pnn. .





62.352%.





-, 566. . 66.666%. , .





Learning history graph and error matrix received by CNN
CNN

, , . , , .





Gateway API JSON Web Token -, , .





24. . 24 . REST API 24, OAuth 2.0, . ( ), ( ) OnVoximplantCallEnd, CRM-. .









, - , CNN. 66.66%.

-, , , .

, 24.



" " . X " " .



, , .









  1. , . : / . , . , . // " 2011.". – 2011. – . 178–185.





  2. Smart Logger II. . [ ]. — : http://www.myshared.ru/slide/312083/.





  3. Smart logger-2 is awake. Emotions of call-center operators and clients under control [Electronic resource]. - Access mode: https://piter.tv/event/_Smart_logger_2_ne_drem/ .





  4. Perepelkina, O. RAMAS: Russian Multimodal Corpus of Dyadic Interaction for Studying Emotion Recognition / O. Perepelkina, E. Kazimirova, M. Konstantinova // PeerJ Preprints 6: e26688v1. - 2018.








All Articles