The technology for recognizing emotions in speech can be used in a huge number of tasks. In particular, this will allow automating the process of monitoring the quality of customer service at call centers.
Determining a person's emotions by their speech is already a relatively saturated market. I reviewed several solutions from companies in the Russian and international markets. Let's try to figure out what their advantages and disadvantages are.
1) Empath
2017 Empath. Web Empath, , Smartmedical. , .
, , . . , , , , , . , , .
. [1]. , , .
Smart Logger II QM Analyzer, , , . QM Analyzer : , , , [2]. , , , , [3].
, . .
Neurodata Lab , , , , , . Neurodata Lab RAMAS — , 12 : , , , — . , [4].
RAMAS Neurodata Lab -, . , , . : , .
, . , , [1].
, . .
|
Empath |
|
Neurodata Lab |
|
|
- |
+ |
+ |
+ |
- |
- |
|
+ |
+ |
- |
+ |
- |
+ |
.
:
, .
. , , . , . , . Librosa.
:
- (MFCC)
-
(Tonnetz)
3 - . Emo-DB, .
SVC
RandomForestClassifier
GradientBoostingClassifier
KNeighborsClassifier
MLPClassifier
BaggingClassifier
- Emo-DB 79%. , 23%. , .
- 55%.
|
|
|
|
|
Emo-DB |
4 |
408 |
MLPClassifier |
79.268%/22.983% |
MCartEmo-admntlf |
7 |
324 |
KNeighborsClassifier |
49.231% |
MCartEmo-asnef |
5 |
373 |
GradientBoostingClassifier |
49.333% |
MCartEmo-pnn |
3 |
421 |
BaggingClassifier |
55.294% |
. .
- MCartEmo-pnn. .
62.352%.
-, 566. . 66.666%. , .
, , . , , .
Gateway API JSON Web Token -, , .
24. . 24 . REST API 24, OAuth 2.0, . ( ), ( ) OnVoximplantCallEnd, CRM-. .
, - , CNN. 66.66%.
-, , , .
, 24.
" " . X " " .
, , .
, . : / . , . , . // " 2011.". – 2011. – . 178–185.
Smart Logger II. . [ ]. — : http://www.myshared.ru/slide/312083/.
Smart logger-2 is awake. Emotions of call-center operators and clients under control [Electronic resource]. - Access mode: https://piter.tv/event/_Smart_logger_2_ne_drem/ .
Perepelkina, O. RAMAS: Russian Multimodal Corpus of Dyadic Interaction for Studying Emotion Recognition / O. Perepelkina, E. Kazimirova, M. Konstantinova // PeerJ Preprints 6: e26688v1. - 2018.