Text sentiment analysis (or sentiment analysis) is one of the tasks that Data Science specialists work with. With the help of such analysis, you can study the array of messages and other data and determine how they are emotionally colored - positive, negative or neutral.
Let's see how it works - let's analyze a number of articles based on the Linis Crowd dataset. We propose to determine which models are most promising, for example, for the development of various monitoring services. As a subject area, we will choose articles of a technical nature (for example, on Habré), which can be useful for implementing automatic collection of opinions.
Tonality is the identification of emotionally colored vocabulary in texts, as well as an emotional assessment of the opinions expressed by the authors.
, , “” . , ( ). – , , .
, -. , , .
: . – 70-80%, , .
:
. () . , .
, . . , , .
, . – . , IT- , . :
-1 –
0 – ;
1 – ;
2 :
. , , Linis Crowd.
, . python, .
– , .
1)
, , , , . , . .
, :
alpha = 0.3; fitprior = True; classprior = None
2)
, .
:
samplesize, . , .
max_features ( — ).
( ). , , ( ). , , , .
, :
nestimators = 200; maxdepth = 3; random_state = 0
3)
Embedding, 64- , LSTM (128 ) Dense (10 ).
Linis Crowd, 29 – . .
.
, . fasttext , 5 - 10 CPU. GPU 0.7 – 1.
:
– 60%.
– 20%
– 20% .
, 70-80%.
:
(88,32%)
(78,91%)
. RNN (83,26%)
, – 88,32% – . , 80%, .
! , .
-, :
http://datareview.info/article/analiz-tonalnosti-teksta-kontseptsiya-metodyi-oblasti-primeneniya/
https://habr.com/ru/company/palitrumlab/blog/262595/
https://compress.ru/article.aspx?id=23115
https://habr.com/ru/company/mailru/blog/417767/
https://habr.com/ru/company/mailru/blog/516214/
https://habr.com/ru/company/mailru/blog/516726/
https://habr.com/ru/company/mailru/blog/516730/