🛣️ ☁️ 🔙 Text Moderation: Data Scientist Etiquette Lessons 🌷 🧢 😳

Hello, Habr!

With this article, we begin the publication of a series of articles about Data Science, the tasks that we solve at the Center for the Development of Financial Technologies of the Russian Agricultural Bank.

Last year, Rosselkhozbank announced the creation and development of an ecosystem for enterprises in the agro-industrial complex. For one of the basic sites of the ecosystem - Our Farming, we decided to do a couple of useful tasks, which we will discuss below.

The Svoe Farmezstvo site is a marketplace of goods for agricultural producers from the largest suppliers from all over Russia. The site includes goods of the highest priority categories for farmers: plant protection products, fertilizers, seeds, agricultural machinery, etc. Thousands of suppliers upload information about their products for the purpose of selling. And, of course, you need to implement a process for checking the quality of the content you download. In this regard, we decided to create our own tool for pre-moderation of text and graphic information.

What they were doing?

In this article, we will tell you how, in cooperation with the MIPT Laboratory, specially created for the Bank's tasks, we have developed a tool that allows you to pre-moderate text content with high accuracy.

Our goal sounded quite simple - the tool we created should automatically classify the text as acceptable for placement on the site (class 0) or unacceptable (class 1). If the algorithm cannot clearly understand which class the text belongs to, then we send it (the text) for manual moderation.

We have the task of text processing: we want to filter texts that are “toxic” in all senses, namely: curses, insults, various content prohibited by law, and just text that is unacceptable for placement on the site.

We expect that the algorithm we have developed will take a set of texts as input and produce a number from 0 to 1 - the degree or probability of the text being “toxic”. The closer this number is to one, the more toxic the comment.

It should be noted that the problem of detecting toxic texts is not new at all and is quite popular in the English-speaking segment. Several years ago, a similar problem was solved in the Toxic Comment Classification Challenge on Kaggle. For the Russian language, the solution should be obtained in a similar way, but the quality of the model may be lower due to the fact that the Russian language is structurally more complex than English.

There is only one marked up Russian-language dataset in the public domainto search for toxicity in the text. We also managed to find a dataset for searching for insults (a special case of toxicity). Plus, we have collected examples of ads from agricultural portals and marked them as acceptable (class - 0).

The task set by us turned out to be quite unique in terms of the given agricultural theme. Its specificity lies in the fact that phrases that in everyday life are insults are not always so when it comes to agriculture. Common examples include the following: “Don't put your snout in” - the text is clearly unacceptable, while the text with a “pig's snout” can be easily placed on the site (although it depends on the context). The same applies to certain subspecies of farm animals and plants.

If we talk about solving such problems of text classification, then, in fact, even the simplest models (linear) already give good results . But, as always, to achieve higher quality, we will use neural networks. The most popular architecture (at the time of this writing) for solving such problems is BERT. At the time of the Kaggle competition mentioned above, this architecture did not exist, so others were used. However, later this task was successfully solved with the help of BERT.

How did you do it?

Let's move on to the most interesting part - solving the problem. After thinking a bit about the “architecture” of the tool, we decided to use three models: dictionary search (as a filter for obscene vocabulary), logistic regression (as a basic solution) and BERT (as a more advanced solution).

General scheme

The general scheme of the solution looks like this: inside the “black box”, the text first enters the naive classifier, which is based on a dictionary of obscene words (expletive expressions), here texts containing “bad” words are immediately cut off (their toxicity is always one ( 1). The texts that have passed the first stage fall into a more complex neural network model, which gives out the degree of their toxicity. In case of a failure of the neural network model, it is replaced by a simpler one - logistic regression. That is, we will get some non-naive result in any case .

Now let's look at each component in detail.

Naive classifier

Everything is quite simple here: according to the dictionary of obscene vocabulary, it is quite easy to understand whether the text contains this or that “bad” word or not.

That is, at this stage, you can even do without the ML-model as such and immediately weed out texts that contain “bad” words. But what if such dictionary words are not used in the text, but the text, nevertheless, is unacceptable for posting on the portal? Let's try to solve this problem using logistic regression and BERT'a.

Logistic regression

The simplest model makes it possible to predict a value based on the available data. Text vectors for this model are obtained using TF-IDF and TweetTokenizer from nltk. Such a model, as is known, allows one to estimate the probability of text toxicity using a logistic function. In our architecture, logistic regression “insures” the neural network.

Great and terrible BERT

We used the pre- trained RuBert model from DeepPavlov, which we further trained on the marked-up texts. The prediction process, without going into details, is as follows:

We built, built and finally built!

We assessed quality using our favorite metrics Accuracy, ROC-AUC and F1-measure. The final quality metrics on the deferred sample are as follows:

Algorithm / Metric	Naive	BERT	LR	Naive → BERT	Naive → LR
Accuracy	0.854	0.901	0.865	0.909	0.879
ROC-AUC	0.782	0.960	0.921	0.963	0.939
F1-measure	0.722	0.840	0.800	0.855	0.824

Operation speed: ~ 2800 texts per minute on GPU (GeForce 1080Ti) in case of BERT processing as the slowest algorithm from the presented ones.

As expected, the metrics with BERT turned out slightly better, although not much.

What conclusions did we draw

In conclusion, I would like to note several important aspects, without which, in our opinion, it is impossible to launch such solutions in industrial mode.

You always need to take into account the specifics of the task regarding the markup of texts.
It is necessary to provide for manual moderation of the text, in the case when the model “doubts” its decision. You don't want inappropriate content to end up in your product.
It is also necessary to send the hand-marked texts from the previous paragraph for additional training. Thus, you can improve the model in small steps and reduce the amount of work during manual moderation over time.
It is better to use an integrated approach to solving the problem. Sometimes even the simplest "models" in dictionaries already give good results.
Choose the best model based on your task. In our case, we chose BERT because it reacts better to the context than logistic regression.

Thank you for attention!

In the next article, we will share our experience in pre-moderation of images all on the same platform of our ecosystem - Our Farming.

Text Moderation: Data Scientist Etiquette Lessons