Where does ML application come from in Russia at the state level?

Foreword



Hello!



The question in the title is not rhetorical, I'm really interested. If suddenly someone knows the answer to it, please write in the comments, maybe I tried to go from the wrong side.



I also clarify that I have no goal of complaining about someone, therefore, in the responses from the ministries, I removed all the stamps with the names and signatures of responsible officials. I am interested in understanding how this system works.



How did it all start?



It all started with the fact that at the end of 2019 I realized that in my previous area of ​​professional activity I had reached a certain peak and further development in this area for me:



  1. not interested
  2. it is possible, but it fully corresponds to the Pareto principle where I would have to spend a lot of effort for the sake of illusory prospects, and even not as desirable as it seemed to me at the beginning of the journey.


Thanks to one significant meeting with friends, I realized that a long-standing desire, drowned out for various reasons, is “to go to IT”, namely in ML and specifically I can succeed. I will not describe how I studied, but it was really intense, productive and most importantly exciting, so much so that I had to force myself to rest. In the end, I went for broke. He quit his old job and devoted almost all the free time to study.



Over time, I began to think about finding a job in a new favorite business, but then the story with Covid began. Also, it is no secret that machine learning has recently become a very fashionable topic and as a result, a significant number of applicants for the positions of juniors / trainees in the field of DS have appeared on the labor market. Taken together, these 2 factors, for me personally, meant that even with good results in several competencies for the employer, I was just one of many newcomers in the field, just like me, and I had little chance of even just reaching the interview stage.



After thinking it over and realizing that the only way I can distinguish myself favorably from the rest is the presence of good projects, I started looking for a topic for the first project. When I was just starting to study ML I was bursting with the number of ideas, but by the time I chose the topic of the project, these ideas became fewer, because already having delved into the specifics a little, I began to think a little differently - “yeah, this is a good idea, but open CV suitable because of his preliminary training and to teach mine, I will not have enough resources and data, only if you do not take a GAN network and generate images with its help. And then there are restrictions on requests to the free version of the API, and to unload a normal dataset, you need either a lot of time or money "and so on.



Deciding to go from the other end, I returned to Kaggle, opened the datasets, ranked them by "hotness" and then it dawned on me. Covid is in the yard! What could be better - to make not just a project, but a project on a hype topic! Then they will probably notice me and tear me away with their hands. So I thought then. Yeah, schazz.



Looking ahead, I can't help but note that despite all my small, but advantages, my responses to xx were either frankly ignored or politely refused, even for those vacancies for which an internal employee of the company recommended me. In just a month of job search, there were about 70 responses from my side, and it seems 3 interviews after which I myself made a negative decision. I do not know what exactly was the case in each individual case, but I suspect that the matter is in the age (30+), lack of specialized education / experience and crooked projects.



But specifically, this story of job search with a good ending - my current manager found me on xx myself, I quickly went through the stages of interviews and now I am doing analytics, including using ML, and I really like it. Moreover, they also pay me for it! I probably won't say this to my leader's face due to my certain introversion, but thank you very much if you suddenly read this)



Well, okay, I was too carried away in the direction of lyrics. Closer



To business



Having obtained everything on the same caggle dataset (https://www.kaggle.com/parthachakraborty/pneumonia-chest-x-ray), I wrote a small sequential network with an accuracy of about 85%. In the end, I took a dataset with pneumonia as a whole, and not with pneumonia caused by covid, because I did not find sets with a significant number of photos from COVID-19, but at that time I knew little about augmentation methods.



Fortunately, I remembered that I have a friend, a radiologist, with the help of whom I learned some details about the differences in the diagnosis of pneumonia by CT scans and X-ray methods. I also sent him the images classified by the model, which I took from the vastness of the cobweb at the request of "X-rays of the lungs infected with pneumonia." The results are slightly worse than I expected. So, in several photos, which the network perceived as bacterial pneumonia, in fact, there was tuberculosis which simply was not in the training sample, but otherwise the percentage of errors corresponded to model.score (X_valid, y_valid).



I was elated. Still, I was doing real data science, and not doing the 70th submission in an attempt to break into the top 1%, predicting prices in Melbourne. I wouldn't be surprised if I find out that the local realtors get hiccups when they try to evaluate a house. Sorry, I couldn't resist.



In general, I was inspired, sent a fresh dozen responses and ... again nothing.



When this idea came to my mind, I cannot say that I was guided exclusively by good intentions like peace-peace, save the poor and others. No, my goal was to find a job as quickly as possible, and for this I needed to stand out in a positive way from the crowd of the same “I want to be in ML”.



But at the same time, going through the adolescent crisis “why are we here?” And being an atheist, I determined for myself my credo - that I want to make the world a better place, because the rest, in my understanding, has no real value on a large scale. Idealistic and naive? Yes, this is true and what I did and why I am writing this post at all follows from these qualities of mine.



I decided to write to the reception of the President of the Russian Federation a proposal of approximately the following character (I have not preserved the exact text, since it is written in a special form on the website of the reception): “I, such and such, within the framework of the designation by the President of the Russian Federation of key areas development of the state, namely, within the framework of the application of ML in various spheres of the state, I propose the following: organize the collection and storage of X-ray images, and make this storage accessible for processing by ML methods and the possibility of giving feedback. " Then I briefly described my model, indicated that even I, with my small baggage of knowledge, was able to make a recommendatory model that could work in tandem with a radiologist and be useful. And in Russia there is a significant number of DC enthusiasts with a high level of knowledge / skills who can do a lot not only in the field of medicine,but also in other areas where, in principle, you can apply ML.



Unfortunately, I don't remember the exact text of the appeal, since it was in March or April, but the general meaning is exactly the same.



Developing this topic, now I would add that, in principle, it is necessary to collect and aggregate the largest possible amount of open data and roll it out to the analogue of Kaggle where it would also be possible to set tasks, discuss their solutions and find the best ones. Rosstat is already doing something similar with regards to the publication of data, I even managed to analyze something, but this topic needs to be developed further.



The message was registered, about which a separate letter came, but I was still pretty surprised when I saw that the answer came. The first response was from the Ministry of Health. It was short and concise.







The essence of the answer, as I see it, is “ok, thanks, don’t.”



I thought that this was the end of the story, but another answer came from the Ministry of Industry and Trade. The answer is very detailed and detailed, but I had the feeling that either they were given distorted information, or they simply misunderstood me.



















I did not ask for any financial assistance for the implementation of this project, moreover, I did not write a word that I want to take part in it (although I would naturally not refuse). Well, they answered, good, and thanks for that.



I would have safely forgotten about this story if I hadn't regularly come across news like this or this (very fresh) or especially this one . After reading it, I laughed a little, because this is exactly what I wrote about.



Outcome



Fuh, well, the post came out.



These are the key questions I wanted to ask.



Who is involved in the implementation of ML in the "business processes" of the state? Who is leading these people?



Is it centralized or does each ministry have its own data scientists? Are they even in the state apparatus?



I saw the text of the national strategy for the development of artificial intelligence for the period up to 2030, but I still have dozens of questions, who can I ask? To get a reasonable answer, of course.



Considering the answers that I received, I have some doubts that this strategy is not just a declaration of intentions, but a real plan, and that this whole undertaking will not result in financing of several "own" showcase pet projects, which will then be referred to, noting the success of the strategy.



In general, any of those who read this post are implementing any programs from this strategy?



Thank you all for taking n minutes of your time!



All Articles