Maturity levels of ML processes (Machine Learning related processes)

Machine learning is out of the hype zone. And it is difficult to say unambiguously how good or bad it is, but what is absolutely clear - more and more people are asking questions "where is the money?", Less and less futuristic articles about the total victory of a machine over a person, more and more reports and discussions are devoted to the automation and systematization of work processes over ML projects. And this article will be no exception - the hype is over, we need to work.





If we talk about building any processes, then personally I really like to use the term “maturity level”. After all, if you have an understandable rating scale in front of your eyes, you can always understand where you are, what awaits you ahead, you can decide on priorities and start adjusting what you need here and now, and not jump over a couple of levels and make a revolution, reinventing the wheel in the bargain ... and it may not come in handy later. In general, a useful exercise from all points of view.





Purely theoretically, you can take the standard process maturity levels described in CMMI, ISO / IEC 33001 or their more IT down-to-earth description from ITIL and shift them to ML. I tried several times to do this exercise in practice, but it turned out to be some kind of spherical horse in a vacuum, which could and gave an answer to the as-is and to-be question, but it was difficult to draw an understandable path. Therefore, looking around, I decided to remember and systematize my difficult path in working on various ML-projects, because CMMI is good, but for real work you need something more specific and down-to-earth. In general, as a result, a description of some of the basic stages of development of work on ML projects was born. Whether they can be called "levels of maturity" or not is the second question, but the above questions are answered and this is the most important thing.





So, let's get down to specifics. Probably the only level in which I did not come up with anything new and took it from ITIL is level "0" or "absent". As they say, if there is no process, then it is not in any form, be it about IT or about ML. But seriously, my 5 levels / steps are as follows:





  • Level 1. "Enthusiasts"





  • Level 2. "R&D"





  • Level 3. "Analyze It"





  • Level 4. "Specialty"





  • Level 5 "Automate it"





Level 1 "Enthusiasts"

- ( ) . , ( ), , , , «Big Data» . Python ( R Matlab … ) -- - . .





— , . , .





2 «»

- , , , . , :





  • – -, .., .





  • – . - 80% , .





  • ML . — , . .





3 « »

, , , . :





  • . ( , ), . , .





  • , .. -. , . . , .





  • . , — , ?





4. «»

, ( , ..), - . , :





  • - PROD- DEV/TEST/PROD





  • , .. – , .





  • – - , - . Run the business/Change the business.





5 « »

, 4- . DataOps&MLops . DVC Feature Store . .





, :











1





2





3





4





5









.





.















.

























.





















100%









PC/ ( +GPU)





PC ( on-prem)





ML- .









PROD DEV/TEST .





.





PROD DEV/TEST .





.









1-2 (Data Scientists - DS)





3-5 :2-3 DS





1-2 DE (Data Engineers)





5-10 :





3-6 DS





2-4 DE











10+ .





«ML-ops».









25+









-





1-2





2-4





5+





10+





50+









( , ). , , . , , .





, , :





  • , ML? ?





  • ?





  • , ?





, ML - , — .





— ( ) . , ML — Python, R, Azure ML Studio, SPSS SAS, . 1- 5- . . . , ( ), , .





But in fact, this is a separate topic for discussion (or another article, if there is inspiration) - the choice of the optimal path from one level to another and what it depends on, when and what software is needed, what expertise is needed, and where is the place for outsourcing and of course - but how to determine the point, when to stop.








All Articles