The Ultimate AI Career Guide: How to Choose a Major, Level Up, and Find a Cool Job





On August 3, Sergey Shirkin, a specialist in ML and artificial intelligence, spoke on our social networks.



Sergey was involved in the automation of financial technologies and databases at Sberbank and Rosbank, building financial models based on machine learning and analytical work at Equifax. Predicts TV viewing using artificial intelligence methods at Dentsu Aegis Network Russia. Visiting Lecturer at the Higher School of Economics (Master's program in Data-Driven Communication).



Sergey also explores quantum computing as applied to AI and machine learning. He is at the forefront of the faculties of Artificial Intelligence, Big Data Analytics and Data Engineering at Geek University, where he works as dean and lecturer.



We share with you the transcript of the broadcast and the recording.



***



My name is Sergey Shirkin, today we will talk about artificial intelligence. We will discuss the initial paths - how to get into artificial intelligence, how to learn the necessary subjects, what courses to take, what literature to read, how to start a career. Also about various directions.



Today's topics can be of interest not only for beginners, but also for experienced programmers - for example, how to move from the field of programming to the field of machine learning, artificial intelligence, and neural networks. Depending on what technology a person is engaged in and what languages ​​he learns, the practical transition to this area can take place in different ways. There are a lot of specialties in AI.



Recommend materials for self immersion in AI?



If you're a complete beginner, it's best to start by learning Python. A quick way to do this, as I've seen with other newbies, is PythonTutor.ru. There you need to study the theory and solve problems - at least 70 percent. Problems can seem difficult if you have not programmed at all before.



The next step is the SQL query language, and the site SQL-EX.ru will help here: there are exercises on SQL. They are organized by stages: training stage, stage for obtaining a rating - you can take a certain place in the rating. Here you will learn how to work with databases. In parallel, there are training materials from the author Moiseenko, and they are quite easy to learn.



Then you need to study machine learning itself. Various algorithms ranging from linear regression, logistic regression, all the way to gradient boosting. There are a lot of materials here. Then you can go to neural networks - for computer vision, for NLP; you will learn convolutional, recurrent neural networks, and the most advanced ones - transformers, Bert, etc.



I'll tell you about the development of AI. If you look at the history of this development until 2010, then it is rather meager: there were, of course, some great achievements in AI and in related fields - in big data, for example, and many mathematical algorithms were ready. But there wasn't enough computing power and data for AI. Since 2010 - or rather 2012 - AI has exploded. In 2012, at one of the competitions, the neural network defeated classical machine vision algorithms and learned to recognize about 1000 image classes.



After this achievement, a large gap appeared from competitors who used classical computer vision, and the development of artificial neural networks began. Various convolutional network architectures emerged, and a breakthrough occurred in computer vision. Previously, it was believed that for a neural network to distinguish between an image of a cat and a dog is a very difficult thing, but in 2012, neural networks learned to recognize and classify images much faster and more accurately than humans.



Nowadays, computer vision has made great strides. In parallel, natural language processing - NLP is developing. With the advent of the GPT-3 model, which was created by OpenAI a couple of months ago, the neural network has learned to generate text (as well as music and other sequences). This is one of the major steps in NLP - most likely in this decade it will flourish. Chat bots will appear that will be able to fully maintain a dialogue with a person.



SQL and Python a bit. After courses in data science, without experience, can you immediately get a job as a data scientist, or do you need to work as a database analyst first?



Getting into data science is harder now than it was 5 years ago. Then it was possible to take part in some competition on Kaggle and take a place - not necessarily the very first, for example, in the first 10% - in some interesting competition, not a training level. After that, it was already possible to go to companies, answer simple questions about machine learning, and such a person could be hired. There were few specialists.



Now everything is much more complicated, so sometimes it does not work out right after you have studied machine learning and math, getting on the job of your dreams - an AI specialist or data scientist.



A good way to go is to work with a data, database analyst or data analyst first. The fact is, you have to learn how to preprocess, cleanse data, apply statistics. These can be database technologies, including Python. When you gain experience, you have a background, then you can, using your knowledge of data science libraries in Python - Pandas, NumPy, SKLearn, apply for a job related to AI or data science.



What is the difference between AI specialists and data scientists?



Does AI need C ++? What would you advise to study in order to become a computer vision expert?



Now there is a division in the vacancies of Western companies: in addition to the data scientist, there are separate vacancies for AI specialists. Previously, it was understood that a data scientist is a person who is engaged in the analysis of tabular data, and computer vision tasks, and NLP tasks. There was also a data analyst vacancy - it paid less, although it was also quite prestigious; such a person had to analyze the data, but not go too deep into AI related to speech, text and images, working mainly with tabular data. Then there was a mixture of vacancies: in the Valley, all data analysts were called data scientists, including those who work only with tabular data, and those who work with NLP and computer vision. And at the same time, a little later, they began to allocate a separate AI specialist.In Russian companies, there is usually no such division, although sometimes there are specialized vacancies - for example, "NLP / computer vision engineer". It is desirable for a data scientist to be able to do everything little by little.



About C ++: the most basic is Python. That is, if you are an AI specialist, you should use TensorFLow, Keras or PyTorch - it comes first now. But if you write more low-level programs - for example, if the job is related to robotic vehicles, then you will often need C ++ code. Python is not always fast. Libraries for machine learning are usually written in C ++, but sometimes you need to write the entire program in C ++: in addition to the models themselves, logic (if-else, etc.) can work, and in C ++ it works faster. Of course, it is difficult to immediately come to such a vacancy, and it is better to first work for one where there will be enough Python - for example, where there is social media analytics with image analysis, without the need for fast processing.



In order to become a specialist, you need to learn how to work with libraries for neural networks, study the OpenCV library for Python - it is also available for C ++. This will give you the toolbox. It is also desirable to be able to work with the NumPy library, understand the very mathematics of image analysis - that is, understand linear algebra and calculus, and also know the architecture of neural networks. Etc.



Why do ML interviews ask questions about how to resolve conflicts in a hash table?



Why is this a marker for hiring when you can Google it along the way?



Not every vacancy asks this. If you go to tabular data analytics, then it is unlikely that you will be asked. They will definitely ask if you are applying for the place of an ML engineer: that is, you do not just create ML models, you also implement them, and you need to know algorithms and data structures. And if you are developing something like a robotic car, then even more so: there you have to write high and low level code, and this knowledge is a must. And sometimes this knowledge is required in the analysis of tabular data - let's say you write a module for this in C ++.

If you are not yet ready for such vacancies, you can go through more interviews. For example, if you go to get a job as a data scientist in a bank, there will be fewer such questions.



I've been writing in Python for 10 years, but no higher education. How difficult is it to get into the realm of AI without being smart?



Higher mathematics is needed. You will have to take courses or study the literature, and this will be a long process. You will need training in linear algebra, calculus, probability theory, and mathematical statistics. The usual school curriculum is clearly not enough to study AI; Of course, the programs are different - in some schools and in the 10th grade, topics from universities are covered, but this rarely happens.



Pandas, SKLearn, Catboost, Seaborn, Kaggle training events - 3% and 13%. Do I need to dive into DL, or can I already look for a job?



The libraries are already doing well; you already have Pandas, a library for working with tabular data, and SKLearn, for machine learning models, and Catboost, for gradient boosting, and Seaborn, for rendering. The results are 3% and 13% - this means that if this is not a training competition, then with such results you should already have some kind of medal.



Deep Learning is not always needed. You may already be trying to look for a job, I think. But, if you need to work with DL, then you still need to teach neural networks.



What is the basic set of books to read?



I'm going to show my books at the end of the stream. I chose the basic set, nothing particularly advanced.



To what extent are these professions in demand now? Will there be many vacancies in 2 years?



If you remember 2015-16, then, for example, there were no more than 5-10 data scientist vacancies at Headhunter. That is, there was an almost empty market. Of course, then there was a renaming of analysts into data scientist, but this was also not enough.



Now it takes several hundred at a time, if you look at the same site. They say there are vacancies that are not there. For example, on ODS - OpenDataScience - if you look, there is a separate vacancy section. In general, while vacancies do not end, I think in 2 years there will only be more of them. Not only large companies are engaged in this: there are startups, small companies; data scientists are now required in government agencies - for example, in various municipal departments, in the tax service, and so on.



In which industry is AI most in demand?



The simplest application of AI, where its explicit use can automate a large amount of mental work of specialists, is in the financial sector. There are a huge number of banks. Each of them needs, for example, to assess the creditworthiness of borrowers - that is, to determine, according to various criteria, whether it is worth issuing a loan, whether the person overestimates his strength and whether he will be able to repay the loan. This is the most obvious use of AI.



Then marketing, building advertising campaigns: that is, when you need to predict whether a person will watch an advertisement (on the Internet, on TV, etc.). This is also a developed direction, it is impossible not to automate it using AI. Plus, robotization is developing now: there are not only industrial, but also household robots - robotic vacuum cleaners and other home accessories, which are also being developed by someone. Or various applications for a smartphone - in general, there are many industries, ranging from industry, medicine, retail, finance, marketing and ending with entertainment. For example, AI can also be used in games.



One hundred more appreciated when applying for jobs in data science: knowledge of mathematics, understanding of specific algorithms, work experience?



He has a technical master's degree and a year of work as a data analyst in consulting.



You have a good background - a technical university, a year of work as a data analyst. If you have already studied technology and know how to program, then getting into data science is easy. If you have worked in database analysis and know SQL, this is a big plus, and if you add programming and machine learning, this is a very good set.



I'll talk about how I build machine learning models at work. The company I work for is Dentsu Aegis, a very famous company, especially among those who work in marketing. It is a top 5 communications group in the world; it is headquartered in Tokyo and has offices in 145 countries. Russian branch - Dentsu Aegis Network Russia. He has been working in Russia for 25 years and is a pioneer of media innovations.



I will tell you about the area I am responsible for as a data scientist. This is exactly the application that I talked about as the most obvious in practical application. AI in marketing helps automate many of the tasks of specialists, and one of them is predicting how different types of content will be viewed by different target audiences. I will tell you more about one of my immediate tasks - prediction of television viewing.



The audience can be several hundred, and in order to forecast them manually, the work of dozens of specialists would be required. It's overwhelming. A very large amount of data - up to billions of rows in tables. You need to take care not only to build a machine learning model, but also to make it work quickly. For such work, you need to know relational and non-relational databases well, work with Linux, have devops skills and generally understand the architecture of the application, the IT infrastructure of the company, know well Python, possibly C ++.

When we build a forecast for TV views, we use modern machine learning methods. For tabular data, these are gradient boosting and random forest. If text is analyzed, we use neural networks; besides them - topic modeling, TF-IDF and other common NLP methods.



We use gradient boosting because if we predict using tabular data, then gradient boosting is ahead of all known algorithms in working with such data. In Kaggle, starting in 2018, all the main achievements in competitions using tabular data were achieved precisely with the help of gradient boosting. Most kegglers then switched to XGBoost - it was the first known library for gradient boosting, and later many mastered LightGBM from Microsoft or CatBoost from Yandex. For the task of forecasting TV views, the use of time series is also well suited, but such methods do not always work well - unexpected events periodically appear that need to be responded to or anticipated in time. Sometimes there are large abnormal periods - from several days to months: for example,The 2018 FIFA World Cup had a big impact on views. Quarantine also became an abnormal period: people began to spend more time at home and watch more TV. This, too, must be taken into account, anticipated. In general, this period is a kind of challenge for machine learning and AI, because you need to constantly monitor models and control them so that they work correctly. In addition to abnormal periods, the forecast is influenced by holidays, weather conditions, changes in trends in the views of specific programs and channels. As a result, the models turn out to be quite complex, because it is necessary to take into account all possible options, take into account or anticipate anomalies and deviations.This, too, must be taken into account, anticipated. In general, this period is a kind of challenge for machine learning and AI, because you need to constantly monitor models and control them so that they work correctly. In addition to abnormal periods, the forecast is influenced by holidays, weather conditions, changes in trends in the views of specific programs and channels. As a result, the models turn out to be quite complex, because it is necessary to take into account all possible options, take into account or anticipate anomalies and deviations.This, too, must be taken into account, anticipated. In general, this period is a kind of challenge for machine learning and AI, because you need to constantly monitor models and control them so that they work correctly. In addition to abnormal periods, the forecast is influenced by holidays, weather conditions, changes in trends in the views of specific programs and channels. As a result, the models turn out to be quite complex, because it is necessary to take into account all possible options, take into account or anticipate anomalies and deviations.changes in trends in views of specific programs and channels. As a result, the models turn out to be quite complex, because it is necessary to take into account all possible options, take into account or anticipate anomalies and deviations.changes in trends in views of specific programs and channels. As a result, the models turn out to be quite complex, because it is necessary to take into account all possible options, take into account or anticipate anomalies and deviations.



Naturally, the models are not left to themselves - testing, fine tuning, monitoring are constantly underway. But it's not just the models that matter: Another important step is feature creation. First, these are signs related to the time of the show: time of day, day of the week, season, etc. Second, there are content-related attributes. At the same time, one must understand that if the program is broadcast at night, then, no matter what interesting content, there will be no more views than in primetime. The importance of the attributes can vary, but different audiences will choose different content. It may depend on gender, age, social status.



One of the most time-consuming stages of working with data is feature engineering: processing or creating features. This part of data science requires a lot of experience: there are either no known recipes in advance, or they are too simple, and you have to come up with ways to prepare features on the fly.



Sometimes there are curiosities in the data: let's say the viewer turns on the TV in the evening and falls asleep. It turns out as if he watched programs all night. This is one example of noise in the data - the data seems to be accurate, but it seems to be not, and you need to learn to guess, although it is difficult. In addition, very few advertisements are usually shown at night.



When we build a model, we need to not only make it work, but also provide testing and monitoring. For this we need metrics. Since we have a regression problem, our set of metrics will differ from the set for classification, for example. These are the root mean square error and the coefficient of determination - they are all very important. There are also metrics that you have to create yourself to solve a specific business problem - for example, the problem of optimizing the costs of an advertising campaign. In this case, we need to predict not only the TV rating, but also the coverage of the advertising campaign; we work not only with machine learning, but also with complex statistical and econometric methods. This is the case when knowledge of machine learning is not enough: it requires calculus, linear algebra, methods of mathematical optimization.Unlike common machine learning tasks - regression, classification, clustering - here you have to come up with your own methods, and programming alone is not enough.



I would like to mention the program of the Higher School of Economics - Data-Driven Communications. I had to help students on this program along the way, they are engaged in marketing and subjects related to machine learning. Actually, what is machine learning and data science for a marketer? Previously, it was not expected that a specialist in this field would program and make complex models, but now it is a skill that gives advantages in the labor market. If a specialist, in addition to his profession, has mastered data science, then he gets the opportunity to either change jobs and become a data scientist, or continue to develop in his subject area, but with great competitive advantages. The machine learning expert will be able to make more accurate predictions, but it will take a lot of learning.



Is it worth paying attention to the MIPT / Yandex Data Science course, or perhaps looking towards Udacity?



As I understand it, you mean a course from MIPT / Yandex on Coursera. Udacity is a standalone learning platform; there is not only data science, although quite a large part of the courses is intended for AI and data science. I recommend not dwelling on one resource, but trying several courses. The courses do not coincide 100%, you can always find something new that you did not know before. Also the new course can be used for repetition. For example, courses on GeekBrains in our departments of AI, data engineering and big data analytics. Since I am their dean and teacher, I can tell you more about them.



Courses are combined into faculties - for example, the faculty of artificial intelligence has 17 courses, plus 8 additional ones. Almost every course has practical work as a final project. Thus, a specialist who learns on it gets practice. I recommend not just studying theory, but doing projects: good practical skills will bring you closer to interviewing and starting a career.



I myself studied some time ago at Udacity - I took a course on robotic vehicles, very long, it was planned to be 9 months, but the course lasted about a year. I really learned a lot, the impressions from the platform are positive. But, of course, all courses there are taught in English.



How to take into account anomalies in time series and can they be cut out?



An unpleasant process. There is no ready-made recipe for this - you need a huge number of tests. More precisely, there are ready-made models, but they are designed only to detect anomalies in the past, and they need not only to be detected, but also to be anticipated.



For such cases, there are various developments, but you have to create them yourself. The most important thing is to determine what will happen in the future: for example, an increase in TV viewing on certain channels and programs. As time passes, this data goes back into the training data and needs to be processed in the right way.



Even if there are no anomalies in the future, past anomalies may affect your forecast. There are many methods here; the simplest is to delete anomalous data, but if there is a lot of them, this can lead to the loss of a whole period of the time series from consideration, so this method is not always suitable.



How to get a job without proven experience?



Good experience is your projects. That is, if you do not just teach theory, but immediately do a project - preferably under the guidance of a mentor (a person with experience in data science and AI) - then you know what you are doing. You not only know how to apply theory, or apply a model to data found on the Internet, but to solve practical problems. When working on such projects, you gain knowledge that cannot be obtained from books and courses, and the help of a mentor is invaluable here.



Let's talk about books - I've prepared a small stack.



If you work in data science, then most likely you will have to work in a Linux environment. At the same time, you will not be an administrator - that is, you will not need too deep knowledge - but confident knowledge of this platform for simple administrative tasks (planning the launch of scripts or disposing of OS resources) will be required. This is where the book "LINUX - Pocket Guide" by Scott Granneman will help. It can be studied in a couple of days.



On the theory of probability, I would recommend the book by GG Bitner "Theory of Probabilities" - it contains both theory and problems. Probability theory will come in handy for both interviews and work.

Anyone working in IT requires a minimum set of knowledge and skills. Accordingly, the book "Theoretical minimum in Computer Science - everything a programmer and developer needs to know" (Philo Vladston Ferreira) is an educational program in computer science.



If you dive into programming and low-level development, then you will need algorithms. In the book "Algorithms for Beginners - Theory and Practice for a Developer" by Panos Luridas, algorithms are given without reference to a specific language. There is a longer book for C ++ - "Algorithms in C ++" by Robert Sedgwick; it is useful when you want to eliminate some of the high-level operations that Python has and create algorithms from scratch.



If you want to get a general idea of ​​the top-level work of a data scientist, then the book "Working with data in any field - how to reach a new level using analytics" by Kirill Eremenko is for you. There is no programming here. But, if you are already an expert, it will be useful to you only if you have not worked with data yet.

Next: β€œData Science. Data Science from Scratch by Joel Grasse is also a useful book. From the same publication - β€œPractical Statistics for Data Science Professionals. 50 Essential Concepts ”by Peter Bruce and Andrew Bruce. Here you can also study statistics.



If you are going to work with data in Python and use the Pandas library, then you definitely need "Python and Data Analysis" by Wes McKinney - the author of the Pandas library itself.

On machine learning, I recommend two books: Machine Learning by Peter Flach and Python and Machine Learning by Sebastian Raska.



For deep learning, there is the book Deep Learning in Python by Francois Schollet, where you can study neural networks for NLP and computer vision problems. Specifically on NLP, there is "Applied Analysis of Text Data in Python" - Benjamin Bengford, Rebecca Belbrough, and Tony Ojeda.



If you want to learn TensorFlow for deep learning, there is a book of the same name by Bharat Ramsundar and Reza Bosag Zade.



There is also a book that simply and clearly explains the principles of neural networks - the book by Andrew Trask "Grock Deep Learning". There is also "Grock Algorithms" - it explains well algorithms that can be useful in interviews and in practice.



What do you ask in interviews?



There is a small collection of questions. There are questions about classical machine learning - a specialist who gets a job in the field of data science and AI should know how classical machine learning models work: linear, logistic regression, gradient descent, L1-L2 regularization. It is necessary for a person to talk about the principle of operation of decision trees, about the criterion of information content for classification and regression problems; so that a person knows how random forest, gradient boosting works. It is very good if he knows the differences between the gradient boosting models - Catboost, LightGBM, XGBoost - that is, what is the difference between these libraries, how gradient boosting is implemented in them. You also need a person to own machine learning libraries - Pandas, NumPy, SKLearn. If a specialist needs to work with neural networks, with computer vision, with NLP,then there will be questions on these topics.

There can be a lot of questions. If a person answers well, then it is interesting to ask him about some of his projects - if a person has done something, the interviewee immediately has many questions related specifically to the projects. If you have personal projects on GitHub, or educational projects from courses, it will be very good if you are able to tell in detail about the technologies and algorithms that you used.



In addition, during the interview, you can ask different basic questions. Usually, if a person answers them well, he is most likely a good specialist. Of course, it is important that he is able to complete the test task. Theory is one thing, but how a person can solve a practical problem, program it, what code he will write is also important. If a person knows the whole theory, but sends a code in which the OP is not used when needed, then he does not know how to apply the theory correctly. Also, of course, the code itself should be readable and commented.



I also wanted to talk about quantum computing, quantum machine learning is another area of ​​my interest, but today I will not have time.



What should be written on the resume to receive an invitation to an interview?



A resume is a crucial moment. First, it should not be bloated in volume: it should contain only relevant experience. If you worked in a specialty not related to IT, this is not necessary. List your achievements in short, projects, courses taken, relevant to the vacancy. Write in what shows you as a specialist who is capable of doing the job. And, of course, the summary must be readable.




What happened before



  1. Ilona Papava, Senior Software Engineer at Facebook - how to get an internship, get an offer and everything about working in a company
  2. Boris Yangel, Yandex ML-engineer - how not to join the ranks of dumb specialists if you are a Data Scientist
  3. Alexander Kaloshin, EO LastBackend - how to launch a startup, enter the Chinese market and get 15 million investments.
  4. , Vue.js core team member, GoogleDevExpret β€” GitLab, Vue Staff-engineer.
  5. , DeviceLock β€” .
  6. , RUVDS β€” . 1. 2.
  7. , - . β€” .
  8. , Senior Digital Analyst McKinsey Digital Labs β€” Google, .
  9. «» , Duke Nukem 3D, SiN, Blood β€” , .
  10. , - 12- β€” ,
  11. , GameAcademy β€” .
  12. , PHP- Badoo β€” Highload PHP Badoo.
  13. , CTO Delivery Club β€” 50 43 ,
  14. , Doom, Quake Wolfenstein 3D β€” , DOOM
  15. , Flipper Zero β€”
  16. , - Google β€” Google-
  17. .
  18. Data Science ? Unity









All Articles