"During quarantine, the load increased 5 times, but we were ready." How Lingualeo moved to PostgreSQL with 23 million users

image



The Lingualeo project is already 10 years old. More than 23 million people from Russia, Turkey, Spain and Latin America are learning English using our service.



LinguaLeo was created in the late 2000s and early 10s and used the advanced technologies and methods at that time. But time passed and they became very outdated. So we decided it was time to update the system.



We asked our backend development leader, Oleg Pravdin, to talk about how he and his team, in parallel with supporting the main product, assembled a new modular service structure based on PostgreSQL, transferred business logic to databases, and migrated with millions of users.



Mature product issues



“I came to Lingualeo in August 2018 to lead the backend development. Back then, backing was done by a team of 8 developers and 2 admins, who maintained a monolith of 1 million lines of code, mostly in PHP. It took 2 months to implement even a small new feature. And infrastructure costs per 10,000 active users exceeded $ 1,000 per year.



How did this happen? The fact is that over 10 years several development teams have changed in the project. New people came, like me, they added new modules and features in their own way. The teams changed, newcomers did not always understand how the old parts of the system work, as a result, the Lingualeo code gradually turned into a black box: opaque logic in the backend, an overloaded front, an abundance of crutches, large gaps in documentation.



In total, we had 20 developers on our staff, but it was impossible to develop the product: if something was added, unexpected problems emerged. It took the team 2-3 weeks to fix everything. The developers were maintaining the code from 2013, and there were no resources to update the functionality.



These are the challenges faced by a large number of companies that develop mature IT products written in technologies from a decade ago. Trends change, but due to old architecture, not all new items can be used.



The product develops and becomes overgrown with functions, but they do not have time to document them in detail. These problems are solved in different ways, but we decided this way: you need to build a new system from scratch and preserve the maximum product logic so that the user experience in Lingualeo does not change.



Step 1. Assembled a prototype of the new architecture



We had to figure out how to update the technical component of the service and rebuild Lingualeo using modern technologies. I suggested to the management to completely change the philosophy of the backend: transfer the business logic to the database, and replace the MySQL database with PostgreSQL.



I started with a prototype on paper: I drew a new architecture, explained how I could increase performance and how much resources it would take to prepare. It was difficult to protect the project, because there were no unambiguous success stories at hand: no one writes about how to migrate a service with 20 million users without stopping the business. But Lingualeo decided to take the risk and approved the change plan.



Pre-migration architecture diagram



image



PHP, , MySQL . JSON







image



SQL PostgreSQL, JSON. -, , JSON



2.



When we shared our plans with the developers, it became clear that the team was not ready for changes. Most of the people left the company: only those who came only recently remained. To carry out the migration, we decided to re-assemble the development team.



We were looking for ambitious and ready for changes, professional and responsible. We tried to pay attention not only to the quality of the code, but also to the soft skills. We were rebuilding the architecture of the service, so we needed people who would not be afraid of complex projects and would be ready to solve problems that they had not faced before.



Some people were found by chance, like me, for example: I met Lingualeo CEO Vladimir Sirotinsky on the plane. Vladimir met the future front-end leader at a consultation with another startup. But we recruited most of the new developers from the market. In order to fill 8 vacancies, we studied 1,118 applications and conducted 124 interviews: A



image

funnel of candidates for new developer vacancies at Lingualeo.



Step 3. Simplified the organizational structure



We have three areas of development: web, backend and mobile applications, we also have a department of testers. Finding someone who understands all industries at once is very difficult in a short time. Therefore, we decided to abandon the technical director and make the organizational structure of the new team as flat as possible. There is only one management level left in the company - one leader in each direction.



We hold regular meetings and communicate directly, so the development has become more predictable, the time frame has been reduced. The CTO may make irrational decisions, such as mis-allocating responsibilities between teams. In a system where leads communicate without an additional layer of managers, the likelihood of irrational decisions is reduced: we can always discuss any problem in a personal conversation.



For example, if I understand that it would be more logical to implement a function in the new structure in the database, and not on the front, I write to the chat and discuss the idea with the front-end leader. No need to make an appointment with the CTO or prepare a presentation to back up your idea.



image



Organizational structure before and after the changes: we dropped the CTO in development and simplified the structure in the product department. Nowadays, a product designer doesn't need to talk to two levels of managers to get an idea to the front.



Step 4. Transferred business logic to databases



Previously, Lingualeo's business logic was at the front and in applications. Product functions were handled by systems that were not designed to process data, such as JavaScript or PHP code. Therefore, we have ported Lingualeo business logic to PostgreSQL databases.



Jungle



One of the 4 main sections of the Lingualeo service is Jungle . This is a set of materials in a foreign language - texts, audio and video - in which you can recognize the translation for any word. That is, users study real content in English, and if something is not clear, they can click on a word in the text or in subtitles for a video and see the translation.



Text in the Jungle



image



Video in the Jungle



image



For the function of translation of words on a click to work, the text must be divided into words, expressions and phrases. Then - refer to the dictionary and display the translation in a new window over the text to the user. It is rather difficult to break up the text for translation: there are fixed expressions and phrasal verbs that it makes no sense to divide into two words. For example, take off and take are different units of content, although they include the same word.



All the logic for this function, along with exceptions and complex text division rules, was previously written in JavaScript on the front. The function was very cumbersome and the translation could take a long time.



We have implemented this feature in the database. The backing sends a ready-made JSON to the front, in which the text is already split into words and expressions. Each word and expression in the database is assigned an ID, which makes it easy to find a translation. Also, JSON takes into account which words the user has in the dictionary, and which ones are not yet. At the front, all that remains is to display information and highlight words with certain signs.



Dictionaries



We did the same with the Dictionaries section : all work now takes place in the database. We have users who have more than 100,000 words and expressions in their dictionary. In the dictionary, you need to provide a convenient search, divide words into groups, and give the user a wide range of filters.



Previously, the logic of dictionaries was on the front side or in the PHP layer, but now the system has a full-fledged API between the front and backend. You can send one request with a large number of parameters to the database, and a ready-made JSON will come from there:



image



Dictionary filters: search by words, choice of words by type of training, choice between words and phrases, filter by learned and new words



Courses



The transfer of business logic to the database significantly reduced the amount of code and accelerated the service. For example, the backend code of the Courses page changed after migration. It is seen by registered users, and courses there are selected by the system according to ten criteria. Previously, such a page was formed for 600 ms and sent 12 requests to the database, now there is only one:



image

image



Step 5. Consider user feedback after the release



It took about six months to develop: we started updating at the end of 2018, and the release took place in May 2019. Most users felt that the service began to work much faster. Previously, Lingualeo could train no more than 2,000 people simultaneously without losing speed, but now the system can withstand peaks of more than 100,000 users.



Some people also noticed negative consequences. Migrating with a black box, it is difficult to ensure the safety of one hundred percent of the data, so some have lost words from the dictionary, some have incorrectly displayed the progress in the courses.



Gradually, my team and I fixed all the problems. The main result of the changes is that now we are not working with a black box, but with a simple and transparent system, so it was much easier to work out the feedback.



Lingualeo



In April 2020, during self-isolation, the load on Lingualeo was five times greater than the same period a year ago. This did not cause any problems: the speed of the service did not drop, users did not notice anything. I'm sure that if we hadn't updated the system, the service would have simply failed.



The product has become not only faster for users, but also much easier to work with: it is now easier for the team to introduce and test new features. We have tidied up the documentation, and the code has become about 40 times smaller, so that new developers can easily figure out how the service works.



The product has become cheaper, so less computing power needs to be rented for it. The cost per active user in Lingualeo has decreased by more than 50 times, although after the updates the number of active users has already doubled.



Finally, the product is safer. Earlier, when all the business logic was in the PHP layer, requests to the database were sent from there from different functions. A database open to SQL queries is a problem: you can make a SQL injection and force it to execute dangerous code, such as deleting data. Now, not a single SQL query comes from outside, because we moved all the logic inside. "



We want to continue to regularly blog about how development works in the updated Lingualeo. Write in the comments what is worth talking about first of all: our team has changed, and the management structure, and technology. We will be happy to answer all your questions.



All Articles