To be or not to be: discussions about testing in mobile development

At the Android meetup, we arranged short discussions for 10-15 minutes, where, together with experts from Avito, Citymobil and Revolut, we share different views on the need for testing in different projects, talk about regression and testing on users.



Watch the video, read the transcript and write your opinion on the voiced questions in the comments. Let's figure it out together: to be or not to be?



Number one discussion





Timecodes



0:56 - Is testing always necessary?

2:00 - Why test if you need to roll out features to the market as soon as possible?

3:24 - Is it possible to test only the basic functionality or one part of the application?

5:29 - Criteria for testing functionality

7:28 - At what point should a startup or outsourcing company stop releasing features and start wasting time on testing


The meetup immediately began with a heated discussion. Therefore, we will introduce the participants in our online discussion:



  • Dmitry Voronin, Lead engineer in the Speed ​​team (Avito).
  • Yuri Chechetkin, mobile developer (Revolut)
  • Dmitry Manko, Android developer (Citymobil)
  • Nina Semkina, Senior Programmer (Yandex.Money)
  • Vladimir Genovich, Lead Programmer (Yandex.Money)
  • Dmitry Zhakov, tester (Yandex.Money)


Nina Semkina: Many people say that testing must be introduced into projects ... But everyone understands that it is very expensive. It is very expensive to spend developers time and a lot of resources to cover all the code with tests. Does testing always have to be done?



Dmitry Manko: What are you going to test?



Nina Semkina: Android application. When do I need to understand that 100% of it needs to be tested?



Dmitry Manko:If the market does not expect the emergence of any functionality, then, from a development point of view or from a coverage point of view, testing can be simplified or completely neglected. It all depends on the product. If we are making a calculator with 2 functions, then we most likely went through a series of test cases more than once. Thus, tests can be omitted, the application can be put on the market faster, and the time-to-market can be reduced.



Nina Semkina:We always have a time-to-market race, the market always requires a bunch of features, and we cannot slow them down. Especially when it comes to a just starting project that needs to be launched on the market right now, we have no name, we compete with other companies. What do we care about testing at this moment? We need to supply our application with features and throw them out faster, otherwise they will become obsolete in 3 weeks. Why test them?



Dmitry Voronin:In fact, development speed at some point begins to depend on test coverage. If, for example, a business is architecturally tied to one main screen, where all the main features are, and changes are made there many, many times, then without tests you can start moving there more slowly, even with a large "feature pipeline". It seems as soon as there are signals that you often come back on regression, that is, a reason to think about the fact that something is wrong with the coating and you are not taking reactive actions. For example, something often breaks, which means that this place is not being tested as it should.



Nina Semkina:Can I conclude that you need to constantly test only the basic functionality, and connect the features "with one point". Let these individual features be with bugs, but perhaps they exist today and tomorrow they will not. Can I test only one part of the application?



Dmitry Manko: I would rather say that the key parts are worth testing. What is important from a business point of view for this product. And if there are features with bugs, they are located somewhere separately and do not affect something in common, then ideally close them with a remote toggle. And if you see from analytics that there really are bugs and users are complaining, then turn off this feature.



Dmitry Voronin:Dima touched on a good topic, that tests are not the only quality control tool that a developer has. You should always remember that in addition to unit and integration tests, manual testing, there is and should be monitoring. And there are ways to roll out and roll back your changes. If you have all these engineering practices, then in principle you can neglect some tools in favor of others, if the speed benefits from this. But in general, it is always considered good form if the developer has a testing culture - that it is more comfortable and more reliable for him to deliver changes so that he has tested the functionality that he must perform. In our company, it is customary to leave such a code, in which you can then easily make changes, and quickly run tests to understand that you did not spoil what has already happened.



Nina Semkina: In the chat they write that you need to test what brings income. And when testing costs are less than possible losses from non-working functionality. In principle, this is a good criterion for understanding what exactly needs to be tested. Could you name any other criteria for testing specific functionality?



Dmitry Manko: The criteria can be identified using analytics. For example, if you fix the most frequently used functions, then it is also advisable to test them so that users encounter bugs as little as possible. If the bugs are in rare places, then this is a small problem. And if the ban is on the authorization screen, it may not be critical, but still a bug that everyone will see. And this is already a reputational risk.



Dmitry Zhakov:Testing in general is generally needed. We must not forget that testing is a verification of the requirements that we place on a product. If something does not meet the requirements, then this is a problem, bug, issue. All test cases and testing can be automated, and if there is not enough time, then first it is worth checking the critical moments, and then everything else. For example, if you have a release tomorrow and your business wants a feature tomorrow, then you check the most critical functionality. And if you have enough time, you can afford to check medium, low-cases. It's more a question of whether your testing will be manual or automated. Whether it's metrics or UI testing, fully verifiable by a robot.



Nina Semkina:We are all now speaking on behalf of large companies with many resources and opportunities. And if we consider the point of view of small firms, startups, which have limited time and resources. I think that at first everyone will sacrifice testing there. At what point can we understand that this is the critical milestone when we should stop and stop rolling features and spend resources on testing?



Dmitry Manko:I can share my opinion, since I come from an outsourcing company. Outsourcing is primarily about where man-hours are sold, and testing is really expensive there. Sometimes it costs more than developing the functionality itself. Such outsourcing companies, when the customer is waiting for the application and “kicks it by the side,” are not famous for testing. In our team, we are faced with the following situation. The product is a menu for bars, where promotions were used for a huge number of cases (birthday, 2 for 1, student, etc.). And we noticed that this stock functionality was breaking every month for a year. And then Unit testing described all the cases, ideally. We understood how everything works (there were about 70 test cases). We beat this product, but, of course, it would not be possible to do this everywhere.



Yuri Chechetkin:My experience of working in large companies - Yandex, Alfa-Bank, Revolut - in fintech, where the criticality of any bug just goes off scale. That being said, I have experience in a startup, and even there, testing was absolutely necessary. I think it doesn't matter if it's a startup or not, because the developer must be responsible for his code, and tests are a guarantee that this code works. A developer is primarily an engineer who is responsible for the product being developed. Therefore, you need to write tests not because you need to, but because they should help you. If these are tests written for show, then this can slow down development. And if you need tests and you understand it yourself, then they must be written. If a developer writes code and is confident that it works without tests, then this is a risk and it is his choice. But I still thinkthat the developer should not take that risk and should cover himself and cover everything with tests.



Nina Semkina: So, we decided that we need to somehow test our code, we will analyze this topic in more detail. I now give the floor to Vladimir Genovich with his report .



Number two discussion





Timecodes



0:09 - How to remove the regression load from QA before releases? Do companies have a strategy to improve application stability?

4:25 - Does it make sense to use mocks or fake objects in UI tests

8:05 - Testing on users: is it acceptable or not?


Nina Semkina: During the report, we received a question in the chat, and it is from him that I would like to continue our discussion. How to remove the regression load from QA before releases? What practices do our speakers use in choosing places for testing? Do companies have a strategy to improve application stability and offload QA specialists?



Dmitry Zhakov:Our strategy is that we test everything. Because we are the last frontier, as employees before users. Therefore, we give only such an application to the client, which works stably always and everywhere. The only question is speed. Initially, the manual run took us a long time - up to a week. But thanks to automation, we have achieved that the release lasts an average of a day. Therefore, if you are developing any functionality, then you need to agree that either you or the testers will immediately write autotests. And some mobile specific cases that you cannot automate will only have to be checked on regression, and the robot will check the rest. Thus, you will relieve testers, they will be engaged in more interesting, research work, and you give the robot the whole routine - clicking scripts.



Yuri Chechetkin: Most large companies refuse from QA, manual testing. This is not exactly a revolutionary path, but a bit of a relic of the past. And, for example, in my company, where I work now, such a word as regression is not even pronounced. We don't have a QA department at all.



Vladimir Genovich: You probably automated it?



Yuri Chechetkin: Not exactly, he was only at the initial stages of the project, and then he was gone at all.



Vladimir Genovich: You run UI tests, right?



Yuri Chechetkin: There are UI tests, of course.



Vladimir Genovich: And Unit tests? So running these tests on release is not a regression?



Yuri Chechetkin:Yes, this is a regression, but there is no manual testing that we are used to talking about. And that's a pretty interesting approach. It sobering up and transforms the developer from a “child” who writes code and gives it to testers, into a more mature and independent engineer who is responsible for his own code. As for visual things, the review can be done by a designer or PO. And there are things like screenshot tests - like Facebook. So it seems that now food companies can do without QA. And testers themselves can do more interesting work. Of course, there is a slightly different story in outsourcing - they sell man-hours, and QA can be sold as an additional service.



Dmitry Zhakov:It turns out that you have regression, it is simply given over to automation, and you have people who are engaged in research work of your application. Testing can be not only UI, but also different.



Yuri Chechetkin: Yes, for example, testing on users.



Nina Semkina: Before we touch on this very dangerous topic, I would like to read the next question from our listeners. Does it make sense to use mocks or fake objects in UI tests?



Dmitry Voronin:It makes sense, and without them nowhere. Because UI tests with full integration are very unreliable. And you can never rely on a test that has 30 systems, each with a bunch of points of failure running a pull request. Such tests are not viable. And no one in any company could make such things work. Therefore, UI tests are the bane of mobile development. If possible, it is better to test without UI. But due to the fact that we are forced to live with a framework, and the only alternative is some kind of robotics, and in iOS, even this is not. And in order to check the interaction with at least one of the systems important for us, we run everything on the device. The UI is here insofar as, due to our immaturity of development, we want to capture as much as possible in order to check how the user clicks - we are so calmer. It seems to me,that after some time this will become a thing of the past and we will no longer be afraid of mocks, clicks and will not fight the system to check everything as it should, because we will not check everything anyway. There may be visual bugs that will not be checked by any UI tests. Therefore, I believe that mocking in UI tests can and should be done, and the main goal of this is to increase the stability of this tool, to bring it to such a state that it is useful. And the real benefit in this case is to make sure there are no regressions. And any instrument that flushes turns into the second "D" from the previous report of Vladimir Genovich, when we stop trusting. This happens when a huge number of random values ​​begin to arrive in our test. And such a test does not give any self-confidence, but only gives false hope that something has been tested.



Dmitry Zhakov: About 70% of our cases are automated, and they don't use a single mock in the application. It might be easier to migrate them to the backend. For example, if it refers to the card number, then you would expect 3DS not to be requested from you. That is, the application does not know that it is locked. I think this is an infrastructure problem.



Nina Semkina: Before moving on to the next report, I would like us to mention one slippery topic - user testing. Many sin this: they always want and inject ... What do you think about this? Is it possible to afford to roll out to users on the sly, collect crashes from them and fix it for yourself. And after testing on them, roll out good full-fledged versions. Or is it generally inadmissible? Or are there reasonable boundaries?



Yuri Chechetkin:We at Revolut practice this a little in the sense that it does not go directly to the battle, but on real users. The demo is also some testing on users, and during the demo questions about flow arise and so on. At this stage, there may be questions about design and general mechanics. Among other things, there is internal rolling - the company is large, more than 1000 people, and we can roll out among colleagues. This is user testing, but not externally, and seems to be safe. And then it can be rolled out to a small percentage of real users outside, but with the ability to close this feature with a toggle. What do you think could go wrong during these stages?



Dmitry Manko:In our reality, things can go wrong. No matter how hard we try to carry out these stages well, in any case, cases jump when we need to monitor crash analytics. The release does not end with the fact that we sent it to the store, all the stages have passed, everything is OK with us. You need to keep watching how the application behaves.



Yuri Chechetkin: Definitely yes. In our case, we have a demo, internal rollout and testing for 5% of users instead of manual testing. Of course, after the release of the feature, you need to look. The rolling should not be 100% straight away - this is the main defense mechanism.



Dmitry Voronin:The ethical issue of user testing is handled by Google for us. Apple doesn't seem to have this. There are special distribution channels, as you know (alpha, beta ... production). Anyone can enter beta testing, and they agree with an understandable form, which says that they are sanely agreeing that they may receive an unstable version of the product. And so he wants to volunteer and help the company make the product better. As soon as we openly tell a person about this, I think this issue should be removed and we should not be afraid to roll out there a version in which we are not 100% sure. And it's even better when we have feedback from there, and with each such unstable release, we improve this process. If a company has processes that track quality trends in beta, then things should only get better.And this is also a plus for users - they will be the first to receive features. These are mostly motivated and loyal users to your product, and they themselves will want to test new things that appear in the application. And they will even be ready to sacrifice something for it.



Nina Semkina: We understand that when we talk about a loyal audience, it is loyal as long as it does not affect its personal interests. This is how we can roll out features with additional small buns, which even if they fall, these users will not be very upset. But even if this person confirmed that he is ready to take the test version, but something serious goes wrong with him (for example, extra money will be written off), then he will no longer be loyal. And the larger the company, the more harshly the user will respond about the product.



Vladimir Genovich:But what about the early adopter who loves you, no matter how the company messes up? And, most likely, this company will be able to recover the losses. Agree, if we roll out something, we say to the user: “Listen, we are very afraid. And you can lose 1000 rubles. But we will reimburse you. " Most likely, such a user will do it at his own peril and risk, and if the money is lost, then we will not tell him later: "Well, you yourself are to blame." Therefore, I think, even in the case of a banking application, we can help users.



Dmitry Zhakov:And if you have too few beta testers, then you can use A / B testing with the help of config files to enable / disable some feature, so that in case of a crash, you can immediately disable something and test it as needed. As we remember, it is very difficult to rollback in mobile, so it is better to check everything to the maximum before the release.



Vladimir Genovich: Or write in React Native))



Nina Semkina: I will interrupt our conversation, as the time has come for the next report. Dima, I give the floor to you.



Number three discussion





Timecodes



0:05 - How to improve regression testing? How and when is testing introduced in the development of features (Avito's experience)

10:43 - Where are Unit tests chasing: on CI or locally before pushing?




Nina Semkina: I would like to contact Dima Voronin to hear his opinion and about his experience on how they improved regression testing in the company and when they introduce testing when developing features.



Dmitry Voronin:I really have something to share. This is a five-year history of dealing with manual regression. And this is partly a continuation of the answer to the question that we had between the 2 first reports. This question is about what to do if you have manual regression. Because not everyone will be able to repeat the Revolut experience. The guys are great, they cut from the shoulder, and they managed to do it reliably. It takes a lot of courage, a good development culture, and most importantly, understanding development leaders who don't feel wild about this approach. This happens because there is inertia in our work and it can be difficult to change the foundations, especially in large companies. The Revolut example proves that if it works, then it's at least faster than manual regression, and every developer starts asking himself the right questions.That is, he begins to be responsible for most of the release cycle, that is, not until the moment when he commits the changes, but like any adult engineer, he also leads the product at the release stage.



What happened with us? We were at the point where we had a manual regression that was done by 5 people for 12 working days, and without that the mobile app wouldn't roll. It was 2015. And at that moment, we didn't have a single automated UI test. We wrote unit tests almost from the very beginning and wrote quite actively. Vladimir in his report talked about 10 seconds and 1000 tests - it's scary for me to imagine when we passed such a moment in 2014. Now we have 12,000 Unit-tests, and they do not take 10 seconds, this is also not a free piece. Even though all engineers understand and write tests, there was a tricky moment. All these Unit tests do not prove a single gram about bugs in production and how the application behaves. That is, testing captures behavior, makes it easier to make changes and gives feedback,are you doing it right. The problem is that there is a QA department. Of course, this is not the problem. The problem is that they have a task to provide a certain level of quality. And they are used to reaching this level, they take on this responsibility. And it is difficult to turn this moment if it does not come from the very beginning of your product. What recipes are there? The most correct thing is not to turn on the hard mode when we fire everyone and everything is taken over by automation. This is probably the scariest and most immature approach I've seen. What's wrong with that? First, the quality of testing will be lost for some time. Secondly, all processes are destroyed, and new ones are not built quickly.And they are used to reaching this level, they take on this responsibility. And it is difficult to turn this moment if it does not come from the very beginning of your product. What recipes are there? The most correct thing is not to turn on the hard mode when we fire everyone and everything is taken over by automation. This is probably the scariest and most immature approach I've seen. What's wrong with that? First, the quality of testing will be lost for some time. Secondly, all processes are destroyed, and new ones are not built quickly.And they are used to reaching this level, they take on this responsibility. And it is difficult to turn this moment if it does not come from the very beginning of your product. What recipes are there? The most correct thing is not to turn on the hard mode when we fire everyone and everything is taken over by automation. This is probably the scariest and most immature approach I've seen. What's wrong with that? First, the quality of testing will be lost for some time. Secondly, all processes are destroyed, and new ones are not built quickly.the quality of testing will be lost for some time. Secondly, all processes are destroyed, and new ones are not built quickly.the quality of testing will be lost for some time. Secondly, all processes are destroyed, and new ones are not built quickly.



What have we done? We started our optimization by writing UI tests that replace regression. That is, these are full-fledged infrastructure tests that touch the backend with test users. And, in fact, the result of this work was, as you know, all sorts of popular frameworks - for example, Kaspresso. This is exactly what we laid when we started. We left behind a bunch of artifacts that can help developers. And that's why it's easier to get into testing now. We also put various runners in the open source, and everyone can see how we work with them. But we didn't forget about manual testing, about its optimization, and how these two departments start to merge into one effective process. Probably point B is the Revolut state. But our road from point A to point B, like many other companies,takes a long time. Now we are at the stage when QA plays the role of researchers, they immerse themselves more in the product, work on functional requirements, write autotests.



The most interesting thing about the practice of improving manual regression is impact analysis. That is, an attempt to answer the question: "What has changed in this release?" And what can we test and what can we roll out with peace of mind to the next stages. Impact analysis is a difficult question, because when you have a large release cycle, that is, you are released for 2-3 months, then the impact analysis will always answer you the same, because during such a time hardly any part of the application has not been touched. But if you shorten this release cycle to a week, or even better to a day, then the impact analysis shows quite adequate things, leaves marks that will help optimize manual regression. We have applied this practice quite successfully. In the beginning there were mistakes, but we reduced the amount of manual testing one-time.



The next practice is to optimize the test model. Oddly enough, but in tests there are also legacy: tests are written, but they may not be very optimal, then something else was added there, and the test cases were not processed for this ... With a detailed analysis, it turned out that it can be shortened several times number of test scenarios.



These three directions allowed us to reach the point that we release it to beta once a day, once a week it reaches 100% of users, there is no manual regression. I hope this story will motivate the companies, which are unhappy with their release state, to act - in order to only press the release button in the future, everything goes to users, and everyone looks only at the charts.



Yuri Chechetkin:These are, of course, not only Revolut practices, but also worldwide, they are used by Google, Facebook, and so on. I agree that this should be a smooth transition. And when many become POs or go into automated QA, it all gets a little blurry, evolves and turns into something said. And in Russia this trend, however, is just beginning. And as you rightly said, he should be as healthy as possible.



Nina Semkina: There was such a question. Who has Unit tests run where? On CI or locally before pushing?



Yuri Chechetkin: It seems that driving locally is the developer's task, that is, you shouldn't do it by force. It's obvious to me that there should be 100% on CI.



Nina Semkina: Thanks to all the participants for the discussion! I give the floor to our speakerDima Manko with his report.



All Articles