👨🏿‍💻 👞 ✍🏽 How we "dispersed" the QA team, and what came of it 🏚️ 🌱 🙅

Or how to get unobvious consequences if you refuse the testing team.

A year and a half ago, we destroyed the testing team: we gave up regression, transferred E2E autotests to Selenium to support developers, and went to teams that saw features to prevent bugs in the bud. In rosy dreams, it seemed to us that it would be more useful: QA work on quality, testing starts early, and developers write autotests themselves and no one bothers them.

But it ~~didn’t work~~ out quite like that. Pink dreams were colored with additional shades: no one thinks about quality, autotests are getting worse, and developers without a QA team ( ~~suddenly~~) there is more work. This is how the second order consequences appeared, for which we were not ready. Now we are correcting them and we can tell you what these consequences are, how they arise, what damage they cause and how to try to predict them so that it does not hurt so much.

What are the first and second order consequences

Ray Dalio in Principles has the concept of "second order consequences." These are non-obvious consequences of our decisions that we often cannot predict. For example, in the 60s of the 20th century, a war with sparrows was launched in China. The sparrows ate the grain, and so that they would stop eating it, China started hunting for birds. During the hunt, the Chinese killed nearly two billion birds en masse.

As a result of the genocide of sparrows, the harvest increased after a year. These are first-order consequences. But there was no one to eat insects and locusts and caterpillars multiplied, which began to destroy even more crops, which led to a massive famine in China in the following years. These are second-order consequences.

The first order consequences are direct consequences of our decisions and always lie on the surface. Second-order consequences are subtle and often long-term. To understand them, you need to think and simulate the situation. For instance:

If you pay developers for the number of lines of code, there will be more code, but the quality will be worse. Over time, people will start to cheat and produce more and more bad code in order to get more money. These are second-order consequences.
If you start to exercise, it will hurt at first and take a long time. But after a while (from a week to months), a habit will arise, and health and appearance will improve. These are second order consequences.
If you get drunk like a pig every Friday, then Friday night will be good. But Saturday morning will be bad - these are second-order consequences. And if you do this regularly, for years, then perhaps it will develop into alcoholism and cirrhosis of the liver. But these are already third-order consequences and "a completely different story."

We had a QA team and we "dispersed" it

Now I will tell you how we felt the consequences of both the first and second order. We had a cozy dedicated team of testers of 7 people: 4 of them wrote autotests, and 3 tested them manually. At some point, we decided to split up and disperse in teams. Why?

Because the developers received feedback too late.

The bugs were at the stage when all the development was “finished”, everything was “integrated” and it was necessary to check that the product was ready for release. There was no acceptance testing , it was performed by product analysts who did not have testing skills. In addition, testers and developers were in different worlds and interacted little.

The obvious (then) solution is to split into teams that work on certain features (parts) of the system in order to prevent bugs in the bud. We didn't want to give up our job, so we decided to transfer our functions to the developers. We thought about autotests - we will hand them over to the developers and they will test themselves without problems.

At first, they decided to test the hypothesis on ourselves with an "experiment": we will cover the critical scenarios of regression with automatic tests and refuse manual regression. If, as a result of experience, the number of hotfixes and release rollbacks does not increase dramatically, then the experiment can be considered successful. And so it happened - there were no more hotfixes. Resolved - disagree.

Note . The company has a product called Restaurant. It includes all services and our monolith. The goal of the product is to automate and optimize the work of all restaurant employees as much as possible. Now our work is more focused on error prevention. Now we are QA in the "Restaurant" product: we develop the qualities in the product, we participate at all stages of task development.

First and second order consequences

Direct consequences . As expected, we began to get involved in the development of tasks from the very beginning, to participate in PBR, planning, workshops and carry testing expertise to them. We became closer to the development teams, or rather a part of it, and our problems were also the problems of the team. Expertise in testing, quality assurance and wide knowledge of the system began to grow in the teams. We, in turn, began to immerse ourselves in the work of developers and understand their pains.

Now, what we didn't plan is second order consequences .

Nobody drives the quality of the product . There are 2 sides to this problem:

quality in terms of processes;
quality of autotests and pipeline.

In our dedicated QA team, we have driven quality. We were the last ones to see the product in front of the users and understand how they see it. We discussed changes and improvements on the team retro, came with suggestions to the development teams to decide together whether they should be introduced. We monitored autotests and worked on their stability.

After we dispersed in teams, it all disappeared somewhere. In the development team, we are part of the team, ~~part of the ship~~ : we completely immersed ourselves in its work, the eye became blurry and this whole overall quality of the product became something distant.

All ideas were aimed only at improving the state of the team - we did everything to release a quality feature, not a quality product. As a result, fundamentally strong solutions that can raise product quality to a new level have ceased to appear.

The competence of writing autotests has disappeared - autotests began to bend and more often fall without changing the code. By the time the team was disbanded, manual testers were just beginning to grasp the basics of automation. It turned out that neither the testers nor the developers had any expertise. In addition, grains of expertise got confused when the people who wrote these tests moved on to development, product management, and someone quit.

We did not reliably know what autotests we have, what they cover, we didn’t know how they develop, evolve, add or remove - everything was left to the mercy of the developers. As a result, when it was necessary to find some information in autotests, it was the same quest that you cannot figure out without a developer.

Extra work for developers . It's hard being a developer. If earlier they used to write product code, which is “magically” verified and goes to production, now they need to write tests themselves, edit and stabilize. At PBR we determine which scenarios should be covered by tests, and the developers choose the level of autotests themselves.

The developers went through several stages of accepting the ~~death of the~~ pipeline.

Negation... All Dodo IS releases are rolled by developers. They organize the process, communicate with the load testing team, look at the logs and monitoring during the release. The developers who rolled the release, faced with the red test, did not try to figure out its reason, but simply restarted the pipeline until it turned green 5-7-10 times. This is because there was no trust in autotests.

The maximum number of restarts I've found is 44 times !!! It seems to me that the rule that we adopted on one of the retro “Don't release with red tests. If the test is red, figure out what the problem is. If the problem is in the tests, fix it or sign it and make a card to unlock the test and add it to the backlog. "

Anger : the developers swore at our tests, they said that they~~shit~~ unstable, poorly written, they need to be redone, thrown out and rewritten (in that order).

There was no bargaining or depression, acceptance came immediately : developers can now write E2E UI and API tests themselves, stabilize and improve them.

The number of bugs on sale began to increase . Non-critical bugs began to seep into production. There are several reasons for this:

Our autotests do not cover all the functionality, but only the critical ones. And there is no more manual regression testing.
There weren't enough QA engineers for all the teams. The teams did not have testing competencies, so they did not pay due attention to testing

As a result, we started to accidentally find production bugs. They are not critical, but how many of them in general were not imagined.

How do we solve these problems

Perhaps another team could have predicted all the non-obvious consequences, but we could not. We made a decision, after a few months saw the consequences, and began to eliminate them.

Created a Restaurant QA guild or Community of practice, which included all Restaurant QAs. The goal of the community is to drive the quality of the entire product, to spread good testing practices to all product teams. This is an education that combines the advantages of a dedicated QA team and we also benefit from being QA in the development team.

We meet once a week: we share results, discoveries and plan to work together on quality. We also allocate several slots per week for working on guild tasks. For example, we are finishing our assistant bot for releasemen.

Duty... The guild partially covers the problem of the lack of quality owner and autotests. But the guild does not have strong competencies in development and automation, so our CTO made a strong-willed decision and organized a duty on the pipeline.

Now developers can systematically improve the pipeline process: stabilize, find problems that delay releases and fix them. One developer from the development team becomes the owner of the pipeline for a month and systematically improves it. It does not release, but rather improves - it makes the process of releasing and maintaining tests easy and effortless. Now that the product metrics have improved, we got rid of this attendant, but we can return it at any time. (While writing an article, we returned it because the notice begins to degrade stability)

Courses... We close the problem of lack of competencies with courses for manual testers and paired work with developers with experience in automation.

Extra work for developers . There is nothing you can do about it, the developers just reached the stage of accepting autotests. Now they write E2E tests themselves, if lower-level ones can't cover the feature, and they stabilize the pipeline. As they say in smart books, it is a good practice when the whole team and developers and testers can write tests. Our hike towards the side saw the microservices from the monolith. There are fewer tests in the monolith, and more and more in separate repositories, the pipeline becomes more stable.

We investigate the product... We solve problems with bugs in production by starting to investigate the product for inconsistencies with expected behavior. We have scheduled weekly exploratory testing sessions. And we bring bugs to the backlog to the product owner.

What would we do now?

Failure to consider second- and third-order consequences has led to poor decisions. It is especially dangerous when the first and not the best option reinforces an already existing bias. But now, with all the experience gained, we would have acted differently.

For example, the loss of competencies could be solved by asking them to share the competence with all QA engineers in a product or developers from teams a few months before the transition of people with competence in automation. And better for all at once.

There is no way to compensate for the problem of extra work for developers , but it would be possible to reduce the pain of writing tests by not putting it before a fact, but:

show the value of tests explicitly;
teach developers to write, improve and maintain these tests;
( ), .

When we went our separate ways, we didn't even think about these problems. “In hindsight” it seems, well, how can it be, to think about it is elementary. But in hindsight we are all strong - try to predict the future.

The second or third order consequences for me may be the first order consequences for more experienced people who have made such decisions many times and have seen the results of such decisions.

Too many uncertainties and variables affecting the results.

It is important not to predict the consequences, but, at least, to know what they might be. Before making any decision, it is important to think about what the consequences are likely to be, read information about cases in other companies in order to at least have an idea of the scale of possible non-obvious consequences.

Anyone who learns to predict the consequences of the second (and even third) order of any decisions will be able to save or destroy humanity. Or make more money than Scrooge McDuck - at least from stock price fluctuations.

How am I going to try to predict the consequences now

I read articles on this topic and deduced for myself several rules that, according to the authors, will help to predict such consequences. I'll try to use them:

Before making a decision - ask yourself the question "What will happen next?" and add timelines to the question. What will happen in 10 minutes, 10 months or 10 years?
Train your thinking towards such consequences by reflecting on different situations. For example, what would be the consequences of the first second or even third order if the whole world switches to electric cars, or, for example, introduces a basic unconditional income. There are no correct answers in this exercise, but it will allow you to think wider.
Remember that the first thought in your head is the first order. Is always.

If you encountered other problems when changing the organization of the testing team or other teams, write in the comments, it will be interesting to know what problems you faced and how you solved them.

P.S. 2 QA- « » . . : , , SRE- mobile SRE . . , : (@EvgenSkt) HR (@alexpanev).

, , , : « » « » ( «» — ). QA, « ? ».

-, .

How we "dispersed" the QA team, and what came of it