Synthetic data: improving perception algorithms and optimizing the search for boundaries

image



In order to cover all edge cases encountered in the real world, critical sensing systems require huge amounts of data. One of the most common approaches to training algorithms for self-driving cars is the selection and labeling of real driving data. At CVPR 2020, Andrey Karpaty said that Tesla also uses this approach - their cars adapt object tags online. "Variation and control" is very important as engineers are constantly adapting the ontology and methodology for labeling data as self-driving cars are constantly faced with new scenarios that need to be analyzed.



However, this data-driven approach has various limitations due to scalability, data collection costs, and the sheer amount of effort required to accurately label datasets. In this text, the Applied team will discuss a synthetic tagged data approach. This approach makes learning and developing critical algorithms for unmanned vehicles faster and more cost effective.



image



An example of synthetic data for images from cameras with reference markings. Original RGB image (top left), 2D frames (top right), semantic markup (bottom left) and 3D frames (bottom right).



Modern approach to data markup and related problems



Figure 2 shows a typical approach to creating tagged datasets. This is a very time consuming process - test drivers drive vehicles equipped with multiple sensors in manual or unmanned mode. During these trips, special software built into the vehicle records raw sensor data and program output from sensing, control and planning modules. During the development process, it may be necessary to create special vehicles, since serial vehicles may lack the accurate sensors required for data collection. After collecting the data, the difficult task of forming a sample of the data that will be marked up arises. This requires careful selection of specific and interesting events, after which the datasets are sent to companies,engaged in markup (it is advisable to minimize the size of the dataset in order to save on its markup). Sometimes this includes searching for specific edge cases in the logs (like a packet flying on the freeway). Also, re-collection and labeling may be required when updating the configuration of any of the sensors.



image



2:



While markup may be the only way to prepare the raw data needed to train autonomous driving algorithms, the main disadvantage of this approach is the investment required to scale up enough. Test drivers may need to travel hundreds or thousands of kilometers to detect any edge case. Tesla, for example, has a fleet of more than a million production cars that collect huge amounts of data: stop signs in different languages, different locations, data validation and more - all on behalf of the company. Most OEMs do not have enough vehicles to collect such datasets. Even if huge amounts of driving data were available, there is still no guarantee.that this data would be available in datasets. In this case, to collect such data, it is necessary to conduct special campaigns, which increases the development cost and increases the time frame.



Another aspect is availability and availability of specific conditions. At the time of this writing, the US is experiencing extreme weather conditions - the sky turns orange (sometimes even red) (Fig. 3). If there are no vehicles in an area with these conditions, it will take years to collect such data - for the extreme conditions to recur. Otherwise, there will be distortions in the dataset due to the fact that it does not provide samples of such conditions.



image



Figure 3: Extreme conditions are difficult to predict and capture in self-driving vehicle datasets. Source: CBS News.



In addition, self-driving vehicle developers are always looking for new designs, and significant infrastructure will be required to process data efficiently. Many queries on this data assume that the data already has tags or markup. The problem is that if this method has not been used before, then they may not be. Finally, the cost of tagging data is quite high, and data is often tagged manually. There is a high probability of errors and inaccuracies (for example, when one car overlaps another in the image).



Using synthetic data and its benefits



Synthetic data provides an alternative approach that is more scalable and accurate. Although synthetic data is generated from simulation, reliable information (semantic vehicle labels or text on road signs) is delivered accurately. Simulations can also provide accurate data on albedo, depth, bounce, and roughness for each object in the scene (Figure 4). In addition, objects have pixel masks and semantic labels. All this allows you to create annotations automatically, without the need to manually tag data from sensors. While it may require dedicated real-world extraction software to generate individual annotations, it will be a one-time investment that will allow you to create and use new label classes.



image



4: . : , , , , .



Another notable advantage of synthetic data markup is that it allows you to create many variations of the same scene without having to travel around the world and rely on luck. Synthetic data also allows you to focus on specific objects of interest to developers. With the right algorithms set up, millions of road sign variants can be simulated in a matter of hours. These options may include different lighting conditions, object placement, various obstructions and damage (rust, oil stains, graffiti). Thus, synthetic data can complement data taken from the real world. Composite real-world events can be used as a starting point from which thousands of variations of the original scene will be created.



Diversity is also important from a geographical point of view. In order to meet foreign road signs with specific modifications used in individual countries, test vehicles will need to travel to those countries. Also, a test car can travel hundreds of kilometers to find a specific road sign, but in the end it turns out that it was half blocked by a school bus. All of these difficulties can be circumvented by instantly creating the necessary scenes using synthetic datasets (Figure 5). Due to the fact that a wide range of scenarios can be created based on synthetic data, algorithms can be tested on many edge cases (Fig. 6).This post describes how Kodiak Robotics (which deals with self-driving trucks) uses synthetic simulations to train algorithms and tests - they verify that their Kodiak Driver system adequately handles various edge test cases.



image



image



Figure 5: Examples of different road signs in Europe and the US



image



Figure 6: Modification of road conditions and markings in synthetic data



Another important use case is to obtain reference data samples that cannot be collected from sensors or added manually. A typical example is accurate depth extraction from a camera with one or more lenses. Real world data does not tell us the depth of each individual pixel, and it is impossible to accurately calculate it or mark it by hand.



Synthetic data requirements



Sensor data



In order for synthetic data markup to be useful in terms of testing and training algorithms for autonomous vehicles, data from simulated sensors and annotations must meet certain criteria. As we wrote earlier in the post on sensor modeling, large datasets from artificial sensors used to develop unmanned vehicles should be generated cheaply and quickly (in a few days). Also, artificial sensors should be modeled taking into account the basic physical principles inherent in specific types of sensors. The most important factor is the level of accuracy of the created models. There is a trade-off between the plausibility gap (how differently algorithms perceive real and synthetic data) and the speed of data collection.This gap can vary depending on the type of sensor simulated, surrounding objects, and environmental conditions. It is also very important to be able to quantify this gap and use the resulting estimate to form a strategy for using synthetic data. As an example, take a look at Figure 7, which shows how the lidar model responds to a wet road. In the picture, you can see how the lidar reacts to return signals at ground level and spray from vehicles around.In the picture, you can see how the lidar reacts to ground level feedback and spray from nearby vehicles.In the picture, you can see how the lidar reacts to ground level feedback and spray from nearby vehicles.



image



7:







Another important aspect that arises when working with synthetic data is the variety of media and materials found in these media. Environments must be generated quickly from real maps and data β€” as shown in Figure 8. The ability to quickly create such environments depends on procedural generation techniques. The ability to model any geographic region from around the world is another incredible advantage of synthetic data over real data. However, while different locations are easy to create, if the methods are misconfigured, areas and data can be duplicated. Currently, a very important aspect in this area is finding the relationship between the repetition of data and the reflection of the diversity of the real world. Diversity must be taken into account both at the macro level (how much the road surface can change on a kilometer segment of the route),and at the micro level (for example, how different materials of the environment can differ).



The importance of materials in rendering physically believable environments was discussed in a previous post, although usually the textures that make up these materials are scans of real surfaces. Creating combinations and variations of these materials to add variety to the generated data can be critical in both training algorithms and testing them.



image



Figure 8: Procedurally generated high quality urban environment.



Annotations



The requirements for data annotations depend on both use cases and algorithms. The types of data annotations taken from the real world are presented in Table 1.



A type Details
Semantic Semantic segmentation (pixel or point)
Cuboid For images, lidar points or radar reflections
Frame Pixel annotation for 2D markup




Table 1: Types of annotations for real-world data



In the case of synthetic data, much more reliable information is available to generate similar annotations that can be captured in the collected data. The reference data is also reproduced with dot / pixel precision. Finally, both sensor data and annotations can be processed in any frame of reference (the world, the system itself, a separate sensor, etc.).



Table 2 shows the standard types of annotation for data generated by modeling. In addition, many formats and data types can be further customized.



A type Details
Semantic Semantic segmentation (pixel or point)
Cuboid , ( )
,
, , , ,
,
,
( BBox – )
Albedo, surface normals, depth, surface roughness, reflections, metallicity, reflective surfaces, optical properties




Table 2: Annotation Types for Synthetic Data



Using all of these additional reference data types dramatically speeds up algorithm development. The sheer scale of the data, its quality, and volume allow engineers to make decisions faster.



image



Figure 9: Annotated synthetic data showing pixel-perfect 2D boxes










image



Vacancies
, , , - .



, , , .



, , . , , , , , .



, , .







About ITELMA
- automotive . 2500 , 650 .



, , . ( 30, ), -, -, - (DSP-) .



, . , , , . , automotive. , , .





All Articles