Voxels have served as the lighting system in the Roblox world for the past four years. But sooner or later, a time for change comes in everything. That is why the developers wondered what to do next.
Lighting is tricky, so there are many factors to be careful about when choosing new technologies. To facilitate decision making, Roblox has prototyped two future systems: so-called voxels and shadow maps. In order to understand the limitations of both, it is first important to understand how they work.
Note: The screenshots in the article are arranged so that voxels are always shown on the left, and shadow maps on the right.
Implementation: voxels
Although this particular system has worked in the game for a long time, the option considered here has undergone many improvements.
The world data is converted into a set of voxel grids: each grid is centered around the character and can be voxel sizes from 1 to 16 (5 grids in total). Each voxel contains occupancy information ranging from 0 to 100%. Lighting data is then computed for each voxel in each mesh based on this fullness and light source / direction information. All of the above happens on the GPU as the central one is not fast enough to update so many voxels with such a high density.
The system stores all data in voxels, in particular - for each available voxel there is data about:
- Fullness (multiple values ​​describing how full each voxel is);
- Skylight (how much of the sky is visible from the voxel);
- Sun shadow (how much of the sun is obscured by the voxel);
- Light object / cone colors (approximation of the color / cone of the effect of local light sources on the voxel).
This information is later used to calculate the color of each pixel at a given resolution. Screen and voxel resolutions can be adjusted independently of each other. Parts of the voxel mesh can be updated frame by frame as the lights / objects move.
Implementation: shadow maps
This method uses rasterization to compute most shadow effects. It is performed in three stages. First, for each shadow caste, we update the shadow map, launching castes of multiple rays from the light source into the scene and remembering the intersection results. We then build a spatial acceleration structure into which we embed each visible light object, which is essentially a truncated cone voxel grid (also known as a froxel grid).
The grid covers the entire part of the game world as seen by the camera. In each froxel, we write down a list of all light objects that intersect it. Finally, in order to calculate the influence of all light sources, when rendering the scene, for each pixel we look for a froxel that contains this pixel, iterate over all the light sources, and for each light we calculate its influence separately using the shadow maps built in the first stage.
The system stores data in two structures:
- Shadow atlases (all visible light shadow maps packed into one large texture);
- Light grid (a froxel grid that effectively transforms a point in the camera into a list of lights).
The color of each pixel is calculated dynamically and is not stored explicitly. Parts of the shadow atlas can be updated frame by frame as the lights / objects move.
Performance: voxels
The voxel technique lends itself better to scaling: to degrade the output quality, we can reduce the number of voxel grids, or update fewer voxels per frame (which results in a "light delay" - a slower update of the light exposure of objects compared to updating the objects themselves).
Voxels have three complexity characteristics: geometric complexity, light complexity, and pixel count. Geometric complexity only affects the cost of voxelization, so adding more objects will not introduce delays. The complexity of the light only affects the cost of computing it, which does not depend on the geometric complexity or the number of pixels. Finally, the final pixel color is calculated from the number of voxels / lights / objects, so we can scale the resolution without affecting the lighting quality.
Voxel performance is calculated as O (G) + O (L) + O (P), where G is the number of triangles (geometric complexity), L is the number of lights, P is the number of pixels.
Unfortunately, the peak voxel performance is not optimal as the voxel count scales as N3 and GPUs are not ideal for the required refresh methods and cannot maintain good performance management. With enough research into GPU computing, the performance loss can be compensated for, but at the moment the base cost can remain quite high.
Performance: shadow cards
Shadow cards are more GPU friendly as they are designed based on rasterization. The cost of updating the shadow atlas can be partially reduced by caching / delaying updates (which logically leads to additional latency). Optimizing geometry representation (including mesh level of detail) also reduces method cost.
However, updating shadows in complex scenes is still costly, as the cost depends on both the amount of geometric detail and the number of lights casting the shadows. Inside a building, in the case of a moving light, the entire building needs to be re-rendered every frame to update the shadow information for that light. A lot of moving light casts in the building leads to a decrease in performance: we cannot update all the lights in the frame at once, which leads to visual artifacts.
In addition, this method does not allow separating the resolution parameter from the amount of light: for each pixel, we have to recalculate the influence of all light sources that cover it. This step also cannot be cached, which leads to performance issues at high resolutions in highly lit scenes: 20 overlapping lights in a 4K room might require 160 million light estimates.
Shadowmap performance is calculated as O (GL) + O (LP), where G is the number of triangles (geometry complexity), L is the number of lights, P is the number of pixels.
Performance: evaluation
For clarity, both methods were used for specially selected game levels. Note that these are pre-existing level implementations and not specifically designed for performance evaluation.
Paris (sun shadows, very few non-shadowing light sources)
- Voxels: shadow update - 6ms, scene rendering - 1.5ms;
- Shadow maps: shadow update - 1 ms, scene rendering - 2.4 ms;
- The cost of computing the base voxel shadow is higher because it is harder for the GPU to process.
Caves (many sources that cast shadows)
- Voxels: shadow update - 7 ms, scene rendering - 0.9 ms;
- Shadow maps: shadow update - 10 ms, scene rendering - 2.1 ms;
- Due to the large amount of geometry and moving lights, updating shadow maps is expensive.
Western (many shadow casting sources)
- Voxels: shadow update - 8ms, scene rendering - 1ms;
- Shadow maps: shadow update - 15 ms, scene rendering - 2.5 ms;
- With moving lights and lots of triangles, updating the shadow map is expensive.
1000 light sources without shadow
- Voxels: light update - 20ms, scene rendering - 0.5ms;
- Shadow maps: light update - 0.5ms, scene rendering - 5ms;
- The cumulative amount of light and voxel overlap in this case slows down the voxel update. In addition, you can see that in the near stage the approximation for “one light source in each voxel” is not performed.
Performance: conclusion
Shadow maps scale well for workload, but there are two things to consider:
- The cost per pixel rises as the resolution increases, making this a practical solution only at medium resolution (1080p); going beyond 1080p requires a very good GPU.
- The cost of rendering shadows grows very quickly in the case of complex geometry of many dynamic lights. This can be compensated for by better culling, but at this stage it remains a fundamental problem.
At the same time, in contrast, voxel performance is much less dependent on the content of the level, but has a much higher base cost. This can be compensated by improved GPU algorithms and reduced voxels.
Memory requirements
Memory requirements for shadow maps and voxels depend on the required quality.
In the case of voxels, several textures are stored in memory for each stage, so their total size depends on the number of stages and the size of each of them. Currently, 4 stages (with voxel sizes 1..8) with 128x64x128 voxels each are used, which adds up to 128 MB of used VRAM. It would be possible to start another 2 cascades (0.5 voxels and 16 voxels) or reconfigure the existing ones, which would increase this value to 192 MB. Conversely, you can reduce the number of cascades (removing some close cascades) in systems with limited memory, and then the minimum memory impact can be about 64 MB with two cascades (4..8) and about 96 MB with three (4 .. sixteen).
In the case of shadow maps, a shadow map atlas and a froxel grid are used. The latter partly depends on the resolution. The size of the shadow atlas, in turn, can be reduced if the quality of the shadows needs to be reduced to improve performance / memory. The current system uses 73 MB of video memory, most of which (64 MB) is occupied by the shadow atlas. You can reduce it and thus limit the number of darkened lights or the quality of shadows. You can also consider some options for shadow maps that require more memory to support translucency, which means they will take up more space (up to 130 MB or more). The minimum impact on system memory is likely to be achieved by reducing the size of the shadow atlas and using its simpler version, which will take about 25 MB.
By comparison, the current lighting system has two modes: high (PC) and low quality (mobile). The PC version is ~ 40 MB (24 MB RAM, 16 MB VRAM); mobile - ~ 11 MB (6 MB RAM, 5 MB VRAM).
In fact, both methods are fairly close in terms of memory impact, but shadow maps are somewhat more scalable for the same light / shadow range.
Mobile compatibility
The audience of the game is becoming more and more mobile, which means that when comparing the selected implementations, different gaming devices must also be taken into account. On high-end devices, there should be enough API capabilities to implement both methods, but of course they may not be as practical in terms of memory and performance.
The existing voxel lighting system is great for mobile devices: it supports many complex lighting functions (light shadows, skylight, etc.) and performs most of the complex calculations on the CPU, thus providing minimal requirements for GPU performance and feature set ... Since this system will still need to be supported for low-cost mobile devices and PCs for the foreseeable future, several options have emerged to support a large pool of devices:
- Keep the existing system in a mobile form, the new one will only be PC / console. This means that a large segment of the user base will not have access to the new system.
- ( , ), /.
- low-end , , .
In all of these cases, it is necessary to answer the question about content compatibility, because one of the main promises of the platform is “download content once and run it anywhere”. We still need to work on this. At first it seems that the new voxel solution is better in the sense that it provides consistency of quality / behavior from old to new system, while shadow maps represent a more abrupt transition to another quality, but at the same time they are more likely to get along with limited capabilities on mobile devices.
Quality: light sources
The shadow map solution provides reliable information in terms of simulating light sources: in the screenshot below with 1000 lights in the case of shadow maps, you can see perfectly reproduced specular highlights - modeled with BRDF, which gives us the light reflections we want.
The voxel solution is fundamentally worse because it approximates the effect of light on each voxel as if it were coming from only one source. From this you can see that the quality of specular reflection deteriorates:
So, in the case of voxels in the area with specular highlights, the colors merge in a 1: 1 ratio, creating yellow light instead of green and red, even though there are no yellow lights in the scene. In contrast, a shadow map solution accurately simulates the color combination.
In some cases, we get very unconvincing results at all, although they can be improved in the future:
In the example above, you can see curved, elongated, distorted mirror highlights, and in several voxels under one of the objects there is no light information at all. The same screenshot for shadow maps gives a much better result.
Quality: shadow
The defining quality of shadow maps is accuracy, while voxel shadows are soft. Shadow maps provide fairly crisp shadows with as little detail as possible, yet precise enough to create a convincing shadow for the character. On the other hand, the voxel algorithm is very good for creating really soft shadows, but shadows from small details are either not recorded at all or have an irregular shape.
For this reason, a variant of the shadow map is currently used to render character shadows - however, it is more of a crutch, applied only to the sun casting character shadows. Other light sources are not taken into account.
Also, a key technique for accelerating voxels is using cascades. However, this means that the fullness data becomes coarser as the distance from the point in the scene is of interest to us. In this case, the quality of the shadow also deteriorates as the distance between the source and the receiver of the shadow increases:
In the screenshot above, the size of the smallest voxel would be enough to render a high-quality shadow from the bridge, but the bridge is too far from the water surface, so its voxels turn out to be too rough even when provided that the voxels near the water surface are quite good.
Quality: skylight
An important function in the voxel pipeline is the calculation of the skylight coefficient, which determines how much of the sky is visible from the current voxel. It is used to mix indoor and outdoor lighting and is very effective at improving its quality. In the screenshot below, the outside of the house should be much brighter than the inside, even in areas that are in shadow. The voxel solution calculates and reproduces this coefficient well, but in the case of a shadow map it is absent, therefore there is no possibility to make shadows dim.
Quality: geometric precision
It is worth noting the fundamental differences in geometry representation between voxels and shadow maps.
Voxels assume that all objects supported by the light engine can be "voxelized" - that is, for every voxel in the game world, there is a quick way to calculate the amount of intersection between the object and the voxel. This works well for primitive shapes, but complex objects such as CSGs and MeshParts are already a serious problem. Now, rough decomposition and a set of other hacks for effective voxelization help partly with this, but this often leads to visible artifacts. At the same time, shadow maps use the same polygonal representation that is involved in rendering, so they can perfectly represent the shapes of all objects:
Quality: light leaks
While the shape of the shadow is extremely important, it is perhaps even more important that invisible pixels must be handled properly. When different approximations violate this, there are so-called light leaks - visible stripes, which cause the most problems in high contrast environments - for example, inside a building with bright outside sun. Here's an example of a light leak:
This is a thin, illuminated section of the floor right against the wall. Shadow maps preserve light occlusion much better and fight this problem.
There are several sources of leaks for voxels. Some of them can be mitigated while maintaining anisotropic fullness: for example, now the memory stores 3 values ​​per voxel, indicating “how much matter is in the projection of the voxel along the axis” for all three axes. Unfortunately, while this helps subtle details cast shadows regardless of their thickness, it won't fix all leaks. The only way to guarantee complete blocking of light in this aspect is to make the part twice as thick as the voxel. In addition, leakage grows with the size of the voxel, which means that it becomes more noticeable at lower quality levels and / or farther away.
Shadow maps are not completely sealed, but leaks in them are a much less problem - for example, in this implementation, an object with a thickness of 0.4 voxel will not transmit visible light (with a thickness of 0.2 voxel, it can transmit part of the light, but also from this you can get rid of in the future).
Quality: conclusion
Shadow cards are excellent in most aspects of quality. The only area that sags significantly is the calculation of the skylight ratio. This may require a hybrid method that uses voxelization for the skylight, which introduces some hitch to the voxel pipeline - or maybe there are alternative solutions to this problem. It would also be nice to be able to support soft shadows, which can be done with some extensions to the shadow map algorithm.
Voxels provide acceptable quality, but compared to shadow maps they lose a lot, especially in the fidelity of shadows and specular highlights. We'll have to somehow solve these problems in order to be able to implement voxel lighting that can provide beautiful shadows from the player, because the use of the current solution gives those only from the sun, which seems incompatible with the future vision of the world in the game.
Visibility: translucency
Rendering shadows is a fairly well-researched issue for opaque objects, but translucency is another matter entirely. Since in the case of a voxel system, light travels through voxel cells based on the value of fullness, it is not too difficult to maintain semi-transparent shadows that low-frequency (soft) shadows from particle effects and other semi-transparent objects in the scene can give, including self-shadowing for The particle effects themselves:
Below is a video of this effect in motion:
There is currently no support for translucency for shadow maps. This means that if we want to support particles or other transparent objects that cast shadows, another solution must be found. There is some research for alternative shadow map representations that might serve this use case, but it remains to be seen how effective this is.
Visibility: vegetation
While shadow maps don't do a particularly good job with translucency, they can show fine detail on objects (such as vegetation), whether modeled with geometry or textures. The voxels are not small enough to serve this use case. In addition, in this case, it is not easy to access the information about the texture, because it requires accurate modeling of the mesh surface, not the volume. It is doubtful that it will ever be possible to get beautiful shadows from vegetation using voxels, whereas shadow maps can provide this even with existing content, as shown in this screenshot:
Visibility: self-illumination
Due to the way voxels are implemented, it is relatively easy to inject lights of arbitrary shape and quantity into the mesh without affecting the performance of other parts of the pipeline. However, while you add a lot of lights to shadow maps, creating lights with irregular shapes has some architectural and performance issues. In particular, it is much easier to realize true self-luminosity with voxels: neon material is currently used to "emit light", but it does not actually emit light to other objects nearby.
Yes, you can add additional lights, but it would be nice to do it all automatically. Shadow maps are not very useful for this, but voxels support voxelization of any shape by necessity, and thus supporting light emission from self-luminous objects is quite simple with them.
Visibility: global illumination
Global illumination (GI) refers to the computation of secondary lighting effects, such as light from a lamp bouncing twice off walls to provide additional illumination in areas where direct light beams cannot reach.
GI in Roblox is extremely complex, and most decisions about it have to sacrifice something from dynamic lighting, dynamic geometry, performance, large-scale scenes. Any of these sacrifices are impermissible.
It is unclear which GI solutions would be practical given the tight content constraints. So far, voxel-based GI seems to be more promising than other approaches.
Of course, having voxel-based GI does not mean direct illumination is computed using them: most voxel-based GI research today involves using shadow maps to compute direct light and improve results with voxels.
Summary
Based on the above analysis, we will compile a summary table of the effectiveness of both solutions. The cells in italics suggest that further research can improve this area. The grading in the table is as follows: Terrible <Poor <Normal <Good <Excellent.
So voxels are great for modeling indirect lighting, but not so great when it comes to direct lighting. At the same time, they are quite resource-intensive, which correlates poorly with the task of maintaining a wide pool of devices.
This led to the decision to create a shadow map system for direct lighting. The solution to the problem of skylight and global illumination has not yet been unequivocally found, but, most likely, it will turn out to be a kind of hybrid of both systems.