I will briefly explain what will happen in this article:
- I'll show you how to use the PyTorch C ++ API to integrate a neural network into a project on the Unity engine;
- I will not describe the project in detail, it does not matter for this article;
- I use a ready-made neural network model, transforming its tracing into a binary that will be loaded at runtime;
- I will show that this approach greatly facilitates the deployment of complex projects (for example, there are no problems with synchronizing the Unity and Python environments).
Welcome to the real world
Machine learning techniques, including neural networks, are still very comfortable in experimental environments, and launching such projects in the real world is often difficult. I will talk a little about these difficulties, describe the limitations on how to get out of them, and also give a step-by-step solution to the problem of integrating a neural network into a Unity project.
In other words, I need to turn a research project in PyTorch into a ready-made solution that can work with the Unity engine in combat conditions.
There are several ways to integrate a neural network into Unity. I suggest using the C ++ API for PyTorch (called libtorch) to create a native shared library that can then be plugged into Unity as a plugin. There are other approaches (for example, using ML-Agents ), which in certain cases can be simpler and more effective. But the advantage of my approach is that it provides more flexibility and more power.
Let's say you have some exotic model and just want to use existing PyTorch code (which was written without intent to communicate with Unity); or your team is developing their own model and doesn't want to be distracted by thoughts of Unity. In both cases, the model code can be as complex as you like and use all the features of PyTorch. And if it suddenly comes to integration, the C ++ API comes into play and wraps everything in a library without the slightest change to the original PyTorch code of the model.
So my approach boils down to four key steps:
- Setting up the environment.
- Preparing a native library (C ++).
- Import of functions from library / plugin connection (Unity / C #).
- Saving / deploying the model.
IMPORTANT: since I did the project while sitting under Linux, some commands and settings are based on this OS; but I do not think that anything here should depend too much on her. Therefore, the preparation of the library for Windows is unlikely to cause difficulties.
Setting up the environment
Before installing libtorch make sure you have
- CMake
And if you want to use a GPU, you need:
- CUDA toolkit (at the time of this writing, version 10.1 was relevant);
- CUDNN library
Difficulties can arise with CUDA, because the driver, libraries and other persimmons must be friends with each other. And you have to ship these libraries with your Unity project to make everything work out of the box. So this is the most uncomfortable part for me. If you do not plan to use GPU and CUDA, then you should know: calculations will slow down by 50-100 times. And even if the user has a rather weak GPU, it's better with it than without it. Even if your neural network is turned on quite rarely, these rare turns on will lead to a delay that will annoy the user. It may be different in your case, but ... do you need this risk?
Once you've installed the above software, it's time to download and (locally) install libtorch. It is not necessary to install it for all users: you can simply place it in your project directory and refer to it when starting CMake.
Preparing a native library
The next step is configuring CMake. I took the example from the PyTorch documentation as a basis and changed it so that after building we get the library, not the executable file. Place this file in the root directory of your native library project.
CMakeLists.txt
cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(networks)
find_package(Torch REQUIRED)
set(CMAKE_CXX_FLAGS «${CMAKE_CXX_FLAGS} ${TORCH_CXX_FLAGS}»)
add_library(networks SHARED networks.cpp)
target_link_libraries(networks «${TORCH_LIBRARIES}»)
set_property(TARGET networks PROPERTY CXX_STANDARD 14)
if (MSVC)
file(GLOB TORCH_DLLS «${TORCH_INSTALL_PREFIX}/lib/*.dll»)
add_custom_command(TARGET networks
POST_BUILD
COMMAND ${CMAKE_COMMAND} -E copy_if_different
${TORCH_DLLS}
$<TARGET_FILE_DIR:example-app>)
endif (MSVC)
The library source code will be located in networks.cpp .
This approach has another nice feature: we don't have to think about which neural network we want to use with Unity yet. The reason (getting a little ahead of myself) is that at any time we can run the network in Python, get a trace of it, and just tell libtorch to "apply this trace to these inputs." Therefore, we can say that our native library is simply serving a kind of black box, working with I / O.
But if you want to complicate the task and, for example, implement network training directly while the Unity environment is running, then you have to write the network architecture and training algorithm in C ++. However, that is outside the scope of this article, so for more information, I refer you to the relevant section of the PyTorch documentation and code examples repository .
Anyway, in network.cpp we need to define an external function to initialize the network (boot from disk) and an external function that starts the network with input data and returns results.
networks.cpp
#include <torch/script.h>
#include <vector>
#include <memory>
extern «C»
{
// This is going to store the loaded network
torch::jit::script::Module network;
To call our library functions directly from Unity, you need to pass information about their entry points. On Linux, I use __attribute __ ((visibility ("default"))) for this. On Windows there is a __declspec (dllexport) specifier for this , but to be honest, I haven't tested if it works there . So, let's start with the function of loading a neural network trace from disk. The file is in a relative path - it is in the root of the Unity project, not in Assets / . So be careful. You can also just pass the filename from Unity.
extern __attribute__((visibility(«default»))) void InitNetwork()
{
network = torch::jit::load(«network_trace.pt»);
network.to(at::kCUDA); // If we're doing this on GPU
}
Now let's move on to the function that feeds the input data to the network. Let's write C ++ code that uses pointers (managed by Unity) to loop data back and forth. In this example, I am assuming my network has fixed inputs and outputs, and I prevent Unity from changing this. Here, for example, I will take Tensor {1,3,64,64} and Tensor {1,5,64,64} (for example, such a network is needed to segment the pixels of RGB images into 5 groups).
In general, you will need to pass information about the dimension and amount of data to avoid buffer overflows.
To convert the data to the format libtorch works with, we use the torch :: from_blob function... It takes an array of floating point numbers and a tensor description (with dimensions) and returns the generated tensor.
Neural networks can take multiple input arguments (for example, call forward () takes x, y, z as input). To handle this, all input tensors are wrapped into a vector of the torch :: jit :: IValue standard templating library (even if there is only one argument).
To get data from a tensor, the easiest way is to process it element by element, but if this slows down the processing speed, you can use Tensor :: accessor to optimize the data reading process . Although personally I did not need it.
As a result, the following simple code is obtained for my neural network:
extern __attribute__((visibility(«default»))) void ApplyNetwork(float *data, float *output)
{
Tensor x = torch::from_blob(data, {1,3,64,64}).cuda();
std::vector<torch::jit::IValue> inputs;
inputs.push_back(x);
Tensor z = network.forward(inputs).toTensor();
for (int i=0;i<1*5*64*64;i++)
output[i] = z[0][i].item<float>();
}
}
To compile the code, follow the directions in the documentation , create a build / subdirectory and run the following commands:
cmake -DCMAKE_PREFIX_PATH=/absolute/path/to/libtorch <strong>..</strong> cmake --build <strong>.</strong> --config Release
If all goes well, libnetworks.so or networks.dll files will be generated that you can place in Assets / Plugins / of your Unity project.
Connecting the plugin to Unity
To import functions from the library, use DllImport . The first function we need is InitNetwork (). When connecting the plugin, Unity will call it:
using System.Runtime.InteropServices;
public class Startup : MonoBehaviour
{
...
[DllImport(«networks»)]
private static extern void InitNetwork();
void Start()
{
...
InitNetwork();
...
}
}
So that the Unity engine (C #) can communicate with the library (C ++), I'll entrust it with all the memory management work:
- I will allocate memory for arrays of the required size on the Unity side;
- pass the address of the first element of the array to the ApplyNetwork function (it also needs to be imported before that);
- just let C ++ address arithmetic access that memory when data is received or sent.
In library code (C ++), I have to avoid any allocation or deallocation of memory. On the other hand, if I pass the address of the first element of the array from Unity to the ApplyNetwork function, I have to save this pointer (and the corresponding chunk of memory) until the neural network finishes processing the data.
Luckily, my native library does the simple job of distilling the data, so it was easy enough to keep track of. But if you want to parallelize the processes so that the neural network simultaneously learns and processes data for the user, you will have to look for some kind of solution.
[DllImport(«networks»)]
private static extern void ApplyNetwork(ref float data, ref float output);
void SomeFunction() {
float[] input = new float[1*3*64*64];
float[] output = new float[1*5*64*64];
// Load input with whatever data you want
...
ApplyNetwork(ref input[0], ref output[0]);
// Do whatever you want with the output
...
}
Saving the model
The article is drawing to a close, and we still discussed which neural network I chose for my project. It is a simple convolutional neural network that can be used to segment images. I did not include data collection and training in the model: my task is to talk about integration with Unity, and not about the troubles with tracing complex neural networks. Don't blame me.
If you're curious, there is a good, complex example here that outlines some special cases and potential problems. One of the main problems is that tracing does not work correctly for all data types. The documentation explains how to solve the problem using annotations and explicit compilation.
This is what the Python code for our simple model might look like:
import torch
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super().__init__()
self.c1 = nn.Conv2d(3,64,5,padding=2)
self.c2 = nn.Conv2d(64,5,5,padding=2)
def forward(self, x): z = F.leaky_relu(self.c1(x)) z = F.log_softmax(self.c2(z), dim=1)
return z
, , , , .
() :
network = Net().cuda()
example = torch.rand(1, 3, 32, 32).cuda()
traced_network = torch.jit.trace(network, example)
traced_network.save(«network_trace.pt»)
Expanding the model
We made a static library, but this is not enough for deployment: additional libraries need to be included in the project. Unfortunately, I am not 100% sure which libraries must be included. I chose libtorch, libc10, libc10_cuda, libnvToolsExt and libcudart . In total, they add 2 GB to the original project size.
LibTorch vs ML-Agents
I believe that for many projects, especially in research and prototyping, ML-Agents, a plugin built specifically for Unity, is really worth choosing. But when projects get more complex, you need to play it safe - in case something goes wrong. And this happens quite often ...
A couple of weeks ago I just used ML-Agents to communicate between a demo game in Unity and a couple of neural networks written in Python. Depending on the game logic, Unity would call one of these networks with different datasets.
I had to dig deeply into the Python API for ML-Agents. Some of the operations that I used in my neural networks, such as 1d fold and transpose, were not supported in Barracuda (this is the tracing library currently used by ML-Agents).
The problem I ran into was that ML-Agents collects "requests" from agents during a certain time interval, and then sends them for evaluation, for example, to a Jupyter notebook. However, some of my neural networks depended on the output of my other networks. And in order to get an estimate of the entire chain of my neural networks, I would have to wait a while, get the result, make another request, wait, get the result, and so on every time I make a request. In addition, the order in which these networks were put into operation was nontrivially dependent on user input. This meant that I could not just run neural networks sequentially.
Also, in some cases, the amount of data I needed to send had to vary. And ML-Agents is more designed for a fixed dimension for each agent (it seems that it can be changed on the fly, but I am skeptical about this).
I could do something like calculating the sequence of calling neural networks on demand, sending the appropriate input to the Python API. But because of this, my code, both on the Unity side and on the Python side, would become too complex, or even redundant. Therefore, I decided to study the approach using libtorch and did not fail.
If earlier someone had asked me to build a GPT-2 or MAML predictive model into a Unity project, I would advise him to try to do without it. Implementing such a task using ML-Agents is too complicated. But now I can find or develop any model with PyTorch, and then wrap it in a native library that connects to Unity like a regular plugin.
Cloud servers from Macleod are fast and secure.
Register using the link above or by clicking on the banner and get a 10% discount for the first month of renting a server of any configuration!