Generator of genre posters for films and transfer of picture style - projects of students of the course "Neural networks" Technosphere





We talked about the graduation projects of graduates of semester courses on mobile development at Technopark (Bauman Moscow State Technical University) (previous publications: " Application Development for iOS ", " Development of Applications for Android "). Today we are sharing interesting projects of students of the second semester of the Technosphere - a joint educational project with Moscow State University with an annual training program in the field of analysis and work with large amounts of data. Students take courses in machine learning, information search, neural networks and other disciplines. The project method is used in teaching, so we sum up the results of the semester on the defense of student projects.



Experiments play an important role in the educational process. Student projects cannot do without them: the guys try different approaches, methods, architectures, tools. Often after the experiment, students abandon the choice of technology and algorithm in favor of a new approach. This is a big part of the experience and learning. Below we will talk about such stages in the development of two student projects.



  • Movie posters GAN / embed project.
  • Cyrclegan project on cartoon series.


CycleGAN on cartoon series



The authors of this project decided to use the CycleGAN technique to transfer pictures from one animated film to another. GAN (Generative adversarial network) is a method for training image-to-image models. Two neural networks are trained: the discriminator and the generator, which are in adversarial relationships. The generator tries to increase the classification error, the discriminator tries to decrease it. And CycleGAN is a method of transferring imag-to-image learning in unsupervised mode.



Let's say there are two picture domains - A and B. Two generators and two discriminators are created: gen_A2B, gen_B2A, disc_A, disc_B. It is necessary for the gen_A2B generator to create the same picture from the picture from A, but so that it is in the domain B. To do this, enter cycle consistency loss:



l1loss(gen_b2a(gen_a2b(a)), a)


In this case, the generator will generate images that deceive the discriminator, but at the same time retain the original content.



Solution architecture:



Generator:





Resnet blocks allow you to remember the original picture. We also used instance normalization instead of batch normalization, because the latter adds noise from other pictures.



Discriminator:





There was no ready-made set of images, so we chose pictures from key frames of anime full-length films: for A, we used "Spirited Away" by Hayao Miyazaki, and for B - "Your Name" by Makoto Shinkai. Since CycleGAN is not very suitable for drastic changes (for example, shape), the authors took both domains from the anime.



First LSGAN was used as a loss function, and then WGAN GP, because LSGAN created strange artifacts and lost colors during training.



For training, we used a model pre-trained on horse2zebra (weights were found only for generators). She already has the understanding that she needs to save the contents of the picture, so the authors at the very beginning got a good autoencoder, which only needed to learn how to deceive the discriminator.



At the very beginning of training, we set a high value for cycle loss identity loss and gradient clipping, but when enough epochs have passed, we decided to gradually decrease these values ​​so that the generator would try to cheat the discriminator a little more.



We also tried to use large pretrained networks (VGG, RESNET, Inception) as a discriminator, but they are very large and training slowed down a lot.



As a result, the work managed to achieve the fact that the style is transferred close to the original, while maintaining the general color scheme (originals on top, generated pictures below):











Project code: https://github.com/IlyasKharunov/Cyclegan_project



Project team: Ilyas Kharunov, Oleg Verbin.



Video with project protection .



Movie posters GAN / embed



The next project is interesting from the point of view of the path that the student has taken. Unlike other student projects, Dmitry did the work on his own. This path turned out to be more difficult than that of others, but the results and conclusions are interesting.



The author decided to create a network that would generate posters for films in the given genres. For example, to have posters for horror films in dark colors, for comedies in light colors, and the like.



From the IMDB website, the author took 41 thousand posters for films in twenty genres for the period 1970-2020. Later it turned out that there were too few images for some genres, so Dmitry balanced the set by genre, and as a result, 32 thousand posters remained.



Then the student applied a neural network with DCGAN architecture to generate genreless images. She worked with posters in size 64x128.





The results turned out to be creepy:





Then the author tried the CVAE + DCGAN architecture:





And also VAE without GAN and GAN with classifier. Came to the conclusion that the assembled set of posters is too complex for these methods. Then the author applied conditional GAN: this is the same as DCGAN, only genres were now fed into both the generator and the discriminator. The latent vector z was taken with a length of 100, genres in hot format with a length of 20, resulting in a vector with a length of 120. The genre was added to the answer and an additional run was performed over one linear layer.



I managed to achieve the following result:





As you can see, the student was interested in trying different approaches, and the results were interesting. Auto received a lot of new experience, came to the conclusion that to implement such an idea, it is necessary to immediately take a more complex neural network, for example, StyleGAN.



Network learning process:





Project team: Dmitry Piterkin.



Video with project protection .






Soon we will tell you about the most interesting diploma projects in C ++ and Go development, front-end development and interface creation. You can read more about our educational projects at this link . And more often go to the Technostream channel , there regularly appear new training videos about programming, development and other disciplines.



All Articles