Rapid detection of supernovae with neural networks

Hello, Habr! I would like to introduce you to a translation (slightly adapted) of the excellent article "Fast Supernovae Detection using Neural Networks" by Rodrigo Carrasco-Davis from the Institute of Astrophysics in Chile.



A bit of background



Astronomy is the study of celestial objects: stars, galaxies or black holes. This study of celestial objects resembles work in the laboratory of "natural physics". The most incredible, extreme natural processes take place in it, which for the most part cannot be repeated on Earth. Observations of these processes allow us to gain a deeper understanding of the world, check the existing knowledge about physics, comparing the established ideas with what we observe in the Universe.



There is a special type of event that is of great interest to astronomers. It happens at the end of the life of massive stars. They are composed of hydrogen, which is pulled towards the center by gravity. And when the density gets high enough, hydrogen atoms start to merge. This leads to the appearance of a glow and the appearance of new chemical elements: helium, carbon, oxygen, neon, etc. The merging process takes place under internal pressure, while gravity exerts external pressure, thereby maintaining the stability of the star while it burns itself out. The more massive the star, the higher temperatures reaches its core, and the faster it burns nuclear fuel.



Gradually, the synthesis process passes to heavier elements: to magnesium, silicon, sulfur, eventually coming to iron, cobalt and nickel. The synthesis of further elements would require more energy than is released during the reaction, so the core collapses and a supernova explosion occurs.





Crab Nebula , Supernova Remnant



This process is very important for astronomers. Due to the extreme conditions during the explosion, astronomers can observe the synthesis of heavy elements, check the behavior of matter under intense pressure and temperature, and observe the explosion product, which could be a neutron star or a black hole.



Supernovae can also be used as standard candles. A common problem in astronomy: measuring distances to celestial objects. Since stars are so far from Earth, it is difficult to determine if a star is faint and close to us, or very distant and very bright. Most supernova explosions in the Universe happen the same way, which is why astronomers use supernovae to measure distances, which is important when studying, for example, the expansion of the Universe and dark energy .



Despite the fact that supernova explosions are very bright, they are difficult to notice due to their distance from the Earth, due to their low frequency of occurrence (about one supernova per galaxy per century) and the short-term nature of the explosion, which can last from several days to a couple weeks. In addition, in order to obtain useful information from a supernova, it is necessary to prepare a spectrograph (used to measure the energy emitted during an explosion at several frequencies). It would also be good to start observing the star ahead of time, as many interesting physical processes take place in the hours before the explosion begins. Now ask yourself the question: how can we quickly find these supernova explosions among all the other observable astronomical objects in the universe?



Astronomy today



Several decades ago, an astronomer had to select a specific object and point a telescope at it to get the information he needed. Modern telescopes like the Zwicky Transient Facility (ZTF) or the Vera Rubin Observatory capture high-quality images of the sky at a very high speed, collecting data on the visible sky every three days. The ZTF telescope generates 1.4 TB of data per night, identifying and sending information about interesting changing objects in the sky in real time.



When something changes its brightness, "smart" telescopes notice it and send a warning alert. The warning is accomplished by sending a data stream in which each message consists of three 63-by-63-pixel cropped images. These three images are called scientific, reference and differential.



A scientific image file is the most recent observation of a specific area. With reference - what was at the beginning of observations. Everything that has changed between the first and second images can be seen in the third, difference. The notorious telescope transmits up to one million warnings per night, but more often several thousand. Let's say a person wants to check each alert manually, it will take about 3.5 days to view all alerts in one night.





Scientific, reference and difference images. Supplemented with other important data, such as observation conditions and information about the object. The fourth image is a color version from PanSTARRS using the Aladin Sky Atlas . You can see the complete evolution of supernova brightness over time in the ALeRCE interface .



Since these warnings communicate everything that changes in the sky, it is important to be able to detect supernovae among the entire stream of information generated by the telescope. The problem is that other astronomical objects can also trigger an alert. For example, variable stars changing their brightness, active galactic nuclei, asteroids. False warnings also happen. Fortunately, scientific, reference, and difference images have a number of distinctive features that help determine what an alert is talking about a supernova or other object. And it would be great to learn how to effectively distinguish between the main classes of alerts.





Five classes of astronomical objects



Thus, active galactic nuclei are usually located in the center of galaxies. Supernovae usually originate near the host galaxy. Asteroids are observed near the solar system and are not visible in the reference image. Variable stars are found in images filled with other stars as they are found mostly within the Milky Way. False alerts arise for various reasons: lack of pixels in the telescope camera, poor subtraction when creating a differential image, cosmic rays, etc. As I said before, it is not possible for a human to manually check every warning. Therefore, an automatic way of classifying them was needed so that astronomers could find the most interesting data, which is more likely to contain information about supernovae.



Search for supernovae using neural networks



Since we roughly understand the differences between the images of the five above-mentioned classes, we can try to calculate specific features in order to classify them correctly. However, manual work is difficult and requires a long period of trial and error. Therefore, it was decided to train a convolutional neural network (CNN) to solve the classification problem and quickly detect supernovae.



Ensuring the invariance of the neural network is achieved by creating rotated copies of each image in the training set by 90 ยฐ, after which the average value of each rotated version of the image is loaded. The use of invariance is important because there is no specific orientation in which structures can appear in images sent in alerts.



Scientists also added some of the metadata contained in the warning, such as position in the coordinates of the sky, distance to other known objects and metrics of atmospheric conditions. After training the model using cross-entropy, the probability that the warning contains information about a supernova concentrated around the values โ€‹โ€‹of 0 or 1. True, the classifier sometimes made mistakes in the predicted class. It is not very convenient that the researcher has to additionally filter the data on possible supernovae after the computer has made a prediction.



To maximize the entropy of the forecast and distribute the values โ€‹โ€‹of the output probabilities, the scientists added additional information to the neural network. This made it possible to improve the detail or clarity of forecasts, obtaining probabilities in the entire range from 0 to 1, and not just the extreme values โ€‹โ€‹of these indicators. The result is much more conveniently interpretable predictions, allowing the astronomer to select worthy supernova candidates.





Convolutional neural network with increased rotation invariance. Rotated copies are created and passed to the same neural network architecture to then apply the middle pool in a dense layer before combining with metadata.



Scientists passed through the neural network about 400,000 objects, evenly distributed in space across the entire coverage of the ZTF telescope, as a check on the correctness of model predictions. It turned out that each class predicted by the neural network is spatially distributed. This makes sense when you consider the nature of each astronomical object. For example, active galactic nuclei and supernovae are mostly outside the plane of the Milky Way (extragalactic objects), since it is unlikely that further objects can be seen through the plane of the Milky Way due to occlusion. The model correctly predicts fewer objects near the plane of the Milky Way (galactic latitudes closer to 0). Variable stars are correctly detected with a higher density in the galactic plane. Asteroids are located near the plane of the solar system,also called the ecliptic (marked with a yellow line). And false warnings happen everywhere.



The information in the images (scientific, reference and difference) is sufficient to obtain a good classification in the training set, but the integration of information from the metadata was critical to obtain the correct spatial distribution of the predictions. 





Spatial distribution of an unmarked set of astronomical objects. Each graph is given in galactic coordinates. Galactic latitude is at the center of the Milky Way, so latitudes close to 0 are also closer to the plane of the Milky Way. Galactic longitude indicates how much of the disk we see in the plane of the Milky Way. The yellow line represents the plane of the solar system (ecliptic).



Supernova hunters



The resulting Supernova Hunter project's web interface allows astronomers to study objects selected by the neural network, confident that they are supernovae. They can also report misclassifications made by the model, which allows new information to be added to the training set to improve the performance of the neural network later.





Supernova Hunter : User interface for researching supernova candidates. It shows a list of alerts with a high probability of supernova information. For each of them images, object position and metadata are added.



Neural network classifier and Supernova Hunter confirmed 394 supernovae and reported 3,060 supernova candidates on Transient Name ServerFrom June 26, 2019 to July 21, 2020, on average, there were 9.2 supernova candidates per day. This rate of observation dramatically increases the number of available supernovae that can be studied in the early stages of an explosion.



Perspectives



The scientists behind the Supernova Hunter are now working to improve the model's classification characteristics so that it more accurately identifies supernova candidates and requires less human attention. Ideally, this should be a system that can automatically report every possible supernova candidate with a high degree of confidence.



Another area of โ€‹โ€‹work of scientists is the search for rare objects using outlier detection methods. This is a challenging but realistic task, as new telescopes could theoretically discover new types of astronomical objects due to the incredible sampling rate and scale of each observation.



A new way to analyze huge amounts of astronomical data will be not only useful, but also necessary, since organizing the classification and redistribution of data is an important part of science. The use of today's powerful telescopes is fundamentally changing the way astronomers study celestial objects, and scientists must be prepared to work with new technologies.



Thank you for attention! Original article .



All Articles