What is threat hunting, and how to properly hunt cybercriminals





Threat hunting or TH is a proactive search for traces of hacking or the functioning of malware that are not detected by standard protection means. Today we will talk about how this process works, what tools can be used to search for threats, and what to keep in mind when forming and testing hypotheses.



What is threat hunting and why is it needed



In the process of threat hunting, the analyst does not wait until the sensors of the protection systems are triggered, but purposefully looks for signs of compromise. To do this, it develops and verifies assumptions about how attackers could have penetrated the network. Such checks should be consistent and regular.



The correct implementation of the process should take into account the principles:



  • It should be assumed that the system has already been compromised. The main goal is to find traces of penetration.
  • To search, you need a hypothesis about how exactly the system was compromised.
  • The search should be carried out iteratively, that is, after testing the next hypothesis, the analyst puts forward a new one and continues the search.


Often, traditional automated defenses miss sophisticated targeted attacks . The reason is that such attacks are often spread over time, so security tools cannot correlate the two phases of an attack. At the same time, attackers carefully think over the penetration vectors and develop scenarios for actions in the infrastructure. This allows them not to perform unmasking actions and pass off their activity as legitimate. Attackers are constantly improving their knowledge, buying or developing new tools.



The issues of identifying targeted attacks are especially relevant for organizations that were previously hacked. According to the reportFireEye M-Trends, 64% of previously compromised organizations were attacked again. It turns out that more than half of the hacked companies are still at risk. This means that it is necessary to apply measures for early detection of the facts of compromise - this can be achieved with the help of TH.



Threat hunting helps information security specialists reduce the time to detect a breach, as well as update knowledge about the protected infrastructure. TH is also useful when using threat intelligence (TI) - especially when TI indicators are used when making a hypothesis.



How to form hypotheses for testing



Since when conducting TH, it is a priori assumed that the attacker has already penetrated the infrastructure, the first thing to do is to localize the location of the search for traces of hacking. It can be determined by putting forward a hypothesis about how the penetration occurred and what confirmation of this can be found in the infrastructure. Having formulated a hypothesis, the analyst checks the truth of his assumption. If the hypothesis is not confirmed, the expert proceeds to develop and test a new one. If, as a result of testing the hypothesis, traces of hacking are found or the presence of malware is established, an investigation begins.







Figure 2. Scheme of threat hunting



The idea of ​​a hypothesis may be born from the analyst's personal experience, but there are other sources for its construction, for example:



  • threat intelligence (TI-). , : X, MD5- Y.
  • , (TTPs). TTPs MITRE ATT&CK. : .
  • . . , asset management . .
  • , .


threat hunting



After formulating a hypothesis, it is necessary to identify data sources that can contain information to test it. Often, such sources contain too much data, among which you need to find relevant. Thus, the TH process boils down to researching, filtering and analyzing a huge amount of data about what is happening in the infrastructure. Consider the sources in which information can be found to test the search hypothesis:







Figure 3. Classification of information sources for conducting TH



The largest amount of relevant information is contained in logs and network traffic. Products of the SIEM (security information and event management) and NTA (network traffic analysis) classes help to analyze information from them. External sources (such as TI feeds) should also be included in the analysis process.



How it works in practice



The main goal of TH is to detect a breach that has not been detected by automated security tools.



For example, consider testing two hypotheses. In practice, we will show how traffic analysis and log analysis systems complement each other in the process of hypothesis testing.



Hypothesis No. 1: an attacker entered the network through a workstation and tries to gain control over other nodes on the network, using WMI command execution to advance.


The attackers obtained the credentials of a root user. After that, they try to take control of other nodes in the network in order to get to the host with valuable data. One of the methods for launching programs on a remote system is using Windows Management Instrumentation (WMI) technology . She is responsible for the centralized management and monitoring of the various parts of the computer infrastructure. However, the creators foresaw the possibility of applying this approach to the components and resources of not only a single host, but also a remote computer. For this, the transmission of commands and responses through the DCERPC protocol was implemented.



Therefore, to test the hypothesis, you need to examine the DCERPC queries. Let's show how this can be done using traffic analysis and a SIEM system. In fig. 4 shows all filtered DCERPC network interactions. For example, we have chosen the time interval from 06:58 to 12:58. Figure 4. Filtered DCERPC sessions . 4 we see two dashboards. On the left are the nodes that initiated DCERPC connections. On the right are the nodes to which clients have connected. As you can see from the figure, all clients on the network access only the domain controller. This is a legitimate activity, since hosts united in an Active Directory domain use the DCERPC protocol to contact a domain controller for synchronization. It would be considered suspicious in the event of such communication between user hosts.















Since nothing suspicious has been identified for the selected time period, moving along the timeline, we select the next 4 hours. Now it is an interval from 12:59 to 16:46. In it, we noticed a strange change in the list of destination hosts (see Fig. 5). Figure 5. After changing the time interval, two new nodes appeared in the server list. In the list of destination hosts, there are two new nodes. Consider the one without a DNS name (10.125.4.16). Figure 6. Refinement of the filter to find out who connected to 10.125.4.16 As you can see from fig. 6, the domain controller 10.125.2.36 accesses it (see Figure 4), which means that this interaction is legitimate.



























Next, you need to analyze who connected to the second new node, in Fig. 5 is win-admin-01.ptlab.ru (10.125.3.10). From the name of the node it follows that this is the administrator's computer. After the filter is refined, only two session source nodes remain. Figure 7. Refine the filter to find out who connected to win-admin-01 Similarly to the previous case, one of the initiators was a domain controller. These sessions are not suspicious as they are common in an Active Directory environment. However, the second node (w-user-01.ptlab.ru), judging by the name, is the user's computer - such connections are anomalies. If you go to the Sessions tab with this filter, you can download the traffic and see the details in Wireshark. Figure 8. Downloading relevant sessions























In the traffic, you can see a call to the IWbemServices interface, which indicates the use of a WMI connection. Figure 9. Calling the IWbemServices (Wireshark) interface Moreover, the transmitted calls are encrypted, so the specific commands are unknown. Figure 10. DCERPC traffic is encrypted, so the transmitted command is not visible (Wireshark) To finally confirm the hypothesis that such communication is illegitimate, you need to check the host logs. You can go to the host and see the system logs locally, but it is more convenient to use a SIEM system. In the SIEM interface, we introduced a condition into the filter that left only the logs of the target node at the time of establishing a DCERPC connection, and saw the following picture:



































Figure 11. System logs win-admin-01 at the moment of establishing a DCERPC connection



In the logs, we saw an exact match with the start time of the first session (see Figure 9), the connection initiator is host w-user-01. Further analysis of the logs shows that they connected under the PTLAB \ Admin account and ran the command (see Fig. 12) to create the user john with the password password !!!: net user john password !!! / add. Figure 12. Executed command during connection











We found out that from node 10.125.3.10 someone using WMI on behalf of the PTLAB \ Admin account added a new user to the host win-admin-01.ptlab.ru. When conducting real TH, the next step is to find out if this is an administrative activity. To do this, you need to contact the owner of the PTLAB \ Admin account and find out if he performed the described actions. Since the considered example is synthetic, we will assume that this activity is illegitimate. Also, when conducting a real TN, in case of revealing the fact of illegal use of the account, you need to create an incident and conduct a detailed investigation.



Hypothesis No. 2: an attacker has penetrated the network and is at the stage of data exfiltration, using traffic tunneling to output data.


Tunneling traffic - organizing a channel in such a way that packets of one network protocol (possibly in a modified form) are transmitted inside the fields of another network protocol. A common example of tunneling is building encrypted pipes like SSH. Encrypted channels ensure the confidentiality of transmitted information and are common in modern corporate networks. However, there are exotic options such as ICMP or DNS tunnels. Such tunnels are used by cybercriminals to disguise their activity as legitimate.



Let's start by finding the most common way to tunnel traffic through the SSH protocol. To do this, we will filter all sessions using the SSH protocol: Figure 13. Searching for DNS sessions in traffic











In the figure, you can see that there is no SSH traffic in the infrastructure, so you need to choose the following protocol that could be used for tunneling. Since DNS traffic is always allowed in corporate networks, we will consider it below.



If you filter traffic by DNS, you can see that one of the nodes has an abnormally large number of DNS queries.







Figure 14. Widget with statistics of DNS clients sessions



After filtering sessions by source of requests, we learned where this anomalous amount of traffic is sent and how it is distributed between destination nodes. In fig. Figure 15 shows that some of the traffic goes to the domain controller, which acts as the local DNS server. However, a large proportion of requests go to an unknown host. In a corporate network built on Active Directory, user computers for DNS name resolution should not use an external DNS server to bypass the corporate one. If such activity is detected, you need to find out what is being transmitted in the traffic and where all these requests are sent. Figure 15. Searching traffic for SSH sessions











If you go to the "Sessions" tab, you can see what is transmitted in requests to the suspicious server. The time between requests is rather short, and there are many sessions. Such parameters are uncommon for legitimate DNS traffic.







Figure 16. DNS traffic parameters



Having opened any session card, we see a detailed description of requests and responses. The replies from the server do not contain errors, but the requested records look very suspicious, since usually the nodes have shorter and more meaningful DNS names. Figure 17. Suspicious DNS record request Traffic analysis showed that suspicious activity on sending DNS requests is taking place on the win-admin-01 host. It's time to analyze the logs of the network node - the source of this activity. To do this, go to SIEM.















We need to find the system logs win-admin-01 and see what happened around 17:06. You can see that a suspicious PowerShell script was running at the same time. Figure 18. PowerShell execution at the same time as sending suspicious requests The logs record which script was being executed. Figure 19. Fixing the name of the running script in the logs The name of the executed script admin_script.ps1 hints at legitimacy, but administrators usually give the name to the scripts for a specific function, and here the name is general. Moreover, the script is located in the folder for temporary files. It is unlikely that an important administrative script will end up in a folder that can be emptied at any time.



























Among the events discovered was the creation of an unusual cryptographic class from the Logos.Utility library. This library is rare and is no longer supported by the developer, so creating its classes is unusual. Let's try to find projects that use it. Figure 20. Creating a custom cryptographic class If you use the search, you can find a utility that organizes a DNS tunnel and uses this class using the second link. Figure 21. Searching for information about a script by class name To finally make sure that this is the utility we need, let's look for additional signs in the logs. So the evidence came to light. The first is to run the nslookup utility using a script. Figure 22. Running the nslookup utility by the script



































The nslookup.exr utility is used during network diagnostics and is rarely run by regular users. Start is visible in the source codes of the utility. Figure 23. Code for launching the nslookup utility (GitHub) The second proof is a rather unique string for generating random values. Figure 24. Generating random values ​​by the script If you use the search in the source codes, you can see this very line. Figure 25. Code for generating a random value The tunnel hypothesis was confirmed, but the essence of the performed actions remained unclear. During the subsequent analysis of the logs, we noticed two process launches. Figure 26. Search for office documents for further exfiltration















































The launch lines of the found processes indicate the search for documents to download. Thus, the hypothesis was fully confirmed, the attackers really used traffic tunneling to download data.



conclusions



As the latest research reports show , the average time that attackers stay in the infrastructure remains long. Therefore, do not wait for signals from automated defenses - act proactively. Study your infrastructure and modern attack methods, and use research conducted by TI teams ( FireEye , Cisco , PT Expert Security Center ).



I am not calling for the abandonment of automated protections. However, one should not assume that the installation and correct configuration of such a system is the final point. This is just the first necessary step. Next, you need to monitor the development and functioning of the controlled network environment, keep your finger on the pulse.



The following tips will help you:



  1. . . , .
  2. . , .
  3. . , . . , TH , .
  4. Automate routine tasks so you have more time to get creative and try out creative solutions.
  5. Simplify the process of analyzing large amounts of data. To do this, it is useful to use tools that help the analyst see what is happening on the network and on network nodes as a single picture. These tools include a platform for exchanging TI indicators , a traffic analysis system and a SIEM system .


Posted by Anton Kutepov, PT Expert Security Center at Positive Technologies.



The entire analysis was carried out in the PT Network Attack Discovery traffic analysis system and the MaxPatrol SIEM security event management system.



All Articles