👨🏾 📙 🦀 How do hackers steal keys and passwords? 🌥️ ☪️ 🧚🏼

I am looking for vulnerabilities in various security systems. At a certain point, it became clear to me that my clients were not familiar enough (if at all familiar) with the basic techniques of "hacking". API keys, passwords, SSH keys, certificates are all great security mechanisms. But this is so as long as they are kept secret. Once such data becomes available to those who should not have access to it, it turns out that the complexity of the passwords and the advancedness of the hashing algorithms no longer matter. In this post, I want to talk about the concepts, methods and tools used by security researchers to find classified information. Such data is used to hack systems. I will also provide some simple workflows here that can help reduce the risk of a successful hacker attack.

It is important to note that the "game" of attack and defense, which is played by hackers and owners of computer systems, is a dishonest game. It is enough for the attacker to penetrate the system, to win only once. And those who defend can only win by always winning. The main difficulty here is to know what you need to pay attention to. But after the defender knows what kind of virtual "doors" a hacker can get into his system, these "doors" can be protected using fairly simple mechanisms. I believe that the simplicity of these mechanisms sometimes diminishes their importance and is the reason that many computer system defenders overlook these mechanisms.

Here are the basic rules for protecting systems that I am going to disclose in this article. They are simple, but this does not mean that you can forget about them with impunity:

(multi-factor authentication, MFA) , . Google GitHub, , VPN-. MFA — .
, .
, . .
. , .

In the matter of preventing leakage of classified information and preventing the appearance of "holes" in security systems, the Pareto principle operates , according to which 20% of efforts yield 80% of the result.

How do hackers work when they find passwords and secret keys? What tools do they use?

Hackers find secret data in JavaScript files

API keys are scattered all over the internet. Anyone can use them. It is a fact. Often there is no particular reason for the keys being publicly available. Developers simply forget them everywhere. For example, keys get into the code for the following reasons:

For debugging purposes.
For local development purposes.
In the form of comments intended for those who will support the project later.

Blocks of code resembling the following can be found quite often on the Internet:

// DEBUG ONLY
// TODO: remove -->
API_KEY=t0psecr3tkey00237948

Although many hackers read JavaScript files themselves, they mostly search for such files using tools like meg , and then check what they find for matching patterns.

How do they do it? After using the scanner, megthey seem to look in the found files for strings that match various patterns. The one who created meg, wrote another excellent program, and this is exactly what is intended. It is called gf and is an improved version grep. In this case, using the gfoption at startup truffleHogor, in another variant of its writing trufflehog, allows the tool to find high-entropy strings that are keys to the API. The same goes for string searchingAPI_KEY... Search results for such a string are often (too often) successful.

Often there are completely normal reasons for the fact that keys appear in the code, but such keys are not protected from outsiders. Let me give you an example. One client I worked with used an external map information service. This is done in many projects. In order to download and work with map information, it was necessary to make calls to the corresponding API using a key. But my client forgot to configure the service he was using to restrict the sources from which that service could receive requests using that particular key. It is not hard to imagine a simple attack, which consists in depleting the resource usage quota of a map service by sending many requests to it. This can cost the user of such a service a lot of money. Or,even "better" (from the point of view of the attacker), such an attack can lead to the fact that those parts of the client's project that are tied to cards, simply "fall".

JS files are used by hackers for more than just finding secret data. After all, such files are the code of your application, which can be seen by anyone who is interested in this code. A good hacker can, after carefully reading the code, understand the approach to entity naming used in it, find out the paths to the API, and can find valuable comments. Findings like these are formatted as a list of words that are passed to automatic scanners. This is what is called an "intelligent automated scan", where a hacker combines automated tools and the information it has gathered about a specific project.

Here is a real comment from the home page of one project, which speaks in plain text about insecure APIs from which anyone can get data:

/* Debug ->
domain.com/api/v3 not yet in production 
and therefore not using auth guards yet 
use only for debugging purposes until approved */

▍What to do?

. . , , .
API. , . , .
, , . , , , , , , . , .
, . . , . grep gf . . , , .
-. , - . 100% . - — .

, -

The Internet Archive (also known as the "Wayback Machine") stores periodic snapshots of websites. This project allows you to see what the Internet was like many years ago. The archive data is of considerable interest to hackers who need to collect information about a certain project. You can scan files for old variants of websites using tools like waybackurls (based on waybackurls.py ). This means that even if you found a key in the site code and removed it from there, but did not rotate the keys, hackers can find this key in the old version of the site and use this key to hack the system.

Here's what to do if you find a key where it shouldn't be:

Create a key designed to replace the compromised key.
Release a new version of the code that uses the new key. This code should be rewritten so that there are no lines in it that make it easy to identify the key.
Remove or deactivate the old key.

▍ The Internet Archive is not the only place to find keys

The old code gives attackers a wide variety of information that interests them.

API secret paths. We are talking about unsecured API endpoints that the developer thought would never be shared. While the paths that a hacker discovers may not be useful to them, these paths can help in understanding the design of a project's API and its API conventions. After the site code goes into production, the developer has no way to hide this code from prying eyes. It is very important to remember this.
-. , API, . , . , , . , -. , , . , , . , s https.

GitHub

GitHub is a goldmine for hackers. If you know where to look, you can find a lot of interesting things using simple search tools. If your organization's GitHub account is not protected by multi-factor authentication, then all employees of the organization, without exception, are walking security holes. It is quite possible that some of the employees use the same password everywhere, and that this password has already been stolen from them through some other system. A hacker who is interested in a certain organization can easily automate the search for compromised passwords, but what can I say, he can find such passwords manually.

An organization's employee roster can be created using open source intelligence (OSINT) techniques. LinkedIn or a public list of company employees from GitHub can help the attacker with this.

If, for example, someone decided to hack Tesla, then he may well start studying the company from this page:

https://api.github.com/orgs/teslamotors/members

And even if a company doesn't use GitHub as a git platform, there is still something valuable to find on GitHub. It is enough that this platform is used by at least one of the company's employees, for example, for a home project. If something secret about a company appears in the code of this project (or in the git history), this will be enough to penetrate the systems of this company.

Keeping track of the complete history of changes made to each project is the nature of git. In light of security issues, this fact plays a huge role. In other words, every change made to the code by anyone who has access to any of the systems of an organization puts that organization at risk.

▍Why is this happening?

Companies do not check their systems for vulnerabilities.
, , .
, , ( , , 1%), ( — git, , , ).
, . .

▍ GitHub

There is such a thing as "dorks" - special search queries that use different capabilities of search engines to find what is related to certain data. Here is an interesting list of similar searches for Google by exploit-db.com.

If you want to delve deeper into this topic, and I recommend that you do, then before giving you a short list of strings used to find keys and passwords on GitHub, I suggest that you read this valuable material written by a talented system security researcher. He talks about how, what and where to search on GitHub, how to use dorks, and outlines in detail the manual process of finding secret data.

The roads used on GitHub are not as complex as those used on Google. The point is, GitHub simply doesn't offer the user the same advanced search capabilities that Google does. Regardless, searching the GitHub repositories correctly can work wonders. Try to search in the repository you are interested in for the following lines:

password
dbpassword
dbuser
access_key
secret_access_key
bucket_password
redis_password
root_password

And if you try to search for specific files using queries like filename:.npmrc _author filename:.htpasswd, then you can filter the search results by types of data leaks. Here's another good piece on this topic.

▍Risk mitigation measures associated with GitHub

Make scanning your code for vulnerabilities part of the CI process. The excellent GitRob tool can help you with this .
. GitRob . , no-expand-orgs.
. GitRob, , 500 , , -commit-depth <#number>.
GitHub !
, , , , . G Suite Active Directory. , .

After this material was published, some of its readers made valuable comments regarding the complexity of passwords and their rotation, as well as the use of hardware protection of information.

Here are @ codemouse92 's comments :

Use complex and unique passwords wherever password logon is used. But keep in mind that a complex password is not necessarily one that is a mysterious jumble of letters, numbers and special characters. The best strategy now is to use long phrases as passwords. I would like to make one note about password managers. While it is definitely worth using such programs, it is still better to use passwords, which are phrases that users remember and can enter on their own.

Here's what user @corymcdonald says :

Where I work, everyone is given multifactor authentication hardware. Each has 2 YubiKey devices. In addition, each team uses the 1Password password manager, and each team has its own password vault. When an employee leaves the company, the support team rotates passwords in every vault that the employee had access to. Personally, I, for example, made an unforgivable mistake by posting the keys to access AWS on GitHub. It is recommended that you check materials using git-secrets before committing . This will prevent what looks like classified information from being shared.

Hackers use Google

Now that we have a basic understanding of the dorks, we can talk about the use of specific search queries on Google. Here you can find incredible things with their help. Google is a powerful search engine that allows you to build queries describing strings that should and should not be present in the data you are looking for. Google, among other things, allows you to search for files with certain extensions, can search for specified domains, URLs. Take a look at the following search string:

"MySQL_ROOT_PASSWORD:" "docker-compose" ext:yml

This string is designed to search for files with the extension yml, moreover, these should be files docker-composein which developers often store passwords. Not particularly unique passwords. Try running a Google search for this string. You will be surprised by what you find.

Other interesting search strings might be looking for RSA keys or AWS credentials. Here's another example:

"-----BEGIN RSA PRIVATE KEY-----" ext:key

Here, endless possibilities open up before us. The quality of the search depends only on the level of creativity of the researcher and on how well he is familiar with various systems. Here's a big list of Google Dorks if you want to experiment.

Hackers scrutinize the systems they are interested in

When a security researcher (or a motivated hacker) is very interested in a certain system, he begins to study the system in depth. He gets to know her closely. He is interested in API endpoints, naming conventions for entities, features of interaction of internal parts of systems, access to different versions of the system if different versions of it are used simultaneously.

A not-so-good approach to securing APIs is to complicate the paths to access them, hide them using something like a random character generator. This is not a substitute for real security mechanisms. Security researchers are trying to find unsecured access paths to systems, API endpoints, for example, using tools for "fuzzy" search for vulnerabilities. Such tools use lists of words, build paths from them, and test those paths by analyzing the responses they receive when trying to access them. Such a scanner will not find an endpoint, the path to which is represented by a completely random set of characters. But such tools are great for identifying patterns and finding endpoints that the owners of the system either forgot or never knew about.

Remember that "security through obscurity" is not the best way to protect systems (although you shouldn't completely ignore it).

This is where the GitHub dorks, which we talked about above, come to the aid of cybercriminals. Knowing what rules are used when constructing paths to the system's endpoints (for example, something like api.mydomain.com/v1/payments/...) can be of immense help to a hacker. Searching the company's GitHub repository (and its employees' repositories) for API-related strings will often find paths that include random characters.

But "random strings", nevertheless, have their place in systems. Their use is always better than using sequences from resource identifiers, strings like usersand in the paths to the API orders.

Here is the awesome SecLists repository, which contains many strings used when naming entities. It is used by almost everyone in the data protection industry. Often these materials are modified for a specific system. Another tool that can be used to find "encrypted" paths is FFuf , an extremely fast fuzzy logic program written in Go.

Outcome

Security issues are often overlooked in startups. Programmers and managers usually prioritize development speed and frequency of product releases, sacrificing quality and safety. Here we see the inclusion of secret information in the code that gets into the repositories, the use of the same keys in different places in the system, the use of access keys where you can use something else. Sometimes it may seem that something like this allows you to speed up the work on the project, but, over time, it can lead to very bad consequences.

In this post, I tried to show you how strings that seem to be protected by being stored in a private repository can easily go public. The same goes for a clone of the repository, made by a well-meaning employee and not intended for prying eyes, but turned out to be public. But you can build a foundation for secure operation by using a secure password sharing tool , using a centralized repository of secrets, configuring password security policies, and multi-factor authentication. This will allow, without ignoring security, not to slow down the speed of work on the project.

When it comes to protecting information, the idea that speed is the most important thing does not work very well here.

Gaining knowledge of how hackers work is usually a very good first step towards understanding what information security is. This is the first step towards securing systems. When securing systems, consider the above methods of penetrating them, and the fact that hackers use a fairly limited set of such methods. It is recommended to consider from the point of view of security absolutely everything that in one way or another is related to a certain system, regardless of whether it is about external or internal mechanisms.

Securing systems can sometimes be perceived as not very important, but time consuming and hectic. But rest assured, the simple steps you take to secure your systems can save you a lot of trouble.

How do you protect your systems?

How do hackers steal keys and passwords?