How Google search analysis can detect COVID-19 pockets earlier than authorities can

Anosmia - a lack of smell - is a symptom of COVID-19.



According to data from 2.5 million users of King's College London's COVID-19 symptom app, two-thirds of users diagnosed with the disease reported anosmia. At the same time, only a fifth of those in whom the disease was not detected reported the same symptom. Meanwhile, tens of thousands of people turn to Google every day for an answer to the question of why they suddenly stopped smelling. Is there a correlation between the search term “I can't smell” and the number of COVID-19 infections? Yes, there is such a correlation.















This study shows that searches related to anosmia match nearly perfect outbreaks in New York, New Jersey, Louisiana and Michigan.



A model built by Bill Lampos and a team of scientists at UCL shows that Google searches predict an increase in detected COVID-19 cases for up to two weeks. Among the most revealing queries are queries on anosmia.



So, anosmia-related searches can help predict COVID-19 outbreaks. But can the data obtained from the analysis of these queries prevent such outbreaks?



It depends on how quickly you can get the data you want. Real-time data is needed if this information is to be used to proactively respond to future outbreaks.



On June 5, Houston, for the first time, was ahead of New York in search terms related to anosmia.



According to the CDC, symptoms of COVID-19 appear between two days and two weeks after infection. This means that you only have 14 days to do something based on your search terms. At the same time, you need to know about where exactly people who have typed “I can't smell” into Google live. And you need to find out about this at the moment when such requests fall into the search engine.



In addition, you need to know how many people turn to Google with a similar request. And it should not be rough and not aggregated data (like those found in Google Trends ).



One way to get this kind of data in real time, and accurate data, is to buy the keywords "I can't smell" in Google Ads , the online advertising platform Google.



Next, you need to create a simple ad for anosmia (or, better yet, use a reputable source of anosmia information). Finally, all that remains is to choose a place on the map from which you want to receive data on the request "I can't smell".



The ad will then appear on the search results page for anyone who searches for “I can't smell”. This will be done for queries entered in the place of the world that the advertisement was aimed at.



Regardless of whether Google users click on such an ad or not, Google Ads will receive information about the number of ad impressions. This data will become available one hour after the search session.



Here is a graph showing searches for "I can't smell" in the 250 most populous cities in the United States, starting April 23rd. The y-axis shows the number of search sessions.





Number of searches for the words "I can't smell"



I have this data due to the fact that, starting from April 23, I bought the keywords "I can't smell" in Google Ads and targeted ads to 250 US cities with the highest number of inhabitants.



Perhaps, this schedule is rather difficult to perceive. Let's display the same data on a US map.





Number of searches for “I can't smell” visualized using a US map



Here you can see that searches for “I can't smell” in late April and early May were mostly from New York and Chicago. These are the two cities that have been hit hardest by COVID-19 in that time frame.



In addition, you can see that in June, indicators are growing in Houston and Dallas, Texas. On June 5, for the first time, Houston was ahead of New York in anosmia-related searches. And since June 13, Houston has been in first place for such requests among the 250 US cities with the highest population levels.



Here are the graphs, looking at which you can compare the number of searches for anosmia in Houston and the number of positive tests for COVID-19 in the first three weeks of June.





Positive COVID-19 Tests and Anosmia Searches



I would like to point out that anyone who takes a few hours to sort out Google Ads can reproduce these experiments.



I started buying keywords related to anosmia as I wanted to know more about the people who were in the quarantined places.



But after a couple of weeks of this experiment, I realized that this method of data mining can be used to collect information about the regions in which the data was "under quarantine".



As a result, buying keywords and targeting ads to citizens of certain countries can help you learn about which countries' authorities are lying to their citizens (or the whole world). And this, incidentally, applies not only to COVID-19, but to any other topic. Take a look at this study , for example.



The government is hiding the number of deaths, this is 100% true. How much they hide is harder to say. They, for a long time, completely controlled the data, as a result, we did not have the opportunity to access independent information about what was happening.



Zitto Kabwe, leader of the opposition ACT-Wazalendo party, Tanzania





Cases of COVID-19 in Tanzania



Tanzania, a country in East Africa, has reported 509 cases of COVID-19 infection since May 8. Since then, no new cases have been reported.



The analysis of search queries about anosmia correlates with the number of detected cases of COVID-19 infection and even allows predicting this indicator. Anosmia is the most common symptom of COVID-19. All this means that we should expect that in Tanzania, if there really are no new cases since May 8, there will be little search for information about anosmia.



However, in the same week that the Tanzanian government stopped reporting new cases of COVID-19, the country ranked second in the world for searches related to anosmia.



Appeared pretty soonMessages from the field, which indicated the crowded hospitals and night burials.



Critics accuse the Tanzanian government of failing to inform the public about the real extent of the spread of the disease and how many lives it claimed.



In order to see the real picture, based on data from Tanzanians, since the days when the Tanzanian government fell silent, I bought the keywords "I can't smell" and aimed the search all over Tanzania.



Here is a heat map for all regions of Tanzania.





Analysis of queries for “I can't smell” in Tanzania



It turned out that from 8 to 31 May 2020, English-speaking residents of Tanzania made an average of 93 queries every day.



One of the features of Google Ads is that you cannot serve ads here to users whose browser language is set to Swahili. This language is spoken by approximately 12.15 Tanzanians per 1 English speaker. It should be borne in mind that Google has data from about 5.1% of devices in the country.



As a result, it turns out that the actual number of searches for anosmia in Tanzania is, in fact, close to about 1824 per day. Google does not allow at least 94.9% of ad campaign data to be disclosed, so I multiplied the number of search sessions found by 19.61 in order to roughly estimate what is really happening in the country.



For comparison, between May 8 and May 31, 3251 anosmia searches were recorded in New York. During the same time, 18,143 cases of infection were reported. The ratio of search sessions to infections looks like 1: 5.5.



In Chicago, the same ratio over the same period looked like 1: 4.



In the District of Columbia it is 1: 1.96.



In most of the US cities I targeted, the number of confirmed COVID-19 cases exceeded the number of searches by 1.75-6 times.



And in Tanzania, approximately 1,824 anosmia searches were made every day since May 8.



The exact results are out of the question, however, I do not take into account in the US more vague requests related to anosmia, such as "loss of smell" ("loss of smell"). Also, I am not able to know exactly what data on users, compared to data on devices, Google has for a particular region.



But in any case, I estimate that in May, the real number of daily COVID-19 cases in Tanzania could be expressed in a small four-digit number.



Maybe this number is less. But it certainly does not equal zero.



Here 's how Google data can help fight COVID-19.



What, as applied to our short-term projection of the number of people with COVID-19, can be called "naucasting" is the observation of the spread of the disease using Google search engines. This is a working technique as proven by Bill Lampos's model.



But this technique can fail. Google Flu Trends, the first and best-known naukasting tool, stopped working three years after its launch. He was unable to help predict the peak of the 2013 flu epidemic.



“But the most useful conclusion that can be drawn is not that the analysis of search data is unreliable,” writes Sam Gilbert. “This is an addition to other methods, but not a replacement,” he adds.



Another model that I am watching is maintained by Imperial College London . This model estimates the true number of infections in Tanzania in the four weeks between April 29 and May 26, 2020, as 24689.



Analyzing Google search data can be a valuable clue for those observing a situation and not limited to official figures.



Even if it turns out that analysis of anosmia-related search queries does not help predict the spread of COVID-19, I don’t think we need to give in to the sentiment that emerged after the Google Flu Trends platform failed.



Now is not the time to be pessimistic about naucasting. The fact is that people these days more than ever turn to Google, telling the search engine about things that they do not tell anyone else. And now, more than ever, we need the best tools available to break through the hidden information and, by capturing information about the thoughts, fears, hopes (or symptoms) of people, to understand what is not being spoken about.



If the authorities are trying to hide data, trying to hide the truth from the citizens of their countries or from the whole world, then in order to prevent what we talked about here, they will have to completely block Google. And not because people can use Google to find objective information, but because the analysis of Google search queries can point the direction of research for those who are not content with official data.



"Advertising ceases to be an advertisement if it helps to find answers to some questions." This is a slogan that would help my colleagues to better perceive what they are doing. Although they resented the fact that they were essentially merchants, they used Google Ads for commercial purposes (in order to sell people goods and services that they did not need).



When you ask Google about reviews of new sneakers, or ask a search engine about the current quarantine situation, or about strange symptoms that you suddenly have, the first thing that appears on the SERPs will be, from a technical point of view , advertising.



This is, moreover, the answer to a question. And, in fact, a lot more.



Are you planning to learn anything using the search query analysis technique presented in this article?






All Articles