Psychological pitfall in data analytics
The human brain has an amazing ability to find patterns in everything ... but these patterns usually have little in common with reality. We can see the image of a rabbit in the cloud or the face of Elvis in potato chips.
See the rabbit and Elvis profile?
Think about the Rorschach test - people are shown different blots and asked what they see. You won't believe how readily our minds find false interpretations of random data sets.
Bat? Butterfly? An ordinary blot? This is one of 10 Rorschach test cards created in 1921.
Psychologists have a beautiful name for this phenomenon: apophenia. Give people at least some incentive and they will find you not only faces and butterflies, but also a reason to allocate a budget for your favorite project or launch an artificial intelligence system.
About the Author: Cassie Kozyrkov is a South African data and statistics specialist. She founded Decision Intelligence at Google, where she is Principal Researcher.
Most datasets contain a lot of random information. What is the likelihood that your analyst is not prone to apophenia? Can you trust your interpretation of the data?
Our mind does the same with data as it does with blots.
The more and more complex ways these datasets are fragmented, the more vague the incentives they are. They practically beg you to recognize false images in them.
Sophisticated data sets are practically begging to see in them what is actually not there
Are you sure your latest dataset is not hidden apophenia?
There is another wonderful word - pareidolia , a kind of apophenia (to find familiar things in vague sensory stimuli). In Japan, there is even a museum of stones that look like faces. We live in amazing times.
Lies, blatant lies and analytics
I know it sounds grim, but I'm not done yet. Taking courses in data analysis can add fuel to the fire. Students are accustomed to expecting real value from every data study. Each exploratory research assignment has a hidden treasure. Only a few professors dare to send you in pursuit of the unattainable (for your own good!). Assessing assignments without an accurate answer is more difficult, so students usually do not pay much attention to them.
Students are used to the truth behind every data set.
Data storytelling is just a step away from lying directly using data. Let's leave aside the question of whether the patterns are real. Let's talk about multiple interpretations. If you see a bat image in a blot, this does not mean that there are no butterflies, pelvic bones or a pair of foxes. If I hadn't mentioned the foxes, would you have seen them? Probably not. The psychological mechanisms responsible for motivation and attention are playing against you. To stop seeing the bat and begin to see only a superposition of values, a special skill is required.
Once people cling to their favorite image, it becomes difficult for them to see it.
The problem is that once people cling to their favorite image, it becomes difficult for them to unsee it and see other images. People tend to believe most of all the interpretation that caught their attention in the first place. Each new value found reduces the motivation to continue the search. Juggling multiple potential stories without reassessing your favorite story is a lot of mental work. Alas, not every analyst is disciplined enough to do this. In fact, many analysts are interested in “proving” only one side of the story through data research. Why develop skills that prevent your wallet from replenishing?
What color is your lightsaber?
There are several ways to prove history using data - honest and thorough. My article on data fragmentation will tell you more about this. Exploratory data analysis does not apply to these methods. Data exploration that does not imply real value is like fishing. The color of your lightsaber depends on the bait used.
If you join the dark side, you will be hooked on the evidence to support your theory. You already “know” that she is faithful (therefore, you can sell it to some naive victim). You may not even be aware that your lightsaber is red if you sincerely believe in data objectivity and your impartiality.
Data exploration that does not imply real value is like fishing.
If you have a rather complicated (vague) dataset, you will find a pattern that you can fit to prove your favorite story. This is the beauty of the Rorschach test. Unfortunately, the data is worse than blots. The more mathematical your method, the more convincing it sounds to those who do not understand anything about it.
A satellite image of the “face on Mars”, which many people perceive as evidence of the existence of aliens.
Those who refuse to embrace the dark side also fish. But they are catching something else: inspiration. They look for patterns that might be interesting and compelling, but don't take them for evidence because they're smart. Instead, they engage in unbiased analytics and try to note as many different interpretations in their heads as possible.
The best analysts try to find as many interpretations as possible.
This requires a keen eye and a modest, unbiased mind. Good analysts don't try to get stakeholders to see only one side of the story. Instead, they think creatively to turn the same data into multiple stories. They present their findings in such a way as to inspire everyone to follow up without provoking their leaders to move mountains out of overconfidence.
Impartiality gives data analysis a chance to carry some meaning.
The discipline developed to search for multiple interpretations is the analyst’s secret weapon. It allows you to keep in sight the real treasures hidden in the data. If you are distracted by false information that you believe in due to bias, it is difficult to pay attention to the evidence pointing in a different direction. Why analyze anything at all if the conclusions are predetermined? Impartiality gives a chance to make sure that all efforts were not in vain.
This grilled cheese sandwich was sold for $ 28,000 at auction because it depicts the Virgin Mary. What do you see here?
Hire a great analyst
Traits you probably want to look for in good analysts:
- They do not draw conclusions that go beyond the data that they examine.
- They are easy to manipulate data processing tools and can quickly view huge amounts of data.
- They have the necessary domain knowledge so they are less likely to waste stakeholders' time on the little things.
- They understand that their job is to find inspiration.
- They visualize data in a way that is convenient and understandable for the brain, so inspiration comes quickly.
- They know what they need to closely track any potential information they find (and who to turn to for help).
In addition to all of the above, this article invites you to pay attention to these features:
- They know that the mind finds meaning where there is none, so they try not to give in to false interpretations and do not rush to conclusions.
- , . , .
- , . , . , -.
Finally, if you are a leader, make sure you give your employees the right incentives. Are you looking for a data analyst or data manipulator? They have different thinking and skills. Choose the analyst wisely and reward for the right behavior.
Forget about potato chips! This Japanese museum with face-like stones has surpassed everyone.
Scientific Publication: The Potato Chip Really Does Look Like Elvis! Neural Hallmarks of Conceptual Processing Associated with Finding Novel Shapes Subjectively Meaningful
Learn the details of how to get a sought-after profession from scratch or Level Up in skills and salary by taking paid SkillFactory online courses:
- Machine Learning Course (12 weeks)
- Learning Data Science from scratch (12 months)
- (9 )
- «Python -» (9 )