Skip to main content

Chasing Qualitative Signal In Quantitative Big Data Noise


Joey Votto is one of the best hitters in the MLB who plays for Cincinnati Reds. Lately he has received a lot of criticism for not swinging on strikes when there are runners on base. Five Thirty Eight decided to analyze this criticism with the help of data. They found this criticism to be true; his swings at strike zone pitches, especially fastballs, have significantly declined. But, they all agree that Votto is still a great player. This is how I see many Big Data stories go; you can explain "what" but you can't explain "why." In this story, no one actually went (that I know) and asked Votto, "hey, why are you not swinging at all those fastballs in the strike zone?"

This is not just about sports. I see that everyday in my work in enterprise software while working with customers to help them with their Big Data scenarios such as optimizing promotion forecast in retail, predicting customer churn in telco, or managing risk exposure in banks.

What I find is as you add more data it creates a lot more noise in these quantitative analysis as opposed to getting closer to a signal. On top of this noise people expect there shall be a perfect model to optimize and predict. Quantitative analysis alone doesn't help finding a needle in haystack but it does help identify which part of haystack the needle could be hiding in.
"In many walks of life, expressions of uncertainty are mistaken for admissions of weakness." - Nate Silver
I subscribe to and strongly advocate Nate Silver's philosophy to think of "predictions" as a series of scenarios with probability attached to it as opposed to a deterministic model. If you are looking for a precise binary prediction you're most likely not going to get one. Fixating on a model and perfecting it makes you focus on over-fitting your model on the past data. In other words, you are spending too much time on signal or knowledge that already exists as opposed to using it as a starting point (Bayesian) and be open to run as many experiments as you can to refine your models as you go. The context that turns your (quantitative) information into knowledge (signal) is your qualitative aptitude and attitude towards that analysis. If you are willing to ask a lot of "why"s once your model tells you "what" you are more likely to get closer to that signal you're chasing.

Not all quantitative analyses have to follow a qualitative exercise to look for a signal. Validating an existing hypothesis is one of the biggest Big Data weapons developers use since SaaS has made it relatively easy for developers to not only instrument their applications to gather and  analyze all kinds of usage data but trigger a change to influence users' behaviors. Facebook's recent psychology experiment to test whether emotions are contagious has attracted a lot of criticism. Keeping ethical and legal issues, accusing Facebook of manipulating 689,003 users' emotions for science, aside this quantitative analysis is a validation of an existing phenomenon in a different world. Priming is a well-understood and proven concept in psychology but we didn't know of a published test proving the same in a large online social network. The objective here was not to chase a specific signal but to validate a hypothesis— a "what"—for which the "why" has been well-understood in a different domain.

About the photo: Laplace Transforms is one of my favorite mathematical equations since these equations create a simple form of complex problems (exponential equations) that is relatively easy to solve. They help reframe problems in your endeavor to get to the signal.

Comments

Popular posts from this blog

15 YEARS OLD GIRL IMPREGNATED AND MAN RESPONSIBLE FOR IT TOOK FOR AN ABORTION THAT FAILED

BBI FACILITATE ARREST OF 35 YEARS OLD FOR DEFILEMENT, IMPREGNATING 15 YEARS OLD GIRL AND ABORTING FIVE MONTHS PREGNANCY IN ANAMBRA STATE. Today, at 1:26pm, We received a complaint from a concerned citizen who informed us of a 15yrs old girl brought into a hospital for medical treatment. Our intelligence team led by Director General Gwamnishu Emefiena Harrison Kenneth Nwaobi Ezika Kene and others left Asaba and arrived Ogidi Anambra state for investigation. 35yrs Chris Azuoma took the victim to hospital where she was injected and given abortion pills. She bled heavily and had complications and so decided to take her to a specialist hospital to evacuate the foetus. Getting to the hospital, we met the management and identified ourselves as Human rights group and they granted us permission to interview the victim. She confirmed the story and the perpetrator confessed forcefully having unprotected sexual intercourse with the victim. 2015 Administration of Criminal Justice permit private per

Hacking Into The Indian Education System Reveals Score Tampering

Debarghya Das has a fascinating story on how he managed to bypass a silly web security layer to get access to the results of 150,000 ISCE (10th grade) and 65,000 ISC (12th grade) students in India. While lack of security and total ignorance to safeguard sensitive information is an interesting topic what is more fascinating about this episode is the analysis of the results that unearthed score tampering. The school boards changed the scores of the students to give them "grace" points to bump them up to the passing level. The boards also seem to have tampered some other scores but the motive for that tampering remains unclear (at least to me). I would encourage you to read the entire analysis and the comments , but a tl;dr version is: 32, 33 and 34 were visibly absent. This chain of 3 consecutive numbers is the longest chain of absent numbers. Coincidentally, 35 happens to be the pass mark. Here's a complete list of unattained marks - 36, 37, 39, 41, 43, 45, 47, 49, 51, 53,

Reveiw: Celluon Epic Laser Keyboard

The Celluon Epic is a Bluetooth laser keyboard. The compact device projects a QWERTY keyboard onto most flat surfaces. (Glass tabletops being the exception) You can connect the Epic to vertically any device that supports Bluetooth keyboards including devices running iOS , Android , Windows Phone, and Blackberry 10. On the back of the device there is a charging port and pairing button. Once you have the Epic paired with your device it acts the same as any other keyboard. For any keyboard the most important consideration is the typing experience that it provides. The virtual keyboard brightness is adjustable and is easy to see in most lighting conditions. Unfortunately the brightness does not automatically adjust based on ambient light. With each keystroke a beeping sound is played which can be turned down. The typing experience on the Epic is mediocre at best. Inadvertently activating the wrong key can make typing frustrating and tiring. Even if you are a touch typist you'll still