Skip to main content

Thrive For Precision Not Accuracy


Jake Porway who was a data scientist at the New York Times R&D labs has a great perspective on why multi-disciplinary teams are important to avoid bias and bring in different perspective in data analysis. He discusses a story where data gathered by Über in Oakland suggested that prostitution arrests increased in Oakland on Wednesdays but increased arrests necessarily didn't imply increased crime. He also outlines the data analysis done by Grameen Foundation where the analysis of Ugandan farm workers could result into the farmers being "good" or "bad" depending on which perspective you would consider. This story validates one more attribute of my point of view regarding data scientists - data scientists should be design thinkers. Working in a multi-disciplinary team to let people champion their perspective is one of the core tenants of design thinking.

One of the viewpoints of Jake that I don't agree with:

"Any data scientist worth their salary will tell you that you should start with a question, NOT the data."

In many cases you don't even know what question to ask. Sometimes an anomaly or a pattern in data tells a story. This story informs us what questions we might ask. I do see that many data scientists start with knowing a question ahead of time and then pull in necessary data they need but I advocate the other side where you bring in the sources and let the data tell you a story. Referring to design, Henry Ford once said, ""Every object tells a story if you know how to read it." Listen to the data—a story—without any pre-conceived bias and see where it leads you.

You can only ask what you know to ask. It limits your ability to unearth groundbreaking insights. Chasing a perfect answer to a perfect question is a trap that many data scientists fall into. In reality what business wants is to get to a good enough answer to a question or insight that is actionable. In most cases getting to an answer that is 95% accurate requires little effort but getting that rest 5% requires exponentially disproportionate time with disproportionately low return.

Thrive for precision, not accuracy. The first answer could really be of low precision. It's perfectly acceptable as long as you know what the precision is and you can continuously refine it to make it good enough. Being able to rapidly iterate and reframe the question is far more important than knowing upfront what question to ask; data analysis is a journey and not a step in the process.

Photo credit: Mario Klingemann

Comments

Popular posts from this blog

A Data Scientist's View On Skills, Tools, And Attitude

I recently came across this interview (thanks Dharini for the link!) with Nick Chamandy, a statistician a.k.a a data scientist at Google. I would encourage you to read it; it does have some great points. I found the following snippets interesting: Recruiting data scientists: When posting job opportunities, we are cognizant that people from different academic fields tend to use different language, and we don’t want to miss out on a great candidate because he or she comes from a non-statistics background and doesn’t search for the right keyword. On my team alone, we have had successful “statisticians” with degrees in statistics, electrical engineering, econometrics, mathematics, computer science, and even physics. All are passionate about data and about tackling challenging inference problems. I share the same view. The best scientists I have met are not statisticians by academic training. They are domain experts and design thinkers and they all share one common trait: they love data!...

Focus On Your Customers And Not Competitors

A lorry is a symbol of Indian logistics and the person who is posing against it is about to rethink infrastructure and logistics in India. Jeff Bezos is enjoying his trip to India charting Amazon’s growth plan where competitors like Flipkart have been aggressively growing and have satisfied customer base. This is not the first time Bezos has been to India and he seems to understand Indian market far better than many CEOs of American companies. His interview with a leading Indian publication didn’t get much attention in the US where he discusses Amazon’s growth strategy in India. When asked whether he is in panic mode: For 19 years we have succeeded by staying heads down, focused on our customers. For better or for worse, we spend very little time looking at our competitors. It is better to stay focused on customers as they are the ones paying for your services. Competitors are never going to give you any money. I always believe in focusing on customers, especially on their latent unme...

Reminder: Apple to Preview "The Future of iOS and OS X" Tomorrow

Apple's World Wide Developer Conference ( WWDC ) starts Monday June 10th. At the event Apple will detail "the future of iOS and OS X." Apple is a leader in integrated accessibility with their VoiceOver screen reader and numerous other accessibility features such as Guided Access , Speak Selection , Zoom , and Assistive Touch . That being said Apple still needs to improve their accessibility features to remain a leader. New accessibility features have routinely been added to new versions of iOS. Last year, Apple added Guided Access along with other accessibility improvements. A new version of iOS is exciting not only because of the cool mainstream features but the lesser known, but no less important accessibility features that make the devices usable for so many. Make sure to visit The Assistive Technology Blog after the Keynote for all the iOS 7 and OS X accessibility news. In the meantime check out my iOS 7 Wish List video below.