Skip to main content

A Data Scientist's View On Skills, Tools, And Attitude



I recently came across this interview (thanks Dharini for the link!) with Nick Chamandy, a statistician a.k.a a data scientist at Google. I would encourage you to read it; it does have some great points. I found the following snippets interesting:

Recruiting data scientists:
When posting job opportunities, we are cognizant that people from different academic fields tend to use different language, and we don’t want to miss out on a great candidate because he or she comes from a non-statistics background and doesn’t search for the right keyword. On my team alone, we have had successful “statisticians” with degrees in statistics, electrical engineering, econometrics, mathematics, computer science, and even physics. All are passionate about data and about tackling challenging inference problems.
I share the same view. The best scientists I have met are not statisticians by academic training. They are domain experts and design thinkers and they all share one common trait: they love data! When asked how they might build a team of data scientists I highly recommend people to look beyond traditional wisdom. You will be in good shape as long as you don't end up in a situation like this :-)

Skills:
The engineers at Google have also developed a truly impressive package for massive parallelization of R computations on hundreds or thousands of machines. I typically use shell or python scripts for chaining together data aggregation and analysis steps into “pipelines.”
Most companies won't have the kind of highly skilled development army that Google has but then not all companies would have Google scale problem to deal with. Though I suggest two things: a) build a very strong community of data scientists using social tools so that they can collaborate on challenges and tools they use b) make sure that the chief data scientist (if you have one) has very high level of management buy-in to make things happen otherwise he/she would be spending all the time in "alignment" meetings as opposed to doing the real work.

Data preparation:
There is a strong belief that without becoming intimate with the raw data structure, and the many considerations involved in filtering, cleaning, and aggregating the data, the statistician can never truly hope to have a complete understanding of the data.
I disagree. I do strongly believe the tools need to involve to do some of these things and the data scientists should not be spending their time to compensate for the inefficiencies of the tools. Becoming intimate with the data—have empathy for the problem—is certainly a necessity but spending time on pulling, fixing, and aggregating data is not the best use of their time.

Attitude:
To me, it is less about what skills one must brush up on, and much more about a willingness to adaptively learn new skills and adjust one’s attitude to be in tune with the statistical nuances and tradeoffs relevant to this New Frontier of statistics.
As I would say bring tools and knowledge but leave bias and expectations aside. The best data scientists are the ones who are passionate about data, can quickly learn a new domain, and are willing to make and fail and fail and make.

Image courtesy: xkcd

Comments

Popular posts from this blog

Focus On Your Customers And Not Competitors

A lorry is a symbol of Indian logistics and the person who is posing against it is about to rethink infrastructure and logistics in India. Jeff Bezos is enjoying his trip to India charting Amazon’s growth plan where competitors like Flipkart have been aggressively growing and have satisfied customer base. This is not the first time Bezos has been to India and he seems to understand Indian market far better than many CEOs of American companies. His interview with a leading Indian publication didn’t get much attention in the US where he discusses Amazon’s growth strategy in India. When asked whether he is in panic mode: For 19 years we have succeeded by staying heads down, focused on our customers. For better or for worse, we spend very little time looking at our competitors. It is better to stay focused on customers as they are the ones paying for your services. Competitors are never going to give you any money. I always believe in focusing on customers, especially on their latent unme...

Reminder: Apple to Preview "The Future of iOS and OS X" Tomorrow

Apple's World Wide Developer Conference ( WWDC ) starts Monday June 10th. At the event Apple will detail "the future of iOS and OS X." Apple is a leader in integrated accessibility with their VoiceOver screen reader and numerous other accessibility features such as Guided Access , Speak Selection , Zoom , and Assistive Touch . That being said Apple still needs to improve their accessibility features to remain a leader. New accessibility features have routinely been added to new versions of iOS. Last year, Apple added Guided Access along with other accessibility improvements. A new version of iOS is exciting not only because of the cool mainstream features but the lesser known, but no less important accessibility features that make the devices usable for so many. Make sure to visit The Assistive Technology Blog after the Keynote for all the iOS 7 and OS X accessibility news. In the meantime check out my iOS 7 Wish List video below.