Skip to main content

A Data Scientist's View On Skills, Tools, And Attitude



I recently came across this interview (thanks Dharini for the link!) with Nick Chamandy, a statistician a.k.a a data scientist at Google. I would encourage you to read it; it does have some great points. I found the following snippets interesting:

Recruiting data scientists:
When posting job opportunities, we are cognizant that people from different academic fields tend to use different language, and we don’t want to miss out on a great candidate because he or she comes from a non-statistics background and doesn’t search for the right keyword. On my team alone, we have had successful “statisticians” with degrees in statistics, electrical engineering, econometrics, mathematics, computer science, and even physics. All are passionate about data and about tackling challenging inference problems.
I share the same view. The best scientists I have met are not statisticians by academic training. They are domain experts and design thinkers and they all share one common trait: they love data! When asked how they might build a team of data scientists I highly recommend people to look beyond traditional wisdom. You will be in good shape as long as you don't end up in a situation like this :-)

Skills:
The engineers at Google have also developed a truly impressive package for massive parallelization of R computations on hundreds or thousands of machines. I typically use shell or python scripts for chaining together data aggregation and analysis steps into “pipelines.”
Most companies won't have the kind of highly skilled development army that Google has but then not all companies would have Google scale problem to deal with. Though I suggest two things: a) build a very strong community of data scientists using social tools so that they can collaborate on challenges and tools they use b) make sure that the chief data scientist (if you have one) has very high level of management buy-in to make things happen otherwise he/she would be spending all the time in "alignment" meetings as opposed to doing the real work.

Data preparation:
There is a strong belief that without becoming intimate with the raw data structure, and the many considerations involved in filtering, cleaning, and aggregating the data, the statistician can never truly hope to have a complete understanding of the data.
I disagree. I do strongly believe the tools need to involve to do some of these things and the data scientists should not be spending their time to compensate for the inefficiencies of the tools. Becoming intimate with the data—have empathy for the problem—is certainly a necessity but spending time on pulling, fixing, and aggregating data is not the best use of their time.

Attitude:
To me, it is less about what skills one must brush up on, and much more about a willingness to adaptively learn new skills and adjust one’s attitude to be in tune with the statistical nuances and tradeoffs relevant to this New Frontier of statistics.
As I would say bring tools and knowledge but leave bias and expectations aside. The best data scientists are the ones who are passionate about data, can quickly learn a new domain, and are willing to make and fail and fail and make.

Image courtesy: xkcd

Comments

Popular posts from this blog

15 YEARS OLD GIRL IMPREGNATED AND MAN RESPONSIBLE FOR IT TOOK FOR AN ABORTION THAT FAILED

BBI FACILITATE ARREST OF 35 YEARS OLD FOR DEFILEMENT, IMPREGNATING 15 YEARS OLD GIRL AND ABORTING FIVE MONTHS PREGNANCY IN ANAMBRA STATE. Today, at 1:26pm, We received a complaint from a concerned citizen who informed us of a 15yrs old girl brought into a hospital for medical treatment. Our intelligence team led by Director General Gwamnishu Emefiena Harrison Kenneth Nwaobi Ezika Kene and others left Asaba and arrived Ogidi Anambra state for investigation. 35yrs Chris Azuoma took the victim to hospital where she was injected and given abortion pills. She bled heavily and had complications and so decided to take her to a specialist hospital to evacuate the foetus. Getting to the hospital, we met the management and identified ourselves as Human rights group and they granted us permission to interview the victim. She confirmed the story and the perpetrator confessed forcefully having unprotected sexual intercourse with the victim. 2015 Administration of Criminal Justice permit private per

Hacking Into The Indian Education System Reveals Score Tampering

Debarghya Das has a fascinating story on how he managed to bypass a silly web security layer to get access to the results of 150,000 ISCE (10th grade) and 65,000 ISC (12th grade) students in India. While lack of security and total ignorance to safeguard sensitive information is an interesting topic what is more fascinating about this episode is the analysis of the results that unearthed score tampering. The school boards changed the scores of the students to give them "grace" points to bump them up to the passing level. The boards also seem to have tampered some other scores but the motive for that tampering remains unclear (at least to me). I would encourage you to read the entire analysis and the comments , but a tl;dr version is: 32, 33 and 34 were visibly absent. This chain of 3 consecutive numbers is the longest chain of absent numbers. Coincidentally, 35 happens to be the pass mark. Here's a complete list of unattained marks - 36, 37, 39, 41, 43, 45, 47, 49, 51, 53,

Reveiw: Celluon Epic Laser Keyboard

The Celluon Epic is a Bluetooth laser keyboard. The compact device projects a QWERTY keyboard onto most flat surfaces. (Glass tabletops being the exception) You can connect the Epic to vertically any device that supports Bluetooth keyboards including devices running iOS , Android , Windows Phone, and Blackberry 10. On the back of the device there is a charging port and pairing button. Once you have the Epic paired with your device it acts the same as any other keyboard. For any keyboard the most important consideration is the typing experience that it provides. The virtual keyboard brightness is adjustable and is easy to see in most lighting conditions. Unfortunately the brightness does not automatically adjust based on ambient light. With each keystroke a beeping sound is played which can be turned down. The typing experience on the Epic is mediocre at best. Inadvertently activating the wrong key can make typing frustrating and tiring. Even if you are a touch typist you'll still