Wednesday, January 23, 2013

Data Science as a Fad?

In many ways, I didn't want to give this Forbes article a link, since it derides the idea of using data in the same ways that seemed to create the (admittedly somewhat imaginary) "scouts-vs.-stats" divide.  There is, of course, vast relevance of data science to management.  I think the article is a bit unfair to the self-created discipline, so please keep that in mind.

However, I also think there are some important points to remember.  Data scientists already engulfed in the management and operations of a given industry are invaluable.  However, data scientists with little understanding of the problems and the practical solutions specific to that industry can be dangerous.  I think this is a nice passage:

"Davenport and Patil declare that “Data scientists’ most basic, universal skill is the ability to write code.” With this pronouncement, data science fails the smell test at the very outset. For how many legitimate scientific fields is coding the most fundamental skill? The most fundamental skill for any scientist is of course mastery of a canonical body of knowledge that includes laws, definitions, postulates, theorems, proofs, and descriptions of unsolved problems. Scientists are therefore characterized by mastery of a body of knowledge, not a collection of methods. What is this body of knowledge for data science? Davenport and Patil admit there is none.

The job of scientists is to conduct independent research, contribute to a body of knowledge, and improve professional practice, while adhering to a recognized standard of conduct. Coding is a tool that facilitates some of these objectives, but is a substitute for none of them."

This point rings true in many cases.  I find myself falling into a "methods trap" in my academic work sometimes (though I try to get out of it as quickly as possible).  I know how to use R, though I am not a programmer or database manager.  I know a number of methods from statistics and econometrics.  I can turn these tools into something pretty neat.  But, I sometimes make the mistake of thinking that this is enough for researching some phenomenon.

...Then I try and write the Intro and Discussion for my paper.  Ouch.  This is amazingly difficult without reaffirming that body of knowledge about the problem at hand in the first place.  Methods and very cool visuals communicate answers.  But they are the tool to do so, not the answers themselves.  A point well-taken from the article.

