Growing the Next Generation of Data Scientists
Note: Today’s guest post comes to us courtesy of Sriram Mohan, an associate professor of computer science and software engineering at Rose-Hulman Institute of Technology, Terre Haute, Ind.
Data, more than ever before is the lifeblood of every organization. From media companies to retail stores, data allows organizations to differentiate themselves from the competition. Whether used in market research or in cost reduction efforts, organizations must leverage data to be competitive, and for the most part, they now enjoy access to all the data anyone could ever need.
What most organizations today don’t have ready access to are data scientists who are trained to turn all of their data into actionable insights. Gartner predicts that by 2015, about 4.4 million data science jobs will be available, and only a third of them will be filled. Is there a shortage of people with the requisite combination of programming, data management skills and statistical and mathematical ability?
Yes. Can we in academia develop more of those people? Absolutely. But we are going to need the help of industry to accomplish that.
As universities nationwide define and create new undergraduates and graduate data science programs, industry can help to make our programs more interesting and relevant for students. We will also benefit from greater access to industry data sets and the latest Big Data technologies.
Data science, by its nature, is abstract, making it difficult to attract initial student interest. But in the real world, data science is applied in many concrete, exciting ways that we can bring into our classrooms. The perceived gap between academia and industry is why I took a sabbatical this past year to work with Avalon Consulting, a Big Data consultancy headquartered near Dallas.
The experience has proved invaluable, however, most, if not all, of what I learned came through hands-on experience in developing solutions. To make data science concepts more interesting and tangible for students, we should provide them with similar hands-on opportunities. To that end, companies should organize contests for students that allow them to work on actual problems with real industry data sets (similar to the Netflix Prize). My university offers students a capstone project that challenges them to solve real-world problems provided through organizations such as Avalon and the U.S. Armed Forces. Such Partnerships are a good start, but to educate data scientists, we need to do more.
Academia needs to stay up to date with the latest Big Data tool chains. Those ecosystems are evolving rapidly, but most of the development happens on the industry side – so much so that academia often finds itself left behind. If we want our students to be aware of the latest changes when they graduate, we must foster a better exchange of technological knowledge between academics and industry.
Industry typically uses summer internships to expose students to the latest technologies. Such experiences could be extended to faculty as well. My colleagues at Rose-Hulman used industry experiences in the past to stay current in technologies such as Google web frameworks, Android development and other programming languages.
Longer- term experiences, such as my training sabbatical, should also be considered. My experience led directly to development of new courses in Hadoop and modern database paradigms. The time I spent working in the ever-changing world of Hadoop and NoSQL databases at Avalon was critical to those courses’ development. Given the rapid pace at which Big Data technologies are now advancing, it is imperative that academia and industry find additional ways to collaborate to better prepare our students for the workplace.
Once our students enjoy greater access to industry data sets and the latest Big Data technologies, we can bring more real-world problems, solutions and stories into the classroom and generate greater interest in data science. The future for these students is bright, but industry collaboration is required to fully meet its demands.