Hi Michael! What motivated you to start a bootcamp?
To me, The Data Incubator isn’t just a fellowship or bootcamp. It’s a way to help companies identify top data science talent and a way for skilled individuals to transition into data science positions. I was motivated by observing first-hand the inefficiencies in the market, both from the employer and candidate perspectives.
First as a candidate, I was a PhD looking to transition into industry. Employers were looking for certain key skills (machine-learning, mapreduce), which is extremely difficult to self-teach because the topics are broad and it’s hard to know what is important to learn and what is not. During interviews, companies kept on asking me the same questions and it seemed inefficient to do this individual by individual. Having a central organization ask these questions and record my answers to show I knew how to answer them was much more streamlined and efficient.
Second, as a hiring manager, I quickly realized that there were many people who looked good on a resume, but when you really prodded them about what they knew, you quickly realize that they didn’t know much more than a few buzz words. Resume screening is an unfortunate but widespread practice in hiring (you can read my critique of it in FastCompany). It ultimately results in lots of wasted time interviewing candidates that look good on paper but can’t deliver in the long term. We want The Data Incubator to be a place where employers can go to find really top notch talent.
What challenges have you faced so far with starting your coding bootcamp?
We’ve worked really hard to make our curriculum top notch and keeping it up to date with the latest technologies. For example, when we started, we did not have a module on Spark, but we’ve developed it since then. Pandas has had twelve releases since we were founded and we have been upgrading the version we teach alongside those releases.
Now, our next challenge is taking the same curriculum and are offering corporate training to many of our Fortune 500 clients so that their employees can get access to same curriculum as our fellows without quitting their jobs and in smaller chunks.
What successes have you had with your first few cohorts?
We continue to be incredibly impressed by the quality of our applicants and the fellows that we ultimately admit into the program. It’s amazing how much our fellows contribute to the community, whether it is through their own experience and knowledge in data science (for instance, we had one cohort self-organize a lecture series on advanced topics in Natural Language Processing), or enriching The Data Incubator’s culture and day-to-day life by bringing different personalities and perspectives to the group, (different groups will organize various social activities for themselves and the team). But the most fulfilling part of this job is matching our fellows with great jobs at great companies. We’ve had alumni go on to The New York Times, Palantir, Ebay, Yelp, and Capital One, just to name a few employers.
Any advice for students looking to join a bootcamp?
The strongest applicants come in with a good background in statistics and programming. We’ve written some blog entries that delve into the specifics:
1. Getting started in data science
2. Displaying and visualizing your data science projects.
3. A new way to think about Numerical Computation
4. Finding great data sources for starter data science projects and ( part 2).
Do you see bootcamps replacing college for parts of the population?
Colleges and universities are slow-moving institutions with highly siloed departments where instructors are rewarded for research, not teaching. Consequently, curricula are often out-dated and theoretical. Our subject, data science, is practical, cutting-edge, and interdisciplinary -- exactly the kind of subject matter that universities have a hard time teaching.
It’s also an interesting space because of the huge industry demand for data science and their willingness to pay for candidates’ tuitions, which allows us to make the program free for admitted fellows. Qualified students are able to attend without incurring any debt, and are all but guaranteed jobs after the training. I think that gives bootcamps like ours another major advantage over universities.
That said, I do think universities are good at teaching fundamental concepts like statistics or core CS algorithms. For example, we benefit greatly from US universities and the National Science Foundation, which help train researchers in many of the core skills of data science, like statistics and programming. For example, the United States graduates 40,000 STEM PhD graduates annually, but only 20,000 continue in academia. We are able to draw from a very talented national (and international) pool of talent that’s benefited from strong university education.
Tell us about your curriculum
We think of our curriculum as being broken up into four major topics:
1. Software development and the ability to manage data pipelines, handle structured and unstructured data, or handling databases.
2. Machine learning, statistics, and specialized applications like time-series and natural language processing.
3. Distributed Computing topics like Hadoop, HIVE, Mapreduce, and Spark.
4. Visualization tools and techniques, including libraries like matplotlib, bokeh, d3js.
What is the job market like for data science?
We’re based in SF, DC, and NYC and the market for data science is exploding in all of them. Obviously, the technology sector is the primary employer for data scientists in the bay area. There’s a huge amount of government or government-contractor demand in Washington, DC. In NYC, the financial services sector is aggressively hiring data scientist and quants. Medicine is another huge employer of data scientists -- and that demand is growing with the introduction of the Affordable Care Act and the change in medical funding.
Michael Li is the founder of The Data Incubator. Previously, he has worked as a data scientist (Foursquare), quant (D.E. Shaw, J.P. Morgan), and a rocket scientist (NASA). He did his PhD at Princeton as a Hertz fellow and read Part III Maths at Cambridge as a Marshall scholar.
At Foursquare, Michael discovered that his favorite part of the job was teaching and mentoring smart people about data science. He decided to build a startup that lets him focus on what he really loves.