As a data scientist in Silicon Valley, I am humbled by the amount of attention our field has received in the past few years. Harvard Business Review has called data science the sexiest job of the 21st century and Forbes released a report explaining why data scientist is the best job to pursue in 2016.
These headlines pique the interest of many industry professionals interested in crunching data and looking for the next exciting phase in their career. With so much interest in the data science field, I often get questions to the tune of: How did you fall into data science as a career? What inspired you to become a data scientist?
Ten years ago, very few students, including myself, told their high school teachers and parents, “When I grow up, I want to be a data scientist,” mainly because the fame and glory of becoming a NBA player sounded much more appealing to a teenager and the concept of data science was foreign to most during this time.
Even today, data science is a nascent field. The day-to-day life of a data scientist is a blurred idea for many, causing confusion around what it actually means to pursue this sexy career. For those curious, the profession at a high level involves a broad range of expertise. Becoming a data scientist means you also have become a:
To be a data scientist, you need to think like a researcher. But instead of working with lab mice, you are creating experiments with your company’s data, thinking through interpretations of the data, and implementing a solution that creates the most value for your business.
That means testing hypotheses, setting confidence intervals, controlling your variables, and evaluating the results – you’ll need to have the basics of descriptive and inferential statistics down pat.
For example, you may be tasked with understanding why sales increased by 10 percent in a given month. Using statistics, you can understand the potential sources of an increase in sales – was it a given promotion or time of the year? Is this a trend seen on an annual basis? These puzzles are the types of complex business problems you’ll be tackling.
Your job will consist of gathering large volumes of data and using statistical techniques to show trends in the data. The only way you are going to get comfortable is by getting your hands dirty. You’ll need to familiarize yourself with different statistical distributions and their assumptions, and in some cases, understand how they are formulated.
As a data scientist, your main product is data – sales numbers, user figures, engagement – which is generated from a tech product, so you’ll need to develop programs that are able to process large volumes of data as quickly as possible and translate the data into actionable insights.
Contrary to what most people think, very few data science jobs will be purely in data, unless you are working in research. These days, being a data scientist is an “end to end” job, which means you are tasked with gathering data, modeling, and building out apps to display the data. You’ll need to pull data on your own, often from multiple sources of different types, and you won’t be able to rely on an engineer to retrieve it, so you can work your magic with data analysis, building predictive models, and the like.
There’s absolutely no point in writing amazing programs if they aren’t going to help your business colleagues make better decisions and help improve their product or service. Data science’s real power is affecting the business in quantifiable ways, usually by identifying and improving upon the business key performance indicators (KPIs).
From there, you can answer questions from your business decision makers framed using these KPIs. The line of questioning usually goes:
– Descriptive questions. Seek to understand what has happened.
– Predictive questions. Where can we use data science to predict what will happen?
– Prescriptive questions. What will we do next?
Lastly, as a data scientist you should always “close the loop” with your stakeholders and review if it meets expectations and maps to ROI.
A critical component of data science is to be able to communicate your findings or your work in a manner that is understandable and easily accessible to your audience. Since your results will drive business decisions, it’s important to create visualizations that your sales, marketing, and finance teams can get behind.
This is where you can get creative with different formats:
– PPT Presentations
– Graphs and Charts (Learn matplotlib for python and ggplot for R)
– Dashboards or Out of the Box tools (Data Dog, Tableau)
– Sophisticated visualization tools such as D3 or Highcharts
– Or even creating a blog post.
Specializing in statistics, programming, business, and design sounds like a daunting task. But getting background in all these skills is incredibly rewarding because you can tackle the business questions that are racking the brains of your company’s senior leaders. You get to play the hero, and be creative to answer all of their most pressing questions: What are our customers going to buy? When are they going to purchase and how often? What factors are sales to drop and how can we set up pre-emptive actions to keep sales consistent? What factors are associated with customer retention and development?
These are just some of the questions a data scientist can answer, making these professionals extremely valuable for the company.
Chalenge Masekera is a data scientist at Salesforce, where he builds products that empower customers to be smarter. His passion is turning data into actionable insights that create the best possible value and strategic direction for businesses, product design and improvement. Chalenge has a Masters in Information Management and Systems from the University of California, Berkeley.