What IS Data Science? 3 Basic (and Random) Things Product Managers Should Know

Everyone is talking about data science, but how many product managers actually know what data science is, and how it can help us build awesome products?

I work with data scientists in my current role as well as my former, and I have some (admittedly quite random) thoughts on what product people should know about this relatively new field of tech. My goal is to give PMs who have no experience working with data science some idea of what data scientists do, and how PMs interact with data science teams. (I plan to write in more detail on this topic in the future!)

Here are 3 random but basic things for product managers to understand about data scientists.


1 - Some, but not all, data scientists are machine-learning focused.

Some data scientists are machine-learning engineers, but not all data scientists are ML-focused.

Here’s a ven diagram to explain how data science fits into machine learning. I couldn’t explain it better with words.

Screen Shot 2019-04-28 at 9.45.33 AM.png

2 - The outcome of a data science project are typically insights, not necessarily a piece of software.

Data scientists take raw or structured data sets and try to find patterns, and uncover other insights.

Example #1: A data scientists takes a data set like Twitter Firehose (all tweets from a specified date range), and builds an ML model to determine how often people are talking about Norwegian Forest cats, and what the sentiment is around Norwegian Forest cats.

The outcome are insights (examples):

  • .0.3% of people on Twitter are talking about Norwegian Forest cats. The sentiment of these messages are 80% positive.

  • People tweeting about Norwegian Forest cats are also talking about vegan diets, so there may be a correlation between the two.

Data science outcomes are one piece of the product puzzle, not necessarily lines of code that will become a functional piece of your product (example above.) However, it also could become a central part of the product (example below).

Example #2: A data scientists takes thousands of emails and creates a supervised learning model that puts emails into two groups: spam, not spam.

The outcome of this type of supervised model may be a central piece of an email spam product that Hooli builds and sells.

3 - What do data scientists do all day?

Most of the time, the goal of a data scientists is to predict the future based on past behaviors.

But how do they do that?

  1. They gather data - i.e. downloading an open-source data set for a genomics project like 1000 Genomes

  2. They prepare data - i.e. cleaning it up by eliminating duplicates, misspells, etc., probably transforming the data as well.

  3. They do data modeling - i.e. train a visual recognition model to recognize hotdogs in photos.

Data scientists do lots of other stuff, but these are three main things that all data scientists that I’ve worked with do.

It’s been a while, so I present you with this adorable red panda.

Side note: A data scientist could most certainly build a model to identify all red pandas in a data set containing thousands of animal photos.