Spot the difference? How to hire a data scientist

19 September 2017 By Joyeeta Das

How do you hire a data scientist? Tough question eh.

It’s a new field. Not many are being produced at the rate at which they are needed, most claim to be one and many do not know they are one. A shortage of 1.5million data scientists is predicted by the end of 2018 alone.

In the last 3 years, I have helped perform DD for many data science companies. I still continue to hire for my own and speak at many forums on this exact subject. Here are my tips on how to hire a data scientist.

But first, let me begin by explaining the roles.

Data engineer

Someone who does the ETL, cleans, mungs, keeps the data prepped and ready for the data scientists to work on. Basically, they have an end goal to work towards—plug the data warehouse/lake into an infrastructure that will work upon it further. Statistics, database, scripting tools knowledge required. The potion stirrer. Someone who learns and knows and breathes data.

Data scientist

Someone who uses prepped up data and uses a creative multidisciplinary and interdisciplinary approach to solve business problems. They may or may not have an end goal such as a given, fixed problem to solve. Machine learning, creative problem solving, Bayesian knowledge, statistical knowledge required. The one who breathes life.

Someone who makes data a living, meaningful, organic mass that breathes itself.

In an ideal world, these would be two different people but we do not live in an ideal world so they are not.

Think of it this way, when the first movies were made – the actor would do their own makeup, their own stunts, and sometimes even write their own dialogue.

Today, specializations and separations have evolved, right?

Similarly, currently, the protagonist of the data science movie does all of the above and even brings their parasols and outfits for the set.

One repeated problem I notice is that people fall in love with personalities when they find a really cool, awesome scientists. They build plans around them. Great! It is indeed a seller’s market, given data science is repeatedly one of the highest-paid professions in the world. But be wary—this can be very detrimental. End of the day, if you are in an organization with a goal, or building a business, most likely you have a business target to achieve.

So, do not create roles around people. People should fit the role you need to be filled.

Although in the end all of it depends upon the unique business you are building and I suppose there are no golden rules.

Here are some tips I follow.

Make the hiring manager fill out the below questionnaire beforehand. Some of the points are common sense and some are not. Usually, CTOs and hiring managers get swayed by personalities—positively or negatively—and sometimes they lose the objective with which the resource has been hired. Especially for a nascent discipline such as data science, most comparables end up being “soft points”. Performance quantification is tough.

So being clear about what you need out of the role is important.


Many companies have HR procedures of a good standard to deal with these, but when dealing with data scientists it is important to be flexible and change the questions to reflect the scarcity of the data science market, as well as the business objective they are meant to fulfil.


Make the manager fill this out ahead of even advertising for the role. The managers will oblige as they are managing one of the most expensive roles in your organization—cost-wise and impact wise.

Who is this new hire?

  • Interdisciplinary, problem solver, someone who achieves exactly x and y.

What will they do?

  • In the first week, three weeks, six weeks, twelve weeks…and then at a steady-state.

Once hired, what exact deliverables will you evaluate?

  • One month, two months, three months, steady-state.

What KPIs will you use to evaluate the work?

  • Speed, efficiency, creativity, customer feedback, peer applause, etc.

Given that data science is evolving, what skills should this person pick up?

  • In three to six months, or within the year

What other ways can this role spin, given business alternatives or pivots?

Help brainstorm and anticipate what other skills they may have. Use these clues to create adverts and let people know about the role.

Finally, when interviewing to see who fits the puzzle, the last response (point 6) can help you check for specific kinds of flexible skills or “nice to have skills” in interviewees (tie-breaker maybe?)

Then, once the role is filled up, have HR or a manager set up monthly/fortnightly meetings to check if the boss is updating the above framework frequently, and make sure it is done so all evaluations are objective. This is in the best interest of both the employee and the manager—an unbiased understanding of their own contribution to data sciences.

Startups need to be agile- this means the above questionnaire should change as the business evolves. If an employee is already hired, bring them into the discussion and incentivize them by sharing transparent goals.

Many companies have HR procedures of a good standard to deal with these, but when dealing with data scientists it is important to be flexible and change the questions to reflect the scarcity of the data science market, as well as the business objective they are meant to fulfil.

In general ask your interviewers to check for interdisciplinary people, flexibility and their interest in solving challenges. Very picky people who say “I do not clean data, I only do machine learning” normally go straight to my NO list. You need someone who will own it all and deliver the results you need (and they need) to move together in a fast-changing environment.

Good luck! If you have any follow-up questions on further technicalities, hit me up!

Let’s connect on LinkedIn, or follow me on Twitter. To explore opportunities with Gyana, visit the careers page.

Joyeeta Das

Joyeeta Das is the co-founder and CEO of Gyana.