Building a Data Science Team is Not an Exact Science

Building a Data Science Team is Not an Exact ScienceAs a Marketing Manager, when I build a team it’s rather quite simple. It’s really broken down into online (i.e. SEM, SEO, social media) versus offline marketing (tradeshows, print etc.) skills and then I look to see if the candidate has successful experience applying the skills to a specific industry or business type (B2B or B2C). I generally put less weight on a candidate’s background using marketing applications and technology because in my experience, most marketing applications are alike enough to conceivably learn one if you already know another.

However, this is completely different in the IT world, especially with all of the existing technology stack options. In a VB article called “Building data science teams: The power of the technology stack,” author Rodrigo Rivera offers great insight into the complexities of putting together an IT team and how it’s affected by the technology stack. Rodrigo Rivera breaks the discussion into 4 categories: hiring, team culture, know-how, and projects.

Hiring:

To attract talented candidates, it’s been a common practice to add exotic programming languages to the job description. The problem is that is many cases, using exotic programming language is not the best decision for the IT department. Rivera states, “Be careful not to overuse the strategy of picking popular choices just to get good candidates. The company should ask itself which technologies and programming languages it can and wants to support.” It’s a good practice to build a project road map before the hire process begins and come up with a persona for individuals in your team so that you have some overlapping skills but still remain dynamic.

Team Culture:

It’s worth considering that the type of person you hire should reflects the type of stacks you plan on using. Understand that if you use cutting edge technology that it attracts a completely different profile than traditional technologies would attract. Cutting edge technology may not be a good fit for your team depending on your team’s dynamic. River says, “This can be very frustrating and toxic for your team culture if the team hits a wall and cannot go into production due to poor technology choices. Hence, if you are on a tight deadline, adopting a new technology can be detrimental for team performance.”

Know-how:

The problem with know-how in the case of technology stacks is that you should not be dependent on specific contributors. People come and go, but your technology debt stays. River states, “one team let their first data science hire freely choose his technology stack. The person decided to use Haskell, a relatively obscure programming language, as their main tool. One year later, the person left the company, and now they have a code base that cannot be maintained because they cannot find appropriate talent.”

Projects:

Rivera believers that the type of projects and scope of the team will be a major influence on the choice of technology stack. “For example, a data science team with a focus on analytics and ad-hoc reporting works perfectly under an R-centric or Python stack. On the other hand, a team requiring robust recommender systems or fraud detection might be better served with the JVM or even with C++.” It’s important to map out what type of projects your team may be working on to get an understanding of what programming languages might be vital. If the project details are unclear, it’s safe to go with a general technology.

In IT, technology stack is a significant factor to consider when selecting a team and Rivera says that you should consider this rule of thumb. “If your data qualifies as big data, then go for JVM-related technologies. If it does not, go for the Python or R ecosystem. These technology choices have robust libraries for the whole value chain (ETL, middleware, analytics, visualization, etc.), most of them are well documented, there is talent available, and the ecosystems are solid enough to offer peace of mind to your CTO yet modern enough to attract top talent.”

This is a discussion that has no one right answer and I’d really like to get your feedback on how to build a data science team.

Click here to read the entire VB article.