How to Be a Data Scientist

Author: Roger Woodard

Three students discuss a topic.

What’s exciting about becoming a data scientist is also what makes it challenging.

The excitement comes from the fact that data science is an emerging field and data scientists are in high demand.

However, an emerging field means there’s no defined roadmap that shows aspiring data scientists exactly how to secure “the best job in America.

Those considering careers in this fast-growing field should first consider what a data scientist actually does.

Data scientists use technical skills and training to generate critical insights and make discoveries in the world of Big Data. These insights are used by companies and organizations to shape important decisions, solve problems, and drive significant change.

Since data science is now influencing almost everything, from product development to sales and marketing to governmental operations, data scientists are emerging as important new players within businesses and organizations across the world.

Data Science Uses and Applications

Industries as diverse as healthcare, sports, government, and finance are discovering the powerful impact of data science. Here are just a few examples of how data science is driving important change:

  • Healthcare: Notre Dame’s Data Science faculty developer, Steven Buechler, applied data science to the field of molecular biology to improve the accuracy of diagnoses and to identify targeted therapies to help breast cancer patients.
  • Sports: In an article in Forbes magazine, Leigh Steinberg describes how the popular sports movie Moneyball demonstrates the power of statistics by showing the impact Oakland A’s general manager Billy Beane’s advanced statistical analysis had on the science of player evaluation for the baseball team, Major League Baseball, and every professional sports team today.
  • Payment Cards: Kathryn O’Donnell, Director of Data Science at Capital One uses her experience leading data science and analytic projects to develop new approaches to combat credit card fraud.

Data Scientist Jobs and Qualifications

As you contemplate a career in data science, it’s also important to consider the various job opportunities available and the skills and competencies employers are seeking in a data scientist.

When LinkedIn released its list of most in-demand skills of 2018, statistical analysis, data mining, and data presentation–critical skills for a data scientist—were in the top ten.

However, job descriptions for data scientists will vary from company to company.

In a recent KDNuggets article, Alex Castrounis described the process of a data scientist as being similar to the scientific method process used by scientists. They ask questions and/or define a problem, collect and leverage data to come up with answers or solutions, test the solution to see if the problem is solved, and iterate as needed to improve on, or finalize the solution.

Typical job responsibilities include:

  • Data acquisition, collection, and storage
  • Discovery and goal identification (ask the right questions)
  • Access, ingest, and integrate data
  • Processing and cleaning data (munging/wrangling)
  • Initial data investigation and exploratory data analysis (EDA)
  • Choosing one or more potential models and algorithms
  • Apply data science methods and techniques (e.g., machine learning, statistical modeling, artificial intelligence)
  • Measuring and improving results (understanding the data and its quirks through validation and tuning)

Meanwhile, according to the 2016 O’Reilly Data Scientist Salary Survey, in addition to strong technical and quantitative skills listed in job responsibilities, top-performing data scientists need well-honed soft skills in communication and leadership, such as:

  • Teaching and training others
  • Organizing and guiding team projects
  • Identifying business or organization problems to be solved with analytics
  • Communicating findings to non-technical decision-makers and people outside the company or organization
  • Knowing their business and guiding their leaders

Notre Dame alumna Laura Godlewski, a data scientist for Facebook who participated in a corporate panel discussion at our recent immersion in Palo Alto, shared with our data science master’s students the skills and qualifications she cares about in data scientists:

“We can’t be successful without data communication. Otherwise it’s just ‘so what?’ You can come up with all numbers, but nobody understands it unless someone can translate it. That is a key part of your role as a data scientist.

You can’t go on gut instinct. You have to turn to your evidence. Why do you want evidence? Because it is going to support your actual conclusion. Evidence supports your decisions.

You need to understand how important qualitative interpretation is to data, as well as the quantitative. You can’t have one without the other, especially at Facebook. Why? We look at all these behavioral metrics. I understand the ones from the U.S., but I may not understand the nuance of what’s going on in Europe, India, Indonesia, and you are called upon to draw that conclusion and form that hypothesis. It’s really important to understand that qualitative learning is as important as quantitative learning.”

How can you become a top-performing data scientist?

As we designed Notre Dame’s master’s in data science, we collaborated with top-tier industry leaders to ensure our program prepared data scientists with the technical and soft skills industry needs.

Industry leaders confirmed they are seeking data scientists who are agile thinkers and can go beyond the techniques and understand the processes and apply critical thinking. They need data scientists who know how to effectively communicate with data and who are leaders in the field.

At Notre Dame, we are focused on building three-dimensional data scientists who can execute the technical data science job responsibilities and bring leadership skills to their roles.

Industry leader Maria Lupetini, Director of Engineering, QCT Machine Learning Group at Qualcomm confirmed, “Data scientists need more stats and mathematical sophistication. They have to display deep statistics and mathematical thinking. And they need to present to the non-technical executive and explain their work.”

In our recent data science information session, our data science faculty member Dr. Alan Huebner further iterated Lupetini’s point:

“Take the analogy of driving a car. For most of us, we just drive it. If something goes wrong, we can’t fix it. Likewise, to become a top performer in data science, you need to get ‘under the hood’ of the algorithms and statistical models. This will help you diagnose problems, explain counter intuitive results, and compare competing results. And you need to be able to explain the problem and the solution to customers in a way they can understand.”

To become a data science leader, you need the right training and Notre Dame’s online master’s in data science program will get you there. Our program trains students to become three-dimensional data scientists who have the specific qualifications employers are eagerly seeking.

I encourage you to download our student guide to learn more about becoming a three-dimensional data scientist and a leader in the field with our online master’s in data science program.