If you had 2 years to start fresh in data science, what skills would you learn first?

If you were starting over and had two years to focus on learning data science, which skills would you prioritize, and why?

Two years is plenty of time. Break it into 52 two-week sprints and aim to create something useful every sprint or two. Building projects is the best way to learn and helps create a portfolio for potential employers.

Start with SQL and Pandas. These are critical for data manipulation and cleaning, which makes up a significant part of data science. Learn to build pipelines that pull, clean, and upload data to tools like BigQuery. Automate tasks and create dashboards. Focus on building things rather than getting stuck in tutorials.

@Leighton
Completely agree.

@Leighton
Great advice! Building projects forces you to apply what you learn and exposes gaps in your knowledge.

It’s surprising no one mentioned statistics yet. A strong foundation in statistics is essential for understanding and implementing machine learning models.

Cade said:
It’s surprising no one mentioned statistics yet. A strong foundation in statistics is essential for understanding and implementing machine learning models.

I always assumed it was a prerequisite to enter the field.

SQL is a must-have skill. If you don’t have at least an intermediate level of SQL, you’ll be limited in what you can do.

Zola said:
SQL is a must-have skill. If you don’t have at least an intermediate level of SQL, you’ll be limited in what you can do.

I’d add Excel to that list. Many business leaders prefer initial analysis in Excel before diving into code.

Start with SQL and Pandas. These two tools cover a huge portion of data analysis and manipulation tasks. Once you’re good at them, you’ll find it easier to transition to more advanced tools.

Mars said:
Start with SQL and Pandas. These two tools cover a huge portion of data analysis and manipulation tasks. Once you’re good at them, you’ll find it easier to transition to more advanced tools.

If you master SQL and Pandas, you can accomplish a lot. They’re worth investing time in.

It depends on your goals. For a career in machine learning, focus on Python, linear algebra, and calculus. For data analysis roles, prioritize SQL and tools like Tableau or Power BI.

Will said:
It depends on your goals. For a career in machine learning, focus on Python, linear algebra, and calculus. For data analysis roles, prioritize SQL and tools like Tableau or Power BI.

Also, Python is a universal skill in this field, so make it a priority regardless of your specific path.

Here’s how I’d allocate my time:

  • Math and statistics: 25%
  • SQL, Pandas, and data cleaning: 15%
  • Machine learning fundamentals: 25%
  • Building projects and learning MLOps: 35%

The fundamentals are critical, so don’t skip them.

I’d focus on foundational skills like SQL, Python, and Tableau for visualization. Once you’re confident in those, move on to machine learning and cloud-based tools.

Get a job in any industry and start applying data skills to real problems. That experience will be more valuable than just learning tools.