๐ I am currently working as a Sr Data Scientist in a Fortune 50 company. Before this, I worked in multiple startups as a Data Analyst and a Data Scientist. I have given more than 30 Data interviews ๐ป in the last 4 years and have also conducted technical rounds for entry-level data scientists.

โฐ I spent an enormous amount of time learning everything listed in popular curriculums but some topics were not expected for entry-level jobs which can even make you lose interest in studying as it takes a lot of time. A sizeable portion of the curriculums available made sense to learn either while working as a Data Scientist or when you want to crack an interview for senior roles. ๐

With that in mind, I want to create a ** simple and easy-to-follow curriculum** that can help freshers venture into the Data Scientist field as soon as possible.

The curriculum -

โ
Can help you crack 95% of interviews for entry-level jobs in the Data Science field

โ
Can help you start a career in a non-FAANG company for a base salary >$95k๐ฐ

โ
Can be completed in 3 monthsโ๏ธ

โ
Can help you switch from a non-technical role as soon as possible

We will approach the learning process through 4 stages: *Crawl, Walk, Run, and Fly*

## Crawl ๐ผ

**Python Basics ๐**

Start by learning the fundamentals of the Python programming language, focusing on concepts relevant to data science.

Understand data types, variables, conditional statements, loops, and functions in Python.

**Data Structures** ๐

Dive into data structures in Python, with a specific focus on

**lists and dictionaries**.Explore their properties, learn how to manipulate and access elements, and understand when to use each data structure.

**Pandas and NumPy** ๐ผ

Learn to load data into Pandas DataFrames, perform basic data manipulation tasks, and handle missing values using Pandas

Utilize NumPy for numerical computations, working with arrays, and performing mathematical operations.

**Data Visualization (Matplotlib)** ๐๐๐

Explore different plot types, such as line plots, bar plots, scatter plots, and histograms.

Learn to visualize data using the Matplotlib library in Python.

Learn how to customize plot aesthetics, add labels, titles, and legends to enhance visual communication.

Practice creating visually appealing plots using Matplotlib to effectively present insights from data.

**Explore Jupyter Notebook and Jupyter Lab** ๐งช

- It allows you to write and execute code interactively. This interactivity makes it easier to experiment, test code snippets, and see immediate results.

**SQL Essentials ๐ฉโ๐ป**

Basic SQL Querying ๐๏ธ๐ก

Master the fundamental SQL operations for data retrieval from relational databases.

Understand SELECT, FROM, WHERE, GROUP BY, ORDER BY, HAVING, and WITH clauses.

Essential SQL Functions ๐๐ข

Understand COUNT, MIN, MAX, DISTINCT, SUM, IF, and IFNULL.

Utilize these functions to perform data aggregations and transformations.

**Math Fundamentals ๐งฎ** (Hands-on using Python Libraries)

Learn how to measure the central tendency of a variable using mean, median, and mode

Learn the concepts behind variance and standard deviation as measures of dispersion.

Learn how to measure the relationship between variables using Correlation and Covariance

## Walk ๐ถ๐ผ

**Machine Learning 101 with Python ๐ค**

Linear Regression

Logistic Regression

Feature Engineering: Learn techniques for data preprocessing, handling missing values, scaling data and feature selection.

Learn how to measure the success of the ML model using performance metrics: Mean Squared Error (MSE), Mean Absolute Error (MAE), R-squared (R^2), Precision, Recall, Confusion Matrix, F1 Score, ROC-AUC.

**Advanced Math ๐ค**

Hypothesis Testing ๐

Learn hypothesis testing techniques: t-tests, z-tests, ANOVA, and Chi-Square tests. ๐๐

Understand when and how to apply these tests using Python for data analysis and making inferences. ๐๐

Grasp concepts like null and alternative hypotheses, p-values, and confidence intervals. ๐ง ๐ก

Statistics ๐

Explore key statistics concepts: random variables, probability distributions, and measures of variance. ๐๐

Study various probability distributions (normal, binomial, exponential) and work with them in Python. ๐ฒ๐

Gain knowledge in inferential statistics, including sampling distributions and confidence intervals. ๐๐

Probability (central limit theorem, Bayes's theorem)

Understand the central limit theorem's significance and its connection to sampling distributions. ๐๐ฒ

Dive into Bayes's theorem and its applications in conditional probability and Bayesian statistics. ๐๐

**Advanced SQL ๐ช**

Joins

Master different types of SQL joins: LEFT JOIN, RIGHT JOIN, CROSS JOIN, INNER JOIN, and UNION ALL. ๐ช๐

Understand when and how to use each type of join to combine data from multiple tables. ๐๐๏ธ

Window Functions

Dive into window functions in SQL, including RANK, DENSE_RANK, and ROW_NUMBER. ๐๐ข

Learn how to use window functions to perform calculations and analyze data within specified partitions or windows. ๐ช๐

## Run ๐โโ๏ธ

**Advanced Machine Learning Concepts ๐ฏ**

**Algorithms**

Supervised Learning Algorithms:

Ridge and Lasso

Decision Trees

Random Forest

XGBoost

K Nearest Neighbors

Support Vector Machines

Unsupervised Learning Algorithms:

K Means Clustering

Principal Component Analysis

DBSCAN (Optional)

Time Series Forecasting Algorithms:

Autoregressive Integrated Moving Average (ARIMA)

Exponential Smoothing Methods

Prophet (developed by Facebook)

**Handling Advanced ML Models**

Learn techniques to handle overfitting and underfitting, which are common challenges in machine learning. โ๏ธ๐

Understand strategies for dealing with unbalanced data, where the classes are not evenly distributed. โ๏ธ๐

Learn how to implement Principal Component Analysis (PCA) for dimensionality reduction and feature extraction. ๐๐ข

**Portfolio Project**

Work on a portfolio project that demonstrates your understanding of advanced machine learning concepts. ๐๐ผ

Use Kaggle projects as references or find real-world datasets to solve interesting machine-learning problems. ๐๐ป

Create a machine learning boilerplate code in Jupyter Notebook that can be used for take-home assignments and ML coding rounds. ๐ก๐

## Fly ๐

**Cloud Services for ML Model Deployment**Familiarize yourself with any one cloud service providers like Google Cloud, AWS, and Azure for deploying ML models. ๐ฉ๏ธโ๏ธ

Gain practical experience by deploying models and study real-world examples and case studies of ML model deployment. ๐๐ป

Learn about

**model versioning and monitoring**in production to ensure smooth and efficient deployment. ๐๐

**Practicing Coding Skills**Practice solving Easy and Medium category Python and SQL questions on platforms like LeetCode. ๐๐ป

Strengthen your coding abilities and problem-solving skills by tackling a variety of coding challenges. ๐ง ๐ก

**Creating a Compelling Resume using STAR Method**Utilize the STAR method to structure your resume and effectively communicate your experiences and achievements. ๐๐

Provide concise and compelling examples using the Situation, Task, Action, and Result framework to showcase your skills and accomplishments. ๐๐ผ

**Reaching Out and Applying on LinkedIn**Leverage the power of LinkedIn to network and explore job opportunities in the data science field. ๐ค๐

Connect with professionals, join relevant groups, and actively engage with the data science community. ๐ผ๐

Throughout your journey, focus on continuous learning and improvement. Stay up to date with the latest advancements in the field and leverage online resources, tutorials, and courses to enhance your skills. Best of luck in your data science endeavors! ๐๐