Description
The objective of this course is to build upon the foundational knowledge of data science and Python programming. Participants will learn more advanced techniques for data manipulation, analysis, and visualization, as well as delve into machine learning algorithms and model evaluation. This class also has an optional two day final data science project built around the practical application of the core concepts of this course.
Audience
This class is for programmers familiar with the basics of data science, machine learning and Python looking to expand their skills into more advanced machine learning algorithms and learn more complex data analysis techniques.
Prerequisites
 Six months or more of Python programming experience is recommended
 Familiarity with data science concepts and libraries (pandas, NumPy, Matplotlib, scikitlearn)
 Completion of an introductory data science course or equivalent experience
Objectives
Five Core Objectives, including data manipulation, Seaborn, & advanced machine learning.

 Apply advanced data manipulation techniques with pandas.
 Utilize advanced data visualization methods with Matplotlib and Seaborn.
 Implement regression and regularization techniques in machine learning.
 Develop classification models and evaluate their performance.
 Apply unsupervised learning and dimensionality reduction techniques.
Outline
Chapter One: Advanced Data Manipulation with Pandas

 Multiindexing
 Reshaping data
 Merging and joining data
 Time series data manipulation
 Handling missing data
Chapter 2: Advanced Data Visualization with Matplotlib and Seaborn
 Customizing plots with Matplotlib
 Plotting geographic data
 Visualizing distributions and relationships with Seaborn
 Interactive visualizations with Plotly
Chapter 3: Machine Learning: Regression and Regularization

 Linear regression
 Polynomial regression
 Regularization techniques (L1 and L2 regularization)
 Model evaluation metrics for regression (Rsquared, RMSE)
Chapter 4: Machine Learning: Classification and Model Evaluation

 Logistic regression
 Decision trees and ensemble methods (Random Forest, Gradient Boosting)
 Model evaluation metrics for classification (accuracy, precision, recall, F1score)
 Crossvalidation and hyperparameter tuning
Chapter 5: Unsupervised Learning and Dimensionality Reduction

 Principal Component Analysis (PCA)
 Clustering algorithms (KMeans, DBSCAN)
 Evaluating clustering performance
 Applications of unsupervised learning
Chapter 6: Text Processing and Natural Language Processing (NLP)
 Text preprocessing (tokenization, stemming, lemmatization)
 Bagofwords model
 Sentiment analysis
 Topic modeling with Latent Dirichlet Allocation (LDA)
Chapter 7: Introduction to Deep Learning with TensorFlow

 Basics of neural networks
 Building and training a neural network using TensorFlow
 Convolutional Neural Networks (CNNs) for image classification
 Recurrent Neural Networks (RNNs) for sequence data
Leave a Comment