Introduction to Data Science for Everyone
About This Course
By the end of the presentation, the student will learn two important concepts:
A. Data Wrangling with the Python Programming Language
- Be able to import tabular data into Python using Pandas.
- Be able to explore and modify tabular data through various data wrangling approaches, including. retrieving dimensions. subsetting. obtaining column statistics. replacing column names. performing mathematical operations. filtering, etc.
B. Data Visualization with the Python Programming Language
- Perform data visualization in Python language using various python data visualization modules such as Matplotlib, Seaborn, Plotly, etc.
- Create summary statistics for a single group and different groups
- Generate graphical displays of data: histograms, scatter plots, box plots, bar plots, QQ-plot and pie charts.
Learning Objectives
This course is an introduction to data wrangling & data visualization and will apply three main frameworks for data visualization in Python.
Material Includes
- A laptop, stable internet access, installation of Zoom for teleconferencing, and Google Collaboratory or Python Anaconda Navigator or any Python programming interface on your computer.
Requirements
- The user must have access to Google Collaboratory on their computer plus a dedicated internet access
Target Audience
- Professionals who would like to learn how to use the Python programming language to learn how to explore any dataset.
Curriculum
2h
Load external packages into Google Collaboratory
Load a dataset into Google Collaboratory
Identify the different data types from data
Create basic summary statistics of continuous variables from data
Generate histograms, QQ-plots and empirical cumulative distribution from data
Generate bar graphs from data
Generate box plots from data
Generate pie charts from data
Your Instructors
Prof. Kimitei
Professor of Mathematics
- Over 16 years as a university professor.
- Bachelor of Science in Math & Computer Science.
- MSc in Applied Math.
- PhD Candidate in Data Science & Analytics