Duração:
4 dias
Próxima Data:
Consulte-nos
Local:
Lisboa e Porto, Online
Descrição
This course focuses on the extensive features of the Python data analysis workhorse library, Pandas, and its visualisation counterpart Matplotlib. It covers the reading,preparation and manipulation of tabular data from various sources and in various common formats. Most wrangling and manipulation processes are covered. Time series data processing and practical linear regression are also covered. For the programming environment we use JupyterLab on the Anaconda platform. Anaconda is one of the most,if not the most,popular data science platforms.
*PVP por participante. A realização do curso nas datas apresentadas está sujeita a um quórum mínimo de inscrições.
Destinatários
This course is designed for anyone with Python programming experience wanting to gain a solid foundation in Python’s data analysis libraries. It is a must for aspiring Data Analysts and Scientists. Existing Data Analysts wanting a systematic introduction to Python’s Data Analysis tools would also find the course very useful.
-
Área: Software & Development
Programa:
###Module 1: INTRODUCTION TO DATAFRAMES
- What is a DataFrame?
- Loading DataFrames
- Accessing contents
- Useful functions
- Adding and dropping columns and rows
- Fitering and assigning data
- Missing values and duplicates
- Arithmetic basics
- Applymap and apply
Module 2: COMBINING DATAFRAMES
- Concatinate
- Merge
- Keys to merge on and suffixes for duplicate columns
- Merge methods
- Append
- Join
- Combine_first: For missing values
Module 3: RESHAPING DATAFRAMES
- Unstacking and Stacking
- Pivoting
- Melting
- Concatinating files from disk
Module 4: GROUPBY AND AGGREGATION: SPLIT-APPLY-COMBINE
- Basic GroupBy
- Hierarchical GroupBy
- Group by function of Index
- Aggregate by mapping on Index and Columns
- Aggregate by user-defined functions
- Aggregate using multiple functions
- Aggregate using separate function for each column
- Transfrom
- Apply function
- Pivoting with Aggregation
Module 5: PLOTTING WITH MATPLOTLIB
- Pie chart
- Bar chart
- Histogram
- Scatter plot
- Line plot
Module 6: TIME SERIES DATA
- Basic Concepts; Datetime,Timestamp,Timedelta,Timezones
- Pandas to_date() fucntion
- Date Range
- What is time series data
- Reading time series data
- Missing Dates
- Partial indexing,Slicing and Selecting
- Resampling
- Moving Window functions
Module 7: LINEAR REGRESSION
- What is linear regression?
- Simple Linear regression
- Multiple Regression
Pré-requisitos:
Delegates are expected to have Python programming experience. They should be able to effectively use Python containers (lists,tuples,dictionaries,and sets),construct loops and conditional statements,write functions and create and use classes and objects.
Partilha: