Data is the lifeblood of any business. But you need to know how to make sense of it so that it brings value to the organization. This training on data science with Python will equip you to produce automated methods for sorting and analyzing data in order to extract useful information. You'll be able to get the most out of your data, and communicate it to influence decision making. You will also be able to develop your skills in computer science, mathematics and management.
Objectives
At the end of this training, the participant will be able to value his data with Python.
Is it for you ?
Any professional in a company (managers or non-managers) whose function requires the realization of data analysis, financial or not.
Prerequisite
None
Your benefits
Content
The more data you have, the more reliable information you can derive. This information can be used to anticipate asset or user/customer behavior and gain a competitive advantage.
Segment 1: What is Python, Jupyter Notebook and Anaconda
- What is Jupyter Notebook
- What are the basics of programming with Python :
- What are the variables in data science
- What data processing can be done: Indexing, extraction, replacement, modification, addition, conversion, cleaning, membership test, sorting, structures (sets, dictionaries, etc.), mathematical operators, comparison operators, logical operators, etc.
- How to use the debugger
- What are the rules for using if, if-else and if-elif conditions for flow control
- When to use while and for loops
- How to create your own functions (def syntax, inputs or parameters, function body and outputs: return)
- When and how to use lambda functions
- How to load or install libraries and modules or python packages
See more + / -
Segment 2: What libraries are essential for data science
ETL (extract/transform/load) process
How to conduct a process that will extract, transform and load data, from a raw data source, for business needs.
Taking advantage of the Pandas library (Panel Data or Python Data Analysis):
- How to extract data from various sources (Excel, CSV, HTML, JSON, etc.) and manipulate it (clean, filter and transform)
- How to identify, remove and replace missing data
- How to deal with duplicate data
- How to manage data aggregations (groupby)
Using the Numpy library to create or generate data (simulations). Introduction to Monte-Carlo simulation.
Modeling
How to model data to conceptualize the relationships between different types of information, with the Pandas library:
- How to combine data tables (add and merge tables)
- How to transform data and how to create data tables
- How to optimize and forecast data with the Statsmodels.api and Scpipy.stats libraries.
Visualization
Once the data has been extracted and modeled, it remains to see how to visualize them in a graphic form (diagram, graph, map, animation...), more easily interpretable and exploitable.
How to use the Matplotlib and Seaborn libraries to :
- Visualize, combine and customize data: line graphs, scatterplots, boxplots, heat maps, etc.
- Save one or more graphs (pdf, jpeg, etc.)
💡 Useful information
Our training sessions are offered in Montreal or Quebec City, in person or in virtual format. Dates and locations are provided when you select your session below. If you have any questions regarding registration, schedules, the language of instruction, or cancellation policies, please consult our FAQ .
Trainers
Private or personalized training
Do you have several employees interested in the same training course? Whether in person at your offices or remotely in virtual mode, we offer private training courses tailored to your team's needs. Group rates are available. Contact us for more details or request a quote online.
Request a quote