SRE Practitioner – System Reliability and Service Assurance

Build SRE skills: SLOs, observability, incident management, and chaos engineering for reliable systems.

This course is delivered as an intensive three-day workshop covering the practical application of SRE principles to large-scale service scalability and reliability, with a focus on modern IT leadership and organizational change approaches. This workshop is aimed at practitioners or managers seeking to improve the reliability and resilience of platforms as well as services.

Is it for you ?

Personnel involved in improving the capacity, reliability and resilience of platforms and services.

Prerequisites

• The participant must hold an SRE Fondation certificate and have three to six months' prior IT experience.
• SRE Foundation Certification

What You'll Walk Away With

✓ Define and leverage SLOs to drive reliability and user satisfaction
✓ Design secure, reliable, and scalable systems
✓ Implement full-stack observability to monitor services effectively
✓ Manage incidents and structure efficient operational response
✓ Apply chaos engineering to test and strengthen system resilience

Training content

1 Day 1:

SRE anti-models
Service level objectives (SLOs), the proxy for customer happiness
Building secure, reliable systems
Non-abstract capacity planning for large-scale design

2 Day 2:

Full Stack observability
Using platform engineering and AIOps
SRE management and incident response
Gremlin instrumentation

3 Day 3: Chaos engineering

Chaos engineering
SRE is a form of DevOps
Review and exam preparation.

📌 Practical information

Our training sessions are offered in Montreal or Quebec City, in person or in a virtual classroom. Dates and locations are specified when you select your session below. If you have any questions, check out our FAQ.