Canada

AI-Assisted SRE / AIOps Lead Engineer - peterborough, Peterborough

AI-Assisted SRE / AIOps Lead Engineer - peterborough, Peterborough
Description
Job Title: AI-Assisted SRE / AIOps Lead Engineer Location: Remote Employment Type: Contract Role Overview We are seeking a highly skilled and hands-on AI-Assisted SRE / AIOps Lead Engineer to lead the operationalization and scaling of an SRE agent-driven operations model. This role combines Site Reliability Engineering, automation, production operations, and AI-assisted workflow enablement to modernize operational practices and improve system reliability. This is not a traditional support or coordination role. The ideal candidate will be a technical builder and operator who can independently assess risk, validate AI-driven recommendations, and apply sound operational judgment in high-impact production environments. You should be comfortable leading a small team while actively contributing at the technical level. Key Responsibilities - Lead the adoption, onboarding, and operationalization of SRE agent-driven workflows across reliability and support functions - Translate existing scripts, runbooks, SOPs, and operational procedures into scalable, agent-compatible workflows - Evaluate and determine which operational activities should remain manual, become semi-automated, or be fully automated - Validate AI-generated recommendations, remediation actions, and workflow outputs before production implementation - Support production releases, release validation, smoke testing, and post-deployment system health checks - Drive troubleshooting efforts during production incidents and ensure timely resolution with thorough root cause analysis - Improve alert management, event correlation, and incident response effectiveness - Partner with engineering, platform, and operations teams to onboard new workflows and drive process improvements - Develop and maintain operational documentation, standards, and reusable runbooks - Mentor junior engineers and provide technical guidance on workflow design, operational execution, and validation practices - Continuously identify opportunities to modernize legacy operational processes and improve efficiency Required Experience - 5–10 years of hands-on experience in Site Reliability Engineering, cloud operations, production engineering, platform operations, or IT operations - Solid experience supporting and troubleshooting production environments - Demonstrated experience with automation, incident management, and operational process improvement - Experience working with release support processes and production validation activities - Exposure to AI-assisted operations, AIOps platforms, or automation-led support models is highly preferred - Experience leading initiatives while remaining deeply involved in hands-on execution Required Technical Skills - Strong scripting expertise in: - Python - PowerShell - Shell/Bash - Hands-on experience with: - Monitoring and observability platforms - Logging systems and dashboards - Alerting and incident workflows - Production support and release validation processes - Cloud platforms, preferably Azure - ITSM/ticketing platforms such as ServiceNow, Jira, or equivalent - APIs, integrations, and automation pipelines - Working knowledge or exposure to: - Kubernetes / AKS - AI productivity and operational tools such as ChatGPT and Copilot - Modern automation and orchestration practices Critical Soft Skills - Strong analytical and structured problem-solving skills - Ability to operate effectively in ambiguous environments with incomplete documentation - Strong ownership mindset with the ability to independently drive outcomes - Excellent judgment during high-pressure production incidents - Ability to challenge assumptions and validate AI-assisted recommendations rather than relying on them blindly - Creative approach toward transforming and modernizing legacy operational workflows - Strong communication and collaboration skills across technical and non-technical teams Ideal Candidate Profile - Hands-on builder/operator rather than a pure coordinator or process manager - Comfortable balancing automation with operational governance and control - Able to independently assess: - Risk impact - Blast radius - Rollback strategies - Safe execution practices - Capable of leading a small team while continuing to contribute technically on a day-to-day basis - Practical mindset with a strong focus on operational excellence and reliability engineering This role is ideal for someone who enjoys combining AI-assisted operations, automation, and modern SRE practices to build scalable and reliable operational systems. Apply on Kit Job: kitjob.ca/job/2oxonl
Highlights
Safety Tips
If the salary for a position is far above normal, proceed with caution.
1 / 10
More info about this ad

AI-Assisted SRE / AIOps Lead Engineer - peterborough has been posted in the Peterborough Engineering category on Locanto.

In this category, there are no other ads right now posted in Peterborough.

There are more ads within a 15 km radius for this category. If you want to view those ads, click here.