Principal Engineer, Operational Excellence – Resilience

Remote Full-time
Job Description: • Facilitate coordination between stakeholders across IT, Product, Engineering, and business units, serving as the central point for technology resilience initiatives and ensuring alignment with business objectives • Own and maintain enterprise-wide technology resilience standards, ensuring consistent implementation and reducing organizational drift from established frameworks across infrastructure, application, and product domains • Drive comprehensive technical resilience architecture including infrastructure redundancy and fault tolerance, application resilience and graceful degradation strategies, and chaos engineering frameworks for continuous resilience validation • Lead enterprise technical recovery strategy development and implementation, including backup and redundancy systems, recovery time/point objectives (RTO/RPO) for technical systems, and data recovery/restoration procedures • Partner to define and implement resilience standards, including feature flagging, release, testing, multi-tenancy frameworks, and scalability frameworks to manage growth • Provide technical oversight and aggregation of technology resilience risks across the enterprise, establishing and monitoring key performance indicators including system uptime • Drive chaos engineering and resilience testing programs, establishing enterprise-wide practices for proactive resilience validation and continuous improvement • Own shared resilience tooling strategy, evaluation, and implementation to support enterprise-wide capabilities including monitoring, testing, and recovery automation • Build and maintain formal networks with key constituents across business units, engineering teams, and external partners • Serve as senior technical advisor during major incident response, providing expertise on technical recovery strategies and coordinating cross-functional recovery efforts • Drive innovation in resilience practices, identifying emerging technologies and methodologies to advance CrowdStrike's competitive resilience advantage • Provide strategic guidance and expertise to junior team members and cross-functional partners on resilience engineering best practices Requirements: • 10+ years of direct experience in technology resilience, disaster recovery, site reliability engineering, or related technical disciplines, with demonstrated expertise in enterprise-scale cloud-native environments • Deep understanding of infrastructure redundancy patterns, application resilience design, chaos engineering principles, and enterprise disaster recovery strategies across hybrid cloud architectures • Proven experience with feature management systems, progressive deployment strategies, multi-tenant architecture resilience, and scalability engineering practices • Proven ability to drive strategic initiatives across large technology organizations, with experience influencing senior stakeholders and leading complex, cross-functional resilience programs • Experience establishing and monitoring resilience KPIs, including system uptime, MTTR, RTO/RPO objectives, and deployment success metrics • Advanced certifications in disaster recovery, cloud architecture, or site reliability disciplines (e.g., DRCS, CISSP, AWS/Azure/GCP architecture certifications) • Exceptional written and oral communication skills, including experience developing and delivering strategic briefings to executive leadership and technical teams • Advanced analytical and conceptual thinking abilities, with proven track record of solving complex, ambiguous resilience challenges with enterprise-wide impact • Demonstrated ability to build formal networks and influence stakeholders across engineering, product, and business organizations • Bachelor's degree in Computer Science, Information Systems, Engineering, Risk/Resilience, or equivalent practical experience Benefits: • Remote-friendly and flexible work culture • Market leader in compensation and equity awards • Comprehensive physical and mental wellness programs • Competitive vacation and holidays for recharge • Paid parental and adoption leaves • Professional development opportunities for all employees regardless of level or role • Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections • Vibrant office culture with world class amenities • Great Place to Work Certified™ across the globe Apply tot his job
Apply Now →

Similar Jobs

Prior Authorization Specialist - UW Medicine Primary Care at Northgate

Remote Full-time

Director of Privacy

Remote Full-time

Sr. Coordinator BD & Marketing - Data Privacy, Protection and Security & Labor & Employment

Remote Full-time

Senior Privacy Officer

Remote Full-time

Private Wealth PWM – Market Leader, Vice President

Remote Full-time

Private Bank Wealth Advisor III - Winston - Salem

Remote Full-time

Registered Nurse(RN) Resource Pool (RP)_ Flexi/PRN

Remote Full-time

Process Engineering Program & Change Management Lead; Remote

Remote Full-time

Product Manager (Remote)

Remote Full-time

Patient Experience Consultant

Remote Full-time

Business Analyst – EHR & Healthcare payer – Remote

Remote Full-time

**Experienced Industrialization Specialist – Provider Network Development and Quality Improvement (Part-Time) $32/Hour**

Remote Full-time

Hybrid In-home/Center Based Part-Time Registered Behavior Technician(RBT) / Behavior Technician(BT)

Remote Full-time

Data Analyst III (Remote)

Remote Full-time

Experienced Data Analyst and Examination Lead – Driving Business Growth through Data-Driven Insights and Strategic Decision Making at arenaflex

Remote Full-time

SEO Specialist for Telehealth Therapy Practice

Remote Full-time

React Native Developer | Spruce | Remote (Worldwide)

Remote Full-time

Flexible Remote Chat Agent Careers at blithequark - Full-Time Opportunities with Competitive Hourly Rates

Remote Full-time

Scheduler and Transcriptionist, Pediatrics

Remote Full-time

Customer Success Manager - Entry Level / Remote (Part Time)

Remote Full-time
← Back to Home