Cloud Engineer – Clustering & Pacemaker Expertise

Are you a seasoned Cloud Engineer with a passion for architecting and implementing robust, high-availability solutions? We're seeking a Lead Cloud Engineer with deep-seated expertise in clustering technologies, particularly Pacemaker, and a comprehensive understanding of hyperscaler platforms (Azure, AWS, GCP, IBM Cloud). In this pivotal role, you'll leverage your extensive experience with RHEL and SLES to design and deploy mission-critical, fault-tolerant infrastructures. You will be instrumental in ensuring the stability and availability of cloud-hosted applications, especially for complex SAP workloads, leading proof of concept (PoC) initiatives and collaborating with development teams to shape technical requirements and documentation.Job offer information
  • Clustering & High Availability Leadership:
    • Architect and implement advanced Pacemaker clustering solutions across diverse operating systems (RHEL, SLES) and platforms, guaranteeing unparalleled high availability and fault tolerance for critical applications, databases, and services.
    • Drive clustering architecture decisions, optimizing for scalability, resilience, and the stringent reliability demands of business-critical workloads.
    • Expertly troubleshoot and resolve intricate issues within Pacemaker clusters, providing authoritative guidance on configuration, rigorous failover testing strategies, and sophisticated cluster architecture.
  • Hyperscaler Architectural Expertise:
    • Strategically utilize your profound knowledge of hyperscaler platforms (Azure, AWS, GCP, IBM Cloud) to architect, deploy, and maintain cutting-edge cloud high availability solutions.
    • Engineer and manage resilient cloud-based high availability (HA) solutions across multi-cloud environments, seamlessly integrating cloud-native services with Pacemaker clustering for mission-critical applications demanding continuous uptime.
  • Strategic Proof of Concepts & Comprehensive Architecture Documentation:
    • Spearhead and provide expert support for proof of concept (PoC) engagements, evaluating innovative clustering architectures and solutions, meticulously documenting setups, configurations, and key learnings for future implementations.
    • Develop comprehensive technical documentation encompassing intricate clustering configurations, detailed architecture diagrams, and streamlined deployment processes, ensuring seamless knowledge transfer, robust reproducibility, and laying the groundwork for advanced automation.
  • SAP Solution Basis & Integration Mastery:
    • Apply your deep expertise in SAP Solution Basis, particularly SAP S/4 HANA, Netweaver, and HANA databases, to architect and optimize sophisticated clustering solutions for demanding SAP workloads in the cloud.
    • Ensure the flawless integration of SAP systems with Pacemaker clustering to consistently exceed stringent high availability requirements.
  • Advanced Database Knowledge (Optional):
    • Provide expert support for DB2 and other complex database solutions within the context of clustering and high availability, ensuring peak database performance and unwavering uptime.
    • Collaborate strategically with database teams to implement highly available clustered databases that are tightly integrated with the underlying cloud infrastructure and sophisticated clustering technologies.
  • Collaborative Leadership & Strategic Requirement Definition:
    • Partner closely with development and business leadership to define critical infrastructure requirements for complex clustered solutions.
    • Lead and actively participate in in-depth requirement analysis sessions with key internal and external stakeholders, ensuring that clustering and HA solutions precisely meet the demanding needs and requirements of SAP ECS.
  • Executive Communication & Strategic Reporting:
    • Communicate fluently in business English, articulating complex technical concepts with clarity and authority to both technical peers and non-technical executive stakeholders.
    • Prepare and deliver impactful technical presentations, comprehensive status reports, and strategic documentation related to advanced clustering solutions and cloud architectures.
Other information
  • Start of the engagement: asap
  • Duration of the engagement: until 31.12.2025 with a possibility to extend (quarterly extension)
  • Location of the engagement: remote, no travel
Personality assumptions and skillsEducation:
  • Bachelor’s degree in Computer Science, Information Technology, or a related field. Advanced certifications in leading cloud platforms (Azure, AWS, GCP) and expert-level clustering technologies (Pacemaker) are highly valued.
Experience:
  • Extensive experience (ideally 7+ years) in architecting and implementing complex clustering and high availability solutions, with a significant focus on Pacemaker and related mission-critical technologies.
  • Proven track record of hands-on experience in enterprise-grade cloud environments (AWS, Azure, GCP, IBM Cloud), with a strong emphasis on designing and deploying resilient HA architectures and leveraging advanced cloud-native services.
  • Deep and broad experience with Linux-based operating systems (RHEL, SLES), including expert-level knowledge of system administration, performance tuning, and security hardening in complex clustered environments.

Technical Skills:
  • Expert-level knowledge and extensive hands-on experience with Pacemaker clustering, including advanced configuration of resource agents, sophisticated fencing mechanisms, and intricate quorum management strategies.
  • Comprehensive understanding of enterprise-grade cloud platforms (AWS, Azure, GCP, IBM Cloud) with a proven ability to architect and implement highly available and scalable solutions.
  • Significant experience in deploying and managing critical SAP systems (S/4 HANA, Netweaver, HANA) within high-availability clustered environments.
  • Familiarity with enterprise-level databases like HANA and DB2 (optional) in complex clustered and HA setups.
  • A strong plus (optional): experience with advanced automation tools and infrastructure-as-code (e.g., Ansible, Terraform).

Language Skills: Fluent English

Preferred Skills and Qualifications:

  • Advanced, expert-level certifications in Pacemaker, Red Hat, SLES, and SAP systems.
  • Extensive experience in architecting and implementing highly resilient multi-cloud high availability architectures.