HPC Cluster Engineer / Linux Specialist

☞ to.scale

Aussicht: 100

Update Tag: 11-11-2025

Ort: Zaventem Flemish Brabant

Kategorie: Hohe Technologie

Industrie: IT Services IT Consulting

Position: Mid-Senior level

Jobtyp: Full-time

Loading ...

Jobinhalt

Role Overview

We are hiring a Linux infrastructure specialist to join a core High-Performance Computing (HPC) team. You will support and evolve simulation and R&D compute platforms used by internal Engineering teams.

This is a technically broad role where you’ll touch on Linux infrastructure, cluster orchestration, automation, monitoring, and some hands-on hardware support. You’ll join a small, senior team with high autonomy and end-to-end responsibility over the HPC platform.

Key Responsibilities

Administer and optimize Linux-based HPC clusters (Ubuntu, CentOS, RHEL-family)
Manage workload scheduling with Slurm
Support containerized workloads using Docker and Singularity
Implement and manage infrastructure-as-code via Ansible and Terraform
Support GPU-accelerated workloads (NVIDIA, CUDA)
Monitor system health and performance using Grafana, Prometheus, and related tools
Troubleshoot hardware and perform physical support tasks (rack/stack, diagnostics, cabling)
Collaborate with internal researchers and engineers to support and improve workload performance
Contribute to documentation and help mature internal platform standards and practices

Requirements

Operating Systems: Ubuntu, CentOS, RHEL derivatives (Rocky, Alma)
Schedulers: Slurm (primary), OpenOnDemand (optional)
Containers: Docker, Singularity
Automation: Ansible, Terraform, Bash, Python
Monitoring: Grafana, Prometheus, custom metrics
HPC Filesystems: Lustre (required), GPFS, Ceph (optional)
Hardware: Server maintenance, rack/stack, troubleshooting
Collaboration: Git, Jira, CI/CD pipelines

Ideal Candidate Profile

5+ years of Linux system administration experience, including in performance-sensitive environments
Experience supporting or operating HPC clusters (Slurm, Lustre)
Scripting ability in Bash and Python
Hands-on automation experience with Ansible and Terraform (or equivalents)
Familiarity with containerization and job isolation (Docker/Singularity)
Comfortable with infrastructure observability tools and performance tuning
Proactive, autonomous, and able to collaborate across teams and functions
Fluent in English (spoken and written)

Nice to Have

Experience with Bright Cluster Manager or other cluster deployment tools
Exposure to distributed file systems (e.g., Ceph)
Familiarity with OpenOnDemand or other HPC frontend tools
Understanding of GPU scheduling (CUDA/NVIDIA)
Cloud exposure (AWS, Azure, or GCP)

Benefits

While preferably we are looking for a Full-Time Employee (FTE), exceptions can be made for the right candidate if they would rather work as a freelancer (Contractor).

Here is a list of benefits:

Meal vouchers
Pension scheme (2%)
Hospitalization Insurance
Remote work allowance (60€/month)

Loading ...

Frist: 26-12-2025

Klicken Sie hier, um sich für einen kostenlosen Kandidaten zu bewerben

Anwenden

Loading ...

Jobs nach Kategorie ➕ Jobs nach Kategorie ➖

Jobs nach Ort ➕ Jobs nach Ort ➖

FindJob24h.comBelgium

HPC Cluster Engineer / Linux Specialist

Jobinhalt

Frist: 26-12-2025

ÄHNLICHE ARBEITEN

to.scale

REKRUTIENRUNGSFIRMA

Verwandte kategorie

FINDEN SIE JOBS NACH KATEGORIE

JOBS NACH ORT

STELLE IN DER INDUSTRIE