Site Reliability Enginer
Ready to get busy with agency and campaigns partnerships at ACTUM Digital?
Don’t miss the opportunity to join our dynamic team!Experience modernizing legacy applications or migrating monoliths to microservices
About the job
We are ACTUM Digital, a software house delivering digital solutions for one of the world’s largest auction houses. We are expanding our reliability capabilities and seeking a dedicated Site Reliability Engineer to strengthen the resiliency, performance, and operational predictability of mission-critical applications.
This is a hybrid role (remote + on-site days) and requires strong engineering judgement, operational discipline, and the ability to work across multiple teams and stakeholders.
Critical to this role is a well-developed sense of urgency and the ability to communicate and articulate priorities effectively with interested stakeholders. It is worth mentioning that, although not regularly, work outside of regular business hours is expected, especially when pressing issues arise that endanger business objectives. This role is expected to step in without hesitation or asking.
What will be your key responsibilities:
Monitoring & Observability
- Design, standardize, and maintain monitoring, alerting, and dashboards that reflect real user impact.
Incident Investigation & Postmortems
- Lead blameless postmortems and convert findings into prioritized reliability improvements.
Reliability Engineering
- Implement proactive reliability measures such as rate limiting, graceful degradation, health checks, retries, and chaos testing.
- Create automation runbooks, remediation steps, and routine operational tasks.
Capacity & Performance
- Forecast capacity needs, validate load limits, and guide scaling decisions to prevent saturation and outages.
What experience should you have:
Cloud Expertise
- Hands-on experience with Azure cloud services (compute, networking, storage, App Services, scaling, monitoring).
- Familiarity with designing and supporting scalable, distributed systems.
CDN Expertise
- Experience with major CDN providers (e.g., Akamai, Cloudflare).
- Strong understanding of bot management, caching strategies, and CDN security controls.
Web Architecture Experience
- Exposure to headless architectures, content delivery pipelines, and API-driven applications.
- Experience with micro front-ends.
Reliability & Operations
- Solid experience with monitoring/observability tools (Azure Monitor, Grafana dashboards).
- Strong troubleshooting skills across the end-to-end request path.
- Knowledge of resilience patterns (graceful degradation, retries, rate limiting, health checks).
Certifications (nice to have)
- Certifications in Azure, Akamai, or other relevant cloud/CDN tooling demonstrating proven expertise.
What's in it for you:
- Inspirational environment: Work on complex international projects using agile methodologies. An informal working environment with innovative colleagues
- Flexible work environment: Hybrid working to blend home working for focus and office working for collaboration and co-creation
- Vacation and time off: Guaranteed 5 weeks of vacation
- Education: Personal growth and challenging work with endless possibilities. Training, conference attendance, e-learning programs, mentoring
- Remuneration: A salary tailored to your qualities and experience
- Additional employee perks: Discounts with business partners, participation in team building, diversity food days, meetups, and knowledge snacks, and arrangement of a MultiSport card
- Seamless mobile communication: Discounted T-Mobile Family tariffs for family members
Are you interested?
Please drop us a line and let us see whether there is a role that fits your skills and talent!