Observability Architect
Capgemini
Troy, NY, US
Hybrid
2026-07-03
Announced salary
$118,560 - $187,200
Estimated net pay
$7,180 - $10,734
/month · 27% withheld
after tax & contributions · Single, no dependents
Job description
Troy, NY, United States (On\-site)
Contract (5 months 23 days)
Published 5 hours ago
incident management
Power BI
root cause analysis
observability
documentation skills
sql database
AWS cloud platform
production support
Senior observability Architect responsible for platform modernization, Grafana Cloud migration, operational excellence, production support readiness, and enterprise observability strategy across hybrid cloud environments.
**Core Responsibilities:**
* Lead observability strategy, architecture, and operational excellence initiatives.
* Provide servant leadership to Operations and Production Support teams.
* Own end\-to\-end design across all layers (infra end user); define standards, SLOs, signal model
* Drive continuous improvement programs focused on reliability, system health, and MTTR reduction.
* Ensure operational readiness and supportability of all platform changes before production deployment.
**Observability Platform Modernization:**
* Mature the enterprise observability platform and lead migration from New Relic to Grafana Cloud at scale.
* SLO/SLI design, anomaly detection (Sift), predictive alerting
* Design observability standards for AWS and on\-premises hybrid environments.
* Standardize telemetry collection using Grafana Alloy and OpenTelemetry.
* Develop enterprise monitoring, logging, tracing, dashboarding, and alerting frameworks.
**Grafana \& OpenTelemetry Architecture:**
* Architect solutions leveraging Grafana Mimir, Loki, Tempo, IRM, and Grafana Cloud.
* Define instrumentation patterns and validated templates for infrastructure, applications, and distributed services.
* Design dashboards, alerts, service health views, SLIs, SLOs, and error\-budget monitoring.
* Ensure telemetry quality, scalability, and governance across the organization.
**Production Support \& Operational Readiness:**
* Own production support intake, triage, escalation, and stakeholder communications.
* Validate monitoring, alerting, loggi
On the map
map
See this employer on the map — Troy