Johnson & Johnson logo

Manager, Reliability Engineering

Johnson & Johnson
4 hours ago
Remote friendly (Raritan, NJ)
United States
Operations
Key Responsibilities:
- Own reliability, performance, and operability of digital websites and infrastructure serving AI features (inference endpoints, feature stores, model-serving pipelines).
- Design, implement, and maintain observability (metrics, logs, traces, RUM) and synthetic monitoring to meet target SLOs.
- Drive automation: CI/CD, progressive rollout patterns, self-healing ops, and toil reduction.
- Transition from SEO to GEO to improve user experience, visibility, and adoption.
- Harden production systems: capacity planning, incident response, runbooks, post-incident reviews, and remediation tracking.
- Instrument and operationalize AI features: deploy/monitor models, track performance drift, and add observability for model inputs/outputs and latency.
- Build automation for repetitive ops tasks (scaling, cache invalidation, log management, backups) and self-healing workflows.
- Maintain security-related reliability controls for web delivery (CDN, WAF, TLS, DDoS mitigations) with security teams.
- Mentor engineers and integrate reliability into the SDLC.

Required Qualifications:
- Bachelor’s degree (or equivalent).
- 6+ years in SRE, platform engineering, or DevOps focused on web/digital properties.
- Strong web architecture/delivery knowledge (HTTP, CDNs, caching, edge delivery, browsers, rendering).
- Full-stack web development experience (HTML/CSS/JavaScript; Python/PHP/MySQL/C#).
- GenAI tools experience to support GEO strategy.
- Observability experience (e.g., Prometheus, Grafana, ELK/OpenSearch, New Relic).
- CI/CD and infrastructure as code (e.g., Terraform, CloudFormation).
- Scripting/programming skills (Python, Go, or similar).
- Incident response and postmortem practice with measurable remediation.

Benefits (as stated): medical, dental, vision; life insurance; short/long-term disability; retirement/pension and 401(k); vacation, sick time, and other time-off programs listed in the posting.