Senior Cloud Engineer, Observability
- Be the hands-on SME for our observability toolchain (e.g., Datadog, CloudWatch, OpenSearch), including log pipelines, tracing/telemetry standards, and platform templates.
- Run office hours, produce exemplars, and pair with teams to implement “known-good” instrumentation and alerting.
- Triage and resolve observability-related platform requests (new service onboarding, log/metric gaps, noisy alerts, dashboard standards) with clear ownership and measurable outcomes.
- Establish and operationalize SLIs/SLOs for key platform components and enable teams to define service SLOs without reinventing the wheel.
- Maintain opinionated “golden paths” for:
- Logging (standard fields/tags, retention, routing, searchability)
- Logging (standard fields/tags, retention, routing, searchability)
- Metrics (naming conventions, cardinality guardrails, standard RED/USE views)
- Metrics (naming conventions, cardinality guardrails, standard RED/USE views)
- Tracing (service maps, critical spans, propagation standards)
- Tracing (service maps, critical spans, propagation standards)
- Dashboards (starter dashboards by service type + curated views for platform reliability)
- Dashboards (starter dashboards by service type + curated views for platform reliability)
- Provide reusable templates for alerting patterns (latency, error-rate, saturation, dependency failures), tuned for actionable paging vs. noise.
- Reduce MTTR by improving detection, triage paths, runbooks, and “what changed” visibility.
- Drive reliability reviews focused on observability gaps: missing signals, unclear ownership, bad alerts, and uninstrumented failure modes.
- Partner with delivery teams to turn recurring incidents into durable fixes (instrumentation + alerting + automation + documentation).
- Embed observability checks into CI/CD and platform workflows (e.g., telemetry guardrails, dashboard/monitor templates, logging standards checks).
- Partner with Security/Compliance to ensure telemetry supports auditability and incident investigation without ad-hoc effort.
- Define and report platform observability KPIs: alert noise rate, % actionable alerts, MTTA/MTTR trends, onboarding time to “fully observable,” runbook coverage, incident recurrence.
- Run lightweight experiments to improve signal quality (threshold tuning, monitor redesign, dashboard UX), and ship improvements like a product owner.
- Create cost-aware telemetry standards (log volume controls, metric cardinality guidance, sampling strategies, retention tiers).
- Help teams optimize spend while improving reliability outcomes (“cheaper + better” logging/metrics patterns).
- Serve as a trusted partner to delivery units, Security, and Data-turning pain points into paved-road improvements.
- Mentor engineers and uplift organizational practices for incident response, reliability signals, and operational excellence.
- Bachelor’s in computer science/engineering or equivalent experience.
- 5+ years hands-on AWS experience operating production workloads.
- Deep practical experience with observability in production, including:
- Datadog and/or CloudWatch (dashboards, monitors/alerts, log search, correlation)
- Datadog and/or CloudWatch (dashboards, monitors/alerts, log search, correlation)
- Designing actionable alerts (noise reduction, ownership, runbook-first alerts)
- Designing actionable alerts (noise reduction, ownership, runbook-first alerts)
- Defining/using SLIs/SLOs and reliability metrics to drive behavior
- Defining/using SLIs/SLOs and reliability metrics to drive behavior
- Strong proficiency with Infrastructure as Code (Terraform; CloudFormation a plus).
- Strong programming for automation/tooling (Python, Go, or similar).
- Solid grasp of cloud architecture, networking, and security fundamentals.
- Experience productizing observability enablement (templates, golden paths, standards, onboarding workflows).
- CI/CD at scale (GitLab pipelines), including integrating reliability/telemetry guardrails into delivery workflows.
- Logging/telemetry platforms beyond CloudWatch/Datadog (e.g., ELK/OpenSearch) and experience managing scale concerns (volume, retention, cardinality).
- Container platforms (ECS/EKS) and common AWS data services (RDS/Aurora, S3/lake patterns, MSK/Kinesis).
- FinOps experience related to observability (tagging, allocation, optimizing telemetry cost).
- Relevant AWS certifications and excellent communication skills.
Bayer offers a wide variety of competitive compensation and benefits programs. If you meet the requirements of this unique opportunity, and want to impact our mission Science for a better life, we encourage you to apply now. Be part of something bigger. Be you. Be Bayer.
To all recruitment agencies: Bayer does not accept unsolicited third party resumes.
Bayer is an Equal Opportunity Employer/Disabled/Veterans
Bayer is committed to providing access and reasonable accommodations in its application process for individuals with disabilities and encourages applicants with disabilities to request any needed accommodation(s) using the contact information below.
Bayer is an E-Verify Employer. Location: United States : Residence Based : Residence Based || United States : Illinois : Chicago || United States : Missouri : Creve Coeur || United States : Washington : Seattle Division: Crop Science Reference Code: 859535 Contact Us Email: [email protected]
Recommended Jobs
Cashiers, Court Monitors and Party Hosts- Altitude Schaumburg
Job Description Job Description Altitude Schaumburg is looking for positive and energetic team players to help us grow our new park! We are hiring Cashiers, Court Monitors and Party Hosts for o…
Obstetrician-Gynecologist
Job Description Job Description Obstetrician-Gynecologist Job Overview Community Health Partnership of Illinois is a is a non-profit organization that provides primary health care to migran…
3D Artist - Freelance (3dsMax)
BE FREE TO CREATE 3D What does it mean to be a 3d Artist with Studio2a? It means working alongside veterans of the 3d world in a virtual team environment. It means facing new challenges and often …
Professional Volunteer Mentor
The Educational Equality Institute (TEEI) is a global NGO dedicated to advancing education and professional development for underserved communities. Through our "Together for Ukraine" program, we del…
Busser / Host Assistant
For over 47 years, Joe’s Pizza & Pasta has been a family tradition in Illinois—serving our famous hand-tossed pizzas, hearty pastas, subs, salads, and more. With nearly two dozen locations, we’re pro…
Dietary Services - Dietitian
Job Description Job Description Description: Registered Dietitian w/IL license for multidisciplinary team. Independent, critical thinker with great communication skills that practices patient …
Experienced Roll Thread Operator
Experienced Roll Thread Operator - Send Resume Now Rocknel Fastener has immediate Full-time openings for 2nd shift roll thread operators. Requires 2 or more years of experience setting up and running…
Lead SRE Engineer
Posting Type Hybrid Job Overview Lead Site Reliability Engineer (SRE) is responsible for driving customer confidence by assuring the quality of Relativity’s current and future software p…
Team Leader I
Foods you love. Brands you trust. And a career that empowers you to grow. At Nestlé USA, we’re all working towards the same goal – to delight and deliver for our consumers. With a rich portf…
Dishwasher
Job Description Job Description As a Dishwasher, you’ll play a key role in keeping the kitchen running smoothly by maintaining clean dishes and kitchen equipment. Responsibilities: • Wash …