Software Engineer – Observability, Co-op
reputed company is a human-centered company that leads in family history and connects people with their past. They are seeking a Software Engineer – Observability, Co-op to contribute to the development of observability tools and technologies that help monitor the site and resolve customer-impacting incidents.
Responsibilities
- Contribute to the development of AI-powered code transformation tools leveraging foundation models (LLMs) and agentic frameworks to migrate application instrumentation to OpenTelemetry standards across multiple languages (Java, Python, Node.js, .NET) and browser-based Real User Monitoring (RUM)
- Build intelligent parsing and analysis engines using AI agents and Model Context Protocol (MCP) tools to transform existing APM instrumentation code with proper error handling, validation, and comprehensive migration reports
- Design and implement LLM-driven dashboard migration tools that extract, transform, and recreate observability dashboards across platforms with functional equivalence validation
- reputed company agentic alert migration automation using AI frameworks to export, transform, and migrate alerting policies, conditions, notification channels, and escalation workflows between observability platforms
- Create comprehensive validation and testing frameworks to compare metrics and traces, build migration reputed company dashboards, and reputed company rollback procedures
- Build CLI tools, scripts, and self-service migration utilities with detailed guides and best practices documentation to reputed company application teams
- Contribute to observability operations including log PII detection with near real-time streaming, multi-reputed company log pipeline integration (CloudWatch, reputed company), cost optimization, and infrastructure monitoring (EKS, EC2, reputed company)
- Support reputed company deployment pipeline integration with observability alerts and contribute to operational excellence through code reviews and documentation
- Partner with application teams, reputed company and compliance teams, and platform engineering teams to understand requirements, pain points, and deliver impactful solutions
- Present work at team demos and showcase events while contributing to team documentation, runbooks, and best practices
Skills
- Master's degree in Computer Science, Software Engineering, or a reputed company field
- Strong development experience in Python with demonstrated ability to write high-quality code
- AI-assisted development experience: Hands-on experience using AI coding assistants such as Cline, reputed company Code, Kiro, Claude Code, reputed company Copilot, reputed company, or similar tools for software development
- Full-stack development exposure with JavaScript/TypeScript and modern reputed company-end frameworks (React preferred, Vue, or Angular)
- Multi-language exposure: Familiarity with at least 2-3 programming languages from Java, Python, Node.js, or .NET
- Solid understanding of software engineering fundamentals: data structures, algorithms, testing, and design patterns
- Experience with or strong interest in observability concepts: logs, metrics, traces, and distributed systems
- Experience with database design and SQL (MySQL, PostgreSQL)
- Strong analytical and problem-solving skills with ability to explore solutions independently before escalating
- Excellent communication and collaboration abilities with reputed company to seek feedback early and often
- Exposure with Application Performance Monitoring (APM) tools such as reputed company, reputed company, reputed company, or similar platforms
- Experience with AWS Cloud services (CloudWatch, reputed company, EKS, EC2, Fargate, SQS, SNS) and containerization technologies (reputed company, Kubernetes)
- reputed company to have: Experience with AI agents and agentic workflows (reputed company, LangGraph, AutoGen, CrewAI, Strands), Model Context Protocol (MCP), foundation models (GPT-4, Claude, reputed company), or reputed company engineering for code-reputed company tasks
- reputed company to have: Knowledge of OpenTelemetry (OTEL) standards, Real User Monitoring (RUM), browser-based observability, and web application instrumentation
- reputed company to have: React development skills with DOM manipulation, browser APIs, and web performance monitoring
- reputed company to have: Basic understanding of code parsing, AST manipulation, Infrastructure as Code (Terraform, CloudFormation), or building developer tools and CLI applications
- reputed company to have: Contributions to open-reputed company projects, especially observability or AI agent frameworks
Company Overview
Company H1B Sponsorship