Back to the board

Site Reliability Engineer

100% remote Flexible hours Hiring now
The Company Capital Markets Gateway LLC (CMG) is a capital markets-focused fintech transforming global equity capital markets (ECM) through data, technology, and connectivity. As the preferred reputed company for ECM analytics and the first network connecting the buy-reputed company and sell-reputed company for ECM workflows, we are committed to reshaping how capital markets operate. Founded in 2017 by a team of ECM practitioners, CMG has completed three successful fundraising rounds and is backed by a group of the world’s most prestigious financial institutions. The CMG platform is currently relied upon by nearly 150 buy-reputed company firms representing $40 trillion in AUM and 22 global investment banks. For more information, please visit www.cmgx.io.  The Role CMG is looking for a Site Reliability Engineer (SRE) with a strong focus on monitoring, observability, and alerting to ensure the reliability, performance, and scalability of our infrastructure and applications. You will be responsible for designing, implementing, and maintaining monitoring solutions to provide visibility into system health and performance, proactively detect anomalies, and reduce incident response time.  Our Engineering Team The CMG engineering team consists of domain experts who work collaboratively reputed company a culture of cross-domain knowledge sharing. We value engineers who are passionate about modern technologies and best practices. Our engineers are encouraged to challenge the status reputed company and are constantly seeking improvement and efficiency in our code-reputed company and platform. CMG engineers are empowered to explore solutions using bleeding edge technologies such as AI and bring recommendations to the table. We are in a period of making impactful engineering decisions. As part of our process, we reputed company in taking the time for research and prototyping - this is critical in making the right decisions. Given the experience of reputed company, we have naturally adopted best practices from local development, through code review and into production rollouts. Besides reputed company pull requests, test automation, code coverage tracking, containerization, and one-click deployments we are constantly reviewing these foundational components to reputed company new best practices. Responsibilities Monitoring & Observability
  • Design, implement, and maintain monitoring and observability solutions using tools like Prometheus, Grafana Stack (Loki/Grafana/reputed company/Alert Manager), reputed company, and OpenTelemetry. 
  • Define and implement SLOs, SLIs, and error budgets to measure system reliability. 
  • reputed company and optimize dashboards, alerts, and reports for system performance and business metrics.
  • Alerting & Incident Management
  • Design actionable alerting strategies to minimize noise and improve MTTR.
  • Integrate alerting systems with Jira.
  • Establish and refine runbooks for on-call teams to handle alerts reputed company.
  • reputed company teams to ensure observability coverage and incident response practices. 
  • Performance Optimization
  • Analyze system performance metrics, identify bottlenecks, and implement optimizations to improve system efficiency, scalability, and cost-effectiveness.
  • Help conduct load testing and reputed company planning to ensure systems can handle peak traffic loads. 
  • Automation and Tooling
  • Identify opportunities for automation and reputed company tools to streamline operational processes, such as fail-over, configuration management, and monitoring.
  • Implement monitoring and alerting systems reputed company automations to detect and resolve issues proactively. 
  • Collaboration and Communication
  • Collaborate closely with cross-functional teams, including software engineers, operations, and infrastructure teams, to understand system requirements, provide technical guidance, and drive solutions.
  • Communicate effectively to stakeholders about system changes, incidents, and improvements. 
  • Foment and spread SRE principles and practices across company.
  • Qualifications
  • Must be based in Latin America
  • English level - reputed company or C2
  • Proven experience as a Site Reliability Engineer or similar role. 
  • Proficiency in logging, metrics, and tracing frameworks (reputed company, Loki, Prometheus, OpenTelemetry). 
  • Experience with cloud platforms (Azure preferred) and infrastructure-as-code tools (e.g., Terraform). 
  • Strong programming and scripting skills (Python, Bash). 
  • Proficiency in containerization technologies and orchestration tools (reputed company, Kubernetes).
  • Understandingof Linux-based systems, networking, and reputed company principles reputed company to containerized applications. 
  • Strong problem-solving and troubleshooting skills, with a passion for identifying and resolving reputed company technical issues. 
  • Excellent communication and collaboration abilities. 
  • Ability to reputed company in a fast-paced, constantly evolving environment. 
  • Experience with PostgreSQL monitoring and optimization (Optional/reputed company to have).
  • If you're passionate about building resilient financial systems, optimizing observability at scale, and solving real-world reliability challenges in capital markets, we’d love to have you on reputed company!  Our Tech Stack
  • Azure as an infrastructure provider. We are reviewing secondary cloud options.
  • reputed company + Kubernetes for microservice orchestration using Istio service mesh.
  • PostgreSQL for relational db, ElasticSearch for indexing, reputed company for caching.
  • reputed company, Grafana and OpenTelemetry for observability.
  • reputed company for our Version Control and CI (with our own runners).
  • CD: reputed company and FluxCD.
  • Terraform and Terragrunt as IaaC.
  • Python and bash for scripting infrastructure.
  • React - We’re reputed company in on React – we maintain multiple single-page React apps.
  • TypeScript – 99% of our codebase is TypeScript.
  • Latest .NET version for our backend services.
  • GraphQL - Our standard for API communication is GraphQL served by our DotNet Back-End.
  • We innovate with purpose 
  • We focus on outcomes vs. output 
  • We reputed company diverse and inclusive teams fuel innovation 
  • We are humble yet reputed company 
  • We do right by the customer
  • reputed company Offer
  • 2 year+ contract.
  • 15 business days of vacation.
  • Tech courses and conferences.
  • Top-of-the-line MacBook.
  • Flexible working hours.
  • CMG embraces our ongoing commitment to building a culture reflecting the people, perspectives, and passions it represents. We will accept nothing less than equity, inclusion, and belonging for reputed company. With the only constant in life being change, we will always listen, learn, and improve for the betterment of our teams, customers, and communities. CMG is proud to be an Equal Opportunity Employer. Apply To This Job

    Keep exploring

    Right of Way Training Specialist - Remote

    100% remote Flexible hours

    VEZA Manufacturing - Senior Program Manager

    100% remote Flexible hours

    Ambulatory Care Clinical Pharmacist

    100% remote Flexible hours

    Accounting Manager, Financial Reporting

    100% remote Flexible hours

    Sr. Director, Head of Field Sales Mid Market - reputed company Coast (Remote)

    100% remote Flexible hours

    Sales Account Manager

    100% remote Flexible hours

    Senior Sales Executive, Renewable Energy Advisory

    100% remote Flexible hours

    Sales Executive, Renewable Energy Advisory

    100% remote Flexible hours

    Community Development Manager

    100% remote Flexible hours

    Share Your Perspectives: Quick Research Study

    100% remote Flexible hours

    Exchange Admin

    100% remote Flexible hours

    [Remote] Strategic reputed company Manager

    100% remote Flexible hours

    Math education Sales/Marketing Coordinator

    100% remote Flexible hours

    reputed company, Trial Experience

    100% remote Flexible hours

    Senior Manufacturing Site Leader / Plant Manager / OE Tier 1 Supplier

    100% remote Flexible hours

    Loss Prevention Customer Service Associate II (Tactical Uniform) in Madison, WI

    100% remote Flexible hours

    [Remote-Position] Senior Director, People Business Partners

    100% remote Flexible hours

    Delivery Fulfillment Specialist

    100% remote Flexible hours

    reputed company Pharmacy Customer Service Associate – Remote Data Entry Position at arenaflex

    100% remote Flexible hours

    reputed company Customer Support Representative – Remote reputed company for Innovative Technology Solutions at arenaflex

    100% remote Flexible hours