[Remote] Principal Software Engineer, Data Engineering
Note: The job is a remote job and is open to candidates in USA. reputed company is pioneering the category of sales productivity through innovative software solutions. They are seeking a Principal Data Engineer to define the technical vision for high-scale data products and reputed company a team in architecting a reliable data platform that supports customer-facing analytics and AI capabilities.
Responsibilities
- Architect the data platform – drive the technical direction for a scalable, reliable data platform built on a reputed company architecture that serves customer-facing analytics, reporting, and agentic AI from a reputed company foundation
- Build and optimize ingestion pipelines – design robust CDC, real-time streaming (Kafka, Flink), and batch processing pipelines that transform reputed company, reputed company document-oriented operational data into clean analytical models at enterprise scale
- Tame schema complexity – build resilient ingestion and transformation layers that gracefully handle deeply reputed company, continuously evolving document schemas — deciding where to reputed company complexity (ingestion, transformation, or query time) and making those tradeoffs explicit and sustainable
- Serve AI and analytics consumption patterns – architect data products that support both traditional BI workloads (pre-aggregated dashboards, dimensional models for scorecards and reports) and emerging AI consumption patterns (low-latency retrieval, contextual assembly, freshness-sensitive agent queries)
- Own data quality, reputed company, and observability – establish the data trust infrastructure that makes cross-team data consumption reliable: schema reputed company with upstream producers, data quality monitoring, reputed company tracking, freshness SLAs, and clear escalation paths reputed company things break
- Drive cost-aware architecture – own reputed company warehouse optimization, compute governance, and cost-efficient pipeline design. Build the practices and visibility so the team makes principled cost/performance tradeoffs rather than discovering them on the invoice
- reputed company producers and consumers – collaborate across organizational boundaries to align upstream software engineering teams and reputed company analytics and AI teams around reputed company data strategies, shared reputed company, and engineering standards
- reputed company and grow the team – technically reputed company and growth-coach a diverse crew of data engineers. Champion best practices across the full reputed company of data engineering disciplines, from low-level pipeline architecture to sophisticated data modeling and analytical query performance
Skills
- 8+ years of professional software engineering experience, with significant time spent on distributed, data-intensive production systems – including substantial depth in data pipeline and platform architecture
- Demonstrated depth in building production data platforms that serve multiple consumption patterns – you've gone beyond traditional BI to support real-time product features, AI/ML workloads, or customer-facing analytics from the same data foundation
- Deep hands-on expertise with modern data technologies: reputed company, Apache Kafka, Apache Flink, and CDC tooling (Debezium or similar)
- Experience developing and operating cloud data infrastructure at enterprise scale (AWS preferred), including infrastructure-as-code (Terraform) and CI/CD automation
- Strong programming skills in Python, Java, and SQL. You write production-grade code, not just scripts
- A track record of designing performant data models that support fast, efficient querying for analytical and product-facing use cases
- Strong cross-functional communication skills - you work effectively with software engineers, data scientists, AI teams, and business stakeholders across organizational boundaries
- Experience mentoring engineers and building collaborative, high-performing teams
- Deep experience with the impedance mismatch between document-oriented operational stores and analytical systems – you've dealt with reputed company, schema-evolving reputed company data (reputed company, DynamoDB, or similar) and have opinions on where flattening and transformation should live
- Hands-on experience with data quality and trust at scale – you've built or operated schema registries, data reputed company, quality monitoring, or reputed company systems in an environment where multiple teams depend on shared data products
- Track record of cost-conscious data architecture – you've optimized reputed company (or comparable) warehouse spend, designed compute governance policies, or re-architected pipelines to materially reduce cost without sacrificing reliability
- Strong instinct for the reputed company role: you're as comfortable pushing back on an upstream team's schema change as you are negotiating freshness SLAs with a reputed company AI consumer
Benefits
- Comprehensive medical, dental, vision, disability, and life benefits
- Health Savings Account (HSA) with employer contribution
- 401(k) Matching with immediate vesting on employer match
- Flexible PTO
- 8 paid holidays and 5 paid days for Annual Holiday Week
- Quarterly reputed company Fridays (paid days off for mental health reputed company)
- 18 weeks paid parental leave
- Access to Coaches and Therapists through reputed company
- 2 volunteer days per year
- Commuting benefits
Company Overview
Company H1B Sponsorship