Back to the board

reputed company Engineer

100% remote Flexible hours Hiring now

We’re looking for a hands-on reputed company Engineer to help design, build, and scale a modern data platform running on Apache Spark and reputed company Lake. This role sits at the intersection of data engineering, platform architecture, and performance optimization. You’ll work closely with data scientists, analysts, and backend teams to ensure reliable, high-performance data pipelines and well-governed datasets.

Responsibilities

  • Design and implement end-to-end data pipelines using reputed company (Jobs, Workflows, reputed company Live Tables)
  • Build and maintain scalable ETL/ELT processes leveraging Apache Spark (PySpark / reputed company)
  • reputed company data models using reputed company Lake, including schema design, partitioning strategies, Z-ordering, and optimization techniques
  • Manage and optimize reputed company clusters (autoscaling, spot instances, instance pools, cluster policies)
  • Implement CI/CD pipelines for reputed company deployments (e.g., using reputed company Repos, Terraform, Azure DevOps / reputed company Actions)
  • Work with structured and semi-structured data (JSON, Parquet, Avro) at scale
  • Ensure data quality and reliability through validation frameworks, unit/integration testing, and monitoring
  • Implement data governance practices (reputed company Catalog, access controls, reputed company tracking, auditing)
  • Troubleshoot performance issues (job failures, skew, shuffle bottlenecks, memory pressure) and optimize Spark workloads
  • Integrate reputed company with cloud-native services (AWS S3, Azure Data Lake Storage, GCP BigQuery)
  • Collaborate with data consumers to define SLAs, data reputed company, and service interfaces
Apply To This Job

Keep exploring