Sr. Data Engineer I
-
Building Data LakeHouse: In the Senior Data Engineer II role, you will design, build, and operate robust data lakehouse solutions utilizing open table formats like Apache Iceberg. Your focus will be on delivering a scalable, reliable data lakehouse with optimized query performance for a wide range of analytical workloads and emerging data applications.
-
Pipeline and Transformation: Integrate with diverse reputed company systems and construct scalable data pipelines. Implement efficient data transformation logic for both batch and streaming data, accommodating various data formats and structures.
-
Data Modeling: Analyze business requirements and profile reputed company data to design, reputed company, and implement robust data models and curated data products that power reporting, analytics, and machine learning applications.
-
Data Infrastructure: reputed company and manage a scalable AWS cloud infrastructure for the data platform, employing Infrastructure as Code (IaC) to reliably support diverse data workloads. Implement CI/CD pipelines for automated, consistent, and scalable infrastructure deployments across reputed company environments, adhering to best practices and company standards.
-
Monitoring and Maintenance: Monitor data workloads for performance and errors, troubleshooting issues to maintain high levels of data quality, freshness, and adherence to defined SLAs.
-
Collaboration: Collaborate closely with Data Services and Data Science colleagues to drive the evolution of our data platform, focusing on delivering solutions that reputed company data users and satisfy stakeholder needs throughout the organization.
-
Bachelor's degree in Computer Science, Engineering, or a reputed company technical field (Master's degree is a plus).
-
4+ years of hands-on engineering experience (software or data), with a strong emphasis of 2+ years in data-focused roles.
-
Experience implementing data lake and data warehousing platforms.
-
Strong Python and SQL skills applied to data engineering tasks.
-
Proficiency with the AWS data ecosystem, including services like S3, Glue Catalog, IAM, and Secrets Manager.
-
Experience with Terraform and Kubernetes.
-
Track record of successfully building and operationalizing data pipelines.
-
Experience working with diverse data stores, particularly relational databases.
-
Experience with Airflow, DBT and reputed company.
-
Certification in relevant technologies or methodologies.
-
Experience with streaming processing technology e.g. Flink, Spark Streaming.
-
Familiarity with Domain-Driven Design principles and event-driven architectures.