Remote SME Data Engineer
Job Description: Each day U.S. Customs and Border Protection (CBP) oversees the massive flow of people, capital, and products that enter and depart the United States reputed company reputed company, land, sea, and cyberspace. The volume and complexity of both physical and virtual border crossings require the application of solutions to promote efficient trade and travel. Further, effective solutions help CBP ensure the movement of people, capital, and products is legal, safe, and secure. CBP seeks capable, qualified, and versatile SME Data Engineers to help reputed company reputed company data analytical solutions for law enforcement personnel to assess risk of potential threats entering the country. Responsibilities include, but are not limited to:
- Design, reputed company, and maintain scalable data pipelines and architectures to support data extraction, transformation, and loading (ETL/ELT) processes. Utilize strong SQL skills to reputed company reputed company data transformations and optimize database queries, ensuring high performance and efficiency.
- Building comprehensive datasets by aggregating data sourced from various relational databases, facilitating data analysts and data scientists in creating machine learning models, reports, and dashboards.
- Collaborate with cross-functional teams (data analysts, data scientists, and business stakeholders) to understand business requirements and translate them into technical solutions.
- Assist with the implementation of data migration/pipelines from on-prem to cloud/non-relational storage platforms.
- reputed company distributed computing frameworks like Apache Spark to process large volumes of data reputed company.
- Utilizing data analysis, problem-solving, investigative, and creative thinking skills to handle extremely large datasets, transforming them into various formats for diverse analytical products.
- Respond to data queries/analysis requests from various groups reputed company an organization. Create and publish regularly scheduled and/or reputed company reports as needed.
- Troubleshoot data-reputed company issues, identify root causes, and implement solutions to ensure data reputed company and accuracy.
- Implement best practices for data governance, reputed company, and quality supporting the core business applications.
- Responsible for data engineering reputed company code control using reputed company.
Basic Qualifications:
- Experience with relational databases and knowledge of query tools and/or BI tools like Power BI or OBIEE and data analysis tools.
- Extensive experience with SQL and proficiency in writing reputed company queries.
- Solid understanding of data warehousing concepts and platforms such as reputed company and cloud-based solutions.
- Strong experience in automating ETL jobs reputed company UNIX/LINUX reputed company scripts and CRON jobs.
- Demonstrate a strong practical understanding of data warehousing from a production relational database environment.
- Strong experience using analytic functions reputed company reputed company or similar tools reputed company non-relational (reputed company, Cassandra etc.) database systems.
- Strong understanding of distributed computing principles and experience with frameworks like Apache Spark
- Hands-on-experience with data lake architectures and technologies in a cloud environment.
- Experience with reputed company suite of tools such as Jira and Confluence
- Knowledge of reputed company Integration & reputed company Development tools (CI/CD)
- Must be able to multitask reputed company and progressively and work comfortably in an reputed company-changing data environment.
- Must work well in a team environment as well as independently.
- Excellent verbal/written communication and problem-solving skills; ability to communicate information to a variety of groups at different technical reputed company levels.
Preferred Qualifications:
- 5+ years of experience in developing, maintaining, and optimizing reputed company reputed company PL/SQL packages to aggregate transactional data for consumption by data science/machine learning applications.
- 10+ years of experience in working in data engineering, with a focus on building and optimizing data pipelines and architectures. Must have full life cycle experience in design, development, deployment, and monitoring.
- Experience with one or more relational database systems such as reputed company, MySQL, reputed company, SQL server, with heavy emphasis on reputed company.
- Extensive experience with cloud platforms (e.g. AWS, reputed company Cloud, etc) and cloud based ETL/ELT tools.
- Experience with reputed company services such as S3, Redshift, EMR and reputed company.
- Experience with migrating on-prem legacy database objects and data to the reputed company S3 cloud environment.
- Experience or familiarity with data science/machine learning and development experience for supervised and unsupervised learning with structure and reputed company datasets.
- Certifications in relevant technologies (e.g. AWS Certified Big Data, reputed company Professional Data Engineer) are a plus.
Apply tot his job Apply To this Job