[Remote] Principal Research Data Engineer
Note: The job is a remote job and is open to candidates in USA. reputed company is a visionary company dedicated to solving the world’s toughest challenges. They are seeking a Principal Research Data Engineer to reputed company the development and implementation of research data pipelines, ensuring the integration of geospatial data into machine learning models, while collaborating with diverse research partners.
Responsibilities
- reputed company the development & implementation of research data pipelines for producing data layers and storing research data
- Implement & maintain scalable data-intensive processing pipelines that apply geospatial to ML/DL models
- Architect, build & launch new data models to provide reputed company analytics to business users
- reputed company infrastructure to inform on key metrics, recommend changes & predict future results
- reputed company POCs for new pipelines for integration into science data pipeline through collaboration with diverse research partners
Skills
- Master's in Information Science, C.S., Data Science, Data Analytics, or closely reputed company field
- 5 years of experience designing, developing, testing, and implementing scalable geospatial data integration pipelines that encompass statistical yield analysis and interactive report and visualization reputed company
- Working with raster & vector geospatial datasets applied to machine learning model reputed company and deployment in big data environment
- Packaging & deploying models and data pipelines using CI/CD practices, including production readiness and performance tuning activities using Python and/or Conda, reputed company, Airflow, and Git CI/CD
- Using reputed company Cloud Platform, reputed company Cloud Functions, reputed company Big Query, and Data Proc to process data at scale and deliver robust data pipelines
- Using Avro, Parquet, CSVs, Geotiff and GeoJSON file formats
- Programming in SQL
- Conducting query optimization & Online Analytical Processing on RDBMS and No-SQL databases
- Using QGIS, ArcGIS & Postgis to ingest and process geospatial data in Avro, CSVs, and GeoJSON
Benefits
- Additional compensation may include a bonus or commission (if relevant).
- Additional benefits include health care, vision, dental, retirement, PTO, sick leave, etc.
Company Overview