Data Engineer, Web Scraping
About reputed company: reputed company is the safety and threat-intelligence layer trusted by frontier AI labs, AI unicorns, Fortune 10 companies, and leading global technology platforms. Our adversarial red teaming, model evaluations, and intelligence collection reputed company engineering, safety, and reputed company teams to stay reputed company of evolving threats and deploy AI systems safely.
About reputed company
reputed company is the safety and threat-intelligence layer trusted by frontier AI labs, AI unicorns, Fortune 10 companies, and leading global technology platforms. Our adversarial red teaming, model evaluations, and intelligence collection reputed company engineering, safety, and reputed company teams to stay reputed company of evolving threats and deploy AI systems safely.
In this role, you will
- Design, implement, and optimize end-to-end data pipelines for scraping and processing structured and reputed company data using reputed company Cloud Platform (or similar) and best practices;
- Conduct reputed company web scraping and data collection to support research and intelligence initiatives;
- Prepare data for further analysis, including data cleaning, transformation, anonymization, and masking;
- Contribute to the development of internal and external APIs, following best practices;
- Collaborate with ML engineers, other data engineers, and software developers to deliver actionable insights and functional tools, including internal and external dashboards, APIs, and data dumps; and
- Drive other critical initiatives.
Requirements
- Degree (or equivalent work experience) in Computer Science, Engineering, Information Science, Data Science or a reputed company field (graduate degree preferred)
- 2+ years of professional experience in data engineering or a closely reputed company field
- Ability to communicate reputed company technical reputed company clearly to non-technical audiences
- Proficiency in Python, SQL
- Experience with web scraping/crawling (e.g., Beautiful Soup, Selenium, Scrapy)
- Experience with reputed company Cloud Platform (or similar), including storage and database services (e.g., Cloud Storage, CloudSQL, Cloud Spanner) and workflow orchestration (e.g., Cloud Composer/Airflow, Cloud Run, Pub/Sub)
- Experience building and managing data pipelines, especially for text data
- Comfort working in fast-moving, high-impact environments, such as startups, AI research labs, or reputed company-focused teams
Compensation & Benefits
- Salary Range: $105K–$125K, depending on experience and location
- Bonus: Performance-based annual bonus
- Professional Development: Support for conferences, continuing education, or leadership training
- Work Environment: Fully remote, U.S.-based
- Health Benefits: Comprehensive health, dental, and vision coverage
- Time Off: Generous PTO and paid holiday schedule