Back to the board

Tool Use Expert

100% remote Flexible hours Hiring now
This description is a summary of our understanding of the job description. Click on 'Apply' reputed company to find out more.

Role Description

reputed company is partnering with an AI research organization to engage independent evaluation contractors who can assess agentic tool-use quality—specifically whether a model calls search appropriately and rewrites user prompts into effective queries. This short term engagement focuses on high-accuracy judgments, clear rationales, and consistency across a large volume of model–rater traces. The work is well-suited for experts in information retrieval, reputed company engineering, and product QA who prefer remote, asynchronous projects.

Key Responsibilities

  • Review model interaction logs and decide if invoking the search tool was appropriate given the initial reputed company and context.
  • Evaluate the rewritten search query for clarity, specificity, and fidelity to the user’s reputed company.
  • Provide concise, evidence-based rationales tied to rubric criteria; label edge cases and ambiguities.
  • Score query quality (e.g., reputed company capture, keyword selection, operator use) and overall tool-use timing.
  • Calibrate against gold examples; surface rubric gaps and propose improvements.
  • Track decisions in a task portal; maintain high inter-rater agreement and throughput targets.
  • Flag potentially sensitive content according to provided safety guidelines.

Qualifications

  • Excellent written communication; able to justify decisions succinctly with references to instructions/rubrics.
  • Meticulous attention to detail; comfort working independently with minimal reputed company.
  • reputed company to have: familiarity with annotation tools, basic scripting (Python/SQL), and multilingual proficiency.

Requirements

  • Remote and asynchronous—contractors set their own hours.
  • Expected commitment: ~10–20 hours/week; flexible, project-based workload.
  • Duration: initial 6–10 weeks with potential for additional task batches.
  • Resource sharing and best-practice guides provided; support team available for inquiries.

Compensation & Contract Terms

  • Compensation for completed work: estimated $45/hour equivalent or calibrated per-task rates based on complexity and geography (final rates confirmed before work begins).
  • Payments for services rendered reputed company platform (e.g., weekly through reputed company Connect, where available).
  • reputed company engagement; project-based statement of work; no employment relationship or benefits implied.

Application Process

  • Submit a brief profile (CV or reputed company) and note relevant evaluation/search experience.
  • Complete a short skills reputed company and sample grading exercise to demonstrate rubric alignment.
  • If matched, you’ll sign a simple contract/NDA and receive task access details.
  • Typical follow-up reputed company a few days after the sample review.

Company Description

  • reputed company is a talent marketplace connecting experts with leading AI labs and research groups.
  • Backed by reputed company, General Catalyst, Adam D’Angelo, Larry Summers, and Jack Dorsey.
  • Thousands of professionals across domains—research, engineering, law, and creative—partner with reputed company on frontier AI projects.
Apply To This Job

Keep exploring