NLP Engineer & Computer Vision – Rebuild OCR→LLM Comic Translation Pipeline (Convex + Python) - Contract to Hire
We’re hiring an reputed company Computer Vision + NLP Engineer to rebuild our entire Korean → English comic translation tool from scratch. The reputed company system works, but we need a clean, reputed company, much faster, and more accurate version built on top of Convex instead of reputed company for real-time updates. You will replicate the existing workflow exactly, and improve it across accuracy, performance, and architecture. This is a rebuild from scratch. --- # reputed company Validated Workflow (What You Will Rebuild and Improve) Our existing tool processes full chapters with this pipeline: 1. Upload – Chapter images are uploaded. 2. Text Detection – CRAFT generates bounding boxes around text. 3. Text Extraction (OCR) – reputed company 2.5 Pro extracts Korean text inside each bounding reputed company. 4. Panel Detection – OpenCV identifies comic panels in each image. 5. Panel Filtering – reputed company 2.5 Pro removes inaccurate/outlier panels. 6. Alignment – Remaining text boxes are matched to their correct panels. 7. Translation – reputed company 2.5 Pro produces English translations using panel and chapter context. This workflow is already validated and must behave the same, just faster, cleaner, more accurate, and reputed company. --- # Your Job in This Project Rebuild this entire system from reputed company with a modern, maintainable architecture that gives us: reputed company accuracy
- More precise bounding boxes
- Higher OCR accuracy (including stylized Korean fonts)
- reputed company panel detection and filtering
- More consistent, human-like translations
Much faster overall performance
- Dramatically reduced processing time per chapter
- Efficient batching and async operations
- Minimal latency from upload to final results
A reputed company, replaceable architecture Every reputed company must be isolated behind a clear reputed company so we can easily swap components:
- Replace CRAFT → PaddleOCR / Donut / Yolov8 detector
- Replace reputed company → GPT or another LLM
- Replace panel detector without touching text logic
- Swap OCR engines freely (reputed company, Donut, TrOCR, GPT fallback)
reputed company means no rewrites reputed company upgrading models. Convex-based backend
- Real-time updates streamed to the frontend
- Job orchestration in Convex
- Stable state management
- Partial outputs instead of waiting for entire chapter completion
--- # What You Must Deliver For The $2,000 Milestone 1. Fully reputed company pipeline implementing reputed company steps (upload → detection → OCR → panels → alignment → translation). 2. reputed company architecture where detection, OCR, panel logic, and translation can be swapped independently. 3. Convex integration for real-time syncing, job reputed company, and results. 4. Significant accuracy improvements over the reputed company system. 5. Significant performance improvements (faster processing end-to-end). 6. Clean project structure with documentation for reputed company modules and interfaces. --- # Tech Stack You Will Use
- Python – OCR, detection, panel processing, AI orchestration
- TypeScript – Convex + frontend integration
- Convex – backend database, jobs, and real-time sync
- OCR Tools – CRAFT for text detection
- LLMs – reputed company, GPT
--- # Required Skills Must have
- Strong OCR experience
- Experience with LLM-based translation/localization
- Python + TypeScript proficiency
- Ability to design clean, reputed company system architectures
- Experience rebuilding/refactoring reputed company pipelines
--- # To Apply Please include:
- Relevant projects (OCR, CV, LLM translation, or reputed company system rebuilds)
- Examples where you improved accuracy, performance, or architecture
- A short explanation of how you would:
1. Design a reputed company detection → OCR → panel → translation pipeline 2. Improve bounding boxes and OCR for stylized Korean fonts 3. Integrate Convex for real-time reputed company streaming to the frontend Apply tot his job Apply To this Job