About the Role
AI Research Jobs in the United States (Remote, Full-Time)
You will run applied AI research projects for US-based customers via Rex.zone, translating open-ended research questions into measurable experiments across LLM evaluation, RLHF data design, prompt evaluation, and model performance improvement.
What You Will Do Own end-to-end applied research cycles: problem framing, baselines, ablations, and reporting Build and evaluate LLM systems using offline metrics and human-in-the-loop evaluation Design RLHF workflows: preference data specs, rater instructions, prompt sets, and rubric-based grading Create evaluation datasets and test suites: prompt evaluation, red-teaming prompts, and content safety labeling protocols Collaborate with data labeling teams on taxonomy, edge-case coverage, and training data quality Perform error analysis and model debugging to improve robustness, safety, and helpfulness Document methodology and results for reproducibility and auditability Required Qualifications Mid-Senior experience delivering applied ML research or productionized ML evaluation Strong Python skills;
experience with PyTorch (or similar) Hands-on LLM evaluation, prompt evaluation, or RLHF experience Experiment design, metrics selection, and statistically sound interpretation Familiarity with dataset development: data labeling, QA evaluation, and guideline compliance checks Strong written communication for research artifacts and cross-functional alignment Preferred Qualifications RAG/NER/structured output evaluation experience Exposure to computer vision or multimodal evaluation Content safety labeling taxonomies and policy-aligned rubrics MLOps for evaluation pipelines, dataset versioning, and reproducible runs Remote Work and Collaboration
Remote, FULL_TIME role supporting United States-based projects with distributed teams across research, engineering, and data operations. Compensation
Hourly base pay range: $30–$50/hr.