AI Evaluation – Safety Specialist

Remote Full-time
This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description At Mercor, we believe the foundation of AI safety is high-quality human data. Models can’t evaluate themselves — they need humans who can apply structured judgment to complex, nuanced outputs. We’re building a flexible pod of Safety specialists: contributors from both technical and non-technical backgrounds who will serve as expert data annotators. This pod will annotate and evaluate AI behaviors to ensure the systems are safe. No prior annotation experience is required — instead, we’re looking for people with the ability to make careful, consistent decisions in ambiguous situations. This role may include reviewing AI outputs that touch on sensitive topics such as bias, misinformation, or harmful behaviors. All work is text-based, and participation in higher-sensitivity projects is optional and supported by clear guidelines and wellness resources. Qualifications You bring experience in model evaluation, structured annotation, or applied research. You are skilled at spotting biases, inconsistencies, or subtle unsafe behaviors that automated systems may miss. You can explain and defend your reasoning with clarity. You thrive in a fast-moving, experimental environment where evaluation methods evolve quickly. Examples of past titles: Machine Learning Research Assistant, AI Evaluator, Data Scientist, Applied Scientist, Research Engineer, AI Safety Fellow, Annotation Specialist, Data Labeling Analyst, AI Ethics Researcher. Requirements Produce high-quality human data by annotating AI outputs against safety criteria (e.g., bias, misinformation, disallowed content, unsafe reasoning, etc). Apply harm taxonomies and guidelines consistently, even when tasks are ambiguous. Document your reasoning to improve guidelines. Collaborate to provide the human data that powers AI safety research, model improvements, and risk audits. Benefits Work at the frontier of AI safety, providing the human data that shapes how advanced systems behave. Gain experience in a rapidly growing field with direct impact on how labs deploy frontier AI responsibly. Be part of a team committed to making AI systems safer, trustworthy, and aligned with human values. Company Description Mercor is a talent marketplace that connects top experts with leading AI labs and research organizations. Our investors include Benchmark, General Catalyst, Adam D’Angelo, Larry Summers, and Jack Dorsey. Thousands of professionals across law, engineering, research, and creative fields collaborate with Mercor on frontier AI projects shaping the future. The pay rate for this role may vary by project, customer, and content category. Compensation will be aligned with the level of expertise required, the sensitivity of the material, and the scope of work for each engagement.
Apply Now →

Similar Jobs

AI Red-Teamer — Adversarial AI Testing (Advanced)

Remote Full-time

Rubric Grading Expert

Remote Full-time

Procurement Expert

Remote Full-time

Linguistic Experts

Remote Full-time

AI Red-Teamer — Adversarial AI Testing

Remote Full-time

STEM PhD Researcher

Remote Full-time

Expert Recruiters

Remote Full-time

Project Manager

Remote Full-time

Visual Annotation Expert

Remote Full-time

Project Manager

Remote Full-time

Experienced Operations Manager and Coach - Overnight Shift - Leading Retail Operations and Customer Service Excellence

Remote Full-time

Fully Remote , Entry Level Data Entry job

Remote Full-time

**Experienced Customer Service Coordinator – Air Travel and Cargo Support**

Remote Full-time

Experienced Customer Support Representative - Remote Contact Center Agent in Iowa

Remote Full-time

Experienced Financial Accountant – Remote Data Entry Specialist for arenaflex, $25-$35/Hour, Full-Time Opportunity

Remote Full-time

REMOTE Licensed Mental Health Clinician (Wisconsin)

Remote Full-time

Experienced Customer Care Specialist – Remote Work Opportunity for Delivering Exceptional Customer Experiences and Driving Business Growth through Empathetic Support and Effective Issue Resolution

Remote Full-time

Experienced Full Stack Account Manager – Insurance Industry Client Relationship Management and Business Development

Remote Full-time

Experienced Remote Customer Service Representative for E-commerce Brands at blithequark

Remote Full-time

Data Analyst - Financial & Semantic Layer Focus

Remote Full-time
← Back to Home