Senior AI Engineer – LLM-Based Content Moderation

About TrustLab

Online misinformation, hate speech, child endangerment, and extreme violence are some of the world's most critical and complex problems. TrustLab is a fast-growing, VC-backed startup, founded by ex-Google, TikTok and Reddit executives determined to use software engineering, ML, and data science to tackle these challenges and make the internet healthier and safer for everyone. If you’re interested in working with the world’s largest social media companies and online platforms, and building technologies to mitigate these issues, you’ve come to the right place.

About the role

We are seeking an AI Engineer with expertise in Large Language Models (LLMs) to enhance the precision and recall of classification systems detecting content abuse, including hate speech, sexual content, misinformation, and other policy-violating material. You will work with cutting-edge AI models to refine detection mechanisms, improve accuracy, and minimize false positives/negatives.

Responsibilities

Design, develop, and optimize automated content moderation workflows, focusing on precision and recall improvements.
Leverage prompt engineering, fine-tuning, RAG, and other approaches to optimize cost and quality.
Deploy and monitor content moderation models in production, iterating based on real-world performance metrics and feedback loops.
Collaborate with policy, trust & safety, and engineering teams to align AI models with customer needs.
Stay up-to-date with advancements in AI and AI safety to ensure best-in-class content moderation capabilities.
Develop medium to long-term vision for content moderation-related R&D, working with management, product, policy & operations, and engineering teams.
Take ownership of results delivered to customers, pushing for change in approach where needed and taking the lead on cross-functional execution.

Minimum Qualifications

Bachelor's or Master’s degree in Computer Science, Artificial Intelligence, Machine Learning, or a related field. Ph.D. is a plus.
Proficiency in Python. Experience with AWS and CI/CD processes & tools is a strong plus.
Experience with prompt-engineering techniques and familiarity with multiple LLM providers.
Strong familiarity with evaluation metrics for classification tasks (e.g., F1-score, precision-recall curves) and best practices for handling imbalanced datasets.
5+ years of industry experience in NLP, computer vision, and LLMs.
2+ years of experience making LLM’s work in production for non-trivial use cases
Experience with fine-tuning and RAG approaches, and LLM cost optimization in production use cases.
Experience automating, orchestrating, and monitoring LLM workflows.
Hands-on experience with debugging issues in production environments, especially on AWS.
Ability to work cross-functionally and enable non-technical stakeholders to leverage AI approaches and tools.

Opportunities and perks

Work on cutting-edge AI technologies shaping the future of online safety.
Collaborate with a multidisciplinary team tackling some of the most challenging problems in content moderation.
Competitive compensation, comprehensive benefits, and opportunities for professional growth.

Pay range is specific to the U.S. – Bay Area; compensation may differ by region.

The pay range for this role is:

150,000 - 180,000 USD per year (Palo Alto)

Engineering

Palo Alto, CA

Share on: