Applied Scientist, Oncology Foundation Model (Intern)

Drug development shouldn’t be guesswork, not when patients are waiting.

Pathos is building a next-generation biotech with AI at the core. Not as a feature, but as the operating system for how medicines get developed. We believe most drugs don’t fail because the science was wrong. They fail because they were tested in the wrong patients, with the wrong assumptions, in trials that couldn’t answer the real question: who benefits, and why?

Pathos exists to change that. We’re building the largest foundation model in oncology and pairing it with proprietary AI systems, deep oncology expertise, and 200+ petabytes of multimodal data linked to patient outcomes, so we can make development decisions with more precision, much earlier.

This is not theoretical. We’re well-capitalized and have the leadership to build a generational company. We invest in and advance our own clinical-stage programs, using our AI platform to sharpen trial design, patient selection and biomarker strategy. So therapies reach the patients most likely to benefit, sooner.

How We Build
Pathos does not operate like a traditional biotech. There is no middle management. There are no layers of approval. The company is designed, from the ground up, around small teams of 2–4 subject-matter experts who each command hundreds of AI agents to do the work that used to require dozens of people.

Everyone builds. Everyone ships. Every function at Pathos — from clinical execution to asset selection to the foundation model itself — runs on this model. Our product velocity delivers meaningful outcomes in hours instead of weeks. This is not a future aspiration. It is how we operate today.

The people who thrive here are operators: deep experts who can specify what needs to happen, orchestrate AI agents to execute at scale, and make high-judgment calls that compound over time. If you have spent your career building and shipping AI systems at scale, this is the environment where that experience becomes a superpower.

About the Role

The Oncology Foundation Model is the scientific core of Pathos, and this role sits at its center. We're hiring an Applied Scientist to lead the pretraining and post training of large language models purpose built for oncology, trained on a dataset unlike anything available in the public domain.

This isn't a role where you fine tune general purpose models and call it done. You'll be making architectural decisions, designing evaluation frameworks, and working directly with oncologists and clinical researchers to ensure the model reflects real world medical reasoning. Your work will propagate through every AI system we build.

If you want to do the most scientifically meaningful work of your career, at the intersection of frontier ML and cancer biology, this is where it happens.

What You'll Do

Lead the design, pretraining, and post training of large language models for oncology applications.
Develop strategies for curating, processing, and governing oncology specific datasets at scale.
Implement alignment techniques including RLHF, supervised fine tuning, and domain adaptation.
Design rigorous evaluation frameworks to assess model performance, safety, and clinical relevance.
Conduct novel research in LLM architectures and training methodologies for biomedical domains.
Publish findings at top tier conferences and journals; communicate work to internal and external stakeholders.
Partner with oncologists, clinical researchers, and cross functional teams throughout the model lifecycle.
Mentor junior scientists and help build a culture of scientific rigor.

Who You Are

PhD in Computer Science, Machine Learning, AI, or a related field, or an MS with equivalent experience.
Hands-on experience with deep learning and neural network architectures.
Proven expertise in both pretraining and post training of large language models (e.g., LLaMA, Qwen, DeepSeek, or similar).
Strong publication record at top tier venues: NeurIPS, ICML, ICLR, ACL, or EMNLP.
Deep understanding of transformer architectures, attention mechanisms, and optimization.
Proficient in Python and deep learning frameworks (PyTorch or TensorFlow/JAX).
Experience with distributed training and large scale model infrastructure.
Strong communicator, able to translate technical work for clinical and non technical audiences.

Preferred Qualifications

Experience applying LLMs to biomedical, healthcare, or life sciences domains.
Background in computational biology, bioinformatics, or medical informatics.
Knowledge of oncology terminology, clinical workflows, or cancer biology.
Experience with retrieval augmented generation (RAG) or knowledge grounding techniques.
Familiarity with model safety, alignment, and responsible AI practices.
Track record of translating research into production systems.
Experience with prompt engineering and instruction tuning.
Contributions to open source ML projects.
First author publications demonstrating research leadership.

Location

This is a hybrid role, requiring up to 3 to 4 days per week onsite at our NYC Headquarters.

The pay range for this role is:

60 - 80 USD per hour (New York Office)

Engineering

New York City, NY

Share on: