Language Model Research Intern (Summer 2025)

At Knexus, we believe AI has the power to transform national security. We are a team of experienced technologists, many with deep roots in US government service, who are passionate about applying cutting-edge AI and machine learning to solve critical challenges facing our nation.


We work closely with government agencies to develop and deploy innovative AI solutions that enhance national security. Our work spans a range of areas, including:

  • Data analysis and intelligence: Developing AI-powered systems to analyze vast amounts of data to identify threats, predict future events, and improve situational awareness.
  • Automation and efficiency: Building machine learning models to automate complex tasks, freeing up human analysts to focus on higher-level decision-making.

Our core values are

  • Be collaborative
  • Understand the why
  • Seek proactive solutions
  • Embody the "state of the art" mindset

Duration: 8-10 Weeks, Stipend: No, Remote: Yes, Start Date: Late Spring/Early Summer 2025 



The Opportunity:  We are looking for an energetic intern to conduct R&D on Generative AI and evaluate performances of opensource Language Models (LMs).


The intern will work with their mentor and the internship team to research and implement an orchestration pipeline and software for evaluation. They will develop and curate test data and run experiments to evaluate various models on specified computation tasks.  They will analyze and report their findings to the internship team.   


Location 

Knexus can support 100% remote work. Our clients and most of the team is on East Coast hours, so we need someone able to support normal working hours in Eastern Standard Time. If you are in the DC area, our office is in Tysons and you would be welcome to work out of that office, but it's not required.

Responsibilities 

  • Conduct literature search on language models and their evaluation techniques 
  • Implement performance evaluation pipelines for language models 
  • Create and curate test data and evaluation metrics  
  • Conduct comparative evaluation, analyze and report findings 

Qualifications and Experience   

  • Undergraduate or Master’s degree in Computer Science 
  • Course(s) in AI and ML 
  • Experience conducting algorithmic performance evaluations  
  • Proficient in Python, associated tools and libraries 
  • Experience with Git and model repositories 

Bonus Experience  

  • Experience with using foundational language model APIs 
  • Experience with Google Vertex AI or Hugging Face Ecosystem 
  • Experience with LangChain or related prompting techniques 

Eligibility Requirements 

  • US Citizen

Applicants should be prepared to complete remote interviews.  We will also ask for academic transcripts (college/university and/or military training) and any professional references.

Don’t feel like you meet every single requirement? Studies have shown that women and people of color are less likely to apply to jobs unless they meet every single qualification. At Knexus, we are dedicated to building a diverse, inclusive, and authentic workplace, so if you’re excited about this role but your past experience doesn’t align perfectly with every qualification in the job description, we encourage you to apply anyways. You may be just the right candidate for this or other roles. Knexus is committed to hiring and retaining a diverse workforce. We are proud to make decisions without regard to race, color, religion, creed, sex, sexual orientation, gender identity, marital status, national origin, age, veteran status, disability, or any other protected class.

Engineering

Knexus Headquarters

Share on:

Terms of servicePrivacyCookiesPowered by Rippling