An undergraduate class · CSCI 455 / 555 · Spring 2026

Fifteen undergraduate scholars spent twelve weeks at the College of William & Mary learning how code-aware language models actually behave: where they hallucinate, how to measure their failures, and how to build workflows that don't collapse in production. Eleven projects shipped this spring.

ModulesVIII
LabsIII
Cohort15
Shipped11
§ I The Cohort

The 2026 class.

§ II The Playbook · An overview of the course structure

Eight chapters of theory.

An overview of how the term is built — each chapter is read end-to-end, then drilled in a notebook. Lectures hand off to lab handouts; lab handouts hand off to seminar discussions. The notebook stays open the entire semester.

IWk 01-02

Mining repositories

Collecting, cleaning, and tokenizing source-code data from public repositories. PyDriller, BPE, deduplication with MinHash, and the ethics of training on copyleft code.

repositoriesBPEASTslicensing
Lab A · shipped
IIWk 03

Modeling code

From n-grams to the naturalness hypothesis. Probability refresher, MLE, perplexity, smoothing, sampling temperature — and why code is more predictable than English.

MLEperplexitysmoothingsampling
Pre-lab · spam classifier
IIIWk 04

Evaluating rigorously

Classification metrics, BLEU and its discontents, CodeBLEU, pass@k, embeddings, SIDE, and the unglamorous human-evaluation rubric the best papers include without making a show of it.

BLEUpass@kCodeBLEUSIDE
Lab B · shipped
IVWk 05-06

Deep learning

Neural networks, backpropagation, embeddings, LSTM/GRU, attention, transformers, autoregressive generation, pre-training, and fine-tuning — the engine room.

neural netsLSTMtransformersCodeBERT
Lab · code completion
VWk 07-08

Prompting LLMs

In-context learning, few-shot, chain-of-thought, prompt engineering, RAG, tool use, context-window management, prompt chaining, self-consistency, and prompt evaluation.

ICLCoTRAGtool use
Lab C · shipped
VIWk 09

Hallucinations in code

How LLMs fabricate, the CodeHalu taxonomy, RAG mitigation, prompt defenses, tool-augmented generation, production case studies, and hallucination-resistant workflows.

CodeHaluRAG mitigationproduction cases
Workshop · red-team
VIIWk 10

NP-completeness

Reductions, hardness, and what LLMs do when the underlying problem isn't tractable. Where statistical pattern-matching collides with the unforgiving floor of computational complexity.

reductionsSAThardnesscomplexity
Theory companion
VIIIWk 11-12

Genetic algorithms

Population search, fitness landscapes, crossover, fitness approximation with LLM predictors, the GA+LLM architecture, and the honest limits of evolutionary search over code.

populationselectionfitness approxGA + LLM
Capstone-adjacent

Grading scheme

Assignments avg · midterm · capstone split · participation
DeliverableWhat it isWeight
Assignments I-III (avg)Average of three coding assignments — mining, modeling, evaluation.40%
MidtermMid-semester written examination of theory and methods.10%
Final projectCapstone block · 5–7 page write-up paired with a ten-minute in-class demo of the shipped product. Graded on five rubric criteria.45%
ParticipationOffice-hour engagement, seminar discussion, peer review.5%
BonusAdditive, not weighted — for exceptional contributions.+0–5
TotalWeighted components sum to 100%; bonus remains additive.100%
§ III The Labs

Five labs, one notebook each.

Each lab pairs a chapter of theory with a hands-on notebook — the artifact a future student inherits. Run them locally, modify them, break them. Lab handouts and source notebooks are linked from each module.

§ IV Final Projects

Real shipped products.

Each group chose a real problem, scoped a system, and built something that runs. Scored on market analysis, differentiation, and technical framework. All eleven groups shipped on schedule — five cleared the bar, six fell short. Results below tell the whole story, sorted by group number.

Group01
Aidan BasloeJeff Lin
Finance · markets

Stock Investment AI

An algorithmic stock-prediction interface with explainable retrieval-grounded recommendations. Pairs price-signal modeling with LLM-generated reasoning over filings.

Aidan Basloe · Jeff Lin
Group02
Nathaniel CallabresiLily Walker
Search · multimedia

Multimodal Video Indexing

Natural-language search across video archives, replacing brittle metadata-only retrieval with vision-and-language embeddings indexed at scene granularity.

Nathaniel Callabresi · Lily Walker
Group03
Alan Gonzalez Osorio
Sports · risk

Sports Betting Arbitrage

Real-time cross-sportsbook arbitrage detection with risk-aware position sizing. Surfaces price disagreements before they close.

Alan Gonzalez Osorio
Group04
James HeJack Stawasz
Tooling · data

PlotForge

A plotting interface for data analysis aimed at students, educators, and lightweight analysts. Natural-language to charting with iterative refinement.

James He · Jack Stawasz
Group05
Sam Bennett
Dev tools · QA

BURT++

A bug-report assistant that translates non-technical user complaints into actionable engineering tickets — clarifying reproduction steps as it goes.

Sam Bennett
Group06
Alice JiCamly Tran
Civic · verification

GenAI Claim Verification

Retrieval-augmented evidence pipeline for verifying factual claims, attaching source citations with calibrated confidence.

Alice Ji · Camly Tran
Group07
Abby Schwall
Education · planning

W&M Degree Map

A planning tool for liberal-arts students navigating complex general-education requirements. Goal-aware course recommendations with clear-eyed prerequisite traversal.

Abby Schwall
Group08
Krishna Swaminathan
Sports · rules

RAG Rules · Ultimate Frisbee

A retrieval-augmented rules interpreter for self-officiated Ultimate Frisbee. Answers in-game questions by grounding in the official rulebook.

Krishna Swaminathan
Group09
Yibarek Tadesse
Education · code

CodeCaster

A coding assistant for social-science students learning to program for data analysis. Designed for the first hundred lines, not the next thousand.

Yibarek Tadesse
Group10
Carter Williamson
Civic · sports

Youth Sports Registration

A multilingual youth-sports registration platform with serious accessibility focus — built to reach families current platforms exclude.

Carter Williamson
Group11
Walker Hyman
Career · jobs

AI-Powered Job Search

A unified career platform consolidating fragmented job-seeker tooling into one assistant — resume, search, outreach, and prep in a single workflow.

Walker Hyman
Your final letter Click here to read your evaluation The acknowledgement window closes at midnight tonight.
CSCI 455 / 555 · Spring 2026

Generative AI
for Software Development

An undergraduate class at the College of William & Mary