An agentic RAG experiment across 20 years of professional history
WTF is a personal agent I built for my freelance work. It ingests twenty years of pitches, project plans, research and retrospectives into a vector store, then lets me ask questions about my own back catalogue. Two modes: explore the corpus to find what I already did and have inevitably forgotten, or generate a new pitch or plan structured against past work so I am not starting from a blank page at 11pm the night before.
Every answer comes with citations back to the source passage, which matters because I do not entirely trust myself, never mind an LLM, to remember what actually happened on a project from 2009.
How it works
A question is matched against the corpus by both meaning and keyword simultaneously. The semantic search finds passages that are conceptually related; the keyword search finds ones that contain the actual words. Both results are fused into a single ranked list using Reciprocal Rank Fusion, which is a fancy way of saying the passages that show up near the top of both lists get promoted. This matters because pure vector search tends to miss exact terminology, and pure keyword search misses meaning. Combining them gets more of the right stuff to the top.
That ranked list goes to the agent, which reads it and makes a decision: enough context to answer, or not. If not, it rewrites the query and searches again, up to three times before it gives up and admits it does not know. When it does have enough, it writes the answer directly from the retrieved passages and attaches an inline citation to each claim so you can see exactly where it came from. This is less a trust mechanism than a sanity check, given that the source material is my own work and I should probably know what is in it.
Document ingestion and vector indexing
Every PDF, article and case study is ingested, broken into passages, and encoded as a high-dimensional vector. Building a searchable memory of everything produced.
Hybrid retrieval with RRF re-ranking
Questions are matched against the corpus by meaning and by keyword simultaneously, then fused into a single ranked list of the most relevant passages.
Agentic reasoning loop
The agent reads what it found and decides whether to search again with a sharper query, looping until it has enough to give a confident answer.
Grounded synthesis with citations
The response is written directly from the retrieved passages, with every claim linked back to the source document it came from.
What the agent knows
The corpus is structured, not just a pile of PDFs scraped off a hard drive. At ingestion an LLM extracts metadata for each document (client, year, type, sector, outcome) and stores it alongside the embeddings. That is what makes the thing useful rather than a fancy search box. The agent can reason about what worked, what shape of problem this resembles, and which old project is worth pulling forward.

Tech stack
The interesting part is the AI stack, not the web stack. Everything below sits behind a thin Next.js app with the usual frontend bits, which I will spare you.
1024-dim vectors for semantic search
Vector store on Supabase Postgres, IVFFlat index
Keyword retrieval fused with semantic via RRF
Explicit retrieve → assess → generate loop
Structured outputs for assessment and metadata
LangGraph state to stream
Default. Fast enough for an agentic loop
Swappable via AI_PROVIDER env var