AI Chatbot — RAG · Agents · Tools
A production retrieval-augmented chatbot over internal documents. Metadata-filtered retrieval feeds a tool-using agent loop; FastAPI exposes the API; the Python service runs on Render.
What needed solving
Internal teams kept hitting the same long handbooks and policy docs for the same questions. Plain keyword search missed context, and single-shot RAG over the whole corpus returned irrelevant chunks when a question touched only one section.
The solution
Built a metadata-aware RAG pipeline: each chunk carries source / section / tag metadata used to pre-filter the candidate set before semantic search. On top, an agent loop (LangChain tools + structured tool-calling) decides when to retrieve, when to summarize, and when to call structured-extraction tools. Exposed via a FastAPI service deployed on Render with streaming responses.
What changed
- 1Metadata filtering narrows the candidate set before vector ranking — fewer off-topic chunks reach the LLM
- 2Agent with tools handles structured questions ("extract X from section Y") beyond plain semantic search
- 3Streaming responses + per-request tracing make latency and quality issues observable in production
Technical highlights
Need something like this?
I take on a small number of projects each quarter. Let's talk if your idea fits.