News Desk — Autonomous Story-Discovery Agent
An automated newsroom agent that reproduces an editorial team's morning routine — searches the web and ~52 curated sources across 11 countries for textile/apparel news, scores and de-dupes the best stories with an LLM, confirms each against its primary source, then drafts publish-ready articles. Two-phase with a human in the loop, streamed live over SSE.

- Year
- 2026
- Type
- Day job
- Stack
- 11
- Outcomes
- 4
What needed solving
Fibre2Fashion's news desk ran a manual morning routine — Google for textile news, visit dozens of competitor and source sites across many countries, judge what's relevant, confirm it against a primary source, then rewrite it. It was hours of repetitive triage every day, with no de-duplication across days and no consistent ranking of what actually mattered.
The solution
Built a two-phase, human-in-the-loop agent in Python + FastAPI. A fast discovery stage turns the beat plus the team's 45 editorial keywords into ~20 date-aware queries, fans out across SerpAPI/DuckDuckGo, Google News RSS, and an RSS-first crawl of the curated sources, then an LLM judges relevance (direct vs. indirect), collapses cross-outlet duplicates, and ranks by newsworthiness + source authority + recency. The editor ticks the stories worth pursuing; only those get the expensive full-text fetch (httpx + Playwright fallback), clustering into distinct candidates, and primary-source confirmation. One click then drafts a house-style, publish-ready article from the sources — every step streamed to a live activity log.
What changed
- Turns the desk's manual morning scan into one click — a deduped shortlist of ~20–30 ready-to-draft story candidates
- Two-phase pipeline keeps cost low: a scan is ~4 LLM calls / a few thousand tokens on gpt-4o-mini, and only editor-picked stories get the expensive full fetch
- One config-driven UI client serves two hosts — the FastAPI SPA and the company's .NET admin portal — so a fix or feature lands once
- Production-hardened: SSRF-guarded fetching, robots.txt + rate limits, retries with backoff, per-job token/URL budgets, structured JSON logs + Prometheus metrics, Docker + CI, durable across restarts
Technical highlights
Need something like this?
I take on a small number of projects each quarter. Let's talk if your idea fits.

AI Chatbot — RAG · Agents · Tools
A production retrieval-augmented chatbot over internal documents. Metadata-filtered retrieval feeds a tool-using agent loop; FastAPI exposes the API; the Python service runs on Render.