1.6 KiB
1.6 KiB
RSS Link Audit (FastAPI)
A FastAPI app that accepts an RSS/Atom feed URL, fetches each post’s full HTML, extracts outbound links, groups them by hostname, hunts for each host’s RSS feed (common endpoints + homepage discovery), and renders a stylish report using the Royal Armory palette.
Features
- Input a feed URL via UI or JSON.
- Concurrent fetching (httpx + asyncio).
- Extract links from each post page.
- Group by hostname; count occurrences.
- Heuristic RSS discovery:
- Probe common feed endpoints (e.g.
/feed,/rss.xml,/atom.xml, etc.). - Parse homepage
<link rel="alternate" ...>for RSS/Atom. - Scan homepage
<a>tags forrss|atom|feed. - Validate candidates with
feedparser.
- Probe common feed endpoints (e.g.
- Report UI:
- Per-host card with counts.
- Bar visual for how many links a host has.
- Top links (if mentioned > 1).
- Links list truncated with a More button.
- RSS/Atom badge if found.
Run locally
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
uvicorn main:app --reload
Open: http://127.0.0.1:8000
API
POST /api/analyze
Content-Type: application/json
{"feed_url": "https://example.com/feed.xml"}
Returns JSON with the summarized data.
Notes / Caveats
- Only static HTML is parsed (no JS rendering).
- Some sites block bots; results may vary.
- For large feeds, you may wish to trim the number of posts (e.g., slice
post_urlsinanalyze_feed). - Consider adding caching (e.g.,
aiocache, Redis) if you’ll run this frequently.