rss-link-app/README.md
2025-09-03 20:22:39 -05:00

50 lines
1.6 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# RSS Link Audit (FastAPI)
A FastAPI app that accepts an RSS/Atom feed URL, fetches each posts full HTML, extracts outbound links, groups them by hostname, **hunts for each hosts RSS feed** (common endpoints + homepage discovery), and renders a stylish report using the **Royal Armory** palette.
## Features
- Input a feed URL via UI or JSON.
- Concurrent fetching (httpx + asyncio).
- Extract links from each post page.
- Group by hostname; count occurrences.
- Heuristic RSS discovery:
- Probe common feed endpoints (e.g. `/feed`, `/rss.xml`, `/atom.xml`, etc.).
- Parse homepage `<link rel="alternate" ...>` for RSS/Atom.
- Scan homepage `<a>` tags for `rss|atom|feed`.
- Validate candidates with `feedparser`.
- Report UI:
- Per-host card with counts.
- **Bar** visual for how many links a host has.
- **Top links** (if mentioned > 1).
- Links list truncated with a **More** button.
- RSS/Atom badge if found.
## Run locally
```bash
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
uvicorn main:app --reload
```
Open: http://127.0.0.1:8000
## API
```
POST /api/analyze
Content-Type: application/json
{"feed_url": "https://example.com/feed.xml"}
```
Returns JSON with the summarized data.
## Notes / Caveats
- Only static HTML is parsed (no JS rendering).
- Some sites block bots; results may vary.
- For large feeds, you may wish to trim the number of posts (e.g., slice `post_urls` in `analyze_feed`).
- Consider adding caching (e.g., `aiocache`, Redis) if youll run this frequently.