Initial Commit for rss-link-app

Analyze links from rss feeds
This commit is contained in:
Waylon Walker 2025-09-03 20:22:39 -05:00
commit 060f998c59
8 changed files with 1837 additions and 0 deletions

50
README.md Normal file
View file

@ -0,0 +1,50 @@
# RSS Link Audit (FastAPI)
A FastAPI app that accepts an RSS/Atom feed URL, fetches each posts full HTML, extracts outbound links, groups them by hostname, **hunts for each hosts RSS feed** (common endpoints + homepage discovery), and renders a stylish report using the **Royal Armory** palette.
## Features
- Input a feed URL via UI or JSON.
- Concurrent fetching (httpx + asyncio).
- Extract links from each post page.
- Group by hostname; count occurrences.
- Heuristic RSS discovery:
- Probe common feed endpoints (e.g. `/feed`, `/rss.xml`, `/atom.xml`, etc.).
- Parse homepage `<link rel="alternate" ...>` for RSS/Atom.
- Scan homepage `<a>` tags for `rss|atom|feed`.
- Validate candidates with `feedparser`.
- Report UI:
- Per-host card with counts.
- **Bar** visual for how many links a host has.
- **Top links** (if mentioned > 1).
- Links list truncated with a **More** button.
- RSS/Atom badge if found.
## Run locally
```bash
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
uvicorn main:app --reload
```
Open: http://127.0.0.1:8000
## API
```
POST /api/analyze
Content-Type: application/json
{"feed_url": "https://example.com/feed.xml"}
```
Returns JSON with the summarized data.
## Notes / Caveats
- Only static HTML is parsed (no JS rendering).
- Some sites block bots; results may vary.
- For large feeds, you may wish to trim the number of posts (e.g., slice `post_urls` in `analyze_feed`).
- Consider adding caching (e.g., `aiocache`, Redis) if youll run this frequently.