scraper.py — Summary Map

UK News Scraper · Internal Training Reference · Python Beginners

📄 scraper.py
Section 1
Imports
Section 2
Configuration Constants
Section 3
Article Data Model
Section 4
Utility / Helper Functions
Sections 5–7
Per-Source Scrapers
BBC News The Guardian Independent Sky News
Section 8
Output Writers
.csv .html .json .odt
Section 9
Email HTML Builder
Section 10
main() & Entry Point
Execution Flow
main()entry point
scrape_bbc()RSS + per-page
scrape_guardian()JSON API
scrape_independent()RSS + fallback
scrape_sky_news()RSS + fallback
save outputsCSV · HTML · JSON · ODT
Key Python Concepts Used
@dataclass try/except list comprehension dict.get(key, default) defaultdict f-strings with open() as f: nested functions functions in variables tuple unpacking list slicing [:10] ternary expression Optional[T] if __name__ == "__main__" logging CSS selectors RSS / feedparser polite scraping