BH_Scrubber is a specialized Python automation tool designed to scrape, aggregate, and classify global public holiday data from OfficeHolidays.com into a generated .ods spreadsheet. It supports both Solar and Fiscal calendars.
This project was built to solve a critical Workforce Optimization (WFO) challenge: efficiently tracking bank holidays across a multinational organization.
Historically, scheduling conflicts arose because all holidays were treated equally. Colleagues were often unaware that a holiday in a specific country might be Regional (affecting only a subset of the workforce) rather than National (affecting everyone).
BH_Scrubber addresses this by:
- Distinguishing Holiday Types: It clearly identifies and visually isolates National holidays (Amber) from Regional ones (Pink).
- Visualizing Support: It highlights supported countries (Green) and greys out unsupported ones (Grey), providing an instant visual status map for planners.
- Automating Updates: It eliminates manual data entry errors by scraping the latest data directly from a trusted source.
- 🛡️ Robust Data Estimation (NEW): If future year data (e.g., 2027) is missing, the tool intelligently attempts to estimate dates based on the previous year's patterns (marked clearly as
(Est)). - 📅 Dynamic Calendar Generation: Automatically builds a 365/366-day calendar grid based on your needs.
- Solar Calendar: Standard Jan 1 – Dec 31.
- Fiscal Calendar: Supports custom start offsets (e.g.,
+5or-5weeks) while maintaining a full year duration. - Multi-Year Scraping: Intelligent scraper automatically fetches data from adjacent years if a Fiscal year straddles two calendar years (e.g., 2026/2027).
- Intelligent Scraping: Parses
officeholidays.comto extract dates, names, and holiday types. - Smart Classification:
- National Holidays: Flagged and styled in Amber (#ffbf00).
- Regional Holidays: Detected and styled in Pink (#ffb6c1).
- Visual Status Indicators:
- ✅ Supported Countries: Rows colored Light Green.
- ❌ Unsupported/Excluded: Rows colored Medium Grey (Visual Blackout).
- ODS Integration: Directly modifies OpenDocument Spreadsheet (ODS) XML for high-performance updates without breaking existing formulas or structures.
- Timestamped Output: Generates unique output files (e.g.,
BH_List_20260128.ods) to preserve history.
-
Prepare the Input File:
- Ensure
country_List.odsis present (Columns: Supported, Region, SubRegion, Country). - To support a country, mark Column A as
Yes. To exclude it, markNo. - Ensure the country name in Column D has a hyperlink to its
officeholidays.compage.
- Ensure
-
Run the Scraper:
python3 bh_Scrubber.py
-
Follow Prompts:
- Enter the target year (e.g.,
2026). - Fiscal Calendar: Answer
yif needed, then specify the offset in weeks (e.g.,+4for forward,-2for backward). - Review the configuration and confirm to start.
- Enter the target year (e.g.,
-
View Results:
- The tool saves a new file:
BH_List_YYYYMMDD.ods(based on execution date). - Open it in LibreOffice Calc or Excel to view the populated and styled calendar.
- The tool saves a new file:
bh_Scrubber.py: Main script (v2.1).country_List.ods: Template spreadsheet.BH_Scrubber.md: Detailed technical documentation.
Gil Alowaifi
Disclaimer: This tool is for internal workforce planning purposes. Data is sourced from public websites.