Skip to content

Migrate docs from monarch-ingest to monarch-app#1258

Merged
kevinschaper merged 9 commits intomainfrom
consolidate-ingest-docs
Mar 13, 2026
Merged

Migrate docs from monarch-ingest to monarch-app#1258
kevinschaper merged 9 commits intomainfrom
consolidate-ingest-docs

Conversation

@kevinschaper
Copy link
Copy Markdown
Member

@kevinschaper kevinschaper commented Feb 13, 2026

Summary

Migrates KG documentation from monarch-ingest into monarch-app and adds infrastructure for automatically fetching source documentation from KozaHub modular ingest repos.

What this PR does

  1. Migrates existing docs from monarch-ingest — KG build process, modeling principles, creating an ingest guide
  2. Adds new doc pages — Neo4J, MonarchR, Licensing
  3. Adds a fetch-docs system — a manifest (docs/sources-manifest.yaml) + script (scripts/fetch-source-docs.py) that pulls README files from each KozaHub ingest repo into the Sources section at doc build time
  4. Rewrites the KG build process doc to reflect the current pipeline (koza graph operations, DuckDB, modular ingests, closurizer, etc.) and replaces the static architecture.png with a mermaid diagram
  5. Reorganizes mkdocs.yaml nav to foreground KG data sources and build process
  6. Updates frontend doc links in PageDocumentation.vue to point to the new structure

What needs human review

New code — please review carefully

File What it does
scripts/fetch-source-docs.py Fetches README.md from each repo listed in the manifest. Runs as part of make docs.
docs/sources-manifest.yaml Declares which repos to fetch from, which branch, and what filename to write.
Makefile (fetch-docs target) Wires fetch-docs into the docs build. This means make docs now requires network access to GitHub.
mkdocs.yaml Nav restructure + mermaid support added.
frontend/.../PageDocumentation.vue Updated link structure for the documentation page in the UI.

Rewritten content — worth a read-through

File Notes
docs/KG-Build-Process/kg-build-process.md Substantially rewritten to match current pipeline. Replaces architecture.png with mermaid.
docs/index.md Landing page rewrite.
docs/Neo4J/index.md New page.
docs/MonarchR/index.md New page.
docs/Licensing/index.md New page.

Migrated as-is from monarch-ingest — low review priority

File Notes
docs/Principles/modeling-principles.md Lifted from monarch-ingest, typos fixed.
docs/Creating-an-Ingest.md Lifted from monarch-ingest.
docs/Sources/index.md Simple index page for the sources section.
docs/KG-Build-Process/architecture.png Old diagram, kept for reference but no longer rendered (replaced by mermaid).

@netlify
Copy link
Copy Markdown

netlify Bot commented Feb 13, 2026

Deploy Preview for monarch-app canceled.

Name Link
🔨 Latest commit 2403845
🔍 Latest deploy log https://app.netlify.com/projects/monarch-app/deploys/69af4e7cb7b99d0008f23bd9

@claude
Copy link
Copy Markdown

claude Bot commented Feb 13, 2026

claude review

Details ## Pull Request Review: Migrate docs from monarch-ingest to monarch-app

Summary

This PR migrates documentation from monarch-ingest to monarch-app, adding KG build process documentation, modeling principles, and a framework for automatically fetching source documentation from ingest repositories.

Code Quality & Best Practices

Strengths:

  • Well-structured documentation with clear organization
  • Good use of MkDocs navigation structure in mkdocs.yaml
  • Appropriate .gitignore entries for generated files
  • YAML manifest (sources-manifest.yaml) follows good declarative configuration practices

Issues Found:

1. CRITICAL: Missing Python Script 🔴

The Makefile references scripts/fetch-source-docs.py at line 93:

fetch-docs:
	$(RUN) python $(ROOTDIR)/scripts/fetch-source-docs.py

However, this script does not exist in the PR. This will cause the make fetch-docs and make docs targets to fail.

Recommendation: Add the scripts/fetch-source-docs.py script that reads docs/sources-manifest.yaml and fetches README files from the specified repositories.

2. Spelling Error 📝

In docs/Principles/modeling-principles.md:11-12:

  • "Authoratative" should be "Authoritative" (appears twice)
  • "seperate" should be "separate" (line 13)

3. Spelling Error in KG Build Process 📝

In docs/KG-Build-Process/kg-build-process.md:6:

  • "indepent" should be "independent"
  • "then" should be "them"

Line 67: "danging" should be "dangling"

4. Inconsistent Indentation in mkdocs.yaml

The mkdocs.yaml file mixes spaces and maintains consistent 2-space indentation, which is good. However, verify that all navigation entries are properly aligned.

Potential Bugs & Issues

  1. Broken Build Process: The docs target depends on fetch-docs, but without the Python script, documentation builds will fail in CI/CD pipelines.

  2. Generated Files Not Ignored Properly: The .gitignore pattern docs/Sources/*.md with exception !docs/Sources/index.md is correct, but the index.md file is included in the PR, which is good. Ensure the fetch script respects this pattern.

  3. Dead Links Risk: The documentation references multiple external GitHub repositories and URLs. While most appear valid, the following should be verified:

    • Architecture image path in kg-build-process.md (line 3: architecture.png)
    • All external repository links in sources-manifest.yaml

Performance Considerations

  1. Network I/O: The fetch-docs script (when implemented) will make 25+ GitHub API calls. Consider:

    • Adding rate limiting/retry logic
    • Implementing caching to avoid unnecessary fetches
    • Using GitHub API tokens to increase rate limits
    • Making requests in parallel where possible
  2. Build Time: The docs target now depends on fetch-docs, which will add network latency to the build. Consider:

    • Making fetch-docs optional or separate from local doc builds
    • Caching fetched documentation with timestamps

Security Concerns

  1. Untrusted Content: The fetch script will pull README content from external repositories. Recommendations:

    • Validate that fetched content is actually markdown
    • Sanitize any HTML in the markdown if processed
    • Verify repository ownership before fetching (all should be monarch-initiative org)
    • Consider pinning to specific commits rather than branch heads for reproducibility
  2. Credentials: If the script uses GitHub API, ensure tokens are:

    • Loaded from environment variables, not hardcoded
    • Have minimal required permissions (read-only public repos)
    • Properly documented in setup instructions

Test Coverage

Missing:

  • No tests for the fetch-source-docs.py script (which doesn't exist yet)
  • No validation that the YAML manifest is well-formed
  • No checks that generated markdown files are valid

Recommendations:

  1. Add unit tests for the fetch script when implemented
  2. Add a test that validates sources-manifest.yaml schema
  3. Consider adding a CI check that runs make fetch-docs and verifies output
  4. Add tests to ensure all links in migrated docs are valid

Documentation Quality

Positive:

  • Clear explanation of the KG build process
  • Good architectural diagram inclusion
  • Well-organized sources manifest
  • Helpful inline comments in YAML files

Suggestions:

  1. Add README or comment explaining how to add new sources to the manifest
  2. Document the fetch-docs process in a contributing guide
  3. Add troubleshooting section for common fetch failures

Additional Recommendations

  1. Implementation Priority: Before merging, implement the missing scripts/fetch-source-docs.py. A basic implementation should:

    # Read sources-manifest.yaml
    # For each source:
    #   - Fetch README from repo/branch
    #   - Save to docs/Sources/{filename}
    # Handle errors gracefully
  2. CI/CD: Add a GitHub Actions workflow to:

    • Run make fetch-docs periodically to keep docs fresh
    • Verify the fetch script works on PR builds (or mock it)
  3. Consistency: Ensure the new doc structure aligns with existing monarch-app documentation conventions

  4. Changelog: Consider adding a CHANGELOG entry for this documentation migration

Verdict

Do Not Merge until the missing scripts/fetch-source-docs.py is added. The PR is otherwise well-structured with good documentation content, but the critical missing script will break the build process.

After adding the script and fixing the spelling errors, this will be a solid addition to the documentation infrastructure.


Files to Fix:

  • Add scripts/fetch-source-docs.py
  • Fix spelling: "Authoratative" → "Authoritative" in modeling-principles.md
  • Fix spelling: "seperate" → "separate" in modeling-principles.md
  • Fix spelling: "indepent" → "independent" in kg-build-process.md
  • Fix spelling: "then" → "them" in kg-build-process.md
  • Fix spelling: "danging" → "dangling" in kg-build-process.md
  • Add tests for fetch script
  • Consider adding CI workflow for periodic doc updates

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.52%. Comparing base (3453d6a) to head (2403845).
⚠️ Report is 29 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1258   +/-   ##
=======================================
  Coverage   73.51%   73.52%           
=======================================
  Files          95       95           
  Lines        3417     3418    +1     
=======================================
+ Hits         2512     2513    +1     
  Misses        905      905           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@claude
Copy link
Copy Markdown

claude Bot commented Feb 17, 2026

claude review

Details ## Follow-up Review (after latest commit: rework mkdocs.yaml top level order)

The mkdocs.yaml navigation reordering looks reasonable — putting KG Data Sources and KG Build sections first makes sense given this PR's focus.

Remaining Blocker

The scripts/fetch-source-docs.py script is still missing from the PR. The Makefile references it in the fetch-docs target. Without this script, both make fetch-docs and make docs will fail. This needs to be added before merging.

Minor Typos (still unaddressed)

  • docs/Principles/modeling-principles.md: 'Authoratative' should be 'Authoritative', 'seperate' should be 'separate'
  • docs/KG-Build-Process/kg-build-process.md: 'indepent' should be 'independent', 'store then' should be 'store them', 'danging edges' should be 'dangling edges' (line 67)

mkdocs.yaml Navigation Note

The new ordering places FastAPI before monarch-py. If monarch-py is the primary library documentation, keeping it higher in the nav may be worth considering — though this is a judgment call for the team.

The documentation content itself (KG build process, modeling principles, sources manifest) is well-written and a useful addition. Once the missing script is added, this PR will be in good shape to merge.

@claude
Copy link
Copy Markdown

claude Bot commented Feb 19, 2026

claude review

Details ## PR Review: Migrate docs from monarch-ingest to monarch-app

This PR is a solid step toward consolidating documentation. The overall structure is well-thought-out, and the manifest-driven approach for fetching ingest READMEs is clean. A few issues to address before merging:


Critical: Missing scripts/fetch-source-docs.py

The PR adds a make fetch-docs target in the Makefile that calls the script at scripts/fetch-source-docs.py, but this script is not included in the PR. The diff contains 9 changed files and the script is not among them. The local scripts/ directory only has generate_category_enums.py, generate_fixtures.py, and get_publications.py.

This means both make fetch-docs and make docs will fail immediately, since docs now depends on fetch-docs:

docs: install-backend docs/Data-Model fetch-docs  # fetch-docs calls the missing script

The script needs to be added to the PR. Based on the manifest structure (docs/sources-manifest.yaml), it should read each entry and fetch the corresponding README.md from GitHub raw URLs, writing them into docs/Sources/.


Bug: MMRRC missing from mkdocs.yaml nav

The manifest (docs/sources-manifest.yaml) includes mmrrc-ingest -> mmrrc.md, but mmrrc.md is not listed in the mkdocs.yaml nav. The file will be fetched and written to docs/Sources/mmrrc.md but won't be reachable through the documentation site navigation.


make docs now requires network access

Adding fetch-docs as a dependency of docs means building documentation now makes network calls to GitHub. This could be problematic in offline environments, CI pipelines subject to GitHub API rate limits, or restricted build environments.

Consider whether fetch-docs should remain a separate, manually-run step rather than an automatic dependency of docs. The .gitignore entries already treat the fetched files as optional generated artifacts, which suggests this intent. Removing the fetch-docs dependency from docs would restore the original offline-capable build while still allowing make fetch-docs to be run explicitly.


Typos in migrated content

A few typos carried over from the source docs worth fixing during migration:

docs/KG-Build-Process/kg-build-process.md:

  • "A weekly job indepent" -> "independent"
  • "store then on a cloud bucket" -> "store them on a cloud bucket"
  • "danging and denormalized edges" -> "dangling and denormalized edges"

docs/Principles/modeling-principles.md:

  • "Authoratative Source" (heading and body) -> "Authoritative Source"
  • "seperate from edge ingests" -> "separate from edge ingests"

Suggestion: Reduce manifest/nav sync burden

Currently, adding a new source requires updating two files: docs/sources-manifest.yaml (for fetching) and mkdocs.yaml (for navigation). This is already evident from the MMRRC omission above. Consider having the fetch script auto-generate the nav section for KG Data Sources in mkdocs.yaml, or at least add a comment to the manifest reminding contributors to also update the nav.


Minor: docs/sources-manifest.yaml location

This file is config for the fetch script rather than documentation content. scripts/sources-manifest.yaml or a top-level location might be clearer, though this is a minor preference.


Overall the approach is good and the new content is a welcome addition. The missing script is a blocker — once that is added and the MMRRC nav entry is included, this should be in good shape.

…rontend doc links

- Rewrite docs/index.md as site-wide overview with links to all sections
- Add Neo4J link-out page with example Cypher queries
- Add Monarch R link-out page with docs/repo links
- Add Licensing page with BSD 3-Clause / CC0 recommendations
- Update mkdocs.yaml nav and site_name to 'Monarch KG Documentation'
- Update PageDocumentation.vue: replace old monarch-ingest link with consolidated doc links
- Fix typos in kg-build-process.md and modeling-principles.md

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@claude
Copy link
Copy Markdown

claude Bot commented Feb 27, 2026

claude review

Details ## PR Review: Migrate docs from monarch-ingest to monarch-app

This is a solid docs consolidation effort — pulling ingest source docs into this repo and updating the documentation site structure. A few issues to address before merging:


Critical: Missing script

The Makefile references scripts/fetch-source-docs.py in the new fetch-docs target, but this script is not included in the PR. Neither make fetch-docs nor make docs will work until this script is added.

The docs/sources-manifest.yaml exists, but the script that reads it and fetches READMEs from each repo is absent. This should be added before the PR is merged.


MMRRC missing from mkdocs.yaml navigation

docs/sources-manifest.yaml includes an entry for mmrrc-ingest mapped to mmrrc.md, but mkdocs.yaml has no corresponding entry in the KG Data Sources nav section. The file will be fetched but unreachable via site navigation. Either add it to the nav or remove it from the manifest.


docs target now requires network access

Making docs unconditionally depend on fetch-docs means every documentation build fetches READMEs from ~24 external GitHub repositories. This could slow down CI builds, cause failures in offline environments, or hit GitHub rate limits without authentication. Consider keeping fetch-docs as a standalone target rather than a dependency of docs, or at minimum document the network requirement.


Minor: Typo in docs/Principles/modeling-principles.md

Line 12: "authoratative" should be "authoritative".


Minor: architecture.png is a committed binary

The image is committed directly with no pointer to the source. A comment in kg-build-process.md noting where the source diagram lives would help future maintainers keep it in sync.


Positive notes

  • The manifest-driven approach (fetching upstream READMEs at doc-build time) is clean and maintainable.
  • The .gitignore change correctly excludes auto-fetched Sources/*.md files while preserving index.md.
  • The PageDocumentation.vue updates are clear and the new link structure is logical.
  • The new docs pages (Neo4J, MonarchR, Licensing, KG-Build-Process, Modeling Principles) are well-written.
  • The mkdocs.yaml reorganization makes the broader KG context more visible.

The main blocker is the missing fetch-source-docs.py script. Once that is added and the MMRRC nav entry is addressed, this looks ready to go.

The script was missing from the repo, causing the build-docs CI job
to fail. Also fixes "authoratative" and "danging" typos in migrated docs.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@claude
Copy link
Copy Markdown

claude Bot commented Mar 7, 2026

claude review

Details ## PR Review: Migrate docs from monarch-ingest to monarch-app

Overall this is a well-structured documentation migration. The approach of auto-fetching upstream READMEs via a manifest + script is clean, and the .gitignore correctly excludes the generated files. A few things worth addressing:


Issues

1. MMRRC inconsistency between manifest and nav

docs/sources-manifest.yaml still includes mmrrc-ingest, but mkdocs.yaml has no MMRRC entry in the navigation (one commit message explicitly says "remove mmrrc, adding it will be its own thing"). The script will fetch and write docs/Sources/mmrrc.md but it won't appear in the nav. This will likely produce an mkdocs warning about an unlinked file. Either remove it from the manifest now, or add a nav entry — whichever matches the intended state.

2. make docs now requires network access

docs: install-backend docs/Data-Model fetch-docs

Chaining fetch-docs into the main docs target means offline or air-gapped builds will fail. Consider whether this is intentional. A common pattern is to keep the fetch step optional/separate and let the CI pipeline call it explicitly (e.g. make fetch-docs && make docs), so local docs builds aren't broken by network unavailability.

3. No GitHub token support — potential CI rate-limiting

scripts/fetch-source-docs.py makes ~23 unauthenticated requests to raw.githubusercontent.com. GitHub's unauthenticated rate limit is 60 req/hr per IP; on a shared CI runner that IP may already be partially used. Adding optional GITHUB_TOKEN support is straightforward and future-proofs against adding more sources:

token = os.environ.get("GITHUB_TOKEN")
req = urllib.request.Request(url)
if token:
    req.add_header("Authorization", f"token {token}")

4. urllib.error.HTTPError catch is redundant

HTTPError is a subclass of URLError, so listing both in the except clause is redundant. The HTTPError branch will never be reached independently:

# Current (redundant)
except (urllib.error.URLError, urllib.error.HTTPError, OSError) as exc:

# Simpler
except (urllib.error.URLError, OSError) as exc:

More importantly, HTTPError carries an HTTP status code (e.g. 404 for a missing README). Catching it separately and logging exc.code / exc.reason would give much more actionable error messages than just printing the exception string.

5. No manifest entry validation

If a manifest entry is missing repo, branch, or filename, the script will abort with an unhelpful KeyError traceback. A small guard would make failures easier to diagnose:

for entry in sources:
    repo = entry.get("repo")
    branch = entry.get("branch")
    filename = entry.get("filename")
    if not all([repo, branch, filename]):
        print(f"ERROR: malformed manifest entry: {entry}", file=sys.stderr)
        failed += 1
        continue

Minor / Nits

  • Hard-coded README.md: The RAW_URL_TEMPLATE only tries README.md. A handful of repos use readme.md (case-sensitive on Linux) or README.rst. Not an issue today but worth a note in the manifest format comment.
  • docs/index.md still refers to #monarch-py anchor in the table of contents (- **[monarch-py](#monarch-py)**). The heading is now ## monarch-py under a new top-level structure — confirm the anchor renders correctly in mkdocs.
  • The kg-build-process.md file contains several links to the old monarch-ingest Jenkinsfile and scripts. That's fine as cross-repo links, but worth confirming those files still exist at those paths if the monarch-ingest repo has been refactored.

What looks good

  • The .gitignore pattern (docs/Sources/*.md + !docs/Sources/index.md) is exactly right for generated files.
  • The manifest YAML format is self-documenting with good inline comments.
  • Script exit code behavior (exit 1 on any failure) is correct for CI.
  • The PageDocumentation.vue update cleanly replaces the old monarch-ingest link with the consolidated doc site.
  • Navigation restructuring in mkdocs.yaml is much more organized than before.

@kevinschaper kevinschaper marked this pull request as ready for review March 7, 2026 00:47
kevinschaper and others added 2 commits March 6, 2026 20:07
Update to reflect koza graph operations merge (join → deduplicate →
normalize → prune), modular ingests, DuckDB as central artifact,
closurizer for denormalization, Neo4j CSV export, SQLite via DuckDB
extension, and Solr loading from DuckDB with 4 cores. Replace static
architecture.png with mermaid diagram. Enable mermaid in mkdocs config.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@claude
Copy link
Copy Markdown

claude Bot commented Mar 7, 2026

claude review

Details PR Review: Migrate docs from monarch-ingest to monarch-app

Overall this is a solid, well-structured PR. The mermaid diagram is a great addition and fetch-source-docs.py is clean. A few things worth addressing:

Issue 1: mmrrc-ingest inconsistency (manifest vs nav)
mmrrc-ingest appears in docs/sources-manifest.yaml (so make fetch-docs writes docs/Sources/mmrrc.md) but is missing from mkdocs.yaml nav. The file will be fetched but never linked in the built site. Either remove it from the manifest or add it to the nav.

Issue 2: Orphaned binary file
docs/KG-Build-Process/architecture.png is added as a new binary, but kg-build-process.md uses a mermaid diagram and has no PNG reference. The file appears unused. Recommend removing it to avoid unnecessary binary weight in the repo.

Issue 3: fetch-docs as unconditional docs prerequisite
Every make docs run now triggers network calls to GitHub for 23+ repos. This risks CI flakiness on transient failures and breaks offline builds. Consider keeping fetch-docs as a separate optional step rather than wiring it into the docs target.

Minor / Non-blocking

  • HTTPError is a subclass of URLError in fetch-source-docs.py so listing both in the except clause is redundant (not a bug).
  • load_manifest calls sys.exit(1) directly; prefer raising an exception and handling it in main() for testability.
  • No manifest entry validation: a malformed entry raises a bare KeyError with no context about which entry failed.
  • kg-phenio uses branch: master in the manifest; will silently fail if that repo migrates to main.

Summary

Severity Item
Should fix mmrrc fetched by manifest but absent from nav
Should fix Orphaned architecture.png binary
Consider fetch-docs as mandatory docs dep risks CI flakiness
Nit Redundant HTTPError catch; sys.exit in helper

The documentation content (KG build process, modeling principles, Neo4J, MonarchR, licensing) is accurate and well-written. PageDocumentation.vue updates are clean. Great work consolidating these docs from monarch-ingest.

Generated with Claude Code

@claude
Copy link
Copy Markdown

claude Bot commented Mar 7, 2026

claude review

Details PR Review: Migrate docs from monarch-ingest to monarch-app

Overall this is a well-structured PR. The doc consolidation makes good sense, the manifest-driven fetch approach is clean, and the new pages are informative. A few things worth addressing before merging:


BUG: mmrrc is in the manifest but absent from mkdocs.yaml nav

docs/sources-manifest.yaml includes an entry for monarch-initiative/mmrrc-ingest (producing mmrrc.md), but mkdocs.yaml has no corresponding nav entry under KG Data Sources. The file will be fetched and written to disk but will never appear in the rendered docs navigation. An 'MMRRC: Sources/mmrrc.md' line needs to be added to the nav.


ISSUE: fetch-docs as a mandatory docs dependency breaks offline builds

Running 'make docs' now unconditionally requires network access to GitHub. Anyone building locally without internet, or a CI job from a cached checkout, will fail. Consider keeping fetch-docs standalone and removing it from the docs dependency chain. Alternatively, add a 'make docs-full' target that chains both.


ISSUE: mkdocs.yaml nav references gitignored files

The .gitignore correctly excludes docs/Sources/*.md (except index.md), but mkdocs.yaml explicitly lists all 23 source files in the nav. Running 'mkdocs build' without first running 'make fetch-docs' will abort with missing-file errors. The docs/Sources/index.md already notes this requirement; a matching comment in mkdocs.yaml would also help contributors.


ISSUE: fetch-source-docs.py has no GitHub auth or retry logic

The script fetches 23 files from raw.githubusercontent.com with no authentication. GitHub unauthenticated rate limit is 60 req/hr, which can be exceeded in shared CI. There is also no retry for transient timeouts, so a single flaky request fails the whole build. Suggest supporting an optional GITHUB_TOKEN env var as an Authorization Bearer header and adding a simple single-retry.


MINOR: HTTPError is redundant in the except tuple

urllib.error.HTTPError is a subclass of urllib.error.URLError, so listing both is redundant. (urllib.error.URLError, OSError) is sufficient.


MINOR: trailing blank line in docs/Principles/modeling-principles.md

The file ends with an extra blank line. Not a blocker.


Positive notes:

  • YAML manifest approach is clean and easy to extend.
  • yaml.safe_load is used correctly (not yaml.load).
  • .gitignore correctly excludes generated files while keeping index.md tracked.
  • The Mermaid diagram in kg-build-process.md is excellent, and the pymdownx.superfences config to render it is correct.
  • The frontend PageDocumentation.vue update is clean and the new links are accurate.
  • Exit code 1 on any fetch failure is the right behavior for CI.

@kevinschaper kevinschaper requested a review from AO33 March 9, 2026 19:08
@AO33 AO33 marked this pull request as draft March 9, 2026 22:28
Comment thread mkdocs.yaml
- Fix bug: add MMRRC to mkdocs.yaml nav (was in manifest but missing from nav)
- Decouple fetch-docs from docs target; add docs-full target that chains both
- Add comment in mkdocs.yaml warning that Sources/*.md files require make fetch-docs
- Remove trailing blank line in docs/Principles/modeling-principles.md

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
@claude
Copy link
Copy Markdown

claude Bot commented Mar 9, 2026

PR Review: Migrate docs from monarch-ingest to monarch-app

Overall this is a well-structured PR. The mermaid diagram is a nice upgrade over the static PNG, and the fetch-docs system is a smart approach to keeping source documentation in sync. A few things worth addressing before merge:

Potential CI breakage (most important): The source files under docs/Sources/ are gitignored but mkdocs.yaml lists them explicitly in the nav. A plain make docs without running fetch-docs first will fail with missing-file errors. Does CI call make docs or make docs-full? If any existing workflow calls make docs directly it will break after merge. Worth auditing .github/workflows/ and updating CI to use docs-full.

Nav / manifest sync risk: Both mkdocs.yaml nav and sources-manifest.yaml must list every source. Nothing enforces they stay in sync — a contributor who adds to the manifest but forgets the nav will get a silently broken nav. Consider mkdocs-gen-files + mkdocs-literate-nav to auto-generate the Sources nav from the manifest, or a CI check that diffs the filename lists.

fetch-source-docs.py issues:

  • urllib.error.HTTPError is a subclass of URLError and is already caught by it — listing both is redundant.
  • No retry logic: a transient GitHub network hiccup will fail the entire build. 2-3 retries with short backoff would improve CI robustness.
  • README.md filename is hardcoded. If an upstream repo uses different casing it will silently 404. A comment noting this assumption would help, or add an optional readme_path field to the manifest.

architecture.png: Committed but not referenced in any markdown. Either add a link with a note explaining it is the old diagram (superseded by the mermaid), or remove it to avoid confusion.

PageDocumentation.vue Sources link: Will 404 if docs were built without running fetch-docs. Fine as long as production always uses docs-full — worth confirming in the deployment workflow.

Minor: docs/sources-manifest.yaml is a build tool input, not a doc page — placing it in scripts/ alongside fetch-source-docs.py might be more intuitive. Not a blocker.

Content quality: The mermaid diagram accurately reflects the pipeline, the Neo4j Cypher examples are useful, the KG build process doc is a genuine improvement over what existed before, and the Licensing page is a welcome addition.

Summary: The main concern is the make docs vs make docs-full CI distinction. Confirming CI is updated would make this ready to merge.

@AO33 AO33 marked this pull request as ready for review March 9, 2026 22:54
@kevinschaper kevinschaper merged commit fa75d33 into main Mar 13, 2026
14 checks passed
@kevinschaper kevinschaper deleted the consolidate-ingest-docs branch March 13, 2026 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants