[DO NOT MERGE - EXPERIMENT] Import markdown to AsciiDoc #136
[DO NOT MERGE - EXPERIMENT] Import markdown to AsciiDoc #136
Conversation
Tactical bridge tool for importing collaboratively-edited markdown (from Google Docs) back into AsciiDoc modules. Matches sections to existing modules via docs.redhat.com URL fragment IDs, produces a match manifest, and feeds into the existing writing/commit/MR pipeline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a new workflow step that splits an edited markdown file by headings, matches sections to existing AsciiDoc modules via URL fragment IDs, and produces a match manifest for the writing step to apply updates. - New skill: docs-workflow-import-markdown with Python matching script - New workflow: docs-import.yaml (import → prepare-branch → writing → commit → create-mr) - Extended docs-workflow-writing with import mode prompt template - Extended build_writing_args.sh with --import-from flag - Updated orchestrator with --markdown flag and import step arg mapping - Handles both single-quote and double-quote ID anchors in .adoc files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The docs-writer agent was rewriting prose instead of just converting markdown syntax to AsciiDoc. Added explicit instructions to preserve the author's original wording and a concrete syntax conversion table. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add wire_assemblies.py as a post-processing step after the writing agent in import mode. The script reads the match manifest, parses original assembly files to learn which sections were inlined vs included, and rebuilds each assembly preserving the same structure. This replaces the unreliable agent-driven approach where the writing agent was responsible for adding include:: directives — it failed to do so in the custom-models-3.4 run (0 of 16 new modules wired). The script handles: - Inline vs include detection from original assembly structure - Title-based module lookup when heading-derived filenames differ - Leveloffset preservation from original assembly includes - New assembly creation and master.adoc wiring - Missing module detection and reporting Also updates the import mode prompt to: - Skip upstream modules (never modify files under upstream/) - Forbid the agent from touching include:: directives - Restrict new module creation to modules/ only Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tching
Headings like "[5.5.](url) Training Hub in OpenShift AI" only used the
link text ("5.5.") for matching, discarding the actual title after the
link. This caused the section to match with an empty clean heading,
preventing correct module resolution.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve merge conflict in docs-orchestrator/SKILL.md argument docs — keep the branch's --create-jira wording and --markdown flag, restore main's --create-merge-request flag. Bump plugin version to 0.0.56. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Note Currently processing new changes in this PR. This may take a few minutes, please wait... ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (10)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment Tip CodeRabbit can use Trivy to scan for security misconfigurations and secrets in Infrastructure as Code files.Add a .trivyignore file to your project to customize which findings Trivy reports. |
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
There was a problem hiding this comment.
Actionable comments posted: 9
🧹 Nitpick comments (1)
specs/2026-04-22-import-markdown-design.md (1)
148-169: ⚡ Quick winUse fully qualified skill names in the workflow example.
The snippet currently uses bare names (
docs-workflow-*). Preferdocs-tools:docs-workflow-*so copied examples resolve reliably via plugin registry.Based on learnings: “Always use fully qualified
plugin:skillnames … in docs … inline text references, and cross-references between skills.”Also applies to: 172-172
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@specs/2026-04-22-import-markdown-design.md` around lines 148 - 169, Replace bare skill identifiers with fully qualified plugin:skill names in the workflow snippet so examples resolve via the plugin registry; update skill entries like the top-level skill for the job (`docs-workflow-import-markdown`) and the per-step `skill:` values for prepare-branch, writing, commit, and create-mr to use the `docs-tools:docs-workflow-*` form (and apply the same change noted for the other occurrence at the later reference). Ensure the skill keys are the only change so inputs and descriptions remain untouched.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py`:
- Line 16: Remove the unused import by deleting the top-level "import os"
statement, and fix the three Ruff line-length failures by breaking the overlong
expressions into shorter pieces: for long string literals use implicit
concatenation or assign parts to intermediate variables/constants, for long
function calls or parameter lists wrap arguments across multiple lines with
parentheses, and for long comprehensions or chained calls split into helper
variables or smaller functions; re-run Ruff/CI to confirm all violations are
cleared.
- Around line 122-164: In grep_for_id_stem and search_by_heading_text, stop
relying on prefix-only grep patterns and instead (1) use your existing grep
calls only to narrow candidate files, (2) open each candidate file and extract
the full id attributes with a regex like r'\[id=["\']([^"\']+)["\']\]' and
confirm the left-hand stem by splitting on '_' and comparing exactly to stem
(i.e., extracted_id.split('_',1)[0] == stem) in grep_for_id_stem, and (3) in
search_by_heading_text read each candidate file and parse the title line that
begins with "= " and compare the full extracted title string exactly to the
provided heading_text (replace the current escaped/pattern usage); update
functions grep_for_id_stem and search_by_heading_text to return only files that
pass these exact equality checks.
In
`@plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py`:
- Around line 120-127: The variable footer_start is assigned but never used;
remove the dead assignment and any related unused initialization (delete the
line that sets footer_start = None and any subsequent redundant references)
inside the loop that iterates over lines (the for i, line in enumerate(lines):
block) so the unused-variable warning from Ruff is resolved while leaving
current_inline_heading and current_inline_lines intact.
- Around line 756-767: The code appends new top-level assemblies at the end when
after_path is None because it falls back to known_assembly_paths[-1]; instead,
preserve document order by falling back to the first known assembly
(known_assembly_paths[0]) so inserts that occur before the first matched
assembly go before it. In the loop that iterates new_assemblies_to_insert (using
variables new_assemblies_to_insert, known_assembly_paths, master, master_rel,
args.dry_run), replace the fallback after = item["after_path"] or
known_assembly_paths[-1] with using known_assembly_paths[0] (or otherwise
compute an appropriate first-known fallback) so new includes retain their
original ordering rather than being appended to the end.
- Around line 611-631: find_master_adoc can return a non-root master.adoc but
insert_assembly_in_master always writes a hardcoded
"include::assemblies/<file>[...]" which breaks relative paths; change
insert_assembly_in_master to compute the include path relative to
master_path.parent (use Path(master_path).parent and generate a relative path
from that directory to new_assembly_path), normalize separators to forward
slashes and strip any leading "./", then use that relative path in the include
directive instead of the hardcoded assemblies/<file>; update references to
new_filename/new_include accordingly and keep insertion logic around
after_assembly_path unchanged.
In `@plugins/docs-tools/skills/docs-workflow-import-markdown/SKILL.md`:
- Around line 46-64: The SKILL.md must document that this step emits a sidecar
JSON at <base-path>/import-markdown/step-result.json in addition to verifying
manifest_path and writing match-manifest.json; update the "Verify output" /
"Report summary" sections to state that the step writes a step-result.json
containing the fields schema_version, step (use "import-markdown"), ticket
(ticket or run identifier), and completed_at (ISO8601 timestamp), and show a
short example JSON payload and placement next to the primary output so consumers
can find it easily.
In
`@plugins/docs-tools/skills/docs-workflow-writing/scripts/build_writing_args.sh`:
- Around line 115-118: The script currently prefers import mode when both
IMPORT_FROM and FIX_FROM are set, silently ignoring --fix-from; modify the logic
in the build_writing_args.sh branch that sets MODE (the block referencing
IMPORT_FROM, FIX_FROM, and MODE) to first detect if both IMPORT_FROM and
FIX_FROM are non-empty and, if so, print a clear error message to stderr and
exit with a non-zero status; otherwise continue to set MODE="import" when
IMPORT_FROM is set and MODE="fix" when FIX_FROM is set.
In `@plugins/docs-tools/skills/docs-workflow-writing/SKILL.md`:
- Around line 271-289: The post-processing step description must instruct the
workflow to include the outputs of wire_assemblies.py in the step-result.json
`files` list when mode == "import" and repo_path is set: after running python3
${CLAUDE_PLUGIN_ROOT}/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py,
capture its JSON keys (`assemblies_updated`, `assemblies_created`,
`master_updated`, and report `missing_modules`) and add those file paths to the
step-result.json `files` array (not just the generated _index.md). Update the
SKILL.md section around the post-processing instructions and the parsing logic
referenced at the existing 304-320 area to explicitly mention collecting and
merging these keys into the step result so created/modified assemblies and
master.adoc updates are reported to the user and missing module targets are
surfaced.
In `@specs/2026-04-22-import-markdown-design.md`:
- Line 131: The design doc instructs editors to add include:: statements
directly into assemblies, which conflicts with the delegated wiring implemented
in wire_assemblies.py; update the spec text (the guidance around "For new
sections" and the assembly include rule) to state that new AsciiDoc modules
should be created and that include wiring is handled automatically by
wire_assemblies.py (do not edit assembly files directly), and reference the
wire_assemblies.py wiring flow and any functions like build_assembly_index or
add_module_include so readers know which script performs the assembly include
wiring.
---
Nitpick comments:
In `@specs/2026-04-22-import-markdown-design.md`:
- Around line 148-169: Replace bare skill identifiers with fully qualified
plugin:skill names in the workflow snippet so examples resolve via the plugin
registry; update skill entries like the top-level skill for the job
(`docs-workflow-import-markdown`) and the per-step `skill:` values for
prepare-branch, writing, commit, and create-mr to use the
`docs-tools:docs-workflow-*` form (and apply the same change noted for the other
occurrence at the later reference). Ensure the skill keys are the only change so
inputs and descriptions remain untouched.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 96f09fe5-e84d-4861-bfa8-976adcba6d27
📒 Files selected for processing (10)
.claude/skills/docs-workflow-import-markdownplugins/docs-tools/.claude-plugin/plugin.jsonplugins/docs-tools/skills/docs-orchestrator/SKILL.mdplugins/docs-tools/skills/docs-orchestrator/defaults/docs-import.yamlplugins/docs-tools/skills/docs-workflow-import-markdown/SKILL.mdplugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.pyplugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.pyplugins/docs-tools/skills/docs-workflow-writing/SKILL.mdplugins/docs-tools/skills/docs-workflow-writing/scripts/build_writing_args.shspecs/2026-04-22-import-markdown-design.md
|
|
||
| import argparse | ||
| import json | ||
| import os |
There was a problem hiding this comment.
Clear the remaining Ruff failures.
CI is still red here for the unused os import and the three overlong lines at Lines 54, 282, and 316.
Also applies to: 54-54, 282-282, 316-316
🧰 Tools
🪛 GitHub Actions: Ruff
[error] 16-16: Ruff F401: os imported but unused. Remove unused import: os
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py`
at line 16, Remove the unused import by deleting the top-level "import os"
statement, and fix the three Ruff line-length failures by breaking the overlong
expressions into shorter pieces: for long string literals use implicit
concatenation or assign parts to intermediate variables/constants, for long
function calls or parameter lists wrap arguments across multiple lines with
parentheses, and for long comprehensions or chained calls split into helper
variables or smaller functions; re-run Ruff/CI to confirm all violations are
cleared.
| def grep_for_id_stem(stem, repo_path): | ||
| """Search .adoc files for [id="<stem>_" or [id='<stem>_' and return matching file paths.""" | ||
| paths = set() | ||
| for quote in ('"', "'"): | ||
| pattern = f"[id={quote}{stem}_" | ||
| try: | ||
| result = subprocess.run( | ||
| ["grep", "-rl", "--include=*.adoc", "-F", pattern, "."], | ||
| capture_output=True, | ||
| text=True, | ||
| cwd=repo_path, | ||
| timeout=30, | ||
| ) | ||
| except subprocess.TimeoutExpired: | ||
| continue | ||
| if result.returncode == 0: | ||
| for line in result.stdout.strip().splitlines(): | ||
| if line.strip(): | ||
| paths.add(line.strip().lstrip("./")) | ||
| return sorted(paths) | ||
|
|
||
|
|
||
| def search_by_heading_text(heading_text, repo_path): | ||
| """Search .adoc files for a matching module title (= <title>).""" | ||
| # Escape special grep characters | ||
| escaped = re.sub(r'([\[\]{}()*+?.\\^$|])', r'\\\1', heading_text) | ||
| pattern = f"^= {escaped}" | ||
| try: | ||
| result = subprocess.run( | ||
| ["grep", "-rl", "--include=*.adoc", "-E", pattern, "."], | ||
| capture_output=True, | ||
| text=True, | ||
| cwd=repo_path, | ||
| timeout=30, | ||
| ) | ||
| except subprocess.TimeoutExpired: | ||
| return [] | ||
|
|
||
| if result.returncode != 0: | ||
| return [] | ||
|
|
||
| paths = [line.strip() for line in result.stdout.strip().splitlines() if line.strip()] | ||
| return [p.lstrip("./") for p in paths] |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's confirm the file exists and examine the matching helpers
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | sed -n '122,164p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1874
🏁 Script executed:
# Now examine how wire_assemblies.py uses these results around line 691
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | sed -n '680,710p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1577
🏁 Script executed:
# Search for S603/S607 errors in the code
rg -n "S603|S607" plugins/docs-tools/skills/docs-workflow-import-markdown/Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 70
🏁 Script executed:
# Let's see the broader context of grep_for_id_stem to understand exact matching behavior
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | sed -n '115,175p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 2477
🏁 Script executed:
# Let's see where grep_for_id_stem and search_by_heading_text are called
rg -n "grep_for_id_stem|search_by_heading_text" plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 659
🏁 Script executed:
# Check what match_sections returns and how it's used
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | sed -n '167,250p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 3683
🏁 Script executed:
# Let's see if there's any test coverage or documentation about expected behavior
fd -t f -e py plugins/docs-tools/skills/docs-workflow-import-markdown/ | head -20Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 618
🏁 Script executed:
# Check for actual Ruff output or lint configuration
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | head -30Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1183
🏁 Script executed:
# Search for how module_path is determined and used
rg -n "module_path" plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py -A 2 -B 2Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1588
🏁 Script executed:
# Check for S603/S607 Ruff errors - these are security warnings about subprocess
# Let's look at the full grep_for_id_stem and search_by_heading_text to understand subprocess usage
rg -n "S603|S607" --type py plugins/docs-tools/skills/docs-workflow-import-markdown/Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 70
🏁 Script executed:
# Look at extract_id_from_heading to understand what stem is
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | sed -n '75,120p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1777
🏁 Script executed:
# Check if there are any test files to understand expected behavior
fd . plugins/docs-tools/skills/docs-workflow-import-markdown/ -type fRepository: redhat-documentation/redhat-docs-agent-tools
Length of output: 259
🏁 Script executed:
# Understand the full context: search for where grep patterns are built and what could cause false positives
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | sed -n '50,100p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 2017
🏁 Script executed:
# Let's check if there's a pyproject.toml or ruff config that would show S603/S607 configuration
fd -name "pyproject.toml" -o -name "ruff.toml" -o -name ".ruff.toml"Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 319
🏁 Script executed:
# Check what S603/S607 actually are - they're subprocess.run security warnings
# Let's trace through the actual grep patterns more carefully to understand false positive risk
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | sed -n '85,115p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1305
🏁 Script executed:
# Now let's understand: if stem="about" from "about-the-python-index_custom-models",
# would grep -F "[id=\"about_" match [id="about-the-python-index_something"? Let's test the logic
python3 << 'EOF'
import re
# Simulate extract_id_from_heading
fragment = "about-the-python-index_custom-models"
stem = re.sub(r"_[^_]+$", "", fragment) # Remove last _<word>
print(f"Fragment: {fragment}")
print(f"Stem: {stem}")
# The grep pattern would be
pattern = f'[id="{stem}_'
print(f"Grep -F pattern: {pattern}")
# Test: would this match these strings?
test_cases = [
'[id="about-the-python-index_custom-models"]', # exact match (expected)
'[id="about-the-python-index_something-else"]', # different suffix (would match pattern)
'[id="about_first"]', # different middle (would NOT match pattern due to "-" in stem)
]
for test in test_cases:
if pattern in test:
print(f"MATCH: {test}")
else:
print(f"NO MATCH: {test}")
EOFRepository: redhat-documentation/redhat-docs-agent-tools
Length of output: 342
🏁 Script executed:
# Let's check: does the code enforce that multiple matches are skipped?
# Looking at the match_sections flow again
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | sed -n '190,220p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1501
🏁 Script executed:
# Critical question: when module_path is set from grep_for_id_stem or search_by_heading_text,
# does wire_assemblies.py blindly trust it or validate further?
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | sed -n '688,720p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1907
🏁 Script executed:
# Check if S603/S607 warnings apply - these are subprocess security warnings
# Let's check the actual subprocess.run calls more carefully
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | sed -n '122,165p' | grep -A 5 -B 5 "subprocess.run"Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1099
🏁 Script executed:
# Verify: search_by_heading_text - does it anchor to end of line or allow partial match?
python3 << 'EOF'
import re
# Test the search_by_heading_text pattern behavior
heading_text = "About the Python index"
escaped = re.sub(r'([\[\]{}()*+?.\\^$|])', r'\\\1', heading_text)
pattern = f"^= {escaped}"
# What would this regex match?
test_lines = [
"= About the Python index", # exact match
"= About the Python index (extended)", # with extra text
"= About the Python", # partial
"About the Python index", # no = prefix
]
print(f"Pattern: {pattern}")
print(f"Pattern is anchored at start (^) but not at end")
print()
for line in test_lines:
if re.match(pattern, line):
print(f"MATCH: {line}")
else:
print(f"NO MATCH: {line}")
EOFRepository: redhat-documentation/redhat-docs-agent-tools
Length of output: 309
🏁 Script executed:
# Final check: in a real scenario, how likely is a one-file false positive?
# Example: if stem="foo" and there are multiple [id="foo_..."] patterns in different files,
# grep_for_id_stem would return >1 matches and trigger "skipped" status
# But if only ONE file has [id="foo_..."], it would be accepted
# Is this realistic? Let's check if there's documentation on ID naming conventions
find plugins/docs-tools/skills/docs-workflow-import-markdown -name "*.md" -o -name "README*" -o -name "*.txt"Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 153
🏁 Script executed:
# Let's read the SKILL.md to understand the intended behavior
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/SKILL.mdRepository: redhat-documentation/redhat-docs-agent-tools
Length of output: 2738
🏁 Script executed:
# Also check if there are any test files that demonstrate expected matching behavior
fd -type f -name "*test*" plugins/docs-tools/skills/docs-workflow-import-markdown/Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 259
🏁 Script executed:
# Let's see the overall structure and flow
ls -la plugins/docs-tools/skills/docs-workflow-import-markdown/Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 318
🏁 Script executed:
# Let's check if there's any mechanism that validates module_path beyond the matching helpers
# Check how module_path is used when status="matched"
rg -n "status.*matched" plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/ -A 3 -B 3Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 3701
🏁 Script executed:
# Look for any validation of the module_path before it's used in wire_assemblies
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | sed -n '685,705p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1177
🏁 Script executed:
# Let's check if there's any way to test or understand the realistic impact
# Look at what happens if a module_path doesn't actually exist or is wrong
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | sed -n '690,700p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 657
🏁 Script executed:
# Check parse_assembly to see if it validates the file exists
rg -n "def parse_assembly" plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py -A 10Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 550
🏁 Script executed:
# Let's verify: if a wrong module_path is passed, does the code fail gracefully or silently?
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | sed -n '1,50p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1951
🏁 Script executed:
# Check what happens when parse_assembly returns None
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | sed -n '102,125p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 957
🏁 Script executed:
# Check what happens when assembly_info is None - line 698 uses it
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | sed -n '693,750p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 3348
🏁 Script executed:
# Check should_inline to see how it handles None assembly_info
rg -n "def should_inline" plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py -A 10Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 489
🏁 Script executed:
# Search for build_assembly_file to see the full error path
rg -n "def build_assembly_file" plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py -A 15Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 777
🏁 Script executed:
# Check extract_assembly_header error handling
rg -n "def extract_assembly_header" plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py -A 12Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 560
🏁 Script executed:
# Final check: does the code safely handle a wrong module_path?
# What happens if module_path points to a file that doesn't exist?
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | sed -n '522,545p'Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1122
🏁 Script executed:
# One final verification: check if there's any path validation that could catch wrong module_path
rg -n "module_path" plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | head -20Repository: redhat-documentation/redhat-docs-agent-tools
Length of output: 1214
Tighten the ID stem and heading text matching patterns.
Both grep_for_id_stem() and search_by_heading_text() use prefix-based searches: Line 126 matches any [id="<stem>_ prefix (e.g., [id="about-the-python-index_custom"] and [id="about-the-python-index_other" both match stem about-the-python-index), and Line 148 matches any heading starting with the given text without anchoring to the end. While the code correctly skips sections when multiple matches are found (Lines 204–215 and 230–241, setting module_path=None), a single false positive would be treated as a correct match and could assign a section to the wrong module. Extract the full IDs and titles directly in Python and compare for exact equality instead of relying on prefix-matching grep patterns. This also improves code clarity and maintainability.
🧰 Tools
🪛 GitHub Actions: Ruff
[error] 128-129: Ruff S603: subprocess call: check for execution of untrusted input.
[error] 129-129: Ruff S607: Starting a process with a partial executable path.
[error] 150-151: Ruff S603: subprocess call: check for execution of untrusted input.
[error] 151-151: Ruff S607: Starting a process with a partial executable path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py`
around lines 122 - 164, In grep_for_id_stem and search_by_heading_text, stop
relying on prefix-only grep patterns and instead (1) use your existing grep
calls only to narrow candidate files, (2) open each candidate file and extract
the full id attributes with a regex like r'\[id=["\']([^"\']+)["\']\]' and
confirm the left-hand stem by splitting on '_' and comparing exactly to stem
(i.e., extracted_id.split('_',1)[0] == stem) in grep_for_id_stem, and (3) in
search_by_heading_text read each candidate file and parse the title line that
begins with "= " and compare the full extracted title string exactly to the
provided heading_text (replace the current escaped/pattern usage); update
functions grep_for_id_stem and search_by_heading_text to return only files that
pass these exact equality checks.
| footer_start = None | ||
|
|
||
| current_inline_heading = None | ||
| current_inline_lines = [] | ||
|
|
||
| for i, line in enumerate(lines): | ||
| if line.strip().startswith("ifdef::parent-context"): | ||
| footer_start = i |
There was a problem hiding this comment.
Remove the dead footer_start assignment.
Ruff is already failing because this variable is never read.
🧰 Tools
🪛 GitHub Actions: Ruff
[error] 127-127: Ruff F841: Local variable footer_start is assigned to but never used. Remove assignment or use the variable.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py`
around lines 120 - 127, The variable footer_start is assigned but never used;
remove the dead assignment and any related unused initialization (delete the
line that sets footer_start = None and any subsequent redundant references)
inside the loop that iterates over lines (the for i, line in enumerate(lines):
block) so the unused-variable warning from Ruff is resolved while leaving
current_inline_heading and current_inline_lines intact.
| def find_master_adoc(repo_path, known_assembly_paths): | ||
| """Find the master.adoc that includes one of the known assemblies.""" | ||
| for master in repo_path.rglob("master.adoc"): | ||
| text = master.read_text(encoding="utf-8") | ||
| for asm_path in known_assembly_paths: | ||
| filename = Path(asm_path).name | ||
| if filename in text: | ||
| return master | ||
| return None | ||
|
|
||
|
|
||
| def insert_assembly_in_master(master_path, new_assembly_path, after_assembly_path): | ||
| """Insert an include directive for a new assembly in master.adoc. | ||
|
|
||
| Places it after the include for after_assembly_path. | ||
| """ | ||
| text = master_path.read_text(encoding="utf-8") | ||
| lines = text.splitlines() | ||
|
|
||
| new_filename = Path(new_assembly_path).name | ||
| new_include = f"include::assemblies/{new_filename}[leveloffset=+1]" |
There was a problem hiding this comment.
Build the master.adoc include path relative to the master file.
find_master_adoc() can return a nested master.adoc, but insert_assembly_in_master() always writes include::assemblies/<file>[leveloffset=+1]. For any non-root master this produces a broken include path. Compute the include relative to master_path.parent instead.
Possible fix
-def insert_assembly_in_master(master_path, new_assembly_path, after_assembly_path):
+def insert_assembly_in_master(master_path, repo_path, new_assembly_path, after_assembly_path):
"""Insert an include directive for a new assembly in master.adoc.
Places it after the include for after_assembly_path.
"""
text = master_path.read_text(encoding="utf-8")
lines = text.splitlines()
new_filename = Path(new_assembly_path).name
- new_include = f"include::assemblies/{new_filename}[leveloffset=+1]"
+ include_path = Path(
+ os.path.relpath(repo_path / new_assembly_path, master_path.parent)
+ ).as_posix()
+ new_include = f"include::{include_path}[leveloffset=+1]"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py`
around lines 611 - 631, find_master_adoc can return a non-root master.adoc but
insert_assembly_in_master always writes a hardcoded
"include::assemblies/<file>[...]" which breaks relative paths; change
insert_assembly_in_master to compute the include path relative to
master_path.parent (use Path(master_path).parent and generate a relative path
from that directory to new_assembly_path), normalize separators to forward
slashes and strip any leading "./", then use that relative path in the include
directive instead of the hardcoded assemblies/<file>; update references to
new_filename/new_include accordingly and keep insertion logic around
after_assembly_path unchanged.
| new_assemblies_to_insert.append({ | ||
| "new_path": new_path, | ||
| "after_path": last_matched_assembly_path, | ||
| }) | ||
|
|
||
| if new_assemblies_to_insert and known_assembly_paths: | ||
| master = find_master_adoc(repo_path, known_assembly_paths) | ||
| if master: | ||
| master_rel = str(master.relative_to(repo_path)) | ||
| for item in new_assemblies_to_insert: | ||
| after = item["after_path"] or known_assembly_paths[-1] | ||
| if args.dry_run: |
There was a problem hiding this comment.
New top-level assemblies can be inserted in the wrong order.
after_path is captured from the last matched assembly seen so far. When a new level-2 section appears before the first matched assembly, it stays None, and Line 766 falls back to known_assembly_paths[-1], which appends that include at the end of master.adoc instead of preserving document order.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py`
around lines 756 - 767, The code appends new top-level assemblies at the end
when after_path is None because it falls back to known_assembly_paths[-1];
instead, preserve document order by falling back to the first known assembly
(known_assembly_paths[0]) so inserts that occur before the first matched
assembly go before it. In the loop that iterates new_assemblies_to_insert (using
variables new_assemblies_to_insert, known_assembly_paths, master, master_rel,
args.dry_run), replace the fallback after = item["after_path"] or
known_assembly_paths[-1] with using known_assembly_paths[0] (or otherwise
compute an appropriate first-known fallback) so new includes retain their
original ordering rather than being appended to the end.
| ### 2. Verify output | ||
|
|
||
| Check that the `manifest_path` from the JSON output exists. | ||
|
|
||
| ### 3. Report summary | ||
|
|
||
| Print the summary to the user: | ||
|
|
||
| ``` | ||
| Import matching complete: <matched> matched, <new> new, <skipped> skipped | ||
| ``` | ||
|
|
||
| If there are skipped sections, add: | ||
|
|
||
| ``` | ||
| Skipped sections require manual attention — see match-manifest.json for details. | ||
| ``` | ||
|
|
||
| ## Known limitations |
There was a problem hiding this comment.
Add required step-result.json sidecar instructions for this step skill.
This step currently documents verify/report only; it should also emit <base-path>/import-markdown/step-result.json with schema_version, step, ticket, and completed_at.
Based on learnings: “All docs-workflow-* step skills must write a step-result.json sidecar alongside their primary output with schema_version, step, ticket, and completed_at fields.”
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 54-54: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 60-60: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/docs-tools/skills/docs-workflow-import-markdown/SKILL.md` around
lines 46 - 64, The SKILL.md must document that this step emits a sidecar JSON at
<base-path>/import-markdown/step-result.json in addition to verifying
manifest_path and writing match-manifest.json; update the "Verify output" /
"Report summary" sections to state that the step writes a step-result.json
containing the fields schema_version, step (use "import-markdown"), ticket
(ticket or run identifier), and completed_at (ISO8601 timestamp), and show a
short example JSON payload and placement next to the primary output so consumers
can find it easily.
| if [[ -n "$IMPORT_FROM" ]]; then | ||
| MODE="import" | ||
| elif [[ -n "$FIX_FROM" ]]; then | ||
| MODE="fix" |
There was a problem hiding this comment.
Reject simultaneous --import-from and --fix-from flags.
Right now, both can be passed and Line 115 silently forces import mode. That makes --fix-from unexpectedly ignored.
Suggested fix
# --- Determine mode ---
MODE=""
+if [[ -n "$IMPORT_FROM" && -n "$FIX_FROM" ]]; then
+ echo "ERROR: --import-from and --fix-from are mutually exclusive." >&2
+ exit 1
+fi
if [[ -n "$IMPORT_FROM" ]]; then
MODE="import"
elif [[ -n "$FIX_FROM" ]]; then📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if [[ -n "$IMPORT_FROM" ]]; then | |
| MODE="import" | |
| elif [[ -n "$FIX_FROM" ]]; then | |
| MODE="fix" | |
| # --- Determine mode --- | |
| MODE="" | |
| if [[ -n "$IMPORT_FROM" && -n "$FIX_FROM" ]]; then | |
| echo "ERROR: --import-from and --fix-from are mutually exclusive." >&2 | |
| exit 1 | |
| fi | |
| if [[ -n "$IMPORT_FROM" ]]; then | |
| MODE="import" | |
| elif [[ -n "$FIX_FROM" ]]; then | |
| MODE="fix" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@plugins/docs-tools/skills/docs-workflow-writing/scripts/build_writing_args.sh`
around lines 115 - 118, The script currently prefers import mode when both
IMPORT_FROM and FIX_FROM are set, silently ignoring --fix-from; modify the logic
in the build_writing_args.sh branch that sets MODE (the block referencing
IMPORT_FROM, FIX_FROM, and MODE) to first detect if both IMPORT_FROM and
FIX_FROM are non-empty and, if so, print a clear error message to stderr and
exit with a non-zero status; otherwise continue to set MODE="import" when
IMPORT_FROM is set and MODE="fix" when FIX_FROM is set.
| ### 3. Post-processing: Wire assemblies (import mode only) | ||
|
|
||
| **Only run this step when `mode == "import"` AND `repo_path` is not null.** | ||
|
|
||
| After the writing agent finishes, run the assembly wiring script to add `include::` directives deterministically: | ||
|
|
||
| ```bash | ||
| python3 ${CLAUDE_PLUGIN_ROOT}/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py \ | ||
| --manifest <INPUT_FILE> --repo-path <REPO_PATH> | ||
| ``` | ||
|
|
||
| The script reads the match manifest, parses original assembly files to learn which sections were previously inlined vs included, and rebuilds each assembly preserving the same structure. It also creates new assemblies and updates `master.adoc` when needed. | ||
|
|
||
| The script emits JSON on stdout with: | ||
| - `assemblies_updated`: list of assembly files modified | ||
| - `assemblies_created`: list of new assembly files created | ||
| - `master_updated`: path to master.adoc if updated | ||
| - `missing_modules`: include targets where the module file does not exist on disk — report these to the user | ||
|
|
There was a problem hiding this comment.
Include wire_assemblies.py outputs in step-result.json files list.
In import mode, post-processing can modify/create assemblies after _index.md is generated. If Line 304 only parses _index.md, sidecar files can underreport actual changes.
Suggested update to the step instructions
-Read the manifest at `<OUTPUT_FILE>` (`_index.md`). Extract every absolute file path from the table rows. These become the `files` array.
+Read the manifest at `<OUTPUT_FILE>` (`_index.md`). Extract every absolute file path from the table rows.
+
+If import post-processing was executed, also read the wiring script JSON output and append:
+- `assemblies_updated[]`
+- `assemblies_created[]`
+- `master_updated` (if non-null)
+
+Deduplicate and write the merged absolute paths as the final `files` array.Also applies to: 304-320
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/docs-tools/skills/docs-workflow-writing/SKILL.md` around lines 271 -
289, The post-processing step description must instruct the workflow to include
the outputs of wire_assemblies.py in the step-result.json `files` list when mode
== "import" and repo_path is set: after running python3
${CLAUDE_PLUGIN_ROOT}/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py,
capture its JSON keys (`assemblies_updated`, `assemblies_created`,
`master_updated`, and report `missing_modules`) and add those file paths to the
step-result.json `files` array (not just the generated _index.md). Update the
SKILL.md section around the post-processing instructions and the parsing logic
referenced at the existing 304-320 area to explicitly mention collecting and
merging these keys into the step result so created/modified assemblies and
master.adoc updates are reported to the user and missing module targets are
surfaced.
| - Any `include::` directives not covered by the new content | ||
| - Existing AsciiDoc attributes and conditionals | ||
|
|
||
| **For new sections:** Create a new AsciiDoc module. Determine the content type (CONCEPT, PROCEDURE, REFERENCE) from the content structure. Follow the repo's existing file naming and directory conventions. Add an `include::` statement in the appropriate assembly file. |
There was a problem hiding this comment.
Design doc conflicts with import-mode assembly wiring rules.
Line 131 says new sections should add include:: directly in assemblies, but the implemented flow delegates include wiring to wire_assemblies.py. Please align this to avoid reintroducing forbidden direct include edits.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@specs/2026-04-22-import-markdown-design.md` at line 131, The design doc
instructs editors to add include:: statements directly into assemblies, which
conflicts with the delegated wiring implemented in wire_assemblies.py; update
the spec text (the guidance around "For new sections" and the assembly include
rule) to state that new AsciiDoc modules should be created and that include
wiring is handled automatically by wire_assemblies.py (do not edit assembly
files directly), and reference the wire_assemblies.py wiring flow and any
functions like build_assembly_index or add_module_include so readers know which
script performs the assembly include wiring.
he import-markdown feature is a round-trip pipeline for collaborating with
SMEs via Google Docs. It takes an edited markdown document (typically exported
from Google Docs), splits it by headings, matches each section back to
existing AsciiDoc modules in the docs repo using URL fragment IDs embedded in
headings, and then converts the updated content back into AsciiDoc —
preserving the existing module scaffolding. A deterministic assembly-wiring
step updates include:: directives so new or reordered modules are reflected in
the assembly files. The whole flow is integrated into the docs-orchestrator
as --workflow import.
Summary by CodeRabbit
Release Notes
New Features
docs-importworkflow orchestrating markdown splitting, module matching, content updates, branch creation, and merge request generation--markdownand--import-fromCLI options for import operationsDocumentation