Skip to content

[DO NOT MERGE - EXPERIMENT] Import markdown to AsciiDoc #136

Open
DonaghBr wants to merge 6 commits intomainfrom
import-markdown
Open

[DO NOT MERGE - EXPERIMENT] Import markdown to AsciiDoc #136
DonaghBr wants to merge 6 commits intomainfrom
import-markdown

Conversation

@DonaghBr
Copy link
Copy Markdown
Collaborator

@DonaghBr DonaghBr commented Apr 30, 2026

he import-markdown feature is a round-trip pipeline for collaborating with
SMEs via Google Docs. It takes an edited markdown document (typically exported
from Google Docs), splits it by headings, matches each section back to
existing AsciiDoc modules in the docs repo using URL fragment IDs embedded in
headings, and then converts the updated content back into AsciiDoc —
preserving the existing module scaffolding. A deterministic assembly-wiring
step updates include:: directives so new or reordered modules are reflected in
the assembly files. The whole flow is integrated into the docs-orchestrator
as --workflow import.

Summary by CodeRabbit

Release Notes

  • New Features

    • Import edited markdown back into AsciiDoc documentation modules with automatic module matching and content conversion
    • New docs-import workflow orchestrating markdown splitting, module matching, content updates, branch creation, and merge request generation
    • New --markdown and --import-from CLI options for import operations
  • Documentation

    • Added design specification for the markdown import workflow

DonaghBr and others added 6 commits April 29, 2026 10:06
Tactical bridge tool for importing collaboratively-edited markdown
(from Google Docs) back into AsciiDoc modules. Matches sections to
existing modules via docs.redhat.com URL fragment IDs, produces a
match manifest, and feeds into the existing writing/commit/MR pipeline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a new workflow step that splits an edited markdown file by headings,
matches sections to existing AsciiDoc modules via URL fragment IDs, and
produces a match manifest for the writing step to apply updates.

- New skill: docs-workflow-import-markdown with Python matching script
- New workflow: docs-import.yaml (import → prepare-branch → writing → commit → create-mr)
- Extended docs-workflow-writing with import mode prompt template
- Extended build_writing_args.sh with --import-from flag
- Updated orchestrator with --markdown flag and import step arg mapping
- Handles both single-quote and double-quote ID anchors in .adoc files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The docs-writer agent was rewriting prose instead of just converting
markdown syntax to AsciiDoc. Added explicit instructions to preserve
the author's original wording and a concrete syntax conversion table.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add wire_assemblies.py as a post-processing step after the writing
agent in import mode. The script reads the match manifest, parses
original assembly files to learn which sections were inlined vs
included, and rebuilds each assembly preserving the same structure.

This replaces the unreliable agent-driven approach where the writing
agent was responsible for adding include:: directives — it failed to
do so in the custom-models-3.4 run (0 of 16 new modules wired).

The script handles:
- Inline vs include detection from original assembly structure
- Title-based module lookup when heading-derived filenames differ
- Leveloffset preservation from original assembly includes
- New assembly creation and master.adoc wiring
- Missing module detection and reporting

Also updates the import mode prompt to:
- Skip upstream modules (never modify files under upstream/)
- Forbid the agent from touching include:: directives
- Restrict new module creation to modules/ only

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tching

Headings like "[5.5.](url) Training Hub in OpenShift AI" only used the
link text ("5.5.") for matching, discarding the actual title after the
link. This caused the section to match with an empty clean heading,
preventing correct module resolution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve merge conflict in docs-orchestrator/SKILL.md argument docs —
keep the branch's --create-jira wording and --markdown flag, restore
main's --create-merge-request flag. Bump plugin version to 0.0.56.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 30, 2026

Note

Currently processing new changes in this PR. This may take a few minutes, please wait...

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 96f09fe5-e84d-4861-bfa8-976adcba6d27

📥 Commits

Reviewing files that changed from the base of the PR and between e2ac6b0 and 3a4a61e.

📒 Files selected for processing (10)
  • .claude/skills/docs-workflow-import-markdown
  • plugins/docs-tools/.claude-plugin/plugin.json
  • plugins/docs-tools/skills/docs-orchestrator/SKILL.md
  • plugins/docs-tools/skills/docs-orchestrator/defaults/docs-import.yaml
  • plugins/docs-tools/skills/docs-workflow-import-markdown/SKILL.md
  • plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py
  • plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py
  • plugins/docs-tools/skills/docs-workflow-writing/SKILL.md
  • plugins/docs-tools/skills/docs-workflow-writing/scripts/build_writing_args.sh
  • specs/2026-04-22-import-markdown-design.md
 ________________________________________________________
< Rust's borrow checker and I are on a first-name basis. >
 --------------------------------------------------------
  \
   \   (\__/)
       (•ㅅ•)
       /   づ
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch import-markdown

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 60 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can use Trivy to scan for security misconfigurations and secrets in Infrastructure as Code files.

Add a .trivyignore file to your project to customize which findings Trivy reports.

@github-actions
Copy link
Copy Markdown

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 30, 2026

Caution

Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted.

Error details
{}

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (1)
specs/2026-04-22-import-markdown-design.md (1)

148-169: ⚡ Quick win

Use fully qualified skill names in the workflow example.

The snippet currently uses bare names (docs-workflow-*). Prefer docs-tools:docs-workflow-* so copied examples resolve reliably via plugin registry.

Based on learnings: “Always use fully qualified plugin:skill names … in docs … inline text references, and cross-references between skills.”

Also applies to: 172-172

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@specs/2026-04-22-import-markdown-design.md` around lines 148 - 169, Replace
bare skill identifiers with fully qualified plugin:skill names in the workflow
snippet so examples resolve via the plugin registry; update skill entries like
the top-level skill for the job (`docs-workflow-import-markdown`) and the
per-step `skill:` values for prepare-branch, writing, commit, and create-mr to
use the `docs-tools:docs-workflow-*` form (and apply the same change noted for
the other occurrence at the later reference). Ensure the skill keys are the only
change so inputs and descriptions remain untouched.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py`:
- Line 16: Remove the unused import by deleting the top-level "import os"
statement, and fix the three Ruff line-length failures by breaking the overlong
expressions into shorter pieces: for long string literals use implicit
concatenation or assign parts to intermediate variables/constants, for long
function calls or parameter lists wrap arguments across multiple lines with
parentheses, and for long comprehensions or chained calls split into helper
variables or smaller functions; re-run Ruff/CI to confirm all violations are
cleared.
- Around line 122-164: In grep_for_id_stem and search_by_heading_text, stop
relying on prefix-only grep patterns and instead (1) use your existing grep
calls only to narrow candidate files, (2) open each candidate file and extract
the full id attributes with a regex like r'\[id=["\']([^"\']+)["\']\]' and
confirm the left-hand stem by splitting on '_' and comparing exactly to stem
(i.e., extracted_id.split('_',1)[0] == stem) in grep_for_id_stem, and (3) in
search_by_heading_text read each candidate file and parse the title line that
begins with "= " and compare the full extracted title string exactly to the
provided heading_text (replace the current escaped/pattern usage); update
functions grep_for_id_stem and search_by_heading_text to return only files that
pass these exact equality checks.

In
`@plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py`:
- Around line 120-127: The variable footer_start is assigned but never used;
remove the dead assignment and any related unused initialization (delete the
line that sets footer_start = None and any subsequent redundant references)
inside the loop that iterates over lines (the for i, line in enumerate(lines):
block) so the unused-variable warning from Ruff is resolved while leaving
current_inline_heading and current_inline_lines intact.
- Around line 756-767: The code appends new top-level assemblies at the end when
after_path is None because it falls back to known_assembly_paths[-1]; instead,
preserve document order by falling back to the first known assembly
(known_assembly_paths[0]) so inserts that occur before the first matched
assembly go before it. In the loop that iterates new_assemblies_to_insert (using
variables new_assemblies_to_insert, known_assembly_paths, master, master_rel,
args.dry_run), replace the fallback after = item["after_path"] or
known_assembly_paths[-1] with using known_assembly_paths[0] (or otherwise
compute an appropriate first-known fallback) so new includes retain their
original ordering rather than being appended to the end.
- Around line 611-631: find_master_adoc can return a non-root master.adoc but
insert_assembly_in_master always writes a hardcoded
"include::assemblies/<file>[...]" which breaks relative paths; change
insert_assembly_in_master to compute the include path relative to
master_path.parent (use Path(master_path).parent and generate a relative path
from that directory to new_assembly_path), normalize separators to forward
slashes and strip any leading "./", then use that relative path in the include
directive instead of the hardcoded assemblies/<file>; update references to
new_filename/new_include accordingly and keep insertion logic around
after_assembly_path unchanged.

In `@plugins/docs-tools/skills/docs-workflow-import-markdown/SKILL.md`:
- Around line 46-64: The SKILL.md must document that this step emits a sidecar
JSON at <base-path>/import-markdown/step-result.json in addition to verifying
manifest_path and writing match-manifest.json; update the "Verify output" /
"Report summary" sections to state that the step writes a step-result.json
containing the fields schema_version, step (use "import-markdown"), ticket
(ticket or run identifier), and completed_at (ISO8601 timestamp), and show a
short example JSON payload and placement next to the primary output so consumers
can find it easily.

In
`@plugins/docs-tools/skills/docs-workflow-writing/scripts/build_writing_args.sh`:
- Around line 115-118: The script currently prefers import mode when both
IMPORT_FROM and FIX_FROM are set, silently ignoring --fix-from; modify the logic
in the build_writing_args.sh branch that sets MODE (the block referencing
IMPORT_FROM, FIX_FROM, and MODE) to first detect if both IMPORT_FROM and
FIX_FROM are non-empty and, if so, print a clear error message to stderr and
exit with a non-zero status; otherwise continue to set MODE="import" when
IMPORT_FROM is set and MODE="fix" when FIX_FROM is set.

In `@plugins/docs-tools/skills/docs-workflow-writing/SKILL.md`:
- Around line 271-289: The post-processing step description must instruct the
workflow to include the outputs of wire_assemblies.py in the step-result.json
`files` list when mode == "import" and repo_path is set: after running python3
${CLAUDE_PLUGIN_ROOT}/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py,
capture its JSON keys (`assemblies_updated`, `assemblies_created`,
`master_updated`, and report `missing_modules`) and add those file paths to the
step-result.json `files` array (not just the generated _index.md). Update the
SKILL.md section around the post-processing instructions and the parsing logic
referenced at the existing 304-320 area to explicitly mention collecting and
merging these keys into the step result so created/modified assemblies and
master.adoc updates are reported to the user and missing module targets are
surfaced.

In `@specs/2026-04-22-import-markdown-design.md`:
- Line 131: The design doc instructs editors to add include:: statements
directly into assemblies, which conflicts with the delegated wiring implemented
in wire_assemblies.py; update the spec text (the guidance around "For new
sections" and the assembly include rule) to state that new AsciiDoc modules
should be created and that include wiring is handled automatically by
wire_assemblies.py (do not edit assembly files directly), and reference the
wire_assemblies.py wiring flow and any functions like build_assembly_index or
add_module_include so readers know which script performs the assembly include
wiring.

---

Nitpick comments:
In `@specs/2026-04-22-import-markdown-design.md`:
- Around line 148-169: Replace bare skill identifiers with fully qualified
plugin:skill names in the workflow snippet so examples resolve via the plugin
registry; update skill entries like the top-level skill for the job
(`docs-workflow-import-markdown`) and the per-step `skill:` values for
prepare-branch, writing, commit, and create-mr to use the
`docs-tools:docs-workflow-*` form (and apply the same change noted for the other
occurrence at the later reference). Ensure the skill keys are the only change so
inputs and descriptions remain untouched.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 96f09fe5-e84d-4861-bfa8-976adcba6d27

📥 Commits

Reviewing files that changed from the base of the PR and between e2ac6b0 and 3a4a61e.

📒 Files selected for processing (10)
  • .claude/skills/docs-workflow-import-markdown
  • plugins/docs-tools/.claude-plugin/plugin.json
  • plugins/docs-tools/skills/docs-orchestrator/SKILL.md
  • plugins/docs-tools/skills/docs-orchestrator/defaults/docs-import.yaml
  • plugins/docs-tools/skills/docs-workflow-import-markdown/SKILL.md
  • plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py
  • plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py
  • plugins/docs-tools/skills/docs-workflow-writing/SKILL.md
  • plugins/docs-tools/skills/docs-workflow-writing/scripts/build_writing_args.sh
  • specs/2026-04-22-import-markdown-design.md


import argparse
import json
import os
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clear the remaining Ruff failures.

CI is still red here for the unused os import and the three overlong lines at Lines 54, 282, and 316.

Also applies to: 54-54, 282-282, 316-316

🧰 Tools
🪛 GitHub Actions: Ruff

[error] 16-16: Ruff F401: os imported but unused. Remove unused import: os

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py`
at line 16, Remove the unused import by deleting the top-level "import os"
statement, and fix the three Ruff line-length failures by breaking the overlong
expressions into shorter pieces: for long string literals use implicit
concatenation or assign parts to intermediate variables/constants, for long
function calls or parameter lists wrap arguments across multiple lines with
parentheses, and for long comprehensions or chained calls split into helper
variables or smaller functions; re-run Ruff/CI to confirm all violations are
cleared.

Comment on lines +122 to +164
def grep_for_id_stem(stem, repo_path):
"""Search .adoc files for [id="<stem>_" or [id='<stem>_' and return matching file paths."""
paths = set()
for quote in ('"', "'"):
pattern = f"[id={quote}{stem}_"
try:
result = subprocess.run(
["grep", "-rl", "--include=*.adoc", "-F", pattern, "."],
capture_output=True,
text=True,
cwd=repo_path,
timeout=30,
)
except subprocess.TimeoutExpired:
continue
if result.returncode == 0:
for line in result.stdout.strip().splitlines():
if line.strip():
paths.add(line.strip().lstrip("./"))
return sorted(paths)


def search_by_heading_text(heading_text, repo_path):
"""Search .adoc files for a matching module title (= <title>)."""
# Escape special grep characters
escaped = re.sub(r'([\[\]{}()*+?.\\^$|])', r'\\\1', heading_text)
pattern = f"^= {escaped}"
try:
result = subprocess.run(
["grep", "-rl", "--include=*.adoc", "-E", pattern, "."],
capture_output=True,
text=True,
cwd=repo_path,
timeout=30,
)
except subprocess.TimeoutExpired:
return []

if result.returncode != 0:
return []

paths = [line.strip() for line in result.stdout.strip().splitlines() if line.strip()]
return [p.lstrip("./") for p in paths]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's confirm the file exists and examine the matching helpers
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | sed -n '122,164p'

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 1874


🏁 Script executed:

# Now examine how wire_assemblies.py uses these results around line 691
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | sed -n '680,710p'

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 1577


🏁 Script executed:

# Search for S603/S607 errors in the code
rg -n "S603|S607" plugins/docs-tools/skills/docs-workflow-import-markdown/

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 70


🏁 Script executed:

# Let's see the broader context of grep_for_id_stem to understand exact matching behavior
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | sed -n '115,175p'

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 2477


🏁 Script executed:

# Let's see where grep_for_id_stem and search_by_heading_text are called
rg -n "grep_for_id_stem|search_by_heading_text" plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 659


🏁 Script executed:

# Check what match_sections returns and how it's used
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | sed -n '167,250p'

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 3683


🏁 Script executed:

# Let's see if there's any test coverage or documentation about expected behavior
fd -t f -e py plugins/docs-tools/skills/docs-workflow-import-markdown/ | head -20

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 618


🏁 Script executed:

# Check for actual Ruff output or lint configuration
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | head -30

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 1183


🏁 Script executed:

# Search for how module_path is determined and used
rg -n "module_path" plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py -A 2 -B 2

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 1588


🏁 Script executed:

# Check for S603/S607 Ruff errors - these are security warnings about subprocess
# Let's look at the full grep_for_id_stem and search_by_heading_text to understand subprocess usage
rg -n "S603|S607" --type py plugins/docs-tools/skills/docs-workflow-import-markdown/

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 70


🏁 Script executed:

# Look at extract_id_from_heading to understand what stem is
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | sed -n '75,120p'

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 1777


🏁 Script executed:

# Check if there are any test files to understand expected behavior
fd . plugins/docs-tools/skills/docs-workflow-import-markdown/ -type f

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 259


🏁 Script executed:

# Understand the full context: search for where grep patterns are built and what could cause false positives
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | sed -n '50,100p'

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 2017


🏁 Script executed:

# Let's check if there's a pyproject.toml or ruff config that would show S603/S607 configuration
fd -name "pyproject.toml" -o -name "ruff.toml" -o -name ".ruff.toml"

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 319


🏁 Script executed:

# Check what S603/S607 actually are - they're subprocess.run security warnings
# Let's trace through the actual grep patterns more carefully to understand false positive risk
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | sed -n '85,115p'

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 1305


🏁 Script executed:

# Now let's understand: if stem="about" from "about-the-python-index_custom-models", 
# would grep -F "[id=\"about_" match [id="about-the-python-index_something"? Let's test the logic
python3 << 'EOF'
import re

# Simulate extract_id_from_heading
fragment = "about-the-python-index_custom-models"
stem = re.sub(r"_[^_]+$", "", fragment)  # Remove last _<word>
print(f"Fragment: {fragment}")
print(f"Stem: {stem}")

# The grep pattern would be
pattern = f'[id="{stem}_'
print(f"Grep -F pattern: {pattern}")

# Test: would this match these strings?
test_cases = [
    '[id="about-the-python-index_custom-models"]',  # exact match (expected)
    '[id="about-the-python-index_something-else"]',  # different suffix (would match pattern)
    '[id="about_first"]',  # different middle (would NOT match pattern due to "-" in stem)
]

for test in test_cases:
    if pattern in test:
        print(f"MATCH: {test}")
    else:
        print(f"NO MATCH: {test}")
EOF

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 342


🏁 Script executed:

# Let's check: does the code enforce that multiple matches are skipped?
# Looking at the match_sections flow again
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | sed -n '190,220p'

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 1501


🏁 Script executed:

# Critical question: when module_path is set from grep_for_id_stem or search_by_heading_text,
# does wire_assemblies.py blindly trust it or validate further?
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | sed -n '688,720p'

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 1907


🏁 Script executed:

# Check if S603/S607 warnings apply - these are subprocess security warnings
# Let's check the actual subprocess.run calls more carefully
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py | sed -n '122,165p' | grep -A 5 -B 5 "subprocess.run"

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 1099


🏁 Script executed:

# Verify: search_by_heading_text - does it anchor to end of line or allow partial match?
python3 << 'EOF'
import re

# Test the search_by_heading_text pattern behavior
heading_text = "About the Python index"
escaped = re.sub(r'([\[\]{}()*+?.\\^$|])', r'\\\1', heading_text)
pattern = f"^= {escaped}"

# What would this regex match?
test_lines = [
    "= About the Python index",           # exact match
    "= About the Python index (extended)",  # with extra text
    "= About the Python",                  # partial
    "About the Python index",              # no = prefix
]

print(f"Pattern: {pattern}")
print(f"Pattern is anchored at start (^) but not at end")
print()

for line in test_lines:
    if re.match(pattern, line):
        print(f"MATCH: {line}")
    else:
        print(f"NO MATCH: {line}")
EOF

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 309


🏁 Script executed:

# Final check: in a real scenario, how likely is a one-file false positive?
# Example: if stem="foo" and there are multiple [id="foo_..."] patterns in different files,
# grep_for_id_stem would return >1 matches and trigger "skipped" status
# But if only ONE file has [id="foo_..."], it would be accepted
# Is this realistic? Let's check if there's documentation on ID naming conventions

find plugins/docs-tools/skills/docs-workflow-import-markdown -name "*.md" -o -name "README*" -o -name "*.txt"

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 153


🏁 Script executed:

# Let's read the SKILL.md to understand the intended behavior
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/SKILL.md

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 2738


🏁 Script executed:

# Also check if there are any test files that demonstrate expected matching behavior
fd -type f -name "*test*" plugins/docs-tools/skills/docs-workflow-import-markdown/

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 259


🏁 Script executed:

# Let's see the overall structure and flow
ls -la plugins/docs-tools/skills/docs-workflow-import-markdown/

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 318


🏁 Script executed:

# Let's check if there's any mechanism that validates module_path beyond the matching helpers
# Check how module_path is used when status="matched"
rg -n "status.*matched" plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/ -A 3 -B 3

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 3701


🏁 Script executed:

# Look for any validation of the module_path before it's used in wire_assemblies
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | sed -n '685,705p'

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 1177


🏁 Script executed:

# Let's check if there's any way to test or understand the realistic impact
# Look at what happens if a module_path doesn't actually exist or is wrong
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | sed -n '690,700p'

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 657


🏁 Script executed:

# Check parse_assembly to see if it validates the file exists
rg -n "def parse_assembly" plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py -A 10

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 550


🏁 Script executed:

# Let's verify: if a wrong module_path is passed, does the code fail gracefully or silently?
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | sed -n '1,50p'

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 1951


🏁 Script executed:

# Check what happens when parse_assembly returns None
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | sed -n '102,125p'

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 957


🏁 Script executed:

# Check what happens when assembly_info is None - line 698 uses it
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | sed -n '693,750p'

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 3348


🏁 Script executed:

# Check should_inline to see how it handles None assembly_info
rg -n "def should_inline" plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py -A 10

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 489


🏁 Script executed:

# Search for build_assembly_file to see the full error path
rg -n "def build_assembly_file" plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py -A 15

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 777


🏁 Script executed:

# Check extract_assembly_header error handling
rg -n "def extract_assembly_header" plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py -A 12

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 560


🏁 Script executed:

# Final check: does the code safely handle a wrong module_path?
# What happens if module_path points to a file that doesn't exist?
cat -n plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | sed -n '522,545p'

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 1122


🏁 Script executed:

# One final verification: check if there's any path validation that could catch wrong module_path
rg -n "module_path" plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py | head -20

Repository: redhat-documentation/redhat-docs-agent-tools

Length of output: 1214


Tighten the ID stem and heading text matching patterns.

Both grep_for_id_stem() and search_by_heading_text() use prefix-based searches: Line 126 matches any [id="<stem>_ prefix (e.g., [id="about-the-python-index_custom"] and [id="about-the-python-index_other" both match stem about-the-python-index), and Line 148 matches any heading starting with the given text without anchoring to the end. While the code correctly skips sections when multiple matches are found (Lines 204–215 and 230–241, setting module_path=None), a single false positive would be treated as a correct match and could assign a section to the wrong module. Extract the full IDs and titles directly in Python and compare for exact equality instead of relying on prefix-matching grep patterns. This also improves code clarity and maintainability.

🧰 Tools
🪛 GitHub Actions: Ruff

[error] 128-129: Ruff S603: subprocess call: check for execution of untrusted input.


[error] 129-129: Ruff S607: Starting a process with a partial executable path.


[error] 150-151: Ruff S603: subprocess call: check for execution of untrusted input.


[error] 151-151: Ruff S607: Starting a process with a partial executable path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/import_markdown.py`
around lines 122 - 164, In grep_for_id_stem and search_by_heading_text, stop
relying on prefix-only grep patterns and instead (1) use your existing grep
calls only to narrow candidate files, (2) open each candidate file and extract
the full id attributes with a regex like r'\[id=["\']([^"\']+)["\']\]' and
confirm the left-hand stem by splitting on '_' and comparing exactly to stem
(i.e., extracted_id.split('_',1)[0] == stem) in grep_for_id_stem, and (3) in
search_by_heading_text read each candidate file and parse the title line that
begins with "= " and compare the full extracted title string exactly to the
provided heading_text (replace the current escaped/pattern usage); update
functions grep_for_id_stem and search_by_heading_text to return only files that
pass these exact equality checks.

Comment on lines +120 to +127
footer_start = None

current_inline_heading = None
current_inline_lines = []

for i, line in enumerate(lines):
if line.strip().startswith("ifdef::parent-context"):
footer_start = i
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove the dead footer_start assignment.

Ruff is already failing because this variable is never read.

🧰 Tools
🪛 GitHub Actions: Ruff

[error] 127-127: Ruff F841: Local variable footer_start is assigned to but never used. Remove assignment or use the variable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py`
around lines 120 - 127, The variable footer_start is assigned but never used;
remove the dead assignment and any related unused initialization (delete the
line that sets footer_start = None and any subsequent redundant references)
inside the loop that iterates over lines (the for i, line in enumerate(lines):
block) so the unused-variable warning from Ruff is resolved while leaving
current_inline_heading and current_inline_lines intact.

Comment on lines +611 to +631
def find_master_adoc(repo_path, known_assembly_paths):
"""Find the master.adoc that includes one of the known assemblies."""
for master in repo_path.rglob("master.adoc"):
text = master.read_text(encoding="utf-8")
for asm_path in known_assembly_paths:
filename = Path(asm_path).name
if filename in text:
return master
return None


def insert_assembly_in_master(master_path, new_assembly_path, after_assembly_path):
"""Insert an include directive for a new assembly in master.adoc.

Places it after the include for after_assembly_path.
"""
text = master_path.read_text(encoding="utf-8")
lines = text.splitlines()

new_filename = Path(new_assembly_path).name
new_include = f"include::assemblies/{new_filename}[leveloffset=+1]"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Build the master.adoc include path relative to the master file.

find_master_adoc() can return a nested master.adoc, but insert_assembly_in_master() always writes include::assemblies/<file>[leveloffset=+1]. For any non-root master this produces a broken include path. Compute the include relative to master_path.parent instead.

Possible fix
-def insert_assembly_in_master(master_path, new_assembly_path, after_assembly_path):
+def insert_assembly_in_master(master_path, repo_path, new_assembly_path, after_assembly_path):
     """Insert an include directive for a new assembly in master.adoc.
 
     Places it after the include for after_assembly_path.
     """
     text = master_path.read_text(encoding="utf-8")
     lines = text.splitlines()
 
     new_filename = Path(new_assembly_path).name
-    new_include = f"include::assemblies/{new_filename}[leveloffset=+1]"
+    include_path = Path(
+        os.path.relpath(repo_path / new_assembly_path, master_path.parent)
+    ).as_posix()
+    new_include = f"include::{include_path}[leveloffset=+1]"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py`
around lines 611 - 631, find_master_adoc can return a non-root master.adoc but
insert_assembly_in_master always writes a hardcoded
"include::assemblies/<file>[...]" which breaks relative paths; change
insert_assembly_in_master to compute the include path relative to
master_path.parent (use Path(master_path).parent and generate a relative path
from that directory to new_assembly_path), normalize separators to forward
slashes and strip any leading "./", then use that relative path in the include
directive instead of the hardcoded assemblies/<file>; update references to
new_filename/new_include accordingly and keep insertion logic around
after_assembly_path unchanged.

Comment on lines +756 to +767
new_assemblies_to_insert.append({
"new_path": new_path,
"after_path": last_matched_assembly_path,
})

if new_assemblies_to_insert and known_assembly_paths:
master = find_master_adoc(repo_path, known_assembly_paths)
if master:
master_rel = str(master.relative_to(repo_path))
for item in new_assemblies_to_insert:
after = item["after_path"] or known_assembly_paths[-1]
if args.dry_run:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

New top-level assemblies can be inserted in the wrong order.

after_path is captured from the last matched assembly seen so far. When a new level-2 section appears before the first matched assembly, it stays None, and Line 766 falls back to known_assembly_paths[-1], which appends that include at the end of master.adoc instead of preserving document order.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugins/docs-tools/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py`
around lines 756 - 767, The code appends new top-level assemblies at the end
when after_path is None because it falls back to known_assembly_paths[-1];
instead, preserve document order by falling back to the first known assembly
(known_assembly_paths[0]) so inserts that occur before the first matched
assembly go before it. In the loop that iterates new_assemblies_to_insert (using
variables new_assemblies_to_insert, known_assembly_paths, master, master_rel,
args.dry_run), replace the fallback after = item["after_path"] or
known_assembly_paths[-1] with using known_assembly_paths[0] (or otherwise
compute an appropriate first-known fallback) so new includes retain their
original ordering rather than being appended to the end.

Comment on lines +46 to +64
### 2. Verify output

Check that the `manifest_path` from the JSON output exists.

### 3. Report summary

Print the summary to the user:

```
Import matching complete: <matched> matched, <new> new, <skipped> skipped
```

If there are skipped sections, add:

```
Skipped sections require manual attention — see match-manifest.json for details.
```

## Known limitations
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add required step-result.json sidecar instructions for this step skill.

This step currently documents verify/report only; it should also emit <base-path>/import-markdown/step-result.json with schema_version, step, ticket, and completed_at.

Based on learnings: “All docs-workflow-* step skills must write a step-result.json sidecar alongside their primary output with schema_version, step, ticket, and completed_at fields.”

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 54-54: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 60-60: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/docs-tools/skills/docs-workflow-import-markdown/SKILL.md` around
lines 46 - 64, The SKILL.md must document that this step emits a sidecar JSON at
<base-path>/import-markdown/step-result.json in addition to verifying
manifest_path and writing match-manifest.json; update the "Verify output" /
"Report summary" sections to state that the step writes a step-result.json
containing the fields schema_version, step (use "import-markdown"), ticket
(ticket or run identifier), and completed_at (ISO8601 timestamp), and show a
short example JSON payload and placement next to the primary output so consumers
can find it easily.

Comment on lines +115 to 118
if [[ -n "$IMPORT_FROM" ]]; then
MODE="import"
elif [[ -n "$FIX_FROM" ]]; then
MODE="fix"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reject simultaneous --import-from and --fix-from flags.

Right now, both can be passed and Line 115 silently forces import mode. That makes --fix-from unexpectedly ignored.

Suggested fix
 # --- Determine mode ---
 MODE=""
+if [[ -n "$IMPORT_FROM" && -n "$FIX_FROM" ]]; then
+  echo "ERROR: --import-from and --fix-from are mutually exclusive." >&2
+  exit 1
+fi
 if [[ -n "$IMPORT_FROM" ]]; then
   MODE="import"
 elif [[ -n "$FIX_FROM" ]]; then
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if [[ -n "$IMPORT_FROM" ]]; then
MODE="import"
elif [[ -n "$FIX_FROM" ]]; then
MODE="fix"
# --- Determine mode ---
MODE=""
if [[ -n "$IMPORT_FROM" && -n "$FIX_FROM" ]]; then
echo "ERROR: --import-from and --fix-from are mutually exclusive." >&2
exit 1
fi
if [[ -n "$IMPORT_FROM" ]]; then
MODE="import"
elif [[ -n "$FIX_FROM" ]]; then
MODE="fix"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@plugins/docs-tools/skills/docs-workflow-writing/scripts/build_writing_args.sh`
around lines 115 - 118, The script currently prefers import mode when both
IMPORT_FROM and FIX_FROM are set, silently ignoring --fix-from; modify the logic
in the build_writing_args.sh branch that sets MODE (the block referencing
IMPORT_FROM, FIX_FROM, and MODE) to first detect if both IMPORT_FROM and
FIX_FROM are non-empty and, if so, print a clear error message to stderr and
exit with a non-zero status; otherwise continue to set MODE="import" when
IMPORT_FROM is set and MODE="fix" when FIX_FROM is set.

Comment on lines +271 to +289
### 3. Post-processing: Wire assemblies (import mode only)

**Only run this step when `mode == "import"` AND `repo_path` is not null.**

After the writing agent finishes, run the assembly wiring script to add `include::` directives deterministically:

```bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py \
--manifest <INPUT_FILE> --repo-path <REPO_PATH>
```

The script reads the match manifest, parses original assembly files to learn which sections were previously inlined vs included, and rebuilds each assembly preserving the same structure. It also creates new assemblies and updates `master.adoc` when needed.

The script emits JSON on stdout with:
- `assemblies_updated`: list of assembly files modified
- `assemblies_created`: list of new assembly files created
- `master_updated`: path to master.adoc if updated
- `missing_modules`: include targets where the module file does not exist on disk — report these to the user

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Include wire_assemblies.py outputs in step-result.json files list.

In import mode, post-processing can modify/create assemblies after _index.md is generated. If Line 304 only parses _index.md, sidecar files can underreport actual changes.

Suggested update to the step instructions
-Read the manifest at `<OUTPUT_FILE>` (`_index.md`). Extract every absolute file path from the table rows. These become the `files` array.
+Read the manifest at `<OUTPUT_FILE>` (`_index.md`). Extract every absolute file path from the table rows.
+
+If import post-processing was executed, also read the wiring script JSON output and append:
+- `assemblies_updated[]`
+- `assemblies_created[]`
+- `master_updated` (if non-null)
+
+Deduplicate and write the merged absolute paths as the final `files` array.

Also applies to: 304-320

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/docs-tools/skills/docs-workflow-writing/SKILL.md` around lines 271 -
289, The post-processing step description must instruct the workflow to include
the outputs of wire_assemblies.py in the step-result.json `files` list when mode
== "import" and repo_path is set: after running python3
${CLAUDE_PLUGIN_ROOT}/skills/docs-workflow-import-markdown/scripts/wire_assemblies.py,
capture its JSON keys (`assemblies_updated`, `assemblies_created`,
`master_updated`, and report `missing_modules`) and add those file paths to the
step-result.json `files` array (not just the generated _index.md). Update the
SKILL.md section around the post-processing instructions and the parsing logic
referenced at the existing 304-320 area to explicitly mention collecting and
merging these keys into the step result so created/modified assemblies and
master.adoc updates are reported to the user and missing module targets are
surfaced.

- Any `include::` directives not covered by the new content
- Existing AsciiDoc attributes and conditionals

**For new sections:** Create a new AsciiDoc module. Determine the content type (CONCEPT, PROCEDURE, REFERENCE) from the content structure. Follow the repo's existing file naming and directory conventions. Add an `include::` statement in the appropriate assembly file.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Design doc conflicts with import-mode assembly wiring rules.

Line 131 says new sections should add include:: directly in assemblies, but the implemented flow delegates include wiring to wire_assemblies.py. Please align this to avoid reintroducing forbidden direct include edits.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@specs/2026-04-22-import-markdown-design.md` at line 131, The design doc
instructs editors to add include:: statements directly into assemblies, which
conflicts with the delegated wiring implemented in wire_assemblies.py; update
the spec text (the guidance around "For new sections" and the assembly include
rule) to state that new AsciiDoc modules should be created and that include
wiring is handled automatically by wire_assemblies.py (do not edit assembly
files directly), and reference the wire_assemblies.py wiring flow and any
functions like build_assembly_index or add_module_include so readers know which
script performs the assembly include wiring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant