Skip to content

Conversation

@kukoyio
Copy link

@kukoyio kukoyio commented Feb 10, 2026

📝 Summary

This PR introduces the initial Source Scanner Plugin for MCP Gateway, providing pre-deployment static analysis of MCP server source code using Semgrep.
Included: plugin registration, Semgrep runner, normalized finding model, policy evaluation, configuration parsing, and initial tests/documentation.
Incomplete: Bandit runner, output normalizer, language detection, and full test coverage. These will be implemented in follow-up PRs.


🏷️ Type of Change

  • Bug fix
  • Feature / Enhancement
  • Documentation
  • Refactor
  • Chore (deps, CI, tooling)
  • Other (describe below)

🧪 Verification

Check Command Status
Lint suite make lint passes
Unit tests make test passing for implemented files
Coverage ≥ 80% make coverage 27%

✅ Checklist

  • Code formatted (make black isort pre-commit)
  • Tests added/updated for changes
  • Documentation updated (if applicable)
  • No secrets or credentials committed

📓 Notes (optional)

This is a scaffold / incremental PR; remaining functionality (Bandit, normalizer, language detection, full coverage) will be added in follow-up PRs.
Some linting and coverage issues exist on Windows; CI runs will be verified on GitHub.
Only relevant plugin files are included to avoid pushing unrelated GitLab files.

- Plugin skeleton and registration
- Semgrep CLI invocation and finding model
- Policy evaluation and config normalization
- Initial tests and documentation

NOTE:
- Bandit, normalizer, and language detection are intentionally incomplete
- exec.run_command integration and severity normalization will follow

Signed-off-by: Ayo Kukoyi <[email protected]>
@crivetimihai
Copy link
Member

Thanks for this, @kukoyio! The plugin architecture is well-structured - separation between scanners, policy, storage, and API layers is clean, and the Interface Alignment doc is a great idea for coordinating multi-contributor work.

A few things I noticed:

Architecture / Consistency:

  1. Sync vs Async mismatch - semgrep_runner.py uses subprocess.run() (synchronous), but utils/exec.py provides an async run_command() wrapper. The Interface Alignment doc says all external commands should use utils.exec.run_command(). Worth aligning, or noting as a follow-up.

  2. print() → logger - semgrep_runner.py uses print() for output. The project convention is to use LoggingService. Switching to logger.info() / logger.warning() would be more consistent.

  3. Plugin class missing - The manifest references plugins.source_scanner.source_scanner.SourceScannerPlugin, but no source_scanner.py with that class exists in this PR. Is this expected as a follow-up?

  4. repo_fetcher.py is empty - Just a header. Fine for scaffolding, but worth noting it's a placeholder.

Security:
5. subprocess.run(["rm", "-rf", temp_folder]) - Consider using shutil.rmtree() instead of shelling out. Safer, cross-platform, and avoids potential issues with unexpected characters in the path.

  1. _clone_repo takes user-provided URLs - The command-as-list form helps, but a URL like --upload-pack=malicious could still be problematic. Consider adding -- before the URL argument.

Minor:
7. REPOSITORY_AVAILABLE = True is hardcoded in the router - should this be dynamically determined?
8. Copyright dates show 2025 in several files - should be 2026.
9. Storage layer - The separate Base = declarative_base() creates its own metadata, separate from mcpgateway.db. For full integration this will need to share the same Base or use Alembic migrations.

Test fixtures in conftest.py are thorough. Looking forward to seeing the Bandit runner and full integration in follow-ups.

@crivetimihai crivetimihai added tcd SwEng Projects sweng-group-12 SwEng Group 12 - AI-Powered Security Scanner MCP Server for Pre-Deployment Validation labels Feb 10, 2026
@crivetimihai
Copy link
Member

Could you also please link this to a specific github issue this closes? E.g. closes #... or partially addresses?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sweng-group-12 SwEng Group 12 - AI-Powered Security Scanner MCP Server for Pre-Deployment Validation tcd SwEng Projects

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants