Identify working alternative instructions through semantic variation testing by shailja-thakur · Pull Request #16 · generative-computing/mellea-contribs

shailja-thakur · 2025-12-13T08:05:59Z

This PR enables discovering validated alternative instructions by testing Mellea m-programs against semantically equivalent variations of a problem. Users can find alternative phrasings that work reliably with their m-programs.

What This Enables

Run m-programs through semantic variations generated by BenchDrift
Identify which variations the m-program answers correctly
Extract validated alternative instructions (semantic variations that work)
Build libraries of working alternative prompts

Key Components

run_benchdrift_pipeline(): Orchestrates semantic variation generation and m-program testing
extract_replacement_instructions(): Extracts variations where m-program succeeded
Configurable variation strategies to focus discovery
Comprehensive test workflow for instruction alternatives

Benefits

Discover semantically equivalent alternatives that work
Build tested collections of working alternative instructions
Find different ways to phrase instructions with same meaning
Automatically validate alternatives through testing

…placement - Add variation_types parameter to run_benchdrift_pipeline() to allow users to customize which semantic variation types to generate (generic, cluster_variations, persona, long_context) - Update test/2_test_instruction_replacement.py to demonstrate variation_types usage - Add docs/INSTRUCTION_REPLACEMENT.md with comprehensive documentation for instruction replacement workflow - Enables discovery of validated alternative instructions with customizable variation strategies 🤖 Generated with Claude Code Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

nrfulton · 2026-01-09T14:21:52Z

Marking as draft; please mark ready for review when benchdrift is open-sourced.

mergify · 2026-01-23T21:45:15Z

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🔴 Enforce conventional commit

This rule is failing.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

nrfulton requested a review from avinash2692 January 9, 2026 14:16

nrfulton marked this pull request as draft January 9, 2026 14:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identify working alternative instructions through semantic variation testing#16

Identify working alternative instructions through semantic variation testing#16
shailja-thakur wants to merge 1 commit intogenerative-computing:mainfrom
shailja-thakur:feature/instruction-replacement-variation-types

shailja-thakur commented Dec 13, 2025

Uh oh!

nrfulton commented Jan 9, 2026

Uh oh!

mergify bot commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shailja-thakur commented Dec 13, 2025

What This Enables

Key Components

Benefits

Uh oh!

nrfulton commented Jan 9, 2026

Uh oh!

mergify bot commented Jan 23, 2026

Merge Protections

🔴 Enforce conventional commit

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants