Skip to content

feat: Content Provenance experiment (C2PA 2.3 §A.7 text authentication)#294

Draft
erik-sv wants to merge 6 commits intoWordPress:developfrom
erik-sv:feature/content-provenance-experiment
Draft

feat: Content Provenance experiment (C2PA 2.3 §A.7 text authentication)#294
erik-sv wants to merge 6 commits intoWordPress:developfrom
erik-sv:feature/content-provenance-experiment

Conversation

@erik-sv
Copy link

@erik-sv erik-sv commented Mar 10, 2026

Summary

Adds a new Content Provenance experiment that embeds cryptographic C2PA 2.3 §A.7 manifests into post content as invisible Unicode variation selectors. Publishers can prove authorship, detect tampering, and participate in the emerging content authenticity ecosystem (same standard used by Google, BBC, Adobe, OpenAI, and Microsoft).

The latest commit extends this with AI fragment provenance — output filter hooks on all five AI Ability classes so that individually generated titles, excerpts, summaries, review notes, and alt text can each carry their own embedded manifest.

What this adds

Content Provenance experiment

  • Auto-signs posts on publish/update with c2pa.created / c2pa.edited actions and provenance-chain ingredient references
  • Three signing tiers — Local (zero setup, self-signed), Connected (delegated to an HTTP signing service), BYOK (publisher's own certificate)
  • c2pa/sign and c2pa/verify Abilities — any plugin can call wp_do_ability('c2pa/sign', ['text' => …])
  • Gutenberg sidebar panel — 5-state shield badge (verified / local-signed / modified / tampered / unsigned) with one-click sign/verify
  • /.well-known/c2pa discovery endpoint — C2PA §6.4 compliant JSON document
  • Verification badge — optional frontend badge on public posts

AI fragment provenance (latest commit)

  • Output filter hook on each Ability result: wp_ai_experiment_title_generation_result, wp_ai_experiment_excerpt_generation_result, wp_ai_experiment_summarization_result, wp_ai_experiment_review_notes_result, wp_ai_experiment_alt_text_result
  • New sign_ai_fragments setting — when enabled, Content Provenance intercepts these filters and embeds a C2PA manifest into each AI-generated fragment before it reaches the editor
  • Fails open — signing errors return the original result unchanged, never blocking output

Post signing flow

flowchart TD
    A[Post Published or Updated] --> B{Content Provenance enabled?}
    B -->|No| Z[Skip]
    B -->|Yes| C[Strip HTML to plain text]
    C --> D[Build C2PA Manifest]
    D --> D1[c2pa.actions.v1]
    D --> D2[c2pa.hash.data.v1 SHA-256]
    D --> D3[c2pa.soft_binding.v1]
    D --> D4[c2pa.ingredient.v2 edit chain]
    D1 & D2 & D3 & D4 --> E{Signing tier}
    E -->|Local| F[RSA-2048 self-signed via OpenSSL]
    E -->|Connected| G[POST to signing service HTTP API]
    E -->|BYOK| H[Publisher cert PEM file]
    F & G & H --> I[Unicode Embedder: VS1-VS256 invisible bytes]
    I --> J[wp_update_post with embedded content]
    J --> K[Store post meta: _c2pa_manifest, _c2pa_status, _c2pa_signed_at]
    K --> L[Gutenberg sidebar shield badge]
Loading

AI fragment provenance flow

flowchart LR
    A[Editor triggers AI Ability] --> B[Ability executes and returns result]
    B --> C[apply_filters on wp_ai_experiment_*_result]
    C --> D{sign_ai_fragments enabled?}
    D -->|No| E[Original result returned to editor]
    D -->|Yes| F[C2PA_Manifest_Builder::build]
    F --> G[Unicode_Embedder::embed]
    G -->|Success| H[Signed fragment returned to editor]
    G -->|Error| E
Loading

Signing tiers

Tier Trust model Setup required On WordPress trust list
Local Self-signed RSA-2048, stored in site options None No — yellow badge
Connected Delegated to HTTP signing service Service URL + API key Yes — green badge
BYOK Publisher's own certificate PEM file path Yes — green badge

WordPress Abilities API

// Sign any text
$result = wp_do_ability( 'c2pa/sign', [
    'text'   => 'The content to sign',
    'action' => 'c2pa.created',  // or c2pa.edited
] );
// $result['signed_text'] — Unicode-embedded provenance
// $result['manifest']    — full C2PA JSON manifest
// $result['signer_tier'] — local | connected | byok

// Verify any text
$result = wp_do_ability( 'c2pa/verify', [
    'text' => $post->post_content,
] );
// $result['verified'] — bool
// $result['status']   — verified | unsigned | tampered | modified
// $result['manifest'] — parsed manifest array if present

Fragment hook usage for third-party plugins

add_filter( 'wp_ai_experiment_title_generation_result', function( $result, $context ) {
    // $result['titles'] — array of generated title strings, each may be signed
    // $context['post_id'] — the post being edited
    return $result;
}, 10, 2 );

Files changed

File Type Description
includes/Experiments/Content_Provenance/Content_Provenance.php New Main experiment class + fragment hooks
includes/Experiments/Content_Provenance/C2PA_Manifest_Builder.php New Manifest construction + verification
includes/Experiments/Content_Provenance/Unicode_Embedder.php New VS1–VS256 embed/extract/strip
includes/Experiments/Content_Provenance/Well_Known_Handler.php New /.well-known/c2pa endpoint
includes/Experiments/Content_Provenance/Verification_Badge.php New Frontend badge
includes/Experiments/Content_Provenance/Signing/Signing_Interface.php New Signer contract
includes/Experiments/Content_Provenance/Signing/Local_Signer.php New Self-signed tier
includes/Experiments/Content_Provenance/Signing/Connected_Signer.php New HTTP service tier
includes/Experiments/Content_Provenance/Signing/BYOK_Signer.php New Cert-based tier
includes/Abilities/Content_Provenance/C2PA_Sign.php New c2pa/sign Ability
includes/Abilities/Content_Provenance/C2PA_Verify.php New c2pa/verify Ability
includes/Abilities/Title_Generation/Title_Generation.php Modified Add wp_ai_experiment_title_generation_result filter
includes/Abilities/Excerpt_Generation/Excerpt_Generation.php Modified Add wp_ai_experiment_excerpt_generation_result filter
includes/Abilities/Summarization/Summarization.php Modified Add wp_ai_experiment_summarization_result filter
includes/Abilities/Review_Notes/Review_Notes.php Modified Add wp_ai_experiment_review_notes_result filter
includes/Abilities/Image/Alt_Text_Generation.php Modified Add wp_ai_experiment_alt_text_result filter
src/experiments/content-provenance/index.js New Gutenberg sidebar panel
includes/Experiment_Loader.php Modified Register experiment
webpack.config.js Modified Add JS entry point
tests/Integration/…/Content_ProvenanceTest.php New 64 integration tests
tests/Integration/…/C2PA_Sign_Test.php New Ability tests
docs/experiments/content-provenance.md New User guide
docs/experiments/content-provenance-developer.md New Developer reference

Test plan

  • Run composer test -- --filter Content_Provenance — all 64 tests pass
  • Activate experiment → publish a post → verify _c2pa_manifest meta is set
  • Edit a signed post → confirm c2pa.edited action + ingredient reference to previous manifest
  • Call wp_do_ability('c2pa/sign', ['text' => 'hello']) in wp shell — returns signed text
  • Call wp_do_ability('c2pa/verify', ['text' => $signed]) — returns verified: true
  • Visit /.well-known/c2pa — returns valid JSON discovery document
  • Tamper with post content in DB → verify badge shows tampered status
  • Enable "Sign AI fragments" → generate a title → inspect title text for invisible Unicode variation selectors
  • Hook wp_ai_experiment_title_generation_result in a test plugin → confirm callback receives correct args

Related

Open WordPress Playground Preview

Implements the C2PA Content Provenance experiment per the WordPress/ai
experiment framework pattern. Zero external dependencies in default
configuration. Fully air-gap compatible with local signing.

## What's included

**Experiment class** (includes/Experiments/Content_Provenance/)
- Content_Provenance extends Abstract_Experiment (id: content-provenance,
  category: EDITOR)
- Auto-signs posts on publish_post (c2pa.created) and post_updated
  (c2pa.edited) with provenance chain ingredient references
- 7 settings: signing_tier, connected_service_url/api_key, byok_certificate,
  auto_sign, show_badge, badge_position
- REST: POST /c2pa-provenance/v1/verify (public),
  GET /c2pa-provenance/v1/status (editor)
- /.well-known/c2pa discovery endpoint per C2PA 2.x §6.4
- Optional verification badge on published posts

**Signing engine** (Signing/)
- Signing_Interface contract
- Local_Signer: RSA-2048 keypair, zero setup
- Connected_Signer: delegates to any C2PA-compliant HTTP service
- BYOK_Signer: publisher's own PEM certificate

**C2PA core**
- C2PA_Manifest_Builder: full C2PA 2.3 manifest with provenance chain
- Unicode_Embedder: VS1-VS256 per C2PA §A.7

**WordPress Abilities API**
- c2pa/sign and c2pa/verify registered as first-class abilities

**Gutenberg sidebar**
- 5-state trust-tier-aware shield badge
- Sign Now, Verify, trust tier notice for local signing

**Tests**: 11 integration tests + 2 ability tests
**Docs**: user guide + developer reference

Out of scope: sentence-level signing, coalition, image provenance, AI output
signing (separate PRs per the three-part contribution plan).

Ref: C2PA 2.3 §A.7
@erik-sv erik-sv force-pushed the feature/content-provenance-experiment branch from 5aadfe6 to 6f950d1 Compare March 10, 2026 17:49
Erik Svilich added 2 commits March 10, 2026 18:19
- Fix Unicode_Embedder VS17-VS256 encoding: 3rd byte must cycle through
  0x84-0x87 (not a fixed 0x84) so continuation bytes stay in 0x80-0xBF.
  Fixes preg_replace /u returning null on invalid UTF-8 in strip()
- Replace str_starts_with() (PHP 8.0) with strncmp() for PHP 7.4 compat
- Remove native `mixed` return type and `array|WP_Error` union type
  (PHP 8.0 native types) from Content_Provenance methods; keep in PHPDoc
- Add public const visibility to MAGIC, VERSION, PREFIX, QUERY_VAR constants
- Remove unused Signing_Hooks_Trait (logic already inlined in main class)
- Add get_public_signer() so the c2pa/sign Ability can access the signer
- Replace all short ternaries (?:) with explicit ternaries per WPCS
- Remove error_log() calls (not allowed in production per WPCS)
- Fix openssl_pkey_new() false-check before calling export/get_details
- Drop Connected_Signer timeout from 15s to 3s per VIP performance standard
- Fix REST /status permission_callback to use $request->get_param()
  instead of $_GET (removes nonce-verification PHPCS warning)
- Fix JS: add curly braces after if-conditions, fix i18n flanking whitespace,
  fix Prettier formatting
- All 11 Content_Provenance integration tests + 2 ability tests pass
- PHPStan level 8: [OK] No errors; PHPCS: 0 errors 0 warnings
Unicode_Embedder now implements the C2PA 2.3 §A.7 C2PATextManifestWrapper
format exactly as specified:
- Wrapper is APPENDED to NFC-normalized text (not prepended)
- Binary header: 8-byte magic "C2PATXT\0" + 1-byte version + 4-byte
  big-endian manifest length, encoded as variation selectors
- extract() scans for U+FEFF anywhere in the string, validates the
  binary header (magic + version + length), and reads the manifest bytes
- strip() removes U+FEFF and VS1-VS256 code points from anywhere in text
  using a Unicode-aware regex

Also fixes pre-existing REST route test failures in Example_ExperimentTest
by resetting the REST server after experiment initialization in setUp(),
ensuring rest_api_init hooks registered during initialize_experiments()
are captured before test methods run.
@codecov
Copy link

codecov bot commented Mar 10, 2026

Codecov Report

❌ Patch coverage is 84.61538% with 164 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.54%. Comparing base (078c9f0) to head (b96f9e2).
⚠️ Report is 12 commits behind head on develop.

Files with missing lines Patch % Lines
...eriments/Content_Provenance/Content_Provenance.php 88.59% 61 Missing ⚠️
...eriments/Content_Provenance/Well_Known_Handler.php 0.00% 31 Missing ⚠️
...ncludes/Abilities/Content_Provenance/C2PA_Sign.php 85.26% 14 Missing ⚠️
...riments/Content_Provenance/Signing/BYOK_Signer.php 72.34% 13 Missing ⚠️
...es/Abilities/Title_Generation/Title_Generation.php 0.00% 10 Missing ⚠️
...iments/Content_Provenance/Signing/Local_Signer.php 72.97% 10 Missing ⚠️
...bilities/Excerpt_Generation/Excerpt_Generation.php 0.00% 4 Missing ⚠️
includes/Abilities/Image/Alt_Text_Generation.php 0.00% 4 Missing ⚠️
includes/Abilities/Summarization/Summarization.php 0.00% 4 Missing ⚠️
...ments/Content_Provenance/C2PA_Manifest_Builder.php 94.73% 4 Missing ⚠️
... and 5 more
Additional details and impacted files
@@              Coverage Diff              @@
##             develop     #294      +/-   ##
=============================================
+ Coverage      57.72%   65.54%   +7.82%     
- Complexity       567      763     +196     
=============================================
  Files             36       48      +12     
  Lines           2933     4011    +1078     
=============================================
+ Hits            1693     2629     +936     
- Misses          1240     1382     +142     
Flag Coverage Δ
unit 65.54% <84.61%> (+7.82%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Erik Svilich added 2 commits March 10, 2026 19:33
Add comprehensive integration tests for all previously-uncovered code paths
to improve patch coverage from 26% toward the project target.

New test files:
- C2PA_Verify_Test.php: 8 tests for verify ability (empty text, unsigned, verified, schema)
- Local_SignerTest.php: 4 tests for local RSA signing and key embedding
- Connected_SignerTest.php: 6 tests for remote signer with HTTP mock via pre_http_request filter
- BYOK_SignerTest.php: 5 tests for bring-your-own-key PEM file signing
- Verification_BadgeTest.php: 8 tests for badge rendering, archive exclusion, and tier labels

Expanded test files:
- Content_ProvenanceTest.php: +18 tests for sign_post, publish/update hooks, keypair lifecycle,
  get_public_signer tier selection, REST endpoints, and settings registration
- C2PA_Sign_Test.php: +6 tests for permissions, non-array input, schema validation
Add Unicode_EmbedderTest.php with 9 tests covering all extract() boundary
cases: insufficient bytes, wrong magic, wrong version, zero-length manifest,
FEFF without VS bytes, and strip() edge cases.

Expand Content_ProvenanceTest.php with 11 more tests covering:
render_settings_fields() HTML output (default, connected-tier, byok-tier),
add_well_known_rewrite() query var registration,
handle_well_known_request() early-return path,
enqueue_assets() all three branches (no screen, non-post screen, post screen),
C2PA_Manifest_Builder::build() signer-error path, and
C2PA_Manifest_Builder::extract_and_verify() invalid-JSON path.

Expand C2PA_Sign_Test.php with 2 more tests covering:
get_experiment() filter branch (experiment provided via filter) and
the is_wp_error() path when the signer fails.
@erik-sv erik-sv force-pushed the feature/content-provenance-experiment branch from 235f898 to 83e9dea Compare March 10, 2026 19:34
@erik-sv
Copy link
Author

erik-sv commented Mar 10, 2026

Plugin Check failure appears to be pre-existing on trunk, happy to investigate if needed.

@jeffpaul
Copy link
Member

@erik-sv mind updating to branch from develop?

Separately, @dkotter and I have been exploring this sort of work for some time (see 10up/classifai#652) and am curious how your work here might overlap/relate to media content (and whether you'd consider also helping on that front in this plugin)?

@erik-sv erik-sv changed the base branch from trunk to develop March 11, 2026 17:50
@erik-sv
Copy link
Author

erik-sv commented Mar 11, 2026

@erik-sv mind updating to branch from develop?

Separately, @dkotter and I have been exploring this sort of work for some time (see 10up/classifai#652) and am curious how your work here might overlap/relate to media content (and whether you'd consider also helping on that front in this plugin)?

Hi @jeffpaul, just moved to the develop branch (thanks for the tips, James LePage just directed me here so I'm new). I'm the co-chair for C2PA's text task force and wrote their spec here so I am very familiar with content provenance technology. Also the CEO of Encypher . Happy to help integration efforts.

I actually have two other PRs for this repo that I have in mind but I didn't want to overwhelm you all with code. Happy to put them up for your review:

  1. AI Content Provenance: hooks into the existing experiment output filters (Title Generation, Excerpt Generation, etc.) so that when WordPress AI generates content, provenance metadata records that fact. This implements a "digital nutrition label" for content, as touched on in 10up/classifai#652
  2. Image Provenance with CDN Continuity: directly addresses the problem where CDNs strip metadata during image transforms. We built a provenance sidecar indexed by perceptual hash that makes the original C2PA manifest retrievable even after aggressive CDN transforms, with edge worker implementations for Cloudflare, Fastly, and CloudFront. For text, we've already solved this CDN survival problem.

Let me know if you have any feedback on this PR or would like me to submit the other two PRs. In regards to the 10up repo, we have developed ways to do exactly what you require for images and text content. One caveat is that to display the CR logo overlay, you need to go through the C2PA compliance program.

@jeffpaul
Copy link
Member

Let me know if you have any feedback on this PR or would like me to submit the other two PRs.

I'll defer to @dkotter for code review on this PR, once you pull it out of Draft state.

Otherwise, additional PRs would be amazing, thanks!

One caveat is that to display the CR logo overlay, you need to go through the C2PA compliance program.

Is that required per site leveraging this WordPress AI plugin or could "we" (either the WordPress AI team, or the WordPress.org project itself) go through that on behalf of every WordPress site leveraging this plugin?

Adds apply_filters() hooks to the final return of each Ability class so
third-party code (including the Content_Provenance experiment) can embed
C2PA provenance into AI-generated text without touching core Ability logic.

Filter names:
- wp_ai_experiment_title_generation_result (array with titles key)
- wp_ai_experiment_excerpt_generation_result (string)
- wp_ai_experiment_summarization_result (string)
- wp_ai_experiment_review_notes_result (array with suggestions key)
- wp_ai_experiment_alt_text_result (array with alt_text key)

Signature: apply_filters( hook, result, context ) where context carries
post_id (0 when unavailable) for downstream provenance metadata.

Also adds sign_ai_fragments setting and sign_ai_fragment() method to
Content_Provenance, wiring all five filters to embed C2PA Unicode
variation selectors into AI outputs when the setting is enabled.
@Jameswlepage
Copy link
Contributor

Is that required per site leveraging this WordPress AI plugin or could "we" (either the WordPress AI team, or the WordPress.org project itself) go through that on behalf of every WordPress site leveraging this plugin?

Interested in this one as well. If the project were to get closer to the protocol, it feels like an audited plugin could work for the universe of sites that the CMS enables.

@erik-sv
Copy link
Author

erik-sv commented Mar 17, 2026

@jeffpaul @Jameswlepage Great questions, let me break this down into the two separate pieces: conformance and signing identity/trust.

To directly answer your question @jeffpaul yes, the WordPress AI team or WordPress.org project can absolutely go through conformance and serve as the signing identity on behalf of every WordPress site. That's the lowest-friction path and follows the same model as Adobe, Microsoft, and the camera manufacturers. The BYOK option remains available for users who need their own organizational identity on the manifest, and you can do both:

Conformance Program

The C2PA conformance program operates at the implementation level, not per-site. So the WordPress AI plugin (or the WordPress.org project itself) would go through conformance once on behalf of every site using the plugin. Happy to help with that process once the implementation is substantially complete.

Signing Identity & Trust

This is the more interesting question. For signatures to show as trusted in C2PA-aware applications (browsers, social platforms, search engines), the signing certificate needs to chain to the C2PA Trust List. There are a few options here, and they're not mutually exclusive:

Option 1: WordPress as the signing identity (recommended starting point)

WordPress operates a centralized signing service and holds a trusted certificate, similar to how Adobe signs content from Photoshop and camera manufacturers (Nikon, Sony, Leica) sign photos under their brand. Every site using the plugin would sign through this service via the Connected tier already in this PR.

  • Pros: Zero setup for users, single conformance + trust list application, broadest coverage
  • Cons: The signature asserts "published via WordPress" not "published by example.com", WordPress.org takes on trust responsibility for everything signed through the service

Option 2: Publisher BYOK (organizational identity)

Individual users obtain their own certificate from a CA on the trust list and configure it via the BYOK tier in this PR. The manifest would say "published by XYZ Press" or "published by example.com."

  • Pros: User-level attribution, each org controls their own identity
  • Cons: Higher friction, requires each publisher to independently obtain a trusted cert

Option 3: Hybrid (Option 1 + 2 recommended)

WordPress serves as the default signing identity out of the box, while users who want organizational attribution can override with BYOK. This is probably the right long-term answer, it gives every WordPress site provenance by default while letting orgs that care about brand-level attestation bring their own identity.


Let me know which direction feels right and I can adjust the implementation accordingly.

@jeffpaul
Copy link
Member

Option 2 is almost certainly a non-starter for the majority (or at least statistically significant) of WordPress installs. Thus going with Option 3 to allow flexibility for sites, especially enterprise installs or publishers, to be able to use BYOK seems most optimal.

@jeffpaul jeffpaul modified the milestones: 0.6.0, 0.7.0 Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Type] Enhancement New feature or request

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

3 participants