feat: Content Provenance experiment (C2PA 2.3 §A.7 text authentication)#294
feat: Content Provenance experiment (C2PA 2.3 §A.7 text authentication)#294erik-sv wants to merge 6 commits intoWordPress:developfrom
Conversation
Implements the C2PA Content Provenance experiment per the WordPress/ai experiment framework pattern. Zero external dependencies in default configuration. Fully air-gap compatible with local signing. ## What's included **Experiment class** (includes/Experiments/Content_Provenance/) - Content_Provenance extends Abstract_Experiment (id: content-provenance, category: EDITOR) - Auto-signs posts on publish_post (c2pa.created) and post_updated (c2pa.edited) with provenance chain ingredient references - 7 settings: signing_tier, connected_service_url/api_key, byok_certificate, auto_sign, show_badge, badge_position - REST: POST /c2pa-provenance/v1/verify (public), GET /c2pa-provenance/v1/status (editor) - /.well-known/c2pa discovery endpoint per C2PA 2.x §6.4 - Optional verification badge on published posts **Signing engine** (Signing/) - Signing_Interface contract - Local_Signer: RSA-2048 keypair, zero setup - Connected_Signer: delegates to any C2PA-compliant HTTP service - BYOK_Signer: publisher's own PEM certificate **C2PA core** - C2PA_Manifest_Builder: full C2PA 2.3 manifest with provenance chain - Unicode_Embedder: VS1-VS256 per C2PA §A.7 **WordPress Abilities API** - c2pa/sign and c2pa/verify registered as first-class abilities **Gutenberg sidebar** - 5-state trust-tier-aware shield badge - Sign Now, Verify, trust tier notice for local signing **Tests**: 11 integration tests + 2 ability tests **Docs**: user guide + developer reference Out of scope: sentence-level signing, coalition, image provenance, AI output signing (separate PRs per the three-part contribution plan). Ref: C2PA 2.3 §A.7
5aadfe6 to
6f950d1
Compare
- Fix Unicode_Embedder VS17-VS256 encoding: 3rd byte must cycle through 0x84-0x87 (not a fixed 0x84) so continuation bytes stay in 0x80-0xBF. Fixes preg_replace /u returning null on invalid UTF-8 in strip() - Replace str_starts_with() (PHP 8.0) with strncmp() for PHP 7.4 compat - Remove native `mixed` return type and `array|WP_Error` union type (PHP 8.0 native types) from Content_Provenance methods; keep in PHPDoc - Add public const visibility to MAGIC, VERSION, PREFIX, QUERY_VAR constants - Remove unused Signing_Hooks_Trait (logic already inlined in main class) - Add get_public_signer() so the c2pa/sign Ability can access the signer - Replace all short ternaries (?:) with explicit ternaries per WPCS - Remove error_log() calls (not allowed in production per WPCS) - Fix openssl_pkey_new() false-check before calling export/get_details - Drop Connected_Signer timeout from 15s to 3s per VIP performance standard - Fix REST /status permission_callback to use $request->get_param() instead of $_GET (removes nonce-verification PHPCS warning) - Fix JS: add curly braces after if-conditions, fix i18n flanking whitespace, fix Prettier formatting - All 11 Content_Provenance integration tests + 2 ability tests pass - PHPStan level 8: [OK] No errors; PHPCS: 0 errors 0 warnings
Unicode_Embedder now implements the C2PA 2.3 §A.7 C2PATextManifestWrapper format exactly as specified: - Wrapper is APPENDED to NFC-normalized text (not prepended) - Binary header: 8-byte magic "C2PATXT\0" + 1-byte version + 4-byte big-endian manifest length, encoded as variation selectors - extract() scans for U+FEFF anywhere in the string, validates the binary header (magic + version + length), and reads the manifest bytes - strip() removes U+FEFF and VS1-VS256 code points from anywhere in text using a Unicode-aware regex Also fixes pre-existing REST route test failures in Example_ExperimentTest by resetting the REST server after experiment initialization in setUp(), ensuring rest_api_init hooks registered during initialize_experiments() are captured before test methods run.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #294 +/- ##
=============================================
+ Coverage 57.72% 65.54% +7.82%
- Complexity 567 763 +196
=============================================
Files 36 48 +12
Lines 2933 4011 +1078
=============================================
+ Hits 1693 2629 +936
- Misses 1240 1382 +142
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Add comprehensive integration tests for all previously-uncovered code paths to improve patch coverage from 26% toward the project target. New test files: - C2PA_Verify_Test.php: 8 tests for verify ability (empty text, unsigned, verified, schema) - Local_SignerTest.php: 4 tests for local RSA signing and key embedding - Connected_SignerTest.php: 6 tests for remote signer with HTTP mock via pre_http_request filter - BYOK_SignerTest.php: 5 tests for bring-your-own-key PEM file signing - Verification_BadgeTest.php: 8 tests for badge rendering, archive exclusion, and tier labels Expanded test files: - Content_ProvenanceTest.php: +18 tests for sign_post, publish/update hooks, keypair lifecycle, get_public_signer tier selection, REST endpoints, and settings registration - C2PA_Sign_Test.php: +6 tests for permissions, non-array input, schema validation
Add Unicode_EmbedderTest.php with 9 tests covering all extract() boundary cases: insufficient bytes, wrong magic, wrong version, zero-length manifest, FEFF without VS bytes, and strip() edge cases. Expand Content_ProvenanceTest.php with 11 more tests covering: render_settings_fields() HTML output (default, connected-tier, byok-tier), add_well_known_rewrite() query var registration, handle_well_known_request() early-return path, enqueue_assets() all three branches (no screen, non-post screen, post screen), C2PA_Manifest_Builder::build() signer-error path, and C2PA_Manifest_Builder::extract_and_verify() invalid-JSON path. Expand C2PA_Sign_Test.php with 2 more tests covering: get_experiment() filter branch (experiment provided via filter) and the is_wp_error() path when the signer fails.
235f898 to
83e9dea
Compare
|
Plugin Check failure appears to be pre-existing on trunk, happy to investigate if needed. |
|
@erik-sv mind updating to branch from Separately, @dkotter and I have been exploring this sort of work for some time (see 10up/classifai#652) and am curious how your work here might overlap/relate to media content (and whether you'd consider also helping on that front in this plugin)? |
Hi @jeffpaul, just moved to the develop branch (thanks for the tips, James LePage just directed me here so I'm new). I'm the co-chair for C2PA's text task force and wrote their spec here so I am very familiar with content provenance technology. Also the CEO of Encypher . Happy to help integration efforts. I actually have two other PRs for this repo that I have in mind but I didn't want to overwhelm you all with code. Happy to put them up for your review:
Let me know if you have any feedback on this PR or would like me to submit the other two PRs. In regards to the 10up repo, we have developed ways to do exactly what you require for images and text content. One caveat is that to display the CR logo overlay, you need to go through the C2PA compliance program. |
I'll defer to @dkotter for code review on this PR, once you pull it out of Draft state. Otherwise, additional PRs would be amazing, thanks!
Is that required per site leveraging this WordPress AI plugin or could "we" (either the WordPress AI team, or the WordPress.org project itself) go through that on behalf of every WordPress site leveraging this plugin? |
Adds apply_filters() hooks to the final return of each Ability class so third-party code (including the Content_Provenance experiment) can embed C2PA provenance into AI-generated text without touching core Ability logic. Filter names: - wp_ai_experiment_title_generation_result (array with titles key) - wp_ai_experiment_excerpt_generation_result (string) - wp_ai_experiment_summarization_result (string) - wp_ai_experiment_review_notes_result (array with suggestions key) - wp_ai_experiment_alt_text_result (array with alt_text key) Signature: apply_filters( hook, result, context ) where context carries post_id (0 when unavailable) for downstream provenance metadata. Also adds sign_ai_fragments setting and sign_ai_fragment() method to Content_Provenance, wiring all five filters to embed C2PA Unicode variation selectors into AI outputs when the setting is enabled.
Interested in this one as well. If the project were to get closer to the protocol, it feels like an audited plugin could work for the universe of sites that the CMS enables. |
|
@jeffpaul @Jameswlepage Great questions, let me break this down into the two separate pieces: conformance and signing identity/trust. To directly answer your question @jeffpaul yes, the WordPress AI team or WordPress.org project can absolutely go through conformance and serve as the signing identity on behalf of every WordPress site. That's the lowest-friction path and follows the same model as Adobe, Microsoft, and the camera manufacturers. The BYOK option remains available for users who need their own organizational identity on the manifest, and you can do both: Conformance ProgramThe C2PA conformance program operates at the implementation level, not per-site. So the WordPress AI plugin (or the WordPress.org project itself) would go through conformance once on behalf of every site using the plugin. Happy to help with that process once the implementation is substantially complete. Signing Identity & TrustThis is the more interesting question. For signatures to show as trusted in C2PA-aware applications (browsers, social platforms, search engines), the signing certificate needs to chain to the C2PA Trust List. There are a few options here, and they're not mutually exclusive: Option 1: WordPress as the signing identity (recommended starting point)WordPress operates a centralized signing service and holds a trusted certificate, similar to how Adobe signs content from Photoshop and camera manufacturers (Nikon, Sony, Leica) sign photos under their brand. Every site using the plugin would sign through this service via the Connected tier already in this PR.
Option 2: Publisher BYOK (organizational identity)Individual users obtain their own certificate from a CA on the trust list and configure it via the BYOK tier in this PR. The manifest would say "published by XYZ Press" or "published by example.com."
Option 3: Hybrid (Option 1 + 2 recommended)WordPress serves as the default signing identity out of the box, while users who want organizational attribution can override with BYOK. This is probably the right long-term answer, it gives every WordPress site provenance by default while letting orgs that care about brand-level attestation bring their own identity. Let me know which direction feels right and I can adjust the implementation accordingly. |
|
Option 2 is almost certainly a non-starter for the majority (or at least statistically significant) of WordPress installs. Thus going with Option 3 to allow flexibility for sites, especially enterprise installs or publishers, to be able to use BYOK seems most optimal. |
Summary
Adds a new Content Provenance experiment that embeds cryptographic C2PA 2.3 §A.7 manifests into post content as invisible Unicode variation selectors. Publishers can prove authorship, detect tampering, and participate in the emerging content authenticity ecosystem (same standard used by Google, BBC, Adobe, OpenAI, and Microsoft).
The latest commit extends this with AI fragment provenance — output filter hooks on all five AI Ability classes so that individually generated titles, excerpts, summaries, review notes, and alt text can each carry their own embedded manifest.
What this adds
Content Provenance experiment
c2pa.created/c2pa.editedactions and provenance-chain ingredient referencesc2pa/signandc2pa/verifyAbilities — any plugin can callwp_do_ability('c2pa/sign', ['text' => …])/.well-known/c2padiscovery endpoint — C2PA §6.4 compliant JSON documentAI fragment provenance (latest commit)
wp_ai_experiment_title_generation_result,wp_ai_experiment_excerpt_generation_result,wp_ai_experiment_summarization_result,wp_ai_experiment_review_notes_result,wp_ai_experiment_alt_text_resultsign_ai_fragmentssetting — when enabled, Content Provenance intercepts these filters and embeds a C2PA manifest into each AI-generated fragment before it reaches the editorPost signing flow
flowchart TD A[Post Published or Updated] --> B{Content Provenance enabled?} B -->|No| Z[Skip] B -->|Yes| C[Strip HTML to plain text] C --> D[Build C2PA Manifest] D --> D1[c2pa.actions.v1] D --> D2[c2pa.hash.data.v1 SHA-256] D --> D3[c2pa.soft_binding.v1] D --> D4[c2pa.ingredient.v2 edit chain] D1 & D2 & D3 & D4 --> E{Signing tier} E -->|Local| F[RSA-2048 self-signed via OpenSSL] E -->|Connected| G[POST to signing service HTTP API] E -->|BYOK| H[Publisher cert PEM file] F & G & H --> I[Unicode Embedder: VS1-VS256 invisible bytes] I --> J[wp_update_post with embedded content] J --> K[Store post meta: _c2pa_manifest, _c2pa_status, _c2pa_signed_at] K --> L[Gutenberg sidebar shield badge]AI fragment provenance flow
flowchart LR A[Editor triggers AI Ability] --> B[Ability executes and returns result] B --> C[apply_filters on wp_ai_experiment_*_result] C --> D{sign_ai_fragments enabled?} D -->|No| E[Original result returned to editor] D -->|Yes| F[C2PA_Manifest_Builder::build] F --> G[Unicode_Embedder::embed] G -->|Success| H[Signed fragment returned to editor] G -->|Error| ESigning tiers
WordPress Abilities API
Fragment hook usage for third-party plugins
Files changed
includes/Experiments/Content_Provenance/Content_Provenance.phpincludes/Experiments/Content_Provenance/C2PA_Manifest_Builder.phpincludes/Experiments/Content_Provenance/Unicode_Embedder.phpincludes/Experiments/Content_Provenance/Well_Known_Handler.php/.well-known/c2paendpointincludes/Experiments/Content_Provenance/Verification_Badge.phpincludes/Experiments/Content_Provenance/Signing/Signing_Interface.phpincludes/Experiments/Content_Provenance/Signing/Local_Signer.phpincludes/Experiments/Content_Provenance/Signing/Connected_Signer.phpincludes/Experiments/Content_Provenance/Signing/BYOK_Signer.phpincludes/Abilities/Content_Provenance/C2PA_Sign.phpc2pa/signAbilityincludes/Abilities/Content_Provenance/C2PA_Verify.phpc2pa/verifyAbilityincludes/Abilities/Title_Generation/Title_Generation.phpwp_ai_experiment_title_generation_resultfilterincludes/Abilities/Excerpt_Generation/Excerpt_Generation.phpwp_ai_experiment_excerpt_generation_resultfilterincludes/Abilities/Summarization/Summarization.phpwp_ai_experiment_summarization_resultfilterincludes/Abilities/Review_Notes/Review_Notes.phpwp_ai_experiment_review_notes_resultfilterincludes/Abilities/Image/Alt_Text_Generation.phpwp_ai_experiment_alt_text_resultfiltersrc/experiments/content-provenance/index.jsincludes/Experiment_Loader.phpwebpack.config.jstests/Integration/…/Content_ProvenanceTest.phptests/Integration/…/C2PA_Sign_Test.phpdocs/experiments/content-provenance.mddocs/experiments/content-provenance-developer.mdTest plan
composer test -- --filter Content_Provenance— all 64 tests pass_c2pa_manifestmeta is setc2pa.editedaction + ingredient reference to previous manifestwp_do_ability('c2pa/sign', ['text' => 'hello'])inwp shell— returns signed textwp_do_ability('c2pa/verify', ['text' => $signed])— returnsverified: true/.well-known/c2pa— returns valid JSON discovery documentwp_ai_experiment_title_generation_resultin a test plugin → confirm callback receives correct argsRelated