fix: use Go-compatible JSON encoding for OCI manifests#348
fix: use Go-compatible JSON encoding for OCI manifests#348sajayantony wants to merge 7 commits intooras-project:mainfrom
Conversation
52000ac to
017a00b
Compare
There was a problem hiding this comment.
Pull request overview
This PR fixes a critical JSON serialization issue where System.Text.Json's default encoder escapes + as \u002B, breaking content-addressable storage (CAS) interoperability with oras-go and other OCI tools. The solution introduces a custom JSON serialization layer with Go-compatible escaping rules.
Changes:
- Introduced internal
OciJsonSerializerwith custom converters (OciStringConverter,OciDictionaryConverter) that bypass .NET's default encoder to preserve literal+characters while maintaining proper escaping for quotes, backslashes, control characters, and HTML-sensitive characters (<,>,&) - Centralized all OCI content JSON operations through
OciJsonSerializeracross 7 source files - Moved
GenerateIndexfromIndexclass toOciJsonSerializerfor architectural consistency - Consolidated duplicate 4 MiB size limit constants into shared
OciLimits.MaxManifestBytes - Updated Copilot instructions with serialization conventions and testing best practices learned from this fix
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
OciJsonSerializer.cs |
Central serialization entry point with Go-compatible EscapeJsonString method and 4 MiB size guard |
OciStringConverter.cs |
Custom JsonConverter<string> using WriteRawValue to bypass default encoder |
OciDictionaryConverter.cs |
Custom dictionary converter handling both keys and values (framework doesn't invoke string converter for dict keys) |
OciLimits.cs |
Shared 4 MiB manifest size constant |
Packer.cs |
Updated to use OciJsonSerializer.SerializeToUtf8Bytes |
ManifestStore.cs |
Updated to use OciJsonSerializer.DeserializeAsync and GenerateIndex |
Repository.cs |
Updated to use OciJsonSerializer.Deserialize for OCI content |
Registry.cs |
Updated to use OciJsonSerializer.Deserialize |
FetchableExtensions.cs |
Updated to use OciJsonSerializer.Deserialize |
ResponseException.cs |
Updated to use OciJsonSerializer.Deserialize for error responses |
Error.cs |
Updated to use OciJsonSerializer.FormatErrorDetail |
Index.cs |
Removed GenerateIndex method (moved to serializer) |
CopyGraphOptions.cs |
Uses OciLimits.MaxManifestBytes instead of duplicate constant |
RepositoryOptions.cs |
Uses OciLimits.MaxManifestBytes instead of duplicate constant |
OciStringConverterTest.cs |
Tests escaping rules, round-trip, verifies literal + not \u002B |
OciDictionaryConverterTest.cs |
Tests key/value escaping, null handling, invalid types, round-trip |
ManifestSerializationTest.cs |
Wire-format JSON fixtures, annotation fidelity, size limit enforcement |
PackerTest.cs |
Two new regression tests inspecting raw bytes for literal + |
.github/copilot-instructions.md |
Trimmed 28%, added serialization/testing conventions |
017a00b to
3f67c33
Compare
tests/OrasProject.Oras.Tests/Serialization/OciDictionaryConverterTest.cs
Outdated
Show resolved
Hide resolved
3f67c33 to
57092d1
Compare
57092d1 to
0489b9a
Compare
0489b9a to
14934ba
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #348 +/- ##
==========================================
+ Coverage 91.76% 92.09% +0.33%
==========================================
Files 64 67 +3
Lines 2755 2897 +142
Branches 364 373 +9
==========================================
+ Hits 2528 2668 +140
- Misses 138 139 +1
- Partials 89 90 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
14934ba to
af9d966
Compare
tests/OrasProject.Oras.Tests/Serialization/OciStringConverterTest.cs
Outdated
Show resolved
Hide resolved
af9d966 to
cd1e45b
Compare
cd1e45b to
f16fe71
Compare
f16fe71 to
ba708a3
Compare
|
Resolved by Copilot: Added section comments throughout the |
|
Resolved by Copilot: Good idea. The serialization path is only used for manifests/indexes (bounded at 4 MiB by |
|
Resolved by Copilot: Added a comment explaining the relationship — |
|
Resolved by Copilot: Named the tuple elements — |
d150eec to
1d6f5f4
Compare
…ance Add learnings from issue oras-project#346 fix: - All JSON through OciJsonSerializer (no direct JsonSerializer) - Serialization types are internal - Domain types must not contain serialization logic - Test invalid inputs, wire format fixtures, round-trip - No dead code; no committing local-only tooling Also trim redundancy: remove duplicate Go→.NET table, collapse overlapping sections. 128→92 lines. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Sajay Antony <sajaya@microsoft.com>
- Add section comments in EscapeJsonString switch cases - Name tuple elements in GenerateIndex return type - Add comment explaining _maxMetadataBytes default - Add tuple naming convention to copilot-instructions.md Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Sajay Antony <sajaya@microsoft.com>
1d6f5f4 to
d7a9498
Compare
akashsinghal
left a comment
There was a problem hiding this comment.
Caution
This review was generated by Copilot using Claude Opus 4.6, mimicking the review style of shizhMSFT. This is not the real shizhMSFT.
Solid fix for a real CAS interoperability problem. The escaping logic matches Go's encoding/json.Marshal semantics correctly, and the converter architecture (separate string + dictionary converters) is the right approach given System.Text.Json's limitation that JsonConverter<string> doesn't intercept dictionary keys. A few observations below.
- Revert copilot-instructions.md to upstream (split to issue oras-project#356) - Move GenerateIndex back to Index.cs, call OciJsonSerializer internally - Add byte-level idempotency assertion in round-trip tests - Document size-guard materialization behavior in OciJsonSerializer Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Sajay Antony <sajaya@microsoft.com>
…o agent/fix-plus-escaping-346
|
Addressed all 4 unresolved review threads in commit 6eacd7d:
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.
You can also share your feedback on Copilot code review. Take the survey.
| internal static (Descriptor Descriptor, byte[] Content) GenerateIndex( | ||
| IList<Descriptor> manifests) | ||
| { | ||
| var index = new Index(manifests); | ||
| var indexContent = JsonSerializer.SerializeToUtf8Bytes(index); | ||
| return (Descriptor.Create(indexContent, Oci.MediaType.ImageIndex), indexContent); | ||
| var indexContent = | ||
| OciJsonSerializer.SerializeToUtf8Bytes(index); | ||
| return ( | ||
| Descriptor.Create(indexContent, Oci.MediaType.ImageIndex), | ||
| indexContent); |
There was a problem hiding this comment.
PR description mentions Index.GenerateIndex() being removed/moved into the serializer, but this method still exists (now just using OciJsonSerializer). Either update the PR description to match the implementation, or remove/move the method if the intent is to centralize index generation in OciJsonSerializer.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Sajay Antony <sajaya@microsoft.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.
You can also share your feedback on Copilot code review. Take the survey.
| var first = true; | ||
| foreach (var kvp in value.OrderBy(k => k.Key, StringComparer.Ordinal)) | ||
| { |
There was a problem hiding this comment.
OciDictionaryConverter.Write sorts keys with StringComparer.Ordinal, which compares UTF-16 code units. Go’s encoding/json sorts map keys by Go string order (bytewise over UTF-8), which can differ from UTF-16 ordinal ordering for non-BMP characters (surrogate pairs) and some BMP ranges, potentially reintroducing cross-runtime digest mismatches for non-ASCII annotation keys. Consider sorting by UTF-8 byte sequence (e.g., Encoding.UTF8.GetBytes(key) with lexicographic compare) to match Go’s ordering semantics.
Fix:
+escaped as\u002Bin JSON serializationFixes #346
Note
This PR was generated with GitHub Copilot CLI. The second commit trims and updates
.github/copilot-instructions.mdwith learnings from this fix.Problem
System.Text.Json's defaultJavaScriptEncoderescapes+as\u002B. OCI media types likeapplication/vnd.oci.image.manifest.v1+jsonbecome...v1\u002Bjson, producing different digests than oras-go and breaking CAS interoperability.Root cause:
JavaScriptEncoder.Defaulthardcodes+on its block list.AllowCharacter('+')doesn't work.UnsafeRelaxedJsonEscapingfixes+but stops escaping<>&— unsafe.Solution
Custom
JsonConverterclasses withUtf8JsonWriter.WriteRawValue()and a Go-compatible escape function matchingencoding/json.Marshalsemantics.Design
Why two converters?
JsonConverter<string>is NOT invoked for dictionary keys — only values.OciDictionaryConverterhandles annotation dictionaries where both keys and values may contain+.Escaping Rules (Go
encoding/json.Marshalcompatible)"\"\\\\b \f \n \r \t\b \f \n \r \t< > &\u003C \u003E \u0026\u2028 \u2029\uXXXX++(literal — the fix)Changes
New files (all
internal):Serialization/OciJsonSerializer.cs— single entry point for all JSON opsSerialization/OciStringConverter.cs— string value converterSerialization/OciDictionaryConverter.cs— annotation dictionary converterContent/OciLimits.cs— sharedMaxManifestBytesconstant (4 MiB)Modified call sites:
Packer.cs,ManifestStore.cs,Repository.cs,Registry.cs,FetchableExtensions.cs,ResponseException.cs,Error.cs— all JSON throughOciJsonSerializerIndex.cs—GenerateIndex()removed (moved to serializer)CopyGraphOptions.cs,RepositoryOptions.cs— use sharedOciLimits.MaxManifestBytesTests (36 new, 429 total passing):
OciStringConverterTest.cs— escape branch coverage + round-tripOciDictionaryConverterTest.cs— keys, values, empty, null, invalid types, round-tripManifestSerializationTest.cs— JSON fixture deserialization, annotation fidelity, index, round-trip, size limitPackerTest.cs— 2 regression tests checking raw bytes for literal+Coverage: 99.1% line, 97.5% branch, 100% method for serialization code.
Copilot Instructions Update (second commit)
Trimmed
.github/copilot-instructions.mdfrom 128→92 lines (−28%) by removing duplicate sections (Go→.NET table, Async & I/O, Copilot Prompt Hints, Non-Goals) and adding serialization/testing guidance learned from this fix.Implementation Plan
Approach: Custom JsonConverters with WriteRawValue
Register custom
JsonConverterclasses that:EscapeJsonString()WriteRawValue()— bypasses STJ's encoder entirelyJsonSerializerOptionsinsideOciJsonSerializerKey decisions
OciJsonSerializerGenerateIndex()moved out ofIndex.cs,Serializemoved out ofError.ToString()OciLimits.MaxManifestBytesreplaces duplicate private constsWhat the original plan missed (retrospective)
JsonConverter<string>— neededOciDictionaryConverter\x01bin C# is\x01Bnot\x01+b— use\u0001for control chars