-
Notifications
You must be signed in to change notification settings - Fork 525
[FEATURE]: Add machine-readable JSON output for -out=report #2020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
ab2cda5 to
ca55c86
Compare
|
I'm reverting the last commit ( |
1ac3c21 to
64355f0
Compare
CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results, when compared to test for commit 2028754...:
Your PR breaks these cases:
Congratulations: Merging this PR would fix the following tests:
It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you). Check the result page for more info. |
CCExtractor CI platform finished running the test files on linux. Below is a summary of the test results, when compared to test for commit 2028754...:
Your PR breaks these cases:
Congratulations: Merging this PR would fix the following tests:
It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you). Check the result page for more info. |
In raising this pull request, I confirm the following (please check boxes):
My familiarity with the project is as follows (check one):
Summary
This PR implements machine-readable JSON output for the
-out=reportfeature, addressing issue #1399. Users can now generate structured reports that can be parsed with tools likejq, enabling seamless integration with automated workflows.Background
Currently, CCExtractor’s report output is human-readable text that requires custom parsing for automation. While other media analysis tools such as ffprobe and mediainfo provide JSON output, structured closed-caption reporting is not consistently available across tools or versions. This feature enables CCExtractor to expose its existing report data in a structured JSON format.
Use case: Users running CCExtractor in automated environments (e.g., CI/CD pipelines, media processing workflows) need to programmatically determine if streams contain closed captions without writing custom parsers.
Changes
-out=reportOptionExisting Text Output (-out=report)
JSON Output Structure (v1.0)
The output follows a versioned JSON report structure:
JSON output via
--report-format json{ "schema": { "name": "ccextractor-report", "version": "1.0" }, "input": { "source": "file", "path": "../20251206ch29FullTS.ts" }, "stream": { "mode": "Transport Stream", "program_count": 5, "program_numbers": [ 1, 2, 3, 4, 5 ], "pids": [ { "pid": 49, "program_number": 1, "codec": "MPEG-2 video" }, { "pid": 52, "program_number": 1, "codec": "AC3 audio" }, { "pid": 53, "program_number": 1, "codec": "AC3 audio" }, { "pid": 65, "program_number": 2, "codec": "MPEG-2 video" }, { "pid": 68, "program_number": 2, "codec": "AC3 audio" }, { "pid": 81, "program_number": 3, "codec": "MPEG-2 video" }, { "pid": 84, "program_number": 3, "codec": "AC3 audio" }, { "pid": 97, "program_number": 4, "codec": "MPEG-2 video" }, { "pid": 100, "program_number": 4, "codec": "AC3 audio" }, { "pid": 113, "program_number": 5, "codec": "MPEG-2 video" }, { "pid": 116, "program_number": 5, "codec": "AC3 audio" } ] }, "programs": [ { "program_number": 1, "summary": { "has_any_captions": true, "has_608": true, "has_708": true }, "services": { "dvb_subtitles": false, "teletext": false, "atsc_closed_caption": true }, "captions": { "present": true, "eia_608": { "present": true, "xds": false, "channels": { "cc1": true, "cc2": false, "cc3": false, "cc4": false } }, "cea_708": { "present": true, "services": [ 1, 2, 3, 4, 5, 6, 9 ] } }, "video": { "width": 1920, "height": 1080, "aspect_ratio": "03 - 16:9", "frame_rate": "04 - 29.97" } }, (More programs omitted for brevity)Schema Notes
programs[]indicates which captioning systems are present (DVB, Teletext, ATSC), whilecaptions.cea_708.services[]lists active CEA-708 caption service numbers.Program Ordering:
input.pathstream.modestream.program_countstream.program_numbers[]stream.pids[]programs[].services.dvb_subtitlesprograms[].services.teletextprograms[].services.atsc_closed_captionprograms[].captions.eia_608.presentprograms[].captions.eia_608.xdsprograms[].captions.eia_608.channels.*programs[].captions.cea_708.presentprograms[].captions.cea_708.services[]programs[].video.width / heightprograms[].video.aspect_ratioprograms[].video.frame_ratecontainer.mp4.timed_text_tracksschema.*programs[].summary.*programs[].captions.presentKey Features:
-out=reportv1.0) for future extensibilityhas_any_captionssummary field reflects EIA-608 / CEA-708 only.)Technical Approach
Example Testing Commands
Field Value Formats:
aspect_ratioandframe_ratepreserve CCExtractor's internal enum formatting (e.g., "03 - 16:9", "04 - 29.97")jq '.programs[].video.aspect_ratio | split(" - ")[1]'Benefits
has_any_captionssummary field for fast EIA-608 / CEA-708 closed-caption checksNotes
strcasecmpon POSIX systems and mapsto _stricmpon Windows via platform-specific preprocessor guards.