Skip to content

Conversation

@pmachapman
Copy link
Collaborator

@pmachapman pmachapman commented Jan 14, 2026

Fixes #840


This change is Reviewable

@codecov-commenter
Copy link

codecov-commenter commented Jan 14, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 66.17%. Comparing base (226d952) to head (40c021f).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #853   +/-   ##
=======================================
  Coverage   66.17%   66.17%           
=======================================
  Files         382      382           
  Lines       20793    20793           
  Branches     2721     2721           
=======================================
  Hits        13760    13760           
  Misses       6067     6067           
  Partials      966      966           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Collaborator

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Enkidu93 reviewed 2 files and all commit messages, and made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ddaspit and @pmachapman).


src/Echo/src/EchoEngine/TranslationEngineServiceV1.cs line 135 at r1 (raw file):

                            ExecutionData = new ExecutionData
                            {
                                TrainCount = 0,

I wonder about making these values reflect the actual number of training rows/inference rows even though the Echo engine isn't really using that data per se. Do you think it makes more sense just to have them be 0?

If you did the UpdateBuildExecutionDataAsync call after the parallel corpus preprocessing, you could get those values during preprocessing like we do with the true MT engine builds and incorporate them in the execution data.

...but maybe this spirals into including other 'realistic' values like those for the quote convention analysis and build warnings which we probably do not want to do. On the other hand, getting these values would be easy since we're already running preprocessing on the corpora. What do you think?

Copy link
Collaborator Author

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmachapman made 1 comment.
Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @ddaspit and @Enkidu93).


src/Echo/src/EchoEngine/TranslationEngineServiceV1.cs line 135 at r1 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

I wonder about making these values reflect the actual number of training rows/inference rows even though the Echo engine isn't really using that data per se. Do you think it makes more sense just to have them be 0?

If you did the UpdateBuildExecutionDataAsync call after the parallel corpus preprocessing, you could get those values during preprocessing like we do with the true MT engine builds and incorporate them in the execution data.

...but maybe this spirals into including other 'realistic' values like those for the quote convention analysis and build warnings which we probably do not want to do. On the other hand, getting these values would be easy since we're already running preprocessing on the corpora. What do you think?

Done. Sounds good. Thank you!

Copy link
Collaborator

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Enkidu93 reviewed 2 files and all commit messages, and made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ddaspit and @pmachapman).


src/Echo/src/EchoEngine/TranslationEngineServiceV1.cs line 135 at r1 (raw file):

Previously, pmachapman (Peter Chapman) wrote…

Done. Sounds good. Thank you!

Thank you! I think we should mirror the counting of the training rows and inference rows here exactly as in

await ParallelCorpusPreprocessingService.PreprocessAsync(
and
await ParallelCorpusPreprocessingService.PreprocessAsync(
- notice that we don't increment on every call.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@ddaspit reviewed 2 files and all commit messages, and made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @pmachapman).

Copy link
Collaborator Author

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmachapman made 1 comment.
Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @ddaspit and @Enkidu93).


src/Echo/src/EchoEngine/TranslationEngineServiceV1.cs line 135 at r1 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

Thank you! I think we should mirror the counting of the training rows and inference rows here exactly as in

await ParallelCorpusPreprocessingService.PreprocessAsync(
and
await ParallelCorpusPreprocessingService.PreprocessAsync(
- notice that we don't increment on every call.

Done.

Copy link
Collaborator

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@Enkidu93 reviewed 2 files and all commit messages, made 1 comment, and resolved 1 discussion.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @pmachapman).

@pmachapman pmachapman merged commit 3e5fc0c into main Jan 19, 2026
3 checks passed
@pmachapman pmachapman deleted the echo_executiondata branch January 19, 2026 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Record source and target language in the build

5 participants