Skip to content

fix: Optimized RemapQuantizer#48

Merged
kienvo merged 2 commits intofossasia:mainfrom
Vishveshwara:optimize_quantizer
Jun 14, 2025
Merged

fix: Optimized RemapQuantizer#48
kienvo merged 2 commits intofossasia:mainfrom
Vishveshwara:optimize_quantizer

Conversation

@Vishveshwara
Copy link
Contributor

@Vishveshwara Vishveshwara commented Jun 5, 2025

Fixes #37

Optimizations:

  • Pre-allocated typed arrays: Use Int32List for palette RGB values instead of repeated method calls
  • Eliminated temporary arrays: Removed distance array creation in _map() method
  • Reduced object allocation: Direct RGB access without creating intermediate Color objects
  • Early termination: Exit immediately on exact color matches (distance = 0)
  • Inline distance calculation: Performs square distance calculations faster
  • LRU-style color cache: Cache recently computed color mappings using bit-shift hashing, since neighboring pixels in dithering often map to same colors

Real world Performance :
Noticed about 0.5 to 0.75 seconds improvement compared to the older algorithm on a low specification phone

Summary by Sourcery

Optimize RemapQuantizer to accelerate palette lookup and reduce memory churn.

Enhancements:

  • Preallocate Int32List arrays for palette red, green, and blue values to avoid repeated getter calls
  • Inline squared distance computation and remove temporary distance arrays for faster nearest‐neighbor search
  • Introduce an LRU‐style cache keyed by packed RGB to reuse recent color mappings with lightweight eviction
  • Short‐circuit palette search on exact matches for immediate results

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Jun 5, 2025

Reviewer's Guide

This PR refactors RemapQuantizer to improve performance by preallocating typed arrays for palette channels, inlining squared distance computations while removing temporary arrays, introducing early exits on exact matches, and leveraging an LRU-style cache.

File-Level Changes

Change Details Files
Preallocated typed arrays and direct RGB access
  • Initialized Int32List for red, green, and blue channels
  • Replaced palette.getRed/Green/Blue calls with direct array lookups
  • Removed intermediate Color object creation during lookup
lib/util/image_processing/remap_quantizer.dart
Inlined distance computation and removed temporary arrays
  • Introduced _fastDistanceSquared for squared distance calculations
  • Removed _map() method and ds list allocation
  • Replaced list-based min/index operations with direct comparisons
lib/util/image_processing/remap_quantizer.dart
Early termination on exact matches
  • Added zero-distance checks before and during palette iteration
  • Returns immediately when an exact color match is found
lib/util/image_processing/remap_quantizer.dart
LRU-style cache for color mappings
  • Added _colorCache map with a fixed size limit
  • Generated cacheKey via bit-shifted RGB values
  • Evicts oldest entries when exceeding maximum cache size
lib/util/image_processing/remap_quantizer.dart

Assessment against linked issues

Issue Objective Addressed Explanation
#37 Optimize the RemapQuantizer for improved performance with (Red, black, white) Palette images.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Vishveshwara - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟡 General issues: 2 issues found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@Vishveshwara
Copy link
Contributor Author

Previous Implementation vs New Implementation performance using 4 different images

image
image

image
image

image
image

image
image

  • I am noticing that the newer algorithm is performing much better with flat colors (for example just a red background (3.)) as a cache of the color mapping is maintained
  • in profile mode it is showing about 100 ms improvement in performance, but i am noticing a bigger performance gap in release mode

@Vishveshwara
Copy link
Contributor Author

@AsCress @kienvo can you please review this?

Copy link
Collaborator

@AsCress AsCress left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. @kienvo Can you eyeball this once ?

@kienvo
Copy link
Member

kienvo commented Jun 9, 2025

It seems to me that the only improvement here is caching; other changes are just rewrites.

@AsCress
Copy link
Collaborator

AsCress commented Jun 9, 2025

@kienvo Yeah. Caching is the main improvement here. What do you think about this ?

@kienvo
Copy link
Member

kienvo commented Jun 9, 2025

@AsCress @Vishveshwara I think we should make a minimal change for caching, other changes should be kept as it is, unless there is a significant reason for those changes.

@Vishveshwara
Copy link
Contributor Author

Vishveshwara commented Jun 9, 2025

  • Uint8List typed arrays gives direct RGB access vs repeated palette.getRed() method calls in the older method
  • No more final ds = [] allocation per lookup

New method

int _getColorIndexInternal(int r, int g, int b) {
    final cacheKey = ((r & 0xFF) << 16) | ((g & 0xFF) << 8) | (b & 0xFF);

    final cachedResult = _colorCache[cacheKey];
    if (cachedResult != null) {
      return cachedResult;
    }

    final numColors = _paletteR.length;

    int bestIndex = 0;
    int minDistance = _fastDistanceSquared(r, g, b, 0);

    if (minDistance == 0) {
      _addToCache(cacheKey, 0);
      return 0;
    }

    for (int i = 1; i < numColors; i++) {
      final distance = _fastDistanceSquared(r, g, b, i);
      if (distance == 0) {
        _addToCache(cacheKey, i);
        return i;
      }
      if (distance < minDistance) {
        minDistance = distance;
        bestIndex = i;
      }
    }

    _addToCache(cacheKey, bestIndex);
    return bestIndex;
  }

Previous method

(img.Color, int) _map(img.Color c) {
    final ds = <num>[];
    for (int i = 0; i < palette.numColors; i++) {
      ds.add(_distance(c, _colorLut[i]));
    }
    final nearestIndex = ds.indexOf(ds.reduce(min));
    return (_colorLut[nearestIndex], nearestIndex);
  }
  • No intermediate ColorRgb8 creation in getColorIndexRgb()
    New method
@override
  int getColorIndexRgb(int r, int g, int b) {
    return _getColorIndexInternal(r, g, b);
  }

Previous method

 int getColorIndexRgb(int r, int g, int b) {
    return _map(img.ColorRgb8(r, g, b)).$2;
  }
  • No (img.Color, int) tuple allocation in _map()
    new method
  @override
  img.Color getQuantizedColor(img.Color c) {
    final index = _getColorIndexInternal(c.r.toInt(), c.g.toInt(), c.b.toInt());
    return _colorLut[index];
  }

  @override
  int getColorIndex(img.Color c) {
    return _getColorIndexInternal(c.r.toInt(), c.g.toInt(), c.b.toInt());
  }

  @override
  int getColorIndexRgb(int r, int g, int b) {
    return _getColorIndexInternal(r, g, b);
  }

  int _getColorIndexInternal(int r, int g, int b) {
    final cacheKey = ((r & 0xFF) << 16) | ((g & 0xFF) << 8) | (b & 0xFF);

    final cachedResult = _colorCache[cacheKey];
    if (cachedResult != null) {
      return cachedResult;
    }

    final numColors = _paletteR.length;

    int bestIndex = 0;
    int minDistance = _fastDistanceSquared(r, g, b, 0);

    if (minDistance == 0) {
      _addToCache(cacheKey, 0);
      return 0;
    }

    for (int i = 1; i < numColors; i++) {
      final distance = _fastDistanceSquared(r, g, b, i);
      if (distance == 0) {
        _addToCache(cacheKey, i);
        return i;
      }
      if (distance < minDistance) {
        minDistance = distance;
        bestIndex = i;
      }
    }

    _addToCache(cacheKey, bestIndex);
    return bestIndex;
  }

previous method

(img.Color, int) _map(img.Color c) {
    final ds = <num>[];
    for (int i = 0; i < palette.numColors; i++) {
      ds.add(_distance(c, _colorLut[i]));
    }
    final nearestIndex = ds.indexOf(ds.reduce(min));
    return (_colorLut[nearestIndex], nearestIndex);
  }

Direct subtraction vs accessing .r, .g, .b properties
new method

int _fastDistanceSquared(int r, int g, int b, int paletteIndex) {
    final dr = r - _paletteR[paletteIndex];
    final dg = g - _paletteG[paletteIndex];
    final db = b - _paletteB[paletteIndex];
    return dr * dr + dg * dg + db * db;
  }

previous method

  num _distance(img.Color a, img.Color b) {
    return (a.r - b.r) * (a.r - b.r) +
        (a.g - b.g) * (a.g - b.g) +
        (a.b - b.b) * (a.b - b.b);
  }

@kienvo
while all these adjustments are quite small and is not majorly increasing the performance,
I feel for larger images , or when we are quantizing with a bigger palette(more than 3 colors) , it would be a better approach

@Vishveshwara
Copy link
Contributor Author

@kienvo can you please check the above comment ?

Copy link
Member

@kienvo kienvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I didn't think about it this way.

@kienvo kienvo merged commit c0dd7a9 into fossasia:main Jun 14, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize Remap Quantizer used for (Red,black,white) Palette images

3 participants