Skip to content

Conversation

@quic-ashwshan
Copy link
Contributor

  • Fix QNN-EP to support static bias tensors with mismatched quantization encodings
  • Add requantization logic to ensure bias_scale = weights_scale x activation_scale

Description

The QNN-EP currently lacks support for static bias tensors with quantization encodings that don't match the expected mathematical relationship bias_scale = weights_scale × activation_scale.

Motivation and Context

This leads to HTP backend disregard the encodings of the bias tensor and recompute its own causing accuracy drops

- Fix QNN-EP to support static bias tensors with mismatched quantization
  encodings
- Add requantization logic to ensure bias_scale = weights_scale x
  activation_scale
@quic-ashwshan
Copy link
Contributor Author

Could we start the CI job for this PR ?@chilo-ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant