Skip to content

Enhance encoding guessing rate and decoding email body parts  #268

@pi-infected

Description

@pi-infected

Hi,

I am processing huge quantity error and I've noticed that sometimes there is a problem with encoding detector & decoding email body parts.

I've enhanced this part by monkey patching flanker, and I have decreased the encoding problem rate.

I guess you will be interested by updating this part with my code 👍

# Monkey patching
import charset_normalizer
from flanker.mime.message import utils
from flanker.mime.message import errors

def _guess_and_convert_with(value, detector=charset_normalizer):
  """
  Try to guess the encoding of the passed value with the provided detector
  and decode it.

  The detector is charset_normalizer module.
  """
  result = detector.from_bytes(value).best()

  if not result:
    raise errors.DecodingError("Failed to guess encoding")

  try:
    value = str(result)
    return value

  except (UnicodeError, LookupError) as e:
    raise errors.DecodingError(str(e))

def _guess_and_convert(value):
  """
  Try to guess the encoding of the passed value and decode it.

  Uses charset_normalizer to guess the encoding.
  """
  return _guess_and_convert_with(value, detector=charset_normalizer)

utils._guess_and_convert_with = _guess_and_convert_with
utils._guess_and_convert      = _guess_and_convert

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions