Skip to content

[BUG] Arabic diacritics ignoring not working correctly #498

@abdullahaldarwish

Description

@abdullahaldarwish

Problem description:

When I have an Arabic text with diacritics and I enable ignoring diacritics, offset count will be wrong and that makes highlighting the result basically wrong and the shift is the same number as the number of past diacritics
example:
كُتَّاَبْ بيوت نبات ثمار
here I have 4 words, but first one has five diacritics "كُتَّاَبْ"
now when I search for the third word "نبات" it highlights the second word "بيوت" and the pattern continues
the main reason as I understood that offset is counted when the content is cleaned (meaning that it has no diacritics) but it shows the result on the original text and the original text counts diacritics as letters

side note: it would be appreciated if search results direction was RTL

example images:
Image

Image

Your environment:
I tried it with both the options: ignore diacritics and ignore Arabic diacritics on, and from what I try I think that this should be my configuration

  • Omnisearch version: 1.27.3
  • Obsidian version: 1.9.14
  • Operating system: Windows 10
  • Number of indexed documents in your vault (approx.): not relevant because it happens even in file search

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions