Word-Final Spaces in Dictionary Entries

These commands reveal entries that have inappropriate word-final spaces in them:
`grep " ," core_lex.csv` — 62 entries
`grep " ," notcore_lex.csv` — 16 entries

Here are the first few from `core_lex.csv`:
```
alkeran ,4785,4785,5000,Alkeran ,名詞,固有名詞,一般,*,*,*,アルケラン,アルケラン,*,A,*,*,*,*
blopress ,4785,4785,5000,Blopress ,名詞,固有名詞,一般,*,*,*,ブロプレス,ブロプレス,*,A,*,*,*,*
bosna i hercegovina ,4793,4793,5000,Bosna i Hercegovina ,名詞,固有名詞,地名,国,*,*,ボスニア・ヘルツェゴビナ,ボスニア・ヘルツェゴビナ,*,A,*,*,*,022705
engel ,4787,4787,5000,Engel ,名詞,固有名詞,人名,一般,*,*,エンゲル,エンゲル,*,A,*,*,*,017728
```

This creates tokens with spaces at the end of them, and inconsistent tokenization when the words are followed by punctuation. So the string _engel engel._ gets two tokens: "engel " (with a space) and "engel" (without a space).

These entries should have their word-final spaces removed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Word-Final Spaces in Dictionary Entries #48

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Word-Final Spaces in Dictionary Entries #48

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions