(!) README says
"filename: The file you want to segment. Each line is a sentence."
It appears to work when each line is a paragraph—and it should. It should also work if a line is part of a sentence.
(2) With UTF-8, it replaces non-breaking space (U+00A0) with an ASCII space. It should not do that.
(3) It removes blank lines. It should not do that.
(!) README says
"filename: The file you want to segment. Each line is a sentence."
It appears to work when each line is a paragraph—and it should. It should also work if a line is part of a sentence.
(2) With UTF-8, it replaces non-breaking space (U+00A0) with an ASCII space. It should not do that.
(3) It removes blank lines. It should not do that.