improve READMe

2021-09-19 22:32:58 +03:00 · 2020-10-17 23:20:30 +02:00
parent 8a688b8d1f
commit 67e343dd08
1 changed files with 6 additions and 3 deletions
--- a/README.md
+++ b/README.md
@@ -33,11 +33,14 @@ You may want to abstain from GPL:
 pip install clean-text
 ```

+NB: This package is named `clean-text` and not `cleantext`.
+
 If [unidecode](https://github.com/takluyver/Unidecode) is not available, `clean-text` will resort to Python's [unicodedata.normalize](https://docs.python.org/3.7/library/unicodedata.html#unicodedata.normalize) for [transliteration](https://en.wikipedia.org/wiki/Transliteration).
-Transliteration to closest ASCII symbols involes manually mappings, i.e., `ê` to `e`. Unidecode's mapping is superiour but unicodedata's are sufficent.
+Transliteration to closest ASCII symbols involes manually mappings, i.e., `ê` to `e`.
+`unidecode`'s mapping is superiour but unicodedata's are sufficent.
 However, you may want to disable this feature altogether depending on your data and use case.

-NB: The package is named `clean-text` and not `cleantext`.
+To make it clear: There are **inconsistencies** between processing text with or without `unidecode`.

 ## Usage

@@ -67,7 +70,7 @@ clean("some input",
 )
 ```

-Carefully choose the arguments that fit your task. The default parameters are listed above. Whitespace is always normalized.
+Carefully choose the arguments that fit your task. The default parameters are listed above.

 You may also only use specific functions for cleaning. For this, take a look at the [source code](https://github.com/jfilter/clean-text/blob/master/cleantext/clean.py).