* Add Thai stopwords from stopwordsiso
* add "th" to language_dict
* add unit test and test data files for Thai language
* - add pythainlp to requirements.txt
- sort requirements.txt
* Update and sort supported language list
* sort the language list
* update language list in docs/index.rst
* Update README.rst
- "Newspaper has *seamless* language extraction and detection."
to
-"Newspaper can extract and detect languages *seamlessly*."
* Update README.rst
- "Newspaper has *seamless* language extraction and detection."
to
-"Newspaper can extract and detect languages *seamlessly*."
* Adds Swahili language Support
* populate Swahili stop words
* Adds Persian Language Support
* fixes unit test due to missing language code
* bump build due to tarvis
The corpora is required for the install or newspaper3k or else it errors.
I tried pip installing, cloning & downloading older versions of newspaper3k before realising.
While the change seems minor it may prevent users from being deterred because they couldn't get the library working.
As per [their blog post of the 27th April](https://blog.readthedocs.com/securing-subdomains/) ‘Securing subdomains’:
> Starting today, Read the Docs will start hosting projects from subdomains on the domain readthedocs.io, instead of on readthedocs.org. This change addresses some security concerns around site cookies while hosting user generated data on the same domain as our dashboard.
Test Plan: Manually visited all the links I’ve modified.
Without `sudo apt-get install python3-pip` as an additional step, `pip3` command would not be available.
```
$ pip3 install newspaper3k
The program 'pip3' is currently not installed. You can install it by typing:
sudo apt-get install python3-pip
```