diff --git a/README.md b/README.md index 34c5004d..2b3b7be0 100644 --- a/README.md +++ b/README.md @@ -499,17 +499,23 @@ dataset = [('Today was....', 1), ('This movie is...', 0), ...] You can then run attacks on samples from this dataset by adding the argument `--dataset-from-file my_dataset.py`. + +#### Dataset loading via other mechanism, see: [more details at here](https://textattack.readthedocs.io/en/latest/api/datasets.html) + +```python +import textattack +my_dataset = [("text",label),....] +new_dataset = textattack.datasets.Dataset(my_dataset) +``` + + + #### Dataset via AttackedText class To allow for word replacement after a sequence has been tokenized, we include an `AttackedText` object which maintains both a list of tokens and the original text, with punctuation. We use this object in favor of a list of words or just raw text. - -#### Dataset loading via other mechanism, see: [here](https://textattack.readthedocs.io/en/latest/api/datasets.html) - - - ### Attacks and how to design a new attack diff --git a/docs/1start/FAQ.md b/docs/1start/FAQ.md index e36a27c8..f4a14f3b 100644 --- a/docs/1start/FAQ.md +++ b/docs/1start/FAQ.md @@ -110,14 +110,21 @@ You can then run attacks on samples from this dataset by adding the argument `-- +#### Dataset loading via other mechanism, see: [more details at here](https://textattack.readthedocs.io/en/latest/api/datasets.html) + +```python +import textattack +my_dataset = [("text",label),....] +new_dataset = textattack.datasets.Dataset(my_dataset) +``` + + #### Custom Dataset via AttackedText class To allow for word replacement after a sequence has been tokenized, we include an `AttackedText` object which maintains both a list of tokens and the original text, with punctuation. We use this object in favor of a list of words or just raw text. -#### Custome Dataset via Data Frames or other python data objects (*coming soon*) - ### 4. Benchmarking Attacks