1
0
mirror of https://github.com/QData/TextAttack.git synced 2021-10-13 00:05:06 +03:00

Merge pull request #338 from QData/new_docs

update the readme table of recipes and try to fix the readthedoc build error
This commit is contained in:
Yanjun Qi / Jane
2020-11-13 21:12:37 -05:00
committed by GitHub
11 changed files with 992 additions and 47 deletions

212
README.md
View File

@@ -105,55 +105,177 @@ We include attack recipes which implement attacks from the literature. You can l
To run an attack recipe: `textattack attack --recipe [recipe_name]`
Attacks on classification tasks, like sentiment classification and entailment:
- **alzantot**: Genetic algorithm attack from (["Generating Natural Language Adversarial Examples" (Alzantot et al., 2018)](https://arxiv.org/abs/1804.07998)).
- **bae**: BERT masked language model transformation attack from (["BAE: BERT-based Adversarial Examples for Text Classification" (Garg & Ramakrishnan, 2019)](https://arxiv.org/abs/2004.01970)).
- **bert-attack**: BERT masked language model transformation attack with subword replacements (["BERT-ATTACK: Adversarial Attack Against BERT Using BERT" (Li et al., 2020)](https://arxiv.org/abs/2004.09984)).
- **checklist**: Invariance testing implemented in CheckList that contract, extend, and substitutes name entities. (["Beyond Accuracy: Behavioral
Testing of NLP models with CheckList" (Ribeiro et al., 2020)](https://arxiv.org/abs/2005.04118)).
- **clare (*coming soon*)**: Greedy attack with word swap, insertion, and merge transformations using RoBERTa masked language model. (["Contextualized Perturbation for Textual Adversarial Attack" (Li et al., 2020)](https://arxiv.org/abs/2009.07502)).
- **faster-alzantot**: modified, faster version of the Alzantot et al. genetic algorithm, from (["Certified Robustness to Adversarial Word Substitutions" (Jia et al., 2019)](https://arxiv.org/abs/1909.00986)).
- **deepwordbug**: Greedy replace-1 scoring and multi-transformation character-swap attack (["Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers" (Gao et al., 2018)](https://arxiv.org/abs/1801.04354)).
- **hotflip**: Beam search and gradient-based word swap (["HotFlip: White-Box Adversarial Examples for Text Classification" (Ebrahimi et al., 2017)](https://arxiv.org/abs/1712.06751)).
- **iga**: Improved genetic algorithm attack from (["Natural Language Adversarial Attacks and Defenses in Word Level (Wang et al., 2019)"](https://arxiv.org/abs/1909.06723)
- **input-reduction**: Reducing the input while maintaining the prediction through word importance ranking (["Pathologies of Neural Models Make Interpretation Difficult" (Feng et al., 2018)](https://arxiv.org/pdf/1804.07781.pdf)).
- **kuleshov**: Greedy search and counterfitted embedding swap (["Adversarial Examples for Natural Language Classification Problems" (Kuleshov et al., 2018)](https://openreview.net/pdf?id=r1QZ3zbAZ)).
- **pruthi**: Character-based attack that simulates common typos (["Combating Adversarial Misspellings with Robust Word Recognition" (Pruthi et al., 2019)](https://arxiv.org/abs/1905.11268)
- **pso**: Particle swarm optimization and HowNet synonym swap (["Word-level Textual Adversarial Attacking as Combinatorial Optimization" (Zang et al., 2020)](https://www.aclweb.org/anthology/2020.acl-main.540/)).
- **pwws**: Greedy attack with word importance ranking based on word saliency and synonym swap scores (["Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency" (Ren et al., 2019)](https://www.aclweb.org/anthology/P19-1103/)).
- **textbugger**: Greedy attack with word importance ranking and a combination of synonym and character-based swaps ([(["TextBugger: Generating Adversarial Text Against Real-world Applications" (Li et al., 2018)](https://arxiv.org/abs/1812.05271)).
- **textfooler**: Greedy attack with word importance ranking and counter-fitted embedding swap (["Is Bert Really Robust?" (Jin et al., 2019)](https://arxiv.org/abs/1907.11932)).
Attacks on sequence-to-sequence models:
- **morpheus**: Greedy attack that replaces words with their inflections with the goal of minimizing BLEU score (["Its Morphin Time! Combating Linguistic Discrimination with Inflectional Perturbations"](https://www.aclweb.org/anthology/2020.acl-main.263.pdf)
- **seq2sick**: Greedy attack with goal of changing every word in the output translation. Currently implemented as black-box with plans to change to white-box as done in paper (["Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples" (Cheng et al., 2018)](https://arxiv.org/abs/1803.01128)).
Following table illustrates the comparison of the attack models.
<table>
<thead>
<tr class="header">
<th style="text-align: left;"><strong>Attack Recipe Name</strong></th>
<th style="text-align: left;"><strong>Goal Function</strong></th>
<th style="text-align: left; width:130px" ><strong>Constraints-Enforced</strong></th>
<th style="text-align: left;"><strong>Transformation</strong></th>
<th style="text-align: left;"><strong>Search Method</strong></th>
<th style="text-align: left;"><strong>Main Idea</strong></th>
</tr>
</thead>
<tbody>
<tr><td colspan="6"><strong>Attacks on classification tasks, like sentiment classification and entailment:</strong></td></tr>
Attacks on classification tasks
| Attack Recipe(s) | Accessibility | Perturbation | Main Idea |
| :-------------------------------: | :-------------: | :----------: | :-----------------------------------------------------------------------------------------------: |
| Alzantot Genetic Algorithm | Score | Word | Genetic algorithm-based word substitution |
| BAE* | Score | Word | BERT masked language model transformation attack |
| Faster Alzantot Genetic Algorithm | Score | Word | Genetic algorithm-based word substitution(faster version) |
| Improved Genetic Algorithm | Score | Word | Improved genetic algorithm-based word substitution |
| Input Reduction* | Gradient | Word | Reducing the input while maintaining the prediction through word importance ranking |
| Kuleshov | Score | Word | Greedy search and counterfitted embedding swap |
| Particle Swarm Optimization | Score | Word | Particle Swarm Optimization-based word substitution |
| TextFooler | Score | Word | Greedy attack with word importance ranking and counter-fitted embedding swap |
| PWWS | Score | Word | Greedy attack with word importance ranking based on word saliency and synonym swap scores |
| TextBugger | Gradient, Score | Word+Char | Greedy attack with word importance ranking and a combination of synonym and character-based swaps |
| HotFlip | Gradient | Word, Char | Beam search and gradient-based word swap |
| BERT-Attack* | Score | Word, Char | BERT masked language model transformation attack with subword replacements |
| CheckList* | Score | Word, Char | Invariance testing that contract, extend, and substitutes name entities. |
| DeepWordBug | Score | Char | Greedy replace-1 scoring and multi-transformation character-swap attack |
| pruthi | Score | Char | Character-based attack that simulates common typos |
<tr class="even">
<td style="text-align: left;"><code>alzantot</code> <span class="citation" data-cites="Alzantot2018GeneratingNL Jia2019CertifiedRT"></span></td>
<td style="text-align: left;"><sub>Untargeted {Classification, Entailment}</sub></td>
<td style="text-align: left;"><sub>Percentage of words perturbed, Language Model perplexity, Word embedding distance</sub></td>
<td style="text-align: left;"><sub>Counter-fitted word embedding swap</sub></td>
<td style="text-align: left;"><sub>Genetic Algorithm</sub></td>
<td ><sub>from (["Generating Natural Language Adversarial Examples" (Alzantot et al., 2018)](https://arxiv.org/abs/1804.07998))</sub></td>
</tr>
<tr class="odd">
<td style="text-align: left;"><code>bae</code> <span class="citation" data-cites="garg2020bae"></span></td>
<td style="text-align: left;"><sub>Untargeted Classification</sub></td>
<td style="text-align: left;"><sub>USE sentence encoding cosine similarity</sub></td>
<td style="text-align: left;"><sub>BERT Masked Token Prediction</sub></td>
<td style="text-align: left;"><sub>Greedy-WIR</sub></td>
<td ><sub>BERT masked language model transformation attack from (["BAE: BERT-based Adversarial Examples for Text Classification" (Garg & Ramakrishnan, 2019)](https://arxiv.org/abs/2004.01970)). </td>
</tr>
<tr class="even">
<td style="text-align: left;"><code>bert-attack</code> <span class="citation" data-cites="li2020bertattack"></span></td>
<td style="text-align: left;"><sub>Untargeted Classification</td>
<td style="text-align: left;"><sub>USE sentence encoding cosine similarity, Maximum number of words perturbed</td>
<td style="text-align: left;"><sub>BERT Masked Token Prediction (with subword expansion)</td>
<td style="text-align: left;"><sub>Greedy-WIR</sub></td>
<td ><sub> (["BERT-ATTACK: Adversarial Attack Against BERT Using BERT" (Li et al., 2020)](https://arxiv.org/abs/2004.09984))</sub></td>
</tr>
<tr class="odd">
<td style="text-align: left;"><code>checklist</code> <span class="citation" data-cites="Gao2018BlackBoxGO"></span></td>
<td style="text-align: left;"><sub>{Untargeted, Targeted} Classification</sub></td>
<td style="text-align: left;"><sub>checklist distance</sub></td>
<td style="text-align: left;"><sub>contract, extend, and substitutes name entities</sub></td>
<td style="text-align: left;"><sub>Greedy-WIR</sub></td>
<td ><sub>Invariance testing implemented in CheckList . (["Beyond Accuracy: Behavioral Testing of NLP models with CheckList" (Ribeiro et al., 2020)](https://arxiv.org/abs/2005.04118))</sub></td>
</tr>
<tr class="even">
<td style="text-align: left;"> <code>clare (*coming soon*)</code> <span class="citation" data-cites="Alzantot2018GeneratingNL Jia2019CertifiedRT"></span></td>
<td style="text-align: left;"><sub>Untargeted {Classification, Entailment}</sub></td>
<td style="text-align: left;"><sub>RoBERTa masked language model</sub></td>
<td style="text-align: left;"><sub>word swap, insertion, and merge</sub></td>
<td style="text-align: left;"><sub>Greedy</sub></td>
<td ><sub>["Contextualized Perturbation for Textual Adversarial Attack" (Li et al., 2020)](https://arxiv.org/abs/2009.07502))</sub></td>
</tr>
<tr class="odd">
<td style="text-align: left;"><code>deepwordbug</code> <span class="citation" data-cites="Gao2018BlackBoxGO"></span></td>
<td style="text-align: left;"><sub>{Untargeted, Targeted} Classification</sub></td>
<td style="text-align: left;"><sub>Levenshtein edit distance</sub></td>
<td style="text-align: left;"><sub>{Character Insertion, Character Deletion, Neighboring Character Swap, Character Substitution}</sub></td>
<td style="text-align: left;"><sub>Greedy-WIR</sub></td>
<td ><sub>Greedy replace-1 scoring and multi-transformation character-swap attack (["Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers" (Gao et al., 2018)](https://arxiv.org/abs/1801.04354)</sub></td>
</tr>
<tr class="even">
<td style="text-align: left;"> <code>fast-alzantot</code> <span class="citation" data-cites="Alzantot2018GeneratingNL Jia2019CertifiedRT"></span></td>
<td style="text-align: left;"><sub>Untargeted {Classification, Entailment}</sub></td>
<td style="text-align: left;"><sub>Percentage of words perturbed, Language Model perplexity, Word embedding distance</sub></td>
<td style="text-align: left;"><sub>Counter-fitted word embedding swap</sub></td>
<td style="text-align: left;"><sub>Genetic Algorithm</sub></td>
<td ><sub>Modified, faster version of the Alzantot et al. genetic algorithm, from (["Certified Robustness to Adversarial Word Substitutions" (Jia et al., 2019)](https://arxiv.org/abs/1909.00986))</sub></td>
</tr>
<tr class="even">
<td style="text-align: left;"><code>hotflip</code> (word swap) <span class="citation" data-cites="Ebrahimi2017HotFlipWA"></span></td>
<td style="text-align: left;"><sub>Untargeted Classification</sub></td>
<td style="text-align: left;"><sub>Word Embedding Cosine Similarity, Part-of-speech match, Number of words perturbed</sub></td>
<td style="text-align: left;"><sub>Gradient-Based Word Swap</sub></td>
<td style="text-align: left;"><sub>Beam search</sub></td>
<td ><sub> (["HotFlip: White-Box Adversarial Examples for Text Classification" (Ebrahimi et al., 2017)](https://arxiv.org/abs/1712.06751))</sub></td>
</tr>
<tr class="odd">
<td style="text-align: left;"><code>iga</code> <span class="citation" data-cites="iga-wang2019natural"></span></td>
<td style="text-align: left;"><sub>Untargeted {Classification, Entailment}</sub></td>
<td style="text-align: left;"><sub>Percentage of words perturbed, Word embedding distance</sub></td>
<td style="text-align: left;"><sub>Counter-fitted word embedding swap</sub></td>
<td style="text-align: left;"><sub>Genetic Algorithm</sub></td>
<td ><sub>Improved genetic algorithm -based word substitution from (["Natural Language Adversarial Attacks and Defenses in Word Level (Wang et al., 2019)"](https://arxiv.org/abs/1909.06723)</sub></td>
</tr>
<tr class="even">
<td style="text-align: left;"><code>input-reduction</code> <span class="citation" data-cites="feng2018pathologies"></span></td>
<td style="text-align: left;"><sub>Input Reduction</sub></td>
<td style="text-align: left;"></td>
<td style="text-align: left;"><sub>Word deletion</sub></td>
<td style="text-align: left;"><sub>Greedy-WIR</sub></td>
<td ><sub>Greedy attack with word importance ranking , Reducing the input while maintaining the prediction through word importance ranking (["Pathologies of Neural Models Make Interpretation Difficult" (Feng et al., 2018)](https://arxiv.org/pdf/1804.07781.pdf))</sub></td>
</tr>
<tr class="odd">
<td style="text-align: left;"><code>kuleshov</code> <span class="citation" data-cites="Kuleshov2018AdversarialEF"></span></td>
<td style="text-align: left;"><sub>Untargeted Classification</sub></td>
<td style="text-align: left;"><sub>Thought vector encoding cosine similarity, Language model similarity probability</sub></td>
<td style="text-align: left;"><sub>Counter-fitted word embedding swap</sub></td>
<td style="text-align: left;"><sub>Greedy word swap</sub></td>
<td ><sub>(["Adversarial Examples for Natural Language Classification Problems" (Kuleshov et al., 2018)](https://openreview.net/pdf?id=r1QZ3zbAZ)) </sub></td>
</tr>
<tr class="even">
<td style="text-align: left;"><code>pruthi</code> <span class="citation" data-cites="pruthi2019combating"></span></td>
<td style="text-align: left;"><sub>Untargeted Classification</sub></td>
<td style="text-align: left;"><sub>Minimum word length, Maximum number of words perturbed</sub></td>
<td style="text-align: left;"><sub>{Neighboring Character Swap, Character Deletion, Character Insertion, Keyboard-Based Character Swap}</sub></td>
<td style="text-align: left;"><sub>Greedy search</sub></td>
<td ><sub>simulates common typos (["Combating Adversarial Misspellings with Robust Word Recognition" (Pruthi et al., 2019)](https://arxiv.org/abs/1905.11268) </sub></td>
</tr>
<tr class="odd">
<td style="text-align: left;"><code>pso</code> <span class="citation" data-cites="pso-zang-etal-2020-word"></span></td>
<td style="text-align: left;"><sub>Untargeted Classification</sub></td>
<td style="text-align: left;"></td>
<td style="text-align: left;"><sub>HowNet Word Swap</sub></td>
<td style="text-align: left;"><sub>Particle Swarm Optimization</sub></td>
<td ><sub>(["Word-level Textual Adversarial Attacking as Combinatorial Optimization" (Zang et al., 2020)](https://www.aclweb.org/anthology/2020.acl-main.540/)) </sub></td>
</tr>
<tr class="even">
<td style="text-align: left;"><code>pwws</code> <span class="citation" data-cites="pwws-ren-etal-2019-generating"></span></td>
<td style="text-align: left;"><sub>Untargeted Classification</sub></td>
<td style="text-align: left;"></td>
<td style="text-align: left;"><sub>WordNet-based synonym swap</sub></td>
<td style="text-align: left;"><sub>Greedy-WIR (saliency)</sub></td>
<td ><sub>Greedy attack with word importance ranking based on word saliency and synonym swap scores (["Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency" (Ren et al., 2019)](https://www.aclweb.org/anthology/P19-1103/))</sub> </td>
</tr>
<tr class="even">
<td style="text-align: left;"><code>textbugger</code> : (black-box) <span class="citation" data-cites="Li2019TextBuggerGA"></span></td>
<td style="text-align: left;"><sub>Untargeted Classification</sub></td>
<td style="text-align: left;"><sub>USE sentence encoding cosine similarity</sub></td>
<td style="text-align: left;"><sub>{Character Insertion, Character Deletion, Neighboring Character Swap, Character Substitution}</sub></td>
<td style="text-align: left;"><sub>Greedy-WIR</sub></td>
<td ><sub>([(["TextBugger: Generating Adversarial Text Against Real-world Applications" (Li et al., 2018)](https://arxiv.org/abs/1812.05271)).</sub></td>
</tr>
<tr class="odd">
<td style="text-align: left;"><code>textfooler</code> <span class="citation" data-cites="Jin2019TextFooler"></span></td>
<td style="text-align: left;"><sub>Untargeted {Classification, Entailment}</sub></td>
<td style="text-align: left;"><sub>Word Embedding Distance, Part-of-speech match, USE sentence encoding cosine similarity</sub></td>
<td style="text-align: left;"><sub>Counter-fitted word embedding swap</sub></td>
<td style="text-align: left;"><sub>Greedy-WIR</sub></td>
<td ><sub>Greedy attack with word importance ranking (["Is Bert Really Robust?" (Jin et al., 2019)](https://arxiv.org/abs/1907.11932))</sub> </td>
</tr>
<tr><td colspan="6"><strong>Attacks on sequence-to-sequence models:</strong></td></tr>
<tr class="odd">
<td style="text-align: left;"><code>morpheus</code> <span class="citation" data-cites="morpheus-tan-etal-2020-morphin"></span></td>
<td style="text-align: left;"><sub>Minimum BLEU Score</sub> </td>
<td style="text-align: left;"></td>
<td style="text-align: left;"><sub>Inflection Word Swap</sub> </td>
<td style="text-align: left;"><sub>Greedy search</sub> </td>
<td ><sub>Greedy to replace words with their inflections with the goal of minimizing BLEU score (["Its Morphin Time! Combating Linguistic Discrimination with Inflectional Perturbations"](https://www.aclweb.org/anthology/2020.acl-main.263.pdf)</sub> </td>
</tr>
</tr>
<tr class="odd">
<td style="text-align: left;"><code>seq2sick</code> :(black-box) <span class="citation" data-cites="cheng2018seq2sick"></span></td>
<td style="text-align: left;"><sub>Non-overlapping output</sub> </td>
<td style="text-align: left;"></td>
<td style="text-align: left;"><sub>Counter-fitted word embedding swap</sub> </td>
<td style="text-align: left;"><sub>Greedy-WIR</sub></td>
<td ><sub>Greedy attack with goal of changing every word in the output translation. Currently implemented as black-box with plans to change to white-box as done in paper (["Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples" (Cheng et al., 2018)](https://arxiv.org/abs/1803.01128)) </sub> </td>
</tr>
</tbody>
</font>
</table>
Attacks on sequence-to-sequence models:
| Attack Recipe(s) | Accessibility | Perturbation | Main Idea |
| :-------------------------------: | :-------------: | :----------: | :-----------------------------------------------------------------------------------------------: |
| Seq2Sick | Score | Word | Greedy attack with goal of changing every word in the output translation. |
| MORPHEUS | Score | Word | Greedy attack that replaces words with their inflections with the goal of minimizing BLEU score |
#### Recipe Usage Examples

View File

@@ -29,6 +29,9 @@ We provide the following broad advice to help other future developers create use
Our modular and extendable design allows us to reuse many components to offer 15+ different adversarial attack methods proposed by literature. Our model-agnostic and dataset-agnostic design allows users to easily run adversarial attacks against their own models built using any deep learning framework. We hope that our lessons from developing TextAttack will help others create user-friendly open-source NLP libraries.
## More Details in Reference
```

View File

@@ -61,3 +61,10 @@ TextAttack has some other features that make it a pleasure to use:
- :ref:`Pre-trained Models <models>` for testing attacks and evaluating constraints
- :ref:`Visualization options <loggers>` like Weights & Biases and Visdom
- :ref:`AttackedText <attacked_text>`, a utility class for strings that includes tools for tokenizing and editing text
Indices and Glossary of TextAttack
-----------------------------------
* :ref:`modindex`

View File

@@ -134,3 +134,8 @@ see some basic information about the dataset.
For example, use `textattack peek-dataset --dataset-from-huggingface glue^mrpc` to see
information about the MRPC dataset (from the GLUE set of datasets). This will
print statistics like the number of labels, average number of words, etc.
## A summary diagram of TextAttack Ecosystem
![diagram](/_static/imgs/intro/textattack_ecosystem.png)

View File

@@ -60,3 +60,7 @@ How to Cite TextAttack
}
```
## A summary diagram of TextAttack Ecosystem
![diagram](/_static/imgs/intro/textattack_ecosystem.png)

View File

@@ -224,3 +224,7 @@ You can run TextAttack tests with `pytest`. Just type `make test`.
#### This guide was heavily inspired by the awesome [transformers guide to contributing](https://github.com/huggingface/transformers/blob/master/CONTRIBUTING.md)

File diff suppressed because one or more lines are too long

361
docs/3recipes/models.md Normal file
View File

@@ -0,0 +1,361 @@
# TextAttack Model Zoo
TextAttack includes pre-trained models for different common NLP tasks. This makes it easier for
users to get started with TextAttack. It also enables a more fair comparison of attacks from
the literature.
All evaluation results were obtained using `textattack eval` to evaluate models on their default
test dataset (test set, if labels are available, otherwise, eval/validation set). You can use
this command to verify the accuracies for yourself: for example, `textattack eval --model roberta-base-mr`.
The LSTM and wordCNN models' code is available in `textattack.models.helpers`. All other models are transformers
imported from the [`transformers`](https://github.com/huggingface/transformers/) package. To list evaluate all
TextAttack pretrained models, invoke `textattack eval` without specifying a model: `textattack eval --num-examples 1000`.
All evaluations shown are on the full validation or test set up to 1000 examples.
### `LSTM`
<section>
- AG News (`lstm-ag-news`)
- `datasets` dataset `ag_news`, split `test`
- Successes: 914/1000
- Accuracy: 91.4%
- IMDB (`lstm-imdb`)
- `datasets` dataset `imdb`, split `test`
- Successes: 883/1000
- Accuracy: 88.30%
- Movie Reviews [Rotten Tomatoes] (`lstm-mr`)
- `datasets` dataset `rotten_tomatoes`, split `validation`
- Successes: 807/1000
- Accuracy: 80.70%
- `datasets` dataset `rotten_tomatoes`, split `test`
- Successes: 781/1000
- Accuracy: 78.10%
- SST-2 (`lstm-sst2`)
- `datasets` dataset `glue`, subset `sst2`, split `validation`
- Successes: 737/872
- Accuracy: 84.52%
- Yelp Polarity (`lstm-yelp`)
- `datasets` dataset `yelp_polarity`, split `test`
- Successes: 922/1000
- Accuracy: 92.20%
</section>
### `wordCNN`
<section>
- AG News (`cnn-ag-news`)
- `datasets` dataset `ag_news`, split `test`
- Successes: 910/1000
- Accuracy: 91.00%
- IMDB (`cnn-imdb`)
- `datasets` dataset `imdb`, split `test`
- Successes: 863/1000
- Accuracy: 86.30%
- Movie Reviews [Rotten Tomatoes] (`cnn-mr`)
- `datasets` dataset `rotten_tomatoes`, split `validation`
- Successes: 794/1000
- Accuracy: 79.40%
- `datasets` dataset `rotten_tomatoes`, split `test`
- Successes: 768/1000
- Accuracy: 76.80%
- SST-2 (`cnn-sst2`)
- `datasets` dataset `glue`, subset `sst2`, split `validation`
- Successes: 721/872
- Accuracy: 82.68%
- Yelp Polarity (`cnn-yelp`)
- `datasets` dataset `yelp_polarity`, split `test`
- Successes: 913/1000
- Accuracy: 91.30%
</section>
### `albert-base-v2`
<section>
- AG News (`albert-base-v2-ag-news`)
- `datasets` dataset `ag_news`, split `test`
- Successes: 943/1000
- Accuracy: 94.30%
- CoLA (`albert-base-v2-cola`)
- `datasets` dataset `glue`, subset `cola`, split `validation`
- Successes: 829/1000
- Accuracy: 82.90%
- IMDB (`albert-base-v2-imdb`)
- `datasets` dataset `imdb`, split `test`
- Successes: 913/1000
- Accuracy: 91.30%
- Movie Reviews [Rotten Tomatoes] (`albert-base-v2-mr`)
- `datasets` dataset `rotten_tomatoes`, split `validation`
- Successes: 882/1000
- Accuracy: 88.20%
- `datasets` dataset `rotten_tomatoes`, split `test`
- Successes: 851/1000
- Accuracy: 85.10%
- Quora Question Pairs (`albert-base-v2-qqp`)
- `datasets` dataset `glue`, subset `qqp`, split `validation`
- Successes: 914/1000
- Accuracy: 91.40%
- Recognizing Textual Entailment (`albert-base-v2-rte`)
- `datasets` dataset `glue`, subset `rte`, split `validation`
- Successes: 211/277
- Accuracy: 76.17%
- SNLI (`albert-base-v2-snli`)
- `datasets` dataset `snli`, split `test`
- Successes: 883/1000
- Accuracy: 88.30%
- SST-2 (`albert-base-v2-sst2`)
- `datasets` dataset `glue`, subset `sst2`, split `validation`
- Successes: 807/872
- Accuracy: 92.55%)
- STS-b (`albert-base-v2-stsb`)
- `datasets` dataset `glue`, subset `stsb`, split `validation`
- Pearson correlation: 0.9041359738552746
- Spearman correlation: 0.8995912861209745
- WNLI (`albert-base-v2-wnli`)
- `datasets` dataset `glue`, subset `wnli`, split `validation`
- Successes: 42/71
- Accuracy: 59.15%
- Yelp Polarity (`albert-base-v2-yelp`)
- `datasets` dataset `yelp_polarity`, split `test`
- Successes: 963/1000
- Accuracy: 96.30%
</section>
### `bert-base-uncased`
<section>
- AG News (`bert-base-uncased-ag-news`)
- `datasets` dataset `ag_news`, split `test`
- Successes: 942/1000
- Accuracy: 94.20%
- CoLA (`bert-base-uncased-cola`)
- `datasets` dataset `glue`, subset `cola`, split `validation`
- Successes: 812/1000
- Accuracy: 81.20%
- IMDB (`bert-base-uncased-imdb`)
- `datasets` dataset `imdb`, split `test`
- Successes: 919/1000
- Accuracy: 91.90%
- MNLI matched (`bert-base-uncased-mnli`)
- `datasets` dataset `glue`, subset `mnli`, split `validation_matched`
- Successes: 840/1000
- Accuracy: 84.00%
- Movie Reviews [Rotten Tomatoes] (`bert-base-uncased-mr`)
- `datasets` dataset `rotten_tomatoes`, split `validation`
- Successes: 876/1000
- Accuracy: 87.60%
- `datasets` dataset `rotten_tomatoes`, split `test`
- Successes: 838/1000
- Accuracy: 83.80%
- MRPC (`bert-base-uncased-mrpc`)
- `datasets` dataset `glue`, subset `mrpc`, split `validation`
- Successes: 358/408
- Accuracy: 87.75%
- QNLI (`bert-base-uncased-qnli`)
- `datasets` dataset `glue`, subset `qnli`, split `validation`
- Successes: 904/1000
- Accuracy: 90.40%
- Quora Question Pairs (`bert-base-uncased-qqp`)
- `datasets` dataset `glue`, subset `qqp`, split `validation`
- Successes: 924/1000
- Accuracy: 92.40%
- Recognizing Textual Entailment (`bert-base-uncased-rte`)
- `datasets` dataset `glue`, subset `rte`, split `validation`
- Successes: 201/277
- Accuracy: 72.56%
- SNLI (`bert-base-uncased-snli`)
- `datasets` dataset `snli`, split `test`
- Successes: 894/1000
- Accuracy: 89.40%
- SST-2 (`bert-base-uncased-sst2`)
- `datasets` dataset `glue`, subset `sst2`, split `validation`
- Successes: 806/872
- Accuracy: 92.43%)
- STS-b (`bert-base-uncased-stsb`)
- `datasets` dataset `glue`, subset `stsb`, split `validation`
- Pearson correlation: 0.8775458937815515
- Spearman correlation: 0.8773251339980935
- WNLI (`bert-base-uncased-wnli`)
- `datasets` dataset `glue`, subset `wnli`, split `validation`
- Successes: 40/71
- Accuracy: 56.34%
- Yelp Polarity (`bert-base-uncased-yelp`)
- `datasets` dataset `yelp_polarity`, split `test`
- Successes: 963/1000
- Accuracy: 96.30%
</section>
### `distilbert-base-cased`
<section>
- CoLA (`distilbert-base-cased-cola`)
- `datasets` dataset `glue`, subset `cola`, split `validation`
- Successes: 786/1000
- Accuracy: 78.60%
- MRPC (`distilbert-base-cased-mrpc`)
- `datasets` dataset `glue`, subset `mrpc`, split `validation`
- Successes: 320/408
- Accuracy: 78.43%
- Quora Question Pairs (`distilbert-base-cased-qqp`)
- `datasets` dataset `glue`, subset `qqp`, split `validation`
- Successes: 908/1000
- Accuracy: 90.80%
- SNLI (`distilbert-base-cased-snli`)
- `datasets` dataset `snli`, split `test`
- Successes: 861/1000
- Accuracy: 86.10%
- SST-2 (`distilbert-base-cased-sst2`)
- `datasets` dataset `glue`, subset `sst2`, split `validation`
- Successes: 785/872
- Accuracy: 90.02%)
- STS-b (`distilbert-base-cased-stsb`)
- `datasets` dataset `glue`, subset `stsb`, split `validation`
- Pearson correlation: 0.8421540899520146
- Spearman correlation: 0.8407155030382939
</section>
### `distilbert-base-uncased`
<section>
- AG News (`distilbert-base-uncased-ag-news`)
- `datasets` dataset `ag_news`, split `test`
- Successes: 944/1000
- Accuracy: 94.40%
- CoLA (`distilbert-base-uncased-cola`)
- `datasets` dataset `glue`, subset `cola`, split `validation`
- Successes: 786/1000
- Accuracy: 78.60%
- IMDB (`distilbert-base-uncased-imdb`)
- `datasets` dataset `imdb`, split `test`
- Successes: 903/1000
- Accuracy: 90.30%
- MNLI matched (`distilbert-base-uncased-mnli`)
- `datasets` dataset `glue`, subset `mnli`, split `validation_matched`
- Successes: 817/1000
- Accuracy: 81.70%
- MRPC (`distilbert-base-uncased-mrpc`)
- `datasets` dataset `glue`, subset `mrpc`, split `validation`
- Successes: 350/408
- Accuracy: 85.78%
- QNLI (`distilbert-base-uncased-qnli`)
- `datasets` dataset `glue`, subset `qnli`, split `validation`
- Successes: 860/1000
- Accuracy: 86.00%
- Recognizing Textual Entailment (`distilbert-base-uncased-rte`)
- `datasets` dataset `glue`, subset `rte`, split `validation`
- Successes: 180/277
- Accuracy: 64.98%
- STS-b (`distilbert-base-uncased-stsb`)
- `datasets` dataset `glue`, subset `stsb`, split `validation`
- Pearson correlation: 0.8421540899520146
- Spearman correlation: 0.8407155030382939
- WNLI (`distilbert-base-uncased-wnli`)
- `datasets` dataset `glue`, subset `wnli`, split `validation`
- Successes: 40/71
- Accuracy: 56.34%
</section>
### `roberta-base`
<section>
- AG News (`roberta-base-ag-news`)
- `datasets` dataset `ag_news`, split `test`
- Successes: 947/1000
- Accuracy: 94.70%
- CoLA (`roberta-base-cola`)
- `datasets` dataset `glue`, subset `cola`, split `validation`
- Successes: 857/1000
- Accuracy: 85.70%
- IMDB (`roberta-base-imdb`)
- `datasets` dataset `imdb`, split `test`
- Successes: 941/1000
- Accuracy: 94.10%
- Movie Reviews [Rotten Tomatoes] (`roberta-base-mr`)
- `datasets` dataset `rotten_tomatoes`, split `validation`
- Successes: 899/1000
- Accuracy: 89.90%
- `datasets` dataset `rotten_tomatoes`, split `test`
- Successes: 883/1000
- Accuracy: 88.30%
- MRPC (`roberta-base-mrpc`)
- `datasets` dataset `glue`, subset `mrpc`, split `validation`
- Successes: 371/408
- Accuracy: 91.18%
- QNLI (`roberta-base-qnli`)
- `datasets` dataset `glue`, subset `qnli`, split `validation`
- Successes: 917/1000
- Accuracy: 91.70%
- Recognizing Textual Entailment (`roberta-base-rte`)
- `datasets` dataset `glue`, subset `rte`, split `validation`
- Successes: 217/277
- Accuracy: 78.34%
- SST-2 (`roberta-base-sst2`)
- `datasets` dataset `glue`, subset `sst2`, split `validation`
- Successes: 820/872
- Accuracy: 94.04%)
- STS-b (`roberta-base-stsb`)
- `datasets` dataset `glue`, subset `stsb`, split `validation`
- Pearson correlation: 0.906067852162708
- Spearman correlation: 0.9025045272903051
- WNLI (`roberta-base-wnli`)
- `datasets` dataset `glue`, subset `wnli`, split `validation`
- Successes: 40/71
- Accuracy: 56.34%
</section>
### `xlnet-base-cased`
<section>
- CoLA (`xlnet-base-cased-cola`)
- `datasets` dataset `glue`, subset `cola`, split `validation`
- Successes: 800/1000
- Accuracy: 80.00%
- IMDB (`xlnet-base-cased-imdb`)
- `datasets` dataset `imdb`, split `test`
- Successes: 957/1000
- Accuracy: 95.70%
- Movie Reviews [Rotten Tomatoes] (`xlnet-base-cased-mr`)
- `datasets` dataset `rotten_tomatoes`, split `validation`
- Successes: 908/1000
- Accuracy: 90.80%
- `datasets` dataset `rotten_tomatoes`, split `test`
- Successes: 876/1000
- Accuracy: 87.60%
- MRPC (`xlnet-base-cased-mrpc`)
- `datasets` dataset `glue`, subset `mrpc`, split `validation`
- Successes: 363/408
- Accuracy: 88.97%
- Recognizing Textual Entailment (`xlnet-base-cased-rte`)
- `datasets` dataset `glue`, subset `rte`, split `validation`
- Successes: 196/277
- Accuracy: 70.76%
- STS-b (`xlnet-base-cased-stsb`)
- `datasets` dataset `glue`, subset `stsb`, split `validation`
- Pearson correlation: 0.883111673280641
- Spearman correlation: 0.8773439961182335
- WNLI (`xlnet-base-cased-wnli`)
- `datasets` dataset `glue`, subset `wnli`, split `validation`
- Successes: 41/71
- Accuracy: 57.75%
</section>

Binary file not shown.

After

Width:  |  Height:  |  Size: 97 KiB

View File

@@ -30,7 +30,8 @@ TextAttack Documentation
Tutorial 4: Attacking scikit-learn models <2notebook/Example_1_sklearn.ipynb>
Tutorial 5: Attacking AllenNLP models <2notebook/Example_2_allennlp.ipynb>
Tutorial 6: Attacking multilingual models <2notebook/Example_4_CamemBERT.ipynb>
Tutorial 7: Explaining BERT model using Captum <2notebook/Example_5_Explain_BERT.ipynb>
.. toctree::
:maxdepth: 6
@@ -41,4 +42,7 @@ TextAttack Documentation
1start/api-design-tips.md
3recipes/attack_recipes
3recipes/augmenter_recipes
apidoc/textattack
3recipes/models.md
apidoc/textattack

View File

@@ -2,6 +2,10 @@
Word Swap by Embedding
============================================
Based on paper: `<arxiv.org/abs/1603.00892>`_
Paper title: Counter-fitting Word Vectors to Linguistic Constraints
"""
import os