mirror of
https://github.com/hate-alert/DE-LIMIT.git
synced 2021-05-12 18:32:23 +03:00
Instructions for LASER+LR models
-
LASER+LR baseline
- Generate the LASER embeddings for the datasets of the target language. Refer to the LASER github repository for guidelines on how to install and generate the embeddings.
- The code expects the embeddings to be present in the directory
Dataset/embedding, with the train, val and test files present respectively in the subdirectories oftrain,val,testin theembeddingsfolder. The name of the file is expected to be {language}.csv. E.g. English.csv, German.csv, etc - For running the baseline experiment, make use of the
LASER+LR/Baselines.ipynbnotebook to run the Logistic regression model. You can choose the language, and sample size by passing them as paramters to the function. Refer to the notebook for further details and instructions.
-
LASER+LR all_but_one
- Similar to the above case, generate the laser embeddings and place them in respective directories in the
Dataset/embeddingfolder. - Use the files
LASER+LR/All_but_one.ipynbandLASER+LR/All.ipynbto run the all_but_one experiments. TheAll_but_one.ipynbnotebook contains code for running the all_but_one experiment using a sample of target language points (including zero-shot case). TheAll.ipynbfile contains codes for running the experiment using all the datasets available, i.e. 100% of training dataset.
- Similar to the above case, generate the laser embeddings and place them in respective directories in the