alihan/fake-academic-paper-generation

mirror of https://github.com/inzva/fake-academic-paper-generation.git synced 2021-06-01 09:27:31 +03:00

Files

History

Samet Demir fca34746a6 update READMEs

2019-12-02 00:26:06 +03:00

..

transformer-xl model added

2019-08-06 12:40:50 +03:00

transformer-xl model added

2019-08-06 12:40:50 +03:00

README.md

update READMEs

2019-12-02 00:26:06 +03:00

README.md

This code is adapted from https://github.com/kimiyoung/transformer-xl.

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

This repository contains the code in PyTorch for the paper

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Zihang Dai*, Zhilin Yang*, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov (*: equal contribution)

Preprint 2018

PyTorch

The source code is in the pytorch/ folder, supporting single-node multi-gpu training via the module nn.DataParallel.
Please refer to pytorch/README.md for details.

Results

Transformer-XL achieves new state-of-the-art results on multiple language modeling benchmarks. Transformer-XL is also the first to break through the 1.0 barrier on char-level language modeling. Below is a summary.

Method	enwiki8	text8	One Billion Word	WT-103	PTB (w/o finetuning)
Previous Best	1.06	1.13	23.7	20.5	55.5
Transformer-XL	0.99	1.08	21.8	18.3	54.5

Acknowledgement

A large portion of the getdata.sh script comes from the awd-lstm repo. Happy Language Modeling :)