59 Commits

Author SHA1 Message Date
Pasquale Minervini
d8f1337e7e README edit -- running the service with no GPU or CUDA support (#773)
One-line addition to the README to show how to run the service on a
machine without GPUs or CUDA support (e.g., for local prototyping)

---------

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2023-08-14 15:41:13 +02:00
sawan Rawat
3ffcd9d311 Added two more features in readme.md file (#831)
# What does this PR do?

<!--
Congratulations! You've made it this far! You're not quite done yet
though.

Once merged, your PR is going to appear in the release notes with the
title you set, so make sure it's a great title that fully reflects the
extent of your awesome contribution.

Then, please replace this with a description of the change and which
issue is fixed (if applicable). Please also include relevant motivation
and context. List any dependencies (if any) that are required for this
change.

Once you're done, someone will review your PR shortly (see the section
"Who can review?" below to tag some potential reviewers). They may
suggest changes to make the code even better. If no one reviewed your PR
after a week has passed, don't hesitate to post a new comment
@-mentioning the same persons---sometimes notifications get lost.
-->

<!-- Remove if not applicable -->

Fixes # (issue)


## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Did you read the [contributor
guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),
      Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the
[forum](https://discuss.huggingface.co/)? Please add a link
      to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes?
Here are the
[documentation
guidelines](https://github.com/huggingface/transformers/tree/main/docs),
and
[here are tips on formatting
docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).
- [ ] Did you write any new necessary tests?


## Who can review?

Anyone in the community is free to review the PR once the tests have
passed. Feel free to tag
members/contributors who may be interested in your PR.

<!-- Your PR will be replied to more quickly if you can figure out the
right person to tag with @


@OlivierDehaene OR @Narsil

 -->
2023-08-14 14:09:20 +02:00
Nicolas Patry
09eca64227 Version 1.0.1 (#836)
# What does this PR do?

<!--
Congratulations! You've made it this far! You're not quite done yet
though.

Once merged, your PR is going to appear in the release notes with the
title you set, so make sure it's a great title that fully reflects the
extent of your awesome contribution.

Then, please replace this with a description of the change and which
issue is fixed (if applicable). Please also include relevant motivation
and context. List any dependencies (if any) that are required for this
change.

Once you're done, someone will review your PR shortly (see the section
"Who can review?" below to tag some potential reviewers). They may
suggest changes to make the code even better. If no one reviewed your PR
after a week has passed, don't hesitate to post a new comment
@-mentioning the same persons---sometimes notifications get lost.
-->

<!-- Remove if not applicable -->

Fixes # (issue)


## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Did you read the [contributor
guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),
      Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the
[forum](https://discuss.huggingface.co/)? Please add a link
      to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes?
Here are the
[documentation
guidelines](https://github.com/huggingface/transformers/tree/main/docs),
and
[here are tips on formatting
docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).
- [ ] Did you write any new necessary tests?


## Who can review?

Anyone in the community is free to review the PR once the tests have
passed. Feel free to tag
members/contributors who may be interested in your PR.

<!-- Your PR will be replied to more quickly if you can figure out the
right person to tag with @


@OlivierDehaene OR @Narsil

 -->
2023-08-14 11:23:11 +02:00
Nicolas Patry
cc7bb5084d Upgrade transformers (fix protobuf==3.20 issue) (#795)
# What does this PR do?

Fixes #531

<!--
Congratulations! You've made it this far! You're not quite done yet
though.

Once merged, your PR is going to appear in the release notes with the
title you set, so make sure it's a great title that fully reflects the
extent of your awesome contribution.

Then, please replace this with a description of the change and which
issue is fixed (if applicable). Please also include relevant motivation
and context. List any dependencies (if any) that are required for this
change.

Once you're done, someone will review your PR shortly (see the section
"Who can review?" below to tag some potential reviewers). They may
suggest changes to make the code even better. If no one reviewed your PR
after a week has passed, don't hesitate to post a new comment
@-mentioning the same persons---sometimes notifications get lost.
-->

<!-- Remove if not applicable -->

Fixes # (issue)


## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Did you read the [contributor
guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),
      Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the
[forum](https://discuss.huggingface.co/)? Please add a link
      to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes?
Here are the
[documentation
guidelines](https://github.com/huggingface/transformers/tree/main/docs),
and
[here are tips on formatting
docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).
- [ ] Did you write any new necessary tests?


## Who can review?

Anyone in the community is free to review the PR once the tests have
passed. Feel free to tag
members/contributors who may be interested in your PR.

<!-- Your PR will be replied to more quickly if you can figure out the
right person to tag with @


@OlivierDehaene OR @Narsil

 -->
2023-08-11 16:46:08 +02:00
Nicolas Patry
16fadcec57 Merge BNB 4bit. (#770)
# What does this PR do?


See #626 
<!--
Congratulations! You've made it this far! You're not quite done yet
though.

Once merged, your PR is going to appear in the release notes with the
title you set, so make sure it's a great title that fully reflects the
extent of your awesome contribution.

Then, please replace this with a description of the change and which
issue is fixed (if applicable). Please also include relevant motivation
and context. List any dependencies (if any) that are required for this
change.

Once you're done, someone will review your PR shortly (see the section
"Who can review?" below to tag some potential reviewers). They may
suggest changes to make the code even better. If no one reviewed your PR
after a week has passed, don't hesitate to post a new comment
@-mentioning the same persons---sometimes notifications get lost.
-->

<!-- Remove if not applicable -->

Fixes # (issue)


## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Did you read the [contributor
guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),
      Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the
[forum](https://discuss.huggingface.co/)? Please add a link
      to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes?
Here are the
[documentation
guidelines](https://github.com/huggingface/transformers/tree/main/docs),
and
[here are tips on formatting
docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).
- [ ] Did you write any new necessary tests?


## Who can review?

Anyone in the community is free to review the PR once the tests have
passed. Feel free to tag
members/contributors who may be interested in your PR.

<!-- Your PR will be replied to more quickly if you can figure out the
right person to tag with @


@OlivierDehaene OR @Narsil

 -->

---------

Co-authored-by: krzim <zimmerk4@live.com>
2023-08-03 23:00:59 +02:00
OlivierDehaene
3ef5ffbc64 v1.0.0 (#727) 2023-07-28 17:43:46 +02:00
regisss
f848decee6 docs: Add hardware section to TOC in README (#721) 2023-07-28 11:20:03 +02:00
regisss
5a1cccbb98 Add section about TGI on other AI hardware accelerators in README (#715)
# What does this PR do?

<!--
Congratulations! You've made it this far! You're not quite done yet
though.

Once merged, your PR is going to appear in the release notes with the
title you set, so make sure it's a great title that fully reflects the
extent of your awesome contribution.

Then, please replace this with a description of the change and which
issue is fixed (if applicable). Please also include relevant motivation
and context. List any dependencies (if any) that are required for this
change.

Once you're done, someone will review your PR shortly (see the section
"Who can review?" below to tag some potential reviewers). They may
suggest changes to make the code even better. If no one reviewed your PR
after a week has passed, don't hesitate to post a new comment
@-mentioning the same persons---sometimes notifications get lost.
-->

<!-- Remove if not applicable -->

As per title.


## Before submitting
- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Did you read the [contributor
guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests),
      Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the
[forum](https://discuss.huggingface.co/)? Please add a link
      to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes?
Here are the
[documentation
guidelines](https://github.com/huggingface/transformers/tree/main/docs),
and
[here are tips on formatting
docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation).
- [ ] Did you write any new necessary tests?


## Who can review?

Anyone in the community is free to review the PR once the tests have
passed. Feel free to tag
members/contributors who may be interested in your PR.

<!-- Your PR will be replied to more quickly if you can figure out the
right person to tag with @


@OlivierDehaene OR @Narsil

 -->
2023-07-28 09:14:03 +02:00
OlivierDehaene
9f18f4c006 v0.9.4 (#713) 2023-07-27 19:25:15 +02:00
OlivierDehaene
e64a65891b docs(README): update readme 2023-07-25 19:45:25 +02:00
Nicolas Patry
5a1512c025 docs: Update README.md (#643) 2023-07-19 13:39:12 +02:00
Nicolas Patry
1c81df15cd docs: Update README.md (#639) 2023-07-19 13:38:52 +02:00
OlivierDehaene
cf83f9b66f v0.9.3 (#634) 2023-07-18 18:11:20 +02:00
Victor Muštar
c8b077be79 docs: README: Add logo + baseline (#611)
![image](https://github.com/huggingface/text-generation-inference/assets/3841370/58177321-479f-4ad1-b3bc-cec027423984)
2023-07-13 21:45:20 +02:00
OlivierDehaene
e28a809004 v0.9.0 (#525) 2023-07-01 19:25:41 +02:00
OlivierDehaene
e74bd41e0f feat(server): add paged attention to flash models (#516)
Closes #478
2023-06-30 19:09:59 +02:00
OlivierDehaene
081b926584 v0.8.0 2023-05-30 18:39:35 +02:00
OlivierDehaene
d31562f300 v0.7.0 (#353) 2023-05-23 21:20:49 +02:00
OlivierDehaene
e71471bec9 feat: add snapshot testing (#282) 2023-05-15 23:36:30 +02:00
Nicolas Patry
e86cca9723 Adding docs on how dynamic batching works. (#258)
This PR starts the minimal possible amount of explanation I could think
of. It tries to explain how dynamic batching occurs, the interactions
with past key values and ignores the padding problem.

Maybe some drawings could help too but I kept it to text for now.
2023-05-01 14:16:50 +02:00
Nicolas Patry
b0b97fd9a7 doc(launcher): add more docs to the launcher itself and link in the README (#257) 2023-04-29 11:53:42 +02:00
Ehsan M. Kermani
f092ba9b22 feat(server): add watermarking tests (#248) 2023-04-27 19:16:35 +02:00
OlivierDehaene
b927244eb5 feat(python-client): get list of currently deployed tgi models using the inference API (#191) 2023-04-17 18:43:24 +02:00
OlivierDehaene
f26dfd0dc1 feat(server): support OPT models (#55)
OPT models do not all have a `tokenizer.json` file on the hub at the
moment. Can't merge for now.
2023-04-11 19:16:41 +02:00
OlivierDehaene
299217c95c feat(server): add flash attention llama (#144) 2023-04-11 16:38:22 +02:00
Guspan Tanadi
9122e7bd9c docs(readme): provide link Logits Warper README (#154) 2023-04-04 13:27:46 +02:00
lewtun
5e5e9d4bbd feat: Add note about NVIDIA drivers (#64)
Co-authored-by: OlivierDehaene <olivier@huggingface.co>
2023-03-23 18:03:45 +01:00
OlivierDehaene
3fef90d50f feat(clients): Python client (#103) 2023-03-07 18:52:22 +01:00
OlivierDehaene
1c19b0934e v0.3.2 (#97) 2023-03-03 18:42:20 +01:00
OlivierDehaene
0fbc691946 feat: add safetensors conversion (#63) 2023-02-14 13:02:16 +01:00
OlivierDehaene
9af454142a feat: add distributed tracing (#62) 2023-02-13 13:02:45 +01:00
Yannic Kilcher
e520d5b349 fixed SSE naming (#61)
https://en.wikipedia.org/wiki/Server-sent_events
2023-02-08 22:30:11 +01:00
OlivierDehaene
1ad3250b89 fix(docker): increase shm size (#60) 2023-02-08 17:53:33 +01:00
OlivierDehaene
c503a639b1 feat(server): support t5 (#59) 2023-02-07 18:25:17 +01:00
lewtun
a0dca443dd feat(docs): Clarify installation steps (#54)
Adds some bits for first-time users (like me 😄 )
2023-02-03 13:07:55 +01:00
OlivierDehaene
20c3c5940c feat(router): refactor API and add openAPI schemas (#53) 2023-02-03 12:43:37 +01:00
OlivierDehaene
313194f6d7 feat(server): support repetition penalty (#47) 2023-02-01 15:58:42 +01:00
OlivierDehaene
2ad895a6cc feat(server): allow gpt-neox models with odd vocab sizes to be sharded (#48) 2023-02-01 14:43:59 +01:00
OlivierDehaene
f830706b21 feat(server): Support GPT-Neox (#39) 2023-01-31 18:53:56 +01:00
OlivierDehaene
15511edc01 feat(server): Support SantaCoder (#26) 2023-01-20 12:24:39 +01:00
OlivierDehaene
32a253063d feat: Return logprobs (#8) 2022-12-15 17:03:56 +01:00
OlivierDehaene
718096f695 feat: Support stop sequences (#7) 2022-12-12 18:25:22 +01:00
OlivierDehaene
a2985036aa feat(server): Add model tests (#6) 2022-12-08 18:49:33 +01:00
OlivierDehaene
daa1d81d5e feat(server): Support Galactica (#4) 2022-12-01 19:31:54 +01:00
OlivierDehaene
feb7806ca4 fix(readme): Typo 2022-11-14 16:22:10 +01:00
OlivierDehaene
4236e41b0d feat(server): Improved doc 2022-11-07 12:53:56 +01:00
OlivierDehaene
427d7cc444 feat(server): Support AutoModelForSeq2SeqLM 2022-11-04 18:03:04 +01:00
OlivierDehaene
c5665f5c8b feat(server): Support generic AutoModelForCausalLM 2022-11-04 14:22:47 +01:00
OlivierDehaene
755fc0e403 fix(models): Revert buggy support for AutoModel 2022-11-03 16:07:54 +01:00
OlivierDehaene
b3b7ea0d74 feat: Use json formatter by default in docker image 2022-11-02 17:29:56 +01:00