Updates on the best model ever: FinGPT v3.3

This commit is contained in:
oliverwang15
2023-10-12 22:57:56 -04:00
parent e1a1232b71
commit 792a2feeea

View File

@@ -16,7 +16,6 @@ Let us DO NOT expect Wall Street to open-source LLMs nor open APIs, due to FinTe
**Disclaimer: We are sharing codes for academic purposes under the MIT education license. Nothing herein is financial advice, and NOT a recommendation to trade real money. Please use common sense and always first consult a professional before trading or investing.**
[![](https://dcbadge.vercel.app/api/server/trsr8SXpW5)](https://discord.gg/trsr8SXpW5)
@@ -32,7 +31,7 @@ Let us DO NOT expect Wall Street to open-source LLMs nor open APIs, due to FinTe
## FinGPT Demos
* [FinGPT V3 (Updated on 10/12/2023)](./fingpt)
* What's new: **Best trainable and inferable FinGPT for sentiment analysis on a single RTX 3090, which even better than GPT-4 and ChatGPT Finetuning.**
* What's new: **Best trainable and inferable FinGPT for sentiment analysis on a single RTX 3090, which is even better than GPT-4 and ChatGPT Finetuning.**
* FinGPT v3 series are LLMs finetuned with the LoRA method on the News and Tweets sentiment analysis dataset which achieve the best scores on most of the financial sentiment analysis datasets with low cost.
@@ -40,20 +39,20 @@ Let us DO NOT expect Wall Street to open-source LLMs nor open APIs, due to FinTe
* Benchmark Results:
* | Weighted F1 | FPB | FiQA-SA | TFNS | NWGI | Devices | Time | Cost |
| ---------------- | --------- | --------- | --------- | --------- | ------------------ | ----------- | ------------- |
| BloombergGPT | 0.511 | 0.751 | - | - | 512 × A100 | 53 days | $2.67 million |
| Llama2-7B | 0.390 | 0.800 | 0.296 | 0.503 | 2048 × A100 | 21 days | $4.23 million |
| FinBERT | 0.880 | 0.596 | 0.733 | 0.538 | 4 × NVIDIA K80 GPU | - | - |
| GPT-4 | 0.833 | 0.630 | 0.808 | - | - | - | - |
| OpenAI Fine-tune | 0.878 | **0.887** | 0.883 | - | - | - | - |
| FinGPT v3.1 | 0.855 | 0.850 | 0.875 | 0.642 | 1 × A100 | 5.5 hours | 22.55 |
| FinGPT (8bit) | 0.855 | 0.847 | 0.879 | 0.632 | 1 × RTX 3090 | 6.47 hours | 6.47 |
| FinGPT (QLoRA) | 0.777 | 0.752 | 0.828 | 0.583 | 1 × RTX 3090 | 4.15 hours | 4.15 |
| FinGPT v3.2 | 0.850 | 0.860 | 0.894 | 0.636 | 1 × A100 | 5.5 hours | 22.55 |
| FinGPT v3.3 | **0.882** | 0.874 | **0.903** | **0.643** | 1 × RTX 3090 | 17.25 hours | 17.25 |
* | Weighted F1 | FPB | FiQA-SA | TFNS | NWGI | Devices | Time | Cost |
| ------------------------------------------------------------ | :-------: | :-------: | :-------: | :-------: | :----------------: | :---------: | :------------: |
| [FinGPT v3.3](https://huggingface.co/oliverwang15/FinGPT_v33_Llama2_13B_Sentiment_Instruction_LoRA_FT_8bit) | **0.882** | 0.874 | **0.903** | **0.643** | 1 × RTX 3090 | 17.25 hours | $17.25 |
| [FinGPT v3.2](https://huggingface.co/oliverwang15/FinGPT_v32_Llama2_Sentiment_Instruction_LoRA_FT) | 0.850 | 0.860 | 0.894 | 0.636 | 1 × A100 | 5.5 hours | $ 22.55 |
| [FinGPT v3.1](https://huggingface.co/oliverwang15/FinGPT_v31_ChatGLM2_Sentiment_Instruction_LoRA_FT) | 0.855 | 0.850 | 0.875 | 0.642 | 1 × A100 | 5.5 hours | $ 22.55 |
| FinGPT (8bit) | 0.855 | 0.847 | 0.879 | 0.632 | 1 × RTX 3090 | 6.47 hours | $ 6.47 |
| FinGPT (QLoRA) | 0.777 | 0.752 | 0.828 | 0.583 | 1 × RTX 3090 | 4.15 hours | $ 4.15 |
| OpenAI Fine-tune | 0.878 | **0.887** | 0.883 | - | - | - | - |
| GPT-4 | 0.833 | 0.630 | 0.808 | - | - | - | - |
| FinBERT | 0.880 | 0.596 | 0.733 | 0.538 | 4 × NVIDIA K80 GPU | - | - |
| Llama2-7B | 0.390 | 0.800 | 0.296 | 0.503 | 2048 × A100 | 21 days | $ 4.23 million |
| BloombergGPT | 0.511 | 0.751 | - | - | 512 × A100 | 53 days | $ 2.67 million |
**Cost per GPU hour.** For A100 GPUs, the AWS p4d.24xlarge instance, equipped with 8 A100 GPUs is used as a benchmark to estimate the costs. Note that BloombergGPT also used p4d.24xlarge As of July 11, 2023, the hourly rate for this instance stands at $32.773. Consequently, the estimated cost per GPU hour comes to $32.77 divided by 8, resulting in approximately **$4.10**. With this value as the reference unit price (1 GPU hour). **BloombergGPT estimated cost= 512 x 53 x 24 = 651,264 GPU hours x $4.10 = $2,670,182.40**. For RTX 3090, we assume its cost per hour is approximately $1.0, which is actually much higher than available GPUs from platforms like vast.ai.
**Cost per GPU hour.** For **A100 GPUs**, the AWS p4d.24xlarge instance, equipped with 8 A100 GPUs is used as a benchmark to estimate the costs. Note that BloombergGPT also used p4d.24xlarge As of July 11, 2023, the hourly rate for this instance stands at $32.773. Consequently, the estimated cost per GPU hour comes to $32.77 divided by 8, resulting in approximately **$4.10**. With this value as the reference unit price (1 GPU hour). **BloombergGPT estimated cost= 512 x 53 x 24 = 651,264 GPU hours x $4.10 = $2,670,182.40**. For **RTX 3090**, we assume its cost per hour is approximately **$1.0**, which is actually much higher than available GPUs from platforms like vast.ai.
* Reproduce the results by running [benchmarks](./fingpt/FinGPT-v3/benchmark/benchmarks.ipynb), and the detailed tutorial is on the way.
* Finetune your own FinGPT v3 model with the LoRA method on only an RTX 3090 with this [notebook](./fingpt/FinGPT-v3/training_8bit/train.ipynb) in 8bit or this [notebook](./fingpt/FinGPT-v3/training_int4/train.ipynb) in int4 (QLoRA)