Zafir Stojanovski
|
c6663cdb81
|
fix(training): Prepend <think> token in format reward (#396)
* prepend think token in format reward
* pre commit + fix some default vals
* add checkpoint config
|
2025-03-28 09:45:17 +01:00 |
|
Andreas Koepf
|
2802066233
|
remove data/ from main .gitignore
|
2025-03-07 16:16:40 +01:00 |
|
Zafir Stojanovski
|
5109ed89c9
|
pre-commit
|
2025-02-23 13:11:31 +01:00 |
|
Zafir Stojanovski
|
6bbec2ac4e
|
exploratory notebook
|
2025-02-22 00:46:33 +01:00 |
|
tohskai
|
847442ef0a
|
Add PolynomialMultiplicationDataset (#64)
* Add PolynomialMultiplicationDataset
|
2025-02-07 14:06:41 +01:00 |
|
abdulhakeem
|
715102c277
|
Remove .DS_Store
|
2025-02-01 20:39:37 -06:00 |
|
Rich Jones
|
99bf648989
|
initial bf working, contrib not committed
|
2025-01-30 15:38:03 +01:00 |
|
Andreas Koepf (aider)
|
3f80fd7b80
|
build: Initialize reasoning_gym package structure with packaging and development setup
|
2025-01-23 10:50:54 +01:00 |
|
Andreas Koepf
|
530cb523c8
|
chore: Add .gitignore with .aider and .env files
|
2025-01-23 10:50:53 +01:00 |
|