Commit Graph

  • 2c4e45d9a9 Update spiral_matrix.py (#511) main Zafir Stojanovski 2025-10-06 13:02:32 +02:00
  • bcc68c5fee Update README.md with new project (#510) Zafir Stojanovski 2025-10-02 17:43:32 +02:00
  • 15d7f027e4 add mila projects (#508) Zafir Stojanovski 2025-09-29 15:37:13 +02:00
  • dd3117bbaf bump version to v0.1.25.dev0 (#509) Andreas Köpf 2025-09-29 15:36:30 +02:00
  • 2f9eaee32a fix: Register missing coin_flip (#507) v0.1.24 Zafir Stojanovski 2025-09-15 14:23:30 +02:00
  • 3fcb8642c6 (README): add gensyn paper (#506) Zafir Stojanovski 2025-09-11 17:11:04 +02:00
  • b0815043a2 Add probability dataset (initial: Coin Flip dataset + curriculum) (#505) Kumar Anant 2025-09-06 20:29:23 +05:30
  • b399c658ca Add OptimalThinkingBench to Projects Using RG (#503) Rich Jones 2025-08-24 21:36:11 +02:00
  • 02b7fac863 fix encoding to be able to run on win (#502) Denini Gabriel 2025-08-18 05:19:45 -03:00
  • b8aa55704b add discord link (#500) Zafir Stojanovski 2025-08-05 10:57:46 +02:00
  • 678622faec add imports feat/path-star-env Oliver 2025-08-03 16:00:49 +01:00
  • 641712e7fa rm teacherless mode Oliver 2025-08-03 15:59:56 +01:00
  • 18ff9868df fix for paper spec Oliver 2025-08-03 15:42:59 +01:00
  • d795ba68c9 typos Oliver 2025-08-02 22:56:54 +01:00
  • fc02d1af5a draft path-star task Oliver 2025-08-02 22:40:00 +01:00
  • 86c4f8552f add GEM to projects using RG (#498) Zafir Stojanovski 2025-08-02 10:09:53 +02:00
  • 0e4582f83b fix(evaluation): Add instructions for running on MMLU Pro (#497) Zafir Stojanovski 2025-08-01 16:27:56 +02:00
  • a969d8ef05 feat(curriculum): Knights and Knaves configs (#488) Zafir Stojanovski 2025-07-31 10:18:05 +02:00
  • cf99528dbe Run categories in parallel (#492) Szymon Ożóg 2025-07-30 19:11:27 +02:00
  • 70af0ad699 Update load_fsdp_to_hf.py feat/curriculum-exp joesharratt1229 2025-07-28 15:57:15 +01:00
  • 62887ad1fc Update load_fsdp_to_hf.py joesharratt1229 2025-07-28 15:56:16 +01:00
  • 3393d22611 added training and evaluation curr conf joesharratt1229 2025-07-28 15:51:19 +01:00
  • d73d881073 reps feat/multi-curriculum-exp Oliver 2025-07-28 12:52:31 +01:00
  • 523d56f019 add graphs eval configs Oliver 2025-07-28 12:21:31 +01:00
  • b29093e2ee Add option to increase timeout (#493) Szymon Ożóg 2025-07-28 06:26:09 +02:00
  • 1ab6cf3a12 update noncurric Oliver 2025-07-27 12:18:05 +01:00
  • 140ff6e67f exp names Oliver 2025-07-27 11:04:44 +01:00
  • 0f5352e5cd fix: Training README.md (#491) Zafir Stojanovski 2025-07-27 11:56:00 +02:00
  • 60a4f2e3a6 specify model_dtype Oliver 2025-07-27 10:11:25 +01:00
  • 13fee716e3 incorporate tweaks Oliver 2025-07-27 10:06:43 +01:00
  • b9ab72a28d thresholds Oliver 2025-07-27 09:55:32 +01:00
  • 93b387a778 prep graphs curriculum exps Oliver 2025-07-27 09:44:37 +01:00
  • 37697e2421 Update README.md fix/update-training-readme Zafir Stojanovski 2025-07-27 05:59:58 +02:00
  • 929c17523a add graphs curriculum config Oliver 2025-07-26 21:42:59 +01:00
  • 4b60c32978 Curr exp (#487) joesharratt1229 2025-07-25 20:38:47 +01:00
  • 2d19f13e0f [fix #484] resolve basic_arithmetic fails when size is large (#485) theblackcat102 2025-07-07 16:46:23 +08:00
  • bf451d5197 Update README.md (#483) v0.1.23 Zafir Stojanovski 2025-07-05 01:57:21 +02:00
  • 1c98584f28 Feat/unsloth example (#482) joesharratt1229 2025-06-28 17:04:38 +01:00
  • c44ff8c542 updated failing hooks feat/unsloth-example joesharratt1229 2025-06-27 07:59:58 +00:00
  • 799eb51800 cleaned up examples joesharratt1229 2025-06-27 07:58:46 +00:00
  • d9cd20c174 Update README.md (RLSwarm GenRL) (#480) Rich Jones 2025-06-26 11:20:45 +02:00
  • 1c9ed2e0eb better usage demo in readme (#477) Oliver Stanley 2025-06-25 21:38:25 +01:00
  • 876e0aa440 corrected countdown issue (#479) joesharratt1229 2025-06-25 21:37:04 +01:00
  • c2ac6fae32 Update README.md (#475) Zafir Stojanovski 2025-06-24 15:19:11 +02:00
  • a4006f6d0e Update README.md - Add Synthetic-2 Miserlou-patch-1 Rich Jones 2025-06-24 14:07:55 +02:00
  • 56ce2e79a7 tutorial(training): Add a minimal example with trl (#473) Zafir Stojanovski 2025-06-21 00:01:31 +02:00
  • 49f3821098 add minimal verifiers example (#472) Oliver Stanley 2025-06-20 16:31:02 +01:00
  • 9e79fc84b6 fix: Rounding issues in score_answer and add unit tests (#462) Adefioye 2025-06-09 13:18:11 -05:00
  • 51c2afc1fc Fix/verl example (#465) joesharratt1229 2025-06-09 09:53:43 +01:00
  • 5726034a26 fix color_cubes answer strings, update gallery with latest envs (#464) Oliver Stanley 2025-06-08 12:16:54 +01:00
  • 602e4be0a2 add survo env (#461) Oliver Stanley 2025-06-08 11:56:33 +01:00
  • 0159b1b571 Update README.md - Star History (#463) Zafir Stojanovski 2025-06-08 12:51:43 +02:00
  • c2fdb11980 add kakurasu env (#460) Oliver Stanley 2025-06-08 09:20:53 +01:00
  • be2babea9c Use raw URLs for images in README.md (#459) v0.1.22 Andreas Köpf 2025-06-06 22:23:59 +02:00
  • 02e0fc1c22 pull fra main rich/genbench Rich Jones 2025-06-06 13:43:29 +02:00
  • 1232a7d1e5 simplify training setup instructions (#454) Oliver Stanley 2025-06-06 09:51:29 +01:00
  • 0ebabf709b Update README.md with Atropos (#458) Zafir Stojanovski 2025-06-06 10:24:25 +02:00
  • 0699e2f507 Update README.md (#451) Zafir Stojanovski 2025-06-04 12:45:23 +02:00
  • 1a727ecf4e support python 3.10 (#450) Oliver Stanley 2025-06-04 10:34:01 +01:00
  • 84958baa69 abs path for images (#449) Zafir Stojanovski 2025-06-04 10:33:13 +02:00
  • 2a57a95ca2 add minimal example for building training datasets (#448) v0.1.20 Oliver Stanley 2025-06-03 19:28:45 +01:00
  • b3f81a6609 fix(README): Arxiv link (#447) Zafir Stojanovski 2025-06-02 12:20:38 +02:00
  • 17a8431013 rename to easy and hard (#445) Zafir Stojanovski 2025-06-02 10:34:05 +02:00
  • af2548f8f2 Add README assets (#446) Zafir Stojanovski 2025-06-02 10:33:54 +02:00
  • 9053009dbe Fix bug in normalize_answer method (#444) Adefioye 2025-06-02 01:58:54 -05:00
  • c0e98f93b4 make task entries json serializable (#443) Oliver Stanley 2025-06-02 07:57:15 +01:00
  • 6614338ecc add numbers to performance heatmap (#442) Zafir Stojanovski 2025-05-30 18:39:13 +02:00
  • b843f33b1d fix(eval): comparison plot (#441) Zafir Stojanovski 2025-05-29 12:31:07 +02:00
  • f51769927e add .gitattributes for correct repo classification (#439) Oliver Stanley 2025-05-19 19:41:07 +01:00
  • 93e731c29c heatmap (#438) Zafir Stojanovski 2025-05-19 10:07:45 +02:00
  • add527ada1 update training dir with external eval details (#437) Oliver Stanley 2025-05-18 23:35:41 +01:00
  • 5961a10145 comparison plot (#436) Zafir Stojanovski 2025-05-18 23:57:49 +02:00
  • 0cda6b1205 qwen math training code (#435) Zafir Stojanovski 2025-05-16 13:19:19 +02:00
  • 47303211b3 update gallery (#434) Oliver Stanley 2025-05-15 21:41:43 +01:00
  • 6d9e14729a fix rubiks cube prompt (#433) Zafir Stojanovski 2025-05-15 22:31:21 +02:00
  • 85f3c6dd02 updated inter-domain generalisation eval configs (#432) Oliver Stanley 2025-05-15 08:08:16 +01:00
  • 4cab1c3e6d comparison plots (#431) Zafir Stojanovski 2025-05-12 15:41:58 +02:00
  • 3df26e0fb2 fix decimal arithmetic curriculum to respect constraints (#430) Oliver Stanley 2025-05-06 19:23:19 +01:00
  • b7d8832267 cfg external-bench-cfg Oliver 2025-04-29 19:23:16 +01:00
  • f211b8544f cfg Oliver 2025-04-29 19:15:33 +01:00
  • 374760577e cfg Oliver 2025-04-29 19:03:28 +01:00
  • 85f0675cca fix Oliver 2025-04-29 09:15:19 +01:00
  • be28d6f5f0 cfg Oliver 2025-04-29 09:03:19 +01:00
  • fe479916bf cfg Oliver 2025-04-29 09:00:09 +01:00
  • 40e3293299 cfg Oliver 2025-04-28 23:41:37 +01:00
  • e60e202db9 revert Oliver 2025-04-28 23:38:02 +01:00
  • 197a976dfd cfg Oliver 2025-04-28 23:33:21 +01:00
  • 4118a93e0a fix Oliver 2025-04-28 23:26:56 +01:00
  • fd12db4352 cfg Oliver 2025-04-28 23:21:22 +01:00
  • f404b2e603 cfg Oliver 2025-04-28 23:17:41 +01:00
  • 82ac4b27ba fix Oliver 2025-04-28 23:12:22 +01:00
  • 39b6f8f7e2 fix Oliver 2025-04-28 23:08:11 +01:00
  • a5aaa89de4 new verl Oliver 2025-04-28 23:04:14 +01:00
  • 1817984fd2 cfg Oliver 2025-04-28 22:58:07 +01:00
  • 1f92a03b6b new verl Oliver 2025-04-28 22:54:07 +01:00
  • 2dff704f0b cfg Oliver 2025-04-28 22:48:45 +01:00
  • fa52788e31 cfg Oliver 2025-04-28 22:44:08 +01:00
  • 8dd7c86368 cfg Oliver 2025-04-28 22:32:10 +01:00
  • f57b5adcb0 cfg updates Oliver 2025-04-28 22:08:26 +01:00
  • d4e19056ea cfg Oliver 2025-04-28 21:27:00 +01:00