32 Commits

Author SHA1 Message Date
Zafir Stojanovski
b843f33b1d fix(eval): comparison plot (#441)
* heatmap

* filter comparison plots

* latex style

* curriculum heatmap

* pre-commit

* update figsize

* large y-ticks

* larger font

* thinner

* include 50
2025-05-29 12:31:07 +02:00
Adefioye
5b653b346c Data collisions notebooks and data (#406)
* Add collisions data

* Fix logic issues in basic_arithmetic and gsm_symbolic data
2025-04-02 09:36:09 +02:00
Oliver Stanley
d1e505a8e9 First version of CodeI/O reasoning data (#264)
* notebook for prepping first set of raw code files
* updated codeio processing notebook for repo-level processing
* fix for edge case in codeio scoring
* Add reformat notebook
* filtering pass
* add non-determinism filtering
* Tweak CodeIODataset & include first real data
* add basic codeio test, metadata
2025-03-05 22:34:11 +01:00
Andreas Köpf
a56b3b6c5c Merge pull request #186 from zafstojano/feat/codeio
feat(env): CodeIO
2025-02-27 12:18:13 +01:00
Zafir Stojanovski
4c637c3b13 final tweaks 2025-02-27 08:38:34 +01:00
Zafir Stojanovski
1ec625cbd9 update timeout 2025-02-26 20:27:43 +01:00
Zafir Stojanovski
2ce450486d e2b testing 2025-02-26 20:19:52 +01:00
Zafir Stojanovski
b47bf882ce filtering 2025-02-25 22:21:26 +01:00
Zafir Stojanovski
5ed4395613 async 2025-02-24 22:07:35 +01:00
Zafir Stojanovski
aac7175c69 generate inputs synchronously 2025-02-24 15:58:06 +01:00
Zafir Stojanovski
96dad6c7f3 sampling code 2025-02-23 00:40:11 +01:00
Andreas Köpf
7a1e387d6e Merge pull request #176 from olliestanley/codeio-experiments
Experiments with CodeI/O techniques for synthesising reasoning data
2025-02-22 16:24:17 +01:00
Zafir Stojanovski
e04ca72809 greedy coreset sampling 2025-02-22 16:15:14 +01:00
Zafir Stojanovski
6bbec2ac4e exploratory notebook 2025-02-22 00:46:33 +01:00
Oliver
081f84dec6 Add steps to synthesize CoTs with DeepSeekV3 2025-02-21 23:36:19 +00:00
Oliver
cce6002c70 Improve prompt for better LLM adherence 2025-02-21 23:00:48 +00:00
Andreas Koepf
eeb9fa31d5 more native type hints 2025-02-21 21:23:14 +01:00
Oliver
cb1f634078 Prompt tweak 2025-02-21 18:34:13 +00:00
Oliver
a0ccfa5144 Merge branch 'main' into codeio-experiments 2025-02-21 17:25:08 +00:00
Andreas Koepf
3e7ff3b084 use native types List->list, Dict->dict, Set->set, Tuple->tuple 2025-02-21 15:15:38 +01:00
Oliver
b7ee70995e Prompt tweak for code preprocessing 2025-02-20 20:07:32 +00:00
Oliver
6f9b81b879 Add initial CodeI/O experiment notebook 2025-02-20 20:03:36 +00:00
Andreas Köpf
d2bef8d30f Merge pull request #65 from zafstojano/env/group-anagrams
Group Anagrams together
2025-02-06 13:03:27 +01:00
Zafir Stojanovski
6ec6876221 add source for words_alpha.txt 2025-02-06 10:12:38 +01:00
Andreas Koepf
40420a35b9 update gsmcross-check status 2025-02-05 21:14:19 +01:00
Andreas Koepf
afb95508ef gsm_symbolic generator changes 2025-02-05 20:58:01 +01:00
Zafir Stojanovski
76a3d4761c generate all english anagrams 2025-02-05 16:25:23 +01:00
Andreas Koepf
c8fcb6ca02 black formatting 2025-02-03 22:57:24 +01:00
abdulhakeem
5d0ad82034 Add EOL to test_generator_files 2025-02-01 20:41:31 -06:00
abdulhakeem
715102c277 Remove .DS_Store 2025-02-01 20:39:37 -06:00
Andreas Koepf
7c0509db7a add eval demo for generated script 2025-01-29 18:28:17 +01:00
Andreas Koepf
d6c9a534af first steps for automatic generation of gsm generator functions 2025-01-29 17:55:37 +01:00