reasoning-gym

mirror of https://github.com/open-thought/reasoning-gym.git synced 2025-10-09 13:40:09 +03:00

Author	SHA1	Message	Date
theblackcat102	2d19f13e0f	[fix #484 ] resolve basic_arithmetic fails when size is large (#485 ) * [fix] resolve basic_arithmetic fails when size is large by replacing zero divisor with 1	2025-07-07 09:46:23 +01:00
Adefioye	9e79fc84b6	fix: Rounding issues in score_answer and add unit tests (#462 )	2025-06-09 19:18:11 +01:00
Oliver Stanley	1a727ecf4e	support python 3.10 (#450 ) * support python 3.10 * add 3.10 to tests * new StrEnum	2025-06-04 10:34:01 +01:00
Adefioye	9053009dbe	Fix bug in normalize_answer method (#444 )	2025-06-02 08:58:54 +02:00
Oliver Stanley	c0e98f93b4	make task entries json serializable (#443 ) * make sympy-based task entries json serializable * remove datetime objs from time_intervals metadata * make adv geometry json serializable * make futoshiki metadata json serializable * fixes * futoshiki tweaks * fix adv geometry * deal with fractions in str representations * fix * restore start_time, end_time as str	2025-06-02 08:57:15 +02:00
Oliver Stanley	3df26e0fb2	fix decimal arithmetic curriculum to respect constraints (#430 ) * fix decimal arithmetic curriculum to respect constraints * update test accordingly	2025-05-06 19:23:19 +01:00
joesharratt1229	d0ef136d5b	Feat/intragen experiments (#414 ) * added curriculum * readapted readme * corrected small errors * Delete eval/eval/r1/algorithmic/word_sorting.json * removed redundant argument * added spell * removed duplicated fit * changed config * added composite changes * added composite changes * updated yaml * added spell backward * updated read me * added qwen2.5 * added * Add files via upload * updated missing trainer func * updated curr * updated spell back * updated correctness score func * updated configs * added local evals * added updates * updated datasets * added fsdp to hf utility * added algorithmic qwen 3b yaml * updated read me * updated configs * added preappend token * updated with thinking token * updated test score board * resolved comments * added evaluation scripts * removed results from pr * added config * added partial reward scoring * added evaluation composites * added training configs * added games eval * added rubriks cube * resolved merge cinflicts * added games config * added latest eval configs * updated strucutre * Delete training/evaluations/eval_graphs_composite.yaml --------- Co-authored-by: joesharratt1229 <joesharrat1229@gmail.com>	2025-04-16 08:04:52 +02:00
Zafir Stojanovski	dced3bfc45	fix(curriculum): Make boundaries in curriculum more sensible (#407 ) * init * fix tests * unify codeio * filtered for libraries not present in reasoning-gym * fix more bounds * puzzle24 * knight swap curriculum * fix number sorting * fix attributes * add validation of config in creation of dataset * dry run for instantiating and validating the datasets * remove unused imports * fix curriculum tests to reference newly updated attribute names	2025-04-04 20:24:14 +02:00
Adefioye	5b653b346c	Data collisions notebooks and data (#406 ) * Add collisions data * Fix logic issues in basic_arithmetic and gsm_symbolic data	2025-04-02 09:36:09 +02:00
Zafir Stojanovski	ce0a6c4878	fix(envs): Add source dataset and index to metadata (#388 ) * add source dataset and index to metadata * fix typo * fix coach class and its test	2025-03-20 11:12:14 +00:00
Oliver Stanley	7475a20700	include ranges rather than sampled values in difficulty metadata dicts (#387 ) * update difficulty metadata for logic datasets * update difficulty metadata for graph datasets * update difficulty metadata for geometry datasets * update difficulty metadata for games datasets * update difficulty metadata for cognition datasets * update difficulty metadata for arithmetic datasets * update difficulty metadata for arc datasets * update difficulty metadata for algorithmic datasets * update difficulty metadata for algebra datasets * use tuples * update tests * update tests	2025-03-20 10:27:03 +01:00
Andreas Köpf	d2c895f1d3	Refactor Curriculum Attributes (#335 ) * remove min_value from AttributeDefinition * remove type from AttributeDefinition * Add CurriculumContext * add ensure_interval option for RangeAttributes * docs: Add legend explaining curriculum indicators in dataset gallery * update GALLERY.md	2025-03-16 15:40:28 +01:00
Zafir Stojanovski	41f3ef876c	time intervals curriculum (#363 )	2025-03-14 16:11:55 +01:00
Zafir Stojanovski	ee001f38a4	fraction simplification curriculum (#349 )	2025-03-13 21:05:50 +01:00
Zafir Stojanovski	db3868150f	lcm curriculum (#348 )	2025-03-13 21:05:06 +01:00
Zafir Stojanovski	596de511f0	feat(env): Prime Factorization (#344 ) * prime factorization * lint	2025-03-13 21:02:20 +01:00
Zafir Stojanovski	7a4b8fc5a8	number formatting curriculum (#341 )	2025-03-13 20:57:43 +01:00
vncntt	c3c6cc8051	gcd curriculum (#331 )	2025-03-11 08:25:24 +01:00
Rich Jones	126eecc798	fix dice (#330 )	2025-03-11 08:24:32 +01:00
vncntt	af6120c095	add metadata for caesar cipher, graph coloring, decimal arithmetic (#304 ) * add metadata for caesar cipher, graph coloring, decimal arithmetic * delete comma * clean up variables	2025-03-09 18:08:56 +01:00
joesharratt1229	a7dd5f6680	added power function exponent (#291 ) * added power function exponent * register PowerFunctionCurriculum --------- Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>	2025-03-08 01:54:36 +01:00
vncntt	7199363339	dice curriculum (#284 ) * curriculum + unit tests * add difficulty to metadata --------- Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>	2025-03-08 01:43:45 +01:00
vncntt	8c80bf6bec	Calendar arithmetic curriculum (#283 ) * calendar arithmetic curriculum * add difficulty to metadata * register CalendarArithmeticCurriculum --------- Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>	2025-03-08 01:38:22 +01:00
vncntt	775a42e9e4	Bitwise arithmetic curriculum (#282 ) * bitwise_arithmetic curriculum * register BitwiseArithmeticCurriculum --------- Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>	2025-03-08 01:32:00 +01:00
joesharratt1229	444c793d3f	added Decimal curriculum (#280 ) * added decimal curricula * added chain sum decimal curriculum * register DecimalArithmeticCurriculum & DecimalChainSumCurriculum --------- Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>	2025-03-07 23:02:57 +01:00
joesharratt1229	1888fe2bb4	added basic arith curricula (#276 ) * added basic arith curricula * register BasicArithmeticCurriculum --------- Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>	2025-03-07 22:54:49 +01:00
Zafir Stojanovski	e560cb3c46	feat(env): Leg Counting Curriculum (#275 ) * leg counting curriculum --------- Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>	2025-03-07 19:15:18 +01:00
Zafir Stojanovski	b915565c0d	add difficulty where possible (#274 )	2025-03-07 19:01:26 +01:00
Andreas Köpf	c69bc5d4e6	Basic curriculum (#198 ) * feat: Add optional curriculum support to dataset registration and creation * docs: Add docstrings to create_curriculum() and register_dataset() * feat: Add curriculum configuration classes for CurriculumExperiment * feat: Add weight parameter to CurriculumAttributeConfig and use in DatasetSpec * refactor: Simplify CurriculumAttributeConfig with "" attribute level support test: Add unit tests for CurriculumExperiment class * feat: Add from_yaml() method to CurriculumExperimentConfig with unit test	2025-03-07 11:22:12 +01:00
joesharratt1229	d9638df79c	updated algorithmics dataset (#269 ) * updated algorithmic datasets * added changes to symbolic and power * updated power function test	2025-03-05 23:32:53 +01:00
Zafir Stojanovski	9bb6d028a3	feat(env): Count Bits Curriculum (#267 ) * add min n * count bits	2025-03-05 22:44:04 +01:00
Andreas Köpf	5d7fbac0ad	Minor question template & score_answer improvements (#261 ) * math prompt improvements * ignore brackets in complex_arithmetic results * improve additional instruction in prompt of polynomial_equations * more strict tests for score_answer in polynomial_equations * simplify special reward handling * fix test_intermediate_integration * fix sokoban dataset * add common dataset score_answer consistency test	2025-03-04 21:55:09 +01:00
Andreas Köpf	c0cf237474	Reduce precision from 28 to 6 in DecimalArithmeticDataset (#256 )	2025-03-03 21:57:08 +01:00
Zafir Stojanovski	01e1c8f9af	fix: Unify Prompts (#254 ) * remove cot * fix prompt template * fix pool matrix * spiral matrix fixed	2025-03-03 21:55:53 +01:00
Andreas Köpf	24828e1889	Remove strip from ProceduralDataset::core score_answer() (#250 ) * remove strip from ProceduralDataset::core score_answer(), strip in extract answer (optional, default=True) * test: Move test_extract_answer() from test_dataset.py to test_utils.py * refactor: Improve decimal reward computation with more flexible comparison * fix: Implement rounding for format_number when round_if_needed is True * test: Add test case for compute_decimal_reward with sign and zeros	2025-03-02 08:46:36 +01:00
Rich Jones	253e49aecf	sm fixes	2025-02-27 11:54:04 +01:00
vncntt	5f01049607	Add KnightsKnavesDataset (knights_knaves) Adapted code from https://github.com/AlphaPav/mem-kk-logic/blob/main/data_prep/lib_kk.py --------- Co-authored-by: Andreas Koepf (aider) <andreas.koepf@provisio.com>	2025-02-25 20:15:38 +01:00
Andreas Koepf	eeb9fa31d5	more native type hints	2025-02-21 21:23:14 +01:00
Andreas Köpf	1c6359f1f3	Merge pull request #181 from open-thought/rich/bitwise Add Bitwise Arithmetic	2025-02-21 17:27:45 +01:00
Andreas Koepf (aider)	bae97aa795	docs: Add comment explaining automatic base detection in int() conversion	2025-02-21 17:16:11 +01:00
Andreas Koepf (aider)	5ff957a766	docs: Add detailed comments for BitwiseArithmeticConfig and BitwiseArithmeticDataset	2025-02-21 17:14:00 +01:00
Andreas Koepf	44f4cc08eb	refactor: Update type hints and remove unused imports in bitwise_arithmetic.py	2025-02-21 17:13:36 +01:00
Andreas Koepf (aider)	c91d13bd08	feat: Add typing hints and improve difficulty parameter documentation in bitwise_arithmetic.py	2025-02-21 17:11:40 +01:00
Rich Jones	1cf6821f17	lint	2025-02-21 17:09:19 +01:00
Rich Jones	c1b26cf184	ensure arbitrary bit depth and signed values	2025-02-21 16:52:26 +01:00
Andreas Koepf	acde58a200	use Decimal class for numeric comparison e.g. +0123.100 == 123.1	2025-02-21 15:36:06 +01:00
Andreas Koepf	3e7ff3b084	use native types List->list, Dict->dict, Set->set, Tuple->tuple	2025-02-21 15:15:38 +01:00
AhmedSaif2	5d02064b5a	add a helper function to handle redundant code	2025-02-21 15:54:00 +02:00
Rich Jones	b6c7ceabb2	clean up comments	2025-02-21 12:17:21 +01:00
Rich Jones	ee9202d63d	add to init	2025-02-21 12:07:17 +01:00

1 2 3 4

187 Commits