187 Commits

Author SHA1 Message Date
theblackcat102
2d19f13e0f [fix #484] resolve basic_arithmetic fails when size is large (#485)
* [fix] resolve basic_arithmetic fails when size is large by replacing zero divisor with 1
2025-07-07 09:46:23 +01:00
Adefioye
9e79fc84b6 fix: Rounding issues in score_answer and add unit tests (#462) 2025-06-09 19:18:11 +01:00
Oliver Stanley
1a727ecf4e support python 3.10 (#450)
* support python 3.10

* add 3.10 to tests

* new StrEnum
2025-06-04 10:34:01 +01:00
Adefioye
9053009dbe Fix bug in normalize_answer method (#444) 2025-06-02 08:58:54 +02:00
Oliver Stanley
c0e98f93b4 make task entries json serializable (#443)
* make sympy-based task entries json serializable

* remove datetime objs from time_intervals metadata

* make adv geometry json serializable

* make futoshiki metadata json serializable

* fixes

* futoshiki tweaks

* fix adv geometry

* deal with fractions in str representations

* fix

* restore start_time, end_time as str
2025-06-02 08:57:15 +02:00
Oliver Stanley
3df26e0fb2 fix decimal arithmetic curriculum to respect constraints (#430)
* fix decimal arithmetic curriculum to respect constraints

* update test accordingly
2025-05-06 19:23:19 +01:00
joesharratt1229
d0ef136d5b Feat/intragen experiments (#414)
* added curriculum

* readapted readme

* corrected small errors

* Delete eval/eval/r1/algorithmic/word_sorting.json

* removed redundant argument

* added spell

* removed duplicated fit

* changed config

* added composite changes

* added composite changes

* updated yaml

* added spell backward

* updated read me

* added qwen2.5

* added

* Add files via upload

* updated missing trainer func

* updated curr

* updated spell back

* updated correctness score func

* updated configs

* added local evals

* added updates

* updated datasets

* added fsdp to hf utility

* added algorithmic qwen 3b yaml

* updated read me

* updated configs

* added preappend token

* updated with thinking token

* updated test score board

* resolved comments

* added evaluation scripts

* removed results from pr

* added config

* added partial reward scoring

* added evaluation composites

* added training configs

* added games eval

* added rubriks cube

* resolved merge cinflicts

* added games config

* added latest eval configs

* updated strucutre

* Delete training/evaluations/eval_graphs_composite.yaml

---------

Co-authored-by: joesharratt1229 <joesharrat1229@gmail.com>
2025-04-16 08:04:52 +02:00
Zafir Stojanovski
dced3bfc45 fix(curriculum): Make boundaries in curriculum more sensible (#407)
* init

* fix tests

* unify codeio

* filtered for libraries not present in reasoning-gym

* fix more bounds

* puzzle24

* knight swap curriculum

* fix number sorting

* fix attributes

* add validation of config in creation of dataset

* dry run for instantiating and validating the datasets

* remove unused imports

* fix curriculum tests to reference newly updated attribute names
2025-04-04 20:24:14 +02:00
Adefioye
5b653b346c Data collisions notebooks and data (#406)
* Add collisions data

* Fix logic issues in basic_arithmetic and gsm_symbolic data
2025-04-02 09:36:09 +02:00
Zafir Stojanovski
ce0a6c4878 fix(envs): Add source dataset and index to metadata (#388)
* add source dataset and index to metadata

* fix typo

* fix coach class and its test
2025-03-20 11:12:14 +00:00
Oliver Stanley
7475a20700 include ranges rather than sampled values in difficulty metadata dicts (#387)
* update difficulty metadata for logic datasets

* update difficulty metadata for graph datasets

* update difficulty metadata for geometry datasets

* update difficulty metadata for games datasets

* update difficulty metadata for cognition datasets

* update difficulty metadata for arithmetic datasets

* update difficulty metadata for arc datasets

* update difficulty metadata for algorithmic datasets

* update difficulty metadata for algebra datasets

* use tuples

* update tests

* update tests
2025-03-20 10:27:03 +01:00
Andreas Köpf
d2c895f1d3 Refactor Curriculum Attributes (#335)
* remove min_value from AttributeDefinition
* remove type from AttributeDefinition
* Add CurriculumContext
* add ensure_interval option for RangeAttributes
* docs: Add legend explaining curriculum indicators in dataset gallery
* update GALLERY.md
2025-03-16 15:40:28 +01:00
Zafir Stojanovski
41f3ef876c time intervals curriculum (#363) 2025-03-14 16:11:55 +01:00
Zafir Stojanovski
ee001f38a4 fraction simplification curriculum (#349) 2025-03-13 21:05:50 +01:00
Zafir Stojanovski
db3868150f lcm curriculum (#348) 2025-03-13 21:05:06 +01:00
Zafir Stojanovski
596de511f0 feat(env): Prime Factorization (#344)
* prime factorization

* lint
2025-03-13 21:02:20 +01:00
Zafir Stojanovski
7a4b8fc5a8 number formatting curriculum (#341) 2025-03-13 20:57:43 +01:00
vncntt
c3c6cc8051 gcd curriculum (#331) 2025-03-11 08:25:24 +01:00
Rich Jones
126eecc798 fix dice (#330) 2025-03-11 08:24:32 +01:00
vncntt
af6120c095 add metadata for caesar cipher, graph coloring, decimal arithmetic (#304)
* add metadata for caesar cipher, graph coloring, decimal arithmetic

* delete comma

* clean up variables
2025-03-09 18:08:56 +01:00
joesharratt1229
a7dd5f6680 added power function exponent (#291)
* added power function exponent

* register PowerFunctionCurriculum

---------

Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>
2025-03-08 01:54:36 +01:00
vncntt
7199363339 dice curriculum (#284)
* curriculum + unit tests
* add difficulty to metadata

---------

Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>
2025-03-08 01:43:45 +01:00
vncntt
8c80bf6bec Calendar arithmetic curriculum (#283)
* calendar arithmetic curriculum
* add difficulty to metadata
* register CalendarArithmeticCurriculum

---------

Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>
2025-03-08 01:38:22 +01:00
vncntt
775a42e9e4 Bitwise arithmetic curriculum (#282)
* bitwise_arithmetic curriculum
* register BitwiseArithmeticCurriculum

---------

Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>
2025-03-08 01:32:00 +01:00
joesharratt1229
444c793d3f added Decimal curriculum (#280)
* added decimal curricula

* added chain sum decimal curriculum

* register DecimalArithmeticCurriculum & DecimalChainSumCurriculum

---------

Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>
2025-03-07 23:02:57 +01:00
joesharratt1229
1888fe2bb4 added basic arith curricula (#276)
* added basic arith curricula
* register BasicArithmeticCurriculum

---------

Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>
2025-03-07 22:54:49 +01:00
Zafir Stojanovski
e560cb3c46 feat(env): Leg Counting Curriculum (#275)
* leg  counting curriculum

---------

Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>
2025-03-07 19:15:18 +01:00
Zafir Stojanovski
b915565c0d add difficulty where possible (#274) 2025-03-07 19:01:26 +01:00
Andreas Köpf
c69bc5d4e6 Basic curriculum (#198)
* feat: Add optional curriculum support to dataset registration and creation
* docs: Add docstrings to create_curriculum() and register_dataset()
* feat: Add curriculum configuration classes for CurriculumExperiment
* feat: Add weight parameter to CurriculumAttributeConfig and use in DatasetSpec
* refactor: Simplify CurriculumAttributeConfig with "*" attribute level support
* test: Add unit tests for CurriculumExperiment class
* feat: Add from_yaml() method to CurriculumExperimentConfig with unit test
2025-03-07 11:22:12 +01:00
joesharratt1229
d9638df79c updated algorithmics dataset (#269)
* updated algorithmic datasets
* added changes to symbolic and power
* updated power function test
2025-03-05 23:32:53 +01:00
Zafir Stojanovski
9bb6d028a3 feat(env): Count Bits Curriculum (#267)
* add min n

* count bits
2025-03-05 22:44:04 +01:00
Andreas Köpf
5d7fbac0ad Minor question template & score_answer improvements (#261)
* math prompt improvements
* ignore brackets in complex_arithmetic results
* improve additional instruction in prompt of polynomial_equations
* more strict tests for score_answer in polynomial_equations
* simplify special reward handling
* fix test_intermediate_integration
* fix sokoban dataset
* add common dataset score_answer consistency test
2025-03-04 21:55:09 +01:00
Andreas Köpf
c0cf237474 Reduce precision from 28 to 6 in DecimalArithmeticDataset (#256) 2025-03-03 21:57:08 +01:00
Zafir Stojanovski
01e1c8f9af fix: Unify Prompts (#254)
* remove cot
* fix prompt template
* fix pool matrix
* spiral matrix fixed
2025-03-03 21:55:53 +01:00
Andreas Köpf
24828e1889 Remove strip from ProceduralDataset::core score_answer() (#250)
* remove strip from ProceduralDataset::core score_answer(), strip in extract answer (optional, default=True)
* test: Move test_extract_answer() from test_dataset.py to test_utils.py
* refactor: Improve decimal reward computation with more flexible comparison
* fix: Implement rounding for format_number when round_if_needed is True
* test: Add test case for compute_decimal_reward with sign and zeros
2025-03-02 08:46:36 +01:00
Rich Jones
253e49aecf sm fixes 2025-02-27 11:54:04 +01:00
vncntt
5f01049607 Add KnightsKnavesDataset (knights_knaves)
Adapted code from https://github.com/AlphaPav/mem-kk-logic/blob/main/data_prep/lib_kk.py

---------

Co-authored-by: Andreas Koepf (aider) <andreas.koepf@provisio.com>
2025-02-25 20:15:38 +01:00
Andreas Koepf
eeb9fa31d5 more native type hints 2025-02-21 21:23:14 +01:00
Andreas Köpf
1c6359f1f3 Merge pull request #181 from open-thought/rich/bitwise
Add Bitwise Arithmetic
2025-02-21 17:27:45 +01:00
Andreas Koepf (aider)
bae97aa795 docs: Add comment explaining automatic base detection in int() conversion 2025-02-21 17:16:11 +01:00
Andreas Koepf (aider)
5ff957a766 docs: Add detailed comments for BitwiseArithmeticConfig and BitwiseArithmeticDataset 2025-02-21 17:14:00 +01:00
Andreas Koepf
44f4cc08eb refactor: Update type hints and remove unused imports in bitwise_arithmetic.py 2025-02-21 17:13:36 +01:00
Andreas Koepf (aider)
c91d13bd08 feat: Add typing hints and improve difficulty parameter documentation in bitwise_arithmetic.py 2025-02-21 17:11:40 +01:00
Rich Jones
1cf6821f17 lint 2025-02-21 17:09:19 +01:00
Rich Jones
c1b26cf184 ensure arbitrary bit depth and signed values 2025-02-21 16:52:26 +01:00
Andreas Koepf
acde58a200 use Decimal class for numeric comparison e.g. +0123.100 == 123.1 2025-02-21 15:36:06 +01:00
Andreas Koepf
3e7ff3b084 use native types List->list, Dict->dict, Set->set, Tuple->tuple 2025-02-21 15:15:38 +01:00
AhmedSaif2
5d02064b5a add a helper function to handle redundant code 2025-02-21 15:54:00 +02:00
Rich Jones
b6c7ceabb2 clean up comments 2025-02-21 12:17:21 +01:00
Rich Jones
ee9202d63d add to init 2025-02-21 12:07:17 +01:00