mirror of
https://github.com/open-thought/reasoning-gym.git
synced 2025-10-09 13:40:09 +03:00
* added games * added llama 3b training conf * update readme with details of external evals * readme update --------- Co-authored-by: joesharratt1229 <joesharratt1229@gmail.com>
27 lines
1.7 KiB
YAML
27 lines
1.7 KiB
YAML
task: llama_math_algebra
|
|
dataset_path: EleutherAI/hendrycks_math
|
|
process_docs: !function utils.process_docs
|
|
dataset_name: algebra
|
|
output_type: generate_until
|
|
training_split: train
|
|
test_split: test
|
|
doc_to_text: "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are a helpful AI Assistant that provides well-reasoned and detailed responses.\nYou first think about the reasoning process as an internal monologue and then provide the user with the answer.\nRespond in the following format:\n<think>\n...\n</think>\n<answer>\n...\n</answer><|eot_id|><|start_header_id|>user<|end_header_id|>\n\nSolve the following math problem efficiently and clearly:\n\n- For simple problems (2 steps or fewer):\nProvide a concise solution with minimal explanation.\n\n- For complex problems (3 steps or more):\nUse this step-by-step format:\n\n## Step 1: [Concise description]\n[Brief explanation and calculations]\n\n## Step 2: [Concise description]\n[Brief explanation and calculations]\n\n...\n\nRegardless of the approach, always conclude with:\n\nTherefore, the final answer is: $\\\\boxed{answer}$. I hope it is correct.\n\nWhere [answer] is just the final number or expression that solves the problem.\n\nProblem: {{ problem }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
|
|
process_results: !function utils.process_results
|
|
doc_to_target: "{{answer if few_shot is undefined else solution}}"
|
|
generation_kwargs:
|
|
until:
|
|
- "Problem:"
|
|
- "</answer>"
|
|
max_gen_toks: 4096
|
|
do_sample: false
|
|
temperature: 0
|
|
metric_list:
|
|
- metric: exact_match
|
|
aggregation: mean
|
|
higher_is_better: true
|
|
num_fewshot: 0
|
|
metadata:
|
|
version: 1.0
|
|
dataset_kwargs:
|
|
trust_remote_code: true
|