* Continue polling stats until all evals complete * Return evaluation changes early, before it has run * Add task for running new eval * requeue rate-limited tasks * Fix prettier