* Continue polling stats until all evals complete
* Return evaluation changes early, before it has run
* Add task for running new eval
* requeue rate-limited tasks
* Fix prettier
- Always stream the visible scenarios, if the modelProvider supports it
- Never stream the invisible scenarios
Also actually runs our query tasks in a background worker, which we weren't quite doing before.