Maaz Karim · projects

Workflow-Arena

114 words 1 min read #Reinforcement Learning#Benchmarking#Scheduling

Workflow-Arena is a deterministic reinforcement learning benchmark designed to evaluate LLM and RL agents on workflow scheduling problems that feel operational rather than toy-like.

The benchmark models DAG-based workflows with worker limits, deadlines, task priorities, failures, and retries, so policies have to make tradeoffs under pressure.

Key design points:

  • dispatch and wait actions with explicit scheduling consequences
  • dependency constraints and workflow ordering
  • critical-path and slack-oriented signals
  • difficulty-scaled scenarios that increase pressure without changing the basic interface
  • reward shaping with penalties for invalid actions, avoidable waiting, over-capacity dispatches, missed deadlines, and unfinished tasks

The goal is to make agent evaluation more robust by reducing easy reward-hacking paths and forcing policies to reason about constrained execution.