- 
Inference
- Your code uses the ART client to perform an agentic workflow (usually executing several rollouts in parallel to gather data faster).
 - Completion requests are routed to the ART backend, which runs the model’s latest LoRA in vLLM.
 - As the agent executes, each 
system,user, andassistantmessage is stored in a Trajectory. - After your rollouts finish, your code assigns a 
rewardto each Trajectory, with higher rewards indicating better performance than low ones. 
 - 
Training
- When all rollouts have finished, Trajectories are grouped and sent to the backend. Inference is blocked while training executes.
 - The backend trains your model using GRPO, initializing from the latest checkpoint (or an empty LoRA on the first iteration).
 - The backend saves the newly trained LoRA to a local directory and loads it into vLLM.
 - Inference is unblocked and the loop resumes at step 1.