Additional histories allow you to include multiple separate conversation histories within a single trajectory. This powerful feature enables training of agents with non-linear conversation flows, such as agents that call sub-agents or compact their message history periodically.
What are Additional Histories?
In ART, a trajectory typically contains a single sequence of messages representing the agent’s conversation. However, some advanced use cases require training on multiple related but separate conversations within the same trajectory context. The additional_histories
feature addresses this need.
Each trajectory can contain:
- A primary
messages_and_choices
sequence (the main conversation)
- An optional list of
additional_histories
, where each history contains its own messages_and_choices
and optional tools
Why Use Additional Histories?
1. Preserving Special Tokens in Multi-Turn Conversations
Some models, like Qwen 3, use chat templates that remove special tokens (such as <think>
) from previous turns in multi-turn conversations. This can interfere with training when you want the model to learn from its thinking process across all turns.
By splitting each turn into a separate history, you can preserve these tokens for training:
from art.trajectories import Trajectory, History
# Instead of a single multi-turn conversation that loses <think> tokens
# Train as separate histories to preserve them
trajectory = Trajectory(
messages_and_choices=[
# First turn with thinking
{"role": "user", "content": "What is 2+2?"},
{"role": "assistant", "content": "<think>I need to add 2 and 2</think>4"}
],
additional_histories=[
History(
messages_and_choices=[
# The Qwen 3 chat template removes <think> tokens from previous turns
{"role": "user", "content": "What is 2+2?"},
{"role": "assistant", "content": "4"},
{"role": "user", "content": "What is 3+3?"},
{"role": "assistant", "content": "<think>I need to add 3 and 3</think>6"}
]
)
]
)
2. Training Agents That Call Sub-Agents
When an agent delegates work to sub-agents, each sub-agent conversation can be stored as an additional history:
trajectory = Trajectory(
# Main agent conversation
messages_and_choices=[
{"role": "user", "content": "Analyze this codebase and fix any bugs"},
{"role": "assistant", "tool_calls": [
{"type": "function", "function": {"name": "analyze_code", "arguments": "Find potential bugs in main.py"}},
]},
{
"role": "tool",
"tool_call_id": "...",
"content": "Found 3 potential issues..."
},
{"role": "assistant", "tool_calls": [
{"type": "function", "function": {"name": "fix_issues", "arguments": "Fix the null pointer issue on line 42 of main.py"}},
]},
{
"role": "tool",
"tool_call_id": "...",
"content": "Fixed by adding null check..."
},
],
additional_histories=[
# Sub-agent 1: Code analysis
History(
messages_and_choices=[
{"role": "system", "content": "You are a code analysis expert"},
{"role": "user", "content": "Find potential bugs in main.py"},
{"role": "assistant", "content": "Found 3 potential issues..."}
]
),
# Sub-agent 2: Bug fixing
History(
messages_and_choices=[
{"role": "system", "content": "You are a bug fixing expert"},
{"role": "user", "content": "Fix the null pointer issue on line 42"},
{"role": "assistant", "content": "Fixed by adding null check..."}
]
)
]
)
3. History Compaction and Summarization
For long-running agents that periodically compress their conversation history:
trajectory = Trajectory(
# Current active conversation
messages_and_choices=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum entanglement"},
{"role": "assistant", "content": "Quantum entanglement is..."},
# ... many more messages ...
],
additional_histories=[
# Previous conversation segment before compaction
History(
messages_and_choices=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Compacted conversation history: the user asked about quantum entanglement, and the assistant explained..."},
{"role": "user", "content": "Tell me more about the history of quantum entanglement"},
{"role": "assistant", "content": "Quantum entanglement was first..."},
]
)
]
)
How It Works
Tokenization Process
When a trajectory with additional histories is tokenized:
- The main history (from
messages_and_choices
) is tokenized first
- Each additional history is tokenized separately
- The training weight is distributed across all tokenized results
- Each history maintains its own context and token boundaries
# In the tokenization pipeline
histories = [create_history_from_trajectory(trajectory)] # Main history
histories.extend(trajectory.additional_histories) # Add all additional histories
# Each history is tokenized independently
for history in histories:
tokenized_result = tokenize_history(history)
# Weight is distributed across all results
Data Structure
The History
class structure:
@dataclass
class History:
messages_and_choices: list[dict[str, Any]]
tools: list[Tool] | None = None
The Trajectory
class with additional histories:
@dataclass
class Trajectory:
messages_and_choices: list[dict[str, Any]]
tools: list[Tool] | None = None
additional_histories: list[History] = field(default_factory=list)
reward: float | None = None
metrics: dict[str, Any] = field(default_factory=dict)
Implementation Guide
Creating a Trajectory with Additional Histories
from art.trajectories import Trajectory, History
# Create the main conversation
main_messages = [
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Help me with a complex task"},
{"role": "assistant", "content": "I'll help you with that"}
]
# Create additional histories
history1 = History(
messages_and_choices=[
{"role": "user", "content": "First subtask"},
{"role": "assistant", "content": "Completing first subtask..."}
]
)
history2 = History(
messages_and_choices=[
{"role": "user", "content": "Second subtask"},
{"role": "assistant", "content": "Completing second subtask..."}
]
)
# Combine into a trajectory
trajectory = Trajectory(
messages_and_choices=main_messages,
additional_histories=[history1, history2],
reward=0.8,
metrics={"task_completed": True}
)
Current Limitations
The RULER reward function does not currently support trajectories with
additional histories. If you attempt to use RULER with additional_histories
,
it will raise an error. Support for this feature in RULER is planned for a
future release.
- Models - Model-specific considerations including Qwen 3