Skip to content

Observations

Agentick supports multiple observation modes optimized for different agent types.

Overview

Six modes via render_mode: - ASCII: Colored text grid - Language: Natural language descriptions - Language Structured: Dictionary with semantic info - RGB Array: Isometric sprite-based pixel observations (512x512) - State Dict: Full state access - Human: Isometric display (same as rgb_array, for interactive use)

1. ASCII Mode

env = agentick.make("GoToGoal-v0", render_mode="ascii")
obs, info = env.reset(seed=42)

Output: #^..G# (colored grid with legend)

Space: gymnasium.spaces.Text

2. Language Mode

env = agentick.make("GoToGoal-v0", render_mode="language")
obs, info = env.reset()

Output: "You are at (1,1) facing north. Goal southeast at distance 5."

Options: - Perspective: first_person, third_person, omniscient - Verbosity: minimal, standard, verbose

env = agentick.make("GoToGoal-v0", render_mode="language", verbosity="minimal")

Space: gymnasium.spaces.Text

3. Language Structured Mode

env = agentick.make("GoToGoal-v0", render_mode="language_structured")
obs, info = env.reset()

Output:

{
    "description": "You are in a 5x5 room...",
    "position": {"x": 1, "y": 1},
    "orientation": "north",
    "surroundings": {"north": "wall", "south": "empty"},
    "visible_goals": [{"type": "goal", "position": {"x": 4, "y": 4}}],
    "valid_actions": ["move_up", "move_right"]
}

Space: gymnasium.spaces.Dict

4. RGB Array Mode (Isometric)

env = agentick.make("GoToGoal-v0", render_mode="rgb_array")
obs, info = env.reset()
print(obs.shape)  # (512, 512, 3), uint8

The default visual mode uses an isometric sprite renderer powered by Kenney tile assets. Produces visually rich 512x512 images with an isometric diamond perspective.

Space: gymnasium.spaces.Box shape (512, 512, 3), dtype uint8, range [0, 255]

5. State Dict Mode

env = agentick.make("GoToGoal-v0", render_mode="state_dict")
obs, info = env.reset()

Output:

{
    "grid": {
        "height": 5, "width": 5,
        "terrain": [[1,1,1],[1,0,0],...],  # 0=empty, 1=wall
        "objects": [[0,0,0],[0,0,1],...],  # 0=none, 1=goal
        "agents": [[0,1,0],[0,0,0],...]
    },
    "agent": {
        "position": (1, 1),
        "orientation": "north",
        "inventory": [],
        "energy": 1.0
    },
    "info": {
        "step_count": 0,
        "max_steps": 20,
        "valid_actions": ["move_down", "move_right"]
    }
}

Fast mode for planning:

env = agentick.make("GoToGoal-v0", render_mode="state_dict", fast_mode=True)
# Returns numpy arrays, ~10x faster

Space: gymnasium.spaces.Dict

6. Human Mode

env = agentick.make("GoToGoal-v0", render_mode="human")
obs, info = env.reset()

Controls: Arrow keys (move), Space (interact), R/L (rotate), Q (quit)

Display: Large sprites, grid overlay, HUD

Multi-Modal Access

# Access any mode programmatically
env = agentick.make("GoToGoal-v0", render_mode="ascii")
ascii_obs = env.get_text_observation()
pixel_obs = env.get_pixel_observation()
state_obs = env.get_state_dict()

Same seed produces equivalent internal state across modes:

env1 = agentick.make("GoToGoal-v0", render_mode="ascii")
env2 = agentick.make("GoToGoal-v0", render_mode="rgb_array")
obs1, _ = env1.reset(seed=42)
obs2, _ = env2.reset(seed=42)
# Same internal state, different representations

See Architecture and Tasks for more details.