Building an Agent Factory with AWS AgentCore

By Yoonsoo Park | December 20, 2025 | 6 minutes

In the rapidly evolving world of Generative AI, manually deploying agents one by one is no longer sexy. We need systems that can build systems—an **Agent Factory**. ![hot](images/hot.gif) Today, we will dissect the Agent Factory system I learned at `AWS re:Invent 2025`. AWS engineers built a system that can build agents using AWS AgentCore, and the repo is [https://github.com/mcginnbros/agent-factory](https://github.com/mcginnbros/agent-factory). This platform is designed to bootstrap your personal AWS environment into a fully functional agent ecosystem using AWS AgentCore. In this post, we'll peel back the layers and look at the infrastructure that makes this possible: from IAM policies and ECR repositories to the Python tooling that orchestrates it all. --- ## 1. The Core Architecture At the heart of this system lies the concept of a **Builder Agent**. Instead of writing Terraform or CloudFormation for every new agent you want to experiment with, you talk to the Builder Agent. The Builder Agent comprises: 1. **Strands Framework**: The application logic driving the agent's reasoning. 2. **AWS AgentCore Runtime**: The serverless compute environment hosting the agent container. 3. **Tooling Access**: Permissions to call AWS APIs to create *other* resources. When you ask the Builder Agent to "Create a weather bot," it doesn't just write code; it provisions real infrastructure. 🤯 ## 2. Infrastructure Under the Hood To enable a safe yet powerful Agent Factory, we lean heavily on AWS native services. ```mermaid graph TB subgraph Client ["Client Layer"] User[User / Developer] CLI[Python CLI - demo_cli.py] end subgraph AWS ["AWS Cloud Account"] subgraph IAM ["IAM & Security"] Roles[IAM Roles] end subgraph Control ["Control Plane"] AgentCoreAPI[Bedrock AgentCore API] Gateway[AgentCore Gateway] end subgraph Data ["Data Plane"] BuilderRuntime[Builder Agent Runtime] AgentRuntime[Deployed Agent Runtime] Memory[AgentCore Memory] end subgraph Storage ["Artifacts"] ECR[ECR: reinvent/agents] end end User --> CLI CLI --> AgentCoreAPI AgentCoreAPI --> BuilderRuntime BuilderRuntime --> AgentCoreAPI AgentCoreAPI --> AgentRuntime BuilderRuntime --> Memory AgentRuntime --> Memory BuilderRuntime -.->|Pulls Image from ECR| ECR AgentRuntime -.->|Pulls Image from ECR| ECR style AWS fill:#f9f9f9,stroke:#333,stroke-width:2px ``` ### Identity and Access Management (IAM) Security is paramount when an AI agent has permission to deploy infrastructure. We utilize specific IAM roles with carefully scoped trust policies. * **`AgentCoreExecutionRole`**: This is the primary identity for our agents. * **Trust Policy**: Allows `bedrock-agentcore.amazonaws.com` to assume this role. * **Permissions**: * `AgentCoreBradrockAccess`: Invoke models (Claude Sonnet/Haiku). * `AgentCoreECRAccess`: Pull Docker images from our private ECR. * `AgentCoreLambdaInvoke`: Call Lambda functions (used for tools). * `AgentCoreA2AAccess`: Communicate with other agents via the A2A protocol. ### Elastic Container Registry (ECR) We use a centralized repository named **`reinvent/agents`** to store our agent images. * **Base Image**: Contains heavy dependencies (Python, Strands, Playwright). * **Generic Agent Image**: A standardized template that can load specific configurations at runtime. * **Builder Image**: The specialized image for our factory orchestrator. ### AgentCore Memory We provision a shared **AgentCore Memory** instance (`reinvent_agent_factory-shared`) that allows agents to retain context across sessions. --- ## 3. Deep Dive: Inside the Builder Agent The `builder_agent` directory is the gold mine of this project. It orchestrates the creation of other agents using a "Generic Agent" strategy. ### The Generic Agent Pattern Instead of building a unique Docker image for every new bot (which takes minutes), we use a single **Generic Agent Container**. * **`templates/generic_agent.py`**: This script is the entry point for *all* deployed agents. * **Dynamic Configuration**: Behavior is injected at runtime via Environment Variables (`SYSTEM_PROMPT`, `AGENT_MODE`, `ENABLE_BROWSER`). This allows the Builder Agent to deploy a new agent in seconds by simply pointing a new Runtime to the existing ECR image. ### Agent Modes: Server vs. Client The platform supports two distinct modes of operation: #### **Server Mode (`AGENT_MODE='server'`)** * **Purpose**: To provide tools or services to other agents. * **Mechanism**: Starts a **FastAPI server on port 9000**. * **Protocol**: Exposed via the **Agent-to-Agent (A2A)** protocol. * **Usage**: The AgentCore Gateway routes requests to this port. #### **Client Mode (`AGENT_MODE='client'`)** * **Purpose**: To consume services and orchestrate tasks. * **Mechanism**: Runs a loop processing user input; does *not* listen on a port. * **Capabilities**: Initialized with `KNOWN_AGENT_IDS` to call Server agents. ### Docker Layering Strategy We use a `FROM ...:base` pattern in our Dockerfiles. We verify that a base image (containing heavy libraries like `strands`, `boto3`, `playwright`) exists first. The Builder's Dockerfile then just copies the lightweight Python scripts on top. This ensures builds are lightning fast. --- ## 4. Interacting with the AgentCore SDK The `demo_cli.py` provides a practical example of how to interact with the AgentCore service using `boto3`. ### Two Distinct Clients ```python # 1. Runtime Client: For invoking agents self.client = boto3.client("bedrock-agentcore", region_name=REGION) # 2. Control Plane Client: For managing resources (deploy, list, delete) self.control_client = boto3.client("bedrock-agentcore-control", region_name=REGION) ``` ### Listing and Inspecting Agents We use the control client to inspect configuration details, such as checking for the A2A protocol and to see what agents are running. ```python # List all runtimes response = self.control_client.list_agent_runtimes() for runtime in response.get("agentRuntimes", []): agent_id = runtime["agentRuntimeId"] # Get full details to check protocol configuration details = self.control_client.get_agent_runtime(agentRuntimeId=agent_id) # Check if A2A is enabled protocol_config = details.get("protocolConfiguration", {}) has_a2a_protocol = (protocol_config.get("serverProtocol") == "A2A") ``` #### Invoking Agents: Simple vs. JSON-RPC When sending messages to an agent using the runtime `client`, the payload format differs based on the agent's mode: 1. **Standard Agents**: Use a simple JSON payload. 2. **A2A Agents**: Use the JSON-RPC 2.0 standard. ```python # Prepare payload based on protocol if use_jsonrpc: # A2A Agents expect JSON-RPC 2.0 payload = { "jsonrpc": "2.0", "method": "message/send", "params": { "message": { "kind": "message", "role": "user", "parts": [{"kind": "text", "text": message}], "messageId": f"msg-{uuid.uuid4().hex[:8]}" } }, "id": request_id } else: # Standard Agents accept simple JSON payload = {"prompt": message} # Invoke the agent response = self.client.invoke_agent_runtime( agentRuntimeArn=agent_arn, runtimeSessionId=session_id, # Maintains conversation state payload=json.dumps(payload), qualifier="DEFAULT" ) ``` This dual-mode support allows the platform to handle both simple standalone bots and complex, interconnected agent swarms. --- ## 5. Developer Experience: Python, uv, and Automation Managing dependencies in modern Python projects can be complex. One thing I added from the original repo is utilizing [**uv**](https://github.com/astral-sh/uv), an extremely fast Python package installer and resolver. We leverages `uv` to ensure reproducible builds. We've also automated the entire bootstrapping process with `scripts/setup_aws_resources.py`, which handles: 1. Detecting AWS Account/Region. 2. Creating IAM Roles and ECR Repositories. 3. Building and pushing Docker images. 4. Generating the local `.env` configuration. --- ## 6. Getting Started If you have the repository cloned, setting up your own Agent Factory is straightforward using `uv`: ### 1. Configure Credentials ```bash cd agent-factory pip install uv uv --version # check uv is installed uv init # initializing uv project and adding dependencies aws configure ``` ### 2. Initialize Environment ```bash uv sync ``` ### 3. Provision Infrastructure ```bash uv run scripts/setup_aws_resources.py ``` This script acts as your bootstrap "Infrastructure as Code". It uses `boto3` to: 1. Detect your AWS Account ID and Region. 2. Check for and create missing IAM Roles. 3. Create the `reinvent/agents` ECR repository. 4. Authenticate Docker with ECR. 5. Build and push the necessary Docker images. 6. Generate a local `.env` file for your client configuration. (do not forget to add it to `.gitignore` :)) ### 4. Deploy Builder & Run CLI ```bash uv run scripts/deploy_builder.py uv run demo_cli.py ``` ![agent-factory-demo](images/agent-factory-demo.png) You can now chat with the Builder Agent and ask it to create new bots for you!

In the rapidly evolving world of Generative AI, manually deploying agents one by one is no longer sexy. We need systems that can build systems—an Agent Factory.

Image not found: hot

Today, we will dissect the Agent Factory system I learned at AWS re:Invent 2025. AWS engineers built a system that can build agents using AWS AgentCore, and the repo is https://github.com/mcginnbros/agent-factory.

This platform is designed to bootstrap your personal AWS environment into a fully functional agent ecosystem using AWS AgentCore.

In this post, we’ll peel back the layers and look at the infrastructure that makes this possible: from IAM policies and ECR repositories to the Python tooling that orchestrates it all.

1. The Core Architecture

At the heart of this system lies the concept of a Builder Agent. Instead of writing Terraform or CloudFormation for every new agent you want to experiment with, you talk to the Builder Agent.

The Builder Agent comprises:

Strands Framework: The application logic driving the agent’s reasoning.
AWS AgentCore Runtime: The serverless compute environment hosting the agent container.
Tooling Access: Permissions to call AWS APIs to create other resources.

When you ask the Builder Agent to “Create a weather bot,” it doesn’t just write code; it provisions real infrastructure. 🤯

2. Infrastructure Under the Hood

To enable a safe yet powerful Agent Factory, we lean heavily on AWS native services.

graph TB
    subgraph Client ["Client Layer"]
        User[User / Developer]
        CLI[Python CLI - demo_cli.py]
    end

    subgraph AWS ["AWS Cloud Account"]
        subgraph IAM ["IAM & Security"]
            Roles[IAM Roles]
        end

        subgraph Control ["Control Plane"]
            AgentCoreAPI[Bedrock AgentCore API]
            Gateway[AgentCore Gateway]
        end

        subgraph Data ["Data Plane"]
            BuilderRuntime[Builder Agent Runtime]
            AgentRuntime[Deployed Agent Runtime]
            Memory[AgentCore Memory]
        end

        subgraph Storage ["Artifacts"]
            ECR[ECR: reinvent/agents]
        end
    end

    User --> CLI
    CLI --> AgentCoreAPI
    AgentCoreAPI --> BuilderRuntime
    BuilderRuntime --> AgentCoreAPI
    AgentCoreAPI --> AgentRuntime
    BuilderRuntime --> Memory
    AgentRuntime --> Memory
    BuilderRuntime -.->|Pulls Image from ECR| ECR
    AgentRuntime -.->|Pulls Image from ECR| ECR
    style AWS fill:#f9f9f9,stroke:#333,stroke-width:2px

Identity and Access Management (IAM)

Security is paramount when an AI agent has permission to deploy infrastructure. We utilize specific IAM roles with carefully scoped trust policies.

AgentCoreExecutionRole: This is the primary identity for our agents.
- Trust Policy: Allows bedrock-agentcore.amazonaws.com to assume this role.
- Permissions:
  - AgentCoreBradrockAccess: Invoke models (Claude Sonnet/Haiku).
  - AgentCoreECRAccess: Pull Docker images from our private ECR.
  - AgentCoreLambdaInvoke: Call Lambda functions (used for tools).
  - AgentCoreA2AAccess: Communicate with other agents via the A2A protocol.

Elastic Container Registry (ECR)

We use a centralized repository named reinvent/agents to store our agent images.

Base Image: Contains heavy dependencies (Python, Strands, Playwright).
Generic Agent Image: A standardized template that can load specific configurations at runtime.
Builder Image: The specialized image for our factory orchestrator.

AgentCore Memory

We provision a shared AgentCore Memory instance (reinvent_agent_factory-shared) that allows agents to retain context across sessions.

3. Deep Dive: Inside the Builder Agent

The builder_agent directory is the gold mine of this project. It orchestrates the creation of other agents using a “Generic Agent” strategy.

The Generic Agent Pattern

Instead of building a unique Docker image for every new bot (which takes minutes), we use a single Generic Agent Container.

templates/generic_agent.py: This script is the entry point for all deployed agents.
Dynamic Configuration: Behavior is injected at runtime via Environment Variables (SYSTEM_PROMPT, AGENT_MODE, ENABLE_BROWSER).

This allows the Builder Agent to deploy a new agent in seconds by simply pointing a new Runtime to the existing ECR image.

Agent Modes: Server vs. Client

The platform supports two distinct modes of operation:

Server Mode (`AGENT_MODE='server'`)

Purpose: To provide tools or services to other agents.
Mechanism: Starts a FastAPI server on port 9000.
Protocol: Exposed via the Agent-to-Agent (A2A) protocol.
Usage: The AgentCore Gateway routes requests to this port.

Client Mode (`AGENT_MODE='client'`)

Purpose: To consume services and orchestrate tasks.
Mechanism: Runs a loop processing user input; does not listen on a port.
Capabilities: Initialized with KNOWN_AGENT_IDS to call Server agents.

Docker Layering Strategy

We use a FROM ...:base pattern in our Dockerfiles. We verify that a base image (containing heavy libraries like strands, boto3, playwright) exists first. The Builder’s Dockerfile then just copies the lightweight Python scripts on top. This ensures builds are lightning fast.

4. Interacting with the AgentCore SDK

The demo_cli.py provides a practical example of how to interact with the AgentCore service using boto3.

Two Distinct Clients

1
2
3
4
5
# 1. Runtime Client: For invoking agents
self.client = boto3.client("bedrock-agentcore", region_name=REGION)

# 2. Control Plane Client: For managing resources (deploy, list, delete)
self.control_client = boto3.client("bedrock-agentcore-control", region_name=REGION)

Listing and Inspecting Agents

We use the control client to inspect configuration details, such as checking for the A2A protocol and to see what agents are running.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# List all runtimes
response = self.control_client.list_agent_runtimes()

for runtime in response.get("agentRuntimes", []):
    agent_id = runtime["agentRuntimeId"]
    
    # Get full details to check protocol configuration
    details = self.control_client.get_agent_runtime(agentRuntimeId=agent_id)
    
    # Check if A2A is enabled
    protocol_config = details.get("protocolConfiguration", {})
    has_a2a_protocol = (protocol_config.get("serverProtocol") == "A2A")

Invoking Agents: Simple vs. JSON-RPC

When sending messages to an agent using the runtime client, the payload format differs based on the agent’s mode:

Standard Agents: Use a simple JSON payload.
A2A Agents: Use the JSON-RPC 2.0 standard.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Prepare payload based on protocol
if use_jsonrpc:
    # A2A Agents expect JSON-RPC 2.0
    payload = {
        "jsonrpc": "2.0",
        "method": "message/send",
        "params": {
            "message": {
                "kind": "message",
                "role": "user",
                "parts": [{"kind": "text", "text": message}],
                "messageId": f"msg-{uuid.uuid4().hex[:8]}"
            }
        },
        "id": request_id
    }
else:
    # Standard Agents accept simple JSON
    payload = {"prompt": message}

# Invoke the agent
response = self.client.invoke_agent_runtime(
    agentRuntimeArn=agent_arn,
    runtimeSessionId=session_id,  # Maintains conversation state
    payload=json.dumps(payload),
    qualifier="DEFAULT"
)

This dual-mode support allows the platform to handle both simple standalone bots and complex, interconnected agent swarms.

5. Developer Experience: Python, uv, and Automation

Managing dependencies in modern Python projects can be complex. One thing I added from the original repo is utilizing uv, an extremely fast Python package installer and resolver.

We leverages uv to ensure reproducible builds. We’ve also automated the entire bootstrapping process with scripts/setup_aws_resources.py, which handles:

Detecting AWS Account/Region.
Creating IAM Roles and ECR Repositories.
Building and pushing Docker images.
Generating the local .env configuration.

6. Getting Started

If you have the repository cloned, setting up your own Agent Factory is straightforward using uv:

1. Configure Credentials

1
2
3
4
5
cd agent-factory
pip install uv
uv --version        # check uv is installed
uv init             # initializing uv project and adding dependencies
aws configure

2. Initialize Environment

1
uv sync

3. Provision Infrastructure

1
uv run scripts/setup_aws_resources.py

This script acts as your bootstrap “Infrastructure as Code”. It uses boto3 to:

Detect your AWS Account ID and Region.
Check for and create missing IAM Roles.
Create the reinvent/agents ECR repository.
Authenticate Docker with ECR.
Build and push the necessary Docker images.
Generate a local .env file for your client configuration. (do not forget to add it to .gitignore :))

4. Deploy Builder & Run CLI

1
2
uv run scripts/deploy_builder.py
uv run demo_cli.py

Image not found: agent-factory-demo

You can now chat with the Builder Agent and ask it to create new bots for you!