极简 AI Agent 框架设计与实现：从 Agent Loop 到上下文工程

更新日期: 2026-03-25 阅读: 174 标签: Agent 分享

实现一个 AI Agent 框架，工程上需要处理三大要素：LLM Call（推理）、Tools Call（执行）以及 Context（上下文）工程。如果说 Agent 框架的核心是上下文工程，那么上下文工程的核心引擎则是 Agent Loop。

Agent 框架设计的核心，是在 Agent Loop 这个 while 循环中设计如何管理上下文。本文即围绕这个核心论点展开。

目录结构

Agent 框架架构图一览
Agent 框架三大要素设计
Agent 框架代码实现
基于极简 Agent 框架的极简 Agent 应用

1. Agent 框架架构图一览

┌─────────────────────────────────────────────────────────────────────┐
│                User Interface(CLI REPL Layer)                       │
│  ┌──────────────┐   ┌──────────────┐   ┌──────────────────────────┐ │
│  │  User Input  │   │    Exit/     │   │   Message History        │ │
│  │   Handler    │   │   Clear Cmd  │   │   Management             │ │
│  └──────┬───────┘   └──────────────┘   └──────────────────────────┘ │
│         │                                                           │
│         ▼                                                           │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │                      Agent Loop Core                         │   │
│  │  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐    │   │
│  │  │   LLM Call   │───▶│ Tool Call    │───▶│   Tool Exec  │    │   │
│  │  │   (DeepSeek) │    │   Parser     │    │   Engine     │    │   │
│  │  └──────────────┘    └──────────────┘    └──────────────┘    │   │
│  │         │                                              │     │   │
│  │         │◀─────────────────────────────────────────────┘     │   │
│  │         │ (Tool Results Feedback)                            │   │
│  │         ▼                                                    │   │
│  │  ┌──────────────┐    ┌──────────────┐                        │   │
│  │  │   Response   │───▶│   Context    │                        │   │
│  │  │   Formatter  │    │   Manager    │                        │   │
│  │  └──────────────┘    └──────────────┘                        │   │
│  └──────────────────────────────────────────────────────────────┘   │
│                              │                                      │
│                              ▼                                      │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │                    Tools Registry (TOOLS)                    │   │
│  │  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐             │   │
│  │  │ shell_  │ │ file_   │ │ file_   │ │ python_ │             │   │
│  │  │ exec    │ │ read    │ │ write   │ │ exec    │             │   │
│  │  └─────────┘ └─────────┘ └─────────┘ └─────────┘             │   │
│  │      │            │            │            │                │   │
│  │      ▼            ▼            ▼            ▼                │   │
│  │  [Function]   [Function]   [Function]   [Function]           │   │
│  │  [Schema]     [Schema]     [Schema]     [Schema]             │   │
│  └──────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────┘

流程回顾

初始上下文（系统提示词 + 用户请求）
    ↓
[agent loop 开始]
    ↓
agent 读取上下文 → 思考 → 决定行动
    ↓
执行工具/行动 → 获得结果
    ↓
结果追加到上下文
    ↓
[循环继续或结束]

2. Agent 框架三大要素设计

1）LLM Call

LLM Provider：使用 DeepSeek deepseek-chat 模型
LLM Call API：使用标准化 OpenAI SDK
为保证当前项目的最大可读性，采用同步非流式调用

2）Tools

采用极简的工具集，操作对象包含文件、Shell 和 Python 代码执行。

Tools 实现：共支持 4 个工具函数

shell_exec：执行 shell 命令并返回输出
file_read：读取文件内容
file_write：写入文件内容（自动创建目录）
python_exec：在子进程中执行 Python 代码并返回输出

Tools 注册：采用手动维护字典映射的方式 name → (function, OpenAI function schema)，用于解析 LLM Call 的 response 时根据 name 匹配需要执行的 tool。

Tools 的定义遵循 OpenAI Function Calling 的标准格式（也称 OpenAI Tools API schema）。

3）Context

System Prompt：极简系统提示词，告知 LLM 可用工具和 ReAct 思考方式
messages 列表（OpenAI chat 格式）是核心状态，累积系统提示词、用户消息、助手响应和工具结果

3. Agent 框架代码实现

3.1 第一部分：Agent Loop 与上下文

基础流程：LLM call → parse tool_calls → execute → append results to messages → loop or exit

安全设置：为 while 循环设置迭代上限 20 轮（MAX_TURNS=20）

上下文管理：

使用 messages 变量作为上下文的载体
使用 System Prompt 初始化：{"role": "system", "content": system_prompt}
追加 User Message：{"role": "user", "content": user_message}
追加 Tool Results：{"role": "tool", "content": result}

# ============================================================
# Agent Loop — 核心
# ============================================================
MAX_TURNS = 20

def agent_loop(user_message: str, messages: list, client: OpenAI) -> str:
    """
    Agent Loop：while 循环驱动 LLM 推理与工具调用。
    流程：
      1. 将用户消息追加到 messages
      2. 调用 LLM
      3. 若 LLM 返回 tool_calls → 逐个执行 → 结果追加到 messages → 继续循环
      4. 若 LLM 直接返回文本（无 tool_calls）→ 退出循环，返回文本
      5. 安全上限 MAX_TURNS 轮
    """
    messages.append({"role": "user", "content": user_message})
    tool_schemas = [t["schema"] for t in TOOLS.values()]
    
    for turn in range(1, MAX_TURNS + 1):
        # --- LLM Call ---
        response = client.chat.completions.create(
            model="deepseek-chat",
            messages=messages,
            tools=tool_schemas,
        )
        choice = response.choices[0]
        assistant_msg = choice.message
        # 将 assistant 消息追加到上下文
        messages.append(assistant_msg.model_dump())
        
        # --- 终止条件：无 tool_calls ---
        if not assistant_msg.tool_calls:
            return assistant_msg.content or ""
        
        # --- 执行每个 tool_call ---
        for tool_call in assistant_msg.tool_calls:
            name = tool_call.function.name
            raw_args = tool_call.function.arguments
            print(f"  [tool] {name}({raw_args})")
            
            # 解析参数并调用工具
            try:
                args = json.loads(raw_args)
            except json.JSONDecodeError:
                args = {}
            
            tool_entry = TOOLS.get(name)
            if tool_entry is None:
                result = f"[error] unknown tool: {name}"
            else:
                result = tool_entry["function"](**args)
            
            # 将工具结果追加到上下文
            messages.append(
                {
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": result,
                }
            )
    
    return "[agent] reached maximum turns, stopping."

注：此处使用 deepseek-chat 模型，主要考量是该模型支持 Tool Calls 且成本较低。

3.2 Tools 实现与注册

工具函数实现

# ============================================================
# Tools 实现 — 4 个工具函数
# ============================================================
def shell_exec(command: str) -> str:
    """执行 shell 命令并返回 stdout + stderr。"""
    try:
        result = subprocess.run(
            command,
            shell=True,
            capture_output=True,
            text=True,
            timeout=30,
        )
        output = result.stdout
        if result.stderr:
            output += "\n[stderr]\n" + result.stderr
        if result.returncode != 0:
            output += f"\n[exit code: {result.returncode}]"
        return output.strip() or "(no output)"
    except subprocess.TimeoutExpired:
        return "[error] command timed out after 30s"
    except Exception as e:
        return f"[error] {e}"

def file_read(path: str) -> str:
    """读取文件内容。"""
    try:
        with open(path, "r", encoding="utf-8") as f:
            return f.read()
    except Exception as e:
        return f"[error] {e}"

def file_write(path: str, content: str) -> str:
    """将内容写入文件（自动创建父目录）。"""
    try:
        os.makedirs(os.path.dirname(path) or ".", exist_ok=True)
        with open(path, "w", encoding="utf-8") as f:
            f.write(content)
        return f"OK — wrote {len(content)} chars to {path}"
    except Exception as e:
        return f"[error] {e}"

def python_exec(code: str) -> str:
    """在子进程中执行 Python 代码并返回输出。"""
    tmp_path = None
    try:
        with tempfile.NamedTemporaryFile(
            mode="w", suffix=".py", delete=False, encoding="utf-8"
        ) as tmp:
            tmp.write(code)
            tmp_path = tmp.name
        
        result = subprocess.run(
            [sys.executable, tmp_path],
            capture_output=True,
            text=True,
            timeout=30,
        )
        output = result.stdout
        if result.stderr:
            output += "\n[stderr]\n" + result.stderr
        return output.strip() or "(no output)"
    except subprocess.TimeoutExpired:
        return "[error] execution timed out after 30s"
    except Exception as e:
        return f"[error] {e}"
    finally:
        try:
            if tmp_path:
                os.unlink(tmp_path)
        except OSError:
            pass

工具注册

工具实现完成后需要注册，方便 Agent Loop 根据 LLM 返回结果执行具体工具。本质是一个字典映射 name → {function, OpenAI schema}。

# ============================================================
# Tools 注册 — name → (function, OpenAI function schema)
# ============================================================
TOOLS = {
    "shell_exec": {
        "function": shell_exec,
        "schema": {
            "type": "function",
            "function": {
                "name": "shell_exec",
                "description": "Execute a shell command and return its output.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "command": {
                            "type": "string",
                            "description": "The shell command to execute.",
                        }
                    },
                    "required": ["command"],
                },
            },
        },
    },
    "file_read": {
        "function": file_read,
        "schema": {
            "type": "function",
            "function": {
                "name": "file_read",
                "description": "Read the contents of a file at the given path.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "path": {
                            "type": "string",
                            "description": "Absolute or relative file path.",
                        }
                    },
                    "required": ["path"],
                },
            },
        },
    },
    "file_write": {
        "function": file_write,
        "schema": {
            "type": "function",
            "function": {
                "name": "file_write",
                "description": "Write content to a file (creates parent directories if needed).",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "path": {
                            "type": "string",
                            "description": "Absolute or relative file path.",
                        },
                        "content": {
                            "type": "string",
                            "description": "Content to write.",
                        },
                    },
                    "required": ["path", "content"],
                },
            },
        },
    },
    "python_exec": {
        "function": python_exec,
        "schema": {
            "type": "function",
            "function": {
                "name": "python_exec",
                "description": "Execute Python code in a subprocess and return its output.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "code": {
                            "type": "string",
                            "description": "Python source code to execute.",
                        }
                    },
                    "required": ["code"],
                },
            },
        },
    },
}

Tools 的定义遵循 OpenAI Function Calling 的标准格式（OpenAI Tools API schema）。每个工具的 schema 字段结构如下：

json

{
    "type": "function",
    "function": {
        "name": "...",
        "description": "...",
        "parameters": {
            "type": "object",
            "properties": { ... },
            "required": [ ... ]
        }
    }
}

3.3 System Prompt

每一次与 LLM 交互都需要带上 System Prompt。

# ============================================================
# System Prompt
# ============================================================
SYSTEM_PROMPT = """You are a helpful AI assistant with access to the following tools:
1. shell_exec — run shell commands
2. file_read — read file contents
3. file_write — write content to a file
4. python_exec — execute Python code
Think step by step. Use tools when you need to interact with the file system, \
run commands, or execute code. When the task is complete, respond directly \
without calling any tool."""

至此，一个极简的 Agent 框架实现完成，单文件搞定，全部代码 279 行。

4. 基于极简 Agent 框架的极简 Agent 应用

4.1 用户交互界面设计 - Python CLI REPL

框架实现完成后，距离 Agent 应用只剩最后一个用户交互界面。从极简思想出发，使用 Python CLI REPL 作为 Agent 的入口：

def main():
    api_key = os.environ.get("DEEPSEEK_API_KEY")
    if not api_key:
        print("Error: please set DEEPSEEK_API_KEY environment variable.")
        sys.exit(1)
    
    client = OpenAI(api_key=api_key, base_url="https://api.deepseek.com")
    messages: list = [{"role": "system", "content": SYSTEM_PROMPT}]
    
    print("Agent ready. Type your message (or 'exit' to quit, 'clear' to reset).\n")
    
    while True:
        try:
            user_input = input("You> ").strip()
        except (EOFError, KeyboardInterrupt):
            print("\nBye.")
            break
        
        if not user_input:
            continue
        
        if user_input.lower() == "exit":
            print("Bye.")
            break
        
        if user_input.lower() == "clear":
            messages.clear()
            messages.append({"role": "system", "content": SYSTEM_PROMPT})
            print("(context cleared)\n")
            continue
        
        reply = agent_loop(user_input, messages, client)
        print(f"\nAgent> {reply}\n")

4.2 DeepSeek 注册，获取 API Key

由于本文 Agent 框架的 LLM Provider 基于 DeepSeek 实现，需要获取 DeepSeek 模型（deepseek-chat 模型）的 API key 才能使用。

注册：https://platform.deepseek.com
获取 API Keys：https://platform.deepseek.com/api_keys

4.3 极简 Agent 使用

使用前设置 API key：

bash

export DEEPSEEK_API_KEY="sk-xxxxx"

使用示例

询问当前目录下的文件列表
执行统计任务：统计当前目录下的代码行数以及 token 数

在实际使用中，Agent 会持续调用工具，持续生成代码与执行代码。API 请求 17 次，Token 消耗 86k，总花费约 3 分钱。

可以看到，实现的 Agent 应用虽然极简，但功能并不简单。OpenClaw 的底层 Agent Core（Pi Agent）的 Tools 层也仅包含四个工具方法：读文件（Read）、写文件（Write）、编辑文件（Edit）、命令行（Shell），其他能力均靠事件机制及 Skills 扩展而来。

当文件系统遇上代码工具，计算机的生产力就彻底被解放了。

写在后面的话

当前极简版的 AI Agent 框架在程序健壮性、安全性、功能性（如流式输出）以及优雅性（如 Tools 注册）方面都有很大的改进空间。但不容否认的是，它五脏俱全、简单清晰，可以帮助我们摒除复杂冗长的组件库，看清 Agent 的本质。

为什么需要极简？一方面是为了方便论述清楚 Agent 的关键点；另一方面是现实考量：代码库也将逐渐成为上下文工程的一部分，代码库越简单，上下文越清晰（信息噪声越低），Agent 则越智能。

Agent 框架之外，Agent 应用之内，上下文工程是智能的核心，也是 Agent 商业应用的关键。框架提供基础工具，上下文工程提供环境，搭配商业领域的 Skills，Agent 就能发挥出巨大的潜力。

本文内容仅供个人学习、研究或参考使用，不构成任何形式的决策建议、专业指导或法律依据。未经授权，禁止任何单位或个人以商业售卖、虚假宣传、侵权传播等非学习研究目的使用本文内容。如需分享或转载，请保留原文来源信息，不得篡改、删减内容或侵犯相关权益。感谢您的理解与支持！

链接: https://fly63.com/article/detial/13505

上一页: Vibe Coding 到 Agentic Engineering：当编程成为每个人的表达工具下一页: AI 前端开发的真实困境：为什么 Figma 设计稿总也还原不准？

内容以共享、参考、研究为目的,不存在任何商业目的。其版权属原作者所有,如有侵权或违规,请与小编联系!情况属实本人将予以删除!