LangGraph实战教程：构建会思考、能记忆、可人工干预的多智能体AI系统

通过组合几个较小的子智能体来创建强大的 AI 智能体已成为一种趋势。但这也带来了挑战，例如减少幻觉、管理对话流程、在测试期间密切关注智能体的工作方式、允许人工介入以及评估其性能。你需要进行大量的反复试验。

在本文中，我们将使用监督者方法构建一个多智能体系统。在此过程中，我们将介绍基础知识、在创建复杂的 AI 智能体架构时可能面临的挑战，以及如何评估和改进它们。

我们将使用 LangGraph 和 LangSmith 等工具来帮助我们完成此过程。

我们将从基础开始，通过分步方法来创建这个复杂的多 AI 智能体架构

环境设置

LangChain、LangGraph 模块构成了一个完整的架构，但是如果我一次性导入所有库，肯定会造成混淆。

所以我们只会在需要时导入模块，因为这将有助于我们以正确的方式学习。

第一步是创建环境变量，用于保存我们的敏感信息，如 API 密钥和其他类似信息。

 importos  
   
 # 为 API 集成设置环境变量
 os.environ["OPENAI_API_KEY"] ="your-openai-api-key"  
 os.environ["LANGSMITH_API_KEY"] ="your-langsmith-api-key"  
 os.environ["LANGSMITH_TRACING"] ="true"  # 启用 LangSmith 追踪
 os.environ["LANGSMITH_PROJECT"] ="intelligent-rag-system"  # 用于组织 LangSmith 追踪的项目名称

LangSmith 对你来说可能是一个新术语。如果你不知道它是什么，我们将在下一节讨论它的用途。如果你已经知道了，可以跳过他。

要获取 LangSmith API 密钥，你可以访问他们的网站并创建一个帐户。之后，在设置下，你会找到你的 API 密钥。

LangSmith 的目的

当我们使用 LLM 构建 AI 智能体应用程序时，LangSmith 可以帮助你理解和改进它们。它就像一个仪表板，显示应用程序内部发生的情况，并允许你：

出现问题时进行调试
测试你的提示和逻辑
评估答案的质量
实时监控你的应用程序
跟踪使用情况、速度和成本

LangSmith 使所有这些都易于使用，即使你不是开发人员。

让我们导入它。

 fromlangsmithimportutils  
   
 # 检查并打印 LangSmith 追踪当前是否已启用
 print(f"LangSmith tracing is enabled: {utils.tracing_is_enabled()}")  
     ### output ###  
     LangSmithtracingisenabled: True

我们刚刚从 LangSmith 导入了稍后将使用的 utils，并且追踪设置为 true，因为我们之前设置了环境变量

LANGSMITH_TRACING = TRUE

，这有助于我们记录和可视化 AI 智能体应用程序的执行情况。

数据集

我们将使用 Chinook 数据库，这是一个用于学习和测试 SQL 的流行示例数据库。它模拟了数字音乐商店的数据和运营，例如客户信息、购买历史和音乐目录。

它有多种格式，如 MySQL、PostgreSQL 等，但我们将使用 SQLite 版本的数据，因为它也有助于我们了解 AI 智能体如何与数据库交互，这对于刚接触本 AI 智能体指南的人尤其有用。

让我们定义一个函数来为我们设置 SQLite 数据库。

 importsqlite3  
importrequests  
fromlangchain_community.utilities.sql_databaseimportSQLDatabase  
fromsqlalchemyimportcreate_engine  
fromsqlalchemy.poolimportStaticPool  

defget_engine_for_chinook_db():  
    """  
    拉取 SQL 文件，填充内存数据库，并创建引擎。
      
    从 GitHub 下载 Chinook 数据库 SQL 脚本，并创建一个用示例数据填充的内存
    SQLite 数据库。
      
    返回：
        sqlalchemy.engine.Engine: 连接到内存数据库的 SQLAlchemy 引擎
    """  
    # 从官方存储库下载 Chinook 数据库 SQL 脚本
    url="https://raw.githubusercontent.com/lerocha/chinook-database/master/ChinookDatabase/DataSources/Chinook_Sqlite.sql"  
    response=requests.get(url)  
    sql_script=response.text  

    # 创建一个内存 SQLite 数据库连接
    # check_same_thread=False 允许跨线程使用连接
    connection=sqlite3.connect(":memory:", check_same_thread=False)  
      
    # 执行 SQL 脚本以使用示例数据填充数据库
    connection.executescript(sql_script)  
      
    # 创建并返回一个使用已填充连接的 SQLAlchemy 引擎
    returncreate_engine(  
        "sqlite://",  # SQLite URL 方案
        creator=lambda: connection,  # 返回数据库连接的函数
        poolclass=StaticPool,  # 使用 StaticPool 维护单个连接
        connect_args={"check_same_thread": False},  # 允许跨线程使用
     )

我们刚刚定义了第一个函数

get_engine_for_chinook_db()

，它使用 Chinook 示例数据集设置一个临时的内存 SQLite 数据库。

它从 GitHub 下载 SQL 脚本，在内存中创建数据库，运行脚本以用表和数据填充它，然后返回一个连接到此数据库的 SQLAlchemy 引擎。

现在我们需要初始化这个函数，以便创建 SQLite 数据库。

 # 使用 Chinook 示例数据初始化数据库引擎
 engine=get_engine_for_chinook_db()  
   
 # 在引擎周围创建一个 LangChain SQLDatabase 包装器
 # 这为数据库操作和查询执行提供了方便的方法
 db=SQLDatabase(engine)

我们刚刚调用了该函数并初始化了引擎，以便稍后使用 AI 智能体在该数据库上运行查询操作。

短期和长期记忆

现在我们初始化了数据库，下一步就是寻找组合（LangGraph + LangSmith）的第一个优势，即两种不同类型的内存可用性，但首先要了解什么是内存。

在任何智能体中，内存都扮演着重要的角色。就像人类一样，AI 智能体需要记住过去的交互以保持上下文并提供个性化的响应。

在 LangGraph 中，我们区分短期记忆和长期记忆，以下是它们之间的快速区别：

短期记忆帮助智能体跟踪当前对话。在 LangGraph 中，这由 MemorySaver 处理，它保存并恢复对话的状态。
而长期记忆让智能体能够记住不同对话中的信息，例如用户偏好。例如，我们可以使用 InMemoryStore 进行快速存储，但在实际应用程序中，你会使用更持久的数据库。

让我们初始化它们两者。

 fromlanggraph.checkpoint.memoryimportMemorySaver  
fromlanggraph.store.memoryimportInMemoryStore  

# 初始化长期内存存储，用于对话之间的持久数据
in_memory_store=InMemoryStore()  

# 初始化检查点，用于单个线程/对话中的短期内存
 checkpointer=MemorySaver()

我们使用

in_memory_store

作为长期内存，即使在对话结束后，它也可以让我们保存用户偏好。

同时，

MemorySaver

（检查点）保持当前对话的上下文完整，从而实现流畅的多轮交互。

多智能体架构

这里将从一个简单的 ReAct 智能体开始，并在工作流中添加额外的步骤，模拟一个逼真的客户支持示例，展示人工介入、长期记忆和 LangGraph 预构建库。

我们将逐步构建多智能体工作流的每个组件，因为它包含两个子智能体，两个专门的 ReAct（推理和行动）子智能体，然后它们将组合起来创建一个包含额外步骤的多智能体工作流。

我们的工作流从以下开始：

human_input，用户在此提供帐户信息。
然后，在 verify_info 中，系统检查帐户并在需要时阐明用户的意图。
接下来，load_memory 检索用户的音乐偏好。
supervisor 协调两个子智能体：music_catalog（用于音乐数据）和 invoice_info（用于账单）。
最后，create_memory 用交互中的新信息更新用户的内存。

所以现在已经了解了基础知识，下面就是开始构建第一个子智能体。

目录信息子智能体

第一个子智能体将是一个音乐目录信息智能体。其主要职责是协助客户处理与我们的数字音乐目录相关的查询，例如搜索艺术家、专辑或歌曲。

我们的智能体将如何记住信息、决定做什么并执行操作？这使我们想到了三个基本的 LangGraph 概念：状态 (State)、工具 (Tools) 和 节点 (Nodes)。

定义状态、工具和节点

在 LangGraph 中，状态 (State) 保存流经图的当前数据快照，基本上是智能体的内存。

对于我们的客户支持智能体，状态包括：

customer_id: 识别客户以进行个性化响应和数据检索。
messages: 对话中交换的所有消息的列表，为智能体提供上下文。
loaded_memory: 加载到对话中的长期用户特定信息（如偏好）。
remaining_steps: 计算剩余步骤数以防止无限循环。

随着对话的进行，每个节点都会更新此状态。使用

TypedDict

进行类型提示，并使用 LangGraph 消息模块中的

Annotated

来方便地附加消息，从而定义我们的状态。

 fromtyping_extensionsimportTypedDict  
fromtypingimportAnnotated, List  
fromlanggraph.graph.messageimportAnyMessage, add_messages  
fromlanggraph.managed.is_last_stepimportRemainingSteps  

classState(TypedDict):  
    """  
    多智能体客户支持工作流的状态模式。
      
    这定义了在图中节点之间流动的共享数据结构，
    表示对话和智能体状态的当前快照。
    """  
    # 从帐户验证中检索到的客户标识符
    customer_id: str  
      
    # 具有自动消息聚合的对话历史记录
    messages: Annotated[list[AnyMessage], add_messages]  
      
    # 从长期内存存储加载的用户偏好和上下文
    loaded_memory: str  
      
    # 防止智能体工作流中无限递归的计数器
     remaining_steps: RemainingSteps

这个 State 类将作为我们多智能体系统中不同部分之间信息管理和传递方式的蓝图。

接下来使用工具 (Tools) 来扩展智能体的能力。工具是一些函数，可以让 LLM 做一些它自己无法做的事情，比如调用 API 或访问数据库。

对于我们的智能体，工具将连接到 Chinook 数据库以获取与音乐相关的信息。

Python 函数，使用

langchain_core.tools

中的

@tool

标记它们，以便 LLM 在需要时可以找到并使用它们。

 fromlangchain_core.toolsimporttool  
importast  

@tool  
defget_albums_by_artist(artist: str):  
    """  
    从音乐数据库中获取艺术家的专辑。
      
    参数：
        artist (str): 要搜索专辑的艺术家姓名。
      
    返回：
        str: 包含专辑标题和艺术家姓名的数据库查询结果。
    """  
    returndb.run(  
        f"""  
        SELECT Album.Title, Artist.Name   
        FROM Album   
        JOIN Artist ON Album.ArtistId = Artist.ArtistId   
        WHERE Artist.Name LIKE '%{artist}%';  
        """,  
        include_columns=True  
    )  

@tool  
defget_tracks_by_artist(artist: str):  
    """  
    从音乐数据库中获取艺术家（或类似艺术家）的歌曲/曲目。
      
    参数：
        artist (str): 要搜索曲目的艺术家姓名。
      
    返回：
        str: 包含歌曲名称和艺术家姓名的数据库查询结果。
    """  
    returndb.run(  
        f"""  
        SELECT Track.Name as SongName, Artist.Name as ArtistName   
        FROM Album   
        LEFT JOIN Artist ON Album.ArtistId = Artist.ArtistId   
        LEFT JOIN Track ON Track.AlbumId = Album.AlbumId   
        WHERE Artist.Name LIKE '%{artist}%';  
        """,  
        include_columns=True  
    )  

@tool  
defget_songs_by_genre(genre: str):  
    """  
    从数据库中获取与特定流派匹配的歌曲。
      
    此函数首先查找给定流派名称的流派 ID，
    然后检索属于这些流派的歌曲，结果限制为
    按艺术家分组的 8 首歌曲。
      
    参数：
        genre (str): 要获取的歌曲的流派。
      
    返回：
        list[dict] or str: 与指定流派匹配的包含艺术家信息的歌曲列表，
                          如果未找到歌曲，则返回错误消息。
    """  
    # 首先，获取指定流派的流派 ID
    genre_id_query=f"SELECT GenreId FROM Genre WHERE Name LIKE '%{genre}%'"  
    genre_ids=db.run(genre_id_query)  
      
    # 检查是否找到任何流派
    ifnotgenre_ids:  
        returnf"No songs found for the genre: {genre}"  
      
    # 解析流派 ID 并将其格式化以用于 SQL 查询
    genre_ids=ast.literal_eval(genre_ids)  
    genre_id_list=", ".join(str(gid[0]) forgidingenre_ids)  

    # 查询指定流派中的歌曲
    songs_query=f"""  
        SELECT Track.Name as SongName, Artist.Name as ArtistName  
        FROM Track  
        LEFT JOIN Album ON Track.AlbumId = Album.AlbumId  
        LEFT JOIN Artist ON Album.ArtistId = Artist.ArtistId  
        WHERE Track.GenreId IN ({genre_id_list})  
        GROUP BY Artist.Name  
        LIMIT 8;  
    """  
    songs=db.run(songs_query, include_columns=True)  
      
    # 检查是否找到任何歌曲
    ifnotsongs:  
        returnf"No songs found for the genre: {genre}"  
      
    # 将结果格式化为结构化的字典列表
    formatted_songs=ast.literal_eval(songs)  
    return [  
        {"Song": song["SongName"], "Artist": song["ArtistName"]}  
        forsonginformatted_songs  
    ]  

@tool  
defcheck_for_songs(song_title):  
    """  
    通过歌曲名称检查数据库中是否存在该歌曲。
      
    参数：
        song_title (str): 要搜索的歌曲标题。
      
    返回：
        str: 包含与给定标题匹配的歌曲的所有曲目信息的数据库查询结果。
    """  
    returndb.run(  
        f"""  
        SELECT * FROM Track WHERE Name LIKE '%{song_title}%';  
        """,  
        include_columns=True  
     )

在此代码块中，定义了四个特定的工具：

get_albums_by_artist: 查找给定艺术家的专辑
get_tracks_by_artist: 查找艺术家的单曲
get_songs_by_genre: 检索属于特定流派的歌曲
check_for_songs: 验证目录中是否存在特定歌曲

这些工具中的每一个都通过执行 SQL 查询与我们的

db

（我们之前初始化的 SQLDatabase 包装器）进行交互。然后以结构化格式返回结果。

 # 为智能体创建一个包含所有与音乐相关的工具的列表
 music_tools= [get_albums_by_artist, get_tracks_by_artist, get_songs_by_genre, check_for_songs]  
   
 # 将音乐工具绑定到语言模型以在 ReAct 智能体中使用
 llm_with_music_tools=llm.bind_tools(music_tools)

最后使用

llm.bind_tools()

将这些

music_tools

绑定到

llm

。

这个关键步骤允许 LLM 根据用户的查询了解何时以及如何调用这些函数。

状态 (State) 已经定义并且工具 (Tools) 已经准备就绪，现在就可以定义图的节点 (Nodes)。

节点是 LangGraph 应用程序中的核心处理单元，它们将图的当前状态作为输入，执行一些逻辑，并返回更新后的状态。

对于 ReAct 智能体，将定义两种关键类型的节点：

music_assistant 是 LLM 推理节点。它使用当前的对话历史和内存来决定下一个操作，可以是调用工具或生成响应，并更新状态。
music_tool_node 运行 music_assistant 选择的工具。LangGraph ToolNode 管理工具调用并用结果更新状态。

通过组合这些节点，在多智能体工作流中实现了动态推理和操作。

首先为

music_tools

创建

ToolNode

：

from langgraph.prebuilt import ToolNode  

# 创建一个执行与音乐相关的工具的工具节点
# ToolNode 是一个预构建的 LangGraph 组件，用于处理工具执行
music_tool_node = ToolNode(music_tools)

定义

music_assistant

节点。此节点将使用 LLM（已绑定

music_tools

）来确定下一个操作。

它还将任何

loaded_memory

合并到其提示中，从而实现个性化响应。

from langchain_core.messages import ToolMessage, SystemMessage, HumanMessage  
from langchain_core.runnables import RunnableConfig  

def generate_music_assistant_prompt(memory: str = "None") -> str:  
    """  
    为音乐助手智能体生成系统提示。

    参数：
        memory (str): 来自长期内存存储的用户偏好和上下文

    返回：
        str: 为音乐助手格式化的系统提示
    """  
    return f"""  
    You are a member of the assistant team, your role specifically is to focused on helping customers discover and learn about music in our digital catalog.   
    If you are unable to find playlists, songs, or albums associated with an artist, it is okay.   
    Just inform the customer that the catalog does not have any playlists, songs, or albums associated with that artist.  
    You also have context on any saved user preferences, helping you to tailor your response.   

    CORE RESPONSIBILITIES:  
    - Search and provide accurate information about songs, albums, artists, and playlists  
    - Offer relevant recommendations based on customer interests  
    - Handle music-related queries with attention to detail  
    - Help customers discover new music they might enjoy  
    - You are routed only when there are questions related to music catalog; ignore other questions.   

    SEARCH GUIDELINES:  
    1. Always perform thorough searches before concluding something is unavailable  
    2. If exact matches aren't found, try:  
       - Checking for alternative spellings  
       - Looking for similar artist names  
       - Searching by partial matches  
       - Checking different versions/remixes  
    3. When providing song lists:  
       - Include the artist name with each song  
       - Mention the album when relevant  
       - Note if it's part of any playlists  
       - Indicate if there are multiple versions  

    Additional context is provided below:   

    Prior saved user preferences: {memory}  

    Message history is also attached.    
    """

还需要创建一个

music_assistant

函数。

def music_assistant(state: State, config: RunnableConfig):  
    """  
    处理音乐目录查询和推荐的音乐助手节点。

    此节点处理与音乐发现、专辑搜索、艺术家信息以及基于存储偏好的个性化推荐相关的客户请求。

    参数：
        state (State): 包含 customer_id、消息、loaded_memory 等的当前状态。
        config (RunnableConfig): 可运行执行的配置

    返回：
        dict: 包含助手响应消息的更新状态
    """  
    # 如果可用，则检索长期内存偏好
    memory = "None"   
    if "loaded_memory" in state:   
        memory = state["loaded_memory"]  

    # 为音乐助手智能体生成指令
    music_assistant_prompt = generate_music_assistant_prompt(memory)  

    # 使用工具和系统提示调用语言模型
    # 模型可以决定是使用工具还是直接响应
    response = llm_with_music_tools.invoke([SystemMessage(music_assistant_prompt)] + state["messages"])  

    # 返回包含助手响应的更新状态
    return {"messages": [response]}

music_assistant

节点为 LLM 构建的详细系统提示，包括通用指令和用于个性化的

loaded_memory

。

它使用此系统消息和当前对话消息调用

llm_with_music_tools

。根据其推理，LLM 可能会输出最终答案或工具调用。

它只是返回此 LLM 响应，

add_messages

（来自状态定义）会自动将其附加到状态中的

messages

列表中。

在状态和节点就位后，下一步是使用边 (Edges) 连接它们，边定义了图中的执行流程。

普通边很简单——它们总是从一个特定节点路由到另一个特定节点。

条件边是动态的。这些是 Python 函数，它们检查当前状态并决定接下来访问哪个节点。

对于我们的ReAct 智能体，需要一个条件边来检查

music_assistant

是否应该：

调用工具： 如果 LLM 决定调用工具，路由到 music_tool_node 来执行它。
结束流程：如果 LLM 提供最终响应而没有工具调用，结束子智能体的执行。

为了处理这个逻辑，定义

should_continue

函数。

def should_continue(state: State, config: RunnableConfig):  
    """  
    确定 ReAct 智能体工作流中下一步的条件边函数。

    此函数检查对话中的最后一条消息，以决定智能体是应继续执行工具还是结束对话。

    参数：
        state (State): 包含消息和其他工作流数据的当前状态
        config (RunnableConfig): 可运行执行的配置

    返回：
        str: "continue" 表示执行工具，"end" 表示完成工作流
    """  
    # 从当前状态获取所有消息
    messages = state["messages"]  

    # 检查最近的消息以查看是否有工具调用
    last_message = messages[-1]  

    # 如果最后一条消息不包含任何工具调用，则智能体已完成
    if not last_message.tool_calls:  
        return "end"  
    # 如果存在工具调用，则继续执行它们
    else:  
        return "continue"

should_continue

函数检查状态中的最后一条消息。如果它包含

tool_calls

，则表示 LLM 想要使用工具，因此该函数返回

"continue"

。

否则，它返回

"end"

，表示 LLM 已提供直接响应并且子智能体的任务已完成。

现在我们拥有了所有部分：状态、节点和边。

使用

StateGraph

将它们组装起来以构建完整的 ReAct 智能体。

from langgraph.graph import StateGraph, START, END  
from utils import show_graph  

# 为音乐工作流创建一个新的 StateGraph 实例
music_workflow = StateGraph(State)  

# 向图中添加节点
# music_assistant: 决定调用哪些工具或直接响应的推理节点
music_workflow.add_node("music_assistant", music_assistant)  
# music_tool_node: 处理所有与音乐相关的工具调用的执行节点
music_workflow.add_node("music_tool_node", music_tool_node)  

# 添加边以定义图的流程
# 设置入口点 - 所有查询都从音乐助手开始
music_workflow.add_edge(START, "music_assistant")  

# 根据是否需要调用工具，从 music_assistant 添加条件边
music_workflow.add_conditional_edges(  
    "music_assistant",  
    # 确定下一步的条件函数
    should_continue,  
    {  
        # 如果需要执行工具，则路由到工具节点
        "continue": "music_tool_node",  
        # 如果不需要工具，则结束工作流
        "end": END,  
    },  
)  

# 工具执行后，始终返回到音乐助手进行进一步处理
music_workflow.add_edge("music_tool_node", "music_assistant")  

# 使用短期内存的检查点和长期内存的存储来编译图
music_catalog_subagent = music_workflow.compile(  
    name="music_catalog_subagent",   
    checkpointer=checkpointer,   
    store=in_memory_store  
)  

# 显示已编译的图结构
show_graph(music_catalog_subagent)

在最后一步中，使用定义的状态创建一个

StateGraph

。为

music_assistant

和

music_tool_node

添加节点。

图从

START

开始，通向

music_assistant

。核心 ReAct 循环通过

music_assistant

的条件边设置，如果检测到工具调用，则路由到

music_tool_node

，如果响应是最终的，则路由到

END

。

music_tool_node

运行后，一条边将流程带回

music_assistant

，允许 LLM 处理工具的输出并继续推理。

测试第一个子智能体

测试第一个子智能体：

import uuid  

# 为此对话会话生成唯一的线程 ID
thread_id = uuid.uuid4()  

# 定义用户关于音乐推荐的问题
question = "I like the Rolling Stones. What songs do you recommend by them or by other artists that I might like?"  

# 使用线程 ID 设置配置以维护对话上下文
config = {"configurable": {"thread_id": thread_id}}  

# 使用用户的问题调用音乐目录子智能体
# 智能体将使用其工具搜索滚石乐队的音乐并提供推荐
result = music_catalog_subagent.invoke({"messages": [HumanMessage(content=question)]}, config=config)  

# 以格式化的方式显示对话中的所有消息
for message in result["messages"]:  
   message.pretty_print()

为对话提供了一个唯一的 thread_id，我们的问题是关于与滚石乐队相似的音乐推荐，看看 AI 智能体将使用什么工具来响应。

======= Human Message ======  

I like the Rolling Stones. What songs do you recommend by them or by  
other artists that I might like?  

======= Ai Message ======  

Tool Calls:  
  get_tracks_by_artist (chatcmpl-tool-012bac57d6af46ddaad8e8971cca2bf7)  
 Call ID: chatcmpl-tool-012bac57d6af46ddaad8e8971cca2bf7  
  Args:  
    artist: The Rolling Stones

根据作为查询的人类消息，它使用正确的工具

get_tracks_by_artist

进行响应，该工具负责根据我们查询中指定的艺术家查找推荐。

现在，我们已经创建了第一个子智能体，让我们创建第二个子智能体。

发票信息子智能体

虽然从头开始构建 ReAct 智能体对于理解基础知识非常有用，但 LangGraph 也为常见架构提供了预构建库。

它允许快速设置像 ReAct 这样的标准模式，而无需手动定义所有节点和边。可以在 LangGraph 文档中找到这些预构建库的完整列表。

和以前一样，首先为的

invoice_information_subagent

定义特定的工具和提示。这些工具将与 Chinook 数据库交互以检索发票详细信息。

from langchain_core.tools import tool  

@tool   
def get_invoices_by_customer_sorted_by_date(customer_id: str) -> list[dict]:  
    """  
    使用客户 ID 查找客户的所有发票。
    发票按发票日期降序排序，这有助于客户查看其最新/最早的发票，或者
    他们想查看特定日期范围内的发票。

    参数：
        customer_id (str): customer_id，作为标识符。

    返回：
        list[dict]: 客户的发票列表。
    """  
    return db.run(f"SELECT * FROM Invoice WHERE CustomerId = {customer_id} ORDER BY InvoiceDate DESC;")  

    @tool   
    def get_invoices_sorted_by_unit_price(customer_id: str) -> list[dict]:  
        """  
        当客户想根据发票的单价/成本了解其一张发票的详细信息时，请使用此工具。
        此工具查找客户的所有发票，并按单价从高到低排序。为了找到与客户关联的发票，
        我们需要知道客户 ID。
    参数：
            customer_id (str): customer_id，作为标识符。

        返回：
            list[dict]: 按单价排序的发票列表。
        """  
        query = f"""  
            SELECT Invoice.*, InvoiceLine.UnitPrice  
            FROM Invoice  
            JOIN InvoiceLine ON Invoice.InvoiceId = InvoiceLine.InvoiceId  
            WHERE Invoice.CustomerId = {customer_id}  
            ORDER BY InvoiceLine.UnitPrice DESC;  
        """  
        return db.run(query)     
    @tool  
    def get_employee_by_invoice_and_customer(invoice_id: str, customer_id: str) -> dict:  
        """  
        此工具将接收发票 ID 和客户 ID，并返回与发票关联的员工信息。
    参数：
        invoice_id (int): 特定发票的 ID。
        customer_id (str): customer_id，作为标识符。

    返回：
        dict: 与发票关联的员工信息。
    """  

    query = f"""  
        SELECT Employee.FirstName, Employee.Title, Employee.Email  
        FROM Employee  
        JOIN Customer ON Customer.SupportRepId = Employee.EmployeeId  
        JOIN Invoice ON Invoice.CustomerId = Customer.CustomerId  
        WHERE Invoice.InvoiceId = ({invoice_id}) AND Invoice.CustomerId = ({customer_id});  
    """  

    employee_info = db.run(query, include_columns=True)  

    if not employee_info:  
        return f"No employee found for invoice ID {invoice_id} and customer identifier {customer_id}."  
    return employee_info

发票处理定义了三个专门的工具：

get_invoices_by_customer_sorted_by_date: 检索客户的所有发票，按日期排序
get_invoices_sorted_by_unit_price: 检索按其中项目单价排序的发票
get_employee_by_invoice_and_customer: 查找与特定发票关联的支持员工

和以前一样，必须将所有这些工具附加到一个列表中。

# 为智能体创建一个包含所有与发票相关的工具的列表
invoice_tools = [get_invoices_by_customer_sorted_by_date, get_invoices_sorted_by_unit_price, get_employee_by_invoice_and_customer]

指导发票子智能体行为的提示：

invoice_subagent_prompt = """  
    你是一个助手团队中的子智能体。你专门负责检索和处理发票信息。你被分配处理与发票相关的问题部分，因此只响应这些问题。

    你可以使用三个工具。这些工具使你能够从数据库中检索和处理发票信息。以下是这些工具：
    - get_invoices_by_customer_sorted_by_date: 此工具检索客户的所有发票，按发票日期排序。
    - get_invoices_sorted_by_unit_price: 此工具检索客户的所有发票，按单价排序。
    - get_employee_by_invoice_and_customer: 此工具检索与发票和客户关联的员工信息。

    如果你无法检索发票信息，请告知客户你无法检索该信息，并询问他们是否想搜索其他内容。

    核心职责：
    - 从数据库中检索和处理发票信息
    - 当客户询问时，提供有关发票的详细信息，包括客户详细信息、发票日期、总金额、与发票关联的员工等。
    - 始终保持专业、友好和耐心的态度

    你可能拥有其他上下文信息，应用于帮助回答客户的查询。它将在下面提供给你：
    """

此提示概述了子智能体的角色、可用工具、核心职责以及处理未找到信息情况的指南。

这种有针对性的指令有助于 LLM 在其专业领域内有效行动。

使用 LangGraph

create_react_agent

预构建函数，而不是像之前的子智能体那样手动为 ReAct 模式创建节点和条件边。

from langgraph.prebuilt import create_react_agent  

# 使用 LangGraph 的预构建 ReAct 智能体创建发票信息子智能体
# 此智能体专门处理客户发票查询和账单信息
invoice_information_subagent = create_react_agent(  
    llm,                           # 用于推理和响应的语言模型
    tools=invoice_tools,           # 用于数据库查询的特定于发票的工具
    name="invoice_information_subagent",  # 智能体的唯一标识符
    prompt=invoice_subagent_prompt,       # 用于发票处理的系统指令
    state_schema=State,            # 用于节点之间数据流的状态模式
    checkpointer=checkpointer,     # 用于对话上下文的短期内存
    store=in_memory_store         # 用于持久数据的长期内存存储
)

create_react_agent

函数接收

llm

、

invoice_tools

、智能体的名称（对于多智能体路由很重要）、刚刚定义的提示、我们的自定义

State

模式，并连接检查点和内存存储。

仅用几行代码，就拥有了一个功能齐全的 ReAct 智能体，这是使用 LangGraph 的优势。

测试第二个子智能体

测试一下新的

invoice_information_subagent

，以确保它按预期工作。我们将提供一个需要获取发票和员工信息的查询。

# 为此对话会话生成唯一的线程 ID
thread_id = uuid.uuid4()  

# 定义用户关于其最近发票和员工协助的问题
question = "My customer id is 1. What was my most recent invoice, and who was the employee that helped me with it?"  

# 使用线程 ID 设置配置以维护对话上下文
config = {"configurable": {"thread_id": thread_id}}  

# 使用用户的问题调用发票信息子智能体
# 智能体将使用其工具搜索发票信息和员工详细信息
result = invoice_information_subagent.invoke({"messages": [HumanMessage(content=question)]}, config=config)  

# 以格式化的方式显示对话中的所有消息
for message in result["messages"]:  
    message.pretty_print()

基本上是在询问客户 ID 为 1 的发票。让我们看看调用了哪些工具。

======= Human Message ======  

My customer id is 1. What was my most recent invoice, and who  
was the employee that helped me with it?  

======= Ai Message ======  

Name: invoice_information_subagent  

Your most recent purchase was on '2025-08-07 00:00:00' and the total amount was $8.91. Unfortunately, I am unable to provide information about U2 albums as it is not related to invoice information. Would you like to search for something else?  
==================================[1m Ai Message [0m==================================  
Name: invoice_information_subagent  

Transferring back to supervisor  
Tool Calls:  
  transfer_back_to_supervisor (9f3d9fce-0f11-43c0-88c4-adcd459a30a0)  
 Call ID: 9f3d9fce-0f11-43c0-88c4-adcd459a30a0  
  Args:  
=================================[1m Tool Message [0m=================================  
Name: transfer_back_to_supervisor  

Successfully transferred back to supervisor  
==================================[1m Ai Message [0m==================================  
Name: supervisor  
Tool Calls:  
  transfer_to_music_catalog_information_subagent (chatcmpl-tool-72475cf0c17f404583145912fca0b718)  
 Call ID: chatcmpl-tool-72475cf0c17f404583145912fca0b718  
  Args:  
=================================[1m Tool Message [0m=================================  
Name: transfer_to_music_catalog_information_subagent  

Error: transfer_to_music_catalog_information_subagent is not a valid tool, try one of [transfer_to_music_catalog_subagent, transfer_to_invoice_information_subagent].  
==================================[1m Ai Message [0m==================================  
Name: supervisor  
Tool Calls:  
  transfer_to_music_catalog_subagent (chatcmpl-tool-71cc764428ff4efeb0ba7bf24b64a6ec)  
 Call ID: chatcmpl-tool-71cc764428ff4efeb0ba7bf24b64a6ec  
  Args:  
=================================[1m Tool Message [0m=================================  
Name: transfer_to_music_catalog_subagent  

Successfully transferred to music_catalog_subagent  
==================================[1m Ai Message [0m==================================  

U2 has the following albums in our catalog:   
1. Achtung Baby  
2. All That You Can't Leave Behind  
3. B-Sides 1980-1990  
4. How To Dismantle An Atomic Bomb  
5. Pop  
6. Rattle And Hum  
7. The Best Of 1980-1990  
8. War  
9. Zooropa  
10. Instant Karma: The Amnesty International Campaign to Save Darfur  

Would you like to explore more music or is there something else I can help you with?  
==================================[1m Ai Message [0m==================================  
Name: music_catalog_subagent  

Transferring back to supervisor  
Tool Calls:  
  transfer_back_to_supervisor (4739ce04-dd11-47c8-b35a-9e4fca21b0c1)  
 Call ID: 4739ce04-dd11-47c8-b35a-9e4fca21b0c1  
  Args:  
=================================[1m Tool Message [0m=================================  
Name: transfer_back_to_supervisor  

Successfully transferred back to supervisor  
==================================[1m Ai Message [0m==================================  
Name: supervisor  

I hope this information helps you with your inquiry. Is there anything else I can help you with?

我们的多智能体正在与用户进行非常详细的对话。让我们理解一下。

在这个例子中，用户提出了一个同时涉及发票详细信息和音乐目录数据的问题。以下是发生的情况：

监督者收到查询。
它检测到与发票相关的部分（“最近的购买”）并将其发送给 invoice_information_subagent。
发票子智能体处理该部分，获取发票，但无法回答 U2 专辑的问题，因此它将控制权交还给监督者。
然后，监督者将剩余的音乐查询路由到 music_catalog_subagent。
音乐子智能体检索 U2 专辑信息并将控制权返回给监督者。
监督者总结，协调了两个子智能体以完全回答用户的多部分问题。

添加人工介入

到目前为止，我们已经构建了一个多智能体系统，可以将客户查询路由到专门的子智能体。然而在现实世界的客户支持场景中，并不总是能够轻易获得 customer_id。

在允许智能体访问发票历史等敏感信息之前，通常需要验证客户的身份。

在这一步中，将通过添加客户验证层来增强我们的工作流。这将涉及一个人工介入 (human-in-the-loop) 组件，如果客户的帐户信息缺失或未经验证，系统可能会暂停并提示客户提供该信息。

为了实现这一点，需要引入了两个新节点：

verify_info 节点 尝试使用的数据库从用户输入中提取并验证客户身份证明（ID、电子邮件或电话）。
如果验证失败，则触发 human_input 节点。它会暂停图并提示用户提供缺失的信息。这可以使用 LangGraph interrupt() 功能轻松处理。

定义一个用于解析用户输入的 Pydantic 模式和一个用于 LLM 可靠地提取此信息的系统提示。

from pydantic import BaseModel, Field  

class UserInput(BaseModel):  
    """用于解析用户提供的帐户信息的模式。"""  
    identifier: str = Field(description="标识符，可以是客户 ID、电子邮件或电话号码。")  

# 创建一个结构化的 LLM，其输出响应符合 UserInput 模式
structured_llm = llm.with_structured_output(schema=UserInput)  

# 用于提取客户标识符信息的系统提示
structured_system_prompt = """You are a customer service representative responsible for extracting customer identifier.  
Only extract the customer's account information from the message history.   
If they haven't provided the information yet, return an empty string for the identifier."""

UserInput

Pydantic 模型将预期数据定义为单个标识符。

使用

with_structured_output()

使 LLM 以此格式返回 JSON。系统提示帮助 LLM 仅专注于提取标识符。

接下来，需要一个辅助函数来获取提取的标识符（可以是客户 ID、电话号码或电子邮件），并在 Chinook 数据库中查找它以检索实际的

customer_id

。

from typing import Optional   

# 客户识别辅助函数
def get_customer_id_from_identifier(identifier: str) -> Optional[int]:  
    """  
    使用标识符检索客户 ID，标识符可以是客户 ID、电子邮件或电话号码。

    此函数支持三种类型的标识符：
    1. 直接客户 ID（数字字符串）
    2. 电话号码（以“+”开头）
    3. 电子邮件地址（包含“@”）

    参数：
        identifier (str): 标识符可以是客户ID、电子邮件或电话号码。

    返回：
        Optional[int]: 如果找到则返回CustomerId，否则返回None。
"""  
# 检查标识符是否为直接客户ID（数字）
if identifier.isdigit():  
    return int(identifier)  

# 检查标识符是否为电话号码（以"+"开头）
elif identifier[0] == "+":  
    query = f"SELECT CustomerId FROM Customer WHERE Phone = '{identifier}';"  
    result = db.run(query)  
    formatted_result = ast.literal_eval(result)  
    if formatted_result:  
        return formatted_result[0][0]  

# 检查标识符是否为电子邮件地址（包含"@"）
elif "@" in identifier:  
    query = f"SELECT CustomerId FROM Customer WHERE Email = '{identifier}';"  
    result = db.run(query)  
    formatted_result = ast.literal_eval(result)  
    if formatted_result:  
        return formatted_result[0][0]  

# 如果未找到匹配项，则返回None
return None

这个实用程序函数能够智能地解析提供的标识符，无论是直接的客户ID、电话号码还是电子邮件地址，然后查询数据库以获取相应的数字化客户ID。

接下来定义

verify_info

节点，该节点负责协调整个标识符提取和验证流程：

def verify_info(state: State, config: RunnableConfig):  
    """  
    通过解析客户输入并将其与数据库进行匹配来验证客户账户信息。

    此节点处理客户支持流程的第一步——客户身份认证。
    它从用户消息中提取客户标识符（ID、电子邮件或电话）并根据数据库进行验证。

    参数：
        state (State): 包含消息和潜在customer_id的当前状态
        config (RunnableConfig): 可运行执行的配置

    返回：
        dict: 如果已验证则包含customer_id的更新状态，或请求更多信息
    """  
    # 仅当customer_id尚未设置时才进行验证
    if state.get("customer_id") is None:   
        # 用于提示客户验证的系统指令
        system_instructions = """You are a music store agent, where you are trying to verify the customer identity   
        as the first step of the customer support process.   
        Only extract the customer's account information from the message history.   
        If they haven't provided the information yet, return an empty string for the identifier.   
        If they have provided the identifier but cannot be found, please ask them to revise it."""  

        # 获取最近的用户消息
        user_input = state["messages"][-1]   

        # 使用结构化LLM从消息中解析客户标识符
        parsed_info = structured_llm.invoke([SystemMessage(content=structured_system_prompt)] + [user_input])  

        # 从解析的响应中提取标识符
        identifier = parsed_info.identifier  

        # 将customer_id初始化为空
        customer_id = ""  

        # 尝试使用提供的标识符查找客户ID
        if (identifier):  
            customer_id = get_customer_id_from_identifier(identifier)  

        # 如果找到客户，则确认验证并在状态中设置customer_id
        if customer_id != "":  
            intent_message = SystemMessage(  
                content= f"Thank you for providing your information! I was able to verify your account with customer id {customer_id}."  
            )  
            return {  
                  "customer_id": customer_id,  
                  "messages" : [intent_message]  
                  }  
        else:  
            # 如果未找到客户，则请求正确的信息
            response = llm.invoke([SystemMessage(content=system_instructions)]+state['messages'])  
            return {"messages": [response]}  

    else:   
        # 客户已验证，无需操作
        pass

该

verify_info

节点首先检查状态中是否已存在

customer_id

。如果不存在，它使用

structured_llm

从用户输入中提取标识符，并使用

get_customer_id_from_identifier

进行验证。验证成功时，它会更新状态并发送确认消息；验证失败时，它会使用主要LLM和系统指令礼貌地向用户请求信息。

现在创建

human_input

节点，该节点作为人工干预机制的关键组件：

from langgraph.types import interrupt  

def human_input(state: State, config: RunnableConfig):  
    """  
    用于请求用户输入的人工介入节点，实现工作流中断机制。

    此节点在工作流中创建中断点，允许系统暂停并等待人工输入后再继续执行。
    它通常用于客户验证或需要其他信息的场景。

    参数：
        state (State): 包含消息和工作流数据的当前状态
        config (RunnableConfig): 可运行执行的配置

    返回：
        dict: 包含用户输入消息的更新状态
    """  
    # 中断工作流并提示用户输入
    user_input = interrupt("Please provide input.")  

    # 将用户输入作为新消息返回到状态中
    return {"messages": [user_input]}

interrupt()

函数是LangGraph的强大功能特性。执行时，它会暂停图的执行并发出需要人工干预的信号。后续的执行函数需要通过提供新输入来处理此中断以恢复图的运行。

现在需要定义条件边函数

should_interrupt

，该函数决定是否需要人工介入：

def should_interrupt(state: State, config: RunnableConfig):  
    """  
    确定工作流是否应中断并请求人工输入的条件判断函数。

    如果customer_id存在于状态中（表示验证已完成），
    则工作流继续执行。否则，它会中断以获取人工输入进行验证。
    """  
    if state.get("customer_id") is not None:  
        return "continue" # 客户ID已验证，继续下一步（监督者）
    else:  
        return "interrupt" # 客户ID未验证，中断以获取人工输入

现在将这些新的节点和边集成到整体图架构中：

# 为具有验证功能的多智能体工作流创建新的StateGraph实例
multi_agent_verify = StateGraph(State)  

# 为客户验证和人工交互添加新节点
multi_agent_verify.add_node("verify_info", verify_info)  
multi_agent_verify.add_node("human_input", human_input)  
# 将现有的监督者智能体添加为节点
multi_agent_verify.add_node("supervisor", supervisor_prebuilt)  

# 定义图的入口点：始终从信息验证开始
multi_agent_verify.add_edge(START, "verify_info")  

# 从verify_info添加条件边以决定是继续还是中断
multi_agent_verify.add_conditional_edges(  
    "verify_info",  
    should_interrupt, # 检查customer_id是否已验证
    {  
        "continue": "supervisor", # 如果已验证，则继续到监督者
        "interrupt": "human_input", # 如果未验证，则中断以获取人工输入
    },  
)  
# 人工输入后，始终循环回到verify_info以重新尝试验证
multi_agent_verify.add_edge("human_input", "verify_info")  
# 监督者完成其任务后，工作流结束
multi_agent_verify.add_edge("supervisor", END)  

# 使用检查点和长期内存存储编译完整的图
multi_agent_verify_graph = multi_agent_verify.compile(  
    name="multi_agent_verify",   
    checkpointer=checkpointer,   
    store=in_memory_store  
)  

# 显示更新后的图结构
show_graph(multi_agent_verify_graph)

新的图结构从

verify_info

开始执行。如果验证成功，则移至

supervisor

；如果验证失败，则路由到

human_input

，该节点会中断流程并等待用户输入。一旦提供了输入，它会循环回到

verify_info

以重新尝试验证。

supervisor

是到达

END

之前的最后处理步骤。

人工干预机制测试

让我们测试人工干预功能。首先在不提供任何身份证明的情况下提出一个问题：

thread_id = uuid.uuid4()  
question = "How much was my most recent purchase?"  
config = {"configurable": {"thread_id": thread_id}}  

result = multi_agent_verify_graph.invoke({"messages": [HumanMessage(content=question)]}, config=config)  
for message in result["messages"]:  
    message.pretty_print()  

### OUTPUT ###  
======== Human Message =======  

How much was my most recent purchase?  

======== Ai Message ==========  

Before I can look up your most recent purchase,  
I need to verify your identity. Could you please provide your  
customer ID, email, or phone number associated with your account?  
This will help me to access your information and assist you  
with your query.

正如预期，智能体会中断并询问客户ID、电子邮件或电话号码，因为

customer_id

最初在状态中为

None

。

现在使用LangGraph的

Command(resume=...)

从中断处恢复对话并提供所需信息：

from langgraph.types import Command  

# 从中断处恢复，提供电话号码进行验证
question = "My phone number is +55 (12) 3923-5555."  
result = multi_agent_verify_graph.invoke(Command(resume=question), config=config)  
for message in result["messages"]:  
    message.pretty_print()  

### OUTPUT ###  
======= Human Message =========  

How much was my most recent purchase?  

=========== Ai Message =======  
Before I can look up your most recent purchase, I need to verify your identity. Could you please provide your customer ID, email, or phone number associated with your account? This will help me to access your information and assist you with your query.  

========== Human Message ===========  

My phone number is +55 (12) 3923-5555.  

============ System Message =======  

Thank you for providing your information! I was able to verify your account with customer id 1.  

========== Ai Message ==========  
Name: supervisor  

{"type": "function", "function": {"name": "transfer_to_invoice_information_subagent", "parameters": {}}}

用户提供电话号码后，

verify_info

节点成功识别了

customer_id

（在Chinook数据库中，此号码对应的

customer_id

为1）。系统确认验证成功，并按照图中定义的流程将控制权传递给

supervisor

，然后由监督者路由原始查询。

这证实了人工干预验证机制按预期正常工作。

LangGraph状态管理的一个关键优势是，一旦

customer_id

得到验证并保存在状态中，它将在整个对话过程中持续存在。这意味着智能体在同一线程的后续问题中不会再次要求验证。

通过在不重新提供ID的情况下提出后续问题来测试这种持久性：

question = "What albums do you have by the Rolling Stones?"  
result = multi_agent_verify_graph.invoke({"messages": [HumanMessage(content=question)]}, config=config)  
for message in result["messages"]:  
    message.pretty_print()  

### OUTPUT ###  
=== Human Message ===  
How much was my most recent purchase?  

=== Ai Message ===  
Before I can look up your most recent purchase, I need to verify your identity. Could you please provide your customer ID, email, or phone number associated with your account?  

=== Human Message ===  
My phone number is +55 (12) 3923-5555.  

=== System Message ===  
Thank you for providing your information! I was able to verify your account with customer id 1.  

=== Ai Message ===  
Name: supervisor  
{"type": "function", "function": {"name": "transfer_to_invoice_information_subagent", "parameters": {}}}  

=== Human Message ===  
What albums do you have by the Rolling Stones?  

=== Ai Message ===  
Name: supervisor  
{"type": "function", "function": {"name": "transfer_to_music_catalog_subagent", "parameters": {}}}

verify_info

节点不会重新提示进行身份识别。由于

state.get("customer_id")

已设置为1，它会立即移至

supervisor

，后者将查询路由到

music_catalog_subagent

。这展示了状态如何维护上下文并避免重复步骤，从而改善用户体验。

长期内存系统集成

我们在"短期和长期内存"部分已经初始化了用于长期内存的InMemoryStore。现在将其完全集成到多智能体工作流中。长期内存的强大之处在于它允许智能体回忆和利用过去对话中的信息，从而实现随时间推移的个性化和上下文感知交互。

添加两个新节点来处理长期内存：

load_memory

在对话开始时（验证后）从

in_memory_store

检索用户现有的偏好；

create_memory

将用户在对话期间分享的任何新音乐兴趣保存到

in_memory_store

以供将来使用。

首先定义一个辅助函数，用于将用户存储的音乐偏好格式化为可读字符串，以便轻松注入LLM的提示中：

from langgraph.store.base import BaseStore  

# 用于格式化用户内存数据以用于LLM提示的辅助函数
def format_user_memory(user_data):  
    """如果可用，则格式化用户的音乐偏好。"""  
    # 访问保存UserProfile对象的'memory'键
    profile = user_data['memory']   
    result = ""  
    # 检查music_preferences属性是否存在且不为空
    if hasattr(profile, 'music_preferences') and profile.music_preferences:  
        result += f"Music Preferences: {', '.join(profile.music_preferences)}"  
    return result.strip()  

# 节点：load_memory
def load_memory(state: State, config: RunnableConfig, store: BaseStore):  
    """  
    从给定用户的长期内存存储中加载音乐偏好。

    此节点获取先前保存的用户偏好，为当前对话提供上下文，从而实现个性化响应。
    """  
    # 从配置的可配置部分获取user_id
    # 在我们的评估设置中，我们可能会通过配置传递user_id
    user_id = config["configurable"].get("user_id", state["customer_id"]) # 如果配置中没有user_id，则使用customer_id

    # 定义用于在存储中访问内存的命名空间和键
    namespace = ("memory_profile", user_id)  
    key = "user_memory"  

    # 检索用户的现有内存
    existing_memory = store.get(namespace, key)  
    formatted_memory = ""  

    # 如果检索到的内存存在且有内容，则对其进行格式化
    if existing_memory and existing_memory.value:  
        formatted_memory = format_user_memory(existing_memory.value)  

    # 使用加载并格式化的内存更新状态
    return {"loaded_memory": formatted_memory}

load_memory

节点使用

user_id

（来自配置或状态）构建命名空间键，并从

in_memory_store

获取现有的

user_memory

。它格式化此内存并更新状态中的

loaded_memory

字段。然后，此内存将包含在

music_assistant

提示中，如

generate_music_assistant_prompt

中设置的那样。

接下来需要一个Pydantic模式来结构化用户配置文件以保存到内存中：

# 用于定义内存存储的用户配置文件结构的Pydantic模型
class UserProfile(BaseModel):  
    customer_id: str = Field(  
        description="客户的客户ID"  
    )  
    music_preferences: List[str] = Field(  
        description="客户的音乐偏好"  
    )

现在定义

create_memory

节点。此节点将使用LLM-as-a-judge模式来分析对话历史和现有内存，然后使用任何新识别的音乐兴趣更新

UserProfile

：

# create_memory智能体的提示，指导其更新用户内存
create_memory_prompt = """你是一位专家分析师，正在观察客户与客户支持助理之间的对话。客户支持助理为一家数字音乐商店工作，并利用多智能体团队来回答客户的请求。
你的任务是分析客户与客户支持助理之间的对话，并更新与客户关联的内存配置文件。内存配置文件可能为空。如果为空，你应该为客户创建一个新的内存配置文件。

你特别关注保存客户分享的任何音乐兴趣，尤其是他们的音乐偏好到他们的内存配置文件中。

为了帮助你完成此任务，我附上了客户与客户支持助理之间的对话，以及与客户关联的现有内存配置文件，你应该根据对话更新或创建该配置文件。

客户的内存配置文件应包含以下字段：
- customer_id: 客户的客户ID
- music_preferences: 客户的音乐偏好

这些是你应该在内存配置文件中跟踪和更新的字段。如果没有新的信息，则不应更新内存配置文件。如果你没有新的信息来更新内存配置文件，这完全没问题。在这种情况下，只需保持现有值不变。

*以下是重要信息*

你应该分析的客户与客户支持助理之间的对话如下：
{conversation}

你应该根据对话更新或创建的与客户关联的现有内存配置文件如下：
{memory_profile}

确保你的响应是一个包含以下字段的对象：
- customer_id: 客户的客户ID
- music_preferences: 客户的音乐偏好

对于对象中的每个键，如果没有新信息，则不要更新值，只需保留已有的值。如果有新信息，则更新值。

深呼吸，仔细思考后再作答。
"""

# 节点：create_memory
def create_memory(state: State, config: RunnableConfig, store: BaseStore):  
    """  
    分析对话历史并更新用户的长期内存配置文件。

    此节点提取客户在对话期间分享的新音乐偏好，并将其持久化到InMemoryStore中以供将来交互使用。
    """  
    # 从配置的可配置部分或状态中获取user_id
    user_id = str(config["configurable"].get("user_id", state["customer_id"]))  

    # 定义内存配置文件的命名空间和键
    namespace = ("memory_profile", user_id)  
    key = "user_memory"  

    # 检索用户的现有内存配置文件
    existing_memory = store.get(namespace, key)  

    # 为LLM提示格式化现有内存
    formatted_memory = ""  
    if existing_memory and existing_memory.value:  
        existing_memory_dict = existing_memory.value  
        # 确保'music_preferences'被视为列表，即使它可能缺失或为None
        music_prefs = existing_memory_dict.get('music_preferences', [])  
        if music_prefs:  
            formatted_memory = f"Music Preferences: {', '.join(music_prefs)}"  

    # 准备用于LLM更新内存的系统消息
    formatted_system_message = SystemMessage(content=create_memory_prompt.format(  
        conversation=state["messages"],   
        memory_profile=formatted_memory  
    ))  

    # 使用UserProfile模式调用LLM以获取结构化的更新内存
    updated_memory = llm.with_structured_output(UserProfile).invoke([formatted_system_message])  

    # 存储更新后的内存配置文件
    store.put(namespace, key, {"memory": updated_memory})

create_memory

节点从存储中检索当前用户内存，对其进行格式化，并将其与完整对话(

state["messages"]

)一起发送给LLM。LLM将新的音乐偏好提取到

UserProfile

对象中，并将其与现有数据合并。然后使用

store.put()

将更新后的内存保存回

in_memory_store

。

将内存节点集成到图中：

load_memory

节点在验证后立即运行以加载用户偏好；

create_memory

节点在图结束前运行，保存任何更新。这确保了在每次交互开始时加载内存并在结束时保存内存：

multi_agent_final = StateGraph(State)  

# 将所有现有节点和新节点添加到图中
multi_agent_final.add_node("verify_info", verify_info)  
multi_agent_final.add_node("human_input", human_input)  
multi_agent_final.add_node("load_memory", load_memory)  
multi_agent_final.add_node("supervisor", supervisor_prebuilt) # 我们的监督者智能体
multi_agent_final.add_node("create_memory", create_memory)  

# 定义图的入口点：始终从信息验证开始
multi_agent_final.add_edge(START, "verify_info")  

# 验证后的条件路由：如果需要则中断，否则加载内存
multi_agent_final.add_conditional_edges(  
    "verify_info",  
    should_interrupt, # 检查customer_id是否已验证
    {  
        "continue": "load_memory", # 如果已验证，则继续加载长期内存
        "interrupt": "human_input", # 如果未验证，则中断以获取人工输入
    },  
)  
# 人工输入后，始终循环回到verify_info
multi_agent_final.add_edge("human_input", "verify_info")  
# 加载内存后，将控制权传递给监督者
multi_agent_final.add_edge("load_memory", "supervisor")  
# 监督者完成后，保存任何新内存
multi_agent_final.add_edge("supervisor", "create_memory")  
# 创建/更新内存后，工作流结束
multi_agent_final.add_edge("create_memory", END)  

# 编译包含所有组件的最终图
multi_agent_final_graph = multi_agent_final.compile(  
    name="multi_agent_verify",   
    checkpointer=checkpointer,   
    store=in_memory_store  
)  

# 显示完整的图结构
show_graph(multi_agent_final_graph)

完整的长期内存集成智能体架构如下：

这个输出现在显示了完整、复杂的工作流：START ->

verify_info

（如果需要，则循环到

human_input

）->

load_memory

supervisor

（内部协调子智能体）->

create_memory

-> END。该架构结合了身份验证、多智能体路由和长期个性化功能。

长期内存多智能体系统测试

测试这个完全集成的图，我们将提供一个复杂的查询，包括用于验证的标识符和要保存的音乐偏好：

thread_id = uuid.uuid4()  

question = "My phone number is +55 (12) 3923-5555. How much was my most recent purchase? What albums do you have by the Rolling Stones?"  
config = {"configurable": {"thread_id": thread_id}}  

result = multi_agent_final_graph.invoke({"messages": [HumanMessage(content=question)]}, config=config)  
for message in result["messages"]:  
    message.pretty_print()

这个交互展示了完整的流程：验证通过

verify_info

提取电话号码，获取

customer_id = 1

，并更新状态；内存加载通过

load_memory

接下来运行，由于这可能是第一次会话，它会加载"None"；监督者路由将查询根据需要路由到

invoice_information_subagent

和

music_catalog_subagent

；内存创建在关于"滚石乐队"的响应之后，

create_memory

分析对话，将艺术家识别为新的偏好，并将其保存到

customer_id = 1

的

in_memory_store

中。

可以直接访问

in_memory_store

来检查音乐偏好是否已保存：

user_id = "1" # 假设在之前的交互中使用了客户ID 1
namespace = ("memory_profile", user_id)  
memory = in_memory_store.get(namespace, "user_memory")  

# 访问存储在"memory"键下的UserProfile对象
saved_music_preferences = memory.value.get("memory").music_preferences  

print(saved_music_preferences)  

### OUTPUT ###  
['Rolling Stones']

输出

['Rolling Stones']

确认我们的

create_memory

节点成功提取并将用户的音乐偏好保存到长期内存中。在将来的交互中，

load_memory

可以加载此信息以提供更个性化的响应。

多智能体系统评估

评估帮助衡量智能体的表现如何，这对于开发至关重要，因为即使是很小的提示或模型更改，LLM的行为也可能发生显著变化。评估为我们提供了一种结构化的方法来捕获故障、比较版本并提高系统可靠性。

评估包含三个核心组件：数据集是一组测试输入和预期输出；目标函数是正在测试的应用程序或智能体，它接收输入并返回输出；评估器是对智能体输出进行评分的工具。

常见的智能体评估类型包括：最终响应评估检查智能体是否给出了正确的最终答案；单步评估评估一个步骤（例如，是否选择了正确的工具）；轨迹评估评估智能体为达到答案所采取的完整推理路径。

评估智能体最直接的方法之一是评估其在任务上的整体表现。这就像将智能体视为一个"黑盒子"，并简单地评估其最终响应是否成功解决了用户的查询并满足了预期标准。输入是用户的初始查询，输出是智能体最终生成的响应。

首先需要一个包含问题及其相应预期最终响应的数据集。该数据集将作为评估的基准。我们将使用

langsmith.Client

来创建和上传此数据集：

from langsmith import Client  

client = Client()  

# 定义用于评估的示例问题及其预期的最终响应
examples = [  
    {  
        "question": "My name is Aaron Mitchell. My number associated with my account is +1 (204) 452-6452. I am trying to find the invoice number for my most recent song purchase. Could you help me with it?",  
        "response": "The Invoice ID of your most recent purchase was 342.",  
    },  
    {  
        "question": "I'd like a refund.",  
        "response": "I need additional information to help you with the refund. Could you please provide your customer identifier so that we can fetch your purchase history?",  
    },  
    {  
        "question": "Who recorded Wish You Were Here again?",  
        "response": "Wish You Were Here is an album by Pink Floyd", # 注意：模型可能会返回更多详细信息，但这是核心预期事实。
    },  
    {   
        "question": "What albums do you have by Coldplay?",  
        "response": "There are no Coldplay albums available in our catalog at the moment.",  
    },  
]  

dataset_name = "LangGraph 101 Multi-Agent: Final Response"  

# 检查数据集是否已存在以避免重复创建错误
if not client.has_dataset(dataset_name=dataset_name):  
    dataset = client.create_dataset(dataset_name=dataset_name)  
    client.create_examples(  
        inputs=[{"question": ex["question"]} for ex in examples],  
        outputs=[{"response": ex["response"]} for ex in examples],  
        dataset_id=dataset.id  
    )

这里定义了四个示例场景，每个场景都有一个问题（智能体的输入）和一个预期响应（我们认为正确的最终输出）。然后在LangSmith中创建一个数据集，并用这些示例填充它。

接下来定义一个目标函数，该函数封装了智能体(

multi_agent_final_graph

)应如何运行以进行评估。此函数将从数据集中获取问题作为输入，并返回智能体最终生成的响应：

import uuid  
from langgraph.types import Command  

graph = multi_agent_final_graph  

async def run_graph(inputs: dict):  
    """  
    运行多智能体图工作流并返回最终响应。

    此函数处理完整的工作流，包括：
    1. 使用用户问题进行初始调用
    2. 处理用于客户验证的人工干预中断
    3. 使用客户ID恢复以完成请求

    参数：
        inputs (dict): 包含用户问题的字典

    返回：
        dict: 包含智能体最终响应的字典
    """  
    # 为此对话会话创建唯一的线程ID
    thread_id = uuid.uuid4()  
    configuration = {"thread_id": thread_id, "user_id": "10"}  

    # 使用用户的问题初始调用图
    # 这将触发验证过程并可能遇到中断
    result = await graph.ainvoke({  
        "messages": [{"role": "user", "content": inputs['question']}]  
    }, config=configuration)  

    # 通过提供客户ID从人工干预中断中恢复
    # 这允许工作流在验证步骤之后继续
    result = await graph.ainvoke(  
        Command(resume="My customer ID is 10"),   
        config={"thread_id": thread_id, "user_id": "10"}  
    )  

    # 从最后一条消息返回最终响应内容
    return {"response": result['messages'][-1].content}

需要注意的是，我们必须通过向图提供

Command(resume="")

来继续通过

interrupt()

。

from openevals.llm import create_llm_as_judge  
from openevals.prompts import CORRECTNESS_PROMPT  

# 使用Open Eval预构建评估器
correctness_evaluator = create_llm_as_judge(  
    prompt=CORRECTNESS_PROMPT,  
    feedback_key="correctness",  
    judge=llm  
)

也可以定义自定义评估器：

# LLM作为裁判指令的自定义定义
grader_instructions = """You are a teacher grading a quiz.  

You will be given a QUESTION, the GROUND TRUTH (correct) RESPONSE, and the STUDENT RESPONSE.  

Here is the grade criteria to follow:  
(1) Grade the student responses based ONLY on their factual accuracy relative to the ground truth answer.  
(2) Ensure that the student response does not contain any conflicting statements.  
(3) It is OK if the student response contains more information than the ground truth response, as long as it is factually accurate relative to the ground truth response.  

Correctness:  
True means that the student's response meets all of the criteria.  
False means that the student's response does not meet all of the criteria.  

Explain your reasoning in a step-by-step manner to ensure your reasoning and conclusion are correct."""  

# LLM作为裁判输出模式
class Grade(TypedDict):  
    """比较预期答案和实际答案，并对实际答案进行评分。"""  
    reasoning: Annotated[str, ..., "解释你判断实际响应是否正确的理由。"]  
    is_correct: Annotated[bool, ..., "如果学生响应基本正确或完全正确，则为True，否则为False。"]  

# 裁判LLM
grader_llm = llm.with_structured_output(Grade, method="json_schema", strict=True)  

# 评估器函数
async def final_answer_correct(inputs: dict, outputs: dict, reference_outputs: dict) -> bool:  
    """评估最终响应是否等同于参考响应。"""  
    # 注意，我们假设输出有一个'response'字典。我们需要确保我们定义的目标函数包含此键。
    user = f"""QUESTION: {inputs['question']}  
    GROUND TRUTH RESPONSE: {reference_outputs['response']}  
    STUDENT RESPONSE: {outputs['response']}"""  

    grade = await grader_llm.ainvoke([{"role": "system", "content": grader_instructions}, {"role": "user", "content": user}])  
    return grade["is_correct"]

可以使用LLM作为真实答案和AI智能体响应之间的裁判。现在所有组件都已编译完成，让我们运行评估：

# 运行评估实验
# 这将使用两个评估器针对数据集测试我们的多智能体图
experiment_results = await client.aevaluate(  
    run_graph,                                    # 要评估的应用程序函数
    data=dataset_name,                           # 包含测试问题和预期响应的数据集
    evaluators=[final_answer_correct, correctness_evaluator],  # 用于评估性能的评估器列表
    experiment_prefix="agent-result",       # 用于在LangSmith中组织实验结果的前缀
    num_repetitions=1,                           # 每个测试用例的运行次数
    max_concurrency=5,                           # 最大并发评估数
)

当运行此命令且评估完成时，它将输出包含结果的LangSmith仪表板页面。

LangSmith仪表板包含我们的评估结果，显示正确性、最终结果、它们的比较等参数。还有其他评估技术也可以使用，开发者可以在相关文档中找到更详细的介绍。

群体架构与监督者架构对比

到目前为止，我们已经使用监督者（Supervisor）方法构建了一个多智能体系统，其中中央智能体管理流程并将任务委派给子智能体。

另一种选择是群体架构（Swarm Architecture），如LangGraph文档中所述。在群体架构中，智能体相互协作并直接传递任务，没有中央协调器。

监督者架构具有一个指导流量的中央智能体，充当专业子智能体的"管理者"角色，遵循分层且更可预测的路径，控制权通常返回给监督者。群体架构由对等智能体组成，它们在没有中央授权的情况下直接相互移交任务，采用分散的和智能体驱动的方式，允许直接、自适应的协作和可能更具弹性的操作。

监督者架构更适合需要明确控制流程和集中决策的场景，而群体架构则在需要灵活协作和分布式处理的环境中表现更佳。选择哪种架构取决于具体的应用需求、复杂性要求和系统的可维护性考虑。

总结

本文详细介绍了使用LangGraph和LangSmith构建企业级多智能体AI系统的完整流程。从基础的单个ReAct智能体开始，逐步构建了一个包含身份验证、人工干预、长期内存管理和性能评估的完整多智能体架构。

通过这个系统化的构建过程，展示了现代AI应用开发中的关键技术要素：模块化的智能体设计、状态管理、工具集成、条件流程控制以及全面的监控评估机制。这些技术组合为构建可靠、可扩展的AI系统提供了坚实的技术基础。

LangGraph和LangSmith的组合为多智能体系统的开发提供了强大的工具支持，从预构建组件的快速原型开发到生产环境的全面监控，都能够满足企业级应用的复杂需求。随着AI技术的不断发展，这种系统化的多智能体架构将在更多领域发挥重要作用。

https://avoid.overfit.cn/post/6d53ceee8630474bb5119ee19596028b

作者：Fareed Khan

LangGraph实战教程：构建会思考、能记忆、可人工干预的多智能体AI系统

环境设置

LangSmith 的目的

数据集

短期和长期记忆

多智能体架构

目录信息子智能体

定义状态、工具和节点

测试第一个子智能体

发票信息子智能体

测试第二个子智能体

添加人工介入

人工干预机制测试

长期内存系统集成

长期内存多智能体系统测试

多智能体系统评估

群体架构与监督者架构对比

总结

deephub

引用和评论

【万字长文】大模型开源开发全景与趋势解读

一文掌握 MCP 上下文协议：从理论到实践

AI Agent爆火后，MCP协议为什么如此重要！

2025年医疗大模型各医疗场景赋能实践研究报告130+份汇总解读|附PDF下载

AdventureX 2025 正式启动：五天四夜，120小时极限创造！一起在杭州点燃青年创新之火！

MCP 协议为何不如你想象的安全？从技术专家视角解读

🔥吐血整理 Bolt.diy 部署与应用攻略