Agent 基础

Function calling

Function calling 本质上就是给 LLM 了解和调用外界函数的能力，LLM 会根据他的理解，在合适的时间返回对函数的调用和参数，然后根据函数调用的结果进行回答。

获取天气是非常经典的使用案例，需要实时获取外部 API 的返回结果，LLM 无法独立回答。这里使用 OpenAI 官方库来实现访问 DashScope 服务上的千问模型：

  import OpenAI from "openai";

  const openai = new OpenAI({
    apiKey: env["ALIBABA_API_KEY"],
    baseURL: `https://dashscope.aliyuncs.com/compatible-mode/v1`,
  });

创建一个假的获取天气的函数：

  function getCurrentWeather({ location, unit="fahrenheit"}){
    const  weather_info = {
      "location": location,
      "temperature": "72",
      "unit": unit,
      "forecast": ["sunny", "windy"],
    }
    return JSON.stringify(weather_info);
  }

创建函数的描述信息 tools：

  // OpenAI 官方 API 指定的格式
  const tools = [
    {
      type: "function",
      function: {
        name: "getCurrentWeather",
        description: "获取指定地点的天气信息",
        parameters: {
          type: "object",
          properties: {
            location: {
              type: "string",
              description: "城市和地点",
            },
            unit: { type: "string", enum: ["celsius", "fahrenheit"] },
          },
          required: ["location"],
        },
      },
    }
  ]

上面是 OpenAI 官方 API 指定的格式：

type： "function" 目前只支持值为 function，必须指定
function：对具体函数的描述
name：函数名, 需要跟函数的名称一致，方便后续实现对函数名的调用
descirption：函数的描述，可以理解成对 LLM 决定是否调用该函数的唯一信息，这部分清晰的表达函数的效果
parameters：函数的参数，OpenAI 使用的是通用的 JSON Schema 去描述函数的各个参数，这里使用了数组作为参数的输入，其中有两个 key
- location： string 值表示位置
- unit：表示请求的单位
required：通过这个 key 告知 LLM 该参数是必须的

调用 LLM 的 tools 功能：

  const messages = [
    {
      "role": "user",
      "content": "上海的天气怎么样"
    }
  ];

  const result = await openai.chat.completions.create({
    model: 'qwen-plus',
    messages,
    tools,
  });

  console.log(result.choices[0]);

输出内容如下：

  {
    message: {
      role: "assistant",
      tool_calls: [
        {
          function: {
            name: "getCurrentWeather",
            arguments: '{"properties": {"location": {"description": "上海", "type": "string"}, "unit": {"enum": ["celsius", "f'... 76 more characters
          },
          id: "",
          type: "function"
        }
      ],
      content: ""
    },
    finish_reason: "tool_calls",
    index: 0,
    logprobs: null
  }

控制 LLM 调用函数的行为

tools 另一个可选的参数是 tool_choice，可选值如下：

none：表示禁止 LLM 使用任何函数, 也就是无论用户输入什么, LLM 都不会调用函数
auto：表示让 LLM 自己决定是否使用函数，也就是 LLM 的返回值可能是函数调用, 也可能正常的信息，而最后一种, 就是指定一个函数, 让 LLM 强制使用该函数, 其类型是一个 object, 有两个属性
type：目前只能指定为 function
function：其值是一个对象, 有且仅有一个 key name 为函数名称例如

  {
    "type": "function", 
    "function": {
      "name": "my_function"
    }
  }

配置示例，禁止 LLM 去调用函数：

  const result = await openai.chat.completions.create({
    model: '',
    messages,
    tools,
    tool_choice: "none"
  });

强制调用某个函数：

  const result = await openai.chat.completions.create({
    model: '',
    messages,
    tools,
    tool_choice: {
      type: "function",
      function: {
        name: "getCurrentWeather"
      }
    }
  });

并发调用函数

在新版的 tools 中引入了并发调用函数的特性，可以简单的理解成之前的 function calling 每次 LLM 只会返回对一个函数的调用请求，而 tools 可以一次返回一系列的函数调用，来获取更多信息，并且函数之间可以并行的调用来节约调用外部 API 所占用的时间。

定义获取当前时间的 API：

  function getCurrentTime({ format = "iso" } = {}) {
    let currentTime;
    switch (format) {
      case "iso":
        currentTime = new Date().toISOString();
        break;
      case "locale":
        currentTime = new Date().toLocaleString();
        break;
      default:
        currentTime = new Date().toString();
        break;
    }
    return currentTime;
  }

将其添加到 tools 中：

  const tools = [
    {
      type: "function",
      function: {
        name: "getCurrentTime",
        description: "Get the current time in a given format",
        parameters: {
          type: "object",
          properties: {
            format: {
              type: "string",
              enum: ["iso", "locale", "string"],
              description: "The format of the time, e.g. iso, locale, string",
            },
          },
          required: ["format"],
        },
      },
    },
    {
      type: "function",
      function: {
        name: "getCurrentWeather",
        description: "Get the current weather in a given location",
        parameters: {
          type: "object",
          properties: {
            location: {
              type: "string",
              description: "The city and state, e.g. San Francisco, CA",
            },
            unit: { type: "string", enum: ["celsius", "fahrenheit"] },
          },
          required: ["location", "unit"],
        },
      },
    },
  ]

测试：

  const messages = [
    {
      "role": "user",
      "content": "请同时告诉我当前的时间，上海的天气"
    }
  ]

  const result = await openai.chat.completions.create({
    model: 'qwen-plus',
    messages,
    tools,
  });

  console.log(result.choices[0]);

目前并发调用还不稳定，不容易输出正确结果。

根据函数结果进行回答

函数运行结果输入给 LLM，让 LLM 参考此进行回答。

  const messages = [
    {
      "role": "user",
      "content": "上海的天气怎么样"
    }
  ];

  const result = await openai.chat.completions.create({
    model: 'qwen-plus',
    messages,
    tools,
  });

提取结果中的函数内容，并进行添加到 messages 中：

  messages.push(result.choices[0].message)

  const functions = {
    "getCurrentWeather": getCurrentWeather
  }

  const cell = result.choices[0].message.tool_calls[0]
  const functionInfo = cell.function
  const functionName = functionInfo.name;
  const functionParams = functionInfo.arguments
  const functionResult = functions[functionName](functionParams);

  messages.push({
    tool_call_id: cell.id,
    role: "tool",
    name: functionName,
    content: functionResult,
  });

然后把最新的 message 传递给 LLM：

  const response = await openai.chat.completions.create({
    model: 'qwen-plus',
    messages,
  });

  console.log(response.choices[0].message);

输出内容如下：

  { role: "assistant", content: ": 上海现在的天气是晴朗并有风，温度约为22℃。" }

使用 LLM 进行数据标注和信息提取

在 langchain 中使用 tools

在 langchain 中，一般会使用 zod 来定义 tool 函数的 JSON schema，我们可以专注在参数的描述上，参数的类型定义和是否 required 都可以由 zod 来生成。并且在定义 Agent tool 时，zod 也能进行辅助的参数类型检测。

使用 zod 修改天气函数参数的 schem：

  import { z } from "zod";

  const getCurrentWeatherSchema = z.object({
    location: z.string().describe("The city and state, e.g. San Francisco, CA"), // string 类型
    unit: z.enum(["celsius", "fahrenheit"]).describe("The unit of temperature"), // 枚举类型
  });

使用 zod-to-json-schema 去将 zod 定义的 schema 转换成 JSON schema：

  import { zodToJsonSchema } from "zod-to-json-schema";

  const paramSchema = zodToJsonSchema(getCurrentWeatherSchema);

在 model 去使用这个 tool ：

  import { ChatAlibabaTongyi } from "@langchain/community/chat_models/alibaba_tongyi";

  const model = new ChatAlibabaTongyi({
    model: "qwen-turbo",
    temperature: 0,
  });

  const modelWithTools = model.bind({
    tools: [
      {
        type: "function",
        function: {
          name: "getCurrentWeather",
          description: "Get the current weather in a given location",
          parameters: paramSchema,
        }
      }
    ]
  });

  await modelWithTools.invoke("上海的天气怎么样");

参考自 langchain.js

因为绑定 tools 后的 model 依旧是 Runnable 对象，所以我们可以很方便的把它加入到 LCEL 链中：

  import { ChatPromptTemplate } from "@langchain/core/prompts";

  const prompt = ChatPromptTemplate.fromMessages([
    ["system", "You are a helpful assistant"],
    ["human", "{input}"]
  ])

  const chain = prompt.pipe(modelWithTools)

  const res1 = await chain.invoke({
    input: "上海的天气怎么样",
  });

  console.log(res);

多 tools model

改写获取时间的 tool：

  const getCurrentTimeSchema = z.object({
    format: z
      .enum(["iso", "locale", "string"])
      .optional()
      .describe("The format of the time, e.g. iso, locale, string"),
  });

  zodToJsonSchema(getCurrentTimeSchema);

modelWithMultiTools 就会根据用户的输入和上下文去调用合适的 function

  const model = new ChatOpenAI({
    temperature: 0 
  });

  const modelWithMultiTools = model.bind({
    tools: [
      {
        type: "function",
        function: {
          name: "getCurrentWeather",
          description: "Get the current weather in a given location",
          parameters: zodToJsonSchema(getCurrentWeatherSchema),
        }
      },
      {
        type: "function",
        function: {
          name: "getCurrentTime",
          description: "Get the current time in a given format",
          parameters: zodToJsonSchema(getCurrentTimeSchema),
        }
      }
    ],
  });

控制 model 对 tools 的调用，强制调用某个函数：

  const modelWithForce = model.bind({
    tools: [
        ...
    ],
    tool_choice: {
      type: "function",
      function: {
        name: "getCurrentWeather"
      }
    }
  });

使用 tools 给数据打标签

首先定义提取信息的函数 scheme ：

  const taggingSchema = z.object({
    emotion:z.enum(["pos", "neg", "neutral"]).describe("文本的情感"),
    language: z.string().describe("文本的核心语言（应为ISO 639-1代码）"),
  });

为 model 绑定 tool，模型需要使用 ChatOpenAI：

  // const model = new ChatOpenAI({
  //   temperature: 0 
  // });

  const modelTagging = model.bind({
    tools: [
      {
        type: "function",
        function: {
          name: "tagging",
          description: "为特定的文本片段打上标签",
          parameters: zodToJsonSchema(taggingSchema)
        }
      }
    ],
    tool_choice: {
      type: "function",
      function: {
        name: "tagging"
      }
    }
  });

组合成 chain：

  import { JsonOutputToolsParser } from "@langchain/core/output_parsers/openai_tools";

  const prompt = ChatPromptTemplate.fromMessages([
    ["system", "仔细思考，你有充足的时间进行严谨的思考，然后按照指示对文本进行标记"],
    ["human", "{input}"]
  ])

  const chain = prompt.pipe(modelTagging).pipe(new JsonOutputToolsParser());

测试：

  const res = await chain.invoke({
    input: "hello world",
  });

  console.log(res);

使用 tools 进行信息提取

定义描述一个人的信息 scheme：

  const personExtractionSchema = z.object({
    name: z.string().describe("人的名字"),
    age: z.number().optional().describe("人的年龄")
  }).describe("提取关于一个人的信息");

构造更上层的 scheme，从信息中提取更复杂信息：

复用 personExtractionSchema 去构建数组的 scheme，去提取信息中多人的信息，并且提取文本中人物之间的关系

  const relationExtractSchema = z.object({
    people: z.array(personExtractionSchema).describe("提取所有人"),
    relation: z.string().describe("人之间的关系, 尽量简洁")
  });

构建成 chain ：

  const model = new ChatOpenAI({
    temperature: 0 
  });

  const modelExtract = model.bind({
    tools: [
      {
        type: "function",
        function: {
            name: "relationExtract",
            description: "提取数据中人的信息和人的关系",
            parameters: zodToJsonSchema(relationExtractSchema)
        }
      }
    ],
    tool_choice: {
      type: "function",
      function: {
        name: "relationExtract"
      }
    }
  });

  const prompt = ChatPromptTemplate.fromMessages([
    ["system", "仔细思考，你有充足的时间进行严谨的思考，然后提取文中的相关信息，如果没有明确提供，请不要猜测，可以仅提取部分信息"],
    ["human", "{input}"]
  ]);

  const chain = prompt.pipe(modelExtract).pipe(new JsonOutputToolsParser());

测试简单的任务：

  await chain.invoke({
    input: "小明现在 18 岁了，她妈妈是小丽"
  });

测试复杂任务：

  await chain.invoke({
    input: "我是小明现在 18 岁，我和小华、小美是好朋友，我们都一样大"
  });

查看输出结果。

Agent

Agents 是一个自主的决策和执行过程，其核心是将 llm 作为推理引擎，根据 llm 对任务和环境的理解，并根据提供的各种工具，自主决策一系列的行动。

RunnableBranch

RunnableBranch 可以对任务进行分类，路由到擅长不同任务的 chain 中，然后对 chain 的结果进行处理、格式化等操作。

示例：参考文档 tool_calls

import { z } from "zod";
import { ChatOpenAI } from "@langchain/openai";
import { ChatAlibabaTongyi } from "@langchain/community/chat_models/alibaba_tongyi";
import { JsonOutputToolsParser } from "@langchain/core/output_parsers/openai_tools";
import { RunnableSequence, RunnableBranch, RunnablePassthrough } from "@langchain/core/runnables";
import { zodToJsonSchema } from "zod-to-json-schema";
import { ChatPromptTemplate, PromptTemplate } from "@langchain/core/prompts";
import { DynamicStructuredTool } from "@langchain/core/tools";

const classifySchema = z.object({
  type: z.enum(["科普", "编程", "一般问题"]).describe("用户提问的分类")
});

const model = new ChatOpenAI({
  temperature: 0 
})

// const model = new ChatAlibabaTongyi({
//     model: "qwen-turbo",
//     temperature: 0,
// });

// const tool = new DynamicStructuredTool({
//     name: "classifyQuestion",
//     description: "对用户的提问进行分类",
//     parameters: zodToJsonSchema(classifySchema),
// });
// const modelWithTools = model.bindTools([tool]);

const modelWithTools = model.bind({
  tools: [
    {
      type: "function",
      function: {
        name: "classifyQuestion",
        description: "对用户的提问进行分类",
        parameters: zodToJsonSchema(classifySchema),
      }
    }
  ],
  tool_choice: {
    type: "function",
    function: {
      name: "classifyQuestion"
    }
  }
});

const prompt = ChatPromptTemplate.fromMessages([
  ["system", `仔细思考，你有充足的时间进行严谨的思考，然后对用户的问题进行分类，
  当你无法分类到特定分类时，可以分类到 "一般问题"`],
  ["human", "{input}"]
]);

const classifyChain = RunnableSequence.from([
  prompt,
  modelWithTools,
  new JsonOutputToolsParser(),
  (input) => {
      const type = input[0]?.args?.type
      return type ? type : "一般问题"
  }
]);

为了构建面向工业使用、稳定的 llm，这里做了多层兜底：

classifySchema 中将 type 指定为必选
在 prompt 中，添加了 “当你无法分类到特定分类时，可以分类到 "一般问题"”
在输出时，也用函数确保 type 如果没有定义，那就是为 “一般问题”
更进一步，也可以在这个函数中进行检测，判断 type 是否是几个目标分类之一，如果不是则返回 “一般问题”

测试：

  await classifyChain.invoke({
    "input": "鲸鱼是哺乳动物么？"
  });

构造三个简单的对应的专家 chain：

import { StringOutputParser } from "@langchain/core/output_parsers";

const answeringModel = new ChatOpenAI({
  temperature: 0.7,
});

const sciencePrompt = PromptTemplate.fromTemplate(
  `作为一位科普专家，你需要解答以下问题，尽可能提供详细、准确和易于理解的答案：

  问题：{input}
  答案：`
);
    
const programmingPrompt = PromptTemplate.fromTemplate(
  `作为一位编程专家，你需要解答以下编程相关的问题，尽可能提供详细、准确和实用的答案：

  问题：{input}
  答案：`
);

const generalPrompt = PromptTemplate.fromTemplate(
  `请回答以下一般性问题，尽可能提供全面和有深度的答案：

  问题：{input}
  答案：`
);

const scienceChain = RunnableSequence.from([
  sciencePrompt,
  answeringModel,
  new StringOutputParser(),
  {
    output: input => input,
    role: () => "科普专家"
  }
]);

const programmingChain = RunnableSequence.from([
  programmingPrompt,
  answeringModel,
  new StringOutputParser(),
  {
    output: input => input,
    role: () => "编程大师"
  }
]);

const generalChain = RunnableSequence.from([
  generalPrompt,
  answeringModel,
  new StringOutputParser(),
  {
    output: input => input,
    role: () => "通识专家"
  }
]);

构建 RunnableBranch 来根据用户的输入进行路由：

  const branch = RunnableBranch.from([
    [
      (input => input.type.includes("科普")),
      scienceChain,
    ],
    [
      (input => input.type.includes("编程")),
      programmingChain,
    ],
    generalChain
  ]);

RunnableBranch 传入的是二维数组，每个数组的第一个参数是该分支的条件。它通过向每个条件传递调用它的输入来选择哪个分支，当数组中的第一个参数返回 True 时，就会返回对应的 Runnable 对象。如果没有传入条件，这就是默认执行这个 Runnable。

RunnableBranch 只会选择一个 Runnable 对象，如果有多个返回为 True，只会选择第一个。

实际上，因为任何函数都是一个 RunnableBranch 对象，所以可以不使用 RunnableBranch，直接使用函数来实现路由，这样会有更大的自由度。

用函数实现一个等价路由功能：

  const route = ({ type }) => {
    if(type.includes("科普")){
      return scienceChain
    }else if(type.includes("编程")){
      return programmingChain;
    }

    return generalChain;
  };

组合成一个完整的 chain：

  const outputTemplate = PromptTemplate.fromTemplate(
    `感谢您的提问，这是来自 {role} 的专业回答：

    {output}
    `
  )


  const finalChain = RunnableSequence.from([
    {
      type: classifyChain,
      input: input => input.input
    },
    branch,
    (input) => outputTemplate.format(input),
  ]);

利用 prompt 模版简单渲染了一下不同 chain 返回的 role 数据，在工程中可以根据需求对返回值进行更复杂的处理。

Lang Smith

langchain 官方和社区都推荐的可视化的追踪和分析 agents/llm-app 的内部处理流程。

在官网注册获取 API_KEY

申请成功后，配置环境变量

  LANGCHAIN_TRACING_V2=true  
  LANGCHAIN_API_KEY=xxx

ReAct 框架

agent 框架结合了推理（reasoning）和行动（acting），其流程大概是让 llm 推理完成任务需要的步骤，然后根据环境和提供的工具进行调用，观察工具的结果，推理下一步任务。

ReAct 的意义是在于，这个框架将 llm 的推理能力、调用工具能力、观察能力结合在一起，让 llm 能适应更多的任务和动态的环境，并且强化了推理和行动的协同作用。因为 agents 在执行过程中，会把思考和推理过程记录下来，所以具有很好的透明度和可解释性，提高了用户对输出结果的可信度。

示例：

首先定义提供给 agents 的工具集：

  const tools = [new SerpAPI(process.env.SERP_KEY), new Calculator()];

然后从 langchain hub 拉取 reAct 的 prompt，前者是由 langchain 提供的用于共享、管理和使用 prompt 的站点：

  const prompt = await pull<PromptTemplate>("hwchase17/react");

可以在 smith 查看其中的 prompt。

  Answer the following questions as best you can. You have access to the following tools:

  {tools}

  Use the following format:

  Question: the input question you must answer
  Thought: you should always think about what to do
  Action: the action to take, should be one of [{tool_names}]
  Action Input: the input to the action
  Observation: the result of the action
  ... (this Thought/Action/Action Input/Observation can repeat N times)
  Thought: I now know the final answer
  Final Answer: the final answer to the original input question

  Begin!

  Question: {input}
  Thought:{agent_scratchpad}

解析这个 prompt:

首先第一部分定义了任务，因为这是一个通用领域的 agents，所以内容是尽可能的回答用户的问题，这也是给模型设定了一个明确的出发点。
然后确定了模型有哪些工具可以用，如果使用 langchain 的内置工具（tool），langchain 已经给每个工具提供了完整的描述
然后就是 reAct 的核心部分，定义了固定的格式和思考的路线，这部分也是记录了 llm 整个的思考过程，也会作为 prompt 在每次调用 llm 时传入，让其知道之前的思考流程和信息
- Question：定义用户的问题，也是整个推理的最终目标，也是模型推理的起点
- Thought：引导模型进行思考，考虑下一步采取的行动，构建解决问题的策略和步骤，也就是推理阶段。这部分也会记录在 prompt 中，方便去理解 llm 推理和思考的过程
- Action：定义模型需要采取的行动，这里需要是 tools 中提供的 tool，这就是模型需要采取的行动
- Action Input：调用工具的参数，参数是连接用户的问题、模型的思考和实际行动的关键环节
- Observation：是 Action 的调用结果，给模型提供反馈，帮助模型根据前一 Action 的行动结果，决定后续的推理和行动步骤
- Final Answer：上面的步骤会重复多次，直到模型认为现有的推理、思考和观察已经能够得出答案，就根据信息总结出最终答案

完整代码如下：

import { ChatOpenAI } from "@langchain/openai";
import { SerpAPI } from "@langchain/community/tools/serpapi";
import "dotenv/config";
import { AgentExecutor, createReactAgent } from "langchain/agents";
import { pull } from "langchain/hub";
import type { PromptTemplate } from "@langchain/core/prompts";
import { Calculator } from "@langchain/community/tools/calculator";

async function main() {
  const tools = [new SerpAPI(process.env.SERP_KEY), new Calculator()];

  const prompt = await pull<PromptTemplate>("hwchase17/react");
  const llm = new ChatOpenAI({
    temperature: 0,
  });

  const agent = await createReactAgent({
    llm,
    tools,
    prompt,
  });

  const agentExecutor = new AgentExecutor({
    agent,
    tools,
  });

  const result = await agentExecutor.invoke({
    input: "我有 15 美元，现在相当于多少人民币？",
  });

  console.log(result);
}

main();

执行该文件，即可在控制台查看推理记录

reAct 框架的一些问题：

复杂性和开销
最终结果的正确性依赖于每一步的精准操作，而每一步的操作是我们很难控制的。从 prompt 来看，其实只定义了基本的思考方式，后续都是 llm 根据自己理解进行推理。
对外部数据源准确性的依赖
reAct 假设外部信息源都是真实而确定的，并不会引导模型对外部数据源进行辩证的思考，所以如果数据源出现问题，推理结果就会出问题。
错误传播和幻觉问题
推理初期的错误会在后续推理中放大，特别是外部数据源有噪声或者数据不完整时。在缺乏足够量数据支持时，模型在推理和调用阶段可能会出现幻觉。
速度问题
reAct 会让模型 “慢下来” 一步步的去思考来得出结论，所以即使是简单的问题也会涉及到多次 llm 调用，很难应用在实时的 chat 场景中。

OpenAI tools Agents

目前，满足稳定可使用的 agents 其实是直接使用 openAI 的 tools 功能。 Agents 所做的事情就是自我规划任务、调用外部函数和输出答案。而 openAI 的 tools 功能恰好如此，其提供了 tools 接口，并且由 llm 决定何时以及如何调用 tools，并根据 tools 的运行结果生成给用户的输出。

由于 openAI 对 gpt3.5t 和 gpt4 针对 tools 进行了微调，使其能够针对 tools 场景稳定地生成合法的调用参数。不会出现 reAct 中使用低性能的 llm 导致的 parse 报错而运行出错的问题。

示例

使用 SerpAPI 和 Calculator 作为 llm 的工具，然后拉去相应的 prompt：

  const tools = [new SerpAPI(process.env.SERP_KEY), new Calculator()];

  const prompt = await pull<ChatPromptTemplate>("hwchase17/openai-tools-agent");

查看 prompt 的内容 openai-tools-agent：

  SYSTEM

  You are a helpful assistant

  PLACEHOLDER

  chat_history
  HUMAN

  {input}

  PLACEHOLDER

  agent_scratchpad

其中 chat_history 和 agent_scratchpad 是 MessagePlaceHolder。因为 openAI 模型本身的强大，已经具有自主决策 tool 调用的能力，从 prompt 上并不需要提供额外的信息和规则。

完整代码如下：

  // import { ChatOpenAI } from "@langchain/openai";
  import { ChatAlibabaTongyi } from "@langchain/community/chat_models/alibaba_tongyi";
  import { SerpAPI } from "@langchain/community/tools/serpapi";
  import "dotenv/config";
  import { AgentExecutor } from "langchain/agents";
  import { pull } from "langchain/hub";
  import { createOpenAIToolsAgent } from "langchain/agents";
  import { ChatPromptTemplate } from "@langchain/core/prompts";
  import { Calculator } from "@langchain/community/tools/calculator";

  process.env.LANGCHAIN_TRACING_V2 = "true";

  async function main() {
    const tools = [new SerpAPI(process.env.SERP_KEY), new Calculator()];

    const prompt = await pull<ChatPromptTemplate>("hwchase17/openai-tools-agent");

    const llm = new ChatAlibabaTongyi({
      model: "qwen-turbo",
      temperature: 0,
    });

    const agent = await createOpenAIToolsAgent({
      llm,
      tools,
      prompt,
    });

    const agentExecutor = new AgentExecutor({
      agent,
      tools,
    });

    const result = await agentExecutor.invoke({
      input:
        "我有 10000 人民币，可以购买多少微软股票，注意微软股票一般是以美元计价，需要考虑汇率问题",
    });

    console.log(result);
  }

  main();

执行代码，输出以下内容：

  {
    input: '我有 10000 人民币，可以购买多少微软股票，注意微软股票一般是以美元计价，需要考虑汇率问题',
    output: '首先，我们需要知道当前的汇率。由于汇率是实时变动的，我会提供一个假设的汇率来计算，但实际购买时请以当天的实时汇率为准。\n' +
      '\n' +
      '假设当前汇率是1美元=6.5人民币（这是一个平均汇率，实际汇率可能会有所不同）。那么，10000人民币可以换算成：\n' +
      '\n' +
      '10000人民币 / 6.5 ≈ 1535.54美元\n' +
      '\n' +
      '接下来，我们查看微软（Microsoft）的股价。微软的股价也会随市场波动，这里我们假设一个价格，例如微软的股价是200美元/股。那么，你可以购买的股票数量为：\n' +
      '\n' +
      '1535.54美元 / 200美元/股 ≈ 7.68股\n' +
      '\n' +
      '由于你不能购买部分股票，所以实际能购买的是7股，因为股票通常是以整数交易的。剩下的钱将不足以再买一股。\n' +
      '\n' +
      '请注意，这只是一个估算，实际购买时请参考实时汇率和股票价格。同时，投资股市有风险，建议在投资前做好充分的研究和咨询专业人士。'
  }

同时也可以在 smith langchain控制台查看推理流程。

自定义 Tool

无论是 reAct 还是 OpenAI tools 亦或是其他的 agents 框架，提供给 agents 的 tool 都影响着 agents 应用范围和效果。除了使用 langchain 内部提供的一系列 tools 外，还可以自定义 tool 让 agents 去使用。

目前由两种可以自定义的 tool，需要的参数都是工具的名称、描述和真实调用的函数，注意这里名称和描述将影响 llm 何时调用，所以一定是有语意的。在函数的实现上，不要抛出错误，而是返回包含错误信息的字符串，llm 可以据此决定下一步行动。

两种自定义的 tool 也有细微的区别：

DynamicTool：只支持单一的字符串作为函数输入。因为 reAct 框架，并不支持多输入的 tool
DynamicStructuredTool：支持使用 zod schema 定义复杂的输入格式，适合在 openAI tools 中使用

使用 DynamicTool 创建只有一个输入的 tool：

  const stringReverseTool = new DynamicTool({
    name: "string-reverser",
    description: "reverses a string. input should be the string you want to reverse.",
    func: async (input: string) => input.split("").reverse().join(""),
  });

可以将前面做的 RAG chain 作为工具提供给 agents，让其由更大范围调用知识和信息的能力。首先创建一个 retriever chain：

  async function loadVectorStore() {
    const directory = path.join(__dirname, "../db/qiu");
    const embeddings = new AlibabaTongyiEmbeddings();
    const vectorStore = await FaissStore.load(directory, embeddings);

    return vectorStore;
  }

  import { createStuffDocumentsChain } from "langchain/chains/combine_documents"; 
  import { createRetrievalChain } from "langchain/chains/retrieval";
  
  const prompt = ChatPromptTemplate.fromTemplate(`将以下问题仅基于提供的上下文进行回答：
    上下文：
    {context}

    问题：{input}`);
  const llm = new ChatAlibabaTongyi({
    model: "qwen-turbo",
  });

  const documentChain = await createStuffDocumentsChain({
    llm,
    prompt,
  });

  const vectorStore = await loadVectorStore();
  const retriever = vectorStore.asRetriever();

  const retrievalChain = await createRetrievalChain({
    combineDocsChain: documentChain,
    retriever,
  });

  return retrievalChain;

这里利用了 langchain 内置的两个创建 chain 的工具，createStuffDocumentsChain 和 createRetrievalChain。前者内置了对 Document 的处理和对 llm 的调用，后者内置了对 retriver 的调用和将结果传入到 combineDocsChain 中。

创建 DynamicTool：

  const retrieverTool = new DynamicTool({
    name: "get-qiu-answer",
    func: async (input: string) => {
      const res = await retrieverChain.invoke({ input });
      return res.answer;
    },
    description: "获取小说 《球状闪电》相关问题的答案",
  });

使用 DynamicStructuredTool 创建一个复杂输入的 tool：

  const dateDiffTool = new DynamicStructuredTool({
    name: "date-difference-calculator",
    description: "计算两个日期之间的天数差",
    schema: z.object({
      date1: z.string().describe("第一个日期，以YYYY-MM-DD格式表示"),
      date2: z.string().describe("第二个日期，以YYYY-MM-DD格式表示"),
    }),
    func: async ({ date1, date2 }) => {
      const d1 = new Date(date1);
      const d2 = new Date(date2);
      const difference = Math.abs(d2.getTime() - d1.getTime());
      const days = Math.ceil(difference / (1000 * 60 * 60 * 24));
      return days.toString();
    },
  });

这里使用 zod 来定义函数输入的格式，并且 agents 在调用 tool 时，参数会经过 zod 进行校验，如果出错会直接将校验的错误信息返回给 llm，其会根据报错信息调整输入格式。

用 reAct agent 去测试:

  const tools = [retrieverTool, new Calculator()];
  // 创建 agents 的代码省略
  const res = await agents.invoke({
      input: "小说球状闪电中量子玫瑰的情节",
  });

输出内容如下：

  {
    input: '小说球状闪电中量子玫瑰的情节',
    output: '在刘慈欣的科幻小说《球状闪电》中，"量子玫瑰"是一个非常重要的情节元素。它是一种虚构的高科技武器，也是故事中的关键线索之一。\n' +
      '\n' +
      '"量子玫瑰"是基于量子物理理论的一种高科技产物，它利用了微观粒子的奇特性质，如量子纠缠和超导现象，制造出一种能量密度极高、威力无比的武器。这种武器的形态像一朵盛开的玫瑰，因此得名。它的存在打破了常规物理学的认知，挑战了人类对能量的理解。\n' +
      '\n' +
      '在小说中，"量子玫瑰"被描绘为一种几乎无法防御的武器，其破坏力巨大，足以改变战争的格局。主角叶文洁在小说中扮演了与之相关的重要角色，她不仅揭示了"量子玫瑰"的秘密，还因为这个秘密而卷入了一系列的政治和军事冲突。\n' +
      '\n' +
      '整个情节围绕着"量子玫瑰"的开发、使用以及对抗展开，展现了科技发展带来的伦理和道德困境，同时也探讨了人性、权力和命运等深层次的主题。'
  }

对于复杂的问题 agents 会逐步推理和应用不同的 tools 来得出最终的结果。所以 tools 的种类、能力很大程度上决定了 agents 的上限，但对于 reAct 这种框架，tools 是作为 prompt 嵌入到 llm 上下文中，所以过多的 tools 会影响用于其他内容的 prompt。同样的，openAI tools 的描述和输入的 schema 也是算在上下文中，也受窗口大小的限制。所以并不是 tools 越多越好，而是根据需求去设计。

跟 chain 一样，复杂 agents 可以同样采用路由的设计，由一个入口 agents 通过 tools 链接多个垂直领域的专业 agents，根据用户的问题进行分类，然后导向到对应的专业 agents 进行回答。

在这种实现时，每个子 agents 通过 DynamicTool 进行定义，并且传入 returnDirect: true，这样会直接将该 tool 调用的结果作为结果返回给用户，而不是将 tool 的结果再次传给 llm 并生成输出。

使用 retriever 例子演示：

  const retrieverChain = await getRetrieverChain();
  const retrieverTool = new DynamicTool({
    name: "get-qiu-answer",
    func: async (input: string) => {
      const res = await retrieverChain.invoke({ input });
      return res.answer;
    },
    description: "获取小说 《球状闪电》相关问题的答案",
    returnDirect: true,
  });

  const tools = [retrieverTool, new Calculator()];
  const agents = await createReactAgentWithTool(tools);

  const res = await agents.invoke({
    input: "用一句话，介绍小说球状闪电中，跟量子玫瑰有关的情节"
  });

  console.log(res);

从 langSmith 的数据可以看到，当调用 get-qiu-answer 这个 tool 后，直接把 tool 的结果当做整个 agents 运行的最终结果返回了，而不是像前面一样再经过一次 llm 节点生成答案。因此就可以利用 returnDirect feature，将入口的 agent 作为 route，去导向到不同领域的专业 agent，这也是多 agents 协同的一种方式。