之前的文章提到通义千问API无法通过with_structured_output/json schema的方式支持结构化输出,如果就是想使用通义千问大模型做结构化输出,应该怎么办呢?有两种办法
使用Ollama来运行通义千问大模型
从Ollama博客文章 Structured output 中了解到,Ollama已经支持结构化输出了,这个功能是在Ollama 0.5.0版本 引入的。通过Ollama把qwen3大模型在本地运行起来,使用下面的代码就能看到效果。
from langchain.chat_models import init_chat_model
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field
llm = init_chat_model(
model_provider="openai",
model="qwen3:8b",
base_url="http://localhost:11434/v1",
api_key="123456"
)
tagging_prompt = ChatPromptTemplate.from_template(
"""
Extract the desired information from the following passage.
Only extract the properties mentioned in the 'Classification' function. Passage: {input} """)
class Classification(BaseModel):
sentiment: str = Field(description="The sentiment of the text")
aggressiveness: int = Field(description="How aggressive the text is on a scale from 1 to 10")
language: str = Field(description="The language the text is written in")
structured_llm = llm.with_structured_output(Classification)
inp = "Estoy increiblemente contento de haberte conocido! Creo que seremos muy buenos amigos!"
prompt = tagging_prompt.invoke({"input": inp})
response = structured_llm.invoke(prompt)
print(response)
程序能够正常运行并打印出Classification这个结构化的类对象内容(如下)
sentiment='positive' aggressiveness=0 language='Spanish'
如果把代码中llm对象构造时的参数换回通义千问API的值,也即是改为
llm = init_chat_model(
model_provider="openai",
model="qwen-plus-latest",
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
api_key=os.environ["QWEN_API_KEY"]
)
会发现之前提到的错误再次出现了,这证明不是通义千问大模型不支持结构化输出,而是通义千问API服务不支持。
如果不想使用Ollama本地运行大模型,而是想使用通义千问API服务支持结构化输出,该怎么办呢?
使用 PydanticOutputParser 自动生成提示词
要使用结构化输出,通义千问文档里提到
您可以在提示词中明确描述所需 JSON 的键值结构和数据类型,并提供标准数据样例,这会帮助大模型达到类似效果。
借助 LangChain 的 PydanticOutputParser 可以帮我们自动生成数据结构的提示词。参考下面的代码
import os
from langchain_core.output_parsers import PydanticOutputParser
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field
from langchain.chat_models import init_chat_model
llm = init_chat_model(
model_provider="openai",
model="qwen-plus-latest",
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
api_key=os.environ["QWEN_API_KEY"]
)
class Classification(BaseModel):
sentiment: str = Field(description="The sentiment of the text")
aggressiveness: int = Field(description="How aggressive the text is on a scale from 1 to 10")
language: str = Field(description="The language the text is written in")
parser = PydanticOutputParser(pydantic_object=Classification)
tagging_prompt = ChatPromptTemplate.from_messages([
("system",
"Extract the desired information into JSON format from the following passage."
"{format_instructions}" ),
("human", "{input}")
]).partial(format_instructions=parser.get_format_instructions())
inp = "Estoy increiblemente contento de haberte conocido! Creo que seremos muy buenos amigos!"
print(tagging_prompt.invoke({"input": inp}))
chain = tagging_prompt | llm | parser
response = chain.invoke({"input": inp})
print(response)
第30行代码会打印出格式化后的提示词,第一行后面的内容都是API根据提供的类结构生成的提示词。
Extract the desired information into JSON format from the following passage.
The output should be formatted as a JSON instance that conforms to the JSON schema below.
As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.
Here is the output schema:
{
"properties": {
"sentiment": {
"description": "The sentiment of the text",
"title": "Sentiment",
"type": "string"
},
"aggressiveness": {
"description": "How aggressive the text is on a scale from 1 to 10",
"title": "Aggressiveness",
"type": "integer"
},
"language": {
"description": "The language the text is written in",
"title": "Language",
"type": "string"
}
},
"required": [
"sentiment",
"aggressiveness",
"language"
]
}
真是太方便了,这下可以高效的从文档里提取结构化数据啦。
思考
从 Ollama 0.5.0的代码修改记录 来看,Ollama只用了很少的代码修改就支持了结构化输出。那么通义千问API服务什么时候能支持起来呢?
本文由博客一文多发平台 OpenWrite 发布!
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。