Huggingface NLP Course Notes 2 : pipeline() 函数

Transformers 库中最基本的对象是 pipeline() 函数。它将模型与其必要的预处理和后处理步骤连接起来，使我们能够通过直接输入任何文本并获得最终的答案.

# 导入pipeline
from transformers import pipeline

# 定义NLP任务
classifier = pipeline("sentiment-analysis")

# 传入文本实例化
classifier("I've been waiting for a HuggingFace course my whole life.")

结果如下：

[{'label': 'POSITIVE', 'score': 0.9598047137260437}]

可以传入列表，包含多个文本：

classifier(
    ["I've been waiting for a HuggingFace course my whole life.", "I hate this so much!"]
)

结果也返回列表：

[{'label': 'POSITIVE', 'score': 0.9598047137260437},
 {'label': 'NEGATIVE', 'score': 0.9994558095932007}]

常用NLP任务在pipeline有对应参数：

* feature-extraction (get the vector representation of a text)
* fill-mask
* ner (named entity recognition)
* question-answering
* sentiment-analysis
* summarization
* text-generation
* translation
* zero-shot-classification

相关任务举例：

零样本分类

# 导入pipeline
from transformers import pipeline

# 指定任务：零样本分类 
classifier = pipeline("zero-shot-classification")
classifier(
    "This is a course about the Transformers library",
    candidate_labels=["education", "politics", "business"],
)

生成结果每种label都有对应的概率：

{'sequence': 'This is a course about the Transformers library',
 'labels': ['education', 'business', 'politics'],
 'scores': [0.8445963859558105, 0.111976258456707, 0.043427448719739914]}

文本生成

from transformers import pipeline

# 指定任务：文本生成
generator = pipeline("text-generation")
generator("In this course, we will teach you how to")

# 生成结果
[{'generated_text': 'In this course, we will teach you how to understand and use '
                    'data flow and data interchange when handling user data. We '
                    'will be working with one or more of the most commonly used '
                    'data flows — data flows of various types, as seen by the '
                    'HTTP'}]

Pipline使用Hub中的其他模型

指定model名称使用Hub中的其他模型

from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2")
generator(
    "In this course, we will teach you how to",
    max_length=30,
    num_return_sequences=2,
)

对应的还有很多其他的任务：

Mask filling

unmasker = pipeline(“fill-mask”)
# 预测<mask>的结果和概率，保留top_k
unmasker("This course will teach you all about <mask> models.", top_k=2)

命名实体识别

ner = pipeline("ner", grouped_entities=True)

问答系统

question_answerer = pipeline("question-answering") 
# 给出question和context
question_answerer( question="Where do I work?", context="My name is Sylvain and I work at Hugging Face in Brooklyn", )

文本摘要

summarizer = pipeline("summarization")
summarizer("some text to summarize")

翻译

# 指定特定模型
translator = pipeline("translation", model="Helsinki-NLP/opus-mt-fr-en")
translator("Ce cours est produit par Hugging Face.")

内容参考： https://huggingface.co/learn/nlp-course/zh-CN/chapter1/3?fw=pt

Huggingface NLP Course Notes 2 : pipeline() 函数

bingo彬哥

引用和评论

Huggingface NLP Course Notes 3 : Transformer介绍

Open WebUI：开源AI交互平台的全面解析

大模型中的Token究竟是什么？从原理到作用深度解析

一文掌握 MCP 上下文协议：从理论到实践

MySQL × 向量数据库：大模型时代的黄金组合实战指南

AdventureX 2025 正式启动：五天四夜，120小时极限创造！一起在杭州点燃青年创新之火！

大模型时代，后端程序员如何避免被AI卷死？