Transformers 库中最基本的对象是 pipeline() 函数。它将模型与其必要的预处理和后处理步骤连接起来,使我们能够通过直接输入任何文本并获得最终的答案.
# 导入pipeline
from transformers import pipeline
# 定义NLP任务
classifier = pipeline("sentiment-analysis")
# 传入文本实例化
classifier("I've been waiting for a HuggingFace course my whole life.")
结果如下:
[{'label': 'POSITIVE', 'score': 0.9598047137260437}]
可以传入列表,包含多个文本:
classifier(
["I've been waiting for a HuggingFace course my whole life.", "I hate this so much!"]
)
结果也返回列表:
[{'label': 'POSITIVE', 'score': 0.9598047137260437},
{'label': 'NEGATIVE', 'score': 0.9994558095932007}]
常用NLP任务在pipeline有对应参数:
* feature-extraction (get the vector representation of a text)
* fill-mask
* ner (named entity recognition)
* question-answering
* sentiment-analysis
* summarization
* text-generation
* translation
* zero-shot-classification
相关任务举例:
- 零样本分类
# 导入pipeline
from transformers import pipeline
# 指定任务:零样本分类
classifier = pipeline("zero-shot-classification")
classifier(
"This is a course about the Transformers library",
candidate_labels=["education", "politics", "business"],
)
生成结果每种label都有对应的概率:
{'sequence': 'This is a course about the Transformers library',
'labels': ['education', 'business', 'politics'],
'scores': [0.8445963859558105, 0.111976258456707, 0.043427448719739914]}
- 文本生成
from transformers import pipeline
# 指定任务:文本生成
generator = pipeline("text-generation")
generator("In this course, we will teach you how to")
# 生成结果
[{'generated_text': 'In this course, we will teach you how to understand and use '
'data flow and data interchange when handling user data. We '
'will be working with one or more of the most commonly used '
'data flows — data flows of various types, as seen by the '
'HTTP'}]
- Pipline使用Hub中的其他模型
指定model名称使用Hub中的其他模型
from transformers import pipeline
generator = pipeline("text-generation", model="distilgpt2")
generator(
"In this course, we will teach you how to",
max_length=30,
num_return_sequences=2,
)
对应的还有很多其他的任务:
Mask filling
unmasker = pipeline(“fill-mask”) # 预测<mask>的结果和概率,保留top_k unmasker("This course will teach you all about <mask> models.", top_k=2)
命名实体识别
ner = pipeline("ner", grouped_entities=True)
问答系统
question_answerer = pipeline("question-answering") # 给出question和context question_answerer( question="Where do I work?", context="My name is Sylvain and I work at Hugging Face in Brooklyn", )
文本摘要
summarizer = pipeline("summarization") summarizer("some text to summarize")
翻译
# 指定特定模型 translator = pipeline("translation", model="Helsinki-NLP/opus-mt-fr-en") translator("Ce cours est produit par Hugging Face.")
内容参考: https://huggingface.co/learn/nlp-course/zh-CN/chapter1/3?fw=pt
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。