头图

Transformers 库中最基本的对象是 pipeline() 函数。它将模型与其必要的预处理和后处理步骤连接起来,使我们能够通过直接输入任何文本并获得最终的答案.

# 导入pipeline
from transformers import pipeline

# 定义NLP任务
classifier = pipeline("sentiment-analysis")

# 传入文本实例化
classifier("I've been waiting for a HuggingFace course my whole life.")

结果如下:

[{'label': 'POSITIVE', 'score': 0.9598047137260437}]

可以传入列表,包含多个文本:

classifier(
    ["I've been waiting for a HuggingFace course my whole life.", "I hate this so much!"]
)

结果也返回列表:

[{'label': 'POSITIVE', 'score': 0.9598047137260437},
 {'label': 'NEGATIVE', 'score': 0.9994558095932007}]

常用NLP任务在pipeline有对应参数:

* feature-extraction (get the vector representation of a text)
* fill-mask
* ner (named entity recognition)
* question-answering
* sentiment-analysis
* summarization
* text-generation
* translation
* zero-shot-classification

相关任务举例:

  • 零样本分类
# 导入pipeline
from transformers import pipeline

# 指定任务:零样本分类 
classifier = pipeline("zero-shot-classification")
classifier(
    "This is a course about the Transformers library",
    candidate_labels=["education", "politics", "business"],
)

生成结果每种label都有对应的概率:

{'sequence': 'This is a course about the Transformers library',
 'labels': ['education', 'business', 'politics'],
 'scores': [0.8445963859558105, 0.111976258456707, 0.043427448719739914]}
  • 文本生成
from transformers import pipeline

# 指定任务:文本生成
generator = pipeline("text-generation")
generator("In this course, we will teach you how to")
# 生成结果
[{'generated_text': 'In this course, we will teach you how to understand and use '
                    'data flow and data interchange when handling user data. We '
                    'will be working with one or more of the most commonly used '
                    'data flows — data flows of various types, as seen by the '
                    'HTTP'}]
  • Pipline使用Hub中的其他模型

指定model名称使用Hub中的其他模型

from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2")
generator(
    "In this course, we will teach you how to",
    max_length=30,
    num_return_sequences=2,
)

对应的还有很多其他的任务:

  • Mask filling

    unmasker = pipeline(“fill-mask”)
    # 预测<mask>的结果和概率,保留top_k
    unmasker("This course will teach you all about <mask> models.", top_k=2)
  • 命名实体识别

    ner = pipeline("ner", grouped_entities=True)
  • 问答系统

    question_answerer = pipeline("question-answering") 
    # 给出question和context
    question_answerer( question="Where do I work?", context="My name is Sylvain and I work at Hugging Face in Brooklyn", )
  • 文本摘要

    summarizer = pipeline("summarization")
    summarizer("some text to summarize")
  • 翻译

    # 指定特定模型
    translator = pipeline("translation", model="Helsinki-NLP/opus-mt-fr-en")
    translator("Ce cours est produit par Hugging Face.")

内容参考: https://huggingface.co/learn/nlp-course/zh-CN/chapter1/3?fw=pt


bingo彬哥
2.5k 声望366 粉丝