Read the paper with me丨ACL2021 NER BERT-based hidden Markov model for multi-source weakly supervised named entity recognition

Abstract: This article is a preliminary interpretation of the work of ACL2021 NER BERT-based Hidden Markov Model for multi-source weakly supervised named entity recognition.

This article is shared from the Huawei Cloud Community " ACL2021 NER | BERT-based Hidden Markov Model for Multi-source Weakly Supervised Named Entity Recognition ", author: JuTzungKuei.

Paper: Li Yinghao, Shetty Pranav, Liu Lucas, Zhang Chao, Song Le. BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition[A]. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) [C]. Online: Association for Computational Linguistics, 2021, 6178–6190.

Link: https://aclanthology.org/2021.acl-long.482.pdf

Code: https://github.com/Yinghao-Li/CHMM-ALT

0, summary

Research content: Learning NER using noise labels from multiple weakly supervised data
Noise data: incomplete, inaccurate, contradictory
Propose a conditional hidden Markov model (CHMM: conditional hidden Markov model)
- Use BERT's contextual representation capabilities to enhance the classic HMM model
- Learn the transfer and emission probabilities of words from BERT embeddings, and infer potential true labels
Use alternate training method (CHMM-ALT) to further improve CHMM
- Fine-tune the BERT-NER model with tags derived from CHMM
- The output of BERT-NER is used as an additional weak source to train CHMM
SOTA reached on four data sets
1 Introduction
NER is the basic task of many downstream information extraction tasks: event extraction, relationship extraction, question and answer
- Supervised and need a lot of labeled data
- Many domains have knowledge sources: knowledge bases, domain dictionaries, labeling rules
- It can be used to match corpora to quickly generate large-scale noise training data from multiple angles
Remote supervision NER: Only use the knowledge base as weak supervision, without using the complementary information of multi-source annotation
The existing HMM method has limitations: one-hot word vector or no modeling
contribute:
- CHMM: Aggregate weak tags from multiple sources
- Alternate training method CHMM-ALT: Train CHMM and BERT-NER in turn, use each other's output for multiple loops to optimize the performance of multi-source weakly supervised NER
- Four benchmark data sets get SOTA
  2. Method
CHMM-ALT trains two models: the multi-source label aggregator CHMM and the BERT-NER model, which take turns as each other’s output
- Stage I: CHMM generates a denoising label y^{*(1:T)}y∗( according to K sources x_{1:K}^{(1:T)}x1:K(1:T) 1:T), fine-tune the BERT-NER model output \widetilde{y}^{(1:T)}y(1:T), as an additional source of annotation, added to the original weak label set x_{1:K+ 1}^{(1:T)} = {x_{1:K}^{(1:T)}, \widetilde{y}^{(1:T)}}x1:K+1(1:T )=(x1:K(1:T),y(1:T))
- Phase II: CHMM and BERT-NER improve each other in several rounds. In each round, CHMM is trained first, then BERT-NER is fine-tuned, and the input of the former is updated.
- CHMM improves Precision, BERT-NER improves Recall
Hidden Markov Model
- Do not understand

3. Results

The nickname is extra: If you want to know more about the dry goods of AI technology, welcome to the AI area of HUAWEI CLOUD. There are currently AI programming Python and other six combat camps for everyone to learn for free.

Click to follow, and learn about the fresh technology of Huawei Cloud for the first time~

Read the paper with me丨ACL2021 NER BERT-based hidden Markov model for multi-source weakly supervised named entity recognition

0, summary

1 Introduction

2. Method

3. Results

华为云开发者联盟

引用和评论

华为云开发者联盟入选 2023 中国技术品牌影响力企业榜，深耕开发者生态

30分钟内输出结果，新加坡国立大学/MIT等基于SVM构建微生物污染检测模型

入选AAAI 2025，浙江大学提出多对一回归模型M2OST，利用数字病理图像精准预测基因表达

LLM增强语义嵌入的模型算法综述

什么是模型上下文协议（MCP）？

2025免费云服务器盘点

OpenBayes 教程上新丨开源代码推理模型 DeepCoder-14B-Preview 狂揽 3k stars