困惑度详解（perplexity）

喔的天

困惑度？？perplexity，没有用也要用。
开头我就这么爆狼式发言，不是没有道理的，在现今的主题模型评判中，困惑度仍然是一种主流,虽然还有held-out-log likelihood (出自 Crowdsourced Time-sync Video tagging using temporal and personalized topic modeling )和 EL（empirical likelihood）等等，但是那些在我看来真的没有什么很实际很合理的解释。但是学术界就是这么玩的，那么我们也就入乡随俗吧。

wiki上有介绍了三种方式，下面我作个小小的翻译，不想看的直接跳过。

传送门

在信息论中，困惑度是一种评判概率模型或概率分布预测的衡量指标，可用于评价模型好坏。
可分为三种

Perplexity of a probability distribution
Perplexity of a probability model
Perplexity per word（我们下面用的方法就是这个）

正文

本文介绍的perplexity是最基本的那一种。公式如下

图片描述

计算方式也很简单，对每一个训练集里出现的单词通过tassign找到其对应的topic，然后从phi矩阵中获得p(w)，也就是上公式中log后面的值。

最后讲一下，一般最后算出来的不是一个perplexity的值，而是不同topic个数得到的不同结果画出来的图，用于观察，我的下面的代码只是得到topic=10的结果，需要多个的直接加for循环就可以了

解释一下，tassign和phi都是传统lda里算出来保存成后缀名的文件，这个大家应该都知道的

import numpy as np

def f_testset_word_count(testset):                                     #测试集的词数统计
    '''reture the sum of words in testset which is the denominator of the formula of Perplexity'''
    return (len(testset.split()))


def graph_draw(topic,perplexity):             #做主题数与困惑度的折线图
    x=topic
    y=perplexity
    plt.plot(x,y,marker="*",color="red",linewidth=2)
    plt.xlabel("Number of Topic")
    plt.ylabel("Perplexity")
    plt.show()


phi = np.loadtxt('test_data/model-final.phi')
word_topic = {}
f = open('test_data/model-final.tassign')
patterns = f.read().split()
f = open('test_data/model-final.tassign')
testset_word_count = f_testset_word_count(f.read())

# 用作循环
_topic=[]
perplexity_list=[]

_topic.append(10)
for pattern in patterns:
    word = int(pattern.split(':')[0])
    topic = int(pattern.split(':')[1])
    pattern = pattern.replace(':','_')
    if not word_topic.has_key(pattern)==True:
        word_topic[pattern] = phi[topic][word]

duishu = 0.0
for frequency in word_topic.values():
    duishu += -math.log(frequency)
kuohaoli = duishu/testset_word_count
perplexity = math.exp(kuohaoli)
perplexity_list.append(perplexity)

graph_draw(_topic,perplexity_list)

困惑度详解（perplexity）

喔的天

正文

jasperyang

引用和评论

tensorflow进阶笔记 --- #"5"# deeplearning.ai 作业代码卷积

2025年医疗大模型各医疗场景赋能实践研究报告130+份汇总解读|附PDF下载

科学计算编程涉及到的技术栈简介

manus 的替代品有哪些？使用LLM大模型技术做手机/网页/浏览器自动化操作技术汇总

vLLM 实战教程汇总，从环境配置到大模型部署，中文文档追踪重磅更新

性能远超SAM系模型，苏黎世大学等开发通用3D血管分割基础模型

【vLLM 学习】基础教程

困惑度详解（perplexity）

喔的天

正文

jasperyang

引用和评论

tensorflow进阶笔记 --- #"5"# deeplearning.ai 作业代码 卷积

2025年医疗大模型各医疗场景赋能实践研究报告130+份汇总解读|附PDF下载

科学计算编程涉及到的技术栈简介

manus 的替代品有哪些？使用LLM大模型技术做手机/网页/浏览器自动化操作技术汇总

vLLM 实战教程汇总，从环境配置到大模型部署，中文文档追踪重磅更新

性能远超SAM系模型，苏黎世大学等开发通用3D血管分割基础模型

【vLLM 学习】基础教程

tensorflow进阶笔记 --- #"5"# deeplearning.ai 作业代码卷积