用 DeepSeek R1 和 Ollama 构建本地 RAG 系统，向 PDF 提问不再是梦！

用DeepSeek R1和Ollama构建本地RAG系统，向PDF提问不再是梦！

📖14分钟阅读

🕙2025年02月02日

有没有想过能直接向PDF文件或者技术手册提问？如何利用开源推理工具DeepSeek R1和运行本地AI模型的轻量级框架Ollama，搭建一个检索增强生成（RAG）系统。

为什么选DeepSeek R1？

DeepSeek R1是一款能和OpenAI的某模型相媲美的工具，成本却低了95%，堪称RAG系统的革新者。开发者们青睐它，主要是因为：

精准检索：每次回答仅需引用3个文档片段。
严格提示：遇到不确定的问题，直接回复 “我不知道”，避免 “幻觉” 情况。
本地运行：摆脱云API的延迟困扰。

搭建本地RAG系统所需工具

1. Ollama

Ollama能让你在本地运行DeepSeek R1这类模型。

下载：前往Ollama官网下载。
安装与启动：在终端输入并运行下面的命令：
```
ollama run deepseek-r1
```

2. DeepSeek R1模型变体

DeepSeek R1的参数规模从15亿到671亿不等。要是搭建轻量级RAG应用，建议从15亿参数的模型起步，在终端输入：

ollama run deepseek-r1:1.5b

专业提示：参数更多的模型（比如70亿参数的）推理能力更强，但对内存的要求也更高。

构建RAG流程的详细步骤

步骤1：导入库

我们会用到：

LangChain：处理文档和检索。

Streamlit：打造用户友好的网页界面。

import streamlit as st
from langchain_community.document_loaders import PDFPlumberLoader
from langchain_experimental.text_splitter import SemanticChunker
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.llms import Ollama

步骤2：上传并处理PDF

借助Streamlit的文件上传器选择本地PDF文件，用PDFPlumberLoader能自动高效提取文本，无需手动解析。

uploaded_file = st.file_uploader("上传PDF文件", type="pdf")
if uploaded_file:
    with open("temp.pdf", "wb") as f:
        f.write(uploaded_file.getvalue())
    loader = PDFPlumberLoader("temp.pdf")
    docs = loader.load()

步骤3：策略性分块文档

text_splitter = SemanticChunker(HuggingFaceEmbeddings())
documents = text_splitter.split_documents(docs)

步骤4：创建可搜索的知识库

为文档片段生成向量嵌入，并存储到FAISS索引中。有了向量嵌入，就能实现快速、上下文相关的搜索。

embeddings = HuggingFaceEmbeddings()
vector_store = FAISS.from_documents(documents, embeddings)
retriever = vector_store.as_retriever(search_kwargs={"k": 3})

步骤5：配置DeepSeek R1

用15亿参数的DeepSeek R1模型搭建一个RetrievalQA链，确保回答基于PDF内容，而非模型的训练数据。

llm = Ollama(model="deepseek-r1:1.5b")
prompt = """
1. 仅依据以下上下文作答。
2. 若不确定，回复 “我不知道”。
3. 回答不超过4句话。

Context: {context}
Question: {question}
Answer:
"""
QA_CHAIN_PROMPT = PromptTemplate.from_template(prompt)

步骤6：组装RAG链

把文件上传、分块和检索功能整合到一个连贯的流程中，为模型提供可靠的上下文，提升回答的准确性。

llm_chain = LLMChain(llm=llm, prompt=QA_CHAIN_PROMPT)
document_prompt = PromptTemplate(
    template="Context:\ncontent:{page_content}\nsource:{source}",
    input_variables=["page_content", "source"]
)
qa = RetrievalQA(
    combine_documents_chain=StuffDocumentsChain(
        llm_chain=llm_chain,
        document_prompt=document_prompt
    ),
    retriever=retriever
)

步骤7：启动网页界面

Streamlit能让用户输入问题并即时获得答案。用户的查询会检索匹配的文档片段，输入模型后实时展示结果。

# Streamlit UI
user_input = st.text_input("向你的PDF提问：")
if user_input:
    with st.spinner("思考中..."):
        response = qa(user_input)["result"]
        st.write(response)

DeepSeek引领RAG的未来

DeepSeek R1只是个开始。未来，RAG系统还会有自我验证、多跳推理等新功能，能自主进行逻辑论证和优化。

近日热文：全网最全的神经网络数学原理（代码和公式）直观解释
欢迎关注知乎和公众号的专栏内容
LLM架构专栏
 知乎LLM专栏
 知乎【柏企】
公众号【柏企科技说】【柏企阅文】

本文由mdnice多平台发布

用 DeepSeek R1 和 Ollama 构建本地 RAG 系统，向 PDF 提问不再是梦！