新手上路，请多包涵

我需要帮助替换 word 文档中的字符串，同时保持整个文档的格式。

我正在使用 python-docx，在阅读文档后，它适用于整个段落，所以我松散了格式，比如粗体或斜体的单词。包括要替换的文本以粗体显示，我想保持这种状态。我正在使用这段代码：

 from docx import Document
def replace_string2(filename):
    doc = Document(filename)
    for p in doc.paragraphs:
        if 'Text to find and replace' in p.text:
            print 'SEARCH FOUND!!'
            text = p.text.replace('Text to find and replace', 'new text')
            style = p.style
            p.text = text
            p.style = style
    # doc.save(filename)
    doc.save('test.docx')
    return 1

因此，如果我实现它并想要类似的东西（包含要替换的字符串的段落会丢失其格式）：

这是 第 1 段，这是粗体文本。

这是 第2段，我将替换 旧文本

目前的结果是：

这是 第 1 段，这是粗体文本。

这是第 2 段，我将替换新文本

原文由 Alo 发布，翻译遵循 CC BY-SA 4.0 许可协议

python python-2.7 python-docx

阅读 2k

2 个回答

得票最新

社区维基

发布于
2023-01-09

✓ 已被采纳

我发布了这个问题（尽管我在这里看到了一些相同的问题），因为这些（据我所知）都没有解决问题。有一个使用 oodocx 库，我试过了，但没有用。所以我找到了解决方法。

代码非常相似，但逻辑是：当我找到包含我要替换的字符串的段落时，使用 runs 添加另一个循环。（这仅在我希望替换的字符串具有相同格式的情况下才有效）。

 def replace_string(filename):
    doc = Document(filename)
    for p in doc.paragraphs:
        if 'old text' in p.text:
            inline = p.runs
            # Loop added to work with runs (strings with same style)
            for i in range(len(inline)):
                if 'old text' in inline[i].text:
                    text = inline[i].text.replace('old text', 'new text')
                    inline[i].text = text
            print p.text

    doc.save('dest1.docx')
    return 1

原文由 Alo 发布，翻译遵循 CC BY-SA 3.0 许可协议

社区维基

发布于
2023-01-09

这就是我在替换文本时保留文本样式的方法。

基于 Alo 的答案以及搜索文本可以拆分为多个运行的事实，这对我来说是替换模板 docx 文件中的占位符文本的方法。它检查占位符的所有文档段落和任何表格单元格内容。

一旦在段落中找到搜索文本，它就会循环遍历它的运行，以确定哪些运行包含搜索文本的部分文本，之后它会在第一次运行中插入替换文本，然后在剩余运行中清空剩余的搜索文本字符。

我希望这可以帮助别人。如果有人想改进它，这是要点

编辑：我随后发现 python-docx-template 它允许在 docx 模板中进行 jinja2 样式模板化。这是文档的链接

python3 python-docx python-docx-模板

def docx_replace(doc, data):
    paragraphs = list(doc.paragraphs)
    for t in doc.tables:
        for row in t.rows:
            for cell in row.cells:
                for paragraph in cell.paragraphs:
                    paragraphs.append(paragraph)
    for p in paragraphs:
        for key, val in data.items():
            key_name = '${{{}}}'.format(key) # I'm using placeholders in the form ${PlaceholderName}
            if key_name in p.text:
                inline = p.runs
                # Replace strings and retain the same style.
                # The text to be replaced can be split over several runs so
                # search through, identify which runs need to have text replaced
                # then replace the text in those identified
                started = False
                key_index = 0
                # found_runs is a list of (inline index, index of match, length of match)
                found_runs = list()
                found_all = False
                replace_done = False
                for i in range(len(inline)):

                    # case 1: found in single run so short circuit the replace
                    if key_name in inline[i].text and not started:
                        found_runs.append((i, inline[i].text.find(key_name), len(key_name)))
                        text = inline[i].text.replace(key_name, str(val))
                        inline[i].text = text
                        replace_done = True
                        found_all = True
                        break

                    if key_name[key_index] not in inline[i].text and not started:
                        # keep looking ...
                        continue

                    # case 2: search for partial text, find first run
                    if key_name[key_index] in inline[i].text and inline[i].text[-1] in key_name and not started:
                        # check sequence
                        start_index = inline[i].text.find(key_name[key_index])
                        check_length = len(inline[i].text)
                        for text_index in range(start_index, check_length):
                            if inline[i].text[text_index] != key_name[key_index]:
                                # no match so must be false positive
                                break
                        if key_index == 0:
                            started = True
                        chars_found = check_length - start_index
                        key_index += chars_found
                        found_runs.append((i, start_index, chars_found))
                        if key_index != len(key_name):
                            continue
                        else:
                            # found all chars in key_name
                            found_all = True
                            break

                    # case 2: search for partial text, find subsequent run
                    if key_name[key_index] in inline[i].text and started and not found_all:
                        # check sequence
                        chars_found = 0
                        check_length = len(inline[i].text)
                        for text_index in range(0, check_length):
                            if inline[i].text[text_index] == key_name[key_index]:
                                key_index += 1
                                chars_found += 1
                            else:
                                break
                        # no match so must be end
                        found_runs.append((i, 0, chars_found))
                        if key_index == len(key_name):
                            found_all = True
                            break

                if found_all and not replace_done:
                    for i, item in enumerate(found_runs):
                        index, start, length = [t for t in item]
                        if i == 0:
                            text = inline[index].text.replace(inline[index].text[start:start + length], str(val))
                            inline[index].text = text
                        else:
                            text = inline[index].text.replace(inline[index].text[start:start + length], '')
                            inline[index].text = text
                # print(p.text)

# usage

doc = docx.Document('path/to/template.docx')
docx_replace(doc, dict(ItemOne='replacement text', ItemTwo="Some replacement text\nand some more")
doc.save('path/to/destination.docx')

原文由 adejones 发布，翻译遵循 CC BY-SA 4.0 许可协议

撰写回答

你尚未登录，登录后可以

和开发者交流问题的细节
关注并接收问题和回答的更新提醒
参与内容的编辑和改进，让解决方法与时俱进

推荐问题

Python docx 在保持样式的同时替换段落中的字符串

你尚未登录，登录后可以

字节的 trae AI IDE 不支持类似 vscode 的 ssh remote 远程开发怎么办？

DataCap 中验证码无法显示，后台出现 NullPointerException 错误?

发现深拷贝和浅拷贝效果一致：请问一下有什么区别呢？

如何实现一个深拷贝函数？

Python 成员变量在多个子类实例间共享，如何避免？

为什么 Qwen2.5-Omni-7B 官方教程都报错 Cannot import available module of Qwen2_5OmniModel in modelscope ？

Spark-TTS-0.5B 的 requirements.txt 在哪里？

Stack Overflow 翻译