1
original address, click here

@ TOC

Preface

When working with PDF files, it is very common. But you will find that many times, we need to merge several PDFs into one PDF file. At this time, you will often go to Baidu and open a paid PDF merge website to start the merge.

Still paying to merge PDF files stupidly?

Today, I will share with you a merges PDF with one click! We still use the Python language we are familiar with, but this time we will call the module PyPDF2 How to use this module, I won't go into details here, see the official description of http://pythonhosted.org/PyPDF2/ Today, what we want to use, I will teach you what, after all, in the office, you don't have time to learn things that have nothing to do with work.

Our mission

Merge the two documents 1.pdf and 2.pdf into 3.pdf.
在这里插入图片描述

Clarify the workflow

Before doing things, you must understand the logic of what you do, that is, the workflow. This is what we must do to solve repetitive tasks:

  • Read 1.pdf files and write 3.pdf
  • Read 2.pdf file and write 3.pdf

A simple operation that seems to be merged is indeed not that simple.

Let Python do it

Import the PyPDF2 module:

from PyPDF2 import PdfFileReader, PdfFileWriter

These two methods correspond to the reading and writing of pdf files. PdfFileReader can read files and PdfFileWriter can write files.

Note: Reading must be read and written page by page, which means that the entire file cannot be read and then written at once. You must read one page and write one page.

Fortunately, our sample 1.PDF and 2.pdf have only one page. The reminder here is specially prepared for friends, so don't jump into the pit.

from PyPDF2 import PdfFileReader, PdfFileWriter

path = r'C:\Users\xxxxxx'
pdf_writer = PdfFileWriter()

for i in range(1, 2):
    pdf_reader = PdfFileReader('E:\demo\{}.pdf'.format(i))
    for page in range(pdf_reader.getNumPages()):
        pdf_writer.addPage(pdf_reader.getPage(page))

with open('E:\demo\3.pdf', 'wb') as out:
    pdf_writer.write(out)

Carefully you will find that merged in the loop, but the output is outside the loop, yes, this confirms what we said before "read one page and write one page". with words, built 3.pdf , by writer pdf_writer.write(out) method output.

end

In the next issue, I will show you how to let Python help us split the PDF.


CoXie带你学编程
195 声望13 粉丝