9

I'm using PdfPages from matplotlib and I can loop through each figure object and save each one as a separate page in the same PDF:

from matplotlib.backends.backend_pdf import PdfPages
pp = PdfPages('output.pdf')
for fig in figs:
    pp.savefig(fig)
pp.close()

This works great. But is there a way for me to add a page number for each page in the PDF?

Thanks.

3
  • I don't know of any direct way to do it, here's a brief outline of another possible solution. Save the plot as an image, insert the image into a word document, add pages to the word doc, then save it as a pdf. Its a little roundabout, but if you think this might be a good way to go let me know and I can flush it out into an actual answer Commented Aug 6, 2014 at 15:56
  • I need to do this once a day with a lot of images, so I'd really need an automated way Commented Aug 6, 2014 at 20:27
  • 2
    It could be automated, but maybe look into this question to work with the pdf directly Commented Aug 6, 2014 at 20:40

8 Answers 8

8

A nice solution using reportlib and PyPDF (base on this):

import os

from PyPDF4.pdf import PdfFileReader, PdfFileWriter
from reportlab.lib.units import mm
from reportlab.pdfgen import canvas


def create_page_pdf(num, tmp):
    c = canvas.Canvas(tmp)
    for i in range(1, num + 1):
        c.drawString((210 // 2) * mm, (4) * mm, str(i))
        c.showPage()
    c.save()


def add_page_numgers(pdf_path):
    """
    Add page numbers to a pdf, save the result as a new pdf
    @param pdf_path: path to pdf
    """
    tmp = "__tmp.pdf"

    writer = PdfFileWriter()
    with open(pdf_path, "rb") as f:
        reader = PdfFileReader(f, strict=False)
        n = reader.getNumPages()

        # create new PDF with page numbers
        create_page_pdf(n, tmp)

        with open(tmp, "rb") as ftmp:
            number_pdf = PdfFileReader(ftmp)
            # iterarte pages
            for p in range(n):
                page = reader.getPage(p)
                numberLayer = number_pdf.getPage(p)
                # merge number page with actual page
                page.mergePage(numberLayer)
                writer.addPage(page)

            # write result
            if writer.getNumPages():
                newpath = pdf_path[:-4] + "_numbered.pdf"
                with open(newpath, "wb") as f:
                    writer.write(f)
        os.remove(tmp)
Sign up to request clarification or add additional context in comments.

4 Comments

FYI this worked for me, but added a lot of bulk size to my pdfs. like 2-3x.
I have an existing pdf to which I want to add pages. Can I do that?
It works for me great, on 300 MB / 40 pages file added only 2 MB plus.
I tried this. It puts the page numbers correctly, but the links in the new PDF are not functioning.
6

Something like this:

from matplotlib.backends.backend_pdf import PdfPages
pp = PdfPages('output.pdf')
for n, fig in enumerate(figs):
    fig.text(4.25/8.5, 0.5/11., str(n+1), ha='center', fontsize=8)
    pp.savefig(fig)
pp.close()

2 Comments

what is the variable "figures" ?
figs is just a collection of figures that you've created. For example, by running figs.append(plt.figure()) in a loop.
2

Use numbering2pdf library.

from numbering2pdf import add_numbering_to_pdf

add_numbering_to_pdf("old_file.pdf", "new_file.pdf")

Comments

1

Either PyPDF2 or pdfrw will let you overlay two PDFs (so, for example, you can generate a PDF which is only page numbers, and use it to watermark your images). pdfrw has a watermark example that uses a single watermark page, but this could easily be modified to use a set of watermark pages, one for each page number.

If you want to get fancier, you can use reportlab to generate these pages on the fly.

pdfrw also has a facility that allows you to import a PDF page into reportlab as if it were an image. There are a couple of examples around that do this dynamically -- here is a good starting point.

Finally, rst2pdf (which is not all that well maintained but works well for simple cases) also lets you import PDFs as images -- it uses pdfrw and reportlab under the hood -- so you can easily use restructuredText to create documents with your images embedded. AFAIK, the best reportlab to use with the released version of rst2pdf is 2.7.

(Disclaimer: I am the pdfrw author and have made contributions to rst2pdf.)

Comments

1

Here is my answer which uses Matplotlib PDF backend to generate a PDF w/ just page numbers and PyPDF2 to merge the "footer" PDF w/ the desired PDF:

def add_header_footer(source_path, save_path, footer_pdf_path=None, start_page=0, header_text=None):
    ''' Adds header & footer info to existing PDFs '''
    footer_pdf_path = os.path.join(os.path.dirname(source_path), 'footer.pdf')
    reader = PdfReader(source_path)
    writer = PdfWriter()
    n_pages = len(reader.pages)

    # Step 1: generate header/footer PDF to be merged into source PDF
    pp = PdfPages(footer_pdf_path)
    for p in range(n_pages):
        fig = plt.figure(num=613, figsize=(8.5, 11), constrained_layout=1, facecolor='white')
        fig.patch.set_alpha(0)
        fig.text(0.48, 0.04, f'{p + 1} | <FOOTER TEXT>',
                 horizontalalignment='center', weight='bold',
                 verticalalignment='bottom', fontsize=6, color='grey')
        if header_text is not None:
            fig.text(0.1, 0.95, header_text,
                     horizontalalignment='left', weight='bold',
                     verticalalignment='center', fontsize=6, color='grey')
        pp.savefig(fig)
        plt.close()
    pp.close()
    footer_reader = PdfReader(footer_pdf_path)

    # Step 2: merge source PDF & header/footer PDF
    for index in list(range(start_page, n_pages)):
        content_page = reader.pages[index]
        footer_page = footer_reader.pages[index]
        mediabox = content_page.mediabox
        content_page.merge_page(footer_page)
        content_page.mediabox = mediabox
        writer.add_page(content_page)

    # Step 3: save merged PDF
    with open(save_path, "wb") as fp:
        writer.write(fp)

    return None

1 Comment

I attempted this method as well, and unfortunately, the links are still not functioning in the new PDF.
1

PyPDF2 >= 2.10.0

Requirements:

# generate a page with a page number:
pip install reportlab --upgrade

# merge that numbered (otherwise empty) page with the original:
pip install PyPDF2 --upgrade

Using a slightly modified version of the code of ofir dubi:

import os

from PyPDF2 import PdfReader, PdfWriter
from reportlab.lib.units import mm
from reportlab.pdfgen import canvas


def create_page_pdf(num, tmp):
    c = canvas.Canvas(tmp)
    for i in range(1, num + 1):
        c.drawString((210 // 2) * mm, (4) * mm, str(i))
        c.showPage()
    c.save()


def add_page_numgers(pdf_path, newpath):
    """
    Add page numbers to a pdf, save the result as a new pdf
    @param pdf_path: path to pdf
    """
    tmp = "__tmp.pdf"

    writer = PdfWriter()
    with open(pdf_path, "rb") as f:
        reader = PdfReader(f)
        n = len(reader.pages)

        # create new PDF with page numbers
        create_page_pdf(n, tmp)

        with open(tmp, "rb") as ftmp:
            number_pdf = PdfReader(ftmp)
            # iterarte pages
            for p in range(n):
                page = reader.pages[p]
                number_layer = number_pdf.pages[p]
                # merge number page with actual page
                page.merge_page(number_layer)
                writer.add_page(page)

            # write result
            if len(writer.pages) > 0:
                with open(newpath, "wb") as f:
                    writer.write(f)
        os.remove(tmp)


if __name__ == "__main__"
    add_page_numgers("input.pdf", "output.pdf")

Comments

0

You can also use fpdf2 (pip install fpdf2). If you have the images saved then you can do something like this:

from fpdf import FPDF
import glob

class MyPDF(FPDF):
    def footer(self):
        # position footer from bottom of page
        self.set_y(-0.6)
        # set the font, I=italic
        self.set_font("helvetica", style="I", size=8)
        # set page number and center it
        pageNum = f'- {self.page_no()} -'
        self.cell(0, 0.5, pageNum, align="C")

filenames = glob.iglob('*.jpg')
pdf = MyPDF()
pdf = MyPDF(orientation='P', unit='in', format='Letter')
for fname in filenames:
    pdf.add_page(orientation='P')
    pdf.image(fname, x=1.0, h=4.8)
pdf.output('Images.pdf')

Comments

0

The pdfnumbering library worked well for me. Here's the command line code I used to add 12-point page numbers to the bottom center of a letter-sized PDF:

pdfnumbering --font-size 12 --text-align center --text-position 0 740 --text-color 000000 --output pfn_merged_numbered.pdf pfn_merged.pdf

You can also use this library within a Python script. Here's a variant of the above command line script that can be added to a Python file:


# The following code was based on:
# https://pypi.org/project/pdfnumbering/ ;
# https://github.com/mikmart/pdfnumbering/blob/main/src/pdfnumbering/core.py
# (for PDFNumberer options);
# and http://www.fpdf.org/en/doc/cell.htm
# for the text_align value of 'C' (for 'center').

from pdfnumbering import PdfNumberer 
from pypdf import PdfWriter

numberer = PdfNumberer(
    font_size=12, text_align = 'C', 
text_position = (0, 740), 
text_color = (0, 0, 0))
document = PdfWriter(clone_from='pfn_merged_pypdf.pdf')
numberer.add_page_numbering(document.pages)
document.write('pfn_merged_pypdf_pdf_numberer_test.pdf')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.