A python program that opens a PDF file and copy a specific line from it and after close and rename the PDF with a copied line

Question

A python program that opens a PDF file and copy a specific line from it and after close and rename the PDF with a copied line

491 Views Asked by Oleh At 03 May 2023 at 23:49

I must rename a number of PDF files with the serial numbers of laptops every day as part of my routine. I need somekind of a script/software that opens a PDF file, copies a certain line (service tag) from it, then closes and renames the PDF with the copied serial number from it.

I tried to use the PyPDF2 package in Python, but my code is not working.

from PyPDF2 import PdfReader


reader = PdfReader("SNPDF.pdf")

page = reader.pages[0]
number_of_pages = len(reader.pages)

text = page.extract_text()



for x in text:
    if x.startswith("SN"):
        new_name = x.replace("SN", '')
      


print(new_name)

It worked with .txt file, but I have no clue how to run it with PDF.

import os

serial = open('text.txt')
for line in serial:
    if line.startswith("SN"):
        new_line = line.replace('SN','')

serial.close()

os.rename('text.txt', new_line)

Original Q&A

There are 3 best solutions below

**Anton Petrov** · Answer 1 · 2023-05-04T00:39:45.650000

The following code should extract the service tag from the first page of the PDF file and rename the file with the extracted service tag.

import os

from PyPDF2 import PdfReader

FILE_TO_RENAME = "SNPDF.pdf"
SERVICE_TAG_PREFIX = "SN"

with open(FILE_TO_RENAME, "rb") as f:
    reader = PdfReader(f)
    # Replace "0" with the index of the page that contains the service tag
    page = reader.pages[0]
    text = page.extract_text()

# Search for the service tag in the extracted text
for line in text.split("\n"):
    if line.startswith(SERVICE_TAG_PREFIX):
        service_tag = line.partition(SERVICE_TAG_PREFIX)[2].strip()
        break
else:
    raise Exception(f"No service tag prefix found: {SERVICE_TAG_PREFIX}")

# Rename the PDF file with the service tag
os.rename(FILE_TO_RENAME, service_tag + ".pdf")

**911** · Answer 2 · 2023-05-04T01:59:57.810000

I think PyMuPDF package is better,you need to import the "fitz" package to use it,

pdf_file = "SNPDF.pdf"
doc = fitz.open(pdf_file)
for page in doc:
    text = page.get_text()
    if text.startswith("SN"):
        new_name = text.replace("SN", '')
doc.close()

**Jorj McKie** · Answer 3 · 2023-05-04T13:30:07.447000

import fitz  # import PyMuPDF package

pdfnames = [...]  # list of pdf pathnames

for f in pdfnames:
    doc = fitz.open(f)
    page = doc[0]  # load first page
    words = page.get_text("words")  # extract text separated as strings w/o spaces
    SN = None
    for i, word in enumerate(words):
        if word[4] == "SN"  # search for SN identifier
            SN = words[i+1][4]  # next word string should be serial number
            break
    if SN == None:
        print(f"no SN in file '{f}'.")
        continue
    doc.close()  # close the file for renaming it
    os.rename(f, SN + ".pdf")

A python program that opens a PDF file and copy a specific line from it and after close and rename the PDF with a copied line

There are 3 best solutions below

Related Questions in PYTHON

Related Questions in PYPDF

Related Questions in ADMINISTRATION

Trending Questions

Popular # Hahtags

Popular Questions