Monday, November 19, 2018

Scan PDF using Python -Python module for converting PDF to text

All of you must be familiar with what PDFs are. In-fact, they are one of the most important and widely used digital media.  PDF stands for Portable Document Format. It uses .pdf extension. It is used to present and exchange documents reliably, independent of software, hardware, or operating system.
Invented by Adobe, PDF is now an open standard maintained by the International Organization for Standardization (ISO). PDFs can contain links and buttons, form fields, audio, video, and business logic.

In the below Example we are just extracting my Airtel Receipt text from PDF with the help ofPython lib "PyPDF2"

Code:


import PyPDF2

pdf=PyPDF2.PdfFileReader(open("C:\MY\Airtel Bill\PaymentReceiptforAirtelAccount.pdf", "rb"))

for page in pdf.pages:

print (page.extractText())



Output:



No comments:

Post a Comment

JOB in 2019

Automated - Digital One o One Meetings- An Idea

                                            Automated - Digital One o One Discussion                                                       ...