DEV Community

Bhushan Rane
Bhushan Rane

Posted on

.py : Automating PDF Operations (Extracting Text from PDFs)

Description:

This Python script extracts text from PDF files using the PyPDF2 library. It reads each page of the PDF and compiles the extracted text into a single string.

# Python script to extract text from PDFs
import PyPDF2
def extract_text_from_pdf(file_path):
with open(file_path, 'rb') as f:
pdf_reader = PyPDF2.PdfFileReader(f)
text = ''
for page_num in range(pdf_reader.numPages):
page = pdf_reader.getPage(page_num)
text += page.extractText()
return text
Enter fullscreen mode Exit fullscreen mode

Top comments (0)

Billboard image

Create up to 10 Postgres Databases on Neon's free plan.

If you're starting a new project, Neon has got your databases covered. No credit cards. No trials. No getting in your way.

Try Neon for Free →

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay