DEV Community

Prahlad Yeri
Prahlad Yeri

Posted on

Trouble fetching checkbox and radio fields with pypdf2

My project involves reading text from a bunch of PDF form files for which I'm using PyPDF2 open source library. There is no issue in getting the text data as follows:

reader = PdfReader("data/test.pdf")
cnt = len(reader.pages)
print("reading pdf (%d pages)" % cnt)
page = reader.pages[cnt-1]
lines = page.extract_text().splitlines()
print("%d lines extracted..." % len(lines))
Enter fullscreen mode Exit fullscreen mode

However, this text doesn't contain the checked statuses of the radio and checkboxes. I just get normal text (like "Yes No" for example) instead of these values.

I also tried the reader.get_fields() and reader.get_form_text_fields() methods as described in their documentation but they return empty values. I also tried reading it through annotations but no "/Annots" found on the page. When I open the PDF in a notepad++ to see its meta data, this is what I get:

%PDF-1.4
%ยฒยณยดยต
%Generated by ExpertPdf v9.2.2
Enter fullscreen mode Exit fullscreen mode

It appears to me that these checkboxes aren't usual form fields used in PDF but appear similar to HTML elements. Is there any way to extract these fields using python?

Agent.ai Challenge image

Congrats to the Agent.ai Challenge Winners ๐Ÿ†

The wait is over! We are excited to announce the winners of the Agent.ai Challenge.

From meal planners to fundraising automators to comprehensive stock analysts, our team of judges hung out with a lot of agents and had a lot to deliberate over. There were so many creative and innovative submissions, it is always so difficult to select our winners.

Read more โ†’

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

๐Ÿ‘‹ Kindness is contagious

Explore a sea of insights with this enlightening post, highly esteemed within the nurturing DEV Community. Coders of all stripes are invited to participate and contribute to our shared knowledge.

Expressing gratitude with a simple "thank you" can make a big impact. Leave your thanks in the comments!

On DEV, exchanging ideas smooths our way and strengthens our community bonds. Found this useful? A quick note of thanks to the author can mean a lot.

Okay