DEV Community

George Moustakas
George Moustakas

Posted on

1 1

PDF file manipulation with python 3 (Problem)

Hi, I have a problem I would like help with, i'm working on a project/script with python 3 where I want to manipulate a large PDF file (50-60 plus pages long) where I would like to find a specific keyword in that file, this keyword is repeated multiple times in the file and each time this keyword is referring to a different data set, then save how many times the keyword was found, in what pages was found and then split those pages from the original file and then merge those pages together in a single file.

I will use multithreading of course, because this script will run alongside other's in a small in-house server and it's already running quite a lot, other scripts.

I found some things online but no luck in what my problem is, except some python libraries that is possible to do what i'm looking for, but i have no idea how i will found this keyword in the file, because the keyword isn't in the same page order in the files, it's different in every file!!

Image of Datadog

Create and maintain end-to-end frontend tests

Learn best practices on creating frontend tests, testing on-premise apps, integrating tests into your CI/CD pipeline, and using Datadog’s testing tunnel.

Download The Guide

Top comments (0)

Image of Datadog

Create and maintain end-to-end frontend tests

Learn best practices on creating frontend tests, testing on-premise apps, integrating tests into your CI/CD pipeline, and using Datadog’s testing tunnel.

Download The Guide

👋 Kindness is contagious

Dive into an ocean of knowledge with this thought-provoking post, revered deeply within the supportive DEV Community. Developers of all levels are welcome to join and enhance our collective intelligence.

Saying a simple "thank you" can brighten someone's day. Share your gratitude in the comments below!

On DEV, sharing ideas eases our path and fortifies our community connections. Found this helpful? Sending a quick thanks to the author can be profoundly valued.

Okay