DEV Community

Evan Lin
Evan Lin

Posted on • Originally published at evanlin.com on

[TIL][Python] Python Tool for Online PDF Viewing, Comparison, and Data Import

title: [TIL][Python] Online PDF Page-by-Page Viewing and Comparison Tool for Importing Data (Python online PDF Viewer and comparison) and Python Snippets
published: false
date: 2023-08-04 00:00:00 UTC
tags: 
canonical_url: http://www.evanlin.com/til-python-tips/
---

## Small Project: Online PDF Viewer and Parse Data compare:

- [https://github.com/kkdai/pdf\_online\_editor](https://github.com/kkdai/pdf_online_editor)
- Compare the files you are preparing to import, using PyPDF.
- For those who are using PDF vector embedding, but don't know what the imported data looks like, you can use this small tool to view it, built by streamlit.

![image-20230805094306589](http://www.evanlin.com/images/2022/image-20230805094306589.png)

Recently, I've spent a lot of time playing with Python while working on LangChain, but I haven't paid attention to many things before. Here's a quick note:

- [PyPDF2 has CVE issues](https://nvd.nist.gov/vuln/detail/CVE-2023-36464), actually switch back to [PyPDF](https://github.com/py-pdf/pypdf)
- You can use `pip freeze` and `pipreqs` together to create `requirements.txt`
- [Heroku](https://heroku.com) can use `Aptfile` to install apt get packages:
  - Import the `"url": "heroku-community/apt"` buildpack
  - Add the list of packages you need in `Aptfile` (e.g. pyimage needs `poppler-utils`)
  - Reference: [How to add apt packages to Heroku](https://www.nikitakazakov.com/heroku-apt-packages) or the repo above.
- [Streamlit](https://streamlit.io/) is a great tool, and it provides the following super useful things for a front-end newbie like me:
  - Various [data input](https://docs.streamlit.io/library/api-reference/widgets) formats
  - [Session State](https://docs.streamlit.io/library/api-reference/session-state): A great online cookie/session-like thing
<iframe width="560" height="315" src="https://www.youtube.com/embed/92jUAXBmZyU" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen=""></iframe>
Enter fullscreen mode Exit fullscreen mode

Top comments (0)