title: [TIL][Python] Online PDF Page-by-Page Viewing and Comparison Tool for Importing Data (Python online PDF Viewer and comparison) and Python Snippets
published: false
date: 2023-08-04 00:00:00 UTC
tags:
canonical_url: http://www.evanlin.com/til-python-tips/
---
## Small Project: Online PDF Viewer and Parse Data compare:
- [https://github.com/kkdai/pdf\_online\_editor](https://github.com/kkdai/pdf_online_editor)
- Compare the files you are preparing to import, using PyPDF.
- For those who are using PDF vector embedding, but don't know what the imported data looks like, you can use this small tool to view it, built by streamlit.

Recently, I've spent a lot of time playing with Python while working on LangChain, but I haven't paid attention to many things before. Here's a quick note:
- [PyPDF2 has CVE issues](https://nvd.nist.gov/vuln/detail/CVE-2023-36464), actually switch back to [PyPDF](https://github.com/py-pdf/pypdf)
- You can use `pip freeze` and `pipreqs` together to create `requirements.txt`
- [Heroku](https://heroku.com) can use `Aptfile` to install apt get packages:
- Import the `"url": "heroku-community/apt"` buildpack
- Add the list of packages you need in `Aptfile` (e.g. pyimage needs `poppler-utils`)
- Reference: [How to add apt packages to Heroku](https://www.nikitakazakov.com/heroku-apt-packages) or the repo above.
- [Streamlit](https://streamlit.io/) is a great tool, and it provides the following super useful things for a front-end newbie like me:
- Various [data input](https://docs.streamlit.io/library/api-reference/widgets) formats
- [Session State](https://docs.streamlit.io/library/api-reference/session-state): A great online cookie/session-like thing
<iframe width="560" height="315" src="https://www.youtube.com/embed/92jUAXBmZyU" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen=""></iframe>
For further actions, you may consider blocking this person and/or reporting abuse
Top comments (0)