@isslerman Please do not rely on the traditional LLMs for the OCR purposes. They are not good at it. You should be using the OCR Providers for ex: OCR API
Thanks Ranjan, I will take a look it. The project itself is more than an OCR. We are using LLMs for other proposes and the OCR is a step. But yes, OCR API can be a good solution too. I will take a look if there is any open source that I can use or check the providers, because the documents in this case is sensitive to be shared.
Bests, Marcos Issler
Sure, it's understandable. There are always pros and cons when it comes to the open source options. Please do remember about the accuracy issues with the open source. However, Since you have mentioned the sensitive document OCR, I would highly recommend you to consider the public cloud options such as AWS, Azure. They provide you the ultimate solution, and they are also GDPR and SOC 2 complaint. Generally, we need to keep several things in mind. Cost, Accuracy, Security, Reliability, Scalability etc. If you can do something with the in-house open source and think that works the best, please proceed with the same. However, it's good to experiment and see the best options.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
@isslerman Please do not rely on the traditional LLMs for the OCR purposes. They are not good at it. You should be using the OCR Providers for ex: OCR API
Thanks Ranjan, I will take a look it. The project itself is more than an OCR. We are using LLMs for other proposes and the OCR is a step. But yes, OCR API can be a good solution too. I will take a look if there is any open source that I can use or check the providers, because the documents in this case is sensitive to be shared.
Bests, Marcos Issler
Sure, it's understandable. There are always pros and cons when it comes to the open source options. Please do remember about the accuracy issues with the open source. However, Since you have mentioned the sensitive document OCR, I would highly recommend you to consider the public cloud options such as AWS, Azure. They provide you the ultimate solution, and they are also GDPR and SOC 2 complaint. Generally, we need to keep several things in mind. Cost, Accuracy, Security, Reliability, Scalability etc. If you can do something with the in-house open source and think that works the best, please proceed with the same. However, it's good to experiment and see the best options.