Optical Character Recognition for Table Extraction from PDF
Building and deploying a PDF to Excel system using PaddleOCR and Fastapi
Description
Optical Character Recognition (OCR) systems are used in diverse industries today. With the development of better performing deep learning models, we are getting even better OCR solutions.
In this course, we shall take you on an amazing journey in which you'll implement and deploy a working OCR solution. To be more precise we shall build a working solution in which a user inputs a PDF file and gets all the tables contained in the PDF as excel sheets. We'll start from understanding how this system works, then build a working prototype on Google Colaboratory (Colab). From here, we shall build a simple API with the Fastapi framework. This will permit users input a PDF file and get as output a compressed file containing folders which themselves contain excel sheets with the different tables found in the PDF.
If you are willing to move a step further in your career, this course is destined for you and we are super excited to help achieve your goals!
This course is offered to you by Neuralearn. And just like every other course by Neuralearn, we lay much emphasis on feedback. Your reviews and questions in the forum will help us better this course. Feel free to ask as many questions as possible on the forum. We do our very best to reply in the shortest possible time.
Enjoy!!!
What You Will Learn!
- How to use Paddle OCR to build a working PDF to Excel system
- Basics of FastAPI
- Building an OCR API
- Taking a working solution from Google Colab and Deploying
Who Should Attend!
- Python Developers curious about Machine Learning
- Software engineers wanting to deploy a working pdf to excel solution
- Learners who want to practically make use of state of art OCR solutions