root
- server (backend)
- web (frontend)
Caution
Don't remove or modify main_v3.py file, since it contains the core processing script.
- Create virtual environment
python -m venv venvImportant
Python 3.8 or higher is required.
- Activate virtual environment
venv\Scripts\activateNote
The above command is for Windows. For Mac / Linux, use appropriate commands.
- Install deps
pip install -r requirements.txt- Set env vars
set GOOGLE_GEMINI_API_KEY=YOUR_KEY
rem Optional for Sheets export:
set GOOGLE_APPLICATION_CREDENTIALS=path\to\service_account.json
set SHEETS_SPREADSHEET_ID=<Google_SpreadSheet_ID>Important
Rename the .env.example file to .env and set it with your API credentials.
- Run backend
python -m server.app- Process via API (example using curl)
curl -F "files[]=@RAJKIYA_VIDYALAYA_HIGH_SCHOOL_KHATOLA-Grade-9th.pdf" http://localhost:5001/api/processTip
Some files to test out with can be explored here: https://drive.google.com/drive/u/1/folders/160G318MN0CUz2CmPFB3vipVQNlJ78Qb0 (shared by Anand)
-
Open a new terminal window
-
Navigate to the frontend directory (
cd web) -
Install dependencies
npm install- Set the environment variable for API base URL
export NEXT_PUBLIC_API_BASE="http://localhost:5001"- Start the development server
npm run dev- Access the frontend at
http://localhost:3000
- PDF files:
.pdf - Image files:
.jpg,.jpeg,.png
Tip
Processing is async but mono, i.e. only one file at a time. Multi Page PDFs are supported. In case of mutiple files are uploaded, they are processed sequentially, and the final result is exported with name of the first uploaded file.
Caution
Low quality images / PDFs may result in inaccurate text extraction.
- Process a PDF file
python main_v3.py student_data.pdf- Process an image file
python main_v3.py class_register.jpg- Process with verbose logging
python main_v3.py student_data.pdf --verboseNote
Here, in the above examples, the --verbose flag enables detailed logging for troubleshooting and analysis. Besides, study_data.pdf and class_register.jpg are sample files, just for demonstration.
We extract:
- SL No: Serial/Roll/Form numbers
- Student Name: Transliterated to English if needed
- Gender: Male/Female/Other (inferred if missing)
- Course Name: Subject combinations
- Phone: 10-digit mobile numbers
- Language: Detected language of original text
- 0.9-1.0: High confidence, ready to use
- 0.7-0.89: Good confidence, minor review needed
- 0.5-0.69: Medium confidence, review recommended
- Below 0.5: Low confidence, manual verification required
Tip
Create your own application credentials for Google Sheets, to export it in this format: https://docs.google.com/spreadsheets/d/1SMh8zempXIIwMwoi2IehOldOdT-XZ7xK6z8DIt6BQds/edit?gid=91232041#gid=91232041. Once created, replace the existing with your credentials.
Steps:
-
Go to Google Cloud Console https://console.cloud.google.com/
-
Create or select a project
- Click the project dropdown.
- Create a new project or use an existing one.
-
Enable the Google Sheets API
- Navigate to APIs & Services > Library.
- Search for Google Sheets API.
- Click Enable.
-
Create credentials
- Go to APIs & Services > Credentials.
- Click Create Credentials > Service Account.
- Give it a name and finish setup.
- After creation, open the service account and go to Keys > Add Key > Create new key.
- Choose JSON. A
.jsonfile will download—this is your application credentials.
-
Set environment variable (for local use)
-
Store the path to that
.jsonfile in an environment variable:export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/service-account.json"
-
On Windows (PowerShell):
setx GOOGLE_APPLICATION_CREDENTIALS "C:\path\to\service-account.json"
-
-
Share the Sheet with the service account
- Open your Google Sheet.
- Click Share.
- Add the service account email (something like
[email protected]). - Give it Viewer/Editor access as needed.