In this tutorial I'll be showing you how to convert a PDF to a Google document in Google Drive. There are two methods to convert your PDF to a Google doc or PDF to Google sheets - the first is a manual process and the second is a fully automated process using the PDFTables API.
With our automated method, you will be able to batch convert PDFs to Google docs and upload to Google Drive using one simple command once all is setup.
Click to jump straight to our automated method.
Method 1 - Manual
In this method we will be using PDFTables.com to convert the PDF to a Google doc then uploading it to Google Drive.
Step 1
Go to PDFTables.com and click the green 'CONVERT A PDF' button.
Step 2
Find your PDF in the dialog box that appears then click 'Open'.
Step 3
The PDF will now upload then the output presented on a preview page. Click the 'Download as Excel' button or the arrow to choose another format.
Step 4
Go to your Google Drive account and move to the folder you'd like your converted file to be saved in.
Step 5
Click the 'New' button in the top left of your screen then choose 'File Upload'.
Step 6
Find your file in the dialog box that appears then click 'Open'. The upload will start then the file will appear once complete. You have now successfully converted your PDF to a Google doc.
Method 2 - Automatic
In this method we'll be using the PDFTables API as well as the Google Drive API with Python to convert a PDF to CSV then upload it to Google Drive. If you don't already have the PDFTables Python library set up and running on your machine, first go to our tutorial how to convert a PDF to Excel with Python and follow steps 1 and 2.
Step 1
First of all you need to enable the Google Drive API and generate the credentials. Follow Annis Souames' blog on how to upload files automatically to drive with Python up to the 'Connecting to Google Drive with PyDrive' section.
Step 2
Update your Python file to match the code below.
from pydrive.auth import GoogleAuth from pydrive.drive import GoogleDrive import os import pdftables_api g_login = GoogleAuth() g_login.LocalWebserverAuth() drive = GoogleDrive(g_login) c = pdftables_api.Client('insert_API_key') c.csv('input_file_name.pdf', 'output_file_name') with open("output_file_name.csv","r") as file: file_drive = drive.CreateFile({'title':os.path.basename(file.name) }) file_drive.SetContentString(file.read()) file_drive.Upload()
- Replace
insert_API_key
with your unique API key. - Replace
csv
with the format you'd like to convert the PDF to. The options arexlsx
,xlsx_single
,csv
,xml
orhtml
. - Replace
input_file_name.pdf
with your PDF file name. - Replace
output_file_name
with what you'd like to call the converted output file.
Step 3
Save your Python script in the same folder your PDF is saved and then run the script. A browser page will appear, you will need to choose an account and allow access.
Step 4
Go to your Google Drive account and the most recent file will be the file you've just uploaded via the API.
Converting multiple PDFs
If you would like to convert and upload multiple PDFs within a folder, use the script below.
from pydrive.auth import GoogleAuth from pydrive.drive import GoogleDrive import os import pdftables_api g_login = GoogleAuth() g_login.LocalWebserverAuth() drive = GoogleDrive(g_login) c = pdftables_api.Client('insert_API_key') file_path = "C:\\Users\\MyName\\Documents\\PDFs\\" for file in os.listdir(file_path): if file.endswith(".pdf"): c.csv(os.path.join(file_path,file), file+'.csv') for file in os.listdir(file_path): if file.endswith(".csv"): file_drive = drive.CreateFile({'title':os.path.basename(file)}) file_drive.SetContentString(file) file_drive.Upload()
Do you have more questions?
Check out our other blog posts here or our FAQ page. Also, feel free to contact us.
Love PDFTables? Leave us a review on our Trustpilot page!