PDFTables has an API (Application Programming Interface) so that programmers can integrate PDF data extraction into your operations. It's a simple web based API, so can be called from any programming language, including R.
Automation is key to improving processes, saving money and freeing up resource. As businesses begin to focus on a digital transformation, finding the right tools for the job is crucial.
I'm going to convert a sample PDF invoice from a freight company which will convert into 1 page and to XLSX format. You can convert your PDFs to Excel, CSV, XML or HTML with PDFTables. If you would like to convert only certain pages from a PDF document, see our tutorial on how to extract pages from a PDF document. Let's get started!
Go to PDFTables.com and head to the API page. Here you will find examples of using our API with many languages. Click 'R' from the left menu then follow the R package for PDF to Excel conversion link.
Once all has been installed, you're ready to convert your PDF. If your PDF is not saved in the current working directory, you will need to specify the path in the command. Run the following command, ensuring to input the correct filename, format and API key.
convert_pdf('test/index.pdf', output_file = NULL, format = "xlsx-single", message = TRUE, api_key = "insert_API_key")
Note: to convert your PDF to an Excel file, use either
xlsx-multiple as the format.
Once the conversion is complete, a message will appear with the path where your converted file is located. To get a count of the number of page credits you have available, simply run the following command:
You have now successfully converted your PDF using the PDFTables API with R.
Do you have more questions?
Love PDFTables? Leave us a review on our Trustpilot page!
For more useful R tutorials, head over to r-bloggers.com.