Blog

How to convert PDF to Excel using R

How to convert PDF to Excel using R

PDFTables has an API (Application Programming Interface) so that programmers can integrate PDF data extraction into your operations. It's a simple web based API, so can be called from any programming language, including R.

Automation is key to improving processes, saving money and freeing up resource. As businesses begin to focus on a digital transformation, finding the right tools for the job is crucial.

I'm going to convert a sample PDF invoice from a freight company which will convert into 1 page and to XLSX format. You can convert your PDFs to Excel, CSV, XML or HTML with PDFTables. If you would like to convert only certain pages from a PDF document, see our tutorial on how to extract pages from a PDF document. Let's get started!

Step 1

Go to PDFTables.com and head to the API page. Here you will find examples of using our API with many languages. Click 'R' from the left menu then follow the R package for PDF to Excel conversion link.

API R

Step 2

Now you'll be at a Github repository created by Expersso. You can install the PDFTables package from either CRAN (you'll need to install CRAN first) or Github. I installed from CRAN:

Expersso Github

Step 3

Once all has been installed, you're ready to convert your PDF. If your PDF is not saved in the current working directory, you will need to specify the path in the command. Run the following command, ensuring to input the correct filename, format and API key.

convert_pdf('test/index.pdf', output_file = NULL, format = "xlsx-single", message = TRUE, api_key = "insert_API_key")

Note: to convert your PDF to an Excel file, use either xlsx-single or xlsx-multiple as the format.

Step 4

Once the conversion is complete, a message will appear with the path where your converted file is located. To get a count of the number of page credits you have available, simply run the following command:

get_remaining("insert_API_key")
PDFTables page count

You have now successfully converted your PDF using the PDFTables API with R.

Do you have more questions?

Check out our other blog posts here or our FAQ page. Also, feel free to contact us.

Love PDFTables? Leave us a review on our Trustpilot page!

For more useful R tutorials, head over to r-bloggers.com.

Icons made by Smashicons from www.flaticon.com is licensed by CC 3.0 BY
PDFTables.com uses cookies to provide a service and collect information about how you use our site. If you don't want us to collect information about your site behaviour, please go to our privacy page for more information. Read about our use of cookies.