Maintenance notice

PDFTables.com will be undergoing essential maintenance on Monday 11th November, 2024 for a period from 7 AM until 10 AM GMT; both this site and the API may be unavailable during that period.

Blog

How to change the delimiter in a CSV file

How to change the delimiter in a CSV file

In this blog post we will be showing you how to change the delimiter in a CSV file using Python. A CSV (comma-separated values) file is a text file that uses commas to separate values and can be opened in Microsoft Excel, Google Sheets, a text editor and more.

There may be a time when you need to change the delimiter in order to, for example, import your file into an application. In this example we will be changing the delimiter from a comma to a semicolon.

Before we start

If you don't have Python installed on your computer, follow steps 1 and 2 of blog post Convert PDF to Excel, CSV or XML with Python. You will now have Python, Anaconda and the PDFTables Python library installed.

I will be changing the delimiter in a sample invoice from a freight company that has been converted from PDF to CSV using PDFTables.com.

Sample of CSV file

Step 1

Create a new Python file in the location where your CSV file is saved. Make sure the file is saved as .py format and use a file name of your choice. Add the following code to the new file:

import csv

reader = csv.reader(open("freight_invoice.csv", "rU"), delimiter=',')
writer = csv.writer(open("output.txt", 'w'), delimiter=';')
writer.writerows(reader)

print("Delimiter successfully changed")

You will need to make some or all of the following changes to the script:

  • Replace freight_invoice.csv with name of your input file
  • Replace output.txt with name you'd like to give your output file
  • Replace the semicolon in delimiter=';' with a new delimiter of your choice

Step 2

Open up an Anaconda Prompt instance. You can find this by searching in your computers search box. Press Enter to open an instance.

Locate the folder in which the Python script is saved using cd .. or cd *folder_name* to move around the directories.

Anaconda instance

Step 3

Type python change_delimiter.py (replacing change_delimiter.py with the name of your Python file) then press Enter.

The comma-separated file will now be read in then a new file will be output in .txt format with the new delimiter. You will see the message Delimiter successfully changed once the script has finished running.

Step 4

You can now open the new file in a text editor or import it into an application.

Converted PDF page 5

Converting PDF to CSV first

If you need to extract data from a PDF first, you can convert PDF to CSV using PDFTables.com. The following code uses the PDFTables API to convert PDF to CSV, then uses the code snippet from above to change the delimiter.

import csv
import pdftables_api

c = pdftables_api.Client('my-api-key')
c.csv('freight_invoice.pdf', 'freight_invoice.csv') 

reader = csv.reader(open("freight_invoice.csv", "rU"), delimiter=',')
writer = csv.writer(open("output.txt", 'w'), delimiter=';')
writer.writerows(reader)

print("Delimiter successfully changed")

You will need to make the following changes to the script:

  • Replace my-api-key with your API key which you will find here.

Do you have more questions?

Check out our other blog posts here or our FAQ page. Also, feel free to contact us.

Love PDFTables? Leave us a review on our Trustpilot page!

Semicolon icon, Colon icon and Hashtag Large icon icon by Icons8. CSV icon made by prettycons from www.flaticon.com.
PDFTables.com uses cookies to provide a service and collect information about how you use our site. If you don't want us to collect information about your site behaviour, please go to our privacy page for more information. Read about our use of cookies.