What are CSV files? And how do I deal with them?
If you already know what CSV files are then you can skip this post.
However, for the rest of you who have encountered a file ending in .csv, and wondered what the heck is the difference between this and your classic .xlsx spreadsheet, well, wonder no more.
So what are CSV files?
CSV stands for Comma Separated Values. It is a format best used for tabular data, rows and columns, exactly like a spreadsheet. A CSV file should have the same number of columns in each row.
A CSV file stores data in rows and the values in each row is separated with a separator, also known as a delimiter. Although the file is defined as Comma Separated Values, the delimiter could be anything.
The most common delimiters are: a comma (,), a semicolon (;), a tab (\t), a space ( ) and a pipe (|). But theoretically anything can be a delimiter as long as it is consistent.
We can see the delimiters in action if we open up a .csv file in a text editor, like Notepad in Windows or TextEdit on Mac.
Wait, how come we can open .csv in a text editor? I can’t open my Excel files in a text editor
This is a great question and requires some diving into what files are.
To a computer, CSV files are the same as text files, with an added requirement that a delimiter is used to break up values in a row. So from a computer’s point of view, a .csv file is the EXACT same as the .txt file.
That’s why you can open and deal with a .csv file in any software designed to handle .txt files.
An Excel file (.xlsx, .xls) is a special format designed specifically to be opened in Microsoft Excel. That’s why you can’t open .xlsx files in any random old software. The file opening software has to be built for .xlsx files.
This is why csv files are the preferred data format for many people. They are easily read in many different softwares.
Hmm okay, so how come I can open my .csv files in Excel too?
This is a feature of Excel. It makes .csv files easy to open and deal with. A lot of times, your computer’s default software to open .csv files is actually Excel.
No matter the delimiter, Excel will open the .csv file BUT it will only automatically parse the .csv file into spreadsheet cells if the delimiter is a comma (,).
Ah, I see. So how do I parse other delimiters to cells in Excel?
Great question. There are two ways to do it, which share most of the same steps.
We use both methods depending on the situation, but if you are just starting out, it may be easier to use method 1 until you get comfortable dealing with .csv files. Then you can use the method that you find more appropriate.
We will use the pipe delimited file (|) to illustrate both methods.
Split Text to Columns
- Select the column with all the values. It should be only the first column. If your file has been opened in multiple columns then use method 2
2. Click on Data -> Text to Columns. Select Delimited in the Convert Text to Columns Wizard, then click Next >
3. In Step 2 of 3, select the appropriate delimiter.
Excel provides some default options for the most common delimiters along with an option for any other delimiter, which is what we will use in this situation.
We will specify the pipe (|) in this box, and then click Next >.
You will see there are a few options here such as Treat consecutive delimiters as one and Text qualifier. We will discuss them later on in this post.
4. You will now arrive on Step 3 out of 3. This lets you convert columns to a particular data format. We will let them be text for now, so just click on Finish and you will be greeted with beautifully parsed columns.
Import Data from Text
- Open a new Excel workbook
- Click on Data -> From Text. Select the .csv file you want to open and click Get Data.
We use the From Text option because remember to a computer a .csv file is the same as a .txt file. We can import an actual .txt file here too, and if it is has a delimiter it would work as well
3. After you click Get Data from your file selector, you will follow the same steps as steps 2–4 in method 1. Those remain the same and you will get your values parsed in cells again.
Okay, this is all cool. But how can I create CSV files?
There are two ways to create CSV files.
One way is to create them in a text editor like Notepad or TextEdit, making sure that the same delimiter is consistently used and the number of values in each row is the same.
This method is cumbersome and should be used for small values.
For larger values and files, we can use our favorite software MS Excel to create CSV files. Let’s go through an example to see how to do that.
- Open up a new Excel workbook
- Write Name, Occupation and Notes in the first row
3. Fill the next 5 rows as following
Now we can save this file in .csv format.
Click on Save As, give the file a name, select Comma Separated Value as file type, and click Save.
This saves the file using comma (,) as the delimiter. This is the default, in both windows and mac, MS Excel saves .csv files with comma (,) as the delimiter.
If you want to save the file with another delimiter, for example pipe (|) or semi-colon (;), it is unfortunately not easy. It is an advanced topic that we don’t want to include it in this tutorial.
Now, let’s see how our newly created file looks in a text editor.
Sweet, but what are these quotation marks around some of the values?
Glad you asked. This is the final piece of the csv puzzle. You will run into many cases when your values contain your delimiter.
In this example above, in Notes, two of the values: I love him, great guy and Good speaker, tall contain a comma, which is also our delimiter.
So we need a way to tell the computer to treat these commas differently than the rest of the commas in the file, otherwise we will have a different number of values in each row.
To do this we quote those values that contain our delimiter. You should quote them using double quotes (“comma, containing value”). Excel automatically does this but if you are creating a csv file using another method you need to be aware of this fact.
In our pipe delimited file, we would quote any values that contained a pipe, for example “I love him|great guy”.
This is why some people, us included, advocate for using a delimiter that won’t pop up in your values such as a pipe (|). This way you won’t need to worry about double quoting values.
Reading with Double Quotes
Excel automatically handles reading the double quotes and ignoring because of the setting below.
That’s it! Enjoy your new found learning of dealing with CSV files!
If you want a custom app, data cleaned, or a project built using spreadsheets, you can check out our consulting service here: https://www.lovespreadsheets.com!