"
This article is part of in the series
Last Updated: Wednesday 29th December 2021

Our day-to-day encounter with the computer system requires a lot of data and information exchange. To ensure that this exchange is smooth and easy, users prefer text files, and one of the most common formats for these text files is '.csv.'

But what is a CSV file? How can we use it to extract and alter data? What is a  python library?  Are there other available libraries to process this format? What are the commands and programs in each library?

Don't worry! All these questions are answered here in this article. You'll learn how to use the csv files and process them properly using code to extract the most out of your data. Let's get started.

What Is A CSV File?

A Comma-Separated Values(CSV) file is a plain text file using a character set like ASCII or Unicode(UTF-8) to store data. It is mainly used to store data stored initially in tabular form, each line representing one record. The different fields of a particular record are separated by commas, hence giving the format its name. The separators are called 'delimiters.'what is a csv file

The image shows how a CSV file looks. Each line of the file represents a row of the tabular data. You can observe that commas separate the fields.

Why are CSV files created?

The programs which handle considerable amounts of data create CSV files to export data from spreadsheets or databases conveniently and import them into different programs. From a programmer's point of view, CSV files are straightforward to work with. Any programming language that accepts text file input and supports string data manipulation can work with CSV files directly.

Reading CSV Files With Python's Built-in CSV Library

The built-in CSV library of Python contains functions to read from as well as write to CSV files. This library supports various csv formats and includes objects and code functions for processing CSV files.

Reading Files With csv

Object used to read from a CSV file : reader object 

Function used to open the CSV file  : open()

The built-in open() function of Python opens the CSV file as a text file. This function provides a file object that is then passed to the reader object, which further processes the file.

Let us take an example of a CSV file, storing the details of students as its data.

  Filename: students.txt

CSV Example

 

 

 

 

 

Python code to read this file :code to read 1

The output is as follows:

Output

Each row returned by the reader object is an array of string elements containing data after the delimiters have been removed.

Reading Files Into a Dictionary With csv 

We use the same file as above.

CSV Example

 

 

 

 

 

The code to read it as a dictionary is as follows:

Dictionary reading code

Notice the difference in the code here. When we use the dictionary, the row name is written in the row{} statement rather than the row number(as done in the previous method).

The result is the same output as before:

Output

The first line of the csv file contains the keys to build the dictionary. If, for instance, the keys are not present on your csv file, then you can specify your own keys.

Optional Parameters of reader object:

The reader object can work with different styles of CSV files with the help of additional parameters. Some of these parameters are discussed here:

  • delimiter parameter specifies the character used to separate each field in a record. The default is the comma (',').
  • quotechar defines the character used to represent fields containing the delimiter character. The default value of this parameter is a double quote (' " ').
  • escapechar describes the character used to escape the delimiter character in case the user doesn't use quotes. The default is no escape character.

For example, in the ‘students.txt’ file discussed above, we want to add another field as 'date.'

The format of date itself contains 'commas,' and the use of 'commas' as the delimiter in this situation will create a lot of confusion and make the processing cumbersome. To make it more convenient to use, we can specify another character as the delimiter. Alternatively, you can put your data in quotes by using the quotechar parameter, as any data placed in quoted strings ignore the function of delimiter. If you want to nullify the interpretation of delimiter completely, then you can use the escapechar parameter.

Writing CSV Files With csv

The object used: writer object.

Method used :  .write_row()

Writing Command

 

The quotechar optional parameter is used to define which character is being used to quote fields when writing. The quoting parameter can take the following values:

  • If the value of the quoting parameter equals to csv.QUOTE_MINIMAL, then the method  .writerow() will quote fields only if they contain the delimiter or the quotechar. This setting is the default case.
  • If the value of the quoting parameter equals to csv.QUOTE_ALL, then the method .writerow()  quotes all the fields present in the data. 
  • If the value of the quoting parameter equals to csv.QUOTE_NONNUMERIC, then the method .writerow() will quote all fields containing text data, and all numeric fields are converted to the float data type.
  • If the value of quoting parameter equals to csv.QUOTE_NONE, then the method .writerow() will escape delimiters instead of quoting them.

Writing Files From A Dictionary With csv

Writing with dictionary

Notice that in the writerow() field, we have to mention fieldnames as well.

Processing CSV Files With Pandas Library

pandas is another library available for processing CSV files. This library is recommended if the user has a considerable amount of data to analyze. It is an open-source Python library available for all python installations. One of the most preferred python installations for the usage of pandas library is the Anaconda distribution equipped with Jupyter notebook. It can process, share and analyze data with the help of many available tools and data structures.

Installing pandas and its dependencies in Anaconda is done as shown below:

install pandas

 

 

 

If you are using pip/pipenv for python installations, then the command is as follows:

Pandas Command 3

With the installation of the pandas library complete, let us now learn how to read and write csv files in this library.

Reading CSV Files with pandas.

Reading a csv file in the pandas library is very easy compared to other libraries because of its compact code structure.

Filename: students.csvReading a csv file in the pandas library

df here is short for 'DataFrame.' The pandas.read_csv()method opens and analyses the CSV file provided and saves the data in a DataFrame. Printing the DataFrame gives the desired output.

Pandas output

 

 

 

 

We can see the difference in output when we use the methods as mentioned earlier. Also, the result obtained using the pandas library has the index starting from '0' rather than '1'.

Writing CSV Files with pandas

The code for writing data into a CSV file is as shown below:

Filename: students.csv

Writing CSV Files with pandas

Here the ‘Name’ column of the file has been made the index column. Moreover, the print(df) command changes to df.to_csv() in the writing mode.

Conclusion

CSV files have tremendous use in the current data and information exchange scenario. Hence, if you understand their structure, composition, and usage, then you are ready to rule the programming world. 

The application of CSV files to assess data makes the operation increasingly efficient. When incorporated with programming languages like Python, the efficiency of these files increases manifolds in data manipulation, assessment and application. There are other libraries available as well for the processing of CSV files in Python. Still, the most efficient and compact ones have been discussed in this article for the programmer to have the most efficient program.