Most database and spreadsheet applications can conveniently output table data in the form of CSV (comma-separated-values) files. While CSV files are handy because of their simplicity and portability, they are ineffective for displaying or analyzing large amounts of data. To overcome this limitation, a programmer can use the Python programming language and matplotlib to plot data from a CSV file and create a readable, visually attractive graph suitable for web or print publication.
Preparation for Plotting CSV Data
Video of the Day
Before you actually plot the CSV file in Python, you'll want to make sure you have all the necessary tools and create a test file. This includes having Python and necessary libraries installed as well as having a CSV file that contains two columns of numerical data.
Video of the Day
Step 1: Create Test File
First, open your text editor and create a simple CSV file for testing. A sample might look like this:
1,2 2,3 3,8 4,13 5,18 6,21 7,13 7.5,4 2.5,4.3
Step 2: Import Necessary Libraries
Now you're ready to import the necessary python libraries into your code file with this line of code:
import matplotlib.pyplot as plt import csv import sys
Plot Graph in Python from CSV
With your preparation out of the way, you can now get started actually using Python to draw a graph from a CSV file.
Step 1: Create Reader Object
Open the CSV file and create a reader object from it. Declare variables to define the upper and lower bounds for the x and y axis values of the graph:
csv_reader = csv.reader(open('test.csv')) bigx = float(-sys.maxint -1) bigy = float(-sys.maxint -1) smallx = float(sys.maxint) smally = float(sys.maxint)
Step 2: Iterate Over Rows
Iterate over each row contained in the reader object storing each row as a vertex in a vertex array. In the same loop compare the x and y values in order to store their upper and lower bounds. Sort the vertex array and then loop through it again. This time store the sorted x and y values in separate arrays:
verts = [] for row in csv_reader: verts.append(row) if float(row[0]) > bigx: bigx = float(row[0]) if float(row[1]) > bigy: bigy = float(row[1]) if float(row[0]) < smallx: smallx = float(row[0]) if float(row[1]) < smally: smally = float(row[1]) verts.sort() x_arr = [] y_arr = [] for vert in verts: x_arr.append(vert[0]) y_arr.append(vert[1])
Step 3: Make a FigureCanvas Object
Create a FigureCanvas object using the imported matplotlib pyplot object. Add the graph's axes to the FigureCanvas by calling the function add_axes and passing it an array of values in the form of: left, bottom, width, height. These values define where the graph is placed on the canvas —they can range from 0.0 to 1.0:
fig = plt.figure() ax = fig.add_axes([0.1, 0.1, 0.8, 0.8])
Step 4: Format the Graph
Format the graph adding labels and defining the minimum and maximum values for each axis:
ax.set_xlabel('x data') ax.set_ylabel('y data') ax.set_xlim(smallx,bigx) ax.set_ylim(smally,bigy)
Step 5: Plot the Graph
Plot the graph by passing in the two arrays containing the x and y values retrieved from the CSV file. Customize the line plot by passing in optional values such as line color (color) or line width (lw). Display the finished graph by calling the show method to open a window and store the image by calling savefig to create a bitmap file on disk:
ax.plot(x_arr,y_arr, color='blue', lw=2) plt.show() fig.savefig('test.png')
Important Considerations for Files
To create files that the Python interpreter can read you must use an ascii text or code editor that creates text only files. You can store graph images in many different image formats including: png, pdf, ps, and svg.
Consult the Matplotlib Documentation
Some aspects of the matplotlib library installation and functionality vary on different computer platforms. Read the documentation carefully. The library can display numerical information in a vast number of ways and can be finely customized. A thorough reading of the documentation will be necessary to become proficient.