All about plotting data.

Gnuplot

TODO...

xmgrace

TODO...

matplotlib and pyplot

With the python module matplotlib.pyplot, one can create amazing graphics. Often the syntax is pretty straightforward. Sometimes it is not, and your code may become messy. This section is intended to avoid that.

Basic examples

Let's first look at some basic examples. Suppose that you want to plot the graph of a function $ f : [0,1] \rightarrow \mathbb{R} $, which has been defined your python code:
import matplotlib.pyplot as plt
import numpy as np
 
def f(x): return x**2 - x + 1 ## insert your function here...
 
xs = np.linspace(0,1,1000) ## use a 1000 points for the plot
ys = map(f,xs) ## compute f(x) for all x in xs
 
fig = plt.figure(figsize=[10,10]) ## make a figure
ax = fig.add_subplot(1,1,1) ## and add a (sub) plot
 
ax.plot(xs,ys) ## the actual plotting function
 
fig.savefig("example.png") ## save the figure

Scatter plot

A basic example of a scatter plot
import matplotlib.pyplot as plt
import numpy as np
import numpy.random
 
def addError(x): return x + np.random.normal(0,0.1)
 
xs = np.random.normal(0,1,1000) ## a 100 Normal(0,1) deviates
ys = map(addError,xs) ## add an error to xs
 
fig = plt.figure(figsize=[10,10]) ## make a figure
ax = fig.add_subplot(1,1,1) ## and add a (sub) plot
 
ax.scatter(xs,ys) ## make a scatter plot, using xs and ys
 
fig.savefig("scatter-example.png") ## save the figure


Advanced stuff


Blob plots

blobplot.png

A violin plot is a way to represent the distribution of a variable when a parameter changes. They are clearer than several histograms on the same plot.
However, violin plots as implemented in python smoothen your distribution, making it difficult to see the details (and incurring in numerical errors under some conditions).

Blobplot is a work-around that plots the distribution "as is" (and it is modified from violin plots) :
import math
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
 
# MAKE UP SOME DATA
list_positions=[0,1,2,3,4]    # this contains the positions of your blobplots on the x axis
 
data_for_blobplot=[]            # this is for the data
for pos in list_positions:
  if pos <= 2: some_data = np.random.gumbel(pos,1, 1000)         # generate some data
  else: some_data = np.random.normal(pos,1, 1000)            # generate some other data
  data_for_blobplot.append(some_data)                    # and append it to the list
 
# PLOTTING
color_alpha_blobs=('midnightblue',0.7)            # this is the color of the blobplot
color_mean_or_median='orangered'            # this is the colot of the median or mean, whatever you specify
extra='median'                                          # you'll have the median printed on top of the blobplot, alternatively, choose 'mean'
 
fig = plt.figure()                    # makes a figure
ax = fig.add_subplot(111)                # adds a single subplot
 
blob_plot(ax, data_for_blobplot, list_pos=list_positions, extra=extra)    #makes blob plots, if extra is not specified, makes no median/mean
 
ax.set_xlim(-1., len(list_positions)) #to display all of them
ax.set_ylim(-2., 12.)
plt.title('Beautiful blob plots',fontsize=18)
plt.xlabel('$\mu$ (yup, that is Latex)', fontsize=14)
plt.ylabel('distribution of my variable', fontsize=14)
 
plt.savefig('blobplot.png')     # saving to png, alternatively you can write 'blobplot.pdf', or other formats
plt.show()                        #prints to screen your splendid blobplot

This is the source code that produces the example above, and with the blobplot function:


Customizing subplots with GridSpec


The module gridspec can be used for managing the relative sizes of the sub-plots of a figure. Let's have a look at an example.
We will take a 3D random sample, and look at it from 3 directions. Such a plot could for instance be interesting when you want to visualize the first 3 principal components during a PCA.
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import numpy as np
from scipy.stats import multivariate_normal as mnormal
 
## make a new figure
fig = plt.figure(figsize=(7.5, 7.5))
## define a 3 by 3 "gridspec"
gs = gridspec.GridSpec(3, 3)
 
## the ratios of the subplots will be 2:1 (hence the 3x3 gridspec)
ax1 = fig.add_subplot(gs[0:2,0:2])
ax2 = fig.add_subplot(gs[0:2,2], sharey=ax1)
ax3 = fig.add_subplot(gs[2, 0:2], sharex=ax1)
 
## take a 3D gaussian sample, and plot it from 3 directions
N = 1000
samples = [mnormal.rvs(cov=np.diag([1,1,0.25]))
           for _ in range(N)]
xs = [sample[0] for sample in samples]
ys = [sample[1] for sample in samples]
zs = [sample[2] for sample in samples]
 
ax1.scatter(xs, ys, 5, color='k')
ax2.scatter(zs, ys, 5, color='k')
ax3.scatter(xs, zs, 5, color='k')
 
## remove some tick labels
plt.setp(ax2.get_yticklabels(), visible=False)
plt.setp(ax1.get_xticklabels(), visible=False)
ax2.set_xticks(ax2.get_xticks()[::2])
ax3.set_yticks(ax3.get_yticks()[::2])
 
## save the figure
fig.savefig("gridspec-example.png", dpi=100)
This is the resulting figure:

gridspec-example.png
Example using gridspec

Miscellaneous tips and tricks


## Saved figures can have large margins, and labels can sometimes be placed outside the figure.
## This can be prevented by specifying bbox_inches='tight'.
## This causes the margins to be smaller than normal, but they enclose everything:
 
fig.savefig("my_plot.png", dpi=300, bbox_inches='tight')