In [1]:
# Author: Volker Hoffmann <volker@cheleb.net>, <volker@physik.uzh.ch>
# Date: 02 December 2014
In [2]:
%matplotlib inline
In [3]:
import matplotlib as mpl; mpl.rcParams['savefig.dpi'] = 144
import matplotlib.pyplot as plt
import numpy as np

Genga (Text, HDF5, Plotting)

In this notebook, we'll explore how to load raw data from text and HDF5 files. We'll also cover some elementary plotting, and indexing by boolean arrays.

Loading Text Files

To load text files, we can use the loadtxt() function provided by Numpy. This loads all data into an array of floats. If you have a mix of types, you may need a more customized approach. For exampls, Pandas has a great function called read_csv().

In [4]:
# Load Files
basedir = "/zbox/data/volker/Debris/Runs/Chaos-42/gas_01/run_01"
M = np.loadtxt("%s/Out_run_01_000100000000.dat" % basedir)
In [5]:
# What's In It?
print M.shape
print M[::10,1]
(562, 21)
[    0.    10.  1517.    30.   996.    50.    60.   896.  1439.  1548.
  1145.   110.  1689.  1212.   140.   955.  1711.   580.  1405.   190.
   200.   626.   220.   230.   240.   751.   662.   270.  1057.   939.
   300.  1139.   320.  1541.   340.   696.   360.   892.   400.   390.
  1343.   410.  1070.   760.   440.   450.   460.   470.   480.   490.
   500.   510.  1826.  1205.   984.  1480.   581.]

In [6]:
# Extract Some Columns
t = M[:,0]
i = M[:,1].astype("int")
m = M[:,2]
x = M[:,4]
y = M[:,5]

Basic Plotting

Plotting is straightforward. After creating figure and axis handles (fig, ax), we set a fixed aspect ratio, and then simply make a scatterplot with the scatter() function in Matplotlib. There are many other functions, of which plot(), and bar() are the most useful. Colors, axes labels, and limits are also easy to set. We can also set an alpha value to make points slightly more opaque.

In [7]:
# Prep Figure
fig, ax = plt.subplots(1,1)
ax.set_aspect("equal")

# Plot
ax.scatter(x, y, s=4**2, alpha=0.5)

# Annotate
ax.set_xlabel("X (AU)")
ax.set_ylabel("Y (AU)")

# Set Range
# ax.set_xlim([-5,5])
# ax.set_ylim([-5,5])

pass

Indexing with Logical Arrays

To access parts of an array, we can use boolean indexing. Performing something like x>20 will return a logical array of the same shape of x, with entries either True or False depending on whether the conditions x>20 is fulfilled at this particular index. Using this logical array to index the array (i.e., x[x>20]) then returns only the values where x>20 is True. We can also use the logical array to index any other array.

In [8]:
print i[i<20]
print x[i<20]
[ 0  3  4  8  9 10 12 13 16 17 18 19  6 15  5]
[ 1.50610511  0.34926058 -1.13896797  1.31454613 -2.18372473  1.04929319
  2.77280806  0.29991058  1.99492047  2.49376875 -1.86795788  1.83181432
 -2.36407213 -0.00689411 -0.3143826 ]

Slightly Fancier Plotting

Let's use logical indexing to make a scatterplot of particles with a certain ID. We also adjust the marker used for the points, the size of the points, and also set the color of the marker to blue (the default).

In [9]:
# Jupiter and Saturn (ID 2000, 2001)

# Prep Figure
fig, ax = plt.subplots(1,1)
ax.set_aspect("equal")

# Plot
# We can pass an array of the same shape as x and y to s=XXX to specify marker sizes.
# You can do the same with the colors, btw.
ax.scatter(x[i<2000], y[i<2000], \
           s=(10*m[i<2000]/np.max(m[i<2000]))**2, \
           marker="*", \
           color="b", edgecolor="b", \
           alpha=0.5)

# Annotate
ax.set_xlabel("X (AU)")
ax.set_ylabel("Y (AU)")
ax.set_xlim([-5,5])
ax.set_ylim([-5,5])
# ax.legend(loc="best", fontsize="small", frameon=False)

pass

Loading HDF5 Files, Even Prettier Plots

HDF5 is a hierarchical file format. It is build around the idea of datasets. Datasets can be nested and have attributes that describe them. In Python, we can use the package h5py to interact with HDF5 files. For more, information, have a look at http://www.h5py.org. We also use the Colorbrewer interface (brewer2mpl) to Python, which we will explore in the future.

In [10]:
import h5py
import os
import brewer2mpl as b2m
In [11]:
# Load Color Scheme
c3 = b2m.get_map('Dark2', 'Qualitative', 3)

# Load HDF5 File.
# File handle is open only within the "with" construct. Very handy!
basedir = "/zbox/data/volker/Impacts/Nice1/Kuiper_NoColl"
with h5py.File("%s/Snapshot_000347000000.hdf5" % basedir, "r") as f5:
    
    # Prepare Canvas
    fig, ax = plt.subplots(1,1)
    ax.set_aspect("equal")

    # Ellipses, Inner Planets
    for ii in [ 0, 1, 2, 3 ]:
        xell = f5["ellipses/x"][ii,:]
        yell = f5["ellipses/y"][ii,:]
        ax.plot(xell, yell, lw=1.0, alpha=0.6, c=c3.mpl_colors[0])

    # Ellipses, Outer Planets
    for ii in [ 4, 5, 6, 7, 8]:
        xell = f5["ellipses/x"][ii,:]
        yell = f5["ellipses/y"][ii,:]
        ax.plot(xell, yell, lw=1.0, alpha=0.6, c=c3.mpl_colors[1])

    # Kuiper Belt
    kuiper_ids = ~np.in1d(f5["particles/pid"][()], \
                          np.array([0,1,2,3,4,5,6,7,8]))
    x = f5["particles/x"][kuiper_ids]
    y = f5["particles/y"][kuiper_ids]
    ax.scatter(x, y, \
               s=1, alpha=1.0, \
               edgecolor=c3.mpl_colors[2], \
               color=c3.mpl_colors[2])

    # Timer
    ax.text(0.03, 0.92, r"%.2e yr" % f5.attrs["tout"], \
            horizontalalignment='left', color='black', size='small', \
            transform=ax.transAxes)

    # Title
    tag = os.getcwd()
    ax.set_title("%s" % tag, size='small')

    # Limits, Etc
    ax.set_xlim([-50,50])
    ax.set_ylim([-50,50])
    ax.set_xlabel('X (AU)')
    ax.set_ylabel('Y (AU)')

Let's have a glimpse at the HDF5 file structure. Again, have a look at the official documentation at http://docs.h5py.org/en/2.3/ for a more comprehensive overview.

In [12]:
basedir = "/zbox/data/volker/Impacts/Nice1/Kuiper_NoColl"
with h5py.File("%s/Snapshot_000347000000.hdf5" % basedir, "r") as f5:
    # Inspect File
    print f5
    print f5.keys()
    print f5["particles/x"]
    print f5.attrs.keys()
    print f5["particles/x"].attrs.keys()
    print f5["particles/x"].attrs["units"]
    print f5["ellipses/x"].attrs.keys()
<HDF5 file "Snapshot_000347000000.hdf5" (mode r)>
[u'ellipses', u'particles']
<HDF5 dataset "x": shape (1804,), type "<f8">
[u'tout', u'tout_units', u'run_name', u'coordinate_frame']
[u'units']
AU
[u'units']