In [1]:
# Author: Volker Hoffmann <volker@cheleb.net>, <volker@physik.uzh.ch>
# Date: 02 December 2014

In [2]:
%matplotlib inline

In [3]:
import matplotlib as mpl; mpl.rcParams['savefig.dpi'] = 144
import matplotlib.pyplot as plt
import numpy as np


# Genga (Text, HDF5, Plotting)

In this notebook, we'll explore how to load raw data from text and HDF5 files. We'll also cover some elementary plotting, and indexing by boolean arrays.

To load text files, we can use the loadtxt() function provided by Numpy. This loads all data into an array of floats. If you have a mix of types, you may need a more customized approach. For exampls, Pandas has a great function called read_csv().

In [4]:
# Load Files
basedir = "/zbox/data/volker/Debris/Runs/Chaos-42/gas_01/run_01"

In [5]:
# What's In It?
print M.shape
print M[::10,1]

(562, 21)
[    0.    10.  1517.    30.   996.    50.    60.   896.  1439.  1548.
1145.   110.  1689.  1212.   140.   955.  1711.   580.  1405.   190.
200.   626.   220.   230.   240.   751.   662.   270.  1057.   939.
300.  1139.   320.  1541.   340.   696.   360.   892.   400.   390.
1343.   410.  1070.   760.   440.   450.   460.   470.   480.   490.
500.   510.  1826.  1205.   984.  1480.   581.]


In [6]:
# Extract Some Columns
t = M[:,0]
i = M[:,1].astype("int")
m = M[:,2]
x = M[:,4]
y = M[:,5]


## Basic Plotting¶

Plotting is straightforward. After creating figure and axis handles (fig, ax), we set a fixed aspect ratio, and then simply make a scatterplot with the scatter() function in Matplotlib. There are many other functions, of which plot(), and bar() are the most useful. Colors, axes labels, and limits are also easy to set. We can also set an alpha value to make points slightly more opaque.

In [7]:
# Prep Figure
fig, ax = plt.subplots(1,1)
ax.set_aspect("equal")

# Plot
ax.scatter(x, y, s=4**2, alpha=0.5)

# Annotate
ax.set_xlabel("X (AU)")
ax.set_ylabel("Y (AU)")

# Set Range
# ax.set_xlim([-5,5])
# ax.set_ylim([-5,5])

pass


## Indexing with Logical Arrays¶

To access parts of an array, we can use boolean indexing. Performing something like x>20 will return a logical array of the same shape of x, with entries either True or False depending on whether the conditions x>20 is fulfilled at this particular index. Using this logical array to index the array (i.e., x[x>20]) then returns only the values where x>20 is True. We can also use the logical array to index any other array.

In [8]:
print i[i<20]
print x[i<20]

[ 0  3  4  8  9 10 12 13 16 17 18 19  6 15  5]
[ 1.50610511  0.34926058 -1.13896797  1.31454613 -2.18372473  1.04929319
2.77280806  0.29991058  1.99492047  2.49376875 -1.86795788  1.83181432
-2.36407213 -0.00689411 -0.3143826 ]



## Slightly Fancier Plotting¶

Let's use logical indexing to make a scatterplot of particles with a certain ID. We also adjust the marker used for the points, the size of the points, and also set the color of the marker to blue (the default).

In [9]:
# Jupiter and Saturn (ID 2000, 2001)

# Prep Figure
fig, ax = plt.subplots(1,1)
ax.set_aspect("equal")

# Plot
# We can pass an array of the same shape as x and y to s=XXX to specify marker sizes.
# You can do the same with the colors, btw.
ax.scatter(x[i<2000], y[i<2000], \
s=(10*m[i<2000]/np.max(m[i<2000]))**2, \
marker="*", \
color="b", edgecolor="b", \
alpha=0.5)

# Annotate
ax.set_xlabel("X (AU)")
ax.set_ylabel("Y (AU)")
ax.set_xlim([-5,5])
ax.set_ylim([-5,5])
# ax.legend(loc="best", fontsize="small", frameon=False)

pass


HDF5 is a hierarchical file format. It is build around the idea of datasets. Datasets can be nested and have attributes that describe them. In Python, we can use the package h5py to interact with HDF5 files. For more, information, have a look at http://www.h5py.org. We also use the Colorbrewer interface (brewer2mpl) to Python, which we will explore in the future.

In [10]:
import h5py
import os
import brewer2mpl as b2m

In [11]:
# Load Color Scheme
c3 = b2m.get_map('Dark2', 'Qualitative', 3)

# File handle is open only within the "with" construct. Very handy!
basedir = "/zbox/data/volker/Impacts/Nice1/Kuiper_NoColl"
with h5py.File("%s/Snapshot_000347000000.hdf5" % basedir, "r") as f5:

# Prepare Canvas
fig, ax = plt.subplots(1,1)
ax.set_aspect("equal")

# Ellipses, Inner Planets
for ii in [ 0, 1, 2, 3 ]:
xell = f5["ellipses/x"][ii,:]
yell = f5["ellipses/y"][ii,:]
ax.plot(xell, yell, lw=1.0, alpha=0.6, c=c3.mpl_colors[0])

# Ellipses, Outer Planets
for ii in [ 4, 5, 6, 7, 8]:
xell = f5["ellipses/x"][ii,:]
yell = f5["ellipses/y"][ii,:]
ax.plot(xell, yell, lw=1.0, alpha=0.6, c=c3.mpl_colors[1])

# Kuiper Belt
kuiper_ids = ~np.in1d(f5["particles/pid"][()], \
np.array([0,1,2,3,4,5,6,7,8]))
x = f5["particles/x"][kuiper_ids]
y = f5["particles/y"][kuiper_ids]
ax.scatter(x, y, \
s=1, alpha=1.0, \
edgecolor=c3.mpl_colors[2], \
color=c3.mpl_colors[2])

# Timer
ax.text(0.03, 0.92, r"%.2e yr" % f5.attrs["tout"], \
horizontalalignment='left', color='black', size='small', \
transform=ax.transAxes)

# Title
tag = os.getcwd()
ax.set_title("%s" % tag, size='small')

# Limits, Etc
ax.set_xlim([-50,50])
ax.set_ylim([-50,50])
ax.set_xlabel('X (AU)')
ax.set_ylabel('Y (AU)')


Let's have a glimpse at the HDF5 file structure. Again, have a look at the official documentation at http://docs.h5py.org/en/2.3/ for a more comprehensive overview.

In [12]:
basedir = "/zbox/data/volker/Impacts/Nice1/Kuiper_NoColl"
with h5py.File("%s/Snapshot_000347000000.hdf5" % basedir, "r") as f5:
# Inspect File
print f5
print f5.keys()
print f5["particles/x"]
print f5.attrs.keys()
print f5["particles/x"].attrs.keys()
print f5["particles/x"].attrs["units"]
print f5["ellipses/x"].attrs.keys()

<HDF5 file "Snapshot_000347000000.hdf5" (mode r)>
[u'ellipses', u'particles']
<HDF5 dataset "x": shape (1804,), type "<f8">
[u'tout', u'tout_units', u'run_name', u'coordinate_frame']
[u'units']
AU
[u'units']