Dataframe Storage Mini-Benchmark

27 Dec 2019

Quick benchmark on how to read/write/store your Pandas dataframe if you don’t want to read from CSV all the time. Conclusion:

If you want to load only some columns, use Parquet or Feather. The way Pandas uses HDF5 cannot deal with this.

Figures:

File Size Comparison

Read Speed Comparison

Write Speed Comparison

Code: