Python/Miscellaneous Snippets and Tips

From PHASTA Wiki
Revision as of 11:35, 9 August 2021 by Jrwrigh (talk | contribs) (Add writing data to rows)
Jump to: navigation, search

Here are miscellaneous tips and tricks when working with Python files.

Write Data to Text File (ie. CSV)

Given data in some Python object (most likely a numpy-derived array, but possibly just a normal Python list), how do you write it out to a file? Use numpy.savetxt (or more likely np.savetxt).

Example: Given a array, A, of shape [n, m], simply use

np.savetxt('path/file.dat', A)

which creates a file with n rows and m columns.

Numpy's documentation has information on other useful arguments to change numerical formats, separators, and adding headers to the file.

Write multiple 1D arrays as columns

To do this, use numpy.column_stack to create an array with the columns "stacked" together.

Example: Given two 1D arrays, a and b, of the same size, use:

np.savetxt('path/file.dat', np.column_stack((a,b)) )

Two things to note here:

  1. np.column_stack takes a list or tuple as an argument, hence the two sets of ((...)).
  2. np.column_stack creates an entirely new array and copies the given data into it. As such, it will double the total amount of memory used; once for the original 1D arrays, and again for the brand new array storing a copy of the original data.
    • If data format is flexible, consider writing in rows instead of columns as it is much faster (~20%, no time spent copying data) and uses less memory

Write multiple 1D arrays as rows

np.savetxt will also take 2D-like array input. This means you can pass a list/tuple of arrays and it will process each array as a row.

Example: Given two 1D arrays, a and b, of the same size, use:

np.savetxt('path/file.dat', (a,b) )

Note we do not need to invoke np.column_stack, and thus we don't spend time copying data or take up memory with redundant data.