Difference between revisions of "Python/Miscellaneous Snippets and Tips"

From PHASTA Wiki
Jump to: navigation, search
m
(Add writing data to rows)
Line 21: Line 21:
 
# <code>np.column_stack</code> takes a ''list'' or ''tuple'' as an argument, hence the two sets of <code>((...))</code>.
 
# <code>np.column_stack</code> takes a ''list'' or ''tuple'' as an argument, hence the two sets of <code>((...))</code>.
 
# <code>np.column_stack</code> creates an entirely new array and copies the given data into it. As such, it will double the total amount of memory used; once for the original 1D arrays, and again for the brand new array storing a copy of the original data.
 
# <code>np.column_stack</code> creates an entirely new array and copies the given data into it. As such, it will double the total amount of memory used; once for the original 1D arrays, and again for the brand new array storing a copy of the original data.
 +
#* If data format is flexible, consider [[Python/Miscellaneous_Snippets_and_Tips#Write_multiple_1D_arrays_as_rows|writing in rows]] instead of columns as it is much faster (~20%, no time spent copying data) and uses less memory
 +
 +
=== Write multiple 1D arrays as rows ===
 +
<code>np.savetxt</code> will also take 2D-like array input. This means you can pass a list/tuple of arrays and it will process each array as a row.
 +
 +
'''Example:''' Given two 1D arrays, <code>a</code> and <code>b</code>, of the same size, use:
 +
 +
np.savetxt('path/file.dat', (a,b) )
 +
 +
Note we do not need to invoke <code>np.column_stack</code>, and thus we don't spend time copying data or take up memory with redundant data.

Revision as of 11:35, 9 August 2021

Here are miscellaneous tips and tricks when working with Python files.

Write Data to Text File (ie. CSV)

Given data in some Python object (most likely a numpy-derived array, but possibly just a normal Python list), how do you write it out to a file? Use numpy.savetxt (or more likely np.savetxt).

Example: Given a array, A, of shape [n, m], simply use

np.savetxt('path/file.dat', A)

which creates a file with n rows and m columns.

Numpy's documentation has information on other useful arguments to change numerical formats, separators, and adding headers to the file.

Write multiple 1D arrays as columns

To do this, use numpy.column_stack to create an array with the columns "stacked" together.

Example: Given two 1D arrays, a and b, of the same size, use:

np.savetxt('path/file.dat', np.column_stack((a,b)) )

Two things to note here:

  1. np.column_stack takes a list or tuple as an argument, hence the two sets of ((...)).
  2. np.column_stack creates an entirely new array and copies the given data into it. As such, it will double the total amount of memory used; once for the original 1D arrays, and again for the brand new array storing a copy of the original data.
    • If data format is flexible, consider writing in rows instead of columns as it is much faster (~20%, no time spent copying data) and uses less memory

Write multiple 1D arrays as rows

np.savetxt will also take 2D-like array input. This means you can pass a list/tuple of arrays and it will process each array as a row.

Example: Given two 1D arrays, a and b, of the same size, use:

np.savetxt('path/file.dat', (a,b) )

Note we do not need to invoke np.column_stack, and thus we don't spend time copying data or take up memory with redundant data.