Difference between revisions of "Python/Miscellaneous Snippets and Tips"
(→IPython Console: Add note about ipython not always being installed.) |
m (Remove paraview and post-processing categories) |
||
(9 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
Here are miscellaneous tips and tricks when working with Python files. | Here are miscellaneous tips and tricks when working with Python files. | ||
+ | |||
+ | == Migrating from MATLAB == | ||
+ | Numpy (the defacto numerical array library in Python) has a handy [https://numpy.org/doc/stable/user/numpy-for-matlab-users.html guide for migrating from MATLAB to Numpy]. | ||
+ | |||
+ | Here are some tips for general Python: | ||
+ | |||
+ | === Indexing === | ||
+ | ==== Python indices start with 0 instead of 1 ==== | ||
+ | This has several cascading effects in the language, such as: | ||
+ | >>> test = list(range(3)) | ||
+ | >>> print(test) # Doesn't include 3, but creates 3 integers | ||
+ | [0, 1, 2] | ||
+ | >>> test[0:2] # Excludes the last index, so only 2 integers are returned | ||
+ | [0, 1] | ||
+ | |||
+ | ==== <code>[...]</code> vs <code>(...)</code> after an object ==== | ||
+ | Say you have <code>object[0]</code> or <code>object['foo']</code> or <code>object('foo')</code>. What can you tell about <code>object</code> without looking at it's definition? Quite a lot actually! | ||
+ | |||
+ | In general: | ||
+ | * <code>[...]</code> gets a value from a Python object | ||
+ | ** Either by [https://docs.python.org/3/reference/expressions.html#subscriptions ''subscription''] (<code>object[0]</code> or <code>object['asdf']</code>) or [https://docs.python.org/3/reference/expressions.html#slicings ''slicing''] (<code>object[1:3]</code>) | ||
+ | * <code>(...)</code> declares arguments to a function (like <code>range(3)</code>). | ||
+ | |||
+ | '''Examples of getting a value from a Python object:''' | ||
+ | |||
+ | >>> lista = ['a', 'b', 3] | ||
+ | >>> lista[0] # Accessing item 0 | ||
+ | 'a' | ||
+ | |||
+ | >>> lista[1:] # Accessing all items of index 1 or greater | ||
+ | ['b', 3] | ||
+ | |||
+ | >>> dicta = {'a': 2, 'b':'foo'} | ||
+ | >>> dicta['a'] # Accessing the value of the key 'a' | ||
+ | 2 | ||
+ | >>> dicta['b'] # Accessing the value of the key 'b' | ||
+ | 'foo' | ||
+ | |||
+ | == Nested Python Objects and Operator Precedence == | ||
+ | Python allows you to "chain" "operations" together, somewhat similarly to piping in a POSIX Shell (ie. <code>ls | grep file</code>). If you see something like <code>object.data['foo'](arg)[:,2:4]</code>, this is just successive operators chained together. Generally, Python evaluates each ''expression'' (ie. a series of operations) left-to-right, with respect given to operator precedence. Expression [https://docs.python.org/3/reference/expressions.html#evaluation-order evaluation order] and [https://docs.python.org/3/reference/expressions.html#operator-precedence operator precedence] are described in more detail in the Python documentation (including a handy table in the operator precedence link). | ||
+ | |||
+ | Based on operator precedence, the example expression above can be rewritten as: | ||
+ | ( ( ( object.data ) ['foo']) (arg) ) [:,2:4] | ||
+ | # 3 2 1 1 2 3 4 | ||
+ | |||
+ | where the line below denotes the order in which the operations are done. It could also be rewritten as: | ||
+ | op1 = object.data | ||
+ | op2 = op1['foo'] | ||
+ | op3 = op2(arg) | ||
+ | op4 = op3[:,2:4] | ||
+ | |||
+ | In words, this expression: | ||
+ | # (<code>op1</code>) References the <code>data</code> ''attribute'' of <code>object</code> | ||
+ | # (<code>op2</code>) ''Subscribes'' the object associated with <code>'foo'</code> | ||
+ | # (<code>op3</code>) ''Calls'' function object in <code>op2</code> with argument <code>arg</code> | ||
+ | # (<code>op4</code>) ''Slices'' the result of the function called in <code>op3</code> | ||
+ | |||
+ | === Nested Python Objects === | ||
+ | Using operator precedence, we can access nested Python objects (such as Lists or Dictionaries) without having to using intermediate variables: | ||
+ | |||
+ | >>> lista = [(1, 3, 'asdf'), 'b', 3] | ||
+ | |||
+ | >>> lista[0] # Accessing the 0th item (the tuple) | ||
+ | (1, 3, 'asdf') | ||
+ | |||
+ | >>> lista[0][2] # Accessing nested lists | ||
+ | 'asdf' | ||
+ | |||
+ | >>> dicta = {'a': 2, 'b':'foo', 'bar':{'sub':'phasta'}} | ||
+ | |||
+ | >>> dicta['bar'] # Accessing the value of 'bar', a nested dictionary | ||
+ | {'sub': 'phasta'} | ||
+ | |||
+ | >>> dicta['bar']['sub'] # Accessing nested dictionary items | ||
+ | 'phasta' | ||
== Write Data to Text File (ie. CSV) == | == Write Data to Text File (ie. CSV) == | ||
Line 41: | Line 116: | ||
''Note: I'll be assuming usage of Python 3.X. If using Python 2.7, use <code>python</code> instead of <code>python3</code>'' | ''Note: I'll be assuming usage of Python 3.X. If using Python 2.7, use <code>python</code> instead of <code>python3</code>'' | ||
=== Unix Shebang === | === Unix Shebang === | ||
− | + | Using a [https://en.wikipedia.org/wiki/Shebang_(Unix) Unix shebang], you can execute the script directly in a Linux terminal (if it has the appropriate execute permissions). | |
This is done by adding a <code>#!/usr/bin/env python3</code> to the top of the file. | This is done by adding a <code>#!/usr/bin/env python3</code> to the top of the file. | ||
This line is not read by Python (<code>#</code> is the comment character), but instead by the (Linux) kernel which uses the executable specified in the line to execute the file. | This line is not read by Python (<code>#</code> is the comment character), but instead by the (Linux) kernel which uses the executable specified in the line to execute the file. | ||
Line 96: | Line 171: | ||
However Python's built-in console does not have the nice features of IPython, so it is generally preferred. | However Python's built-in console does not have the nice features of IPython, so it is generally preferred. | ||
+ | |||
+ | == "Which module version did I load?" and Reloading modules == | ||
+ | |||
+ | When working with [[Vtkpytools]] and other in-development Python libraries, you may need to determine what version of the library you're working with, and may need to reload that library to get recently-added features. | ||
+ | |||
+ | The following examples will use [[Vtkpytools]] as the module in question. (with the preliminary that the library has been aliased via <code>import vtkpytools as vpt</code>). | ||
+ | |||
+ | === Which module version did I load? === | ||
+ | You can check the release version via: | ||
+ | |||
+ | print(vpt.__version__) | ||
+ | ''Note, if you checking the <code>vtk</code> python library, you need to use <code>vtk.VTK_VERSION</code>, because vtk is special'' | ||
+ | |||
+ | If you're dealing with development version of the library, the version information may not be updated. You can determine the '''path''' that the library was loaded from via: | ||
+ | |||
+ | print(vpt.__file__) | ||
+ | |||
+ | === Reload the module === | ||
+ | |||
+ | Note any new instance of a Python interpreter will automatically load a new version of a library; it is not cached outside of a single interpreter instance. | ||
+ | |||
+ | ==== In IPython Console ==== | ||
+ | If you're using the IPython console (either [[#IPython_Console|from commandline]], or from Spyder), you can use <code>dreload</code>: | ||
+ | |||
+ | dreload(vpt) | ||
+ | |||
+ | ==== In Spyder ==== | ||
+ | If you're in Spyder, in addition to using <code>dreload</code> (specified above), rerunning a script will automatically reload any imported libraries. A message should appear in the console window saying which modules it reloaded. | ||
+ | |||
+ | ==== Using Python standard library ==== | ||
+ | If either of the above is not working, or not a desirable solution, you can try to reload the module using the standard Python library: | ||
+ | |||
+ | from importlib import reload | ||
+ | reload(vpt) | ||
+ | |||
+ | Note that this is a bit finicky, as only the top-level module is reloaded. So in this case, any changes made to the <code>vpt.barfiletools</code> module are ''not'' reloaded. Thus you have to do a recursive dance to reload everything. This can be better automated using [https://stackoverflow.com/a/17194836/7564988 this StackOverflow answer] (reprinted below): | ||
+ | |||
+ | from types import ModuleType | ||
+ | |||
+ | try: | ||
+ | from importlib import reload # Python 3.4+ | ||
+ | except ImportError: | ||
+ | # Needed for Python 3.0-3.3; harmless in Python 2.7 where imp.reload is just an | ||
+ | # alias for the builtin reload. | ||
+ | from imp import reload | ||
+ | |||
+ | def rreload(module): | ||
+ | """Recursively reload modules.""" | ||
+ | reload(module) | ||
+ | for attribute_name in dir(module): | ||
+ | attribute = getattr(module, attribute_name) | ||
+ | if type(attribute) is ModuleType: | ||
+ | rreload(attribute) | ||
+ | |||
+ | [[Category:Python]] |
Latest revision as of 10:13, 18 September 2022
Here are miscellaneous tips and tricks when working with Python files.
Contents
Migrating from MATLAB
Numpy (the defacto numerical array library in Python) has a handy guide for migrating from MATLAB to Numpy.
Here are some tips for general Python:
Indexing
Python indices start with 0 instead of 1
This has several cascading effects in the language, such as:
>>> test = list(range(3)) >>> print(test) # Doesn't include 3, but creates 3 integers [0, 1, 2] >>> test[0:2] # Excludes the last index, so only 2 integers are returned [0, 1]
[...]
vs (...)
after an object
Say you have object[0]
or object['foo']
or object('foo')
. What can you tell about object
without looking at it's definition? Quite a lot actually!
In general:
-
[...]
gets a value from a Python object- Either by subscription (
object[0]
orobject['asdf']
) or slicing (object[1:3]
)
- Either by subscription (
-
(...)
declares arguments to a function (likerange(3)
).
Examples of getting a value from a Python object:
>>> lista = ['a', 'b', 3] >>> lista[0] # Accessing item 0 'a' >>> lista[1:] # Accessing all items of index 1 or greater ['b', 3]
>>> dicta = {'a': 2, 'b':'foo'} >>> dicta['a'] # Accessing the value of the key 'a' 2 >>> dicta['b'] # Accessing the value of the key 'b' 'foo'
Nested Python Objects and Operator Precedence
Python allows you to "chain" "operations" together, somewhat similarly to piping in a POSIX Shell (ie. ls | grep file
). If you see something like object.data['foo'](arg)[:,2:4]
, this is just successive operators chained together. Generally, Python evaluates each expression (ie. a series of operations) left-to-right, with respect given to operator precedence. Expression evaluation order and operator precedence are described in more detail in the Python documentation (including a handy table in the operator precedence link).
Based on operator precedence, the example expression above can be rewritten as:
( ( ( object.data ) ['foo']) (arg) ) [:,2:4] # 3 2 1 1 2 3 4
where the line below denotes the order in which the operations are done. It could also be rewritten as:
op1 = object.data op2 = op1['foo'] op3 = op2(arg) op4 = op3[:,2:4]
In words, this expression:
- (
op1
) References thedata
attribute ofobject
- (
op2
) Subscribes the object associated with'foo'
- (
op3
) Calls function object inop2
with argumentarg
- (
op4
) Slices the result of the function called inop3
Nested Python Objects
Using operator precedence, we can access nested Python objects (such as Lists or Dictionaries) without having to using intermediate variables:
>>> lista = [(1, 3, 'asdf'), 'b', 3] >>> lista[0] # Accessing the 0th item (the tuple) (1, 3, 'asdf') >>> lista[0][2] # Accessing nested lists 'asdf'
>>> dicta = {'a': 2, 'b':'foo', 'bar':{'sub':'phasta'}} >>> dicta['bar'] # Accessing the value of 'bar', a nested dictionary {'sub': 'phasta'} >>> dicta['bar']['sub'] # Accessing nested dictionary items 'phasta'
Write Data to Text File (ie. CSV)
Given data in some Python object (most likely a numpy-derived array, but possibly just a normal Python list), how do you write it out to a file? Use numpy.savetxt
(or more likely np.savetxt
).
Example: Given a array, A
, of shape [n, m]
, simply use
np.savetxt('path/file.dat', A)
which creates a file with n
rows and m
columns.
Numpy's documentation has information on other useful arguments to change numerical formats, separators, and adding headers to the file.
Write multiple 1D arrays as columns
To do this, use numpy.column_stack
to create an array with the columns "stacked" together.
Example: Given two 1D arrays, a
and b
, of the same size, use:
np.savetxt('path/file.dat', np.column_stack((a,b)) )
Two things to note here:
-
np.column_stack
takes a list or tuple as an argument, hence the two sets of((...))
. -
np.column_stack
creates an entirely new array and copies the given data into it. As such, it will double the total amount of memory used; once for the original 1D arrays, and again for the brand new array storing a copy of the original data.- If data format is flexible, consider writing in rows instead of columns as it is much faster (~20%, no time spent copying data) and uses less memory
Write multiple 1D arrays as rows
np.savetxt
will also take 2D-like array input. This means you can pass a list/tuple of arrays and it will process each array as a row.
Example: Given two 1D arrays, a
and b
, of the same size, use:
np.savetxt('path/file.dat', (a,b) )
Note we do not need to invoke np.column_stack
, and thus we don't spend time copying data or take up memory with redundant data.
Running Python Files in Terminal
There are a few ways to run Python files (and Python code more generally): via a Unix shebang, python
, inside a ipython
instance, or through an IDE like spyder
. We'll go over how to run a script, script.py
, in the terminal:
~$ cat script.py print('Hello World')
Note: I'll be assuming usage of Python 3.X. If using Python 2.7, use python
instead of python3
Unix Shebang
Using a Unix shebang, you can execute the script directly in a Linux terminal (if it has the appropriate execute permissions).
This is done by adding a #!/usr/bin/env python3
to the top of the file.
This line is not read by Python (#
is the comment character), but instead by the (Linux) kernel which uses the executable specified in the line to execute the file.
Then you can execute the file directly.
~$ cat script.py #!/usr/bin/env python3 print('Hello World') ~$ ./script.py Hello World
Adding a shebang does not exclude you from running the script in any of the other methods listed below; Python will simply ignore the line since it is a commented line.
Default Python Interpreter
You can execute script.py
by calling the script as an argument to the python interpreter executable, much like you can do with bash, zsh, or perl:
~$ python3 script.py Hello World
IPython Console
The IPython Console is a very powerful, interactive python shell (ie. console or terminal) that is built into many other applications, such as Spyder and JupyterLab. It offers a host of useful features, tab completion, like system shell commands (cd
, ls
, etc), debugging shell, and also has special "magic" commands. One of those magic commands is %run
.
%run
allows you to run a script and then interact with the variables that are created in the script, similarly to Matlab's console. This is especially useful when debugging a script; if your script outputs an error and stop, you can inspect the variable states right when the error occurred.
Note: IPython is not installed in default python installations. It is included in Anaconda installations and can also be installed quite easily via pip
(pip install ipython
) or conda
(conda install ipython
).
~$ cat script.py #!/usr/bin/env python3 print('Hello World') test = 1 ~$ ipython Python 3.9.6 (default, Jun 30 2021, 10:22:16) Type 'copyright', 'credits' or 'license' for more information IPython 7.23.1 -- An enhanced Interactive Python. Type '?' for help. In [1]: %run script.py Hello World In [2]: test Out[2]: 1
Note this can also be done with Python's default console, but it's a bit more clunky:
~$ python Python 3.9.6 (default, Jun 30 2021, 10:22:16) [GCC 11.1.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> exec(open('script.py').read()) Hello World >>> test 1
However Python's built-in console does not have the nice features of IPython, so it is generally preferred.
"Which module version did I load?" and Reloading modules
When working with Vtkpytools and other in-development Python libraries, you may need to determine what version of the library you're working with, and may need to reload that library to get recently-added features.
The following examples will use Vtkpytools as the module in question. (with the preliminary that the library has been aliased via import vtkpytools as vpt
).
Which module version did I load?
You can check the release version via:
print(vpt.__version__)
Note, if you checking the vtk
python library, you need to use vtk.VTK_VERSION
, because vtk is special
If you're dealing with development version of the library, the version information may not be updated. You can determine the path that the library was loaded from via:
print(vpt.__file__)
Reload the module
Note any new instance of a Python interpreter will automatically load a new version of a library; it is not cached outside of a single interpreter instance.
In IPython Console
If you're using the IPython console (either from commandline, or from Spyder), you can use dreload
:
dreload(vpt)
In Spyder
If you're in Spyder, in addition to using dreload
(specified above), rerunning a script will automatically reload any imported libraries. A message should appear in the console window saying which modules it reloaded.
Using Python standard library
If either of the above is not working, or not a desirable solution, you can try to reload the module using the standard Python library:
from importlib import reload reload(vpt)
Note that this is a bit finicky, as only the top-level module is reloaded. So in this case, any changes made to the vpt.barfiletools
module are not reloaded. Thus you have to do a recursive dance to reload everything. This can be better automated using this StackOverflow answer (reprinted below):
from types import ModuleType try: from importlib import reload # Python 3.4+ except ImportError: # Needed for Python 3.0-3.3; harmless in Python 2.7 where imp.reload is just an # alias for the builtin reload. from imp import reload def rreload(module): """Recursively reload modules.""" reload(module) for attribute_name in dir(module): attribute = getattr(module, attribute_name) if type(attribute) is ModuleType: rreload(attribute)