Difference between revisions of "Anaconda"
m |
|||
Line 45: | Line 45: | ||
An official cheatsheet for <code>conda</code> commands is [https://docs.conda.io/projects/conda/en/latest/user-guide/cheatsheet.html here]. | An official cheatsheet for <code>conda</code> commands is [https://docs.conda.io/projects/conda/en/latest/user-guide/cheatsheet.html here]. | ||
+ | |||
+ | == Installing Packages with <code>mamba</code> == | ||
+ | |||
+ | Using <code>conda</code> to install packages can be quite slow, as it's written in pure Python and resolving package dependencies is [https://stackoverflow.com/a/28102139/7564988 quite hard]. | ||
+ | [[Mamba|<code>mamba</code>]] is a drop-in replacement for <code>conda</code> ''as a package manager''. It's written in C++ and is intercompatible with <code>conda</code>, ''but its a lot faster'' (by orders of magnitude). See [[Mamba]] for more information. | ||
==Short list of things included in Anaconda installation == | ==Short list of things included in Anaconda installation == |
Revision as of 13:59, 20 October 2020
This is an overview of accessing Anaconda on the viz nodes. Anaconda is a distribution of Python that includes many commonly used packages for scientific/machine-learning applications.
It also has it's own environment manager, conda
, which is a language-agnostic environment manager. It is used to activate access to the Anaconda packages, the Python interpreter associated with the environment, and install packages to the environment from the Anaconda public repositories. Note that installing packages through conda
is generally preferable to pip
as it can also install non-Python libraries that support pure Python tools (such as LAPACK, BLAS, and even PETSc). Management of conda
environments (and Python environments in general) won't be covered here (as they shouldn't be overly necessary for our purposes), but you can learn more about them at this very helpful article.
Contents
Accessing Anaconda
The current (20191201) installation of the Anaconda is at /projects/tools/anaconda3
. This is where Anaconda's Python interpreter is stored as well as the packages associated with it. To access this, you need to activate the conda
environment.
Accessing conda
First, you need to have access to conda
itself. This is done by adding the following to the .bashrc/.zshrc
in you home directory:
# >>> conda initialize >>> # !! Contents within this block are managed by 'conda init' !! __conda_setup="$('/projects/tools/anaconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)" if [ $? -eq 0 ]; then eval "$__conda_setup" else if [ -f "/projects/tools/anaconda3/etc/profile.d/conda.sh" ]; then . "/projects/tools/anaconda3/etc/profile.d/conda.sh" else export PATH="/projects/tools/anaconda3/bin:$PATH" fi fi unset __conda_setup # <<< conda initialize <<<
Activating conda
Environment
By default, adding the above section to your .bashrc/.zshrc
will cause conda
to activate the "base" environment at whenever a shell is launched.
To not have environment enabled by default (recommended to prevent inadvertent conflicts), run conda config --set auto_activate_base False
. This will add a setting to the .condarc
file in you home directory.
To activate a conda
environment, simply run conda activate [environment name]
. If the environment name is not given, it defaults to "base".
To deactivate a conda
environment, simply run conda deactivate
.
Conda Resources
Official documentation on conda
can be found here.
An official cheatsheet for conda
commands is here.
Installing Packages with mamba
Using conda
to install packages can be quite slow, as it's written in pure Python and resolving package dependencies is quite hard.
mamba
is a drop-in replacement for conda
as a package manager. It's written in C++ and is intercompatible with conda
, but its a lot faster (by orders of magnitude). See Mamba for more information.
Short list of things included in Anaconda installation
- iPython
- Interactive Python shell
- Jupyter
- Offers unique interactive computing tools, namely Notebooks
- Notebooks can be used for many different programming languages
- Numpy
- The fundamental scientific library
- Offers n-dimensional array objects and functions to perform calculations with them
- Pandas
- Essentially numpy arrays with labels
- Allows for easy data manipulation, plotting
- See xarray if you're working in structured higher dimensional data
- SciPy
- Ecosystem of packages (including numpy, matplotlib, sympy, etc.)
- Also a standalone package of miscellaneous scientific computing functions
- Matplotlib
- Matlab-style plotting package