Anaconda

From PHASTA Wiki
Jump to: navigation, search

This is an overview of accessing Anaconda on the viz nodes. Anaconda is a distribution of Python that includes many commonly used packages for scientific/machine-learning applications.

It also has it's own environment manager, conda, which is a language-agnostic environment manager. It is used to activate access to the Anaconda packages, the Python interpreter associated with the environment, and install packages to the environment from the Anaconda public repositories. Note that installing packages through conda is generally preferable to pip as it can also install non-Python libraries that support pure Python tools (such as LAPACK, BLAS, and even PETSc). Management of conda environments (and Python environments in general) won't be covered here (as they shouldn't be overly necessary for our purposes), but you can learn more about them at this very helpful article.

Accessing Anaconda

The current (20191201) installation of the Anaconda is at /projects/tools/anaconda3. This is where Anaconda's Python interpreter is stored as well as the packages associated with it. To access this, you need to activate the conda environment.

Accessing conda

First, you need to have access to conda itself. This is done by adding the following to the .bashrc/.zshrc in you home directory:

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/projects/tools/anaconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/projects/tools/anaconda3/etc/profile.d/conda.sh" ]; then
        . "/projects/tools/anaconda3/etc/profile.d/conda.sh"
    else
        export PATH="/projects/tools/anaconda3/bin:$PATH"                                                          
    fi
fi
unset __conda_setup
# <<< conda initialize <<<

Activating conda Environment

By default, adding the above section to your .bashrc/.zshrc will cause conda to activate the "base" environment at whenever a shell is launched.

To not have environment enabled by default (recommended to prevent inadvertent conflicts), run conda config --set auto_activate_base False. This will add a setting to the .condarc file in you home directory.

To activate a conda environment, simply run conda activate [environment name]. If the environment name is not given, it defaults to "base".

To deactivate a conda environment, simply run conda deactivate.

Conda Resources

Official documentation on conda can be found here.

An official cheatsheet for conda commands is here.

Installing Packages with mamba

Using conda to install packages can be quite slow, as it's written in pure Python and resolving package dependencies is quite hard. mamba is a drop-in replacement for conda as a package manager. It's written in C++ and is intercompatible with conda, but its a lot faster (by orders of magnitude). See Mamba for more information.

Short list of things included in Anaconda installation

  • Spyder
    • A MATLAB-esque IDE for scientific Python
    • Uses ipython as it's built-in console
  • iPython
    • Interactive Python shell
  • Jupyter
    • Offers unique interactive computing tools, namely Notebooks
    • Notebooks can be used for many different programming languages
  • Numpy
    • The fundamental scientific library
    • Offers n-dimensional array objects and functions to perform calculations with them
  • Pandas
    • Essentially numpy arrays with labels
    • Allows for easy data manipulation, plotting
    • See xarray if you're working in structured higher dimensional data
  • SciPy
    • Ecosystem of packages (including numpy, matplotlib, sympy, etc.)
    • Also a standalone package of miscellaneous scientific computing functions
  • Matplotlib
    • Matlab-style plotting package