Working with virtual environments and conda
To work with Python projects, it’s a good idea to use virtual environments. Virtual environments allow you to create isolated spaces for your Python projects, each with its own set of packages and dependencies. This way, you can avoid conflicts between different projects that may require different versions of the same package. This is particularly important in scientific computing, where you may have a long-running project that you need to keep working with, while also wanting to be able to start something new with its own dependencies.
At the heart of a Python environment are:
- A Python interpreter: a specific version of Python that will be used to execute your code
- An environment management tool: a tool to create, manage, and switch between different environments
- A package management tool: a tool to install, update, and remove packages within an environment
Condais a tool that combines all three of these functionalities. It allows you to create and manage virtual environments, manage different versions of Python, and install packages. Conda is particularly popular in the scientific Python community due to its ability to handle complex system-level dependencies, such as GPU support, and its support for packages outside of the Python ecosystem (such as R, C libraries, etc).There are other tools that can be used to manage virtual environments, such as
venv(built into Python) and newer ones likeuvandpixi. These use a project oriented approach, where the environment is tied to a specific project directory. Conda, on the other hand creates and manages environments centrally, independently of any specific directory. At the end of the day, it’s a matter of preference and use case, but the key take home message is to actually always use virtual environments for your projects!
Installing conda, if you don’t have it already
- There are several distributions of
condaand it can pull packages from many different sources called “channels”. In this workshop we will use the conda-forge distribution called “miniforge” (or “miniforge3”).- Go to the conda-forge downloads page: https://conda-forge.org/download/.
- Download the installer for your operating system (Mac, Windows, Linux) and architecture (x86_64/amd64 or arm64/aarch64) and follow the installation instructions on that page.
- Note: When prompted whether to “automatically initialize conda”, I recommend saying “yes”. And on Windows, I recommend checking the options to “Create start menu shortcuts” and “Add Miniforge3 to my PATH environment variable”.
- Go to the conda-forge downloads page: https://conda-forge.org/download/.
- Open a NEW terminal window. If you see
(base)at the beginning of your prompt, you have installed it successfully. You can also runconda --versionand see if you get a version output.
Environments: how do they work?
The
(base)at the beginning of the line indicates which conda environment is currently active: thebaseenvironment. Each environment will have its own version of Python, with its own packages. Conda takes care of all dependencies and so on.VERY IMPORTANTLY, DO NOT INSTALL THINGS ON THE BASE ENVIRONMENT. It’s a sure way to make things more confusing for yourself. Anything you do should have its own environment. You should never be doing any work on your base environment: it’s the environment
condauses to run itself, so a conflict or problem there can break your entirecondainstallation.So first of all, let’s try to create a new, “clean” environment. Type:
conda env create -n bestpracticesThis will create a new environment named “bestpractices” (
-nis shorthand for--name). Let’s activate it, so we can use it. Type:conda activate bestpracticesThis should change the beginning of your prompt to
(bestpractices).
Always remember to activate the environment you want to use or make changes to!Try running
python. It probably didn’t work. Why?Let’s deactivate this empty environment and delete it. To deactivate, type:
conda deactivateAnd then, to delete the environment use:
conda env remove -n bestpracticesTry running
conda env listand you will see you have no environments other thanbase.Now let’s try to create an environment with Python. Type:
conda env create -n bestpractices python=3.13(or pick a version of your choice). Activate your new environment as before and try:
python --versionYou can also try
which python, depending on your platform. This should show you a path within the newly created environment, under your user, in a miniforge directory, rather than in a system location like/binor/usr/bin.Now try running
pythonto launch the interpreter and thenimport numpy as np. This should fail, but why?
Installing packages in your environment
With a
condaenvironment active, you have two options for installing Python packages:Using
condaitself:conda install numpyThis will install the package from the active conda channel, in this case
conda-forge. It can be used to install packages that go beyond just Python! For a listing, see: https://conda-forge.org/packages/Using the Python package installer
pip:pip install numpyThis will install Python packages from PyPI (the Python Package Index).
Generally, you can use either method, but mixing them can sometimes lead to conflicts. If you do use both, try to install what you need using
condafirst and then usepipfor packages that are not available viaconda.Try running step 7 from above again. It should work now.
Now let’s deactivate and delete this environment, as before using
conda deactivateandconda env remove -n bestpractices.This time, we will try to create an environment using the environment.yml in this repository. Navigate to where you cloned the workshop repository in your terminal and have a look at the contents of environment.yml.
Now, run:
conda env create -f environment.ymlThis will create a new environment named
bestpractices_finalwith all the packages specified in the environment.yml file. Activating the new environment as before, usingconda activate bestpractices_final(why is this the name?) and try runningpython --version. Try runningconda listto see all the installed packages. Everything listed in theenvironment.ymlshould be there, along with all dependencies.
Note: Thepipequivalent, once you have an activated virtual environment with a Python interpreter, is:pip install -r requirements.txtThis will install all packages listed in a
requirements.txtfile from PyPI.A final tip: append
--dry-runto yourcondacommands to see what they will do without actually making any changes.
Some advice on projects and dependencies
Beyond using virtual environments, here are some best practices when it comes to managing dependencies for your projects:
- Keep track of every package for every module you import from in your code! Maintain either an
environment.yml(forconda) or arequirements.txtfile (forpip) that lists all of the dependencies needed to run your code. - For reproducibility, it’s a good idea to specify (pin) the specific versions of your dependencies you are using in your environment files.
- For portability and ability to incorporate your code into other projects, avoid pinning versions strictly: pin to the minimum version required for the functionality you need.
- In addition to your environment file, for maximal reproducibility, consider generating a lock file, which specifies the exact versions of all packages and their dependencies in your environment. This can be done with
condausing conda-lock or withpipusing pip-tools. (Additionally, bothuvandpixiautomatically generate lock files.) - If you will be sharing your code or expect others to run it, consider making a Python package. Check out the pyOpenSci Python Package Guide for more details on how to do that!