A non-magical introduction to Pip and Virtualenv for Python beginners

Jamie Matthews

One of the hurdles that new Python developers have to get over is understanding the Python packaging ecosystem. This blog post is based on material covered in our Python for Programmers training course, which attempts to explain pip and virtualenv for new Python users.

Prerequisites

Python for Programmers is aimed at developers who are already familiar with one or more programming languages, and so we assume a certain amount of technical knowledge. It will help if you're reasonably comfortable with a command line. The examples below use bash, which is the default shell on Macs and most Linux systems. But the commands are simple enough that the concepts should be transferrable to any terminal, such as PowerShell for Windows.

pip

Let's dive in. pip is a tool for installing Python packages from the Python Package Index.

PyPI (which you'll occasionally see referred to as The Cheeseshop) is a repository for open-source third-party Python packages. It's similar to RubyGems in the Ruby world, PHP's Packagist, CPAN for Perl, and NPM for Node.js.

Python actually has another, more primitive, package manager called easy_install, which is installed automatically when you install Python itself. pip is vastly superior to easy_install for lots of reasons, and so should generally be used instead. You can use easy_install to install pip as follows:

$ sudo easy_install pip

You can then install packages with pip as follows (in this example, we're installing Django):

# DON'T DO THIS
$ sudo pip install django

Here, we're installing Django globally on the system. But in most cases, you shouldn't install packages globally. Read on to find out why.

virtualenv

virtualenv solves a very specific problem: it allows multiple Python projects that have different (and often conflicting) requirements, to coexist on the same computer.

What problem does it solve?

To illustrate this, let's start by pretending virtualenv doesn't exist. Imagine we're going to write a Python program that needs to make HTTP requests to a remote web server. We're going to use the Requests library, which is brilliant for that sort of thing. As we saw above, we can use pip to install Requests.

But where on your computer does pip install the packages to? Here's what happens if I try to run pip install requests:

$ pip install requests
Downloading/unpacking requests
  Downloading requests-1.1.0.tar.gz (337Kb): 337Kb downloaded
  Running setup.py egg_info for package requests

Installing collected packages: requests
  Running setup.py install for requests

    error: could not create '/Library/Python/2.7/site-packages/requests': Permission denied

Oops! It looks like pip is trying to install the package into /Library/Python/2.7/site-packages/requests. This is a special directory that Python knows about. Anything that's installed in site-packages can be imported by your programs.

We're seeing the error because /Library/ (on a Mac) is not usually writeable by "ordinary" users. To fix the error, we can run sudo pip install requests (sudo means "run this command as a superuser"). Then everything will work fine:

$ sudo pip install requests
Password:
Downloading/unpacking requests
  Running setup.py egg_info for package requests

Installing collected packages: requests
  Running setup.py install for requests

Successfully installed requests
Cleaning up...

This time it worked. We can now type python and try importing our new library:

>>> import requests
>>> requests.get('http://dabapps.com')
<Response [200]>

So, we now know that we can import requests and use it in our program. We go ahead and work feverishly on our new program, using requests (and probably lots of other libraries from PyPI too). The software works brilliantly, we make loads of money, and our clients are so impressed that they ask us to write another program to do something slightly different.

But this time, we find a brand new feature that's been added to requests since we wrote our first program that we really need to use in our second program. So we decide to upgrade the requests library to get the new feature:

sudo pip install --upgrade requests

Everything seems fine, but we've unknowingly created a disaster!

Next time we try to run it, we discover that our original program (the one that made us loads of money) has completely stopped working and is raising errors when we try to run it. Why? Because something in the API of the requests library has changed between the previous version and the one we just upgraded to. It might only be a small change, but it means our code no longer uses the library correctly. Everything is broken!

Sure, we could fix the code in our first program to use the new version of the requests API, but that takes time and distracts us from our new project. And, of course, a seasoned Python programmer won't just have two projects but dozens - and each project might have dozens of dependencies! Keeping them all up-to-date and working with the same versions of every library would be a complete nightmare.

How does virtualenv help?

virtualenv solves this problem by creating a completely isolated virtual environment for each of your programs. An environment is simply a directory that contains a complete copy of everything needed to run a Python program, including a copy of the python binary itself, a copy of the entire Python standard library, a copy of the pip installer, and (crucially) a copy of the site-packages directory mentioned above. When you install a package from PyPI using the copy of pip that's created by the virtualenv tool, it will install the package into the site-packages directory inside the virtualenv directory. You can then use it in your program just as before.

How can I install virtualenv?

If you already have pip, the easiest way is to install it globally sudo pip install virtualenv. Usually pip and virtualenv are the only two packages you ever need to install globally, because once you've got both of these you can do all your work inside virtual environments.

In fact, virtualenv comes with a copy of pip which gets copied into every new environment you create, so virtualenv is really all you need. You can even install it as a separate standalone package (rather than from PyPI). This might be easier for Windows users. See virtualenv.org for instructions.

How do I create a new virtual environment?

You only need the virtualenv tool itself when you want to create a new environment. This is really simple. Start by changing directory into the root of your project directory, and then use the virtualenv command-line tool to create a new environment:

$ cd ~/code/myproject/
$ virtualenv env
New python executable in env/bin/python
Installing setuptools............done.
Installing pip...............done.

Here, env is just the name of the directory you want to create your virtual environment inside. It's a common convention to call this directory env, and to put it inside your project directory (so, say you keep your code at ~/code/projectname/, the environment will be at ~/code/projectname/env/ - each project gets its own env). But you can call it whatever you like and put it wherever you like!

Note: if you're using a version control system like git, you shouldn't commit the env directory. Add it to your .gitignore file (or similar).

How do I use my shiny new virtual environment?

If you look inside the env directory you just created, you'll see a few subdirectories:

$ ls env
bin include lib

The one you care about the most is bin. This is where the local copy of the python binary and the pip installer exists. Let's start by using the copy of pip to install requests into the virtualenv (rather than globally):

$ env/bin/pip install requests
Downloading/unpacking requests
  Downloading requests-1.1.0.tar.gz (337kB): 337kB downloaded
  Running setup.py egg_info for package requests

Installing collected packages: requests
  Running setup.py install for requests

Successfully installed requests
Cleaning up...

It worked! Notice that we didn't need to use sudo this time, because we're not installing requests globally, we're just installing it inside our home directory.

Now, instead of typing python to get a Python shell, we type env/bin/python, and then...

>>> import requests
>>> requests.get('http://dabapps.com')
<Response [200]>

But that's a lot of typing!

virtualenv has one more trick up its sleeve. Instead of typing env/bin/python and env/bin/pip every time, we can run a script to activate the environment. This script, which can be executed with source env/bin/activate, simply adjusts a few variables in your shell (temporarily) so that when you type python, you actually get the Python binary inside the virtualenv instead of the global one:

$ which python
/usr/bin/python
$ source env/bin/activate
$ which python
/Users/jamie/code/myproject/env/bin/python

So now we can just run pip install requests (instead of env/bin/pip install requests) and pip will install the library into the environment, instead of globally. The adjustments to your shell only last for as long as the terminal is open, so you'll need to remember to rerun source env/bin/activate each time you close and open your terminal window. If you switch to work on a different project (with its own environment) you can run deactivate to stop using one environment, and then source env/bin/activate to activate the other.

Activating and deactivating environments does save a little typing, but it's a bit "magical" and can be confusing. Make your own decision about whether you want to use it.

Requirements files

virtualenv and pip make great companions, especially when you use the requirements feature of pip. Each project you work on has its own requirements.txt file, and you can use this to install the dependencies for that project into its virtual environment:

env/bin/pip install -r requirements.txt

See the pip documentation for more details.

Recap

  • pip is a tool for installing packages from the Python Package Index.
  • virtualenv is a tool for creating isolated Python environments containing their own copy of python, pip, and their own place to keep libraries installed from PyPI.
  • It's designed to allow you to work on multiple projects with different dependencies at the same time on the same machine.
  • You can see instructions for installing it at virtualenv.org.
  • After installing it, run virtualenv env to create a new environment inside a directory called env.
  • You'll need one of these environments for each of your projects. Make sure you exclude these directories from your version control system.
  • To use the versions of python and pip inside the environment, type env/bin/python and env/bin/pip respectively.
  • You can "activate" an environment with source env/bin/activate and deactivate one with deactivate. This is entirely optional but might make life a little easier.

pip and virtualenv are indispensible tools if you're a regular Python user. Both are fairly simple to understand, and we highly recommend getting to grips with them.

If this blog post has sparked your interest in learning Python, check out our Python for Programmers workshop at DabApps HQ in Brighton.

  • scragg

    Very nice article, thanks for writing it. I used virtualenv once because I was using pypy. I did it as a global virtualenv, which after reading this article probably isn't the best way to do it. Any advice on organizing a pypy virtualenv would be helpful.

    • Tinned_Tuna

      This is quite easy, you give virtualenv the path to the pypy interpreter using the -p flag, e.g.

      virtualenv -p /path/to/pypy env

  • rahduro

    Excellent tutorial, new to python, and I always hated when things just install without letting you know where it's going..and next time there is some upgrade all sorts of weird errors keep coming..Thanks a lot.

  • Danny Rosen

    Virtualenvwrapper is pretty sweet too!
    http://virtualenvwrapper.re...

    • oltjano

      Yes it is :D

  • David Echols

    Why no mention of venv? venv is included by default in python 3.3 :)

  • Kurt

    Or even better, activate your environments automatically. I wrote about here: http://www.burgundywall.com...

  • Jesus Christo

    The best kind of article/tutorial; one that is simple, direct, and actively avoids/dispels magic by assuming minimal background knowledge.

  • Fred

    This is an excellent article. This clarifies some of the stuff I simply skim over in other tutorials. Thanks so much!

  • Henrique

    I would like to just give a different advice regarding virtualenvs and installing dependencies:

    When you create the virtualenv, the current package you're working on doesn't get added to site-packages, so you're forced to be at the repository root to import the package.

    The best approach is to have a proper setup.py file so you can do `python setup.py develop`, which will link the package you're working on into the virtualenv site-packages. This way it acts as it's installed and you can import regardless of cwd.

    If you define your requirements on the setup.py (I think you should), you can even skip the `pip install -r requirements.txt` step.

    I've cooked up a package template that can help getting this working:

    https://github.com/hcarvalh...

  • Madison

    Just one correction on this: easy_install is *not* installed just "when you install Python". It's only installed when you install either distribute or setuptools. I think most Python installations these days will come with one of those almost by default, but not if, say, you're installing Python from python.org on a Mac or Windows.

    Typically if you're building up a Python installation from scratch and not using something like MacPorts or your OS packaging system, you can get install distribute by downloading distribute_setup.py (the first hit on Google should be the current version) and then just run `python distribute_setup.py` with your Python of choice. *Then* you can use `easy_install pip` to get pip. So distribute_setup.py -> easy_install -> pip.

    (The reasons for a lot of this are historical, and there are efforts underway to simplify the whole issue in the future.)

  • exhuma

    You say that pip is "vastly superior to easy_install", but never give an example as to why! The only advantages of pip that I see is that the console output is nicer, and that you can uninstall packages. But the console output is not really important, and you can just uninstall packages by removing their folders/eggs, so it's not *really* a big gain over easy_install.

    easy_install on the other hand, lets you install multiple versions of the same package, which can be a life-saver sometimes. Additionally easy_install lets you create binary, pre-compiled packages (say you package has C-extensions). This means people installing the package don't require a complete build-environment with compilers and the required headers.

    Another gripe I have is that pip can either install packages using the "requirements.txt" file *and* using the setup "install_requires" value. Why this ambiguity? What makes requirements.txt better than "install_requires"?

    Personally, I use pip as well as everyone keeps saying it is the "new hotness", and that easy_install will disappear. But I have to say that I prefer easy_install feature-wise.

    Am I not seeing an essential feature which makes pip the clear choice for the future? What is it that makes it "vastly superior"?

    • Madison

      Just a few responses to this: Yes, you already hit two of the reasons pip is really better: More sane and controllable console output, and the fact that it actually keeps a record of what files it installs so as to enable uninstallation. I think the latter alone should be enough.

      But to address some of your other comments: First of all you're confusing easy_install and setuptools (particularly the bdist_egg command that can be used to build binary egg distributions). It's true that until recently pip's inability to install binary packages has been a real shortcoming, especially for those of us who work with scientists ;) The idea of adding egg installation support to pip has been bandied about, but ultimately rejected for several reasons, the chief among them being that as nifty as the egg format is it has several major shortcomings, not the least of which being that it's not described by any accepted standard. Now that the wheel format has been developed (which *is* based on a standard and fixes some of the other problems with eggs) that's not going to be as much of an issue in the future.

      As for being able to install multiple versions of the same package side by side as eggs is one of the things that made them nifty. Unfortunately the feature is a bit fragile and only really works in some limited circumstances. I don't see a lot of people using that in the wild--instead they're more likely to just use virtualenv for exactly this purpose.

      As for the conflation of requirements.txt and install_requires: The latter is just a feature specific to setuptools for listing a package's runtime requirements. Not all packages use setuptools, and need another way to list their requirements. Also a requirements.txt allows listing any number of (seemingly) unrelated packages needed to set up a software stack for some particular purpose. They need not all be mutual requirements.

      Finally, one other feature that makes pip better (and this is just one): Unlike easy_install, pip actually checks when it installs a package that all of that package's requirements can be installed successfully as well. If any of the requirements fails to install it rolls back the entire installation and does not otherwise modify your system. easy_install does the exact opposite, and can leave a system in a broken state after a failed installation.

      (BTW, these are all good questions so +1 nonetheless)

    • Donald Stufft

      requirements.txt and install_requires serve different purposes.

      install_requires tell you abstract requirements, you don't know where they come from nor what package is going to fulfill them. Typically you want your install_requires to be as non specific as to version as possible to enable more widespread use.

      requirements.txt on the other hand take a set of abstract requirements and make them concrete. It does this by pairing the name and optional version spec from install_requires with an index url from --index-url.

  • chanux

    All this time I used virtualenv wrong. I had mysweetunicornproject_env per project. This was partially because I didn't really understand virtualenv and the lack of education :D.

    Thank you!

  • gamesbook

    I'd love to be able to print this for off-line reference; any chance you could add a print-friendly style sheet to your web site? Thanks!

  • n e lorenson

    Nice! Almost feels too easy...

  • liptonshmidt

    Beautiful, brilliant article. Thank you!

  • Jayaraj K

  • Nick

    I've come back to this post multiple times as a reminder of the basics of virtualenv. Thank you, I really appreciate it.

  • Teddy

    Note that the pyvenv script which comes with Python 3.3 does not include distribute/setuptools or pip, you have to install them to the venv yourself.

  • Rabbit

    great article! i've been using venv for several months, and only now do i feel like i get what's going on.

  • Srinath Krishna

    Nice article. But why shouldn't one commit the env directory with the project? If the project is so tied to the env in question, and assuming all developers work on similar platforms, it makes sense to keep the env with the project for easy access.

  • KT

    Excellent article. Thanks!

  • inoumba

    Brilliant! Great Job

  • muatik

    This is quite a concise article, thanks Matthews. I am wondering what are the differences between pyenv's virtual environment system and virtualenv. Could you also tell a little about this?

  • brownshoes

    Just wondering if you or someone could expand on this?

    Note: if you're using a version control system like git, you shouldn't commit the env directory. Add it to your .gitignore file (or similar).

    Why not include the env file in a commit? Aren't all the project files in the env dir? Thanks!

    • kr

      (answer also for Srinath Krishna)

      1. env is a directory, not a file.

      2. env and requirements.txt (and setup.py) contains some equivalent information

      if you have a requirements.txt you should be able to recreate env with

      virtualenv env
      env/bin/pip install -r requirements.txt

      if you have an environment you can create the matching requirements.txt with

      pip freeze > requirements.txt

      requirements.txt is a small text file defining the packages available at run-time.

      env is part of the run-time environment, it is a big (multi-megabyte) directory structure, containing a machine and directory specific python run-time.

      thus committing env is like committing the operating system together with the python program.

  • Tyler Waitt

    I wish every tutorial was like this. Love how you explain everything.

  • maxw

    Thanks, this was beautifully clear.

  • Alex

    Very nice. Python novice, confused about pip and virtualenv; this article helped me. Thank you for your work !

  • SAFEER

    Excellent article. helpful.

  • tohu777

    Excellent, concise, very helpful. Thanks!

  • Joachim Hagege

    Excellent job buddy, thank you very much!

  • Tushar

    This is great article and it really sparked my interest.. Great work Jamie .. Thanks for writing this one..

  • Anibalismo

    Muchas gracias por dedicarle tiempo a compartir información como esta!

  • Pavan

    Best blog one could find for pip and virtualenv awesome work! you made my day :)

  • 84pg

    Thanks so much. Such an awesome summary of what virtualenv does. :)

  • awqonre

    Lovely to have a clear and practical explanation. Thank you!

  • Kåre Jonsson

    Thanks fora fantyastic article! Really good for a beginner like myself.

  • Vladimír Kroupa

    Great tutorial. I read it every time I need to refresh my knowledge about virtualenv.

  • kiran

    Excellent article that subsided my confusion and let me go ahead for python projects, thank you very much

  • JacksonTale

    awesome article, very clear explanation

  • Antriksh Yadav

    Thanks for this. As a beginner, I wasn't sure what virtualenv actually did in the directory and how I was supposed to use the folder structure it generated. You described it so well!

  • Bonnie Varghese

    Excellent article! I can confirm now I understand why we need virtualenv after all!

  • Ruben

    great Tutorial. easy to follow and never got bored. Thanks a ton.

  • hhh

    Great article ! Thank you for your teaching skills !

  • James

    This is a fantastic intro. I wish the rest of the development ecosystem had guides this solid. Thanks so much!

  • philippeowagner

    Great article - despite the age. I'd like to introduce virtualenv-mgr https://github.com/arteria/... at this point. I'd like to introduce virtualenv-mgr. virtualenv-mgr is a tool to manage multiple virtualenvs at once. Simply install, uninstall or upgrade specific packages in all virtualenvs at once. Print statistic, about the usage of packages over all environments. Find/list virtualenvs for further processing, eg. as input for virtualenv-mgr. Find all envs having a package installed.

  • micheal

    Wow, great blog post.Really looking forward to read more. Keep writing.

Commenting is now closed