A non-magical introduction to Pip and Virtualenv for Python beginners
One of the hurdles that new Python developers have to get over is understanding the Python packaging ecosystem. This blog post is based on material covered in our Python for Programmers training course, which attempts to explain pip and virtualenv for new Python users.
Python for Programmers is aimed at developers who are already familiar with one or more programming languages, and so we assume a certain amount of technical knowledge. It will help if you're reasonably comfortable with a command line. The examples below use
bash, which is the default shell on Macs and most Linux systems. But the commands are simple enough that the concepts should be transferrable to any terminal, such as PowerShell for Windows.
Let's dive in. pip is a tool for installing Python packages from the Python Package Index.
PyPI (which you'll occasionally see referred to as The Cheeseshop) is a repository for open-source third-party Python packages. It's similar to RubyGems in the Ruby world, PHP's Packagist, CPAN for Perl, and NPM for Node.js.
Python actually has another, more primitive, package manager called
easy_install, which is installed automatically when you install Python itself. pip is vastly superior to
easy_install for lots of reasons, and so should generally be used instead. You can use
easy_install to install pip as follows:
$ sudo easy_install pip
You can then install packages with pip as follows (in this example, we're installing Django):
# DON'T DO THIS $ sudo pip install django
Here, we're installing Django globally on the system. But in most cases, you shouldn't install packages globally. Read on to find out why.
virtualenv solves a very specific problem: it allows multiple Python projects that have different (and often conflicting) requirements, to coexist on the same computer.
What problem does it solve?
To illustrate this, let's start by pretending virtualenv doesn't exist. Imagine we're going to write a Python program that needs to make HTTP requests to a remote web server. We're going to use the Requests library, which is brilliant for that sort of thing. As we saw above, we can use pip to install Requests.
But where on your computer does
pip install the packages to? Here's what happens if I try to run
pip install requests:
$ pip install requests Downloading/unpacking requests Downloading requests-1.1.0.tar.gz (337Kb): 337Kb downloaded Running setup.py egg_info for package requests Installing collected packages: requests Running setup.py install for requests error: could not create '/Library/Python/2.7/site-packages/requests': Permission denied
Oops! It looks like
pip is trying to install the package into
/Library/Python/2.7/site-packages/requests. This is a special directory that Python knows about. Anything that's installed in
site-packages can be imported by your programs.
We're seeing the error because
/Library/ (on a Mac) is not usually writeable by "ordinary" users. To fix the error, we can run
sudo pip install requests (
sudo means "run this command as a superuser"). Then everything will work fine:
$ sudo pip install requests Password: Downloading/unpacking requests Running setup.py egg_info for package requests Installing collected packages: requests Running setup.py install for requests Successfully installed requests Cleaning up...
This time it worked. We can now type
python and try importing our new library:
>>> import requests >>> requests.get('http://dabapps.com') <Response >
So, we now know that we can
import requests and use it in our program. We go ahead and work feverishly on our new program, using
requests (and probably lots of other libraries from PyPI too). The software works brilliantly, we make loads of money, and our clients are so impressed that they ask us to write another program to do something slightly different.
But this time, we find a brand new feature that's been added to
requests since we wrote our first program that we really need to use in our second program. So we decide to upgrade the
requests library to get the new feature:
sudo pip install --upgrade requests
Everything seems fine, but we've unknowingly created a disaster!
Next time we try to run it, we discover that our original program (the one that made us loads of money) has completely stopped working and is raising errors when we try to run it. Why? Because something in the API of the
requests library has changed between the previous version and the one we just upgraded to. It might only be a small change, but it means our code no longer uses the library correctly. Everything is broken!
Sure, we could fix the code in our first program to use the new version of the
requests API, but that takes time and distracts us from our new project. And, of course, a seasoned Python programmer won't just have two projects but dozens - and each project might have dozens of dependencies! Keeping them all up-to-date and working with the same versions of every library would be a complete nightmare.
How does virtualenv help?
virtualenv solves this problem by creating a completely isolated virtual environment for each of your programs. An environment is simply a directory that contains a complete copy of everything needed to run a Python program, including a copy of the
python binary itself, a copy of the entire Python standard library, a copy of the
pip installer, and (crucially) a copy of the
site-packages directory mentioned above. When you install a package from PyPI using the copy of
pip that's created by the
virtualenv tool, it will install the package into the
site-packages directory inside the virtualenv directory. You can then use it in your program just as before.
How can I install virtualenv?
If you already have
pip, the easiest way is to install it globally
sudo pip install virtualenv. Usually
virtualenv are the only two packages you ever need to install globally, because once you've got both of these you can do all your work inside virtual environments.
virtualenv comes with a copy of
pip which gets copied into every new environment you create, so
virtualenv is really all you need. You can even install it as a separate standalone package (rather than from PyPI). This might be easier for Windows users. See virtualenv.org for instructions.
How do I create a new virtual environment?
You only need the
virtualenv tool itself when you want to create a new environment. This is really simple. Start by changing directory into the root of your project directory, and then use the
virtualenv command-line tool to create a new environment:
$ cd ~/code/myproject/ $ virtualenv env New python executable in env/bin/python Installing setuptools............done. Installing pip...............done.
env is just the name of the directory you want to create your virtual environment inside. It's a common convention to call this directory
env, and to put it inside your project directory (so, say you keep your code at
~/code/projectname/, the environment will be at
~/code/projectname/env/ - each project gets its own
env). But you can call it whatever you like and put it wherever you like!
Note: if you're using a version control system like
git, you shouldn't commit the
env directory. Add it to your
.gitignore file (or similar).
How do I use my shiny new virtual environment?
If you look inside the
env directory you just created, you'll see a few subdirectories:
$ ls env bin include lib
The one you care about the most is
bin. This is where the local copy of the
python binary and the
pip installer exists. Let's start by using the copy of
pip to install
requests into the virtualenv (rather than globally):
$ env/bin/pip install requests Downloading/unpacking requests Downloading requests-1.1.0.tar.gz (337kB): 337kB downloaded Running setup.py egg_info for package requests Installing collected packages: requests Running setup.py install for requests Successfully installed requests Cleaning up...
It worked! Notice that we didn't need to use
sudo this time, because we're not installing
requests globally, we're just installing it inside our home directory.
Now, instead of typing
python to get a Python shell, we type
env/bin/python, and then...
>>> import requests >>> requests.get('http://dabapps.com') <Response >
But that's a lot of typing!
virtualenv has one more trick up its sleeve. Instead of typing
env/bin/pip every time, we can run a script to activate the environment. This script, which can be executed with
source env/bin/activate, simply adjusts a few variables in your shell (temporarily) so that when you type
python, you actually get the Python binary inside the virtualenv instead of the global one:
$ which python /usr/bin/python $ source env/bin/activate $ which python /Users/jamie/code/myproject/env/bin/python
So now we can just run
pip install requests (instead of
env/bin/pip install requests) and
pip will install the library into the environment, instead of globally. The adjustments to your shell only last for as long as the terminal is open, so you'll need to remember to rerun
source env/bin/activate each time you close and open your terminal window. If you switch to work on a different project (with its own environment) you can run
deactivate to stop using one environment, and then
source env/bin/activate to activate the other.
Activating and deactivating environments does save a little typing, but it's a bit "magical" and can be confusing. Make your own decision about whether you want to use it.
virtualenv and pip make great companions, especially when you use the
requirements feature of pip. Each project you work on has its own
requirements.txt file, and you can use this to install the dependencies for that project into its virtual environment:
env/bin/pip install -r requirements.txt
See the pip documentation for more details.
- pip is a tool for installing packages from the Python Package Index.
- virtualenv is a tool for creating isolated Python environments containing their own copy of
pip, and their own place to keep libraries installed from PyPI.
- It's designed to allow you to work on multiple projects with different dependencies at the same time on the same machine.
- You can see instructions for installing it at virtualenv.org.
- After installing it, run
virtualenv envto create a new environment inside a directory called
- You'll need one of these environments for each of your projects. Make sure you exclude these directories from your version control system.
- To use the versions of
pipinside the environment, type
- You can "activate" an environment with
source env/bin/activateand deactivate one with
deactivate. This is entirely optional but might make life a little easier.
pip and virtualenv are indispensible tools if you're a regular Python user. Both are fairly simple to understand, and we highly recommend getting to grips with them.
If this blog post has sparked your interest in learning Python, check out our Python for Programmers workshop at DabApps HQ in Brighton.
Very nice article, thanks for writing it. I used virtualenv once because I was using pypy. I did it as a global virtualenv, which after reading this article probably isn't the best way to do it. Any advice on organizing a pypy virtualenv would be helpful.
This is quite easy, you give virtualenv the path to the pypy interpreter using the -p flag, e.g.
virtualenv -p /path/to/pypy env
Excellent tutorial, new to python, and I always hated when things just install without letting you know where it's going..and next time there is some upgrade all sorts of weird errors keep coming..Thanks a lot.
Virtualenvwrapper is pretty sweet too!
Yes it is :D
Why no mention of venv? venv is included by default in python 3.3 :)
Or even better, activate your environments automatically. I wrote about here: http://www.burgundywall.com...
The best kind of article/tutorial; one that is simple, direct, and actively avoids/dispels magic by assuming minimal background knowledge.
This is an excellent article. This clarifies some of the stuff I simply skim over in other tutorials. Thanks so much!
I would like to just give a different advice regarding virtualenvs and installing dependencies:
When you create the virtualenv, the current package you're working on doesn't get added to site-packages, so you're forced to be at the repository root to import the package.
The best approach is to have a proper setup.py file so you can do `python setup.py develop`, which will link the package you're working on into the virtualenv site-packages. This way it acts as it's installed and you can import regardless of cwd.
If you define your requirements on the setup.py (I think you should), you can even skip the `pip install -r requirements.txt` step.
I've cooked up a package template that can help getting this working:
Just one correction on this: easy_install is *not* installed just "when you install Python". It's only installed when you install either distribute or setuptools. I think most Python installations these days will come with one of those almost by default, but not if, say, you're installing Python from python.org on a Mac or Windows.
Typically if you're building up a Python installation from scratch and not using something like MacPorts or your OS packaging system, you can get install distribute by downloading distribute_setup.py (the first hit on Google should be the current version) and then just run `python distribute_setup.py` with your Python of choice. *Then* you can use `easy_install pip` to get pip. So distribute_setup.py -> easy_install -> pip.
(The reasons for a lot of this are historical, and there are efforts underway to simplify the whole issue in the future.)
You say that pip is "vastly superior to easy_install", but never give an example as to why! The only advantages of pip that I see is that the console output is nicer, and that you can uninstall packages. But the console output is not really important, and you can just uninstall packages by removing their folders/eggs, so it's not *really* a big gain over easy_install.
easy_install on the other hand, lets you install multiple versions of the same package, which can be a life-saver sometimes. Additionally easy_install lets you create binary, pre-compiled packages (say you package has C-extensions). This means people installing the package don't require a complete build-environment with compilers and the required headers.
Another gripe I have is that pip can either install packages using the "requirements.txt" file *and* using the setup "install_requires" value. Why this ambiguity? What makes requirements.txt better than "install_requires"?
Personally, I use pip as well as everyone keeps saying it is the "new hotness", and that easy_install will disappear. But I have to say that I prefer easy_install feature-wise.
Am I not seeing an essential feature which makes pip the clear choice for the future? What is it that makes it "vastly superior"?
Just a few responses to this: Yes, you already hit two of the reasons pip is really better: More sane and controllable console output, and the fact that it actually keeps a record of what files it installs so as to enable uninstallation. I think the latter alone should be enough.
But to address some of your other comments: First of all you're confusing easy_install and setuptools (particularly the bdist_egg command that can be used to build binary egg distributions). It's true that until recently pip's inability to install binary packages has been a real shortcoming, especially for those of us who work with scientists ;) The idea of adding egg installation support to pip has been bandied about, but ultimately rejected for several reasons, the chief among them being that as nifty as the egg format is it has several major shortcomings, not the least of which being that it's not described by any accepted standard. Now that the wheel format has been developed (which *is* based on a standard and fixes some of the other problems with eggs) that's not going to be as much of an issue in the future.
As for being able to install multiple versions of the same package side by side as eggs is one of the things that made them nifty. Unfortunately the feature is a bit fragile and only really works in some limited circumstances. I don't see a lot of people using that in the wild--instead they're more likely to just use virtualenv for exactly this purpose.
As for the conflation of requirements.txt and install_requires: The latter is just a feature specific to setuptools for listing a package's runtime requirements. Not all packages use setuptools, and need another way to list their requirements. Also a requirements.txt allows listing any number of (seemingly) unrelated packages needed to set up a software stack for some particular purpose. They need not all be mutual requirements.
Finally, one other feature that makes pip better (and this is just one): Unlike easy_install, pip actually checks when it installs a package that all of that package's requirements can be installed successfully as well. If any of the requirements fails to install it rolls back the entire installation and does not otherwise modify your system. easy_install does the exact opposite, and can leave a system in a broken state after a failed installation.
(BTW, these are all good questions so +1 nonetheless)
requirements.txt and install_requires serve different purposes.
install_requires tell you abstract requirements, you don't know where they come from nor what package is going to fulfill them. Typically you want your install_requires to be as non specific as to version as possible to enable more widespread use.
requirements.txt on the other hand take a set of abstract requirements and make them concrete. It does this by pairing the name and optional version spec from install_requires with an index url from --index-url.
All this time I used virtualenv wrong. I had mysweetunicornproject_env per project. This was partially because I didn't really understand virtualenv and the lack of education :D.
I'd love to be able to print this for off-line reference; any chance you could add a print-friendly style sheet to your web site? Thanks!
n e lorenson
Nice! Almost feels too easy...
Beautiful, brilliant article. Thank you!
See the below link,
I've come back to this post multiple times as a reminder of the basics of virtualenv. Thank you, I really appreciate it.
Note that the pyvenv script which comes with Python 3.3 does not include distribute/setuptools or pip, you have to install them to the venv yourself.
great article! i've been using venv for several months, and only now do i feel like i get what's going on.
Nice article. But why shouldn't one commit the env directory with the project? If the project is so tied to the env in question, and assuming all developers work on similar platforms, it makes sense to keep the env with the project for easy access.
Excellent article. Thanks!
Brilliant! Great Job
This is quite a concise article, thanks Matthews. I am wondering what are the differences between pyenv's virtual environment system and virtualenv. Could you also tell a little about this?
Just wondering if you or someone could expand on this?
Note: if you're using a version control system like git, you shouldn't commit the env directory. Add it to your .gitignore file (or similar).
Why not include the env file in a commit? Aren't all the project files in the env dir? Thanks!
(answer also for Srinath Krishna)
1. env is a directory, not a file.
2. env and requirements.txt (and setup.py) contains some equivalent information
if you have a requirements.txt you should be able to recreate env with
env/bin/pip install -r requirements.txt
if you have an environment you can create the matching requirements.txt with
pip freeze > requirements.txt
requirements.txt is a small text file defining the packages available at run-time.
env is part of the run-time environment, it is a big (multi-megabyte) directory structure, containing a machine and directory specific python run-time.
thus committing env is like committing the operating system together with the python program.
I wish every tutorial was like this. Love how you explain everything.
Thanks, this was beautifully clear.
Very nice. Python novice, confused about pip and virtualenv; this article helped me. Thank you for your work !
Excellent article. helpful.
Excellent, concise, very helpful. Thanks!
Excellent job buddy, thank you very much!
This is great article and it really sparked my interest.. Great work Jamie .. Thanks for writing this one..
Muchas gracias por dedicarle tiempo a compartir información como esta!
Best blog one could find for pip and virtualenv awesome work! you made my day :)
Thanks so much. Such an awesome summary of what virtualenv does. :)
Lovely to have a clear and practical explanation. Thank you!
Thanks fora fantyastic article! Really good for a beginner like myself.
Great tutorial. I read it every time I need to refresh my knowledge about virtualenv.
Excellent article that subsided my confusion and let me go ahead for python projects, thank you very much
awesome article, very clear explanation
Thanks for this. As a beginner, I wasn't sure what virtualenv actually did in the directory and how I was supposed to use the folder structure it generated. You described it so well!
Excellent article! I can confirm now I understand why we need virtualenv after all!
great Tutorial. easy to follow and never got bored. Thanks a ton.
Great article ! Thank you for your teaching skills !
This is a fantastic intro. I wish the rest of the development ecosystem had guides this solid. Thanks so much!
Great article - despite the age. I'd like to introduce virtualenv-mgr https://github.com/arteria/... at this point. I'd like to introduce virtualenv-mgr. virtualenv-mgr is a tool to manage multiple virtualenvs at once. Simply install, uninstall or upgrade specific packages in all virtualenvs at once. Print statistic, about the usage of packages over all environments. Find/list virtualenvs for further processing, eg. as input for virtualenv-mgr. Find all envs having a package installed.
Wow, great blog post.Really looking forward to read more. Keep writing.
Commenting is now closed