How not to program in python
TL;DR
Whatever you do, make sure you are using versioned python packages, even for simple tasks. And use pip+virtualenv.
So you want to program in python..
It seems like only yesterday, and not 7 years ago, that I decided to learn python. I may not be the best python programmer, but I have made probably every mistake you can, so here are a bunch of things not to do, and a few things you should be doing.
Don’t: write python ‘scripts’
Don’t write programs like this:
temp = input("C: ")
print temp*9/5+32
The way you fix that is not by writing the following:
if __name__ == "__main__":
temp = input("C: ")
print temp*9/5+32
And don’t write this either:
def main():
temp = input("C: ")
print temp*9/5+32
if __name__ == "__main__":
main()
No matter how good your logic is, if you couple the logic with your input and output you are painting yourself into a corner. I’ve seen people write scripts like this, and then have other scripts call them using os.system. In a loop. Then they wonder why python is so slow.
Do: Write python modules and packages
Minimally this could look something like:
def ctof(temp):
return temp*9/5+32
def main():
temp = input("C: ")
print ctof(temp)
if __name__ == "__main__":
main()
Even better would be to have main
parse sys.argv rather than working
interactively. For simple interactive tools it is hard to beat the cmd
module
Now you have a (albeit poorly named) python module that can properly be imported from a larger program:
>>> import temp
>>> print temp.ctof(100)
212
Don’t: mess with PYTHONPATH
Now that you have a module you can import, what do you do with it? For
years my development/production environment consisted of the following: a lib
directory containing modules and packages and a util
directory containing
scripts that used those modules. This worked fine for a long time, especially
when I only had one machine. When I got more systems, I used the high tech
method of rsync
‘ing the entire directory tree to /srv/python
or ~/python/
and mucking with the python path. This system worked, but had a number of
problems:
- If I wanted to run a program on a new system, I had to rsync the entire directory tree.
- Since there was no dependency information, the first time I wanted to share a program I wrote, I had to figure out the dependencies manually.
- I had no idea what modules were being used, and which were obsolete.
- When I started writing test code and documentation, I did not have a good place to store them. I used a single directory for all my tiny modules because one directory per module seemed like overkill at the time.
- When the version of python on the system was upgraded, bad things happened.
It’s very tempting to simply throw all of your python code into a single directory tree, but that method only causes problems later on.
Do: Create python modules
For the example above, we can write a simple setup.py
file:
from distutils.core import setup
setup(name="temp",
version="1.0",
py_modules = ["temp"],
entry_points = {
'console_scripts': [
'ctof = temp:main',
]
},
)
If you have a full package instead of a single file module, you should use
packages
and not py_modules
. The the official
documentation should be
read if you are doing anything more complicated. There are fields for your
name, short and long descriptions, licensing information, etc. This
example was kept purposely short to make it clear that there is not much
you actually have to do to get started. Even a barebones setup.py
is
better than no setup.py
.
Don’t: use ‘scripts’ in setup.py (Do: Use entry points)
console_scripts
entry_points
should be preferred over the ‘scripts’ that
setup.py can install. The last time I tried, scripts
did not get
correctly installed on Windows systems, but console_scripts
did.
Additionally, the more code you have in scripts, the less testable code you
have in your modules. When you use scripts, eventually you will get to the
point where they all contain something similar to:
from mypackage.commands import frob
frob()
and at that point, you are just re-implementing what console_scripts
does for you.
Do: Version your packages and depend on specific versions.
So, after years of doing-the-wrong-thing, I finally created proper packages for each of my libraries and tools. Shortly after that I started having problems again. While I had been versioning all of my packages, any package that required another package simply depended on the package name and not any specific version or it. This created problems any time I would add new features. I would install the latest version of a utility package on a server, and it would crash since I had forgotten to upgrade the library it depended on. Since I wasn’t syncing the entire directory tree anymore, libraries were becoming out of date.
Don’t install packages system wide. (Do: Use virtualenv and pip)
Once you get to the point where you are using versioned packages, you’ll
want to be able install different versions of modules under different
python versions. When I was simply sticking everything under /srv/python
it
was next to impossible to have multiple versions of python. I could change
PYTHONPATH
to point somewhere else, but there was no easy way to maintain two
complete different trees of modules.
It is extremely simple to get started using pip and virtual environments.
You can use the -E
option to create a virtual environment and install a
package in one command. The -E
option to pip creates a virtual environment if
one doesn’t already exist:
justin@eee:~/tmp$ pip -E python_env install bottle
Creating new virtualenv environment in python_env
New python executable in python_env/bin/python
Installing distribute...done........................
Downloading/unpacking bottle
Downloading bottle-0.9.5.tar.gz (45Kb): 45Kb downloaded
Running setup.py egg_info for package bottle
Installing collected packages: bottle
Running setup.py install for bottle
Successfully installed bottle
Cleaning up...
justin@eee:~/tmp$ ./python_env/bin/python
>>> import bottle
>>> bottle.__file__
'/home/justin/tmp/python_env/lib/python2.7/site-packages/bottle.pyc'
>>>
I can use that same method to install the toy module I wrote for this post as well:
justin@eee:~/tmp$ pip -E python_env install ~/tmp/post/temp_mod/
Unpacking ./post/temp_mod
Running setup.py egg_info for package from file:///home/justin/tmp/post/temp_mod
Installing collected packages: temp
Running setup.py install for temp
Installing ctof script to /home/justin/tmp/python_env/bin
Successfully installed temp
Cleaning up...
pip was also nice enough to install my console_script
:
justin@eee:~/tmp$ ./python_env/bin/ctof
C: 34
93
Too long; Did read
The barrier to entry for python is a lot lower compared to a language like java or c++. It’s true that helloworld is simply:
print("Hello, World")
However, if you plan on using python for anything more complicated, you will want to learn how to take advantage of modules and packages. Python doesn’t force you to do this, but not doing so can quickly turn into a maintenance nightmare.