Monday, April 04, 2011

PyCon: Porting to Python 3

Porting to Python 3

py3ksupport.appspot.com has a list of the top 50 Python projects and which of them support Python3. As of the talk, 34% of the top 50 Python projects supported Python3. I just checked, and it's up to 54%.

There are multiple strategies to porting to Python3:
  • Only support Python3.
  • Use separate trees for Python2 and Python3.
  • Include both versions in a single download and set package_dir in setup.py
  • Implement "continuous conversion" using 2to3. This approach is recommended for libraries. Distribute can help.
  • Use a single codebase with no conversion. This requires loads of compatibility hacks. It's fun, but it's ugly. Check out the "six" project if that's what you want to do.
Try 2to3 first. If in doubt, use distribute.

Libraries should port as soon as possible.

In order to prepare, use Python 2.7 with the -3 flag. Fix all the deprecation warnings.

Use separate variables for string vs. binary data.

Add "b" and "t" to file flags.

For sort, switch from "cmp" to "key".

Reminder: __foo__ is often pronounced "dunder foo" (aka "double under foo").

Increase your test coverage. This significantly helps!

Use "2to3 -w ." to port an entire directory.

In setup.py for distribute, use use_2to3=True.

urllib2.urlparse is not right. Use the urlparse module directly.

Bytes vs. unicode is the hard part.

Using == to compare bytes and str doesn't work. b"a"[0] does not equal "a"[0]. Hence, the comparison will return False in a way that is unexpected and silent.

Trying to support Python 2.5 and Python3 at the same time is REALLY hard. It's much easier to support Python 2.7 and Python 3.0 at the same time.

Seeking in a UTF-8 text file is slow because it involves decoding, not simple indexing.

See http://docs.python.org/py3k/howto/pyporting.html.

There's a book called "Porting to Python 3".

1 comment:

EOL said...

Thank you for sharing these useful tips! (I'll be using them for my "uncertainties" package.)