[pycrypto] Once again: Python3 with PyCrypto

Thorsten Behrens sbehrens at gmx.li
Tue Dec 28 15:22:17 CST 2010


How about them apples.

D:\Users\Soenke\Documents\Source\pycrypto>python --version
Python 3.1.3

D:\Users\Soenke\Documents\Source\pycrypto>python setup.py test
running test
..................................................................................................................................
..................................................................................................................................
..................................................................................................................................
..................................................................................................................................
..................................................................................................................................
..................................................................................................................................
.............................................................................................................SelfTest: 
You can ign
ore the RandomPool_DeprecationWarning that follows.
build\lib.win-amd64-3.1\Crypto\Util\randpool.py:40: 
RandomPool_DeprecationWarning: This application uses RandomPool, which 
is BROK
EN in older releases.  See http://www.pycrypto.org/randpool-broken
   RandomPool_DeprecationWarning)
.....................
----------------------------------------------------------------------
Ran 910 tests in 28.723s

OK

Alright, allow me to bask in this for a second. Okay, I'm good.

This also still builds and tests correctly on Python 2.7, 2.2 and2.1.

I'll commit this to the py3k branch. It's not ready to be merged yet. 
Here's the TODO as I have it:

TODO:
- Check for type of string in functions and throw an error when it's not 
correct. While at it, ensure that functions observe the guidelines below 
re type.
   This is friendlier than just relying on Python's errors.
- Document the expected types for functions. The cipher, the key and the 
input texts are byte-strings. Plaintext decodes are byte-strings 
currently, this needs review.
   In keeping with how Python 3.x's hash functions work, the input MODE 
is a text string. hexdigest() returns a text string, and digest() 
returns a byte-string.
- Compile and test _fastmath.c
- Go through test cases and see which modules are not covered

Volunteers for any of those?

Here is a writeup of what was changed:

Py code:

setup.py invokes 2to3 automatically. This handles int/long and print 
issues, among others.
setup.py will touch nt.py on win32 after build and build again. This is 
necessary so 2to3 can do its magic on that file.

There are still a lot of places in the code that need manual attention 
even with 2to3. They mostly have to do with string (2.x) vs. 
byte/unicode (3.x) representation

Use "if sys.version_info[0] is 2:" where needed. Ideally, most of the 
conditional code can be in py3compat.

Replace str(x) with bstr(x) if bytes were intended. Becomes str(x) in 
2.x and bytes(x) in 3.x through py3compat module.
Replace chr(x) with bchr(x) if bytes were intended. Becomes chr(x) in 
2.x and bytes([x]) in 3.x through py3compat module.
Replace ord(x) with bord(x) if bytes were intended. Becomes ord(x) in 
2.x and x in 3.x through py3compat module.

Comparing a string index to a string literal needs to be changed in 3.x, 
as b'string'[0] returns an integer, not b's'.
The comparison can be fixed by indexing the right side, too: "if 
s[0]==b('\x30')[0]:" or "if self.typeTag!=self.typeTags['SEQUENCE'][0]:"

String literals need to be bytes if bytes were intended.
Replace "x" with b("x") if bytes were intended. Becomes "x" in 2.x, and 
s.encode("x","latin-1") in 3.x through py3compat module.
For example, '"".join' is replaced by 'b("").join', and 's = ""' becomes 
's = b("")'.
Search for \x to find literals that may have been intended as byte strings
!! However, where a human-readable ASCII text string output was 
intended, such as in AllOrNothing.undigest(), leave as a string literal !!

Only load python_compat.py "if sys.version_info[0] is 2 and 
sys.version_info[1] is 1:" .
The assignment to True, False generates syntax errors in 3.x, and >= 2.2 
don't need the compatibility code.

Where print is used with >> to redirect, use a separate function 
instead. See setup.py for an example

The string module has been changed in 3.x. It lost join and split, 
maketrans now expects bytes, and so on.
Replace string.join(a,b) with b.join(a).
Replace string.split(a) with a.split().
Replace body of white-space-stripping functions with 'return 
"".join(s.split())'

Integer division via the "/" operator can return a float in 3.x. This 
causes issues in Util.number.getStrongPrime. As 2.1 does not support
the "//" operator, a helper function "floordiv()" is brought in via 'if 
sys.version_info' from floordiv.py or py21floordiv.py.


C code:

Extended "pycrypto_compat.h". It handles #define's for Python 3.x 
forward compatibility

#include "pycrypto_compat.h"
// All other local includes after this, so they can benefit from the 
definitions in pycrypto_compat.h

The compat header #defines IS_PY3K if compiling on 3.x
The compat header #defines PyBytes_*, PyUnicode_*, PyBytesObject to 
resolve to their PyString* counterparts if compiling on 2.x.
PyLong_* can be dangerous depending on code construct (think an if that 
runs PyInt_* with else PyLong_*),
therefore it is #defined in each individual module if needed and safe to 
do so.

PyString_* has been replaced with PyBytes_* or PyUnicode_* depending on 
intent
PyStringObject has been replaced with PyBytesObject or PyUnicodeObject 
depending on intent.
PyInt_* has been replaced with PyLong_*, where safe to do (in all cases 
so far)

The C code uses "#ifdef IS_PY3K" liberally.
Code duplication has been avoided.

Module initialization and module structure differ significantly in 3.x. 
Conditionals take care of it.

myModuleType.ob_type assignment conditionally becomes PyTypeReady for 3.x.

getattr cannot be used to check against custom attributes in 3.x. For 
3.x, conditional code uses getattro
and PyUnicode_CompareWithASCIIString instead of strcmp

hexdigest() needed to be changed to return a Unicode object with 3.x



More information about the pycrypto mailing list