[pycrypto] Initial review of Thorsten's Py3k changes
Dwayne C. Litzenberger
dlitz at dlitz.net
Sun Apr 17 15:06:03 CST 2011
Awesome! Thank you!
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
Thorsten Behrens <sbehrens at gmx.li> wrote:
I am going back into the code to take a peek at your suggestion. On 1/29/2011 8:47 PM, Dwayne C. Litzenberger wrote: > Have a look in the various common.py files. All of the hex test vectors are > being fed through either a2b_hex or b2a_hex. I think it should be possible > to make versions of b2a_hex and a2b_hex that also do bytes->str and > str->bytes conversions, respectively. > > The following code works in both Python 2.1 and Python 3.2b2: > > from binascii import b2a_hex as _b2a_hex, a2b_hex as _a2b_hex > from codecs import ascii_decode as _ascii_decode > def bin2hex(bts): > """Like b2a_hex, but returns a str instead of bytes in Python 3.x""" > return _ascii_decode(_b2a_hex(bts)) > def hex2bin(s): > """Like a2b_hex, but expects a str instead of bytes in Python 3.x""" > return _a2b_hex(s.encode('ascii')) This would actually make things worse. That it works at all is to be considered a bug - there's a TODO I have not followed up on yet, and that TODO is to add type-chec
all functions so that an error is returned if a parameter is not "an object interpretable as a buffer of bytes". That is, if encode() is called with a unicode (str) object, that should raise an error. The reason I believe that pycrypto should check type is that the Python 3.x stdlib behaves that way: >>> from hashlib import sha1 >>> h = sha1() >>> h.update("lorem") Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: Unicode-objects must be encoded before hashing >>> h.update(b"lorem") >>> print (h.hexdigest()) b58e92fff5246645f772bfe7a60272f356c0151a For consistency, I have both Crypto.Hash and Crypto.Cipher behaving this way. The changes are in the doc, but in a nutshell: Crypto.Hash Python 3.x: digest() returns a bytes object Python 3.x: hexdigest() returns a bytes object Python 3.x: The passed argument to update() must be an object interpretable as a buffer of bytes Crypto.Cipher new() Python 3.x: ```mode`` is a string object; ```key``` and ``
must be objects interpretable as a buffer of bytes. cipher object Python 3.x: ```IV``` is a bytes object. decrypt() Python 3.x: ```string``` must be an object interpretable as a buffer of bytes. decrypt() will return a bytes object. encrypt() Python 3.x: ```string``` must be an object interpretable as a buffer of bytes. encrypt() will return a bytes object. If these new conventions will cause an issue, let's discuss that now, before I add the type-checking. All that having been said, I still think it should be possible to have the vectors be Unicode literals and to convert them to a bytes object when reading them in. It just will need to be done in a different part of the code. Thorsten
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the pycrypto