[pycrypto] AES, python 2.7 vs 3
Paul_Koning at Dell.com
Paul_Koning at Dell.com
Mon Feb 10 08:40:53 PST 2014
On Feb 8, 2014, at 3:25 AM, Dave Pawson <dave.pawson at gmail.com> wrote:
> On 7 February 2014 17:01, <Paul_Koning at dell.com> wrote:
>> That’s what pycrypto needs to do, yes. From what Dwayne says, it sounds like that’s currently not finished yet.
>>
>> The easiest way to look at this is as a data type matching exercise. Cryptographic operations are functions that operate on sequences of bytes. Unicode strings are NOT sequences of bytes — they are an entirely different data type.
>
> How (if at all) does that statement change if I am using unicode,
> utf-8 encoding please Paul?
> As I understand it, utf-8 constitutes octets? Or am I wrong?
Yes, UTF-8 is one of several encodings you can use for Unicode. It’s probably the most popular one for a variety of reasons. So unless you have a reason to do otherwise, UTF-8 is a good default choice for encoding of Unicode strings.
>
>
>>
>> It is valid to speak of specific encodings of Unicode strings as sequences of bytes, but the key point is that you have to do the encoding — which means, first of all, choosing WHICH encoding — in order to have that sequence of bytes.
>
> And how does that match with Python 3, which (appears | is) based on
> Unicode strings?
The “str” type is Unicode. To turn it into “bytes” — for I/O, for crypto, or for other purposes that need octet strings, you have to encode the Unicode. As I mentioned, UTF-8 is a typical choice, but if you had a reason for using something else, you would specify that encoding instead.
For example:
$ python3
>>> s="foo"
>>> type(s)
<class 'str'>
>>> b=s.encode("utf-8")
>>> type(b)
<class 'bytes'>
>>> b
b'foo'
>>> s="aéö"
>>> b=s.encode("utf-8")
>>> b
b'a\xc3\xa9\xc3\xb6'
>
>
>>
>> Since you have to make those choices, it’s not safe for APIs like crypto to accept strings and effectively do some encoding as a side effect. Better to require bytes in the interface, and let you handle the encode/decode steps explicitly, in the way you want them to be done.
>
>
> Thanks for that Paul... you seem to be pointing at Pycrypto as the
> source of my problem. An approach paper / web page would be very
> helpful to me (and others facing the same issues) in managing this
> slippery (to me) aspect of crypto.
>
> Again, thanks for the comments.
This http://docs.python.org/3/howto/unicode.html might be helpful for a much more detailed explanation of what I’ve been talking about.
paul
More information about the pycrypto
mailing list