[pycrypto] Hash: Remove "oid" attributes; add "name" attribute

Dwayne Litzenberger dlitz at dlitz.net
Sun Feb 17 23:58:48 PST 2013

[Reposted from 'Hash: Remove "oid" attributes; add "name" attribute'

Legrandin wrote:

>Hi Dwayne,
>The Object ID is an identifier assigned by (inter)national standard 
>bodies (NIST) or recognized private organizations (RSA Inc, Teletrust) 
>to the hash algorithm for use in all the several crypto protocols based 
>on ASN.1 (PKCS#1 signatures, PKCS#7/CMS, PKCS#8 private key 
>encapsulation, SSL/TLS, CA certificates, etc). Nothing stops one from 
>also using without ASN.1, as a stand-along numerical string guaranteed 
>to be unique.
>The fact that a few other protocols don't use it (and prefer to have 
>their own internal identifiers, and therefore not leverage work done by 
>others already) does not look to me as a reason to isolate it in the 
>PKCS#1v1.5 signature module, considering that protocols that use it are 
>the majority, and all hashes currently in pycrypto have it (being them 
>all quite mature).
>The attribute could also remained undefined for those experimental hash 
>that pycrypto ever introduced but that don't have any Object ID 
>assigned yet (e.g. Salsa20 maybe?). That would just mean that the hash 
>cannot be used to make PKCS#1v1.5 signatures (which makes sense). If 
>the OID exists, it can be added to the module. It it doesn't, it is not 

My response:

     Be very careful with your use of Object Identifiers.  In many cases there are a
     great many OIDs available for the same algorithm, but the exact OID you're
     supposed to use varies somewhat.
     -- Peter Gutmann, X.509 Style Guide, http://www.cs.auckland.ac.nz/~pgut001/pubs/x509guide.txt

> Protocols that use [OIDs] are the majority

Really?  The only widely-used protocols I can think of are SNMP (which 
is irrelevant here), LDAP (also irrelevant), and the CMS/TLS/PKCS 
protocol suite.  SSH doesn't use them, OpenPGP doesn't use them, DNSSEC 
doesn't use them, OAuth doesn't use them, OpenID doesn't use them, DKIM 
doesn't use them, and I'm pretty sure that IPsec/IKEv1/IKEv2 don't use 
them.  Of the protocols that do use them, which ones actually use the 
OIDs listed in this commit, rather than some ciphersuite identifier like 
pbeWithSHA1AndDES-CBC?  Have there been any major new crypto protocols 
designed in the last decade that use these OIDs?  That use ASN.1?

This is going to be a bit of a long rant.

OIDs and ASN.1 are legacy ITU-T crap, and the protocols built around 
them are overcomplicated and error-prone.  The only reason why I merged 
any ASN.1 stuff at all is because PKCS#1 uses it.  PKCS#1 is a bit of a 
special case, because it's basically synonymous with RSA; It's even used 
protocols that don't otherwise use ASN.1 use PKCS#1.

I think it was you who convinced me that the ASN.1 used by PKCS#1 was 
simple enough that it wouldn't lead to an endless series of bugs.  Even 
so, you *still* got it wrong, as described in LP#1119552 [1] .  I'm not 
blaming you; I'm blaming ASN.1 for being such a terrible, complicated, 
obfuscatory way to define and describe data formats.  Hell, the only 
reason why you got it wrong was because *so many other people got it 
wrong early on that the spec was modified to accommodate their errors*.  
And PKCS#1 is a much *simpler* use-case of ASN.1 compared to the rest of 
the CMS/TLS/PKCS suite...

In contrast, PyCrypto *needs* to be kept simple, because we simply don't 
have the developer resources to create a secure CMS/TLS/PKCS 
implementation.  Even if we had the resources, getting it right is 
tricky enough that we *shouldn't* try to make yet another 
implementation---especially not one that's Python-specific.  It would be 
better to pool our limited resources with other FOSS crypto developers 
to improve the existing implementations, or maybe to try to recruit them 
to work on a new project that would become the successor to the existing 
implementations.  One more insecure, resource-starved FOSS CMS/TLS/PKCS 
implementation is not good for users.

OpenSSL exists today, and there are several ways to use it from Python.  
The purpose of PyCrypto is not to reimplement everything that OpenSSL 
already does.  What would be gained by doing that?  If we just wanted to 
make a nicer, more Pythonic API for OpenSSL, we could just add OpenSSL 
as a dependency and be done with it.  (Python itself already uses 
OpenSSL for hashlib, so it's not unprecedented.)

PyCrypto is used by a lot of folks who are either implementing 
recently-created protocols (i.e. *not* CMS/TLS/PKCS), or who 
are---rightly or wrongly---creating new protocols.  One of my goals with 
PyCrypto has been to improve their chances of building something secure, 
and to me that means that I should steer people to simpler, 
easy-to-implement building blocks like OpenPGP and SSH, not complex, 
error-prone things like ASN.1/CMS/TLS/X.509/PKCS#12.

I want to avoid turning PyCrypto into something that treats CMS/TLS/PKCS 
as the gold standard and everything else as a second-class citizen.  
There have already been a few cases of that (for example, the "oid" 
attribute here, and the "pkcs" parameter to RSA.exportKey), and I see 
those things as oversights that need to be fixed, not things that I want 
to entrench further.

I see that you've been building a PKCS#8 implementation in your fork of 
the PyCrypto repo.  I can only assume that you eventually plan to build 
a PKCS#7/CMS implementation, too.  That's fine, but seeing things like 
`algos = { 'PBKDF2WithHMAC-SHA1AndDES-EDE3-CBC' : 
_PBES2_Factory(_PBKDF2_Factory(), _DES_EDE3_CBC_Factory()) }` convinces 
me that it's beyond the scope of what I want to include in PyCrypto, 
unless it were in a well-isolated subdirectory that could be easily 
split into a separate package if the maintenance became too burdensome 
for me.  At a minimum, we'd need to agree that the string "X.509" 
doesn't belong in the module that implements the raw RSA primitive.

You've done a lot of good work and I appreciate your contributions, but 
IMHO you're embedding the PKCS stuff too deeply into the core of 
PyCrypto when I'd prefer to see it in separate subdirectory, or even a 
separate library.  This is partly my fault: I was a bit too anxious to 
merge the PKCS#1 stuff after being absent for a while, so I didn't pay 
close enough attention to the API changes (even though the API is really 
what differentiates PyCrypto from other libraries).  In the future, I'm 
going to try to be more picky upfront about the API, to avoid 
backpedaling like I'm doing right now with the .oid stuff.

As I see it, the PKCS1 stuff probably should have been consolidated into 
something like Crypto.Protocol.PKCS1.  Going forward, the PKCS8 stuff 
should probably go into something like Crypto.Protocol.PKCS8, and a 
future OpenPGP package could go into Crypto.Protocol.OpenPGP.  
RSA.importKey and RSA.exportKey should probably be deprecated and moved 
into the PKCS1 and PKCS8 packages, respectively.

The exact names of the subtrees are debatable, but the idea is create a 
clear separation between the primitives and the protocols that use them, 
rather than mixing them all together.  This is particularly important 
for the hash modules, since those could eventually become thin wrappers 
around the standard hashlib library---I doubt that would ever happen if 
we insisted on attaching extraneous things like OIDs to them.

Again, sorry for the long message, but I wanted to explain my thinking 
as clearly as possible.  Let me know what you think.

- Dwayne

[1] https://bugs.launchpad.net/pycrypto/+bug/1119552

Dwayne C. Litzenberger <dlitz at dlitz.net>
  OpenPGP: 19E1 1FE8 B3CF F273 ED17  4A24 928C EC13 39C2 5CF7

More information about the pycrypto mailing list