Signature Security

This article attempts to give a technical overview on the current thinking on the strength of security of primary algorithms used in public key systems, and why it can still be considered okay to use RSA key lengths that are breakable given enough effort (which is still rather much.).

In the end the issue is simply a cost-benefit analysis from the attackers’ viewpoint: Is it worth to run a few million CPU years’ worth of supercomputer time to break somebody’s RSA key? In particular because all of that effort is needed for each key all over again.

This article looks at the size of the effort based on published academic works, and estimates the cost of the attack. Currently there are no known quicker ways to “break” RSA and ECC keys, than so called “brute force”. Although one must note that the method of “brute force” has become greatly more efficient than when it was first described some 2500 years ago.

The security of a public key system is based on the difficulty of reversing certain mathematical operations. Even though mathematical reversing is possible, the goal is that it will not be economically worthwhile to do it in a timescale before the key becomes obsolete. But even that depends on whose key it is, and what can be done with it.

The difficulty of RSA algorithm is the result of integer factorization. An RSA public key is a multiplication result of two prime numbers (that pair is the private key,) and something encrypted with a private key (signed) can easily be decrypted (verified) with the public key.

The difficulty of Elliptic Curve is based on discrete logarithm. Incidentally the best attack method on general RSA is the Discrete Logarithm Problem attack, which is also what is used to break ECC – but for ECC it is dozens of magnitudes harder than for RSA. (1024 CPU years to break 256 bit ECC key, for example.)

Knowledge on algorithms for these tasks has improved immensely over past decade. Even still if someone wants to overcome the difficulty of these algorithms (public key factorization), a long computing time at large computer clusters is needed, and that is still very expensive. The art of integer factorization is more advanced than discrete logarithms, and therefore the long term (10 years) safe RSA key is considered to be at least 2048 bits long, while the equivalent ECC key is 224 bits. Double the bit count, and along conservative thinking it should be safe for 20 years (4096 bits for RSA, 300 bits for ECC.)

However, in cryptography nothing is absolutely predictable. Only hindsight is certain and the published attack algorithms have never been less efficient than previous ones. All it needs is some Radical Technology (algorithm) that makes what is currently considered a “hard” problem into “just a bit tedious” (from consuming CPU years to consuming CPU hours,) and all presumptions of current safety levels become obsolete.

Is it certain that such Radical Technology will not appear? No.
It is likely that such Radical Technology will appear? No.
How do you know? I don’t.

Although first integer factorization algorithm was described by Erathostenes around 500 BCE, the first precursor of current General Number Field Sieve, the quadratic sieve method was defined in 1981 by Carl Pomerance. Indeed the relative lateness of efficient method invention can be seen as result of lateness of inventions of public key crypto systems (RSA in 1977, ECC in 1985) which have given motivation for otherwise abstract mathematics research.

Avoid Giving Easy Loopholes

This is not an exhaustive list, and mainly about RSA encryption.

Loophole 1:

Cryptography primitives are always combined to produce something called protocols. Defining/implementing a protocol, where some content can be spied upon or modified without a trace without needing a correct key, is the basic mistake done over and over again.

Loophole 2:

Do not use too small keys. This applies to all public key systems, and what is “too small” can be read in this article elsewhere.

When the adversary is able to crack your private key from the public version of it, it can make signatures pretending to be you, and it can decrypt all messages sent to you.

Loophole 3:

The security properties of RSA encryption only work if me > n, where n, e are are public key parts of RSA, m is the message treated as an integer. For a very short message and a small exponent, it is trivial to decrypt the message, and therefore a secure encryption requires messages that are close to key size and preferably a high exponent.

For example, if e=3, and message has the length of 1/3 of the key with zero padding, it is trivial to take the cubic root of the message.

This is why we use e = 216+1, and pad (with PKCS#1v1.5 padding method, or PKCS#1v2.1 PSS padding method) the short message so that its size will always match the key size.

Loophole 4:

The zero padding is to be avoided, because having lots of consecutive zero bytes in the message makes the brute force RSA decrypt much easier.

This is why the RSA encryption padding algorithm that is considered the most secure so far (RSA-OAEP) uses a masking function to hide the original data first, so that it does not have its internal structure visible under the encryption result. The masked result that is fed to encryption is as close to pure random as possible. To brute force decrypt that is no easier than factoring the recipient’s public key.

The PSS signature algorithm is used for similar reasons.

Loophole 5:

Critical for RSA key security is that all the keys in the world are relative primes to each other.

Mathematically stating: For any two public key moduli N1,N2 the GCD(N1,N2) == 1

The reason for this is that if either factor in the public key (call them p, q) is shared with any other public key out there, there is a trivial and easy to scale program that can find it out. A small desktop machine can check millions of public keys if your key happens to violate the GCD() condition given above. The result that GCD() gives in case of security compromise is the common p between two public keys, and thus trivial division gives the q’s of both public keys → hence two public keys are compromised. The scenario stated above happens with 50% probability when 2(L/4) keys have been generated for key length of L, and the used generators are good quality high entropy ones. (See Birthday Paradox.)

With sufficiently long keys (like 1024 bits) the value space of available keys is large enough that good quality generators are extremely unlikely to produce keys that anybody else will/has generate(d) – 2256 = 1077 (approximately). You can calculate when the first factor collision happens while 1 billion (109) keys are being generated per second.

A related thing is that public key factors (p,q) must be of equals size (the index position of the top-most ‘1’ bit in the value is the same), which means that for a key that is L bits long, the factors are both L/2 long (L being commonly a multiple of 8, the L/2 will be multiple of 4, and the same for p,q). This way the factors of a 1024 bit key are never to be found from e.g. a 1032 bit key’s factors. (Incidentally this also makes the very efficient Elliptic Curve Method for factoring large integers inefficient against these RSA keys.)

What happens when a bad quality random number generator is used?

  • Debian OpenSSL key generation catastrophe (2005-2008)
  • “Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices” (see below)
  • “Factoring RSA keys from certified smart cards: Coppersmith in the wild” (see below)

Loophole 6:

The use of the same key for encryption and signing when using RSA algorithm has a hole, if the decrypted message is signed directly and unmodified. In the following equation, n is the public key (n = p*q), e is the other public key component.

If we use the same key for encryption and signing, the attacker can use this to decrypt our encrypted message.

The attacker chooses a random number r (r must have GCD(n, r) = 1). Then he chooses a new message m′ = mere and send this message (m′) for signing to the sender. When sender signs m′, we get m′d ≡ (mere)d ≡ mr (mod n). Then we just to multiply it to r-1 to get m (the secret message).

Countermeasure: See RSA-OAEP, RSA-PSS, ECIES. The message “m” is never signed directly.

Loophole 7:

Sending the same message to multiple recipients (like in CMS message encryption key wrapping) makes a small public exponent vulnerable. (Like the usual: 216+1, which is chosen for fast encoding.)

Therefore messages must never be encrypted plain, and they must always use some encapsulation that effectively renders the messages distinct each time. (RSA-OAEP, PKCS#1 encrypt)

Loophole 8:

Weak RSA keys. There are several of them, and all good generators avoid creating them.

Private key p,q not being in range of q < p < 2q is another way of telling that their binary representations are of the same length. If that rule is violated, the Elliptic Curve Method is able to factor the public key quicker than the General Number Field Sieve. Still the effort is not trivial by any means.

The p,q must be relative primes to the public exponent e, or the RSA algorithm does not work.

Shared prime factors are explained above; Loophole 5.

Shared public key happens when both prime factors are shared at the same time, but it is not quite as bad as having one prime factor shared. Only one party can pretend to be another, rather than everybody being able to do it.

For private key p,q not to be primes, any of (p-1),(p+1),(q-1),(q+1) must have small prime factors. Primality testing is commonly done using probable MillerRabin and LucasLehmer tests, or by Knuth’s Algorithm P (vol 2, page 395) which simplifies MillerRabin test. Repeating the test T times results in the upper bound of false primality having a probability of 0.5-T. T=20 -> P= 0.91E-12

Private exponent d being smaller than about N0.292 is a weakness, but in a normal case with the usual public exponent e = 216+1 this happens very infrequently, and is therefore merely a theoretical issue of general RSA algorithm. Because d is computed from (p-1)(q-1) product, it therefore can be of any value, including undesirable ones. Probability for it being undesirable is far lower than that the probabilistic primality test has given false approval for factors being prime, unless the primality testing is repeated in order of bit length (N)/4 times (256 times for 1024 bit key, usually testing is repeated 30-90 times.)

Possible Radical Technologies

Recent developments of quantum computers may or may not be possible to extend to do large integer factorization easily, but a computer and an algorithm that are able to do it are still very expensive. The earliest quantum factorization algorithm was published in 1995 (Shor), and another even more efficient one in 2013. Shor’s algorithm can also be used for ECC’s Discrete Logarithm Problem.

Actual test implementations of these algorithms have been done up to n=21 (2012,) which is far from 21024, and we don’t yet know if extending that far is possible at all.

The largest quantum circuitry designs that are implementable today are processing 5 bits, and consists of few hundred quantum gates. See table I at http://arxiv.org/abs/1301.3210 for the number of required gates as the bit width grows. For example the 512 bit case requires 1.3 million gates for a modular multiplier, and 1.3 billion gates with a depth of about 41 million gates for modular exponentiation. At the multiplier the gate grows by a bit over a factor of 4 for bit width doubling. For exponentiation the gate count grows by a factor of 8 for every doubling, and depth grows by a factor of 4.

“But we can do billions of transistors on a silicon now!” Sure, but we can’t do even hundreds of quantum gates. The debate on system logic depth vs. quantum decoherence time is very much opposing quantum computer as a magic bullet for breaking all encryption. Producing infinite decoherence time is seen in magical communication devices in science fiction, but it does not happen in reality.

Adiabatic Quantum Computation method has been able to factor n=143, but it has its own difficulties for use at cryptographically useful key lengths.

Instead of using cooled quantum computers, a way to do an all optical transistor was announced in July 2013, and that may bring processing speed up from the current level by a factor of 1 000 to 100 000. This means that a CPU clock cycle would rise from 3 GHz to 3 THz. But a current test system using Cesium vapor in millikelvin temperature isn’t usable beyond proof of concept. Doing something million times faster than today is easily counter-measured by adding 500 bits to RSA key length (or 100 bits to ECC key length.)

Cryptography researchers are looking for algorithms in a class called “post-quantum“, for which can be shown mathematically that there is no quantum loophole to speed up the reversal of the encryption function.

Recent history on integer factorization

The following data is from published academic papers with donated and grant money supplied CPU resources. The estimate of 1.0 kbit RSA being factored in few years with current style of academic resourcing is extrapolated from current achievements. With larger budgets (not very much larger, though) it is achievable operation today.

Bits/Challenge Year Method
RSA-512 2000 General Number Field Sieve
RSA-200 (663 bits) 2005 General Number Field Sieve
Mersenne 1039 2007 Special Number Field Sieve
RSA-768 2009 General Number Field Sieve, approximately 1500 CPU years during 2.5 years
Mersenne 1061 2012 Special Number Field Sieve, approximately 350 CPU years during 1 year
RSA-1024 ESTIMATE 2015 Guess: 4000 CPU years
Based on a note at Mersenne-1061 factorization report.
RSA-1024 ESTIMATE ? Guess: 2 000 000 CPU years of 2 GHz
eprint 2013/635: Lenstra et.al. 2013

Mersenne numbers are special form integers of type ″21061-1″ where the exponent (1061) is a prime, which has special implications for its factorization work. Therefore they are estimated to be about 10 times easier to factor than an RSA key of equivalent size with approximately equal size prime factors. (However, having the same prime factor twice is trivial to test by taking a square root.) Adding more bits adds the task difficulty as follows (based on complexity estimate of General Number Field Sieve):

  • 1024 : 1.0
  • 1100 : 18.8
  • 1152 : 131
  • 1536 : 5.28*107
  • 2048 : 1.02*1014

That is, 1.5 kbit keys are already 50 million times harder than 1024 bit keys, which are not breakable so far. It doesn’t matter what the first number is, for safety margin you are looking at the exponent of the difference in difficulty. (The ln n in the GNFS complexity equation can be expressed from number of bits x as Excel formula: =LN(2)*cell_of_x. The first part of complex looking exponents and small ordo can be replaced with constant 2 or 3.)

Cost of the Factorization

The following was written before Lenstra et.al. paper was published, and it was based on assumption that 4000 CPU years are enough. 2 million CPU years makes it possible, but definitely impractical.

Notable detail here is that being able to do something at any cost is not the same thing as it being practical. It isn’t generally worth the money to run supercomputer clusters to get somebody’s private RSA key factorized in a short timescale (still taking several days at the very least, never mind what “plausible fiction” TV writers would have us believe.) Some estimations:

  • Solution needs to be done within 3 days (1/100 of a year) instead of 4 000 years of a single CPU. The task can be moved to 400 000 CPUs.
  • Each CPU is delivered in package efficient format, like Chinese Tianhe-2 super-computer
  • This task needs just 1/10 part of Tianhe-2, but the job can be shortened to, say, 1 day by using all of it (not used here)
  • The system eats up 17.8 MW of electricity – say 2 MW for the segment used for this task for 3 days (72 hours) -> 144 000 kWh of energy
  • Electricity price varies being 0.10 to 0.50 €/kWh. Assuming a cheap source, the electricity bill comes to 14 400 €.
  • Cost of Tianhe-2 is given as approximately 2 400 million Yuan, or around 350 million €. Say this tasks needed 1 part of (3 years * 100th part of year * 10th part of system), the hardware price for this task is around 117 000 €.

The electricity bill is about 1/9 of total cost. Let’s say roughly that the total cost is 10 times the plain electricity cost. Is it worth about 150 000 € to factor somebody’s 1024 bit RSA key? (Or 1000 times that?)  Compare the above description with “Bombshell paper” from 2002 saying that “a computer cluster capable of factoring RSA-1024 in a day will cost around 200 million €” Apparently it costs about twice as much.

Known algorithms for the job are not good enough to do this with decades cheaper ( = smaller) set of private resources. With 4 000 CPU cores the sieving will take around 1 year; with 400 it will take around 10 years. The cost is (hw_cost_unit / timeunit) * time * number_of_machines, whose product doesn’t really change. Total hardware cost for the slower task will be the same as with the faster task.

In any case, using 1.5 kbit key instead of 1.0 kbit is already around 50 million times more difficult ( = expensive!) to factor than current technology enables us. Going to 2.0 kbit key is not really necessary – yet. Also having a habit of redoing key generation once a week is s great way to make any factorization work worthless even with short keys.

Additional Safety Features with Mobile Signatures

With Mobile Signatures the executed operation leaves trace data at many places, and if somebody disputes a signature that has no corresponding data in the mobile network, it becomes highly suspicious.

An even better safety measure is the use of CAdES-T timestampted signatures, where the operator’s timestamp server adds the server’s 2000-3000 bit signature on the user’s CAdES-BES signature so that it says: “CAdES-BES happened at this time through me”. To fake such a signature, the attacker needs to break a) the user’s key, and b) the server’s much bigger key.

Applications asking for these signatures must verify that the received signature is indeed a response for the random 32-64 byte challenge they sent in the signature request. Then they can verify that the CAdES-T signer is known and expected. They could also verify that the actual signer’s certificate is known as that particular user’s own certificate. They should be able to track the user’s and CAdES-T signer’s certificates to known trust roots per RFC 5280 validation rules and check certificate revocations, but that may get a bit too challenging to implement in every application.

What if 3000 Bit RSA Keys are Too Large for You?

Going to elliptic curves is attractive for two reasons:

  1. Equivalent strength (as it is currently understood) ECC keys are small compared to RSA keys (see Key Length references below)
  2. Algorithms to break ECC keys are still very immature (this is also a risk – somebody can invent something radical)

Consequences/Problems:

  1. Some computation techniques for faster (by 10-50% compared to a trivial algorithm) polynome value calculation are patented, and those patents are still being (mostly) active in US and in Europe.
  2. Basic trivial evaluation computations are not patentable, and the hardware speed of trust modules keeps rising so that patented algorithms may not be necessary anymore.
  3. There are lots of published standard curve parameter sets to choose from.
  4. Some of the curve groups are not really compatible with X9.62 parameters (GOST3410 and DSTU4145.)
  5. Some curve groups we are not quite sure of (Chinese SM2.)
  6. Efficiently computable forms of curve points are being formulated at the rate of a few every year, so there are many of them to choose from.
  7. Mathematics of Elliptic Curves is subtle, and the worst thing possible is to pick equations and curve family parameters at random. Some of the possible equations and families are trivially easy to break, or they produce families where random curve point definition is bound to produce non-computable or insecure values at high probability.

Are Elliptic Curves More Secure?

At the present the art of Discrete Logarithm Problem is not advanced enough to handle even a Certicom ECC-163 challenge. At Ruhr Universität Bochum a hardware optimized system for key breaking searches was built out of reprogrammable logic chips called FPGA. This enables it to be used for different kinds of algorithms, and thus to gain experience based cost/performance data points.

The COPACOBANA machine can brute force break standard DES in about a week (2008), and artificially short ECC in a year. See the “docs” section there. They do state that actually used ECC keys are well beyond the hardware capabilities of that box, yet they give reasonable estimates for what it would take: It takes 8 000 million machines running in parallel 1 year to break ECC-163. However, do note that COPACOBANA is slower by about a factor of 10 per compute unit, than latest supercomputers utilizing heavily paralleled GPUs for math operations. The cost optimum is somewhere in between.

The COPACOBANA has only 100 compute units vs. about a million in top-notch supercomputers. At these scales of hardware requirements, it is likely that optimal hardware for each algorithm is specially fabricated ASIC, and even then the cost will be in billions, and run time be impractically long for things like ECC-224, ECC-384, GP-256, etc.

It could be that because ECC keys are shorter than RSA keys, they are more tractable to handle with Quantum Computers. But that is still to be seen.

Recent paper “Elliptic and Hyperelliptic Curves: a Practical Security Analysis” studied the expected difficulty of breaking approximately 256 bits long ECC curves. The result was given as “1024 CPU core years at 2 GHz”.

Compare this with 106 CPU core years for 1024 bit RSA key and 107 difficulty increase for every 512 bits means ECC 256 is roughly equally hard as a 2300 bit RSA key (or maybe 2500 bit.)

Notes

  1. This paper talks primarily about RSA keys. ECC keys are mentioned occasionally where they have known related properties.
  2. Some PKI systems do not publish certificates for some reasons. This does not make the RSA public key a secret. Apparently even without Subject Public Key Info data at hand, it is easy to find the public key just with two signatures and their plain text versions. See EPRINT references below. That reference did not say anything about ECC Public Keys, but presume that it is not harder than RSA public keys, and you should be safe.
  3. NIST is recommending that you should re-generate and re-register a new key for a public key crypto system at least every 2 years, preferably every year. (NIST recommendations are very conservative to protect against assumed future progress in crypto analysis.)
  4. DNSSEC operational procedures say that signing keys must be redone every year for maintaining the skill of doing so. Forgetting and needing to re-learn are the most expensive part of PKI maintenance.
  5. Most Mobile Identity systems seem to use 3 to 5 year validity, because key re-generation and certificate re-registration is considered to be too much trouble for the user and too expensive to handle at service desks.

References

http://eprint.iacr.org/2002/115

Universal Padding Schemes for RSA

Shows that it is OK to use same same key for encryption and signing, when later is done using PSS. Actually it probably is OK when using any padding aside of null padding, but it can be proven in mathematical sense for PSS.

http://eprint.iacr.org/2012/444

Factorization of a 1061-bit number by the Special Number Field Sieve

This is about 1061 bit Mersenne number factorization.

http://eprint.iacr.org/2013/644

Elliptic and Hyperelliptic Curves: a Practical Security Analysis

Estabilishing comparable break effort metric on 256 bit ECC keys.

http://eprint.iacr.org/2013/635

Universal security; from bits and mips to pools, lakes — and beyond

Trying to give common base definition for security levels in energy form.

http://eprint.iacr.org/2013/599

Factoring RSA keys from certified smart cards: Coppersmith in the wild

A smart-card security processor with certificates from multiple sources is found to have bad quality RNG, and that resulted in complete failure of RSA key security.

http://eprint.iacr.org/2012/588

Breaking Public Keys — How to Determine an Unknown RSA Public Modulus

(Or: why keeping your public key secret just does not work.)

http://eprint.iacr.org/2010/006

Factorization of a 768-bit RSA modulus

https://factorable.net/paper.html

Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices

This attacks on easy loophole of bad RSA key generation — not random enough keys doing partial collisions a lot more frequently than birthday collision statistics presumption gives.

http://www.ams.org/notices/199902/boneh.pdf

Twenty Years of Attacks on the RSA Cryptosystem

A review of attacks on RSA.

Wikipedia: Key Size

Relative strenght notes, including quantum algorithm assumptions

 Key length recommendations

BlueCrypt’s collection of Key Length recommendations from various sources, and length/strength comparisons