Understanding Cryptography with RSA
RSA is an asymmetric cryptographic algorithm that you are probably using right now to view this article over HTTPS. It was designed by Ron Rivest, Adi Shamir and Leonard Adleman, who developed the algorithm in 1977, naming it after the first initials of their last names. Unlike symmetric systems like AES that have a single key with which you can encrypt and decrypt some plaintext, RSA has two keys: a public key, which can be stored and shared publicly, and a private key, which must be kept secret.
Very simply, if Alice is trying to encrypt a message that only Bob can see, Alice would use Bob’s public key to encrypt the message, and Bob would use his private key to decrypt it. This type of cryptosystem, RSA in particular, is used very heavily in key exchanges, digital signing and many other applications that make your data safe for use on the internet. Some of you may have heard about SSH or TLS (HTTPS). Both of these protocols use RSA heavily along with symmetric cryptosystems to encrypt data in transit. The goal of this article is to show you that even with a high school level algebra background, you can understand the math behind cryptography. Hopefully this makes you more interested in mathematics and cryptography! So without further disclaimer, let’s get into some math. It’ll be fun I promise!
The first thing to understand is the concept of modular arithmetic, in particular the addition and multiplication of integers with some particular maximum and minimum. This is not as complicated as it sounds, in fact you probably do it every day when looking at a clock. Let’s say the time is 10am and I want to know what 4 hours from now is. We know to count 11, 12 then 1 and 2. This is because a clock only has 12 hours. In group theory, we call 12 the order and the group of integer numbers that can only go to 12 and no higher the group of integers modulo 12. You’ll often see operations in the following notation showing equivalence:
25 = 1 mod 12
You can read this as 25 is “congruent” (or equivalent) to 1 when in the group of integers modulo 12. To use our clock analogy, if you move the hour hand ahead 25 times from 0, you’re back at 1 o’ clock.
This can be re-written to be applicable to any number that the modulo divides. To put it a bit more mathematically, we’ll use the operator | to denote something is able to divide something else without a remainder. For example 12|24 because 24/12 is an integer with no remainder. For a general case, we can say
a = b mod n is the same as saying
a - b = kn for some integer k
For our example we can see this is true because:
25 - 1 = (2)(12) where k = 2
The Division Algorithm
Stick with me here, this section is a bit long, but it’s crucial to understand how we derive the RSA private key. Mathematically, we figured out that the result of the modulo operation above is 1 using what’s called the division algorithm, which states that any integer can be represented as
a = bq + r
25 = 12(2) + 1
You can see this in the second statement above where we represent 25 as 12(2) + 1. In this case the number to the right of the congruent operator, or 1 in this case, will be the remainder if you divide a by b. Now lets get a bit more general for a second. We can say that if a and b are integers, then there are two other integers out there, call them r and s, where the following equation is true:
gcd(a, b) = ar + bs, where gcd() is the greatest common divisor
We can see this is true in our example as well because the GCD of 12 and 25 is 1, we have 1 = 25(-1) + 12(2). So for a = 25 and b = 12, r = -1 and s = 2.
Do you remember back in grade school math where we had to find the GCD by listing out all the factors of two numbers? Well it turns out there’s a better way to do it that works numbers of any size. You may have heard of a Greek mathematician named Euclid. He’s famous for a lot of discoveries in number theory including the following algorithm for find the GCD of two numbers and it turns out we can also use the same algorithm in reverse to discover r and s in the equation above, which will give us the RSA Private Key.
The goal here is to learn the math behind deriving the RSA Private Key
Let’s say we want to find the GCD of two rather large numbers, for example: 616 and 487. It might take you a while to factor each of those out by hand, but Euclid’s algorithm can help. First we have to plug in the values from the first equation from above where a = 616 and b = 487. It just makes the math a bit easier if the larger number is a.
616 = 487(1) + 129
I’ll assume you know how to do long division, which is how we found that 487 divides into 616 one time with a remainder of 129. Now set the b to a and r to b and do it again and again until we get either a remainder of 0, then we’ve found our GCD. This will make more sense with the example. Now it’s very important that we keep track of the remainder. In this calculation, I’ll keep this value bolded to make it easier to follow.
1) 616 = 487(1) + 129
2) 487 = 129(3) + 100
3) 129 = 100(1) + 29
4) 100 = 29(3) + 13
5) 29 = 13(2) + 3
6) 13 = 3(4) + 1
7) 3 = 1(3) + 0 <-- Now we're done
We can see out of this that 1 is the final remainder before 0 and thus the GCD of 616 and 487. When two numbers have a GCD of 1, they are called relatively prime. This is an important property in RSA as we’ll see in a minute. Now what we really want out of this is not as much the final GCD value as much as the r and s above, which we’ll get by reversing this algorithm. This part is fun and takes a bit of basic number manipulation. Remember, we’re trying to get each of these equations in the form of
gcd(a, b) = ar + bs
We’ll start from the second to the bottom of our list of equations above setting the remainder on the left (which we know is the GCD), and doing a bit of algebra to move the numbers into place on the way up.
1 = 13 - 4(3)
= 13 - 4(29 - 13(2)) plugin step 5 for 3
= 13 - 4(29) + 8(13) distribute
= -4(29) + 9(13) combine
Let me stop here to make sure you understand what’s happening. We took the equation in step 6 from above and plugged in the value for 3 from equation above it (step 5). That is 3 = 29–13(2). What we’re going to try to do is keep the integrity of the remainders going up so at the end, we’ll see our original numbers 616 and 487 and their corresponding multipliers. This is why in the third line above I’m “keeping” the numbers 13 and 29 because those are both remainders. I then need to combine them before doing it again. At each level, you can check your work by plugging in the right side into the calculator to ensure it’s still equal to 1. So now that the first step is done, let’s continue on. Hopefully you’ll see a pattern here.
1 = 13 - 4(3) = 13 - 4(29 - 13(2)) plug in step 5 for 3
= 13 - 4(29) + 13(8) distribute, keep 13 and 29
= -4(29) + 9(13) combine = -4(29) + 9(100 - 29(3)) plug in step 4 for 13
= -4(29) + 9(100) - 27(29) distribute, keep 29 and 100
= 9(100) - 31(29) combine = 9(100) - 31(129 - 100(1)) plug in step 3 for 29
= 9(100) - 31(129) + 31(100) distribute keep 100 and 129
= -31(129) + 40(100) combine = -31(129) + 40(487 - 129(3)) plug in step 2 for 100
= -31(129) + 40(487) - 120(129) distribute, keep 129 and 487
= 40(487) - 151(129) combine
= 40(487) - 151(616 - 487(1)) plug in step 1 for 129
= 40(487) - 151(616) + 151(487) distribute, keep 616 and 487
= 191(487) - 151(616) combine and DONE!
I hope this process made sense and you can see how the final equation is:
gcd(a, b) = ar + bs
1 = (616)(-151) + (487)(191)
Remember these numbers because we’ll come back to them later.
Euler’s Totient Function
Prime numbers are incredibly important in cryptography for some of the properties around the GCD of two primes being equal to 1. Leonhard Euler discovered a function that gives you the exact number of primes from 1 to n that are relatively prime to n. In symbols the function is represented like so:
In plain English this is the repeated product of all distinct prime numbers that divide n up to n. We won’t dive into the specifics of this, because it’s not necessary for what we want to do. For RSA, we need this function to give us the number of relatively prime numbers for a prime number, which as it turns out is just:
φ(p) = (p - 1) where p is a prime number
Ahh that’s much simpler. In plain English all this means is that if you have a prime number p there are (p -1) numbers between 1 and p whose GCD is 1, or to use the proper term, relatively prime.
RSA Key Creation
Now that we have the necessary building blocks, let’s figure out how RSA works. First let’s generate our Public and Private key pair mathematically.
- Pick two prime numbers, p and q, and we’ll say n = pq
- Now we’ll say m = φ(n) = φ(pq) = (p-1)(q-1) from the section above
- Find two values D and E such that D*E = 1 mod m
Our public key will be the combo (n, E), and private key will be (n, D). Luckily, we’ve already done the math, let’s see how.
First we’ll pick prime numbers p = 23 and q = 29
n = pq = 667
m = φ(pq) = (23 - 1)(29 - 1) = (22)(28) = 616
We’ll “randomly” pick a value of E that is relatively prime to this value 616. Typically this number would be picked by a random number generator on a computer or Hardware Security Module (HSM) that uses something like that first Euclidean division algorithm we used to find the GCD behind the scenes to check that the random number it’s picking satisfies this criteria. If it finds one, it uses it, otherwise, it picks another and tries again. For this exercise let’s pick the number E = 487 since it’s relatively prime to 616. Starting to remember these numbers?
The next step is to find numbers D*E such that D*E = 1 mod m. If you remember all the way back to the modular arithmetic section where we talked about modulus able to be represented as a normal equation, we’ll do the same thing here.
D*E - 1 = km, where k is some integer such that k|m
We can rearrange this equation to get something more familiar:
D*E - km = 1 or D*E + (-k)m = 1
Now we have something that looks identical to the GCD equation we already solved for above:
ar + bs = 1
D*E + (-k)m = 1
From the above calculations where we also used 487 and 616, we can fill this in and determine which number is D. So if E = 487, and m = 616, in this equation, k = -151 and D = 191.
Now we have out Private Key (n, D) = (667, 191) and Public Key (n, E) = (667, 487). Let’s do some encryption and give this key pair to Bob to use in the next section.
The hard part is over, now it’s on to the fun stuff of using the keys to encrypt secret messages. Typically messages are text strings encoded as numbers but since those can get really big really fast, we’ll use something simple like lucky number 7 as our secret message, we’ll denote as m, to be encrypted. If Alice wants to encrypt m for only Bob to see and get a ciphertext, c, she would use his public key. Like so (Note that ^ indicates “raised to the power of”)
c = m ^ E mod n
c = 7 ^ 487 mod 667
c = 458
For what it’s worth, you can use Euler’s Theorem and Fermat’s Little Theorem to do that huge exponents like that by hand, but we won’t go into it here. I’ll leave that up to you for an exercise if you’re curious. You could also just open up python and type in the above expression using the modulo operator %.
Now to verify that what we did was correct. If we can reverse this operation and get 7, we’ve successfully done asymmetric encryption with RSA. The reverse process uses Bob’s private key (n, D) = (667, 191) and goes like this to find the plaintext, p.
p = c ^ D mod n
p = 458 ^ 191 mod 667
p = 7
Success! We can now derive RSA keys and use them to both encrypt and decrypt data. In the process we’ve learned a bit about number theory, group theory and cryptography. A practical use for this is in the TLS protocol you likely use every day on HTTPS websites such as this one, where RSA and other asymmetric algorithms are used to exchange the symmetric key used to encrypt your internet traffic so no one can steal your data. I hope you’ve enjoyed this bit of practical math and that you’ll keep on learning even when it’s hard!
Resources and Next Steps
As a final disclaimer, the math in this article is not representative of the actual implemented RSA algorithm, which has many other controls to make it secure. If you’re a developer, please do not use the methods described here to roll your own crypto, but merely as a way to understand the fundementals of the math behind RSA. If you do want to securely create RSA keys properly you should use vetted tools and libraries such as OpenSSL, GnuPG, or ssh-keygen. Finally, if this has made you more interested in cryptography, I’d suggest the following resources that have helped me: