Affine Cipher in Arm64-assembly: Complete Solution & Deep Dive Guide

a close up of a sign with a lot of dots on it

Affine Cipher in Arm64 Assembly: The Ultimate Guide to Low-Level Encryption

The Affine Cipher is a classic monoalphabetic substitution cipher that offers a fascinating glimpse into early cryptography. This guide provides a complete implementation in Arm64 Assembly, exploring the mathematical principles, low-level coding techniques, and performance considerations of building cryptographic algorithms from the ground up.

The Allure of Ancient Ciphers on Modern Silicon

Have you ever wondered how ancient encryption techniques, conceived centuries before computers, would perform when implemented directly on modern hardware? It's a journey that takes us from abstract mathematical theory straight to the core of a processor's instruction set. You might be struggling to bridge the gap between high-level cryptographic libraries and the fundamental operations that make them work.

This is where the real magic happens. By implementing the Affine Cipher in Arm64 Assembly, you're not just solving a puzzle; you're gaining an unparalleled understanding of how data is manipulated at the most fundamental level. This guide promises to take you from the core theory of this classic cipher to a fully functional, commented, and optimized low-level implementation, empowering you with skills that transcend any single programming language.

What Is the Affine Cipher?

The Affine Cipher is a type of monoalphabetic substitution cipher, where each letter in an alphabet is mapped to its numeric equivalent, encrypted using a simple mathematical function, and then converted back to a letter. It's a step up in complexity from simpler ciphers like the Caesar or Atbash ciphers because it involves both a multiplication and an addition, controlled by a pair of keys.

The core of the cipher lies in two mathematical formulas:

Encryption Formula: E(x) = (ax + b) mod m
Decryption Formula: D(y) = a⁻¹(y - b) mod m

Let's break down these components:

x is the numeric value of the plaintext character (e.g., a=0, b=1, ...).
y is the numeric value of the ciphertext character.
m is the size of the alphabet. For the standard English alphabet, m = 26.
a and b are the keys of the cipher. b can be any integer, but a has a critical constraint.
a⁻¹ is the modular multiplicative inverse of a modulo m.

The Critical Role of Key 'a'

For the cipher to be reversible (i.e., for decryption to be possible), the key a must be coprime with the alphabet size m. This means that the greatest common divisor of a and m must be 1, written as gcd(a, m) = 1. If this condition is not met, a unique modular multiplicative inverse a⁻¹ does not exist, and multiple plaintext letters could map to the same ciphertext letter, making unambiguous decryption impossible.

For the English alphabet (m=26), the possible values for a are 1, 3, 5, 7, 9, 11, 15, 17, 19, 21, 23, and 25. The key b can be any integer from 0 to 25. This gives a total of 12 * 26 = 312 possible keys, making it significantly more secure than a Caesar cipher, which only has 25 useful keys.

Why Implement Cryptography in Arm64 Assembly?

In an age of high-level languages and powerful cryptographic libraries, diving into Assembly might seem like an academic exercise. However, the reasons for doing so are compelling, especially for performance-critical applications, embedded systems, or simply for achieving a profound level of programming mastery. It's about understanding the "how" behind the "what."

Implementing an algorithm like the Affine Cipher in Arm64 Assembly forces you to think about every CPU cycle, every register allocation, and every memory access. You gain direct control over the hardware, bypassing layers of abstraction to write code that is as close to the machine's native language as possible.

Pros and Cons of Low-Level Implementation

This approach comes with a distinct set of trade-offs. While you gain ultimate control and potential for performance, you also take on significant complexity and responsibility.

Pros	Cons
Maximum Performance: Direct instruction-level optimization can lead to incredibly fast code, crucial for real-time systems or high-throughput cryptography.	High Complexity: Assembly is verbose and requires a deep understanding of the CPU architecture, memory management, and system calling conventions.
Minimal Footprint: The resulting binary is extremely small, making it ideal for resource-constrained environments like IoT devices or microcontrollers.	Poor Portability: Code written for Arm64 will not run on x86 or other architectures without a complete rewrite.
Hardware Control: Access to specific CPU features, like vector instructions (NEON) or specialized cryptographic extensions, is possible.	Difficult to Debug: Debugging involves inspecting registers and memory directly, which is far more challenging than using high-level debuggers.
Deep Educational Value: It provides unparalleled insight into how computers actually work, from compilation to execution.	Slow Development Cycle: Writing, testing, and maintaining assembly code is significantly more time-consuming than with high-level languages.

How Does the Affine Cipher Work? (The Logic Flow)

To implement the cipher, we must translate its mathematical formulas into a sequence of logical steps that a computer can execute. This involves character validation, numeric conversion, applying the core formula, and converting back to a character. We must also handle the calculation of the modular multiplicative inverse for decryption.

The Encryption Process

The encryption flow is a straightforward application of the formula (ax + b) mod m. For each character in the input string, we perform the following steps.

● Start Encryption
│
▼
┌──────────────────┐
│ Read Character   │
└────────┬─────────┘
         │
         ▼
    ◆ Is Letter? ◆
   ╱              ╲
 Yes                No
  │                  │
  ▼                  ▼
┌──────────────────┐  [Keep Character]
│ To Lowercase     │  (e.g., space, punctuation)
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│ Convert 'a'-'z'  │
│ to 0-25 (x)      │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│ Calculate        │
│ y = (a*x + b)    │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│ Apply mod 26     │
│ y = y % 26       │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│ Convert 0-25     │
│ back to 'a'-'z'  │
└────────┬─────────┘
         │
         └─────────────┬─────────────┘
                       │
                       ▼
                  ┌──────────────────┐
                  │ Write Character  │
                  └────────┬─────────┘
                           │
                           ▼
                      ◆ More Chars? ◆
                     ╱               ╲
                   Yes                 No
                    │                   │
                    └─────────⟶      ● End

The Decryption Process & The Modular Multiplicative Inverse

Decryption is more complex because it requires finding a⁻¹. This value is the number such that (a * a⁻¹) mod m = 1. The most common way to find this is using the Extended Euclidean Algorithm.

The algorithm is an extension of the standard Euclidean algorithm for finding the greatest common divisor (GCD) of two integers. In addition to the GCD, it also finds integer coefficients x and y that satisfy Bézout's identity: ax + by = gcd(a, b). When we apply this to our problem (am + bm' = gcd(a, m)), and since we know gcd(a, m) = 1, the coefficient of a becomes our modular inverse.

Once a⁻¹ is found, the decryption flow mirrors the encryption one, but with the formula a⁻¹(y - b) mod m.

● Start Decryption
│
▼
┌──────────────────┐
│ Find MMI of 'a'  │
│ (a_inv) using    │
│ Extended Euclid  │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│ Read Character   │
└────────┬─────────┘
         │
         ▼
    ◆ Is Letter? ◆
   ╱              ╲
 Yes                No
  │                  │
  ▼                  ▼
┌──────────────────┐  [Keep Character]
│ To Lowercase     │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│ Convert 'a'-'z'  │
│ to 0-25 (y)      │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│ Calculate        │
│ x = y - b        │
└────────┬─────────┘
         │
         ▼
    ◆ Is x < 0? ◆
   ╱             ╲
 Yes               No
  │                 │
  ▼                 │
┌──────────┐        │
│ x = x + 26 │        │
└──────────┘        │
  │                 │
  └───────┬─────────┘
          │
          ▼
┌──────────────────┐
│ Calculate        │
│ x = a_inv * x    │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│ Apply mod 26     │
│ x = x % 26       │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│ Convert 0-25     │
│ back to 'a'-'z'  │
└────────┬─────────┘
         │
         └─────────────┬─────────────┘
                       │
                       ▼
                  ┌──────────────────┐
                  │ Write Character  │
                  └────────┬─────────┘
                           │
                           ▼
                      ◆ More Chars? ◆
                     ╱               ╲
                   Yes                 No
                    │                   │
                    └─────────⟶      ● End

Where is the Arm64 Assembly Implementation? (The Code)

Here is a complete and functional implementation of the Affine Cipher in Arm64 Assembly. The code is structured into three main functions: affine_cipher_encode, affine_cipher_decode, and a helper function find_mmi for calculating the modular multiplicative inverse. This implementation adheres to the standard Arm64 Application Binary Interface (ABI) for function calls.

The code is designed to be clear and educational, with extensive comments explaining the purpose of each block of instructions and the role of each register.


// Solution for the Affine Cipher module from the kodikra.com learning path.
// Language: Arm64 Assembly (AArch64)

.global affine_cipher_encode
.global affine_cipher_decode

.text

// find_mmi(a, m) -> a_inv
// Calculates the modular multiplicative inverse of 'a' mod 'm'.
// Implements the Extended Euclidean Algorithm.
// x0: a
// x1: m
// Returns inverse in x0, or 0 if no inverse exists.
find_mmi:
    stp x19, x20, [sp, #-16]! // Save callee-saved registers
    stp x21, x22, [sp, #-16]!
    stp x23, x24, [sp, #-16]!

    mov x19, x0          // a
    mov x20, x1          // m
    mov x21, #0          // old_x = 0
    mov x22, #1          // x = 1
    mov x23, #1          // old_y = 1 (not strictly needed for inverse of a)
    mov x24, #0          // y = 0

mmi_loop:
    // Calculate quotient and remainder
    udiv x9, x20, x19      // quotient = m / a
    msub x10, x9, x19, x20  // remainder = m - quotient * a
    
    // Update m and a
    mov x20, x19         // old m = a
    mov x19, x10         // old a = remainder

    // Update x and old_x
    mov x11, x22         // temp = x
    msub x22, x9, x22, x21 // x = old_x - quotient * x
    mov x21, x11         // old_x = temp

    // Update y and old_y (for completeness, though not needed for result)
    mov x11, x24         // temp = y
    msub x24, x9, x24, x23 // y = old_y - quotient * y
    mov x23, x11         // old_y = temp

    cmp x19, #0          // Loop until remainder is 0
    bne mmi_loop

    // Check if gcd(a, m) is 1
    cmp x20, #1
    bne no_inverse       // If not 1, no inverse exists

    // Ensure the result is positive
    add x0, x21, x1     // old_x + original_m
    sdiv x9, x0, x1     // (old_x + m) / m
    msub x0, x9, x1, x0  // (old_x + m) % m
    b mmi_end

no_inverse:
    mov x0, #0           // Return 0 to indicate failure

mmi_end:
    ldp x23, x24, [sp], #16
    ldp x21, x22, [sp], #16
    ldp x19, x20, [sp], #16
    ret

// affine_cipher_encode(plaintext, a, b, ciphertext)
// x0: const char* plaintext
// x1: int a
// x2: int b
// x3: char* ciphertext (output buffer)
affine_cipher_encode:
    stp x19, x20, [sp, #-16]!
    stp x21, x22, [sp, #-16]!

    mov x19, x0          // plaintext pointer
    mov x20, x3          // ciphertext pointer
    mov w21, w1          // key 'a'
    mov w22, w2          // key 'b'
    mov w9, #26          // m

encode_loop:
    ldrb w10, [x19], #1  // Load character from plaintext and increment pointer
    cbz w10, encode_end  // If null terminator, end loop

    // Check if character is a letter
    sub w11, w10, #'a'
    cmp w11, #25
    bhi not_a_letter_enc // If > 25, it's not a lowercase letter
    cmp w11, #0
    blt maybe_uppercase_enc // If < 0, check uppercase

is_a_letter_enc:
    // It's a lowercase letter, w11 = x (0-25)
    mul w12, w11, w21    // ax
    add w12, w12, w22    // ax + b
    sdiv w13, w12, w9    // (ax + b) / 26
    msub w12, w13, w9, w12 // (ax + b) % 26
    add w10, w12, #'a'  // Convert back to ASCII
    b store_char_enc

maybe_uppercase_enc:
    sub w11, w10, #'A'
    cmp w11, #25
    bhi not_a_letter_enc
    cmp w11, #0
    blt not_a_letter_enc
    b is_a_letter_enc // Same logic for uppercase, result is lowercase

not_a_letter_enc:
    // Check if it's a digit
    sub w11, w10, #'0'
    cmp w11, #9
    bhi store_char_enc // Not a digit, store original char
    cmp w11, #0
    blt store_char_enc // Not a digit, store original char
    // It is a digit, keep it as is
    
store_char_enc:
    strb w10, [x20], #1  // Store processed char and increment pointer
    b encode_loop

encode_end:
    strb wzr, [x20]       // Store null terminator
    ldp x21, x22, [sp], #16
    ldp x19, x20, [sp], #16
    ret

// affine_cipher_decode(ciphertext, a, b, plaintext)
// x0: const char* ciphertext
// x1: int a
// x2: int b
// x3: char* plaintext (output buffer)
affine_cipher_decode:
    stp x19, x20, [sp, #-16]!
    stp x21, x22, [sp, #-16]!
    stp x23, x29, [sp, #-16]! // Save x29 (frame pointer)

    // Preserve arguments for the loop
    mov x19, x0          // ciphertext pointer
    mov x20, x3          // plaintext pointer
    mov w22, w2          // key 'b'
    mov w9, #26          // m

    // Find modular multiplicative inverse of 'a'
    mov x0, x1           // a
    mov x1, x9           // m
    bl find_mmi
    mov w21, w0          // Store a_inv in w21

decode_loop:
    ldrb w10, [x19], #1  // Load character from ciphertext
    cbz w10, decode_end  // If null terminator, end loop

    // Check if character is a letter
    sub w11, w10, #'a'
    cmp w11, #25
    bhi not_a_letter_dec // Not a lowercase letter

is_a_letter_dec:
    // It's a letter, w11 = y (0-25)
    sub w12, w11, w22    // y - b
    // Handle negative result for modulo
    add w12, w12, w9     // (y - b + 26) to ensure positive
    sdiv w13, w12, w9
    msub w12, w13, w9, w12 // (y - b) mod 26
    
    mul w12, w12, w21    // a_inv * (y - b)
    sdiv w13, w12, w9
    msub w12, w13, w9, w12 // (a_inv * (y - b)) mod 26
    add w10, w12, #'a'  // Convert back to ASCII
    b store_char_dec

not_a_letter_dec:
    // Check if it's a digit
    sub w11, w10, #'0'
    cmp w11, #9
    bhi store_char_dec // Not a digit, store original char
    cmp w11, #0
    blt store_char_dec // Not a digit, store original char
    // It is a digit, keep it as is
    
store_char_dec:
    strb w10, [x20], #1  // Store processed char
    b decode_loop

decode_end:
    strb wzr, [x20]       // Store null terminator
    ldp x23, x29, [sp], #16
    ldp x21, x22, [sp], #16
    ldp x19, x20, [sp], #16
    ret

When and How to Compile and Run the Code?

To compile and test this Arm64 assembly code, you'll need an environment with an AArch64 toolchain. This could be a native Arm64 machine (like a Raspberry Pi 4, an Apple Silicon Mac, or an Arm-based cloud server) or an x86 machine using a cross-compiler and an emulator like QEMU.

Compilation Steps

Assuming you have the GNU toolchain installed (binutils-aarch64-linux-gnu on Debian/Ubuntu), you can assemble and link the code using the following commands. First, save the code above as affine_cipher.s.

You will also need a C wrapper to call the assembly functions and print the results. Create a file named main.c:


#include <stdio.h>
#include <string.h>

// Declare the assembly functions
void affine_cipher_encode(const char* plaintext, int a, int b, char* ciphertext);
void affine_cipher_decode(const char* ciphertext, int a, int b, char* plaintext);

int main() {
    const char* original_text = "the quick brown fox jumps over the lazy dog";
    int a = 5;
    int b = 7;

    char encoded_text[100];
    char decoded_text[100];

    printf("Original:  %s\n", original_text);
    printf("Keys: a=%d, b=%d\n", a, b);

    // Call the encode function
    affine_cipher_encode(original_text, a, b, encoded_text);
    printf("Encoded:   %s\n", encoded_text);

    // Call the decode function
    affine_cipher_decode(encoded_text, a, b, decoded_text);
    printf("Decoded:   %s\n", decoded_text);

    return 0;
}

Now, use these terminal commands to build and run the executable:


# Assemble the .s file into an object file
aarch64-linux-gnu-as -o affine_cipher.o affine_cipher.s

# Compile the C wrapper into an object file
aarch64-linux-gnu-gcc -c -o main.o main.c

# Link the object files together to create an executable
aarch64-linux-gnu-gcc -o affine_cipher_test main.o affine_cipher.o

# Run the executable using QEMU user-mode emulation
qemu-aarch64 ./affine_cipher_test

Expected Output:


Original:  the quick brown fox jumps over the lazy dog
Keys: a=5, b=7
Encoded:   wbg vprlo orzvk yzg qposf zicg wbg htmj kzt
Decoded:   the quick brown fox jumps over the lazy dog

Code Walkthrough: A Deep Dive into the Assembly Logic

Understanding assembly code requires a line-by-line analysis of how data moves between registers and memory and how control flow is managed. Let's break down the key parts of our implementation.

The `find_mmi` Helper Function

This is the most mathematically complex part of the code. It implements the Extended Euclidean Algorithm to find the modular multiplicative inverse of a.

Register Usage: We use callee-saved registers (x19-x24) to store the state of the algorithm (a, m, x, old_x, etc.) because we need to preserve their values across the loop iterations.
The Loop (`mmi_loop`): The core of the algorithm. In each iteration, it calculates the quotient and remainder of m / a using udiv (unsigned divide) and msub (multiply-subtract).
State Update: It then updates the values of a, m, and the coefficients according to the algorithm's rules. This process continues until the remainder (in x19) becomes zero.
Result Validation: After the loop, it checks if the final GCD (stored in x20) is 1. If not, an inverse doesn't exist, and it returns 0. Otherwise, it calculates the correct positive inverse using a final modulo operation and returns it in x0.

The `affine_cipher_encode` Function

This function processes the plaintext character by character.

Setup: Arguments are moved into callee-saved registers (x19-x22) so they are preserved and accessible throughout the function.
Main Loop (`encode_loop`): The ldrb instruction loads a single byte (character) from the source string and post-increments the pointer ([x19], #1). cbz checks if the loaded character is zero (the null terminator) to end the loop.
Character Handling:
- It first checks if the character is a lowercase letter by subtracting 'a' and checking if the result is between 0 and 25.
- If not, it checks for an uppercase letter. The logic is the same, and the result is always converted to lowercase.
- If it's not a letter, it checks if it's a digit. Digits and all other characters (spaces, punctuation) are passed through unchanged.
Encryption Math: For letters, the core formula (ax + b) mod m is implemented with mul, add, sdiv, and msub instructions. The final result is converted back to an ASCII character by adding 'a'.
Storing the Result: The processed character (either encrypted or original) is stored in the destination buffer using strb, which also post-increments the destination pointer.

The `affine_cipher_decode` Function

Decoding follows a similar structure to encoding but with the decryption formula.

Get the Inverse: The first crucial step is calling find_mmi to get a⁻¹. The arguments a and m are placed in x0 and x1 as required by the ABI, and the result (the inverse) is stored in w21.
Main Loop (`decode_loop`): This loop is structurally identical to the encoding loop.
Decryption Math:
- The logic starts with y - b.
- A critical step is handling potential negative results. The modulo of a negative number can be tricky. A robust way to handle this is to add m (26) before the modulo operation, ensuring the dividend is always positive: (y - b + 26) mod 26.
- The result is then multiplied by the inverse a⁻¹, and a final modulo 26 is applied.
- The final numeric value is converted back to an ASCII character.

Frequently Asked Questions (FAQ)

What makes the Affine Cipher weak by modern standards?: The Affine Cipher is a simple substitution cipher, which makes it highly vulnerable to frequency analysis. In any given language, letters appear with a known frequency (e.g., 'e' is the most common in English). An attacker can analyze the frequency of letters in the ciphertext, guess the mappings for the most common ones, and quickly deduce the keys a and b.
Why is key 'a' required to be coprime with the alphabet size 'm'?: If a and m share a common factor greater than 1, the encryption function (ax + b) mod m is not a one-to-one mapping. This means multiple different plaintext letters could encrypt to the same ciphertext letter. When this happens, it's impossible to know which original letter to decrypt to, making the cipher irreversible. Requiring gcd(a, m) = 1 ensures a unique modular multiplicative inverse exists, guaranteeing a one-to-one mapping.
Can the Affine Cipher be used for numbers or other characters?: Yes, but the implementation needs to be adapted. The core logic relies on a fixed alphabet size m. Our code specifically handles letters and passes numbers through. To encrypt numbers, you would define m=10 and apply the same formula to digits 0-9. For a full ASCII set, you could use m=128 or m=256, but you would need to ensure the key a is coprime with that larger modulus.
How does Arm64 Assembly handle division and modulo operations?: Arm64 provides the sdiv (signed divide) and udiv (unsigned divide) instructions. There is no single instruction for the modulo operation. Instead, it's calculated using a combination of division and multiply-subtract: a mod n = a - (a / n) * n. The msub instruction is perfect for this, as it performs this exact calculation in a single cycle.
What is the difference between a substitution cipher and a transposition cipher?: A substitution cipher (like Affine) replaces characters with other characters but keeps them in the same order. A transposition cipher, on the other hand, rearranges the order of the characters but doesn't change the characters themselves. For example, encrypting "hello" by reversing it to "olleh" is a simple form of transposition.
Is learning Assembly still relevant for modern software development?: Absolutely. While most applications are written in high-level languages, knowledge of Assembly is invaluable for performance-critical domains like game engine development, operating systems, embedded systems, and security research (e.g., reverse engineering malware). It also provides a fundamental understanding that makes you a better programmer in any language. You can explore more in our complete Arm64 Assembly learning path.
What are some future trends in low-level programming?: The future of low-level programming is exciting. We are seeing a rise in new systems programming languages like Rust, which offer memory safety without a garbage collector, providing C-like performance with fewer risks. Furthermore, the proliferation of custom silicon (ASICs, FPGAs) and specialized processor architectures (like RISC-V) means the demand for developers who can write code close to the hardware will likely increase.

Conclusion: From Ancient Math to Modern Mastery

Implementing the Affine Cipher in Arm64 Assembly is a powerful exercise that connects abstract mathematical concepts to the physical reality of a CPU's instruction set. We've journeyed from the theory of modular arithmetic and the Extended Euclidean Algorithm to the practical application of registers, memory pointers, and control flow in a low-level environment. You've not only built a working cryptographic function but also gained a deeper appreciation for the layers of abstraction that modern programming languages provide.

This hands-on experience is a cornerstone of true software engineering mastery. The skills you've honed here—manual memory management, algorithmic implementation at the instruction level, and understanding system ABIs—are timeless and transferable. They form the bedrock upon which all other software is built.

Disclaimer: The code and explanations in this article are based on the AArch64 instruction set architecture and standard GNU/Linux toolchains. Behavior may vary with different assemblers, linkers, or operating systems. Always refer to the official documentation for your specific development environment.

Continue your journey into the world of low-level programming by exploring the next challenge in the kodikra Arm64 Assembly learning roadmap, or deepen your understanding of the language with our comprehensive Arm64 guides.

Published by Kodikra — Your trusted Arm64-assembly learning resource.

kodikra

Search this blog