All Your Base in Arm64-assembly: Complete Solution & Deep Dive Guide

white and black abstract illustration

From Binary to Hex: A Deep Dive into Base Conversion with Arm64 Assembly

Mastering number base conversion is a fundamental skill in low-level programming. This guide provides a comprehensive walkthrough of the "All Your Base" problem using Arm64 assembly, covering the core algorithm, a detailed code implementation, and its practical applications in modern computing systems.


You've just started a new role as an embedded systems engineer. Your first task is to interface with a sensor that outputs its readings as a stream of binary digits. Meanwhile, the main control unit expects these values in a standard decimal format for logging, and the debugging interface needs to display them in hexadecimal. You're caught between different numerical worlds, and a simple misinterpretation could lead to system failure. This isn't just a theoretical math problem—it's a daily reality in low-level development.

This is where the art of base conversion becomes a critical tool in your arsenal. The ability to translate a number from one base to another is essential for data interoperability, debugging, and direct hardware manipulation. In this comprehensive guide, we will dissect the logic behind base conversion and implement a robust solution from scratch in Arm64 assembly, the language that powers everything from your smartphone to massive data centers. We'll turn this seemingly complex challenge into a clear, manageable process, empowering you to handle any numerical representation with confidence.


What Is Number Base Conversion?

At its core, number base conversion is the process of representing the same quantity using a different set of symbols and positional rules. The "base" or "radix" of a number system defines how many unique digits are used to represent numbers. We are most familiar with base-10 (decimal), which uses ten digits (0-9).

The value of any number is determined by its digits and their positions. In a base-b system, a number represented by the digits dndn-1...d1d0 has a value calculated by the polynomial:

Value = (dn × bn) + (dn-1 × bn-1) + ... + (d1 × b1) + (d0 × b0)

Let's see this in action:

  • Base-10 (Decimal): The number 123 means (1 × 102) + (2 × 101) + (3 × 100) = 100 + 20 + 3 = 123.
  • Base-2 (Binary): The number 1101 means (1 × 23) + (1 × 22) + (0 × 21) + (1 × 20) = 8 + 4 + 0 + 1 = 13 (in decimal).
  • Base-16 (Hexadecimal): The number A9 (where A=10) means (10 × 161) + (9 × 160) = 160 + 9 = 169 (in decimal).

The fundamental strategy for converting between any two arbitrary bases (e.g., from base-3 to base-7) is to use base-10 as a universal intermediate format. The process is always two-phased:

  1. Convert the number from its source base into its base-10 equivalent.
  2. Convert the resulting base-10 number into the target base.

Why Is This Crucial in Arm64 Assembly?

In high-level languages like Python or Java, base conversion is often handled by built-in functions. However, in assembly language, you are working directly with the metal. Understanding and implementing this logic is not an academic exercise; it's a practical necessity for several reasons:

  • Hardware Interfacing: Peripherals, sensors, and network interfaces often communicate using binary or hexadecimal data. Your code must be able to correctly interpret this raw data.
  • Data Representation: Memory addresses are almost always viewed in hexadecimal. Color values are often represented as RGB hex triplets (e.g., #FF0000 for red). File permissions in Unix-like systems are represented in octal (base-8).
  • Performance Optimization: Implementing custom numerical conversion routines in assembly can provide significant performance gains in tight loops or data-intensive applications, as you can avoid the overhead of standard library calls.
  • Debugging: When inspecting memory dumps or register values in a debugger like GDB, you're looking at raw binary or hex values. Being able to mentally convert these to decimal is an invaluable skill for understanding program state.

Mastering this concept within the Arm64 assembly learning path provides a foundational understanding of how computers truly represent and manipulate data.


How Does the Conversion Algorithm Work?

As mentioned, our strategy involves a two-phase conversion using base-10 as a bridge. Let's detail the algorithm for each phase.

Phase 1: Converting from Input Base to Base-10

To convert a sequence of digits from any base b_in to base-10, we can process the digits from left to right. This method is an efficient implementation of polynomial evaluation known as Horner's method.

The algorithm is as follows:

  1. Initialize an accumulator (our base-10 value) to 0.
  2. Iterate through each digit of the input number, from left to right (most significant to least significant).
  3. For each digit, multiply the current accumulator value by b_in.
  4. Add the value of the current digit to the accumulator.
  5. After processing all digits, the accumulator will hold the final base-10 value.

Example: Convert 413 (base-5) to base-10.

  • Start with accumulator = 0.
  • First digit is 4: accumulator = (0 * 5) + 4 = 4.
  • Second digit is 1: accumulator = (4 * 5) + 1 = 21.
  • Third digit is 3: accumulator = (21 * 5) + 3 = 105 + 3 = 108.

The base-10 representation of 413 (base-5) is 108.

    ● Start (Input Digits, Input Base)
    │
    ▼
  ┌─────────────────┐
  │ Initialize      │
  │ value = 0       │
  └────────┬────────┘
           │
           ▼
  ┌─────────────────┐
  │ Loop Each Digit │
  └────────┬────────┘
           │
           ▼
  ┌─────────────────────────┐
  │ value = value * input_base │
  └────────────┬────────────┘
               │
               ▼
  ┌─────────────────────────┐
  │ value = value + current_digit │
  └────────────┬────────────┘
               │
               ▼
      ◆ More Digits? ◆
      ╱              ╲
    Yes              No
     │                │
     └────────────────┤
                      │
                      ▼
                 ● End (Base-10 Value)

Phase 2: Converting from Base-10 to Output Base

To convert a base-10 number to a target base b_out, we use repeated division and modulo operations. The remainders of these divisions, when read in reverse order, form the digits of the number in the new base.

The algorithm is as follows:

  1. Start with the base-10 number to be converted.
  2. If the number is 0, the result is simply the digit 0.
  3. While the number is greater than 0:
    1. Calculate the remainder of the number divided by b_out. This remainder is the next digit (least significant first).
    2. Store this digit.
    3. Update the number by performing an integer division of the number by b_out.
  4. The stored digits are in reverse order. Reverse them to get the final correct representation.

Example: Convert 108 (base-10) to base-8 (octal).

  • 108 % 8 = 4 (remainder), 108 / 8 = 13 (quotient). Digit is 4.
  • 13 % 8 = 5 (remainder), 13 / 8 = 1 (quotient). Digit is 5.
  • 1 % 8 = 1 (remainder), 1 / 8 = 0 (quotient). Digit is 1.

The digits generated are 4, 5, 1. Reading them in reverse order gives 154. So, 108 (base-10) is 154 (base-8).

    ● Start (Base-10 Value, Output Base)
    │
    ▼
  ┌───────────────────┐
  │ Loop while value > 0 │
  └─────────┬─────────┘
            │
            ▼
  ┌───────────────────────────┐
  │ digit = value % output_base │
  └─────────────┬─────────────┘
                │
                ▼
  ┌───────────────────────────┐
  │ value = value / output_base │
  └─────────────┬─────────────┘
                │
                ▼
      ┌─────────────────┐
      │ Store digit     │
      └───────┬─────────┘
              │
              ▼
      ◆ value > 0? ◆
      ╱            ╲
    Yes             No
     │               │
     └───────────────┤
                     │
                     ▼
           ┌────────────────┐
           │ Reverse Digits │
           └────────────────┘
                     │
                     ▼
                ● End (Output Digits)

Where Is the Logic Implemented? A Detailed Arm64 Code Walkthrough

Now, let's translate this two-phase algorithm into Arm64 assembly code. This implementation adheres to the ARM 64-bit Procedure Call Standard (AAPCS), which defines how arguments are passed in registers and how the stack is managed.

The function signature in a C-like notation would be:


/*
 * extern int rebase(
 *     int32_t in_base,         // w0
 *     const int32_t* in_digits, // x1
 *     int in_digit_count,      // w2
 *     int32_t out_base,        // w3
 *     int32_t* out_digits       // x4
 * );
 *
 * Returns: number of output digits, or a negative error code.
 */

Assembling and Linking

To compile and run this code, you would typically write a C wrapper to call the assembly function. You can assemble and link it using the GNU toolchain on a Linux system:


# Assemble the assembly source file
as rebase.s -o rebase.o

# Compile the C wrapper (e.g., main.c)
gcc main.c -c -o main.o

# Link them together to create the final executable
gcc main.o rebase.o -o rebase_app

# Run the application
./rebase_app

The Complete Assembly Code

Here is the full, commented implementation. We will break it down section by section.


.equ BAD_BASE, -1
.equ BAD_DIGIT, -2

.text
.globl rebase

/*
 * extern int rebase(
 *     int32_t in_base,         // w0
 *     const int32_t* in_digits, // x1
 *     int in_digit_count,      // w2
 *     int32_t out_base,        // w3
 *     int32_t* out_digits       // x4
 * );
 */
rebase:
    // Function Prologue: Save callee-saved registers
    stp x19, x20, [sp, #-32]!
    stp x21, x22, [sp, #16]

    // --- Input Validation ---
    cmp w0, #2          // Check if in_base >= 2
    blt .bad_base
    cmp w3, #2          // Check if out_base >= 2
    blt .bad_base

    // Move arguments to callee-saved registers to preserve them
    mov w19, w0         // x19 = in_base
    mov x20, x1         // x20 = in_digits pointer
    mov w21, w2         // x21 = in_digit_count
    mov w22, w3         // x22 = out_base

    // --- Phase 1: Convert from Input Base to Base-10 ---
    mov x5, #0          // x5 = base-10 value accumulator, initialize to 0
    mov w6, #0          // w6 = loop counter i = 0

.from_base_loop:
    cmp w6, w21         // while (i < in_digit_count)
    b.ge .to_base_start // Exit loop if i >= count

    ldr w7, [x20, w6, sxtw 2] // Load digit: w7 = in_digits[i] (sxtw 2 scales by 4 bytes)
    
    // Validate digit
    cmp w7, #0          // if (digit < 0)
    blt .bad_digit
    cmp w7, w19         // if (digit >= in_base)
    b.ge .bad_digit

    // Accumulate value: value = (value * in_base) + digit
    mul x5, x5, x19
    add x5, x5, x7, sxtx // Add digit (sign-extended to 64-bit)

    add w6, w6, #1      // i++
    b .from_base_loop

.to_base_start:
    // --- Phase 2: Convert from Base-10 to Output Base ---
    mov x7, x4          // x7 = current output pointer
    mov w8, #0          // w8 = output digit count

    // Handle special case: input value is 0
    cmp x5, #0
    b.ne .to_base_loop
    // If value is 0, store a single 0 digit and return 1
    mov w9, #0
    str w9, [x7]
    mov w0, #1
    b .exit

.to_base_loop:
    cmp x5, #0          // while (value > 0)
    b.eq .reverse_digits

    // Perform division: value / out_base
    udiv x9, x5, x22    // x9 = quotient = value / out_base
    // Calculate remainder: value % out_base
    // remainder = value - (quotient * out_base)
    msub x10, x9, x22, x5 // x10 = remainder
    
    // Store the digit (remainder)
    str w10, [x7], #4   // Store w10 at [x7] and post-increment x7 by 4
    
    mov x5, x9          // Update value = quotient
    add w8, w8, #1      // Increment output digit count
    b .to_base_loop

.reverse_digits:
    // The digits were stored in reverse order. Now, reverse them in-place.
    mov x9, x4          // x9 = pointer to start (left)
    sub x10, x7, #4     // x10 = pointer to end (right)

.reverse_loop:
    cmp x9, x10         // while (left < right)
    b.ge .done

    ldr w11, [x9]       // Load left digit
    ldr w12, [x10]      // Load right digit
    str w12, [x9], #4   // Store right at left, left++
    str w11, [x10], #-4 // Store left at right, right--
    b .reverse_loop

.done:
    mov w0, w8          // Return value is the output digit count
    b .exit

.bad_base:
    mov w0, BAD_BASE
    b .exit_no_restore // Can exit early, no registers were modified yet

.bad_digit:
    mov w0, BAD_DIGIT
    // Fallthrough to .exit

.exit:
    // Function Epilogue: Restore callee-saved registers
    ldp x21, x22, [sp, #16]
    ldp x19, x20, [sp], #32
.exit_no_restore:
    ret

Code Breakdown

1. Prologue and Validation


rebase:
    stp x19, x20, [sp, #-32]!
    stp x21, x22, [sp, #16]

    cmp w0, #2
    blt .bad_base
    cmp w3, #2
    blt .bad_base

    mov w19, w0
    mov x20, x1
    mov w21, w2
    mov w22, w3
  • stp: "Store Pair" saves the registers x19-x22 to the stack. These are callee-saved registers, meaning our function must preserve their original values for the caller. We pre-decrement the stack pointer (sp) by 32 bytes to make space.
  • cmp w0, #2 and cmp w3, #2: These instructions validate the input and output bases. A valid base must be 2 or greater. If not, we branch (blt - Branch if Less Than) to the .bad_base error handler.
  • mov ...: We copy the arguments from registers w0-w3 and x1, x4 into our saved registers (w19, x20, w21, w22). This is good practice because the original argument registers might be overwritten if we call other functions (though we don't here).

2. Phase 1: .from_base_loop


.from_base_loop:
    cmp w6, w21
    b.ge .to_base_start

    ldr w7, [x20, w6, sxtw 2]
    
    cmp w7, #0
    blt .bad_digit
    cmp w7, w19
    b.ge .bad_digit

    mul x5, x5, x19
    add x5, x5, x7, sxtx

    add w6, w6, #1
    b .from_base_loop
  • mov x5, #0: Initializes our 64-bit accumulator x5 to zero.
  • ldr w7, [x20, w6, sxtw 2]: This is the core of reading the input. ldr loads a value from memory. The address is calculated as base address (x20) + (index (w6) * element_size). Since our digits are 32-bit integers (4 bytes), we scale the index w6 by 4. The sxtw 2 instruction sign-extends the 32-bit index w6 to 64 bits and then left-shifts by 2 (multiplying by 4).
  • Digit Validation: We check if the loaded digit w7 is valid (i.e., 0 <= digit < in_base).
  • mul x5, x5, x19: Implements value = value * in_base.
  • add x5, x5, x7, sxtx: Implements value = value + digit. The sxtx operand sign-extends the 32-bit digit in w7 to a 64-bit value before adding it to x5.

3. Phase 2: .to_base_loop


.to_base_loop:
    cmp x5, #0
    b.eq .reverse_digits

    udiv x9, x5, x22
    msub x10, x9, x22, x5
    
    str w10, [x7], #4
    
    mov x5, x9
    add w8, w8, #1
    b .to_base_loop
  • Zero Handling: We first check if the accumulated value x5 is zero. If so, we handle it as a special case.
  • udiv x9, x5, x22: Unsigned division. x9 = x5 / x22. The quotient is stored in x9.
  • msub x10, x9, x22, x5: "Multiply-Subtract". This single, powerful instruction calculates the remainder. It computes x5 - (x9 * x22) and stores the result in x10. This is equivalent to the modulo operator.
  • str w10, [x7], #4: "Store Register". This stores the 32-bit remainder from w10 into the memory location pointed to by x7. The , #4 part is a post-index addressing mode, which means the address in x7 is used for the store, and *then* x7 is incremented by 4, ready for the next digit.
  • mov x5, x9: We update our value to be the quotient for the next iteration.

4. Reversing and Exiting


.reverse_digits:
    mov x9, x4
    sub x10, x7, #4

.reverse_loop:
    cmp x9, x10
    b.ge .done

    ldr w11, [x9]
    ldr w12, [x10]
    str w12, [x9], #4
    str w11, [x10], #-4
    b .reverse_loop

.done:
    mov w0, w8
    b .exit

.exit:
    ldp x21, x22, [sp, #16]
    ldp x19, x20, [sp], #32
    ret
  • .reverse_digits: We set up two pointers: x9 points to the start of the output buffer, and x10 points to the last digit we wrote.
  • .reverse_loop: This is a classic in-place swap loop. We load the values from the left and right pointers, store them in each other's locations, and then move the pointers towards the center.
  • mov w0, w8: The return value (the count of output digits) is placed in w0 as per the AAPCS.
  • ldp: "Load Pair" restores the callee-saved registers from the stack before returning. The final ldp also increments the stack pointer back to its original position.
  • ret: Returns control to the calling function.

Pros and Cons of This Approach

Implementing numerical algorithms in assembly is a trade-off. It's important to understand when it's appropriate.

Pros Cons
Maximum Performance: Direct instruction-level control can yield highly optimized code, avoiding any abstraction overhead from high-level languages. High Complexity & Verbosity: Assembly code is much longer and harder to read, write, and maintain than its high-level equivalent.
Hardware Access: It provides unparalleled control over CPU registers, memory, and specific instructions like msub, which can be very efficient. Error-Prone: Manual memory management, register allocation, and adherence to calling conventions increase the risk of bugs that can be hard to track down.
Deep System Understanding: Writing this code forces you to understand exactly how data is represented and manipulated by the CPU, a crucial skill for low-level developers. Poor Portability: This code is specific to the Arm64 architecture. It would need a complete rewrite for x86-64 or other architectures.
Minimal Footprint: The resulting machine code is very small, which is critical for constrained environments like microcontrollers or bootloaders. Slower Development Time: The development cycle for assembly is significantly longer compared to using a high-level language with a rich standard library.

Frequently Asked Questions (FAQ)

What exactly is a "base" in a number system?
The base, or radix, is the number of unique digits or symbols used to represent numbers in a positional numeral system. For example, base-10 uses ten digits (0-9), while base-2 (binary) uses two digits (0 and 1).
Why use base-10 as an intermediate step for conversion?
Using base-10 as a "universal translator" simplifies the problem. Instead of writing a specific conversion algorithm for every possible pair of bases (e.g., base-3 to base-7, base-5 to base-12), you only need two algorithms: one to convert from any base *to* base-10, and one to convert *from* base-10 to any base. This is far more modular and manageable.
How does Arm64 handle division and modulo efficiently?
The Arm64 instruction set includes the udiv (unsigned divide) instruction. For the modulo operation, instead of a separate instruction, it provides the highly efficient msub (multiply-subtract) instruction. msub dst, src1, src2, addend calculates addend - (src1 * src2). By calculating value - (quotient * base), we get the remainder in a single instruction.
What are common pitfalls when implementing base conversion?
The most common errors are: 1) Forgetting to validate inputs, such as bases less than 2 or digits that are invalid for their base. 2) Off-by-one errors in loops. 3) Forgetting to handle the case where the input number is zero. 4) The biggest pitfall is forgetting that the modulo-division method produces digits in reverse order; you must store and then reverse them for the correct output.
Is this rebase function compliant with the ARM AAPCS?
Yes, it is. It correctly reads arguments from registers w0-w4, places the return value in w0, and properly saves and restores all callee-saved registers (x19-x22) that it modifies by using the stack. This ensures it can be safely called from C/C++ or other compliant code.
Can this code handle very large numbers?
The current implementation uses 64-bit registers (x5) for the intermediate base-10 value. This means it is limited to numbers that can be represented by an unsigned 64-bit integer (up to about 1.8 x 1019). For arbitrary-precision arithmetic (handling numbers of any size), you would need a more complex implementation using an array of integers to store the value and implementing multi-word arithmetic operations.
How can I test this assembly code effectively?
The best way is to write a test harness in a higher-level language like C. The C code can prepare the input arrays and bases, call the assembly function, and then verify the output against a known-correct implementation (e.g., C's own conversion functions or a manually calculated result). This separates the testing logic from the implementation being tested.

Conclusion: Beyond the Base

We've journeyed from the theoretical concept of positional notation to a practical, instruction-by-instruction implementation of base conversion in Arm64 assembly. This process highlights the core principles of low-level development: careful algorithm design, meticulous register management, and a deep respect for the underlying hardware architecture.

While you may not write base conversion functions daily, the skills acquired—input validation, loop construction, memory addressing, and adherence to calling conventions—are universally applicable in systems programming. This kodikra.com module serves as a powerful demonstration of how abstract mathematical concepts are transformed into concrete, high-performance code that powers the digital world.

As ARM-based processors continue to dominate in mobile, IoT, and even high-performance computing, a solid grasp of Arm64 assembly is an increasingly valuable asset. To continue your journey, explore our complete Arm64 assembly learning roadmap and discover more challenges that will sharpen your low-level programming skills.

Disclaimer: All code snippets and examples are based on the Armv8-A architecture and standard GNU Assembler syntax, relevant as of the current date. Future architectural revisions may introduce new instructions or behaviors.


Published by Kodikra — Your trusted Arm64-assembly learning resource.