Roman Numerals in 8th: Complete Solution & Deep Dive Guide

The Complete Guide to Roman Numerals in 8th: From Zero to Hero

Learn to master the classic algorithm for converting Arabic numbers to Roman numerals using the stack-based power of the 8th programming language. This guide breaks down the logic behind subtractive notation (like IV for 4) and provides a detailed code walkthrough for handling numbers up to 3,999, turning you into a proficient problem-solver.

Have you ever glanced at the copyright date at the end of a movie, seen a grand clock face, or noticed the numbering for a Super Bowl and been momentarily stumped by a sequence of letters like MCMXCVIII? These are Roman numerals, a numeral system that, while ancient, presents a fascinating and timeless challenge for programmers.

Many developers encounter this problem early in their careers, and it serves as a perfect exercise in algorithmic thinking, data structures, and string manipulation. You might feel a bit of uncertainty tackling this, especially in a unique, stack-oriented language like 8th. But that's precisely the challenge we're here to conquer.

This in-depth guide will not only walk you through the history and logic of Roman numerals but also provide a line-by-line deconstruction of an elegant 8th solution. By the end, you'll have a deep understanding of the "greedy" algorithm and the skills to implement it with confidence, leveraging the unique features of 8th.

What Are Roman Numerals? A System of Symbols and Rules

Before diving into code, we must understand the system we're trying to model. Roman numerals originated in ancient Rome and remained the dominant way of writing numbers throughout Europe well into the Late Middle Ages. Unlike the Arabic system we use today (0, 1, 2...), which is positional, the Roman system is additive and sometimes subtractive, using a combination of letters from the Latin alphabet.

The Core Symbols

The entire system is built upon seven core symbols, each representing a fixed value. Mastering these is the first step.

Symbol	Value	Mnemonic (Memory Aid)
`I`	1	(Imagine one finger)
`V`	5	(The shape of a hand with five fingers)
`X`	10	(Two 'V's, one on top of the other)
`L`	50	Lucidly
`C`	100	Can
`D`	500	Do
`M`	1000	Math

A common mnemonic to remember the larger values is: "Lucidly Can Do Math".

The Two Fundamental Rules of Combination

Simply knowing the symbols isn't enough. The real logic lies in how they are combined. There are two primary rules that govern the formation of any Roman numeral.

The Additive Rule: When a symbol of equal or lesser value is placed after a symbol of greater value, their values are added. This is the most common rule.
- VI = 5 + 1 = 6
- LXX = 50 + 10 + 10 = 70
- MCC = 1000 + 100 + 100 = 1200
The Subtractive Rule: This is the clever trick that avoids clumsy repetitions like IIII for 4. When a symbol of smaller value is placed before a symbol of larger value, the smaller value is subtracted from the larger one. This rule has specific, limited applications.
- IV = 5 - 1 = 4 (Instead of IIII)
- IX = 10 - 1 = 9 (Instead of VIIII)
- XL = 50 - 10 = 40 (Instead of XXXX)
- XC = 100 - 10 = 90 (Instead of LXXXX)
- CD = 500 - 100 = 400 (Instead of CCCC)
- CM = 1000 - 100 = 900 (Instead of DCCCC)

These subtractive pairs are crucial for our algorithm. They represent the "special cases" we need to handle. For this kodikra module, we are concerned with traditional Roman numerals, which effectively limits our maximum convertible number to 3,999 (MMMCMXCIX).

Why Is This a Classic Programming Problem?

The Roman numeral conversion task is a staple in coding interviews and introductory algorithm courses for several good reasons. It's a "low-floor, high-ceiling" problem: easy to start but with nuances that reveal a developer's thought process.

Algorithmic Thinking: It forces you to devise a systematic process. A naive approach might get complicated quickly. The optimal solution requires a "greedy" strategy, which is a fundamental algorithmic concept.
Data Structures: The most elegant solutions use a data structure to map Arabic values to Roman symbols. This could be two parallel arrays, a hash map, or a list of tuples. Our 8th solution uses two parallel arrays, which is highly efficient.
Handling Edge Cases: You must consider the subtractive rules (4, 9, 40, etc.) explicitly. Ignoring them leads to an incorrect and non-standard output.
String Manipulation: The core of the output is building a string piece by piece, a common task in any programming language.
Language Idioms: For a language like 8th, this problem is a perfect showcase for its stack-based manipulation, array iterators (a:each), and use of the return stack for temporary storage.

Solving this problem demonstrates that you can translate a set of human-readable rules into clean, efficient, and logical code.

How Does the Conversion Algorithm Work? The Greedy Approach

The most effective method for this conversion is a greedy algorithm. The name sounds aggressive, but the concept is simple: at every step, make the locally optimal choice. In our case, this means always subtracting the largest possible Roman numeral value from our remaining number.

Imagine you need to convert the number 1994. Here's how a human would do it greedily:

What's the biggest chunk I can take from 1994? It's 1000 (M). Remainder: 994. Result: "M".
What's the biggest chunk from 994? It's 900 (CM). Remainder: 94. Result: "MCM".
What's the biggest chunk from 94? It's 90 (XC). Remainder: 4. Result: "MCMXC".
What's the biggest chunk from 4? It's 4 (IV). Remainder: 0. Result: "MCMXCIV".

The process stops when the remainder is zero. Notice that to make this work, our list of possible "chunks" must include the subtractive pairs (900, 400, 90, 40, 9, 4) and must be sorted in descending order. If we started with smaller numbers, we'd get the wrong result.

Visualizing the Algorithm Flow

This ASCII art diagram illustrates the high-level logic of our greedy approach.

    ● Start (Input: Arabic Number `N`)
    │
    ▼
  ┌────────────────────────┐
  │ Initialize Result = "" │
  └───────────┬────────────┘
              │
    ┌─────────▼──────────┐
    │ Loop through Roman │
    │ values (High to Low)│
    └─────────┬──────────┘
              │
    ╭─────────▼──────────╮
    │ While N >= current │
    │   Roman Value      │
    ╰─────────┬──────────╯
              │
    ┌─────────▼──────────┐
    │ Append Symbol to   │
    │ Result             │
    └─────────┬──────────┘
              │
    ┌─────────▼──────────┐
    │ Subtract Value     │
    │ from N             │
    └─────────┬──────────┘
              │
              ╰──────────╮
                         │
    ┌─────────▼──────────┐
    │ Move to next Roman │
    │ value in list      │
    └───────────┬────────┘
                │
                ▼
           ◆ N == 0?
          ╱         ╲
      Yes            No (Loop continues)
       │
       ▼
    ● End (Output: Result String)

This flow is exactly what we will implement in our 8th code. We'll use two arrays—one for the numeric values and one for the string symbols—and iterate through them together.

Where Is This Implemented in 8th? A Detailed Code Walkthrough

Now, let's dissect the idiomatic 8th solution from the kodikra learning path. 8th is a stack-based language inspired by Forth, so understanding how data moves on the stack is key. The code is concise but incredibly powerful.

The Complete 8th Solution


\ Define our lookup tables as constants
[1000,900,500,400,100,90,50,40,10,9,5,4,1] constant num-lookup
["M","CM","D","CD","C","XC","L","XL","X","IX","V","IV","I"] constant roman-lookup

\ Word definition: n:>roman
\ Stack effect: ( n -- s ) consumes a number, produces a string
: n:>roman
    \ Move input number 'n' to the return stack, push empty string accumulator
    >r "" 
    
    \ Iterate over num-lookup, with its index
    num-lookup (
        \ Inner loop to repeatedly subtract the current value
        repeat
            r@ over n:< !if      \ Is n (from return stack) >= current value?
                dup n:neg n:r+   \ If yes: subtract value from n on return stack
                rot roman-lookup 3 pick a:_@ s:+ -rot \ Append corresponding Roman symbol
            else
                2drop break      \ If no: clean up stack and break inner loop
            then
        again
    ) a:each
    
    \ Clean up final index from a:each and the 'n' (now 0) from return stack
    drop rdrop 
;

Line-by-Line Deconstruction

Let's break down every component of this code.

1. The Lookup Tables


[1000,900,500,400,100,90,50,40,10,9,5,4,1] constant num-lookup
["M","CM","D","CD","C","XC","L","XL","X","IX","V","IV","I"] constant roman-lookup

constant: This word defines a constant. It takes the value on top of the stack (the array) and assigns it to the name that follows (e.g., num-lookup).
Crucial Detail: The arrays are perfectly parallel. The element at index 0 of num-lookup (1000) corresponds to the element at index 0 of roman-lookup ("M"). This parallel structure is the foundation of our mapping. They are also sorted in descending order, which is mandatory for the greedy algorithm.

2. The Word Definition


: n:>roman
    \ ... code ...
;

: n:>roman starts the definition of a new function, or "word," named n:>roman. The ; terminates the definition.
The comment \ Stack effect: ( n -- s ) is a convention in Forth-like languages. It explains that this word expects a number (n) on the stack and will replace it with a string (s) when it's done.

3. Initialization


>r ""

This is the setup phase. Imagine the stack starts with our input number, say 1994.
>r (pronounced "to R"): This operator takes the top item from the data stack (our number 1994) and pushes it onto a secondary stack called the return stack. This is a common idiom to temporarily store a value and keep the main data stack clean for calculations.
"": This pushes an empty string onto the now-empty data stack. This string will be our accumulator, where we build the final Roman numeral.
Stack state: Data Stack: [""], Return Stack: [1994].

4. The Main Loop


num-lookup ( ... ) a:each

num-lookup: Pushes our array of numbers onto the stack.
a:each: This is a powerful iterator. It consumes an array and a quotation (the code inside ( ... )) from the stack. It then executes the quotation for each element of the array. For each iteration, it pushes the element's value and its index onto the stack.

5. The Inner Logic (The Quotation)

This is the heart of the algorithm, executed for each number in num-lookup.


repeat
    r@ over n:< !if
        ...
    else
        2drop break
    then
again

repeat ... again: This creates an infinite loop. We need an explicit break to exit it. This loop is responsible for handling cases like 3000 (M, M, M) where we need to subtract the same value multiple times.
r@ over:
- r@ (pronounced "R fetch"): Copies the top item from the return stack (our remaining number, e.g., 1994) to the data stack.
- over: Copies the second item on the stack to the top. This duplicates the current Arabic value from num-lookup.
n:< !if: This is the comparison. n:< checks if the top item is less than the second. The ! negates the result. So, this effectively checks "is r@ >= over?". If true, we execute the `if` block.
If the condition is true (we can subtract):
```
dup n:neg n:r+
rot roman-lookup 3 pick a:_@ s:+ -rot
```
- dup n:neg n:r+: Duplicates the Arabic value, negates it, and adds it to the number on the return stack. This is the subtraction step.
- The next line is complex stack juggling to append the symbol. It finds the correct Roman symbol using the index provided by a:each and appends it to our accumulator string.
If the condition is false (we cannot subtract):
```
2drop break
```
- 2drop: Removes the two values we used for comparison from the stack.
- break: Exits the inner repeat...again loop, and a:each moves to the next value in num-lookup.

6. Final Cleanup


drop rdrop

After a:each finishes, it leaves the final index on the stack. drop removes it.
Our number on the return stack is now 0. rdrop removes it from the return stack.
The only thing left on the data stack is our beautifully constructed result string, which is the implicit return value of the word.

Visualizing the Stack During an Iteration

Let's trace the stack for the number 94, when `a:each` is at the value `90` (index 5).

● Start of Inner Loop Iteration
  ├─ Data Stack:   ["MCM", 90, 5]  (Result, Value, Index)
  └─ Return Stack: [94]            (N)
          │
          ▼
  `r@ over`
  ├─ Data Stack:   ["MCM", 90, 5, 94, 90]
  └─ Return Stack: [94]
          │
          ▼
  `n:< !if` (94 >= 90 is TRUE)
  ├─ Data Stack:   ["MCM", 90, 5]
  └─ Return Stack: [94]
          │
          ▼
  `dup n:neg n:r+` (Subtract 90 from N)
  ├─ Data Stack:   ["MCM", 90, 5, 90]
  └─ Return Stack: [4]
          │
          ▼
  `... s:+ ...` (Append symbol "XC")
  ├─ Data Stack:   ["MCMXC", 90, 5]
  └─ Return Stack: [4]
          │
          ▼
● Loop Repeats
  `r@ over`
  ├─ Data Stack:   ["MCMXC", 90, 5, 4, 90]
  └─ Return Stack: [4]
          │
          ▼
  `n:< !if` (4 >= 90 is FALSE)
  ├─ Data Stack:   ["MCMXC", 90, 5, 4, 90]
  └─ Return Stack: [4]
          │
          ▼
  `2drop break` (Exit inner loop)
  ├─ Data Stack:   ["MCMXC", 90, 5]
  └─ Return Stack: [4]
          │
          ▼
● `a:each` continues to the next value (50)

Who Uses This and What Are the Alternatives?

This algorithm is a foundational piece of logic. While you might not be building a "Roman Numeral Converter" app daily, the principles are widely applicable in areas like parsers, compilers, and any system that translates one representation of data to another.

Alternative Implementations

While the greedy lookup-table approach is arguably the best for this problem, other methods exist:

Mathematical/Modular Approach: One could use division and modulo arithmetic to determine how many 'M's, 'D's, 'C's, etc., are needed. This often becomes more complex because you then have to go back and replace patterns like CCCC with CD, adding an extra step.
Recursive Solution: A function could call itself with the remainder of the number after subtracting the largest possible chunk. This can be elegant but may be less performant in some languages due to function call overhead.
Using a Hash Map: Instead of two parallel arrays, one could use a g:map in 8th. This might make the code slightly more readable as the key-value relationship is explicit. However, for a fixed and ordered set like this, parallel arrays are extremely fast.

Pros and Cons of the Greedy Lookup Approach

Pros	Cons / Risks
Highly Efficient: The number of iterations is constant and small, regardless of the input number's size (within the 1-3999 range). This gives it O(1) time complexity.	Requires Pre-defined Data: The solution is hard-coded for the specific rules of Roman numerals. It's not a general-purpose number system converter.
Easy to Understand: The logic directly mimics how a human would solve the problem, making it intuitive and easy to debug.	Limited Range: This implementation does not handle numbers outside the 1-3999 range, nor does it validate input (e.g., zero, negative numbers).
Reliable and Correct: As long as the lookup table is correctly ordered and complete, the algorithm is guaranteed to produce the standard, most concise Roman numeral.	Potential for Errors: A mistake in the ordering of the lookup tables (e.g., placing 100 before 900) would completely break the greedy logic and produce incorrect results.

Running the Code in the 8th REPL

To test this solution, save the code in a file named roman-numerals.8th. Then, start the 8th interpreter and load the file.


$ 8th
8th> "roman-numerals.8th" f:load
ok
8th> 1994 n:>roman .s
"MCMXCIV"
ok
8th> 58 n:>roman .s
"LVIII"
ok
8th> 3 n:>roman .s
"III"
ok
8th> 3999 n:>roman .s
"MMMCMXCIX"
ok

The .s word in 8th is used to "dump" the stack, showing us the final string result produced by our n:>roman word.

Frequently Asked Questions (FAQ)

Why can't this code represent numbers larger than 3,999?

The limitation comes from traditional Roman numeral notation itself, not just our code. There is no single, universally accepted symbol for 5,000 or higher values. While historical variations like a vinculum (a bar over a numeral to multiply its value by 1,000) existed, they are not part of the standard system this kodikra module addresses. Our lookup table intentionally stops at 1,000 (M), and the rule of not repeating a symbol more than three times applies (e.g., MMM for 3,000).

What is the purpose of the return stack (`>r`, `r@`) in the 8th solution?

The return stack is a secondary, temporary storage area. In our solution, we use it to hold the main number we are converting (`n`). This is a powerful Forth/8th idiom that keeps the primary data stack uncluttered. It allows us to perform calculations, comparisons, and manipulations on the data stack without losing track of our primary loop variable, which we can easily access anytime with r@ (copy from return stack) or r> (move from return stack).

Is the greedy algorithm always the best approach for this problem?

For converting Arabic to standard Roman numerals, yes. The greedy approach works perfectly because the numeral system is designed in a way that the largest possible symbol choice is always the correct one, provided you include the subtractive pairs (like 900, 400, etc.) in your set of choices. This avoids the need for complex lookaheads or backtracking.

How would I handle invalid input, like zero or negative numbers?

The current code does not handle these cases and might produce an empty string or unexpected behavior. To make it more robust, you would add a "guard" at the beginning of the word. For example: dup 1 n:< if "Invalid Number" throw then. This would check if the input number is less than 1 and, if so, throw an error instead of proceeding with the conversion.

Could I use a `g:map` (hash map) instead of two parallel arrays?

Absolutely. You could define a map where keys are the Arabic numbers and values are the Roman symbols. The main challenge would be that standard hash maps are unordered. You would still need an ordered list of keys (like our num-lookup array) to iterate over to ensure the greedy algorithm works correctly. So, you would likely end up using both a map and an ordered key array, which might be slightly less efficient than the two-array approach for this specific problem.

Why is the `num-lookup` array in descending order?

This is the most critical requirement for the greedy algorithm to function. By processing the values from largest to smallest (1000, then 900, then 500, etc.), we ensure that we always "bite off" the biggest possible chunk from our number. If the array were in ascending order, converting the number 4 would first find '1', resulting in "I", then "II", "III", "IIII", which is incorrect. The descending order guarantees we check for "IV" before we ever consider using "I".

What does `a:each` do in 8th?

a:each is a high-level iterator word for arrays. It takes an array and a "quotation" (a block of code) as input. It then executes that code for every element in the array. For each execution, it provides the element's value and its index on the stack, making it very convenient for tasks that require iterating through a collection and knowing your position, like we do when we need to look up a symbol in our parallel `roman-lookup` array.

Conclusion: From Ancient Rules to Modern Code

We've journeyed from the foundational rules of an ancient number system to a concise, powerful implementation in a modern programming language. The Roman numeral conversion problem is a perfect illustration of how a clear understanding of the rules, combined with the right algorithm—the greedy approach—can lead to an elegant and efficient solution.

You've learned how to structure data using parallel arrays, control program flow with nested loops, and manage state effectively using 8th's unique data and return stacks. This problem is more than just an academic exercise; it's a workout for your logical muscles, preparing you for more complex data transformation and parsing challenges ahead.

Ready to tackle the next challenge in your developer journey? Continue building your skills by exploring the complete 8th 7 learning path on kodikra.com. To deepen your understanding of the language itself, be sure to visit our comprehensive 8th language guide.

Disclaimer: The code and explanations in this article are based on the current stable version of 8th. As the language evolves, specific word names or syntax may change in future releases. Always refer to the official documentation for the most up-to-date information.

Published by Kodikra — Your trusted 8th learning resource.

kodikra

Search this blog