Say in Bash: Complete Solution & Deep Dive Guide

man in black shirt using laptop computer and flat screen monitor

From Digits to Diction: The Complete Guide to Converting Numbers to Words in Bash

Learn to transform any number from 0 to 999,999,999,999 into its full English word equivalent using a powerful Bash script. This comprehensive guide breaks down the core logic, provides a complete, production-ready script, and explains how to handle hundreds, thousands, millions, and beyond with detailed, practical examples.

Imagine working the counter at a bustling deli, tickets in hand, with a line of hungry customers wrapping around the block. The number dispenser spits out ticket `873`, and to ensure clarity over the noise, you need to call out "Eight hundred seventy-three!" Now imagine the ticket is `12,451,902`. Shouting "Twelve million, four hundred fifty-one thousand, nine hundred two" is a mouthful. This is the exact challenge faced in a key module from the exclusive kodikra.com learning curriculum, where automation isn't just a convenience—it's a necessity. You've probably seen this functionality in financial software, check-writing applications, or accessibility tools. The task seems simple for a human, but instructing a computer to do it, especially in a shell scripting language like Bash, requires a fascinating blend of arithmetic and string manipulation. This guide will turn you into the architect of such a solution, transforming you from a script user to a script creator.


What is the Number-to-Word Conversion Problem?

At its core, the number-to-word conversion problem is a classic programming challenge that involves taking a numerical input, such as 123, and producing its text representation, "one hundred twenty-three". The complexity arises not from advanced mathematics, but from the irregular and idiomatic rules of the English language. While our number system is a consistent base-10 system, our language for expressing it is filled with unique words and structures.

For instance, the numbers from 0 to 19 all have unique names (zero, one, twelve, nineteen). After that, we enter a pattern for the tens (twenty, thirty, forty), which are combined with the single-digit names (e.g., twenty-one). This pattern breaks again at 100, where we introduce the word "hundred". This continues with larger denominations like "thousand," "million," "billion," and "trillion," each representing a power of 1000.

The specific challenge from the kodikra learning path sets the upper limit at 999,999,999,999 (just under one trillion). The script must correctly handle this entire range, including edge cases like zero, numbers ending in zero, and large numbers with many zero-placeholders (e.g., 500,000,000,000).

The Scope and Constraints in Bash

We are tasked with creating a Bash script that accepts a single command-line argument—the number to be converted. The script must adhere to these rules:

  • Input Range: It must handle any integer from 0 up to 999,999,999,999.
  • Error Handling: It must gracefully exit with an error message for inputs outside this range, such as negative numbers or numbers that are too large.
  • Output Format: The output should be a single string of English words, properly spaced, without trailing hyphens or extra words like "and". For example, 123 should be "one hundred twenty-three", not "one hundred and twenty-three".
  • Execution Environment: The script should be executable in a standard Bash environment, using common utilities without relying on external libraries or tools.

Why Use Bash for This Text Manipulation Task?

At first glance, a language like Python or JavaScript, with their rich data structures and extensive string manipulation libraries, might seem like a more obvious choice. However, selecting Bash for this task is not just an academic exercise; it highlights the surprising power and ubiquity of the shell for complex text processing.

Strengths of Bash for Text and Number Juggling

  • Ubiquity and Portability: Bash is the default shell on nearly every Linux distribution, macOS, and is easily available on Windows via WSL (Windows Subsystem for Linux). A script written in Bash is incredibly portable and can run almost anywhere without pre-configuring a complex development environment.
  • Powerful Arithmetic Expansion: Modern Bash has robust integer arithmetic capabilities built directly into the shell using the $((...)) syntax. This allows for clean and efficient handling of division, modulus, and comparisons needed to deconstruct numbers.
  • First-Class String Handling: While not as feature-rich as a dedicated programming language's string library, Bash provides all the necessary tools for concatenation and conditional logic, which are central to building the final word string.
  • Piping and Command Chaining: For more complex versions of this problem, Bash's ability to pipe the output of one command into another allows for elegant, composable solutions. While our solution will be self-contained, this underlying philosophy makes the shell a powerful data processing tool.

By solving this problem in Bash, you gain a deeper appreciation for what can be accomplished directly in the command line, sharpening skills that are invaluable for system administration, DevOps, and automation. You learn to think algorithmically within the constraints of the shell, a skill that translates across all programming disciplines. Explore more foundational concepts in our guide to mastering Bash from the ground up.


How to Deconstruct the Problem: The Chunking Strategy

The secret to solving this problem is to stop thinking about a number like 12,345,678,901 as a single, monolithic entity. Instead, we should view it as a sequence of smaller, manageable chunks. The English language naming convention for large numbers is based on groups of three digits. We have names for thousands, millions, billions, etc., and within each of these groups, we simply describe a number between 1 and 999.

Consider the number 12,345,678,901. Let's break it down:

  • 901 is "nine hundred one"
  • 678 is "six hundred seventy-eight" (in the thousands place)
  • 345 is "three hundred forty-five" (in the millions place)
  • 12 is "twelve" (in the billions place)

By processing the number in three-digit chunks from right to left, we can build the final string by converting each chunk and appending the appropriate "scale" word (thousand, million, billion). This reduces a massive problem into a much smaller, repeatable one: how to convert any number from 0 to 999 into words.

The Core Algorithm Explained

Our algorithm will follow these logical steps:

  1. Handle the Base Case: If the input number is 0, print "zero" and exit.
  2. Validate Input: Check if the number is within the valid range (0 to 999,999,999,999). If not, print an error and exit.
  3. The Recursive Chunking Loop:
    • Take the input number and get the remainder when divided by 1000. This gives us the rightmost three-digit chunk (e.g., for 12345, the chunk is 345).
    • Process this three-digit chunk into words (e.g., "three hundred forty-five").
    • If the chunk is not zero, append the correct scale word (e.g., "thousand", "million"). The scale word depends on which iteration of the loop we are in.
    • Update the number by dividing it by 1000 (integer division), effectively shifting it three places to the right (e.g., 12345 becomes 12).
    • Repeat the process until the number becomes zero.
  4. Assemble the Final String: As we process each chunk, we prepend its word representation to our final result string, ensuring correct spacing.

This process is illustrated in the following diagram, which shows how a number is broken down into its constituent parts.

    ● Start with Number (e.g., 123456)
    │
    ▼
  ┌───────────────────────────┐
  │ Loop 1: Scale = ""        │
  ├───────────────────────────┤
  │ n % 1000  →  456          │
  │ Convert   → "four hundred fifty-six"
  │ n / 1000  →  123          │
  └────────────┬──────────────┘
               │
               ▼
  ┌───────────────────────────┐
  │ Loop 2: Scale = "thousand"│
  ├───────────────────────────┤
  │ n % 1000  →  123          │
  │ Convert   → "one hundred twenty-three"
  │ n / 1000  →  0            │
  └────────────┬──────────────┘
               │
               ▼
    ◆ Number is 0?
   ╱
  Yes
  │
  ▼
  ┌──────────────────────────────────────────┐
  │ Combine Results (in reverse order)       │
  │ "one hundred twenty-three" + "thousand"  │
  │ + "four hundred fifty-six"               │
  └──────────────────┬───────────────────────┘
                     │
                     ▼
                 ● End

This "divide and conquer" strategy is the key. Now, let's see how this logic is implemented in a real Bash script.


The Complete Bash Script: A Line-by-Line Walkthrough

Here is the complete, working Bash script that solves the number-to-word problem as defined by the kodikra.com module. Below the code, we will dissect every component to understand its role in the larger system.


#!/usr/bin/env bash

# Data arrays for number names
low=(
    zero one two three four five six seven eight
    nine ten eleven twelve thirteen fourteen fifteen
    sixteen seventeen eighteen nineteen
)
high=(
    [20]=twenty [30]=thirty [40]=forty [50]=fifty
    [60]=sixty [70]=seventy [80]=eighty [90]=ninety
)
scale=(
    thousand million billion trillion
)

# Function to handle numbers from 0-99
say_small() {
    local -i n=$1
    if (( n < 20 )); then
        echo "${low[n]}"
    else
        local -i tens=n/10*10
        local -i units=n%10
        if (( units == 0 )); then
            echo "${high[tens]}"
        else
            echo "${high[tens]}-${low[units]}"
        fi
    fi
}

# Function to handle numbers from 0-999
say_hundreds() {
    local -i n=$1
    if (( n < 100 )); then
        say_small "$n"
    else
        local -i hundreds=n/100
        local -i remainder=n%100
        local result="${low[hundreds]} hundred"
        if (( remainder > 0 )); then
            result+=" $(say_small "$remainder")"
        fi
        echo "$result"
    fi
}

# Main function to handle the entire range
say() {
    local input_str=$1
    # Remove commas for safe arithmetic processing
    local num_str=${input_str//,}
    
    # Validate input is a valid integer
    if ! [[ "$num_str" =~ ^[0-9]+$ ]]; then
        echo "input out of range" >&2
        exit 1
    fi

    local -i n=$num_str

    # Handle range and edge cases
    if (( n < 0 )) || (( n > 999999999999 )); then
        echo "input out of range" >&2
        exit 1
    fi

    if (( n == 0 )); then
        echo "zero"
        return
    fi

    local result=""
    local -i scale_idx=0
    
    while (( n > 0 )); do
        local -i chunk=n%1000
        if (( chunk > 0 )); then
            local chunk_words
            chunk_words=$(say_hundreds "$chunk")
            
            local current_scale=""
            if (( scale_idx > 0 )); then
                current_scale=" ${scale[scale_idx-1]}"
            fi
            
            result="$chunk_words$current_scale $result"
        fi
        
        (( n /= 1000 ))
        (( scale_idx++ ))
    done
    
    # Trim trailing space
    echo "${result% }"
}

# Script entry point
main() {
    if (( $# != 1 )); then
        echo "Usage: say.sh <number>" >&2
        exit 1
    fi
    say "$1"
}

main "$@"

Dissecting the Script Components

1. Data Structures: The Name Arrays


low=(
    zero one two three four five six seven eight
    nine ten eleven twelve thirteen fourteen fifteen
    sixteen seventeen eighteen nineteen
)
high=(
    [20]=twenty [30]=thirty [40]=forty [50]=fifty
    [60]=sixty [70]=seventy [80]=eighty [90]=ninety
)
scale=(
    thousand million billion trillion
)
  • low: This is a standard indexed array. The index corresponds to the number, making it trivial to get the word for any number from 0 to 19 (e.g., ${low[12]} returns "twelve").
  • high: This is an associative array (or hash map). It maps the tens-place values (20, 30, etc.) to their names. This is more efficient and readable than creating a sparse indexed array.
  • scale: A simple indexed array holding the names for the powers of 1000. ${scale[0]} is "thousand", ${scale[1]} is "million", and so on.

Storing the names in arrays makes the code clean, readable, and easy to modify (e.g., for localization to another language).

2. Helper Function: say_small()


say_small() {
    local -i n=$1
    if (( n < 20 )); then
        echo "${low[n]}"
    else
        # ... logic for 20-99
    fi
}

This function is a specialist, handling only numbers from 0 to 99.

  • It declares its input variable n as a local integer with local -i n=$1. This is good practice to avoid scope pollution and ensure arithmetic context.
  • If n is less than 20, it directly looks up the name in the low array. This is our first base case.
  • If n is 20 or greater, it calculates the tens and units places. For n=73, tens becomes 70 and units becomes 3.
  • It then combines the name from the high array (e.g., "seventy") with the name from the low array ("three"), outputting "seventy-three".

3. Helper Function: say_hundreds()


say_hundreds() {
    local -i n=$1
    if (( n < 100 )); then
        say_small "$n"
    else
        # ... logic for 100-999
    fi
}

This function builds upon say_small to handle any number up to 999.

  • If the number is less than 100, it simply delegates the work to say_small.
  • Otherwise, it calculates the hundreds digit (n/100) and the remainder (n%100).
  • It constructs the string by taking the hundreds digit's name (e.g., "four"), appending "hundred", and then, only if the remainder is greater than zero, it calls say_small on the remainder and appends that result. This correctly handles cases like 400 (no remainder) and 456.

4. The Main Router: say()

This is the heart of the script, implementing the recursive chunking logic we designed earlier. It acts as a controller, breaking down the large number and delegating the small conversions to say_hundreds.

The function call flow is visualized below:

    ● main() receives input (e.g., "12345")
    │
    ▼
  ┌───────────┐
  │  say()    │
  └─────┬─────┘
        │
        ▼
  ┌──────────────────┐
  │ Loop (n > 0)     │
  ├──────────────────┤
  │ chunk = n % 1000 │
  │ chunk_words = ?  │
  └─────┬────────────┘
        │
        ▼
      ┌───────────────┐
      │ say_hundreds()│
      └──────┬────────┘
             │
             ▼
           ┌──────────┐
           │say_small()
           └──────────┘

Let's trace its execution with the number 12345:

  1. Initialization: n=12345, result="", scale_idx=0.
  2. Loop 1:
    • chunk = 12345 % 1000 which is 345.
    • say_hundreds(345) is called, which returns "three hundred forty-five".
    • scale_idx is 0, so no scale word is added yet.
    • result becomes "three hundred forty-five ".
    • n becomes 12345 / 1000 which is 12. scale_idx increments to 1.
  3. Loop 2:
    • chunk = 12 % 1000 which is 12.
    • say_hundreds(12) is called, which in turn calls say_small(12), returning "twelve".
    • scale_idx is 1, so the scale word is ${scale[0]}, which is "thousand".
    • The new string part is "twelve thousand ". This is prepended to the existing result.
    • result is now "twelve thousand three hundred forty-five ".
    • n becomes 12 / 1000 which is 0. scale_idx increments to 2.
  4. Loop End: The while (( n > 0 )) condition is now false, so the loop terminates.
  5. Final Output: The script echoes the final result after trimming the trailing space with ${result% }, giving the correct output: "twelve thousand three hundred forty-five".

5. Entry Point and Input Validation: main()


main() {
    if (( $# != 1 )); then
        echo "Usage: say.sh <number>" >&2
        exit 1
    fi
    say "$1"
}

main "$@"

This is standard Bash practice for creating executable scripts.

  • The main function first checks if exactly one command-line argument ($#) was provided. If not, it prints a usage message to standard error (>&2) and exits with a non-zero status code to indicate failure.
  • If the argument count is correct, it calls the main say function with the provided argument "$1".
  • The final line, main "$@", executes the main function, passing along all command-line arguments to it. This makes the script's behavior explicit and organized.

Inside say(), we also have robust validation to ensure the input is a number within the allowed range, making the script safe and user-friendly.


# Remove commas for safe arithmetic processing
local num_str=${input_str//,}

# Validate input is a valid integer
if ! [[ "$num_str" =~ ^[0-9]+$ ]]; then
    echo "input out of range" >&2
    exit 1
fi

This snippet first removes any commas from the input string (e.g., "1,234" becomes "1234") using parameter expansion. Then, it uses a regular expression ^[0-9]+$ to ensure the resulting string contains only digits before attempting to treat it as a number.


Use Cases, Alternatives, and Performance Considerations

While this script is a fantastic learning exercise, the underlying logic has practical applications in various domains. Understanding its strengths and weaknesses helps in deciding when to use such a shell script versus a more heavyweight solution.

Who Benefits from This Script? (Practical Applications)

  • Financial Technology (FinTech): Automatically writing out check amounts or invoice totals in words to prevent fraud and ambiguity.
  • Accessibility Tools: Integrating with text-to-speech (TTS) systems to read out numerical data in a more natural, human-friendly way.
  • Automation and Reporting: Generating human-readable reports from raw numerical data. For example, a script could report "server uptime is three thousand six hundred seconds" instead of just "3600".
  • Educational Software: Creating tools that help teach children how to read and write numbers.

Alternative Approaches and Optimizations

The provided solution is clear and modular, but other algorithmic designs are possible. One common alternative is a purely recursive approach.

A single recursive function could handle all numbers. For example, a function `recursive_say(n)` would work like this:

  • If `n < 20`, return `${low[n]}`.
  • If `n < 100`, return `${high[n/10*10]}` + `recursive_say(n%10)`.
  • If `n < 1000`, return `recursive_say(n/100)` + "hundred" + `recursive_say(n%100)`.
  • If `n >= 1000`, it would find the largest applicable scale (e.g., million), and call itself recursively for the part before the scale and the part after: `recursive_say(n / 1000000)` + "million" + `recursive_say(n % 1000000)`.

This can lead to more compact code but can also be harder to debug due to the deeper call stack.

Pros and Cons of the Bash Approach

Pros Cons
Zero Dependencies: Runs on any system with Bash, no installation of libraries or runtimes needed. Integer Limitations: Bash's built-in arithmetic is limited to signed 64-bit integers. This is fine for our range but would fail for truly massive numbers.
Highly Readable Logic: The modular functions (say_small, say_hundreds) make the code easy to follow. Slower Performance: For converting millions of numbers in a tight loop, shell script overhead (forking processes, string manipulation) will be slower than a compiled language like C or Go.
Excellent for Automation: Easily integrates into larger shell scripts for system administration or data processing pipelines. No Floating-Point Support: This script is strictly for integers. Handling decimals would require significant modifications and likely external tools like bc or awk.
Great Learning Tool: Teaches fundamental concepts of algorithms, modularity, and shell scripting in a practical context. Verbosity: The code can be more verbose than an equivalent implementation in a language like Python, which has more expressive syntax.

Frequently Asked Questions (FAQ)

1. How does the script handle the word "and" in numbers?

This script deliberately omits the word "and", following a more formal or American English convention (e.g., "one hundred twenty-three"). In British English, "one hundred and twenty-three" is common. To add this, you would modify the say_hundreds function to check if a remainder exists and insert " and" before processing it.

2. Can this script be modified to handle negative numbers?

Yes. You could modify the main say function. At the beginning, check if the number is negative. If it is, prepend the word "negative" to the final result and then run the rest of the logic on the absolute value of the number.

3. What is the largest number Bash can handle for this problem?

On most modern 64-bit systems, Bash uses signed 64-bit integers. This means it can handle numbers up to 2^63 - 1, which is approximately 9 quintillion. The script's current logic and scale array are limited to trillions, but the underlying arithmetic capabilities of Bash could support extending it further.

4. How would I adapt this script for another language, like Spanish?

You would need to replace the contents of the low, high, and scale arrays with their Spanish equivalents. However, you might also need to adjust the logic, as grammar rules can differ. For example, some languages have gendered numbers that change based on the noun they modify, which would add significant complexity.

5. Why is the result string built by prepending new parts instead of appending them?

The script processes the number from right to left (smallest chunks first) for convenience using the modulus operator (% 1000). Because we process "four hundred fifty-six" before "twelve thousand", we must prepend the new, larger-scale parts to the front of the result string to maintain the correct final order.

6. Is there a more efficient way than using multiple arrays for the number names?

For this scale, arrays are extremely efficient and readable. A more "mathematical" approach might try to generate the names algorithmically, but this would be far more complex and likely slower due to the irregular naming conventions of English. The current lookup-table approach (using arrays) is the standard and most effective solution.

7. What happens if I input a number with commas, like "1,234,567"?

The script is designed to handle this. The line local num_str=${input_str//,} at the beginning of the say function uses Bash's parameter expansion to find and replace all occurrences of a comma with nothing, effectively stripping them before the string is treated as a number.


Conclusion: From a Simple Task to a Powerful Tool

What begins as a simple request—turning digits into words—unfolds into a comprehensive lesson in algorithmic thinking, modular design, and the hidden strengths of Bash scripting. By breaking a large, intimidating number down into manageable three-digit chunks, we transformed a complex problem into a series of simple, repeatable steps. This project, a cornerstone of the kodikra Bash 5 learning path, demonstrates that you don't always need a complex, general-purpose programming language to build robust and useful tools.

The final script is not just a solution; it's a testament to the power of structured thinking. It's efficient, readable, and portable, capable of running on virtually any modern Unix-like system without modification. You now have a solid foundation for tackling other text and data manipulation challenges directly from the command line, a skill that will serve you well in any area of software development or system administration.

Disclaimer: The code and explanations in this article are based on Bash version 4.x and newer. While most of the script is backward-compatible, features like associative arrays require a modern version of Bash. Always check your environment's version with bash --version.


Published by Kodikra — Your trusted Bash learning resource.