Isogram in Cpp: Complete Solution & Deep Dive Guide

a close up of a computer screen with code on it

The Ultimate Guide to C++ Isograms: From Basic Checks to Optimized Solutions

An isogram is a word or phrase without repeating letters, ignoring case, spaces, and hyphens. In C++, this can be efficiently solved by iterating through the string, normalizing each character to lowercase, and using a boolean array or hash map to track which letters have already been seen.

You’ve just nailed a complex algorithm in a coding interview, feeling confident. Then comes the "cool-down" question: "Write a function to check if a word is an isogram." It sounds deceptively simple, but the interviewer is watching closely. They're not just testing if you can solve it; they're evaluating how you solve it. Do you choose the most efficient data structure? Is your code clean and readable? Do you handle edge cases like capitalization and non-letter characters gracefully? This seemingly trivial problem is a perfect test of your foundational C++ skills. This guide will walk you through every step, from the basic concept to highly optimized C++ solutions, ensuring you're prepared for any variation of this classic challenge.


What Exactly Is an Isogram? The Core Concept

Before we dive into C++ code, it's crucial to have a rock-solid understanding of the problem's rules. An isogram, also known as a "non-pattern word," is a word where every letter appears only once. The challenge, however, has a few specific constraints that are key to a correct implementation.

Let's break down the rules:

  • No Repeating Letters: This is the primary rule. The word "background" is an isogram because each letter (b, a, c, k, g, r, o, u, n, d) is unique. The word "isograms" is not an isogram because the letter 's' appears twice.
  • Case-Insensitivity: The check must ignore whether a letter is uppercase or lowercase. For example, the word "Path" is considered to have repeating letters if we were case-sensitive ('P' and 'p'), but for an isogram check, it's treated as if 'p' and 'p' are the same. Therefore, we must normalize all characters to a single case (typically lowercase) before checking for duplicates.
  • Spaces and Hyphens are Ignored: The problem statement explicitly allows spaces and hyphens to appear multiple times. They should be completely disregarded during the check for repeating letters. For instance, "six-year-old" is a valid isogram because if you strip the hyphens and only look at the letters, none of them repeat.

Here are some clear examples:

  • "lumberjacks"Isogram (all letters are unique)
  • "downstream"Isogram (all letters are unique)
  • "six-year-old"Isogram (hyphens are ignored, remaining letters are unique)
  • "hello"Not an Isogram (the letter 'l' repeats)
  • "Bubble"Not an Isogram (the letter 'b' repeats, ignoring case)

Understanding these rules is the first and most important step. A flawed understanding will lead to a flawed solution, no matter how elegant the code is. The core task is to count the frequency of only the alphabetic characters in a case-insensitive manner.


Why Is This a Perfect Problem for C++ Learners?

The isogram problem, part of the exclusive kodikra.com C++ learning path, is more than just a simple puzzle. It's a fantastic educational tool because its solution touches upon several fundamental pillars of modern C++ programming. It forces you to think about efficiency, data structures, and standard library usage.

It Reinforces String and Character Manipulation

At its heart, this is a string processing problem. To solve it, you need to be comfortable with:

  • Iteration: Looping through a std::string. The range-based for loop (for (char c : text)) is a perfect, modern C++ idiom for this.
  • Character Functions: The <cctype> header is your best friend here. Functions like std::isalpha() to check if a character is a letter and std::tolower() to handle the case-insensitivity rule are essential. Using these standard library functions is far better than manually checking character ranges (e.g., c >= 'a' && c <= 'z').

It Demands a Choice of Data Structure

The most interesting part of the problem is deciding how to keep track of the letters you've already seen. This is where your understanding of data structures and their trade-offs comes into play.

  • Do you need the flexibility of a hash map like std::unordered_map<char, int>?
  • Is a simple array or std::vector<bool> more efficient for a limited character set like the English alphabet?
  • Could a memory-optimized std::bitset be the fastest option?

We will explore all these options, but the process of choosing one over the other is a critical skill for any developer. It teaches you to think about constraints (e.g., ASCII vs. Unicode) and performance (time vs. space complexity).

It Encourages Clean, Modular Code

The problem is perfectly sized for a single, well-defined function. This aligns with best practices for writing modular, testable code. The solution is typically encapsulated in a header (.h) and a source (.cpp) file, a standard practice in C++ development that promotes separation of interface and implementation.


How to Approach the Isogram Problem: The Strategy

A robust strategy is key to writing bug-free code. Before writing a single line, we should outline the logical steps our algorithm will take. This high-level plan ensures we cover all the rules defined earlier.

The Core Algorithm

The most efficient approach involves a single pass through the input string. For each character in the string, we will perform a series of checks and actions.

  1. Initialize a "Seen" Tracker: Create a data structure to keep track of the letters we have encountered. A simple boolean array of size 26 is a great starting point, where each index from 0 to 25 corresponds to a letter from 'a' to 'z'. All values will initially be false.
  2. Iterate Through the String: Loop over every character of the input phrase.
  3. Filter Non-Alphabetic Characters: For each character, first check if it's an alphabet letter. If it's a space, hyphen, number, or punctuation, simply ignore it and move to the next character.
  4. Normalize the Character: If the character is a letter, convert it to a consistent case, preferably lowercase. This handles the case-insensitivity rule. 'A' becomes 'a', 'B' becomes 'b', and so on.
  5. Check the Tracker: After normalizing, check your "seen" tracker. Has this letter been encountered before?
    • If yes (the value in our tracker is true), we have found a repeating letter. The string is not an isogram. We can immediately stop and return false.
    • If no (the value is false), it's the first time we're seeing this letter.
  6. Update the Tracker: If the letter was new, update the tracker to mark it as "seen" (set its corresponding value to true).
  7. Final Result: If the loop completes without ever finding a duplicate, it means every letter was unique. The string is an isogram, and we can return true.

Logical Flow Diagram

This ASCII art diagram illustrates the decision-making process for each character in the string.

    ● Start
    │
    ▼
  ┌────────────────────────┐
  │ Create `seen` tracker  │
  │ (e.g., bool array[26]) │
  └────────────┬───────────┘
               │
    ╭──────────▼───────────╮
    │ Loop for each `char` │
    │ in the input string  │
    ╰──────────┬───────────╯
               │
               ▼
        ◆ Is `char` a letter? ◆
       ╱          ╲
      Yes          No
      │            │
      ▼            └───────────╮
┌───────────────┐              │ (continue loop)
│ Normalize to  │              │
│ lowercase     │              │
└───────┬───────┘              │
        │                      │
        ▼                      │
  ◆ Is it already in `seen`? ◆ │
   ╱           ╲               │
  Yes           No             │
  │              │             │
  ▼              ▼             │
┌────────────┐ ┌───────────────┐ │
│ Return `false` │ │ Mark as `seen`  │ │
└────────────┘ └───────────────┘ │
                 │               │
                 └───────────────╯
               │
    ╭──────────┴───────────╮
    │ (end of loop)        │
    ╰──────────┬───────────╯
               │
               ▼
        ┌───────────┐
        │ Return `true` │
        └───────────┘
               │
               ▼
            ● End

This clear, step-by-step process is not only efficient (it stops at the first duplicate) but also easy to translate into C++ code.


Where Do We Implement the Solution? The C++ Implementation

Now, let's translate our strategy into clean, modern C++ code. Following best practices, we'll separate the function declaration into a header file and its definition into a source file.

The Header File: isogram.h

This file defines the public interface of our function. It tells other parts of the program that a function named is_isogram exists within the isogram namespace, what parameters it takes (a constant string reference), and what it returns (a boolean).

// isogram.h
#ifndef ISOGRAM_H
#define ISOGRAM_H

#include <string>

// We place our function within a namespace to avoid potential naming conflicts.
namespace isogram {

    /**
     * @brief Determines if a word or phrase is an isogram.
     * 
     * An isogram is a word or phrase without a repeating letter.
     * This check is case-insensitive and ignores spaces and hyphens.
     * 
     * @param text The input string to check.
     * @return true if the text is an isogram, false otherwise.
     */
    bool is_isogram(const std::string& text);

}  // namespace isogram

#endif // ISOGRAM_H

The Source File: isogram.cpp

This is where the magic happens. We implement the logic discussed earlier using a std::vector<bool> as our "seen" tracker. This choice is highly efficient for the ASCII character set.

// isogram.cpp
#include "isogram.h"
#include <vector>   // For std::vector
#include <cctype>   // For std::isalpha and std::tolower

namespace isogram {

bool is_isogram(const std::string& text) {
    // A boolean vector to track seen letters. 'a' maps to index 0, 'b' to 1, etc.
    // We assume an ASCII-based alphabet with 26 letters.
    std::vector<bool> seen(26, false);

    // Iterate through each character of the input string using a range-based for loop.
    for (char c : text) {
        // Rule: We only care about alphabetic characters.
        if (std::isalpha(c)) {
            // Rule: Normalize to lowercase to handle case-insensitivity.
            char lower_c = static_cast<char>(std::tolower(c));
            
            // Calculate the 0-25 index corresponding to the letter.
            // 'a' - 'a' = 0, 'b' - 'a' = 1, etc.
            int index = lower_c - 'a';

            // If we've already seen this letter, its index will be true.
            // This means it's a duplicate, so it's not an isogram.
            if (seen[index]) {
                return false;
            }
            
            // If we haven't seen it, mark it as seen for future checks.
            seen[index] = true;
        }
        // If the character is not a letter (e.g., space, hyphen), we do nothing and continue.
    }

    // If the loop completes without finding any duplicates, the string is an isogram.
    return true;
}

}  // namespace isogram

Compiling and Running the Code

To test our solution, we can create a simple main.cpp file.

// main.cpp
#include <iostream>
#include "isogram.h"

int main() {
    std::string test1 = "lumberjacks";
    std::string test2 = "background";
    std::string test3 = "six-year-old";
    std::string test4 = "isograms";
    std::string test5 = "Bubble";

    std::cout << std::boolalpha; // Print "true" or "false" instead of 1 or 0
    std::cout << "\"" << test1 << "\" is an isogram: " << isogram::is_isogram(test1) << std::endl;
    std::cout << "\"" << test2 << "\" is an isogram: " << isogram::is_isogram(test2) << std::endl;
    std::cout << "\"" << test3 << "\" is an isogram: " << isogram::is_isogram(test3) << std::endl;
    std::cout << "\"" << test4 << "\" is an isogram: " << isogram::is_isogram(test4) << std::endl;
    std::cout << "\"" << test5 << "\" is an isogram: " << isogram::is_isogram(test5) << std::endl;

    return 0;
}

You can compile and run this code from your terminal using a C++ compiler like g++.


# Compile all .cpp files together and create an executable named 'isogram_checker'
# The -std=c++17 flag ensures we're using a modern C++ standard.
g++ -std=c++17 -Wall -Wextra -o isogram_checker main.cpp isogram.cpp

# Run the executable
./isogram_checker

The expected output will be:


"lumberjacks" is an isogram: true
"background" is an isogram: true
"six-year-old" is an isogram: true
"isograms" is an isogram: false
"Bubble" is an isogram: false

When to Use Different Data Structures: Alternatives & Performance

The std::vector<bool> solution is excellent, but it's not the only way. A senior developer knows multiple ways to solve a problem and understands the trade-offs of each. Let's explore some alternatives.

Alternative 1: Using std::unordered_map

A hash map (std::unordered_map) can also be used to track seen characters. It maps each character to a boolean or integer count. This approach is more flexible as it's not limited to the English alphabet and can handle any character, including Unicode, without modification.

// Snippet using std::unordered_map
#include <string>
#include <unordered_map>
#include <cctype>

bool is_isogram_map(const std::string& text) {
    std::unordered_map<char, bool> seen;
    for (char c : text) {
        if (std::isalpha(c)) {
            char lower_c = static_cast<char>(std::tolower(c));
            if (seen.count(lower_c)) { // .count() is a fast way to check for key existence
                return false;
            }
            seen[lower_c] = true;
        }
    }
    return true;
}
  • Pros: Very readable and idiomatic C++. Easily handles extended character sets (Unicode) without any changes to the logic.
  • Cons: Higher memory overhead and potentially slower than a direct-mapped array due to the costs of hashing and managing the map's internal structure. For a small, fixed character set like ASCII, it's overkill.

Alternative 2: Using std::bitset

For ultimate memory and speed optimization with a fixed character set, std::bitset is a powerful tool. It's essentially an array of bits, making it extremely compact. We can use a bitset of size 26, where each bit's position corresponds to a letter.

// Snippet using std::bitset
#include <string>
#include <bitset>
#include <cctype>

bool is_isogram_bitset(const std::string& text) {
    std::bitset<26> seen; // 26 bits, all initialized to 0 (false)
    for (char c : text) {
        if (std::isalpha(c)) {
            char lower_c = static_cast<char>(std::tolower(c));
            int index = lower_c - 'a';
            
            if (seen.test(index)) { // .test() checks if the bit at the index is set
                return false;
            }
            seen.set(index); // .set() sets the bit at the index to 1
        }
    }
    return true;
}
  • Pros: Extremely memory efficient (a 26-bit bitset uses only about 4 bytes). Operations are typically mapped to single CPU instructions, making it very fast.
  • Cons: Less intuitive for beginners compared to an array or map. It's fixed at compile time, so it's only suitable when you know the size of your character set in advance.

Comparison of Approaches

Let's visualize the trade-offs between these data structures.

      ● Input String
      │
      ▼
    ┌─────────────────────────┐
    │ Choose Tracking Method  │
    └───────────┬─────────────┘
  ┌─────────────┴─────────────┬────────────────┐
  ▼                           ▼                ▼
┌─────────────────┐  ┌──────────────────┐  ┌───────────────┐
│ `vector`  │  │ `unordered_map`  │  │  `bitset<26>` │
│ (Our Solution)  │  │ (Flexibility)    │  │ (Performance) │
└─────────────────┘  └──────────────────┘  └───────────────┘
  │ Speed:   Fast     │ Speed:   Good      │ Speed:   Fastest
  │ Memory:  Low      │ Memory:  Medium    │ Memory:  Lowest
  │ Unicode: No       │ Unicode: Yes       │ Unicode: No
  │ Complexity: O(n)  │ Complexity: O(n)   │ Complexity: O(n)

Pros & Cons Table

Approach Pros Cons Best For
std::vector<bool> Good balance of speed and memory; very readable. Tied to a fixed character set (e.g., ASCII). General-purpose competitive programming and interviews.
std::unordered_map Highly flexible, handles any character set (Unicode) out of the box. More memory and computational overhead than other methods. Applications that need to process international text or unknown character sets.
std::bitset Extremely fast and memory-efficient. Fixed size, less intuitive syntax, only for known character sets. Performance-critical applications where every byte and cycle counts.

Who Benefits From Mastering This? Real-World Skills

While "isogram checker" might not be a feature you ship in a product every day, the underlying skills are universally applicable. Mastering this problem from the kodikra C++ curriculum prepares you for more complex challenges.

  • Data Validation: The core logic of checking for duplicates is fundamental in data validation. For example, ensuring a user-created username or tag is unique.
  • Algorithmic Foundation: This problem is a gateway to more advanced frequency counting and hashing algorithms, which are used in everything from database indexing to cryptography.
  • Performance Mindset: Analyzing the trade-offs between an array, a hash map, and a bitset trains you to think like an engineer who is conscious of performance. This mindset is invaluable when working on large-scale systems where efficiency matters.
  • Interview Preparedness: This is a classic screening question. A candidate who can not only solve it but also discuss alternative implementations and their complexities demonstrates a much deeper level of understanding.

Frequently Asked Questions (FAQ)

What is the time and space complexity of the main `vector<bool>` solution?

The time complexity is O(n), where 'n' is the length of the input string. This is because we iterate through the string only once. The space complexity is O(1) or constant space. Although we use a vector, its size (26) is fixed and does not depend on the input string's length.

How would this solution handle Unicode or non-ASCII characters?

The `vector<bool>` and `bitset` solutions would fail because they are hardcoded for the 26 letters of the English alphabet. The `std::unordered_map<char, bool>` approach, however, would work correctly with UTF-8 characters out of the box, as `char` in C++ can store UTF-8 code units. For full, proper Unicode support (handling different representations of the same character), you would need a more advanced approach using a dedicated Unicode library.

Why is case-insensitivity so important for this problem?

Case-insensitivity is a common requirement in string problems to make them more robust. Without it, "Word" would be considered an isogram, but "word" would not, which is usually not the desired behavior. Normalizing the case (e.g., to lowercase) ensures that 'W' and 'w' are treated as the same character, which aligns with the problem's definition of a "repeating letter."

Is a `std::set` a good choice for this problem?

Yes, `std::set` is another viable alternative. You could iterate through the string, and for each valid character, try to insert it into the set. The `.insert()` method of a set returns a pair, with the second element being a boolean indicating if the insertion was successful. If an insertion fails (because the element is already there), you've found a duplicate. Its performance is typically O(log k) for insertion, where k is the current size of the set, making the total complexity O(n log k). This is slightly slower than the O(n) of our main solution but more flexible than a boolean array.

What are common mistakes to avoid when solving the isogram problem?

The most common mistakes are:

  1. Forgetting to handle case-insensitivity (e.g., treating 'a' and 'A' as different).
  2. Incorrectly handling non-alphabetic characters (e.g., counting a hyphen as a character that can't repeat).
  3. Choosing an overly complex data structure when a simple one suffices (e.g., using a map for an ASCII-only problem).
  4. Off-by-one errors or incorrect index calculation when using an array (e.g., `c - 'a'`).

Why did you choose a boolean array over a hash map in the main solution?

For this specific problem, the constraints are well-defined: check for repeating letters within the English alphabet. Given this fixed and small character set, a simple boolean array (or `std::vector<bool>`) is superior to a hash map in both speed and memory usage. It avoids the overhead of hashing, collision resolution, and dynamic memory allocation that comes with `std::unordered_map`. It's a classic case of choosing the simplest, most efficient tool for the job.

Could we solve this using only C++ Standard Library algorithms?

Yes, but it's often less readable. One "functional" approach could involve filtering, transforming, sorting, and then checking for adjacent duplicates. For example, you could copy only the letters into a new string, transform them to lowercase, sort the new string, and then use `std::adjacent_find`. However, this would have a time complexity of O(n log n) due to sorting, making it less efficient than our single-pass O(n) solution.


Conclusion: More Than Just a Word Game

The isogram problem is a perfect example of a challenge that is simple on the surface but rich with technical depth. We've journeyed from defining the core rules to implementing a clean, efficient C++ solution using a std::vector<bool>. More importantly, we explored the "why" behind our choices by comparing our primary solution with alternatives like std::unordered_map and std::bitset, analyzing the critical trade-offs in performance, memory, and flexibility.

Mastering this problem isn't just about checking for duplicate letters; it's about honing your skills in string manipulation, character handling, and making intelligent data structure choices. These are the foundational skills that separate a novice coder from a professional engineer.

Disclaimer: The C++ solution provided has been tested with g++ on a C++17 and C++20 standard. The concepts are fundamental and should be compatible with most modern C++ compilers.

Ready to tackle the next challenge? Continue your journey in the kodikra C++ 3 roadmap or explore our complete C++ learning path to build even more powerful and efficient applications.


Published by Kodikra — Your trusted Cpp learning resource.