Isogram in Abap: Complete Solution & Deep Dive Guide

a close up of a computer screen with code on it

Mastering String Manipulation in ABAP: The Complete Guide to Isogram Detection

An isogram is a word or phrase where no letter repeats. This guide provides a complete, step-by-step tutorial on how to build an efficient isogram checker in modern ABAP, using clean code, internal tables, and best practices for robust data validation logic.

Ever found yourself staring at a complex data validation requirement in an SAP system, wondering about the most efficient way to parse a string? You're not alone. Many ABAP developers face challenges that require them to go beyond simple data moves and dive deep into the world of string manipulation and algorithmic thinking. These tasks can feel daunting, especially when performance is critical.

This article promises to turn that challenge into a skill-building opportunity. We will tackle the classic "Isogram" problem, a fantastic exercise from the exclusive kodikra.com curriculum. By the end of this guide, you will not only have a robust, production-ready ABAP class for detecting isograms but also a much deeper understanding of powerful string functions and internal table operations that you can apply to countless other real-world SAP development scenarios.


What Exactly Is an Isogram?

Before we write a single line of code, it's crucial to understand the problem domain. The definition is simple yet precise, with a few important rules to consider.

An isogram (also known as a "non-pattern word") is a word or phrase that does not contain any repeating letters. The core constraint is the uniqueness of alphabetical characters.

However, there are two special exceptions to this rule:

  • Spaces are allowed to appear multiple times.
  • Hyphens (dashes) are also allowed to appear multiple times.

Let's look at some examples to make this crystal clear:

  • lumberjacks - This is an isogram. Every letter (l, u, m, b, e, r, j, a, c, k, s) appears only once.
  • background - This is also an isogram.
  • six-year-old - This is a perfect example of the rules. Even though it contains a hyphen, all the letters (s, i, x, y, e, a, r, o, l, d) are unique.

Conversely, a word like isograms is not an isogram because the letter 's' appears twice. Similarly, abba is not an isogram because both 'a' and 'b' are repeated.


Why Is This Logic Important for ABAP Developers?

While checking for isograms might seem like a purely academic puzzle, the underlying principles are directly applicable to everyday tasks in the SAP world. Mastering this logic equips you with skills in several critical areas:

  • Data Validation & Cleansing: Imagine a scenario where a user must enter a unique set of preference codes, or a material characteristic must not contain duplicate flags. The logic for checking isograms is the foundation for such validation rules.
  • Algorithmic Thinking: This problem forces you to think about efficiency. How can you check for duplicates with the minimum number of operations? This mindset is invaluable when working with large datasets in SAP, where performance is paramount.
  • Mastery of String Operations: You will become intimately familiar with essential ABAP statements like TRANSLATE, REPLACE, and offset-based string access, which are fundamental for processing any character-based data.
  • Effective Use of Internal Tables: The most efficient solution involves using an internal table to keep track of characters you've already seen. This module is a practical lesson in choosing the right table type (like a SORTED TABLE) for performance-critical lookups.

Ultimately, solving this problem makes you a more versatile and efficient ABAP programmer, ready to tackle more complex data transformation and validation requirements. You can explore more such challenges in our complete ABAP Learning Roadmap.


How to Design the Isogram Detection Algorithm

A robust algorithm is a clear plan of action. Before writing code, we should outline the logical steps required to solve the problem. A good algorithm for checking isograms can be broken down into four main phases.

Step 1: Normalization

The problem states that checks should be case-insensitive (e.g., 'A' is the same as 'a') and that spaces and hyphens should be ignored. Therefore, the first step is to clean or "normalize" the input string.

  • Convert the entire input string to a single case, either lower or upper. Lowercase is a common convention.
  • Remove all occurrences of spaces and hyphens from the string.

For example, if the input is "Six-year-old", after normalization, it should become "sixyearold".

Step 2: Iteration

Once we have a clean string containing only the letters we need to check, we must examine each character one by one. This means we need a loop that runs from the first character to the last character of the normalized string.

Step 3: Tracking Seen Characters

This is the core of the algorithm. As we iterate through the characters, we need a mechanism to remember which ones we have already encountered. If we see a character that we've already remembered, we know immediately that it's a repeat, and the string is not an isogram.

An internal table in ABAP is the perfect tool for this job. We can add each unique character to this table as we encounter it for the first time.

Step 4: Decision Making

Inside our loop, for each character, we perform a check:

  • "Have I seen this character before?" (i.e., "Does this character already exist in my tracking table?")
  • If yes, we can stop immediately. The string is not an isogram. We don't need to check the rest of the characters.
  • If no, we add the character to our tracking table and move on to the next character in the string.

If our loop completes without ever finding a duplicate, it means every character was unique. Therefore, the string is an isogram.

High-Level Logic Flow

Here is a simple ASCII art diagram illustrating this high-level process flow.

    ● Start (Input String)
    │
    ▼
  ┌───────────────────┐
  │ Normalize String  │
  │ (lowercase,       │
  │  remove ' ', '-') │
  └─────────┬─────────┘
            │
            ▼
  ┌───────────────────┐
  │ Loop Each Char    │
  └─────────┬─────────┘
            │
            ▼
    ◆ Char Seen Before?
   ╱                   ╲
 Yes (Duplicate)      No (Unique)
  │                      │
  ▼                      ▼
┌──────────────┐      ┌──────────────────┐
│ Return FALSE │      │ Add Char to Seen │
└──────────────┘      │ List & Continue  │
                       └──────────────────┘
            │
            ▼
  ● End Loop?
  │
  ▼
┌─────────────┐
│ Return TRUE │
└─────────────┘

Where to Implement the Solution: A Modern ABAP Class

To promote reusable, clean, and testable code, we will implement our logic within a global ABAP class. This is the standard for modern ABAP development, especially in S/4HANA environments. Let's call our class ZCL_KODIKRA_ISOGRAM.

We will define a single public, static method named IS_ISOGRAM.

  • Public: So it can be called from any other program.
  • Static: So we don't need to create an instance of the class to use it. This makes it a simple utility method.

The method will have one importing parameter (the string to check) and one returning parameter (a boolean value indicating if it's an isogram).

The Complete ABAP Code

Here is the full, well-commented code for the method. We'll break it down in detail in the next section.

CLASS zcl_kodikra_isogram DEFINITION
  PUBLIC
  FINAL
  CREATE PUBLIC .

  PUBLIC SECTION.
    "! <p>Determines if a word or phrase is an isogram.</p>
    "! <p>An isogram is a word or phrase without a repeating letter.</p>
    "! <p>Spaces and hyphens are allowed to appear multiple times.</p>
    "! @parameter iv_string | The input string to check
    "! @parameter rv_is_isogram | abap_true if isogram, abap_false otherwise
    METHODS is_isogram
      IMPORTING
        iv_string      TYPE string
      RETURNING
        VALUE(rv_is_isogram) TYPE abap_bool.
  PROTECTED SECTION.
  PRIVATE SECTION.
ENDCLASS.


CLASS zcl_kodikra_isogram IMPLEMENTATION.

  METHOD is_isogram.
    "======================================================================
    " STEP 1: DATA DECLARATION
    "======================================================================
    " lv_processed_string will hold the normalized version of the input.
    DATA lv_processed_string TYPE string.

    " For tracking characters we have already encountered.
    " A sorted table with a unique key is highly efficient for lookups.
    TYPES:
      BEGIN OF ty_char,
        char TYPE c LENGTH 1,
      END OF ty_char.

    DATA lt_seen_chars TYPE SORTED TABLE OF ty_char
                         WITH UNIQUE KEY char.
    DATA ls_seen_char TYPE ty_char.

    "======================================================================
    " STEP 2: NORMALIZE THE INPUT STRING
    "======================================================================
    " Assign input to our working variable.
    lv_processed_string = iv_string.

    " Convert to lowercase to make the check case-insensitive.
    TRANSLATE lv_processed_string TO LOWER CASE.

    " Remove all spaces and hyphens as per the requirements.
    REPLACE ALL OCCURRENCES OF '-' IN lv_processed_string WITH ``.
    REPLACE ALL OCCURRENCES OF ' ' IN lv_processed_string WITH ``.

    "======================================================================
    " STEP 3: ITERATE AND CHECK FOR DUPLICATES
    "======================================================================
    " Assume it's an isogram until we find a duplicate.
    rv_is_isogram = abap_true.

    " Get the length of the string to control the loop.
    DATA(lv_len) = strlen( lv_processed_string ).

    " Loop through each character of the normalized string.
    DO lv_len TIMES.
      " Get the current character using offset access.
      " sy-index is 0-based in DO loops, so we subtract 1.
      DATA(lv_current_char) = lv_processed_string+sy-index-1(1).

      " Check if we have seen this character before.
      READ TABLE lt_seen_chars
           WITH KEY char = lv_current_char
           TRANSPORTING NO FIELDS.

      " sy-subrc = 0 means the character was found in our tracking table.
      IF sy-subrc = 0.
        " It's a duplicate. This is not an isogram.
        rv_is_isogram = abap_false.
        " Exit the loop immediately for efficiency.
        EXIT.
      ELSE.
        " It's a new character. Add it to our tracking table.
        ls_seen_char-char = lv_current_char.
        INSERT ls_seen_char INTO TABLE lt_seen_chars.
      ENDIF.
    ENDDO.

  ENDMETHOD.
ENDCLASS.

Detailed Code Walkthrough

Let's dissect the ABAP code to understand exactly how it works, piece by piece.

Section 1: Data Declarations


    DATA lv_processed_string TYPE string.

    TYPES:
      BEGIN OF ty_char,
        char TYPE c LENGTH 1,
      END OF ty_char.

    DATA lt_seen_chars TYPE SORTED TABLE OF ty_char
                         WITH UNIQUE KEY char.
    DATA ls_seen_char TYPE ty_char.

Here, we declare the variables we'll need.

  • lv_processed_string: A string variable to hold the cleaned-up version of the input. Using a separate variable preserves the original input parameter iv_string.
  • lt_seen_chars: This is the most important declaration. We define it as a SORTED TABLE with a UNIQUE KEY on the character field. This is a deliberate performance choice. A sorted table uses a binary search algorithm for lookups (like our READ TABLE statement), which is significantly faster than a linear scan of a standard table, especially as the number of unique characters grows.
  • ls_seen_char: A work area (structure) compatible with our internal table, used to insert new characters.

Section 2: Input Normalization


    lv_processed_string = iv_string.

    TRANSLATE lv_processed_string TO LOWER CASE.

    REPLACE ALL OCCURRENCES OF '-' IN lv_processed_string WITH ``.
    REPLACE ALL OCCURRENCES OF ' ' IN lv_processed_string WITH ``.

This block implements the first step of our algorithm.

  • TRANSLATE ... TO LOWER CASE: This is a standard and highly efficient ABAP statement for case conversion. It ensures that 'I' and 'i' are treated as the same character.
  • REPLACE ALL OCCURRENCES OF ...: We use this statement twice to strip out all hyphens and spaces, leaving us with only the letters to be checked for uniqueness.

Section 3: The Core Logic Loop


    rv_is_isogram = abap_true.
    DATA(lv_len) = strlen( lv_processed_string ).

    DO lv_len TIMES.
      DATA(lv_current_char) = lv_processed_string+sy-index-1(1).
      
      READ TABLE lt_seen_chars ...
      
      IF sy-subrc = 0.
        rv_is_isogram = abap_false.
        EXIT.
      ELSE.
        INSERT ls_seen_char INTO TABLE lt_seen_chars.
      ENDIF.
    ENDDO.

This is the heart of the function.

  • Pessimistic vs. Optimistic Approach: We start by setting our return value rv_is_isogram to abap_true. This is an "optimistic" approach: we assume the string is an isogram and only change our minds if we find evidence to the contrary (a duplicate character).
  • Looping: A DO lv_len TIMES loop is a classic and performant way to iterate a fixed number of times. Inside, we use sy-index (which is 1-based) to calculate the 0-based offset needed for character access. lv_processed_string+sy-index-1(1) skillfully extracts one character at a time.
  • The Check: READ TABLE lt_seen_chars ... is the crucial check. Because our table is a SORTED TABLE, this read is extremely fast. We use TRANSPORTING NO FIELDS because we don't care about the data in the table; we only care if a row was found.
  • The Verdict: The system variable sy-subrc tells us the outcome. If sy-subrc is 0, a matching entry was found, meaning the character is a duplicate. We immediately set our return value to abap_false and use EXIT to terminate the loop. There's no point in checking further. If sy-subrc is not 0 (usually 4), the character is new, and we INSERT it into our tracking table.

Internal Table Logic Flow

This diagram shows the decision-making process inside the loop for each character.

    ● Get Current Char
    │
    ▼
  ┌───────────────────────────┐
  │ READ TABLE lt_seen_chars  │
  │ WITH KEY char = curr_char │
  └────────────┬──────────────┘
               │
               ▼
      ◆ sy-subrc = 0?
     ╱               ╲
   Yes (Found)      No (Not Found)
    │                  │
    ▼                  ▼
┌──────────────┐   ┌──────────────────┐
│ Set Flag to  │   │ INSERT curr_char │
│ FALSE & EXIT │   │ INTO TABLE       │
└──────────────┘   │ lt_seen_chars    │
                   └──────────────────┘
    │
    ▼
    ● Next Iteration

Alternative Approaches and Performance Considerations

While our chosen solution is robust and efficient, it's helpful for a senior developer to be aware of other potential methods and their trade-offs.

Approach Pros Cons
Sorted Internal Table (Our Solution) - Very fast lookups (binary search, O(log n)).
- Clean, modern ABAP syntax.
- Memory efficient for the ASCII character set.
- Small overhead for maintaining the sorted order on inserts.
Hashed Internal Table - Potentially the fastest for lookups (amortized O(1) complexity).
- Excellent for very large, unique character sets.
- Higher memory consumption than a sorted table.
- The key must be well-defined. For a single character, this is simple.
Standard Table with `READ` - Simple to declare. - Very poor performance. Each `READ TABLE` performs a linear scan (O(n)), making the overall algorithm O(n²), which is unacceptable for long strings.
Nested Loops (Brute Force) - Requires no extra memory (no internal table). - Extremely inefficient (O(n²)). Involves comparing every character with every other character. Should be avoided in production code.

For this specific problem, both Sorted and Hashed tables are excellent choices. The performance difference between them is likely negligible given the small size of the character set (e.g., 26 letters in English). Our choice of a sorted table is a perfect balance of performance and clarity.

To deepen your knowledge, we recommend exploring the fundamentals of ABAP programming, where we cover internal table types in great detail.


Frequently Asked Questions (FAQ)

1. What is the difference between an isogram and a pangram?

They are different concepts. An isogram has no repeating letters (e.g., "background"). A pangram is a sentence that contains every letter of the alphabet at least once (e.g., "The quick brown fox jumps over the lazy dog"). A string can be both, one, or neither.

2. How does this solution handle Unicode or multi-byte characters?

Our solution using TYPE c LENGTH 1 is designed for single-byte characters. For a system with complex Unicode characters (like emojis or certain language scripts), the approach would need modification. You would need to ensure the character extraction and internal table field types correctly handle the full character representation, possibly using string slicing functions that are Unicode-aware.

3. Is the ABAP `FIND` statement a good way to check for isograms?

No, it would be very inefficient. Using FIND would require you to loop through each character and then, for each one, search the *rest* of the string for another occurrence. This leads to an O(n²) time complexity, similar to the brute-force nested loop approach, and should be avoided.

4. What is the time complexity of the provided ABAP solution?

The time complexity is approximately O(n log k), where 'n' is the length of the input string and 'k' is the number of unique characters encountered. The 'n' comes from the main loop, and the 'log k' comes from the binary search lookup (`READ TABLE`) and insertion into the sorted table. This is highly efficient.

5. Could I use a HASHED TABLE for even better performance?

Yes, you absolutely could. A HASHED TABLE would change the lookup time to be, on average, constant time (O(1)). This would make the overall complexity closer to O(n). For this specific problem, the difference is minimal, but for problems with a much larger set of possible keys, a hashed table is often the superior choice.

6. How do I test this ABAP class?

The best practice is to create a local or global test class using the ABAP Unit framework. You would create test methods that call zcl_kodikra_isogram=>is_isogram( ) with various inputs (e.g., "lumberjacks", "isograms", "six-year-old", "") and use cl_abap_unit_assert=>assert_equals( ) to verify that the returned value matches the expected outcome.

7. Why are spaces and hyphens ignored by definition?

This is simply part of the classic definition of an isogram as defined in this kodikra module. The focus is on the uniqueness of the alphabetical letters themselves, not all characters. This makes the puzzle more interesting by allowing for multi-word phrases to be evaluated.


Conclusion: From Problem to Pattern

We have successfully journeyed from a simple problem statement to a complete, robust, and performant solution in modern ABAP. By tackling the isogram challenge, you've done more than just write a function; you've practiced a pattern of problem-solving that is essential for any senior developer: Analyze, Design, Implement, and Refine.

You've reinforced your understanding of critical ABAP concepts, including class-based design, powerful string manipulation statements, and the strategic use of different internal table types for optimal performance. This knowledge is not confined to this single problem; it is a powerful toolset you can now apply to complex data validation rules, custom report logic, and integration scenarios across the SAP landscape.

Disclaimer: The code provided in this article is written for modern ABAP syntax, primarily found in SAP S/4HANA and recent versions of SAP NetWeaver (7.40 and above). The logic can be adapted for older versions, but the syntax (e.g., inline declarations with DATA()) may need to be adjusted.


Published by Kodikra — Your trusted Abap learning resource.