Anagram in Abap: Complete Solution & Deep Dive Guide

a close up of a computer screen with code on it

Mastering String Manipulation: An ABAP Anagram Solver from Zero to Hero

Learn to solve the Anagram problem in ABAP by creating a robust class. This guide covers case-insensitive string comparison, sorting characters, and filtering candidates using internal tables, providing a complete, performance-optimized solution for this classic algorithm challenge found in the exclusive kodikra.com curriculum.

You've just been handed a task that seems deceptively simple: find all the anagrams for a specific word from a given list. Your mind immediately races. "Easy," you think, "I'll just compare letters." But then the requirements start piling up. The comparison must be case-insensitive. A word cannot be its own anagram. The solution needs to be efficient, reusable, and fit within the structured world of ABAP objects. Suddenly, this "simple" task has become a puzzle of string manipulation, table operations, and algorithmic thinking.

This is a common scenario for developers. The gap between understanding a problem and implementing an elegant, robust solution can be vast. This guide is here to bridge that gap. We will walk you through, step-by-step, the process of building a powerful anagram solver in ABAP. You won't just get code; you'll understand the logic, the design choices, and the core ABAP concepts that make it all work, turning a frustrating challenge into a showcase of your skills.


What Exactly Is the Anagram Problem in ABAP?

At its core, an anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once. For example, the word "listen" is an anagram of "silent". They both use the exact same letters (l, i, s, t, e, n) with the same frequency, just in a different order.

When we frame this as a programming challenge, specifically within the context of the kodikra learning path, we are given a target word and a list of candidate words. The objective is to write a program that filters this list and returns only the candidates that are true anagrams of the target word.

The rules for this specific problem add a few crucial layers of complexity:

  • Case-Insensitivity: The comparison must ignore whether letters are uppercase or lowercase. This means "Listen" should be considered an anagram of "Silent". This rule forces us to normalize our input strings before any comparison.
  • Identity Exclusion: A word is never its own anagram. So, if the target word is "stop" and the candidate list contains "stop", it should not be included in the results. This requires a direct, though case-insensitive, comparison to filter out identical matches.
  • Character Set: The inputs consist of standard ASCII alphabetic characters (A-Z, a-z). This simplifies the problem by excluding numbers, punctuation, or special characters, allowing us to focus on the core sorting and comparison logic.

The fundamental strategy to solve this is to find a "canonical representation" or "signature" for each word. If two different words share the same signature, they are anagrams. The most common and intuitive way to create this signature is to convert the word to a consistent case (e.g., lowercase) and then sort its characters alphabetically. For example, both "listen" and "silent", when normalized this way, become "eilnst". This signature becomes the key to our comparison logic.


Why Is Solving This a Core Skill for ABAP Developers?

While finding anagrams might seem like an academic exercise, the underlying techniques are directly applicable to real-world challenges within the SAP ecosystem. Mastering this problem demonstrates proficiency in several critical areas of ABAP development that are used daily in building reports, interfaces, and enhancements.

Deepening String Manipulation Skills

SAP systems are treasure troves of data, much of it in text or string format. Customer names, material descriptions, addresses, and user comments all require sophisticated string handling. The anagram problem forces you to use a suite of ABAP's string processing functions:

  • to_lower(): Essential for data normalization and ensuring consistent comparisons.
  • strlen(): Used for preliminary checks and validations.
  • SPLIT: A powerful tool for deconstructing strings into individual components, in this case, characters.
  • CONCATENATE: The inverse of SPLIT, used for reconstructing strings from internal tables.

Proficiency in these commands is non-negotiable for any developer tasked with data cleansing, data migration (ETL), or building dynamic user interfaces in SAP.

Mastery of Internal Tables

Internal tables are the lifeblood of ABAP programming. This problem isn't just about strings; it's about managing collections of data efficiently. You will work with an internal table of candidate words and build a new internal table for the results. The process involves:

  • Looping: Iterating through the candidate table is the main driver of the logic.
  • Appending: Conditionally adding valid anagrams to the result table.
  • Sorting: The core of the algorithm involves sorting an internal table of single characters to create the word's signature.

Understanding how to declare, populate, and manipulate internal tables efficiently is fundamental to writing performant ABAP code, especially when dealing with large datasets from tables like MARA or KNA1.

Algorithmic and Object-Oriented Thinking

Finally, this problem encourages structured, algorithmic thinking. You must break down a complex requirement into a series of logical, manageable steps. Encapsulating this logic within an ABAP Class (zcl_anagram) promotes reusability and clean code design, which are hallmarks of modern ABAP development. This object-oriented approach ensures that your anagram solver can be easily integrated into any program, function module, or even a web service, without rewriting the core logic.


How to Design and Implement the ABAP Solution

The most effective strategy is the "Sort and Compare" method. We'll create a unique, sorted signature for our target word. Then, for each candidate word, we'll generate its own signature and compare it to the target's signature. If they match, and the words themselves aren't identical, we've found an anagram.

Let's visualize this high-level process.

High-Level Algorithm Flow

This diagram illustrates the overall logic from receiving the input to returning the final list of anagrams.

    ● Start (Target Word, Candidate List)
    │
    ▼
  ┌───────────────────────────┐
  │ 1. Normalize Target Word  │
  │    (Convert to Lowercase) │
  └────────────┬──────────────┘
               │
               ▼
  ┌───────────────────────────┐
  │ 2. Create Target Signature│
  │    (Sort its characters)  │
  └────────────┬──────────────┘
               │
               ▼
  ┌─────────────────────────────────┐
  │ 3. Loop Through Candidate List  │
  └─────────────────────────────────┘
     │
     ├─ For each `candidate`...
     │
     ▼
  ◆ Is `candidate` (case-insensitive) the same as `target`?
  ╱                                    ╲
 Yes (Identity Match)                     No (Proceed)
  │                                      │
  ▼                                      ▼
[Skip to Next Candidate]       ┌───────────────────────────┐
                               │ 4. Normalize Candidate    │
                               │    (Convert to Lowercase) │
                               └────────────┬──────────────┘
                                            │
                                            ▼
                               ┌───────────────────────────┐
                               │ 5. Create Candidate Sig.  │
                               │    (Sort its characters)  │
                               └────────────┬──────────────┘
                                            │
                                            ▼
                               ◆ Does Candidate Sig. == Target Sig.?
                               ╱                             ╲
                              Yes (Anagram Found!)            No
                              │                             │
                              ▼                             ▼
                           ┌─────────────────┐           [Skip to Next
                           │ 6. Add to       │            Candidate]
                           │    Results List │
                           └─────────────────┘
     │
     └─ End of Loop
     │
     ▼
    ● Return Results List

To implement this in modern ABAP, we will create a global class named zcl_anagram. This class will have a single public method, find, which takes the target word and a table of candidates as input and returns a table of anagrams.

Class Definition (zcl_anagram.clas.abap)

The class definition sets up the public interface. We define the necessary data types for our method parameters: a single string for the target word and a string_table for the candidates and the results.


CLASS zcl_anagram DEFINITION
  PUBLIC
  FINAL
  CREATE PUBLIC .

  PUBLIC SECTION.
    "! Finds anagrams for a given word from a list of candidates.
    "!
    "! @parameter i_target_word | The word to find anagrams for.
    "! @parameter it_candidates | A table of potential anagrams.
    "! @parameter r_anagrams    | A table containing the found anagrams.
    METHODS find
      IMPORTING
        i_target_word TYPE string
        it_candidates TYPE string_table
      RETURNING
        VALUE(r_anagrams) TYPE string_table.

  PRIVATE SECTION.
    "! Creates a canonical signature of a word (lowercase, sorted).
    "!
    "! @parameter i_word | The input word.
    "! @parameter r_sorted_word | The sorted signature of the word.
    METHODS get_word_signature
      IMPORTING
        i_word            TYPE string
      RETURNING
        VALUE(r_sorted_word) TYPE string.

ENDCLASS.

Class Implementation (zcl_anagram.clas.imp.abap)

Here lies the core logic. We implement the public find method and a private helper method get_word_signature to keep our code clean and adhere to the Single Responsibility Principle.


CLASS zcl_anagram IMPLEMENTATION.

  METHOD find.
    " Normalize the target word once to avoid redundant processing in the loop.
    DATA(lv_target_lower) = to_lower( i_target_word ).
    DATA(lv_target_signature) = get_word_signature( lv_target_lower ).

    " Loop through each candidate word to check if it's an anagram.
    LOOP AT it_candidates INTO DATA(lv_candidate).
      DATA(lv_candidate_lower) = to_lower( lv_candidate ).

      " Rule: A word is not its own anagram.
      " Skip if the lowercase candidate is identical to the lowercase target.
      IF lv_candidate_lower = lv_target_lower.
        CONTINUE.
      ENDIF.

      " Rule: Anagrams must have the same length.
      " This is an early exit optimization.
      IF strlen( lv_candidate_lower ) <> strlen( lv_target_lower ).
        CONTINUE.
      ENDIF.

      " Generate the signature for the current candidate word.
      DATA(lv_candidate_signature) = get_word_signature( lv_candidate_lower ).

      " If signatures match, it's an anagram. Add original candidate to results.
      IF lv_candidate_signature = lv_target_signature.
        APPEND lv_candidate TO r_anagrams.
      ENDIF.
    ENDLOOP.
  ENDMETHOD.


  METHOD get_word_signature.
    " This helper method creates the canonical representation of a word.

    " 1. Create an internal table to hold each character.
    TYPES: BEGIN OF ty_char,
             char TYPE c LENGTH 1,
           END OF ty_char.
    DATA lt_chars TYPE STANDARD TABLE OF ty_char.

    " 2. Split the word into individual characters.
    " The `AT ''` addition splits the string at every character.
    SPLIT i_word AT '' INTO TABLE lt_chars.

    " 3. Sort the table of characters alphabetically.
    SORT lt_chars BY char.

    " 4. Concatenate the sorted characters back into a single string.
    CONCATENATE LINES OF lt_chars INTO r_sorted_word.
  ENDMETHOD.

ENDCLASS.

A Deep Dive into the Code: The Walkthrough

Understanding the code line-by-line is crucial for true mastery. Let's dissect the implementation, focusing on the "why" behind each ABAP statement.

The `get_word_signature` Helper Method

This private method is the heart of our algorithm. Its sole purpose is to take any string and return its canonical signature. This is a great example of code modularity.

    ● Input: `i_word` (e.g., "Listen")
    │
    ▼
  ┌────────────────────────────┐
  │ 1. Define internal table   │
  │    `lt_chars` (type char 1)│
  └─────────────┬──────────────┘
                │
                ▼
  ┌────────────────────────────┐
  │ 2. SPLIT i_word AT ''      │
  │    INTO TABLE lt_chars     │
  │    (lt_chars now holds    │
  │     'L','i','s','t','e','n')│
  └─────────────┬──────────────┘
                │
                ▼
  ┌────────────────────────────┐
  │ 3. SORT lt_chars BY char   │
  │    (lt_chars is now       │
  │     'e','i','L','n','s','t')│
  └─────────────┬──────────────┘
                │
                ▼
  ┌────────────────────────────┐
  │ 4. CONCATENATE LINES OF    │
  │    lt_chars INTO result    │
  └─────────────┬──────────────┘
                │
                ▼
    ● Return: `r_sorted_word` (e.g., "eiLnst")
  1. TYPES ... / DATA lt_chars ...: We declare a local internal table, lt_chars, with a single column of type c with length 1. This table is perfectly structured to hold one character per row.
  2. SPLIT i_word AT '' INTO TABLE lt_chars.: This is a powerful and concise ABAP statement. When you use SPLIT with an empty string ('') as the delimiter, ABAP understands that you want to break the source string apart into its individual characters. Each character becomes a new row in the lt_chars table.
  3. SORT lt_chars BY char.: This is a standard operation on an internal table. It sorts the rows of lt_chars based on the value in the char column. Since the ASCII values of 'a' are lower than 'b', etc., this effectively sorts the characters alphabetically.
  4. CONCATENATE LINES OF lt_chars INTO r_sorted_word.: This is the reverse of the SPLIT operation. It takes all the rows from the lt_chars table and joins them together into a single string, which is then returned.

The Public `find` Method

This method orchestrates the entire process, using our helper method to do the heavy lifting.

  1. DATA(lv_target_lower) = to_lower( i_target_word ).: First, we normalize the target word by converting it to lowercase. We do this once, outside the loop, for efficiency. Processing it repeatedly inside the loop for every candidate would be redundant and wasteful.
  2. DATA(lv_target_signature) = get_word_signature( lv_target_lower ).: We then generate the signature for our normalized target word. This signature will be the "golden key" we compare against all candidates.
  3. LOOP AT it_candidates INTO DATA(lv_candidate).: This begins the main iteration over the provided list of potential anagrams.
  4. DATA(lv_candidate_lower) = to_lower( lv_candidate ).: Inside the loop, the first step for each candidate is to normalize it to lowercase, just as we did with the target. This ensures our comparisons are case-insensitive.
  5. IF lv_candidate_lower = lv_target_lower. CONTINUE. ENDIF.: This is our identity check. If the normalized candidate is exactly the same as the normalized target, it's the same word. The problem states a word cannot be its own anagram, so we use CONTINUE to immediately skip to the next iteration of the loop.
  6. IF strlen( lv_candidate_lower ) <> strlen( lv_target_lower ). CONTINUE. ENDIF.: This is a crucial performance optimization. By definition, two words can only be anagrams if they have the exact same number of letters. By checking the length first, we can immediately discard any candidates that are too long or too short, avoiding the more expensive signature generation process for them.
  7. DATA(lv_candidate_signature) = get_word_signature( lv_candidate_lower ).: Only if a candidate passes the identity and length checks do we invest the processing power to generate its signature using our helper method.
  8. IF lv_candidate_signature = lv_target_signature. APPEND lv_candidate TO r_anagrams. ENDIF.: This is the final comparison. If the candidate's signature matches the target's signature, we have found an anagram! We then APPEND the original candidate word (preserving its original casing) to our results table, r_anagrams.

When to Consider Alternatives: Performance and Other Approaches

The "Sort and Compare" method is elegant, easy to understand, and generally performs well for moderately sized lists. However, for extremely large datasets or in performance-critical applications, it's worth knowing about alternative strategies. The most common alternative is the "Character Frequency Map" or "Hashing" method.

Alternative: Character Frequency Map

Instead of sorting, this method involves counting the occurrences of each character in a word. You create a map (or in ABAP, a hashed table) where the key is the character and the value is its count. Two words are anagrams if their character frequency maps are identical.

Example: For "listen":

  • l: 1
  • i: 1
  • s: 1
  • t: 1
  • e: 1
  • n: 1

For "silent", the map would be exactly the same, proving they are anagrams.

Pros and Cons Comparison

Here’s a breakdown of how the two main approaches stack up against each other.

Aspect Sort and Compare (Our Solution) Character Frequency Map
Complexity to Implement Low. The logic is very straightforward using standard ABAP SPLIT, SORT, and CONCATENATE statements. Medium. Requires creating and managing a hashed table, incrementing counts, and then comparing two tables, which can be more verbose.
Readability High. The idea of "if the sorted versions are the same, they're anagrams" is very intuitive for other developers to understand. Medium. The logic of comparing frequency maps is sound but less immediately obvious than comparing two sorted strings.
Performance Good. The dominant operation is sorting. The time complexity is roughly O(N * K log K), where N is the number of candidates and K is the length of the words. Potentially Better. The time complexity is roughly O(N * K), as you just iterate through the characters once to build the map. For very long words, this can be faster than sorting.
Memory Usage Moderate. Requires an intermediate internal table to hold the characters for sorting for each word. Moderate. Requires an intermediate hashed table to store character counts for each word. The size depends on the number of unique characters.
Best For Most common scenarios, educational purposes, and applications where code clarity is paramount. It's the standard, reliable ABAP approach. Performance-critical scenarios with extremely large word lists or very long words, where the overhead of sorting becomes a bottleneck.

Frequently Asked Questions (FAQ)

How would this solution handle non-alphabetic characters like numbers or punctuation?

As written, the current solution would include non-alphabetic characters in the sorting signature. For example, "rail-safety" and "fairy-tales" would not be considered anagrams because the hyphen's position would differ after sorting. To handle this, you would need to add a preprocessing step within the get_word_signature method to strip out or ignore any character that is not a letter before splitting and sorting.

Is this solution optimized for very large datasets of candidate words?

The solution is reasonably efficient. The key optimizations are normalizing the target word only once and using the length check as a fast-fail mechanism. For millions of candidates, the "Character Frequency Map" approach might offer better performance, but for most typical business scenarios in SAP, this "Sort and Compare" method provides an excellent balance of performance and code clarity.

Can I use this logic in a procedural report program instead of a class?

Absolutely. The core logic from the find and get_word_signature methods can be extracted and placed into a FORM...ENDFORM subroutine or a Function Module. However, using a class (as shown) is the recommended modern ABAP practice as it promotes encapsulation, reusability, and easier unit testing.

What is the key difference between using `string` and `c` data types here?

The string data type is dynamic in length, making it ideal for the input parameters where word lengths are unknown. The c data type has a fixed length. We use c LENGTH 1 in our internal table lt_chars because we know for a fact that each row will only ever hold a single character, making it a very precise and efficient choice for that specific task.

Why is the `SPLIT at ''` command so important for this solution?

The `SPLIT at ''` command is the most direct and efficient way in ABAP to deconstruct a string into an internal table of its constituent characters. Without it, you would have to resort to a more complex and slower manual loop, using offset/length operations to extract one character at a time. It's a critical shortcut for this kind of character-level manipulation.

What are some common pitfalls when working with strings in ABAP?

A common pitfall is forgetting about trailing spaces, especially when working with fixed-length c fields, which can affect comparisons. Another is case sensitivity; unless you consistently use functions like to_lower() or to_upper(), you might get unexpected results. Finally, not being aware of performance implications of operations inside a large loop (like calling the same function with the same inputs repeatedly) can slow down your programs significantly.


Conclusion and Next Steps

You have successfully navigated the Anagram problem, transforming a set of requirements into a clean, efficient, and reusable ABAP Objects solution. This journey has reinforced fundamental skills in string manipulation, internal table operations, and algorithmic design. The "Sort and Compare" technique, centered on creating a canonical signature for each word, is a powerful pattern that you can apply to a wide range of data comparison and cleansing tasks in your SAP projects.

By building this solver, you've not only solved a classic computer science puzzle but have also sharpened the exact skills needed to be an effective modern ABAP developer. The ability to break down a problem, design a logical flow, write clean code, and consider alternative approaches is what separates a good developer from a great one.

Technology Disclaimer: The code and concepts presented in this article are based on modern ABAP syntax available in SAP S/4HANA and recent versions of SAP NetWeaver (7.5x and higher). While the core logic is adaptable, specific syntax like inline declarations (DATA(...)) may need to be adjusted for older systems.

Ready to tackle the next challenge? Continue your journey through the ABAP learning modules on kodikra.com to build upon these skills. For a broader overview of the language, explore our complete ABAP language guide.


Published by Kodikra — Your trusted Abap learning resource.