Sieve in Cobol: Complete Solution & Deep Dive Guide

Tabs labeled

Cobol's Hidden Power: Implementing the Sieve of Eratosthenes from Scratch

The Sieve of Eratosthenes is a highly efficient, ancient algorithm used to find all prime numbers up to a specified integer. This guide demonstrates how to implement this classic method using Cobol, leveraging its robust data structures and procedural logic for powerful batch processing and algorithmic problem-solving.

You’ve probably heard whispers of Cobol in the halls of finance and government, a language powering critical systems yet often dismissed as a relic. You might think it's all about crunching numbers for payroll and banking transactions. But what if that same structured, reliable power could be harnessed to solve one of the oldest problems in mathematics: finding prime numbers?

The challenge isn't just about the algorithm; it's about seeing a classic language in a new light. This article bridges that gap. We will take you from the theoretical foundations of the Sieve of Eratosthenes to a practical, line-by-line implementation in Cobol, proving that foundational computer science and enterprise programming are a perfect match.


What Is the Sieve of Eratosthenes?

The Sieve of Eratosthenes is an elegant and surprisingly simple algorithm for finding all prime numbers within a given range. Devised by the Greek mathematician Eratosthenes of Cyrene over two millennia ago, its logic mimics the physical act of sifting through numbers to find the ones that are "prime" or indivisible.

Imagine you have a list of all integers from 2 up to a limit, say 100. The algorithm works as follows:

  • You start with the first prime number, which is 2.
  • You then "cross out" or mark all multiples of 2 (4, 6, 8, etc.) from your list. These numbers cannot be prime because they are divisible by 2.
  • Next, you move to the next unmarked number, which is 3. This is your next prime.
  • You then cross out all multiples of 3 (6, 9, 12, etc.).
  • You continue this process, moving to the next unmarked number (which will always be a prime) and eliminating its multiples.

When you can no longer find a new unmarked number whose square is within your limit, the process is complete. All the numbers that remain unmarked in your list are the prime numbers.

    ● Start with a list of numbers [2, 3, 4, ..., N]
    │
    ▼
  ┌─────────────────┐
  │  Let p = 2      │
  │ (first prime)   │
  └────────┬────────┘
           │
           ▼
  ┌─────────────────────────────────┐
  │ Loop while p*p ≤ N              │
  ├─────────────────────────────────┤
  │   ◆ Is p marked as composite?   │
  │  ╱           ╲                  │
  │ No           Yes                │
  │ │            └─────────────────┐│
  │ ▼                              ││
  │ Mark all multiples of p        ││
  │ (2p, 3p, ...) as composite     ││
  │                                ││
  │ ▼                              ││
  │ Find next unmarked number > p  ││
  │ Set it as the new p            ││
  │ └──────────────────────────────┘│
  └─────────────────────────────────┘
           │
           ▼
    ● All remaining unmarked numbers are prime.

Why Use Cobol for This Algorithm?

At first glance, pairing an ancient Greek algorithm with Cobol might seem unusual. However, Cobol's design philosophy makes it surprisingly well-suited for this task, especially when viewed through the lens of its typical operating environment on mainframes.

Strengths of Cobol for Algorithmic Tasks:

  • Strong Data Structuring: Cobol's DATA DIVISION is exceptionally clear and powerful for pre-defining data structures. The use of tables (arrays) with the OCCURS clause is a perfect fit for representing the list of numbers in the sieve.
  • Batch Processing Prowess: The Sieve is a classic batch process—it takes a defined input (a limit), performs a series of deterministic operations, and produces a final output. This is precisely the kind of workload Cobol was built to handle efficiently.
  • Readability and Maintenance: Well-written Cobol is often described as self-documenting. Its verbose, English-like syntax makes the logic of loops and conditions transparent, which is a core principle of the kodikra learning path curriculum.
  • Performance on Fixed Data: For a fixed-size problem where memory is allocated upfront, Cobol can be very performant. It doesn't have the overhead of dynamic memory allocation found in many modern languages, making its execution predictable.

Implementing the Sieve in Cobol is not just a theoretical exercise; it’s a practical demonstration of how to manage arrays and perform iterative logic in a language that powers a significant portion of the global financial infrastructure. This module from kodikra.com is designed to build these foundational skills.


How to Implement the Sieve in Cobol: The Complete Code

Here is a complete, well-commented Cobol program that implements the Sieve of Eratosthenes. This solution is designed for clarity and follows best practices taught in the kodikra.com exclusive curriculum. It uses a boolean-like flag (0 for prime, 1 for composite) in a table to represent the sieve.


       IDENTIFICATION DIVISION.
       PROGRAM-ID. SieveOfEratosthenes.
       AUTHOR. Kodikra.
      *================================================================
      * This program finds all prime numbers up to a specified limit
      * using the Sieve of Eratosthenes algorithm.
      *================================================================
       ENVIRONMENT DIVISION.
       CONFIGURATION SECTION.
       
       DATA DIVISION.
       WORKING-STORAGE SECTION.
       
      *-- Constants and Configuration
       01 WS-CONFIG.
          05 WS-LIMIT              PIC 9(4) VALUE 1000.
          05 WS-LINE-BREAK         PIC 9(2) VALUE 10. *> Primes per line
          
      *-- Sieve Data Structure
      *   We use a table (array) to represent numbers from 0 to WS-LIMIT.
      *   A value of 0 means the number is potentially prime.
      *   A value of 1 means the number is composite (not prime).
       01 WS-SIEVE-TABLE.
          05 WS-IS-COMPOSITE    PIC 9(1) OCCURS 0 TO 1000 TIMES
                               DEPENDING ON WS-LIMIT
                               INDEXED BY IDX.

      *-- Loop Counters and Calculation Variables
       01 WS-COUNTERS.
          05 P                     PIC 9(4). *> Current prime candidate
          05 I                     PIC 9(4). *> Multiple marker
          05 P-SQUARED             PIC 9(8). *> To check P*P <= LIMIT
          05 PRIME-COUNT           PIC 9(4) VALUE 0.
          05 PRINT-COUNTER         PIC 9(2) VALUE 0.

       PROCEDURE DIVISION.
       
       000-MAIN-PROCEDURE.
           DISPLAY "Finding prime numbers up to " WS-LIMIT "..."
           
           PERFORM 100-INITIALIZE-SIEVE
           PERFORM 200-PERFORM-SIEVE
           PERFORM 300-DISPLAY-PRIMES
           
           DISPLAY " "
           DISPLAY "Found " PRIME-COUNT " prime numbers."
           STOP RUN.

      *----------------------------------------------------------------
      * 100-INITIALIZE-SIEVE
      * Initializes the entire sieve table.
      * Sets 0 and 1 as composite, and all others as potentially prime (0).
      *----------------------------------------------------------------
       100-INITIALIZE-SIEVE.
      *    Initialize all numbers to 0 (potentially prime)
           MOVE ALL '0' TO WS-SIEVE-TABLE.
           
      *    0 and 1 are not prime numbers, so mark them as composite (1)
           MOVE 1 TO WS-IS-COMPOSITE(0).
           MOVE 1 TO WS-IS-COMPOSITE(1).

      *----------------------------------------------------------------
      * 200-PERFORM-SIEVE
      * The core logic of the Sieve of Eratosthenes.
      * It iterates from 2 up to the square root of the limit.
      *----------------------------------------------------------------
       200-PERFORM-SIEVE.
           PERFORM VARYING P FROM 2 BY 1 UNTIL P * P > WS-LIMIT
      *        If P has not been marked, it is a prime
               IF WS-IS-COMPOSITE(P) = 0
      *            Mark all multiples of P as composite
                   PERFORM VARYING I FROM P * P BY P UNTIL I > WS-LIMIT
                       MOVE 1 TO WS-IS-COMPOSITE(I)
                   END-PERFORM
               END-IF
           END-PERFORM.

      *----------------------------------------------------------------
      * 300-DISPLAY-PRIMES
      * Iterates through the sieve and prints all numbers that
      * were not marked as composite.
      *----------------------------------------------------------------
       300-DISPLAY-PRIMES.
           DISPLAY " "
           DISPLAY "Prime numbers are:"
           
           PERFORM VARYING IDX FROM 2 BY 1 UNTIL IDX > WS-LIMIT
               IF WS-IS-COMPOSITE(IDX) = 0
                   ADD 1 TO PRIME-COUNT
                   ADD 1 TO PRINT-COUNTER
                   
      *            Display the prime number with leading zero suppression
                   DISPLAY IDX WITH NO ADVANCING
                   
      *            Add spacing and line breaks for readability
                   IF PRINT-COUNTER < WS-LINE-BREAK
                       DISPLAY " " WITH NO ADVANCING
                   ELSE
                       DISPLAY " " *> Move to the next line
                       MOVE 0 TO PRINT-COUNTER
                   END-IF
               END-IF
           END-PERFORM.

Detailed Code Walkthrough

Let's break down the program's structure and logic, following its execution flow from top to bottom.

    ● Start Program
    │
    ▼
  ┌──────────────────────┐
  │ 000-MAIN-PROCEDURE   │
  └──────────┬───────────┘
             │
             ├─► Call 100-INITIALIZE-SIEVE
             │   └─ Set all array elements to '0'
             │      Mark 0 and 1 as composite ('1')
             │
             ├─► Call 200-PERFORM-SIEVE
             │   └─ Loop P from 2 to sqrt(LIMIT)
             │      IF P is prime (value is '0')
             │         └─ Inner loop: Mark multiples of P as composite
             │
             ├─► Call 300-DISPLAY-PRIMES
             │   └─ Loop through the array
             │      IF element is prime (value is '0')
             │         └─ Display the index number
             │
             ▼
    ● Stop Run

1. IDENTIFICATION and ENVIRONMENT DIVISION

These are standard Cobol boilerplate. The PROGRAM-ID gives our program a name, SieveOfEratosthenes. For this simple console application, the ENVIRONMENT DIVISION is minimal.

2. DATA DIVISION

This is where the magic happens. We define all our variables and data structures.

  • WS-LIMIT: A constant that sets the upper bound for our prime number search. Changing this single value allows the program to handle different ranges.
  • WS-SIEVE-TABLE: This is our core data structure. The OCCURS ... DEPENDING ON WS-LIMIT clause creates a dynamic-sized array (up to a max of 1000 in this case) whose actual size is determined by WS-LIMIT at runtime. Each element, WS-IS-COMPOSITE, is a single digit (PIC 9(1)) that acts as a flag.
  • WS-COUNTERS: A group of variables for controlling our loops (P, I), storing a calculated value (P-SQUARED), and counting results (PRIME-COUNT, PRINT-COUNTER).

3. PROCEDURE DIVISION

This division contains the executable logic, broken into paragraphs (similar to functions or methods).

  • 000-MAIN-PROCEDURE: This is the entry point. It orchestrates the entire process by calling the other paragraphs in the correct sequence: initialize, sieve, and display. Finally, it prints a summary and executes STOP RUN to terminate the program.
  • 100-INITIALIZE-SIEVE: This paragraph prepares our array. It first sets every single element to '0', assuming all numbers are prime. It then explicitly marks indices 0 and 1 as composite (value '1') because, by definition, they are not prime numbers.
  • 200-PERFORM-SIEVE: This is the heart of the algorithm.
    • The outer loop, PERFORM VARYING P FROM 2 BY 1 UNTIL P * P > WS-LIMIT, iterates through potential prime numbers. The condition P * P > WS-LIMIT is a critical optimization; we only need to check for factors up to the square root of the limit.
    • Inside, IF WS-IS-COMPOSITE(P) = 0 checks if our current number P has been marked as composite. If not, it must be a prime.
    • The inner loop, PERFORM VARYING I FROM P * P BY P ..., then marks all multiples of this newly found prime P as composite. It starts from P*P because any smaller multiples (like 2*P, 3*P) would have already been marked by smaller primes (2, 3, etc.).
  • 300-DISPLAY-PRIMES: After the sieve is complete, this paragraph iterates through the entire WS-SIEVE-TABLE from 2 to WS-LIMIT. If an element WS-IS-COMPOSITE(IDX) is still '0', it means the index IDX is a prime number, and it gets displayed on the screen. The logic also includes formatting to print a fixed number of primes per line for better readability.

Compiling and Running the Code

To run this Cobol program, you'll need a Cobol compiler. GnuCOBOL (formerly OpenCOBOL) is a popular open-source option available for Windows, macOS, and Linux.

1. Save the code in a file named sieve.cob.

2. Open your terminal or command prompt and compile the code using the following command:


cobc -x -free sieve.cob

The -x flag creates an executable file, and -free specifies that the source code uses a free-format layout (not the rigid column-based format of older Cobol versions).

3. Run the compiled program:


./sieve

The program will then execute and print the list of prime numbers up to the limit defined in the DATA DIVISION.


Pros, Cons, and Alternative Approaches

While the Sieve of Eratosthenes is highly efficient, it's essential to understand its trade-offs, a key aspect of algorithm analysis covered in the Cobol 4 learning module.

Pros & Cons of the Sieve of Eratosthenes

Pros (Advantages) Cons (Disadvantages)
Fast for a Given Range: It's one of the fastest ways to find all primes up to a reasonably large integer n. Memory Intensive: It requires an array of size n, which can consume significant memory for very large limits.
Simplicity: The algorithm is straightforward to understand and implement. Not for Single Primes: It is inefficient if you only need to check if one specific, large number is prime.
No Division or Modulo: The core logic relies on addition and array access, which are computationally cheaper than division operations. Static Range: The upper limit must be known in advance to allocate the necessary memory.

Alternative Prime-Finding Algorithms

  • Trial Division: This is the most basic method. To test if a number n is prime, you divide it by all integers from 2 up to the square root of n. It's simple but very slow for finding all primes in a range compared to the sieve.
  • Sieve of Atkin: A more complex and optimized version of the sieve algorithm. It has a better theoretical time complexity but is much harder to implement correctly. For most practical purposes up to a reasonable limit, the Sieve of Eratosthenes offers a better balance of performance and simplicity.
  • Segmented Sieve: An optimization of the basic sieve that breaks the range into smaller segments. This allows finding primes in a very large range without requiring a single massive array, thus reducing memory consumption.

Frequently Asked Questions (FAQ)

Is Cobol still relevant today?

Absolutely. Cobol is the backbone of the global financial system, running on mainframes in banking, insurance, and government agencies. Billions of lines of Cobol code are still in production, and there is a high demand for developers who can maintain and modernize these critical systems. Learning it provides a unique and valuable skill set.

Why is the Sieve of Eratosthenes so memory-intensive?

The algorithm requires a data structure, typically an array or a boolean list, to keep track of every number up to the specified limit N. If you want to find primes up to 1 billion, you need an array with 1 billion elements, which can consume gigabytes of RAM. This is its primary limitation.

Can this Cobol program handle very large numbers?

The program's limit is constrained by two factors: the PIC clauses defining the variables and the available memory. The current PIC 9(4) for WS-LIMIT allows numbers up to 9999. To handle larger numbers, you would need to increase the picture clauses (e.g., PIC 9(8)) and ensure your system has enough memory to support the corresponding OCCURS clause in the sieve table.

What does OCCURS ... DEPENDING ON mean in the Cobol code?

The OCCURS clause is how Cobol defines an array (called a table). The DEPENDING ON WS-LIMIT addition makes the table's size variable at runtime, up to the maximum value specified in the clause. This is a powerful feature for creating data structures that adapt to the input data size without overallocating memory.

Why does the inner loop start from p*p?

This is a crucial optimization. Any composite number less than p*p must have a prime factor that is smaller than p. For example, when we get to p=5, we don't need to check 10 (2*5), 15 (3*5), or 20 (4*5=2*2*5). These multiples have already been marked as composite by smaller primes (2 and 3). The first multiple we need to mark for prime p is always p*p.

How can I compile and run this Cobol code?

You need a Cobol compiler. A great free and open-source option is GnuCOBOL. Once installed, you can save the code as a .cob file and compile it from your terminal using the command cobc -x -free your_program.cob. This creates an executable file you can run directly.


Conclusion

We've journeyed from an ancient Greek mathematical concept to a modern implementation in one of the most enduring programming languages in history. By building the Sieve of Eratosthenes in Cobol, you have not only solved a classic computer science problem but also gained practical experience with Cobol's powerful data definition and procedural logic capabilities.

This exercise demonstrates that the principles of efficient algorithm design are timeless and language-agnostic. Cobol, with its structured nature and reliability, proves to be a capable tool for more than just business logic. Mastering these fundamentals is key to becoming a proficient developer, whether you're working on legacy mainframe systems or modern applications.

To continue your journey, we highly recommend exploring the other challenges in the kodikra Cobol 4 learning module. For a broader understanding of the language's features, check out our complete Cobol guide.

Disclaimer: The code in this article was developed and tested using GnuCOBOL 3.1.2. While it uses standard Cobol syntax, behavior may vary slightly with other compilers or versions.


Published by Kodikra — Your trusted Cobol learning resource.