A 25-Year Compute Journey Measured by the Sieve of Eratosthenes
Counting prime numbers up to a given limit is a classic rite of passage for new programmers. It’s often assigned early in a course because it naturally introduces key concepts such as variable types, loops, arrays, procedures, and basic algorithm optimization.
A prime number is a whole number greater than 1 that can be divided evenly only by 1 and itself. For example, 7 is prime because its only divisors are 1 and 7, while 15 is not prime because it can be divided evenly by 1, 3, 5, and 15.
Most beginners approach prime detection by checking divisibility and examining remainders. While this method works, and can be improved with smarter logic, it is not the most efficient for large ranges.
An elegant alternative dates back thousands of years to the 3rd century BCE. The mathematician Eratosthenes of Cyrene developed a remarkably efficient technique now known as the Sieve of Eratosthenes. This algorithm is frequently introduced alongside prime-counting assignments because it naturally involves array manipulation and demonstrates how algorithmic thinking can dramatically improve performance.
To see how this timeless mathematical problem scales across physical technology, we compiled and executed a native Prime Counting C Application on computer environments spanning a quarter century—tracking how hardware evolution alters pure computational speed.
Goal: Find all prime numbers up to a chosen limit.
Because this sieve relies heavily on allocating a massive array and iteratively jumping through memory to cross off flags, it serves as an exceptional modern benchmark not just for pure raw processor clock speeds, but for memory latency and CPU cache capacity.
How does your current machine handle the Sieve? Select a calculation threshold, click the button below, and see your browser's speed map directly into the live chart and data table!
To compare all systems fairly, we mapped out the execution times for calculation targets up to 100 Million. Run the 100M benchmark above to see your device plot instantly against legacy and cutting-edge desktop silicon.
The processing times recorded below showcase raw calculation speeds (measured in milliseconds). Lower numbers indicate faster performance.
| Hardware / OS Environment | 1M (ms) | 10M (ms) | 100M (ms) | 1B (ms) |
|---|---|---|---|---|
| Your Device (This Browser) | — | — | — | N/A (Browser Limit) |
| AMD Ryzen 7 9800X3D (Win 11, 64-bit native) | 1 | 11 | 167 | 3,439 |
| AMD Ryzen 7 9800X3D (Linux Mint, VB VM) | 1 | 8 | 208 | 4,107 |
| AMD Ryzen 7 9800X3D (Win XP, VB VM 32-bit) | 1.76 | 12.49 | 5,299 | — |
| MacBook Air 2018 (macOS Sonoma) | 6.34 | 62.08 | 797 | 9,502 |
| MacBook Air 2017 (Linux Mint) | 10.3 | 69.69 | 949 | 11,792 |
| MacBook Pro Early 2015 (Win 11, 64-bit) | 5.27 | 91.38 | 1,130 | 13,170 |
| MacBook Pro Early 2015 (macOS Monterey) | 4.24 | 74.08 | 1,221 | 14,629 |
| Acer Aspire One Netbook (WinXP, 1.6GHz Atom) | 42.44 | 639.00 | 7,386 | — |
| Dell Inspiron 5000e (WinXP, PIII 700MHz) | 143.13 | 1,905.45 | 24,080 | — |
| Dell Inspiron 5000e (WinXP, PIII 550MHz) | 152.37 | 1,977.78 | 25,498 | — |
Looking closely at the 100 Million calculation block, the year-2000 Dell 5000e running an Intel Pentium III at 550MHz sputtered through the operation in a sluggish 25.49 seconds. Flash forward to the AMD Ryzen 7 9800X3D: it chews through the exact same logic in an invisible 167 milliseconds. That is roughly a 152x speedup, highlighting the massive cumulative effect of frequency improvements, branch prediction refinements, and Instructions Per Clock (IPC) gains over 25 years.
A classic Sieve of Eratosthenes requires a large chunk of sequential memory array flags. When calculating up to 10 Million, the memory requirement easily fits entirely within the Ryzen 7 9800X3D's massive 96MB of L3 3D V-Cache. Because the CPU doesn't have to continuously fetch data from system RAM, its processing speeds fall to a jaw-dropping 8 to 11 milliseconds.
Look at the Ryzen 9800X3D's behavior inside VirtualBox. Running modern Linux Mint introduces only a minor performance degradation (from 167ms to 208ms at 100M). However, spinning up a 32-bit legacy Windows XP Virtual Machine sends the calculation time skyrocketing to 5,299 milliseconds! Forcing modern Zen 5 architectures to context-switch through legacy 32-bit translations inside an unoptimized hypervisor box destroys execution efficiency.
On the Intel Core i5 inside the Early 2015 MacBook Pro, running a native application on Windows 11 surprisingly squeaked past macOS Monterey when tasks scaled up to 1 Billion numbers (13,170ms vs 14,629ms). This highlights how variations in compiler optimizations (GCC/Clang vs MSVC) handle deeply nested looping matrices differently.
Mobile Pentium III (Coppermine architecture), leveraging SDR/early DDR laptop memory speeds.
Intel Atom N270 Netbook era. Low-power, in-order execution processor designed for portability over performance.
MacBook Pro & Air architectures utilizing multi-core ultra-low voltage Intel core profiles with high turbos.
State-of-the-art TSMC 4nm processing node packing stacked L3 V-Cache layout operating at blistering IPC throughput.