Thanks to the revolutionary PPU, the performance of any processor can increase 100-fold.

According to Flow Computing, a Finnish startup under the umbrella of VTT Technical Research Center, it is possible to increase the performance of any processor by 100 times.

The Finnish startup Flow Computing, under the umbrella of VTT Technical Research Center, is developing a technology that consists of a Parallel Processing Unit (PPU) and a compiler to accelerate CPU code by 100 times. Although research on this is still ongoing, researchers claim that it can double the performance of any existing computing code overnight.

According to Flow Computing, the new PPU is designed to break the decades-long stagnation in CPU performance. The company asserts that the stagnation in CPUs in recent years has resulted in them becoming the weakest link in computing. However, the firm states that their PPU is broadly compatible with existing CPU architectures. Flow Computing has even prepared optimized PPU licenses targeting mobile, PC, and data center processors.

A radical approach with solid footing

On the other hand, Flow states that they have not yet produced a chip and do not intend to do so. Therefore, like Arm, Flow hopes to progress by providing IP licenses to companies such as AMD, Apple, Intel, Nvidia, and Qualcomm. The company’s boldest claim is that an integrated PPU can achieve a 100-fold performance increase independent of architecture and with full backward software compatibility. However, this radical claim involves recompiling software for the PPU. Nonetheless, Flow asserts that nearly doubling performance can be achieved with minimal changes.

Looking at documents published by The Verge, we see that Flow’s claims are not as outlandish as others, they do not promise magic, and particularly target credible goals by focusing on thread synchronization, concurrent memory access, and context switching issues in CPU-based threaded code.

The company promises a doubling of performance for developers running their source code through the Flow compiler in programs using Posix/Linux pthreads. They claim that with minor software adjustments, a 10-fold increase can be achieved, and with recompilation, as mentioned earlier, gains of up to 100-fold are possible. Moreover, the company emphasizes that the recompiled code will be simpler due to the PPU’s native understanding of vector/matrix operations.

The company’s focus is not on increasing the performance of every processor but rather on dedicated processors aimed at performance-sensitive threaded workloads. This includes applications such as games, video codecs, and networking software.

Time is needed.

The company scales its PPU solution from a few cores up to 256 cores. Additionally, they estimate that a 64-core PPU on a 3 nm process node would occupy 22 mm2 – significantly less than traditional CPUs – and require 43 W of power. However, the company does not aim to replace CPUs or GPUs. Instead, their goal is to alleviate the workload on CPUs and enable a more flexible gaming environment. Regarding GPUs, there are clear distinctions: PPUs are optimized for parallel processing, whereas GPUs are specialized for graphics processing, making them difficult to substitute for each other. Therefore, the PPU is envisioned as a Co-CPU.

While these ambitions may sound bold, they are presented as such for a reason. Hence, it’s advisable to currently view these statements as claims. The company’s foremost challenge ahead is completing their first RISC-V design. They have indicated that they will provide more technical details in the second half of the year, and hopefully, concrete materials will be available then.

Scroll to Top