Fractile on song as ARIA awards £5m to AI chip design pioneer

25 Oct, 2024
Newsdesk
Thumbnail
Credit – Fractile

Fractile, a UK startup taking a radically different approach to AI chip design, has received £5 million in funding from the Advanced Research + Invention Agency (ARIA).

Founded in 2022 by AI PhD, Walter Goodwin, Fractile has already built a world-class team with senior hires from NVIDIA, Arm and Imagination.

In July, the company raised $15m led by Kindred Capital, NATO Innovation Fund and OSE and including ‘angel’ contributions from serial Cambridge entrepreneurs Stan Boland and Hermann Hauser.

Fractile is one of 12 projects to which ARIA is awarding nearly £50m through its Scaling Compute programme, which aims to increase and open up new vectors of progress in the field of computing by reducing the cost of AI hardware.

ARIA was created by an Act of Parliament and is sponsored by the Department for Science, Innovation, and Technology - funding projects across the full spectrum of R & D disciplines, approaches, and institutions looking at how technology can enable a better future and that can prove critical for the UK in the long-term.

Becoming an ARIA R & D Creator is a significant milestone for the Fractile team, as the UK startup continues its journey to build the chips and systems needed to reach the next frontier of AI performance.

Stan Boland

Fractile is regarded as a genuine game-changer. Boland told us after deciding to invest: “Fractile is not focused on training. It’s in training that there’s a huge CUDA* lock by NVIDIA that Graphcore never overcame – that plus insufficient memory bandwidth and inability to scale as the models got bigger.

“By contrast, Fractile’s system is highly scaleable and focused on inference. That’s the process (using the Large Language Model example) where the model’s already been built, the training is done and the training weights are known, the job is to take input tokens, take the weights, do some maths and produce output tokens.

“Inference is heavily dominated by matrix vector multiplies (like 99.999 per cent of the operations) and GPUs are terrible at doing them because for every token, a huge number of weights have to be pulled onto the chip to do some fairly trivial maths then the result written back to memory.

“That’s why in-memory compute (IMC) is the way to go. Graphcore was also an IMC architecture, as are the ~20 other startups in the space (albeit hampered in Graphcore’s case by lack of scaleability), so in itself it’s not surprising that Fractile is adopting this approach.

“But Walter is also doing something very interesting alongside it, which I can’t share right now, but is the key to him getting 100x performance, cost and power advantage.

“Once the models and weights are known, it’s relatively simple to offload the inference task onto different processors; there is no effective CUDA lock. There’s a long way to go on Fractile, but it’s definitely one to watch.”

* CUDA is a software layer that gives direct access to the GPU's virtual instruction set and parallel computational elements for the execution of compute kernels.