AI chip design ‘babe’ Fractile thanks Cambridge for bouncing $15m birth

29 Jul, 2024
Tony Quested
Cambridge backers were very much to the fore in the $15 million birth of Fractile – a startup pledging “a radically different approach to AI chip design.”
Thumbnail
Stan Boland. Credit – Five.

The company marked the seed round by claiming it had the technology to solve some of the biggest barriers to better AI performance, not least handing model builders more design scope as well as cutting costs in the process.

Serial Cambridge entrepreneurs Stan Boland and Hermann Hauser are angel investors alongside Amar Shah – co-founder of Wayve – which was hatched within Cambridge University.

The co-leaders in the round were Kindred Capital, NATO Innovation Fund – which empowers DeepTech founders to address challenges in defence, security and resilience – and OSE (Oxford Science Enterprises) with participation from Cocoa and Inovia Capital. With this seed cash injection Fractile has now raised $17.5m (£14m) in total funding.

Fractile claims that under the current regime AI models are very expensive to run, their performance inhibited and their potential future capabilities restricted. It is hard for AI model builders to deliver meaningful differentiation, the company claims.

London-headquartered Fractile says it is taking a radically different approach to chip design for AI inference. A key aspect of this is in-memory compute, which removes the need to shuttle model parameters to and from processor chips.

It says its chips will be able to run state-of-the-art AI models at least 100x faster and 10x cheaper, by using novel circuits to execute 99.99 per cent of the operations needed to run model inference. This strategy will also offer significant power savings.

Founded in 2022 by artificial intelligence PhD, Walter Goodwin, Fractile has already built a world-class team with senior hires from NVIDIA, Arm and Imagination.

The investment will be used to grow Fractile’s team across silicon, software and AI, build partnerships and accelerate progress to the company’s first products.

A spokesperson revealed: “There are two paths available to a company attempting to build better hardware for AI inference. The first is specialisation: honing in on very specific workloads and building chips that are uniquely suited to those specific requirements.

“Because model architectures evolve rapidly in the world of AI, whilst designing, verifying, fabricating and testing chips takes considerable time, companies pursuing this approach face the problem of shooting for a moving target whose exact direction is uncertain.

“The second path is to fundamentally change the way that computational operations themselves are performed, create entirely different chips from these new building blocks, and build massively scalable systems on top of these. This is Fractile’s approach, which will unlock breakthrough performance across a range of AI models both present and future.”

Dr Goodwin, who is CEO, added: “In today’s AI race, the limitations of existing hardware – nearly all of which is provided by a single company – represent the biggest barrier to better performance, reduced cost, and wider adoption.

“Fractile’s approach supercharges inference, delivering astonishing improvements in terms of speed and cost. This is more than just a speed-up – changing the performance point for inference allows us to explore completely new ways to use today’s leading AI models to solve the world’s most complex problems.

“We’re thrilled to have raised our funding from investors with a wealth of experience in the AI and chip industries, continue to grow our world-class team and further our technological development and partnerships.”

Stan Boland said: “There’s no question that, in Fractile, Walter is building one of the world’s future superstar companies. He’s a brilliant AI practitioner but he’s also listening intently to the market so he can be certain of building truly compelling products that other experts will want to use at scale.

“To achieve this, he’s already starting to build one of the world’s best teams of semiconductor, software and tools experts with track records of flawless execution. I’ve no doubt Fractile will become the most trusted partner to major AI model providers in short order.”

Business Weekly understands that test chips are going to be pushed onto a shuttle by the end of the year.

Boland said: “Fractile is not focused on training. It’s in training that there’s a huge CUDA* lock by NVIDIA that Graphcore never overcame – that plus insufficient memory bandwidth and inability to scale as the models got bigger.

“By contrast, Fractile’s system is highly scaleable and focused on inference. That’s the process (using the Large Language Model example) where the model’s already been built, the training is done and the training weights are known, the job is to take input tokens, take the weights, do some maths and produce output tokens.

“Inference is heavily dominated by matrix vector multiplies (like 99.999 per cent of the operations) and GPUs are terrible at doing them because for every token, a huge number of weights have to be pulled onto the chip to do some fairly trivial maths then the result written back to memory.

“That’s why in-memory compute (IMC) is the way to go. Graphcore was also an IMC architecture, as are the ~20 other startups in the space (albeit hampered in Graphcore’s case by lack of scaleability), so in itself it’s not surprising that Fractile is adopting this approach.

“But Walter is also doing something very interesting alongside it, which I can’t share right now, but is the key to him getting 100x performance, cost and power advantage.

“Once the models and weights are known, it’s relatively simple to offload the inference task onto different processors; there is no effective CUDA lock. There’s a long way to go on Fractile, but it’s definitely one to watch.”

* CUDA is a software layer that gives direct access to the GPU's virtual instruction set and parallel computational elements for the execution of compute kernels.