New Microsoft partnership accelerates generative AI development

Image Credit: DKosig // Getty Images

Check out the on-demand sessions from the Low-Code/No-Code Summit to learn how to successfully innovate and achieve efficiency by upskilling and scaling citizen developers. Watch now.


One of the hottest trends in artificial intelligence (AI) this year has been the emergence of popular generative AI models. With technologies including DALL-E and Stable Diffusion, there are a growing number of startups and use cases that are emerging. 

Generative AI builds on a number of foundational technologies including the use of transformer models. The use of transformers for generative AI and other use cases can be resource intensive on the inference side, where systems predict and build out the results of a model.

Among the vendors that is building technology to help accelerate AI inference for transformer models is startup d-Matrix, which raised $44 million in a series A round of funding in April to help build out its AI accelerator hardware technology. The company has developed a digital in-memory compute (DIMC) technology that isn’t publicly available yet, but it’s already gotten the attention of Microsoft.

Microsoft and d-Matrix today announced that the Microsoft Project Bonsai reinforcement learning will be supported on the d-Matrix DIMC technology, which the two vendors hope will provide a significant acceleration for AI inference.

Event

Intelligent Security Summit

Learn the critical role of AI & ML in cybersecurity and industry specific case studies on December 8. Register for your free pass today.


Register Now

“Project Bonsai is a platform which enables our version of deep reinforcement learning and we call it machine teaching,” Kingsuk Maitra, principal applied AI engineer at Microsoft, told VentureBeat. “We have trained a compiler for d-Matrix’s one-of-a-kind digital in-memory compute technology and the early results are very encouraging.”

What Microsoft’s Project Bonsai is all about

Project Bonsai has been in development at Microsoft for the last several years and is currently available as a preview.

Maitra said that the goal of the effort is to abstract away the complexities that are associated with deep reinforcement learning networks. An initial target for Project Bonsai is industrial controls, including chip design and manufacturing. Part of the technology is a capability to train models using a high-level language developed at Microsoft Project Bonsai called Inkling to train deep reinforcement agents to do control tasks.

Deep reinforcement learning doesn’t require labeled data, Maitra explained. Rather, it essentially learns with feedback from the environment, which can be emulated with a simulator. At the end of a training loop, the result is a trained reinforcement learning (RL) agent, which Microsoft refers to as “brains.” The brains, when deployed, can take meaningful actions to complete the task at hand.

“We are running active real-life workloads and training the compiler, relative to those real-life workloads, most of them with well-known large language models with different Bonsai brains,” Maitra said.

The d-Matrix Corsair is coming in 2023

Currently d-Matrix doesn’t have any chips that are publicly available, but the first one, code-named Corsair, is set to debut in 2023.

“We’re building an accelerated computing platform for transformers and specifically focused around generative AI,” Sudeep Bhoja, cofounder, CTO at d-Matrix told VentureBeat.

Bhoja explained that the chips that d-Matrix is developing can be built in a very modular way and could be packaged together with a CPU or could be integrated on a PCI card that plugs into a server in the cloud. The d-Matrix technology is designed to help accelerate AI inference, with its DIMC technology that provides high performance and low latency.

With Microsoft’s Project Bonsai, d-Matrix now has a compiler that can build deep reinforcement learning tools for its silicon. A key end goal for d-Matrix is to help support the continued growth and deployment of generative AI models. 

“We want to enable [generative AI models] because it takes a lot of processing power, there are latency constraints and it is user facing,” Bhoja said. “You have to be able to do it in a very energy-efficient way so that the data centers don’t have to bring in more power….”

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

Read More

Sean Michael Kerner