AI Chips: What Are They? – Built In

The future of artificial intelligence largely hinges on the development of AI chips.
AI chips refer to specialized computing hardware used in the development and deployment of artificial intelligence systems. As AI has become more sophisticated, the need for higher processing power, speed and efficiency in computers has also grown — and AI chips are essential for meeting this demand.
An AI chip is a specialized integrated circuit designed to handle AI tasks. Graphics processing units (GPUs), field programmable gate arrays (FPGAs) and application-specific integrated circuits (ASICs) are all considered AI chips.
Many AI breakthroughs of the last decade — from IBM Watson’s historic Jeopardy! win to Lensa’s viral social media avatars to OpenAI’s ChatGPT — have been powered by AI chips. And if the industry wants to continue pushing the limits of technology like generative AIautonomous vehicles and robotics, AI chips will likely need to evolve as well.
“As the cutting edge keeps moving and keeps changing,” said Naresh Shanbhag, an electrical and computer engineering professor at the University of Illinois Urbana-Champaign, “then the hardware has to change and follow, too.”
 
The term “AI chip” is a broad classification, encompassing various chips designed to handle the uniquely complex computational requirements of AI algorithms quickly and efficiently. This includes graphics processing units (GPUs), field-programmable gate arrays (FPGAs) and application-specific integrated circuits (ASICs). Central processing units (CPUs) can also be used in simple AI tasks, but they are becoming less and less useful as the industry advances.
 
In general, a chip refers to a microchip, which is an integrated circuit unit that has been manufactured at a microscopic scale using semiconductor material. Components like transistors (tiny switches that control the flow of electrical current within a circuit) are etched into this material to power computing functions, such as memory and logic. While memory chips manage data storage and retrieval, logic chips serve as the brains behind the operation that processes the data.  
AI chips largely work on the logic side, handling the intensive data processing needs of AI workloads — a task beyond the capacity of general-purpose chips like CPUs. To achieve this, they tend to incorporate a large amount of faster, smaller and more efficient transistors. This design allows them to perform more computations per unit of energy, resulting in faster processing speeds and lower energy consumption compared to chips with larger and fewer transistors. 
AI chips also feature unique capabilities that dramatically accelerate the computations required by AI algorithms. This includes parallel processing — meaning they can perform multiple calculations at the same time. 
Parallel processing is crucial in artificial intelligence, as it allows multiple tasks to be performed simultaneously, enabling quicker and more efficient handling of complex computations. Because of the way AI chips are designed, they are “particularly effective for AI workloads and training AI models,” Hanna Dohmen, a research analyst at Georgetown University’s Center for Security and Emerging Technology (CSET), told Built In.
The various types of AI chips differ in their hardware and functionality:
GPUs are most often used in the training of AI models. Originally developed for applications that require high graphics performance, like running video games or rendering video sequences, these general-purpose chips are typically built to perform parallel processing tasks. Because AI model training is so computationally intensive, companies connect several GPUs together so they can all train an AI system synchronously.
FPGAs are useful in the application of AI models because they can be reprogrammed “on the fly,” as Tim Fist, a fellow with the Technology and National Security Program at CNAS, put it, meaning they are “hyper-specialized.” In other words, FPGAs are highly efficient at a variety of different tasks, particularly those related to image and video processing.
ASICs are accelerator chips, designed for a very specific use — in this case, artificial intelligence. They are custom-built to support specific applications. ASICs offer similar computing ability to the FPGAs, but they cannot be reprogrammed. Because their circuitry has been optimized for one specific task, they often offer superior performance compared to general-purpose processors or even other AI chips. Google’s tensor processing unit is an example of an ASIC that has been crafted explicitly to boost machine learning performance.
NPUs are modern add-ons that enable CPUs to handle AI workloads and are similar to GPUs, except they’re designed with the more specific purpose of building deep learning models and neural networks. As a result, NPUs excel at processing massive volumes of data to perform a range of advanced AI tasks like object detection, speech recognition and video editing. Because of their capabilities, NPUs often outperform GPUs when it comes to AI processes.
More on AI HardwareWhat Is Neuromorphic Computing?
 
Modern artificial intelligence simply would not be possible without these specialized AI chips. Here are just some of the ways they are being used.
AI chips speed up the rate at which AI, machine learning and deep learning algorithms are trained and refined, which is particularly useful in the development of large language models (LLMs). They can leverage parallel processing for sequential data and optimize operations for neural networks, enhancing the performance of LLMs — and, by extension, generative AI tools like chatbotsAI assistants and text-generators.
AI chips make AI processing possible on virtually any smart device — watches, cameras, kitchen appliances — in a process known as edge AI. This means that processing can take place closer to where data originates instead of on the cloud, reducing latency and improving security and energy efficiency. AI chips can be used in anything from smart homes to smart cities.
AI chips help advance the capabilities of driverless cars, contributing to their overall intelligence and safety. They are able to process and interpret vast amounts of data collected by a vehicle’s cameras, LiDAR and other sensors, supporting sophisticated tasks like image recognition. And their parallel processing capabilities enable real-time decision-making, helping vehicles to autonomously navigate complex environments, detect obstacles and respond to dynamic traffic conditions.
AI chips are useful in various machine learning and computer vision tasks, allowing robots of all kinds to perceive and respond to their environments more effectively. This can be helpful across all areas of robotics, from cobots harvesting crops to humanoid robots providing companionship.
 
When it comes to the development and deployment of artificial intelligence, AI chips are much better than regular chips, thanks to their many distinctive design attributes.
Perhaps the most prominent difference between more general-purpose chips (like CPUs) and AI chips is their method of computing. While general-purpose chips employ sequential processing, completing one calculation at a time, AI chips harness parallel processing, executing numerous calculations at once. This approach means that large, complex problems can be divided up into smaller ones and solved at the same time, leading to swifter and more efficient processing.
AI chips are designed to be more energy-efficient than conventional chips. Some AI chips incorporate techniques like low-precision arithmetic, enabling them to perform computations with fewer transistors, and thus less energy. And because they are adept at parallel processing, AI chips can distribute workloads more efficiently than other chips, resulting in minimized energy consumption. Long-term this could help reduce the artificial intelligence industry’s massive carbon footprint, particularly in data centers.
Using AI chips could also help edge AI devices run more efficiently. For example, if you want your cellphone to be able to collect and process your personal data without having to send it to a cloud server, the AI chips powering that cellphone must be optimized for energy efficiency so they don’t drain the battery.
Because AI chips are specifically designed for artificial intelligence, they tend to be able to perform AI-related tasks like image recognition and natural language processing with more accuracy than regular chips. Their purpose is to perform intricate calculations involved in AI algorithms with precision, reducing the likelihood of errors. This makes AI chips an obvious choice for more high-stakes AI applications, such as medical imaging and autonomous vehicles, where rapid precision is imperative.
Unlike general-purpose chips, some AI chips (FPGAs and ASICs, for example) can be customized to meet the requirements of specific AI models or applications, allowing the hardware to adapt to different tasks. 
Customizations include fine-tuning certain parameters (variables within a trained model) and optimizing the chip’s architecture for specific AI workloads. This flexibility is essential to the advancement of AI, as it enables developers to tailor the hardware to their unique needs, accommodating variations in algorithms, data types and computational requirements.
 
While AI chips play a crucial role in advancing the capabilities of AI, their future is full of challenges, such as supply chain bottlenecks, a fragile geopolitical landscape and computational constraints.
At the moment, Nvidia is a top supplier of AI hardware and software, controlling about 80 percent of the global market share in GPUs. But Nvidia doesn’t manufacture its own chips; it relies on Taiwan Semiconductor Manufacturing Corporation (TSMC), which makes roughly 90 percent of the world’s advanced chips, powering everything from Apple’s iPhones to Tesla’s electric vehicles. It is also the sole manufacturer of Nvidia’s powerful H100 and A100 processors, which power the majority of AI data centers.
TSMC’s control over the market has created severe bottlenecks in the global supply chain. The company has limited production capacity and resources, which hinders its ability to meet escalating demand for AI chips.
“The demand for these chips is currently far exceeding the supply,” CNAS’ Fist said. “If you’re an AI developer and you want to buy 10,000 of Nvidia’s latest GPUs, it’ll probably be months or years before you can get your hands on them.”
These supply shortages won’t last forever though. TSMC’s subsidiary, Japan Advanced Semiconductor Manufacturing (JASM), is constructing a factory in Kumamoto that is expected to be at full production by the end of 2024. TSMC is also building two state-of-the-art plants in Arizona, the first of which is set to begin chip production in 2025
In the meantime, prominent AI makers like MicrosoftGoogle and Amazon are designing their own custom AI chips to reduce their reliance on Nvidia. 
There have also been wider attempts to counter Nvidia’s dominance, spearheaded by a consortium of companies called the UXL Foundation. For example, the Foundation has developed an open-source alternative to Nvidia’s CUDA platform, and Intel has directly challenged Nvidia with its latest Gaudi 3 chip. In addition, Intel and AMD have created their own processors for laptops and computers, signaling that the semiconductor sector could become less reliant on Nvidia moving forward.
Taiwan, which plays a central role in the global supply of AI chips, is viewed by China as a rogue province as opposed to an independent nation. Because of this, some analysts believe a Chinese invasion could occur within the decade, which would affect TSMC’s ability to manufacture AI chips and put the entire AI industry in jeopardy.
Meanwhile, amid tensions between the U.S. and China, President Joe Biden rolled out a sweeping set of export controls in 2022 that dramatically limit China’s access to AI chips, chip-making equipment and chip design software (much of which is controlled by U.S. companies like Nvidia). Although companies like Intel can still introduce new AI chips in China, they must limit the performance of these chips. 
“We want to restrict China’s military modernization, and we are concerned about the Chinese government using AI chips to develop weapons of mass destruction,” Dohmen, whose research focuses on U.S.-China tech competition, said. But it also comes down to a desire for AI dominance. “We want to be the first, we want to be the best in tech and AI innovation.”
Indeed, as the United States works to limit China’s access to AI hardware, it is also taking steps to reduce its own reliance on chip fabrication facilities in East Asia. In addition to facilitating the two TSMC plants in Arizona, the government has secured a third TSMC site in Phoenix through the CHIPS and Science Act and also set aside more than $52 billion in federal funding and incentives to support U.S. semiconductor manufacturing, research and development.
Developers are creating bigger and more powerful models, driving up computational demands. And “chips need to keep up,” Fist said. But AI chips have finite computational resources.
“The amount of chips that you need to scale a state-of-the-art AI system is growing by about four times every year, which is huge,” Fist added. Meanwhile, the algorithmic efficiency of chips, or the ability to do more with fewer chips, is growing by two times every year. “The requirements, in terms of how many chips we need and how powerful they need to be, are outstripping what the industry is currently able to provide.”
Instead of simply throwing more chips at the problem, companies are rushing to figure out ways to improve AI hardware itself. 
One key area of interest is in-memory computing, which eliminates the separation between where the data is stored (memory) and where the data is processed (logic) in order to speed things up. And AI chip designers like Nvidia and AMD have started incorporating AI algorithms to improve hardware performance and the fabrication process. All of this work is essential to keeping up with the breakneck pace at which AI is moving. 
“There are all of these different exponential trends at play,” Fist said. “So there’s this big rush to figure out how do we build even more specialized chips for AI? Or, how do we innovate in other parts of the stack?”
While regular chips are typically general-purpose and designed to accomplish all kinds of computer functions, AI chips are made to handle the complex computations involved in AI-related tasks. Unlike regular chips, AI chips are optimized for specific AI workloads, offering improved performance, speed and energy efficiency.
A CPU (central processing unit) is a general-purpose chip that can handle a wide range of tasks in a computer system, including running operating systems and managing applications. GPUs (graphics processing units) are also general-purpose, but they are typically built to perform parallel processing tasks. They are best-suited for rendering images, running video games, and training AI models.
The costs of AI chips vary and depend on factors like performance. While AMD’s MI300X chip falls between $10,000 and $15,000, Nvidia’s H100 chip can cost between $30,000 to $40,000, often surpassing the $40,000 threshold.
Nvidia dominates the AI chip manufacturing industry, but it faces competition from other major tech companies like Microsoft, Google, Intel, IBM and AMD.
Nvidia’s H100 is considered the top AI chip on the market. According to Nvidia, the chip can train large language models four times faster than the company’s A100 models and generate answers to user prompts 30 times faster. 
Nvidia’s chips are manufactured in Taiwan.

source

Facebook Comments Box

Trả lời

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *