🇺🇸:I was recently invited by the French newspaper, Siècle Digital, to discuss Sam Altman’s 7 trillion dollar plan to turbocharge chipmaking for an AI-era. The podcast is available in French below, and the following are abridged English highlights of our conversation.
🇫🇷:J'ai été invité par le journal français Siècle Digital pour discuter du plan de 7 billions de dollars de Sam Altman visant à boulverser la fabrication des semi-conducteurs pour l'ère de l'IA. Le podcast est disponible en français ci-dessous, et ce qui suit sont des moments forts abrégés de notre conversation en anglais.
What are Sam Altman's plans? Why does he want to reorganize the semiconductor industry?
Altman has proposed raising 5 to 7 trillion dollars [$5,000,000,000,000 – $7,000,000,000,000] to build chip fabs producing the GPUs companies like OpenAI need to train large language models (LLMs). These fabs would be managed by existing manufacturers like TSMC, and OpenAI would commit to being a major client. Until now, OpenAI has developed its technology using Microsoft's computing resources, imposing limits on its growth.
The semiconductor industry poses challenges different from Altman’s past ventures, however. A former colleague of mine currently serving as a senior advisor to the Rand Corporation in the United States, warned that "In software, anything is possible—it really just is a money and coding problem… However, in the world of hard tech, you actually have to deal with the laws of physics. You have to think about the real world and engineering challenges, and this stuff is hard to do."
What is a semiconductor? What is its role in AI?
The terms "semiconductors," "integrated circuits," and "electronic chips" are often used interchangeably, though they refer to slightly different concepts in the field of electronics. Semiconductors are materials halfway between conductors and insulators, used as the foundational material to manufacture electronic components. Integrated circuits are pieces of semiconductor material on which electronic components are integrated, while the term "chip" is more general and describes a miniaturized electronic device.
Semiconductor play an essential role in AI as specialized processors known as GPUs (Graphics Processing Units), which accelerate calculations required for machine learning. Companies like Intel, AMD, NVIDIA, Qualcomm, and TSMC play key roles in designing and manufacturing semiconductors for AI.
What is the state of the global semiconductor production and supply chain? Are there problems that require a complete overhaul? Does Nvidia's dominance pose a problem?
Shortages in GPUs have grown with the rise of machine learning, leaving AI companies big and small scrambling for GPUs to train their algorithms.
A young engineer I know recently left his job at a major semiconductor company to start his own AI-focused chip startup but is facing a shortage of the GPUs he needs to train his model. This GPU shortage is even more concerning for him than for larger clients like OpenAI because his purchasing power can’t compete with the giant competitors he hopes to challenge.
The U.S. Congress has tried to support entrepreneurs like my friend through proposals like the "National Artificial Intelligence Research Resource (NAIRR)," which would subsidize computing resources as a public good for students and researchers. However, this initiative has not made significant progress toward funding.
Regarding NVIDIA's dominance in the market, research and development in semiconductor technology are among the most challenging and costly projects on the planet. Even focusing solely on advanced chip design, which is NVIDIA's specialty as it does not operate its own fabs, allocating 27% of total annual sales to research and development (well beyond the already high industry average of 15%) requires a significant level of market concentration.
What resources does Sam Altman have to realize his plans? Can he hope for government assistance, namely in the U.S.? Are his plans utopian?
The size of such an investment exceeds the current size of the global semiconductor industry. Global chip sales were $527 billion last year and are expected to reach $1 trillion annually by 2030. It dwarfs the relatively modest $39 billion in chipmaking grants the U.S. passed through adoption of the CHIPS and Science Act of 2022 two years ago (when I was working in Washington), and even the enormous government funds offered by China, Taiwan, South Korea, and Japan pale in comparison.
Achieving his ambitions for the chip industry and other sectors necessary to support AI will require persuading a complex and global network of funders, industrial partners, and governments. Such persuasion includes getting approval for the UAE's involvement from the United States, which is wary about Gulf states’ cooperation with China. Here, the U.S. government represents both significant potential supporter for Altman's efforts and potential opposition to support from Middle Eastern sovereign wealth funds.
Rapidly accelerating AI chipmaking also raises numerous security concerns, two of which I want to mention here:
Heterogeneous integration (IH) has become chipmakers’ best hope to advance computing beyond the limits of Moore's Law. HI involves mixing diverse sets of dies (chiplets), each containing part of a final integrated circuit’s functionality. The challenge here lies in introducing potential hardware weaknesses that malicious actors could exploit to hack devices.
AI applications have intentionally led chips to adopt more unpredictable behaviors, posing security challenges. This unpredictability makes it harder to detect aberrant behavior by existing hardware security measures. Recently independent researchers discovered a vulnerability in the chips of four major GPU suppliers (not including NVIDIA). This flaw allowed adversaries to spy on other users’ large-language model (LLM) training sessions. Fragments of innocent users' data were stored in the local memory of shared cloud servers, which malicious actors using the same cloud hardware could use to spy on subsequent user sessions. Requiring GPU manufacturers to clear local memory after the end of LLM sessions may appear simple, but the at least four providers who did not anticipate this issue face the difficulty of modifying source code that has become deeply rooted over years of rapid iterations of GPU architecture.
In short, if the world manages to accelerate the semiconductor industry as quickly as Altman envisions, existing security risks, as well as those that are still unknown, will also be accelerated.