Why DeepSeek Could Change What Silicon Valley Believe About A.I.
![Why DeepSeek Could Change What Silicon Valley Believe About A.I. Why DeepSeek Could Change What Silicon Valley Believe About A.I.](https://i3.wp.com/static01.nyt.com/images/2025/01/27/multimedia/ROOSE-market-btgv/ROOSE-market-btgv-facebookJumbo.jpg?w=780&resize=780,470&ssl=1)
The penetration of artificial intelligence that sends shock waves through stock markets, connects the giants of Silicon Valley, and generates breaths to the end of America’s technological dominance, with an equal and humble title: “stimulating the ability to think about LLMS through reinforcement learning.”
the 22 pagesAnd, which was released last week by a Chinese Chinese company called Deepseek, the alarm bells were not immediately released. It took a few days for researchers to digest the paper’s claims, and the effects of what he described. The company has created a new model for Amnesty International called Deepseek-R1, built by a team of researchers who claimed to have used a modest number of artificial intelligence chips of second-class to suit the performance of the leading American AI models in a small part of the cost.
Deepseek said he had done this using smart engineering to replace the power of raw computing. It did this in China, a country that many experts believed to be second distant in the global artificial intelligence race.
The reaction of some industrial monitors initially to Dibsic penetrated with disbelief. Certainly, they believed that Deepseek had been deceived to achieve the results of the R1, or have ridiculed their numbers to make their model seem more impressive than it was. The Chinese government may be promoting propaganda to undermine the narration of American artificial intelligence. Maybe Dibsic Hide a set of illegal NVIDIA H100 chipsIt is prohibited under American export controls, and lying on this topic. In fact, R1 may be just a smart resale for American artificial intelligence models that have not been much in the path of real progress.
Ultimately, given that more people dug details Deepseek-R1-which, unlike most artificial intelligence models, was released as an open source program, allowing strangers to examine his internal business more closely-their doubts turned into anxiety.
Late last week, when many Americans began using Deepseek models for themselves, and hit the Deepseek Mobile application to the first place in the Apple App Store, he was tending to full panic.
I am skeptical about the most dramatic that I saw during the past few days – like the claim that I made One investor for silicon valleyDibsic is a detailed conspiracy by the Chinese government to destroy the American technology industry. I also think it is reasonable that the company’s excessive budget in the company is badly exaggerated, or that it was at risk of developments made by American IQ companies in ways that have not been revealed.
But I believe that Deepseek’s R1 penetration was real. Based on the conversations I had with those familiar with the industry, experts wandering for a week roaming and testing the results of the paper for themselves, it seems that he doubts many of the main assumptions raised by the American technology industry.
The first is the assumption that in order to build samples of advanced artificial intelligence, you need to spend huge sums of money on strong data chips and data centers.
It is difficult to exaggerate the extent of the establishment of this doctrine. Companies such as Microsoft, Meta and Google have already spent tens of billions of dollars in building infrastructure that they believed were necessary to build and operate artificial intelligence models from the next generation. they Planning to spend other billions – Or, in the case of Openai, up to $ 500 billion through Joint project with Oracle and Softbank This was announced last week.
Dibsic appears to have spent a small part of this R1 building. We do not know the exact cost, and there A lot of warnings to make About the numbers they have released so far. Certainly it is higher than $ 5.5 million, and the number that the company claims has spent a previous model training.
But even if R1 costs 10 times to train more than Deepseek claims, and even if you are treated in other costs, you may exclude them, such as engineers salaries or the costs of conducting basic research, it will remain less than the size of artificial intelligence companies spending on developing their most capable models.
The clear conclusion of the drawing is not that American technology giants waste their money. It is still useful to run strong AI models as soon as they are trained, and there are reasons for the belief that spending hundreds of billions of dollars will remain logical on companies such as Openai and Google, which can be paid expensive to stay on top of the package.
But Deepseek progresses in cost challenges, the “bigger” narration is the best “AI Arms race in recent years by showing that small models are relatively small, when they are trained properly, can exceed or exceed the performance of the larger models.
This, in turn, means that artificial intelligence companies may be able to achieve very strong capabilities with much less investment than previously thought. It indicates that we may soon see a flood of investing in the smaller startups of artificial intelligence, and more competition for the giants of Silicon Valley. (Which, due to the tremendous costs of training their models, was mostly competing with each other until now.
There are other more technical reasons because everyone in the silicon Valley pay attention to Deepseek. In the search sheet, the company reveals some details about how to build R1 already, which includes some advanced technologies in the form of the form. (Basically, this means the pressure of the large Amnesty International models into smaller models, which makes them cheaper in running without losing much in the way of performance.)
Dibsic also included details Proposal It was not as difficult as it was previously thought to convert the “vanilla” language model into a more advanced thinking model, by applying a technique known as learning to reinforce above it. (Don’t worry if these terms go beyond your head – what matters is that ways to improve artificial intelligence systems that American technology companies have previously been present now on the Internet, free of charge for anyone to take and repeat them.)
Even if the prices of American technology giants are recovered in the coming days, the success of Deepseek raises important questions about the long -term intelligence strategies. If a Chinese company is able to build cheap, open source models that match the performance of expensive American models, then why does anyone pay for us? And if you are meta-the only American technology giant whose models are launched as a free open source program-what prevents Deepseek or another starting from just taking your models, on which you spent billions of dollars, and cut them into smaller, cheaper models that they can offer for penalties?
Deepseek’s penetration undermines some geopolitical assumptions that many American experts were making about the position of China in the artificial intelligence race.
First, the narration challenges that China is behind the border useful, when it comes to building strong models of artificial intelligence. For years, many artificial intelligence experts (and policy makers they listened to) assumed that the United States was offering at least several years, and that copies of the developments presented by American technology companies were difficult for Chinese companies to do quickly.
But the results of Deepseek show that China has the advanced capabilities of Amnesty International that can match or exceed models from Openai and other American AI companies, and the hacks taken by American companies may be very easy for Chinese companies – or, at least, one Chinese company – to repeat them. Within weeks.
(New York Times prosecution Openai and her partner, Microsoft, accused of violating copyright for news content related to artificial intelligence systems. Openai and Microsoft denied these claims.)
The results also raise questions about whether the steps taken by the United States government are taking it to reduce the spread of strong artificial intelligence systems to our opponents – that is, export controls used to prevent strong artificial intelligence chips from falling into the hands of China – whether these regulations need to adapt while taking Consider new and more efficient ways of training forms.
Of course, there are concerns about what this means for privacy and control if China takes the initiative to build strong AI systems that millions of Americans use. Deepseek models I noticed They routinely refuse to respond to questions about sensitive topics inside China, such as the Tiannmen Square massacre and Ouigor detention camps. If the other developers are building at the top of Deepseek models, as is common in open source software, these control measures may be included throughout the industry.
Their privacy experts also The interests raised About the fact that joint data with Deepseek models may be available by the Chinese government. If you are worried about using Tiktok as a monitoring and advertising tool, the climb of Deepseek should also worry you.
I am still not sure of the full impact of Deepseek, or whether we will consider the R1 “moment of Sputnik” to make artificial intelligence, and some have them Claim.
But it seems wise to take seriously the possibility that we will be in a new era from the edge of the male abyss now – that the largest and richest American technology companies may no longer win, and that containing the spread of artificial intelligence systems we thought.
At least, Deepseek showed that the AI ​​Arms race in AI is really working, and that after several years of amazing progress, there are still more surprises in the store.