Techno

How Chinese AI Startup DeepSeek Made a Model that Rivals OpenAI

Today, Deepseek is one of the only artificial intelligence companies in China that does not depend on financing from technology giants such as Baidu, alibaba or bytedance.

A young group of geniuses eager to prove themselves

According to Liang, when he collected the Deepseek research team, he was not looking for experienced engineers to build a consumer facing product. Instead, he focused on doctoral students from the best universities in China, including Beijing University and the University of Tsinghua, who were keen to prove themselves. Many of them were published in the higher magazines and won prizes at international academic conferences, but they lack experience in industry, according to what I said. Chinese technology publication QBitai.

“Our basic technical positions are mostly occupied by people who have graduated this year, in one or two years,” Liang told 36 kir in 2023. The recruitment strategy helped to create a culture of a cooperative company where people were free to use abundant computing resources to follow unconventional research projects. It is a flagrantly different way to work from existing internet companies in China, where the difference is often competing for resources. (Hadith example: He was accused by pretuable a former trainee– The winner of a prestigious academic award, at least – sabotaging the work of his colleagues in order to store more computing resources for his team.)

Liang said that students can be better suitable to search for high investment and profit. “Most people, when they are young, they can completely devote themselves to a task without utilitarian considerations,” he explained. The stadium for potential appointments is that Deepseek has been created “to solve the most difficult questions in the world.”

Experts say the fact that these young researchers are fully learned in China adding their engine. “This younger generation also embodies a feeling of patriotism, especially while moving on the restrictions of the United States and suffocation points in critical hardware and software technologies,” explains Zhang. “Their design not only reflects these personal aspirations, but also reflects a broader commitment to developing China’s position as a global leader of innovation.”

Innovation born from a crisis

In October 2022, the United States government began to collect export controls that are strongly restricting the AI ​​AI companies from reaching advanced chips like Nvidia’s H100. This step was made a problem for Deepseek. The company started with a stock of 10,000 H100, but it needed more to compete with companies such as Openai and Meta. “The problem we face has never been funded, but the export control over advanced chips,” told Liang 36 Care In a second interview in 2024.

Dibsic had to reach more efficient methods to train their models. “Many of these methods are not new ideas, but the combination of them successfully to produce an advanced model is a great achievement,” says Windy Zhang, a software engineer, says.

Deepseek has also made a great progress in the multi -header interest (MLA) and the experience of expertise, two technical designs that make Deepseeek more effective by cost by requesting lower computing resources for training. In fact, the latest Deepseek model is so effective that it requires ten computing power of the Meta Training Lama 3.1 Training model, According to the Institute of Research, the Anged Intelligence Age.

Dibsic’s willingness to share these innovations with the public has earned it great fame in the global artificial intelligence research community. For many Chinese AI companies, the development of open source models is the only way to play with their Western counterparts, as they attract more users and shareholders, which in turn help models to grow. “They have now proven that advanced models can be built with less, although they are still many, and that the current standards for building models leave a big room for improvement,” says Zhang. “Certainly we will see many attempts in this direction to go forward.”

The news can cause trouble for the current US export controls that focus on creating computing resources bottlenecks. “The current estimates of the strength of computing the artificial intelligence of China, and what it can achieve, can be raised,” says Zhang.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button