Chinese startup DeepSeek’s launch of its latest AI models, which it says are on a par or better than industry-leading models in the US at a fraction of the cost, is threatening to upset the technology world order.
The company has attracted attention in global AI circles after writing in a paper last month that the training of DeepSeek-V3 required less than US$6 million worth of computing power from Nvidia H800 chips.
DeepSeek’s AI Assistant, powered by DeepSeek-V3, has overtaken rival ChatGPT to become the top-rated free application available on Apple’s App Store in the US.
Photo: Reuters 照片:路透
This has raised doubts about the reasoning behind some US tech companies’ decision to pledge billions of dollars in AI investment and shares of several big tech players, including Nvidia, have been hit.
WHY IS DEEPSEEK CAUSING A STIR?
The release of OpenAI’s ChatGPT in late 2022 caused a scramble among Chinese tech firms, who rushed to create their own chatbots powered by artificial intelligence.
Photo: Bloomberg 照片:彭博社
But after the release of the first Chinese ChatGPT equivalent, made by Chinese search engine giant Baidu, there was widespread disappointment in China at the gap in AI capabilities between US and Chinese firms.
The quality and cost efficiency of DeepSeek’s models have flipped this narrative on its head. The two models that have been showered with praise by Silicon Valley executives and US tech company engineers alike, DeepSeek-V3 and DeepSeek-R1, are on par with OpenAI and Meta’s most advanced models, the Chinese startup said.
They are also cheaper to use. The DeepSeek-R1, released last month, is 20 to 50 times cheaper to use than OpenAI o1 model, depending on the task, according to a post on DeepSeek’s official WeChat account.
Photo: AP 照片:美聯社
But some have publicly expressed scepticism about DeepSeek’s success story.
Scale AI CEO Alexandr Wang (汪滔) said during an interview with CNBC on Jan. 23, without providing evidence, that DeepSeek has 50,000 Nvidia H100 chips, which he claimed would not be disclosed because that would violate Washington’s export controls that ban such advanced AI chips from being sold to Chinese companies. DeepSeek did not immediately respond to a request for comment on the allegation.
Bernstein analysts on Jan. 27 highlighted in a research note that DeepSeek’s total training costs for its V3 model were unknown but were much higher than the US$5.58 million the startup said was used for computing power. The analysts also said the training costs of the equally-acclaimed R1 model were not disclosed.
Photo: Bloomberg 照片:彭博社
WHO IS BEHIND DEEPSEEK?
DeepSeek is a Hangzhou-based startup whose controlling shareholder is Liang Wenfeng (梁文鋒), co-founder of quantitative hedge fund High-Flyer, based on Chinese corporate records.
Liang’s fund announced in March 2023 on its official WeChat account that it was “starting again,” going beyond trading to concentrate resources on creating a “new and independent research group, to explore the essence of AGI [Artificial General Intelligence].” DeepSeek was created later that year.
Photo: Reuters 照片:路透
ChatGPT makers OpenAI define AGI as autonomous systems that surpass humans in most economically valuable tasks.
It is unclear how much High-Flyer has invested in DeepSeek. High-Flyer has an office located in the same building as DeepSeek, and it also owns patents related to chip clusters used to train AI models, Chinese corporate records show.
High-Flyer’s AI unit said on its official WeChat account in July 2022 that it owns and operates a cluster of 10,000 A100 chips.
HOW DOES BEIJING VIEW DEEPSEEK?
DeepSeek’s success has already been noticed in China’s top political circles. On Jan. 20, the day DeepSeek-R1 was released to the public, founder Liang attended a closed-door symposium for businessman and experts hosted by Chinese premier Li Qiang (李強), Chinese state news agency Xinhua said.
Liang’s presence at the gathering is potentially a sign that DeepSeek’s success could be important to Beijing’s policy goal of overcoming Washington’s export controls and achieving self-sufficiency in strategic industries like AI. A similar symposium last year was attended by Baidu CEO Robin Li (李彥宏).
(Reuters)
中國新創公司DeepSeek(深度求索)推出了最新的AI(人工智慧)模型,據稱與美國領先業界的模型旗鼓相當,甚至更好,但所需成本只有美國模型的一小部分,這可能顛覆科技世界的秩序。
該公司上個月在一篇論文中指出,DeepSeek-V3的訓練,只需要不到六百萬美元的輝達 H800晶片的運算力,引起了全球AI界的關注。
使用DeepSeek-V3的DeepSeek AI助手,已超越競爭對手ChatGPT,成為美國蘋果App Store上評價最高的免費應用程式。
這引發人們質疑為何一些美國科技公司要在AI領域投入數十億美元,輝達等幾家大型科技公司的股價也因此重挫。
DeepSeek為何引起轟動?
22022年底,OpenAI ChatGPT的發布,讓中國科技公司紛紛跟進,爭相創造自己的AI聊天機器人。
但在中國搜尋引擎巨頭百度發布第一個類似ChatGPT的中文版應用程式後,中國民眾對中美企業在AI能力上的差距普遍感到失望。
DeepSeek模型的品質及成本效益徹底顛覆了此說法。這家中國新創公司表示,DeepSeek-V3和DeepSeek-R1這兩款模型受到矽谷高層及美國科技公司工程師的一致好評,其水準與OpenAI及Meta最先進的模型不相上下。
而且使用DeepSeek也比較便宜。根據DeepSeek微信官方帳號上的一篇文章稱,上月發布的DeepSeek-R1,其使用成本比OpenAI o1模型低20到50倍,視任務而定。
但有些人對DeepSeek的成功故事公開表示懷疑。
Scale AI執行長汪滔1月23日接受CNBC採訪時表示,DeepSeek有五萬個輝達 H100晶片,但他並未提供證據,並聲稱不會接露這些晶片的下落,因為這會違反華盛頓的出口管制規定,即禁止將此類先進的AI晶片出售給中國公司。對此指控,DeepSeek並未直接回應。
華爾街投資機構伯恩斯坦的分析師1月27日在一份研究報告中強調,DeepSeek的V3模型訓練總成本尚不清楚,但遠高於該新創公司所稱用於算力的558萬美元。分析師也表示,同樣廣受好評的R1模型的訓練成本尚未揭露。
DeepSeek的幕後推手是誰?
DeepSeek是一家位於杭州的新創公司,根據中國公司記錄,其控股股東是量化對沖基金幻方量化的共同創辦人梁文鋒。
2023年3月,梁文鋒的基金在其微信官方帳號上宣布「重新出發」,超越交易,集中資源打造「全新獨立研究團隊,探索AGI(通用人工智慧)的本質」。DeepSeek於同年稍後創立。
開發ChatGPT的OpenAI將AGI定義為:在最具經濟價值的任務中超越人類的自主系統。
目前仍不清楚幻方量化對DeepSeek投資了多少。根據中國公司記錄,幻方量化的辦公室與DeepSeek位於同一棟大樓,並且還擁有訓練AI模型用之晶片群集的相關專利。
幻方量化的AI部門2022年7月在其官方微信上表示,他們所擁有並營運的晶片群集,有一萬個A100晶片。
北京如何看待DeepSeek?
DeepSeek的成功已引起中國高層政界的關注。據新華社報導,1月20日,DeepSeek-R1向公眾發布當天,創始人梁文鋒參加了由中國國務院總理李強主持的一場商人及專家秘密座談會。
梁文鋒出席該會議可能意味,DeepSeek的成功對於北京克服華盛頓的出口管制、實現AI等戰略產業的自給自足的政策目標至關重要。百度執行長李彥宏去年也出席了類似的研討會。
(台北時報林俐凱編譯)
US President Donald Trump on Monday last week signed the TAKE IT DOWN ACT (Tools to Address Known Exploitation by Immobilizing Technological Deepfakes on Websites and Networks Act), bipartisan legislation that enacts stricter penalties for the distribution of non-consensual intimate imagery, sometimes called “revenge porn,” as well as deepfakes created by artificial intelligence. The measure, which goes into effect immediately, was introduced by Sen. Ted Cruz, a Republican from Texas, and Sen. Amy Klobuchar, a Democrat from Minnesota, and later gained the support of First Lady Melania Trump. Critics of the measure, which addresses both real and artificial intelligence-generated imagery, say
Cats ruled in ancient Egypt—and not just in their own minds. These clever, graceful creatures were so deeply respected by the Egyptians that harming one could lead to severe punishment, even death. But why did the Egyptians hold cats in such high regard? It wasn’t just because they were cute; cats played a crucial role in protecting the country’s grain stores from pests. As guardians of Egypt’s food supply, they were seen as sacred animals. The Egyptians honored them through Bastet, the cat-headed goddess of protection, the home and fertility. Egyptians didn’t just love cats; they worshipped them. Cats lived in luxury,
A: Wanna go see a movie during the three-day weekend for the Dragon Boat Festival? B: Sure, I wanna see “Mission: Impossible – The Final Reckoning.” A: Rumor has it that this may be actor Tom Cruise’s last mission with the Mission: Impossible action movie franchise. B: Tom was only 34 when the first installment of the series was released in 1996. Now, he’s 63 and the eighth installment is out. A: I hope he’ll stay with the series. Let’s go see him fight against AI this weekend. A: 端午節三天連假週末要不要去看電影? B: 好啊我想看 《不可能的任務:最終清算》! A: 這有可能是湯姆克魯斯最後一次為動作片《不可能》系列出任務。 B:
Continued from yesterday(延續自昨日) https://www.taipeitimes.com/News/lang Despite these advantages, there are still some challenges when it comes to housing data centers under the ocean. One problem is that they’re difficult to access for repair or replacement. __4__ This presents a significantly higher level of complexity than handling traditional land-based data centers. Another challenge is energy reliability. Many underwater data centers rely on offshore renewable energy sources, which can be unstable due to environmental factors. While underwater data centers offer exciting possibilities, overcoming the associated challenges is essential to realizing their full potential. 儘管有這些優勢,在海底安置資料中心仍存在一些難關。一個問題是難以進入水下資料中心進行修理或更換。這需要派人員到水下或將資料中心運送到維修站。這比處理傳統的陸上資料中心還要複雜許多。另一個難關是能源可靠性。許多水下資料中心依賴近海再生能源,而這些能源可能會因環境因素而不穩定。雖然水下資料中心提供了令人興奮的可能性,但要充分發揮其潛力,克服相關的難題是相當重要的。 What Did You Learn? (A) However, the expense of housing and maintaining the facilities