Alibaba Group Holding Ltd (阿里巴巴) cofounder Jack Ma (馬雲)-backed Ant Group Co (螞蟻集團) used Chinese-made semiconductors to develop techniques for training artificial intelligence (AI) models that would cut costs by 20 percent, people familiar with the matter said.
Ant used domestic chips, including from Alibaba and Huawei Technologies Co (華為), to train models using the so-called “mixture of experts” machine learning approach, the people said.
It got results similar to those from Nvidia Corp chips, such as the H800, they said.
Photo: AFP
Hangzhou-based Ant is still using Nvidia for AI development, but is now relying mostly on alternatives, including from Advanced Micro Devices Inc and Chinese chips for its latest models, one of the people said.
The models mark Ant’s entry into a race between Chinese and US companies that has accelerated since DeepSeek (深度求索) demonstrated how capable models can be trained for far less than the billions invested by OpenAI and Alphabet Inc’s Google.
It underscores how Chinese companies are trying to use local alternatives to the most advanced Nvidia semiconductors. While not the most advanced, the H800 is a relatively powerful processor and is barred by the US from China.
The company published a research paper this month that said its models at times outperformed Meta Platforms Inc in certain benchmarks, which has not been independently verified.
However, if they work as advertised, Ant’s platforms could mark another step forward for Chinese AI development by slashing the cost of inferencing or supporting AI services.
Ant said it cost about 6.35 million yuan (US$875,952) to train 1 trillion tokens using high-performance hardware, but its optimized approach would cut that down to 5.1 million yuan using lower-specification hardware.
Tokens are the fundamental units of text — such as words, characters or parts of words — that a language model breaks down and analyzes to understand context, meaning and structure.
In essence, they are the building blocks that enable the model to interpret human language and produce intelligent output.
The company plans to leverage the recent breakthrough in the large language models it has developed, Ling-Plus and Ling-Lite, for industrial AI solutions including healthcare and finance, the people said.
On English-language understanding, Ant in its paper said that the Ling-Lite model did better in a key benchmark compared with one of Meta’s Llama models.
Ling-Lite and Ling-Plus models outperformed DeepSeek’s equivalents on Chinese-language benchmarks.
Ant has made the Ling models open-source. Ling-Lite contains 16.8 billion parameters, which are adjustable settings that work like knobs and dials to direct the model’s performance.
Ling-Plus has 290 billion parameters, which is considered relatively large in the realm of language models. For comparison, experts estimate that ChatGPT’s GPT-4.5 has 1.8 trillion parameters, MIT Technology Review said. DeepSeek-R1 has 671 billion.
The company faced challenges in some areas of the training, including stability.
Even small changes in the hardware or the model’s structure led to problems, including jumps in the models’ error rate, it said in the paper.
Additonal reporting by staff writer
NEW IDENTITY: Known for its software, India has expanded into hardware, with its semiconductor industry growing from US$38bn in 2023 to US$45bn to US$50bn India on Saturday inaugurated its first semiconductor assembly and test facility, a milestone in the government’s push to reduce dependence on foreign chipmakers and stake a claim in a sector dominated by China. Indian Prime Minister Narendra Modi opened US firm Micron Technology Inc’s semiconductor assembly, test and packaging unit in his home state of Gujarat, hailing the “dawn of a new era” for India’s technology ambitions. “When young Indians look back in the future, they will see this decade as the turning point in our tech future,” Modi told the event, which was broadcast on his YouTube channel. The plant would convert
‘SEISMIC SHIFT’: The researcher forecast there would be about 1.1 billion mobile shipments this year, down from 1.26 billion the prior year and erasing years of gains The global smartphone market is expected to contract 12.9 percent this year due to the unprecedented memorychip shortage, marking “a crisis like no other,” researcher International Data Corp (IDC) said. The new forecast, a dramatic revision down from earlier estimates, gives the latest accounting of the ongoing memory crunch that is affecting every corner of the electronics industry. The demand for advanced memory to power artificial intelligence (AI) tasks has drained global supply until well into next year and jeopardizes the business model of many smartphone makers. IDC forecast about 1.1 billion mobile shipments this year, down from 1.26 billion the prior
People stand in a Pokemon store in Tokyo on Thursday. One of the world highest-grossing franchises is celebrated its 30th anniversary yesterday.
Zimbabwe’s ban on raw lithium exports is forcing Chinese miners to rethink their strategy, speeding up plans to process the metal locally instead of shipping it to China’s vast rechargeable battery industry. The country is Africa’s largest lithium producer and has one of the world’s largest reserves, according to the US Geological Survey (USGS). Zimbabwe already banned the export of lithium ore in 2022 and last year announced it would halt exports of lithium concentrates from January next year. However, on Wednesday it imposed the ban with immediate effect, leaving unclear what the lithium mining sector would do in the