Alibaba Group Holding Ltd (阿里巴巴) cofounder Jack Ma (馬雲)-backed Ant Group Co (螞蟻集團) used Chinese-made semiconductors to develop techniques for training artificial intelligence (AI) models that would cut costs by 20 percent, people familiar with the matter said.
Ant used domestic chips, including from Alibaba and Huawei Technologies Co (華為), to train models using the so-called “mixture of experts” machine learning approach, the people said.
It got results similar to those from Nvidia Corp chips, such as the H800, they said.
Photo: AFP
Hangzhou-based Ant is still using Nvidia for AI development, but is now relying mostly on alternatives, including from Advanced Micro Devices Inc and Chinese chips for its latest models, one of the people said.
The models mark Ant’s entry into a race between Chinese and US companies that has accelerated since DeepSeek (深度求索) demonstrated how capable models can be trained for far less than the billions invested by OpenAI and Alphabet Inc’s Google.
It underscores how Chinese companies are trying to use local alternatives to the most advanced Nvidia semiconductors. While not the most advanced, the H800 is a relatively powerful processor and is barred by the US from China.
The company published a research paper this month that said its models at times outperformed Meta Platforms Inc in certain benchmarks, which has not been independently verified.
However, if they work as advertised, Ant’s platforms could mark another step forward for Chinese AI development by slashing the cost of inferencing or supporting AI services.
Ant said it cost about 6.35 million yuan (US$875,952) to train 1 trillion tokens using high-performance hardware, but its optimized approach would cut that down to 5.1 million yuan using lower-specification hardware.
Tokens are the fundamental units of text — such as words, characters or parts of words — that a language model breaks down and analyzes to understand context, meaning and structure.
In essence, they are the building blocks that enable the model to interpret human language and produce intelligent output.
The company plans to leverage the recent breakthrough in the large language models it has developed, Ling-Plus and Ling-Lite, for industrial AI solutions including healthcare and finance, the people said.
On English-language understanding, Ant in its paper said that the Ling-Lite model did better in a key benchmark compared with one of Meta’s Llama models.
Ling-Lite and Ling-Plus models outperformed DeepSeek’s equivalents on Chinese-language benchmarks.
Ant has made the Ling models open-source. Ling-Lite contains 16.8 billion parameters, which are adjustable settings that work like knobs and dials to direct the model’s performance.
Ling-Plus has 290 billion parameters, which is considered relatively large in the realm of language models. For comparison, experts estimate that ChatGPT’s GPT-4.5 has 1.8 trillion parameters, MIT Technology Review said. DeepSeek-R1 has 671 billion.
The company faced challenges in some areas of the training, including stability.
Even small changes in the hardware or the model’s structure led to problems, including jumps in the models’ error rate, it said in the paper.
Additonal reporting by staff writer
Shares of contract chipmaker Taiwan Semiconductor Manufacturing Co (TSMC, 台積電) came under pressure yesterday after a report that Apple Inc is looking to shift some orders from the Taiwanese company to Intel Corp. TSMC shares fell NT$55, or 2.4 percent, to close at NT$2,235 on the local main board, Taiwan Stock Exchange data showed. Despite the losses, TSMC is expected to continue to benefit from sound fundamentals, as it maintains a lead over its peers in high-end process development, analysts said. “The selling was a knee-jerk reaction to an Intel-Apple report over the weekend,” Mega International Investment Services Corp (兆豐國際投顧) analyst Alex Huang
Taiwan Semiconductor Manufacturing Co (TSMC, 台積電) is expected to remain Apple Inc’s primary chip manufacturing partner despite reports that Apple could shift some orders to Intel Corp, industry experts said yesterday. The comments came after The Wall Street Journal reported on Friday that Apple and Intel had reached a preliminary agreement following more than a year of negotiations for Intel to manufacture some chips for Apple devices. Taiwan Institute of Economic Research (台灣經濟研究院) economist Arisa Liu (劉佩真) said TSMC’s advanced packaging technologies, including integrated fan-out and chip-on-wafer-on-substrate, remain critical to the performance of Apple’s A-series and M-series chips. She said Intel and Samsung
TRANSITION: With the closure, the company would reorganize its Taiwanese unit to a sales and service-focused model, Bridgestone said Bridgestone Corp yesterday announced it would cease manufacturing operations at its tire plant in Hsinchu County’s Hukou Township (湖口), affecting more than 500 workers. Bridgestone Taiwan Co (台灣普利司通) said in a statement that the decision was based on the Tokyo-based tire maker’s adjustments to its global operational strategy and long-term market development considerations. The Taiwanese unit would be reorganized as part of the closure, effective yesterday, and all related production activities would be concluded, the statement said. Under the plan, Bridgestone would continue to deepen its presence in the Taiwanese market, while transitioning to a sales and service-focused business model, it added. The Hsinchu
Taiwan Semiconductor Manufacturing Co (TSMC, 台積電) has approved a capital budget of US$31.28 billion for production expansion to meet long-term development needs during the artificial intelligence (AI) boom. The company’s board meeting yesterday approved the capital appropriation plan for purposes such as the installation of advanced technology capacity and fab construction, the world’s largest contract chipmaker said in a statement. At an earnings conference last month, TSMC forecast that its capital expenditure for this year would be at the higher end of the US$52 billion to US$56 billion range it forecast in January in response to robust demand for 5G, AI and