Alibaba Group Holding Ltd (阿里巴巴) cofounder Jack Ma (馬雲)-backed Ant Group Co (螞蟻集團) used Chinese-made semiconductors to develop techniques for training artificial intelligence (AI) models that would cut costs by 20 percent, people familiar with the matter said.
Ant used domestic chips, including from Alibaba and Huawei Technologies Co (華為), to train models using the so-called “mixture of experts” machine learning approach, the people said.
It got results similar to those from Nvidia Corp chips, such as the H800, they said.
Photo: AFP
Hangzhou-based Ant is still using Nvidia for AI development, but is now relying mostly on alternatives, including from Advanced Micro Devices Inc and Chinese chips for its latest models, one of the people said.
The models mark Ant’s entry into a race between Chinese and US companies that has accelerated since DeepSeek (深度求索) demonstrated how capable models can be trained for far less than the billions invested by OpenAI and Alphabet Inc’s Google.
It underscores how Chinese companies are trying to use local alternatives to the most advanced Nvidia semiconductors. While not the most advanced, the H800 is a relatively powerful processor and is barred by the US from China.
The company published a research paper this month that said its models at times outperformed Meta Platforms Inc in certain benchmarks, which has not been independently verified.
However, if they work as advertised, Ant’s platforms could mark another step forward for Chinese AI development by slashing the cost of inferencing or supporting AI services.
Ant said it cost about 6.35 million yuan (US$875,952) to train 1 trillion tokens using high-performance hardware, but its optimized approach would cut that down to 5.1 million yuan using lower-specification hardware.
Tokens are the fundamental units of text — such as words, characters or parts of words — that a language model breaks down and analyzes to understand context, meaning and structure.
In essence, they are the building blocks that enable the model to interpret human language and produce intelligent output.
The company plans to leverage the recent breakthrough in the large language models it has developed, Ling-Plus and Ling-Lite, for industrial AI solutions including healthcare and finance, the people said.
On English-language understanding, Ant in its paper said that the Ling-Lite model did better in a key benchmark compared with one of Meta’s Llama models.
Ling-Lite and Ling-Plus models outperformed DeepSeek’s equivalents on Chinese-language benchmarks.
Ant has made the Ling models open-source. Ling-Lite contains 16.8 billion parameters, which are adjustable settings that work like knobs and dials to direct the model’s performance.
Ling-Plus has 290 billion parameters, which is considered relatively large in the realm of language models. For comparison, experts estimate that ChatGPT’s GPT-4.5 has 1.8 trillion parameters, MIT Technology Review said. DeepSeek-R1 has 671 billion.
The company faced challenges in some areas of the training, including stability.
Even small changes in the hardware or the model’s structure led to problems, including jumps in the models’ error rate, it said in the paper.
Additonal reporting by staff writer
Taiwan’s foreign exchange reserves fell below the US$600 billion mark at the end of last month, with the central bank reporting a total of US$596.89 billion — a decline of US$8.6 billion from February — ending a three-month streak of increases. The central bank attributed the drop to a combination of factors such as outflows by foreign institutional investors, currency fluctuations and its own market interventions. “The large-scale outflows disrupted the balance of supply and demand in the foreign exchange market, prompting the central bank to intervene repeatedly by selling US dollars to stabilize the local currency,” Department of Foreign
Intel Corp is joining Elon Musk’s long-shot effort to develop semiconductors for Tesla Inc, Space Exploration Technologies Corp and xAI, marking a surprising twist in the chipmaker’s comeback bid. Intel would help the Terafab project “refactor” the technology in a chip factory, the company said on Tuesday in a post on X, Musk’s social media platform. That is a stage in the development process that typically helps make chips more powerful or reliable. The chipmaker’s shares jumped 4.2 percent to US$52.91 in New York trading on Tuesday. The Terafab project is a grand plan by Musk to eventually manufacture his own chips for
Taiwan Power Co (Taipower, 台電) yesterday said it plans to resume operations at two coal-fired power generators for three months to boost security of electricity supply as liquefied natural gas (LNG) supply risks are running high due to the Middle East conflict. The two coal-fired power generators are at Mailiao Power Plant in Yunlin County’s Mailiao Township (麥寮). The plant, operated by Formosa Plastics Group (台塑集團), supplied electricity to Taipower’s power grid until the end of last year. Taipower’s decision came about one month after Minister of Economic Affairs Kung Ming-hsin (龔明鑫) on March 10 said that the nation had no imminent
Some robotaxi passengers were left stranded in the middle of fast-moving traffic in a major Chinese city after their driverless vehicles stopped running, according to police and media reports on Wednesday. A preliminary investigation indicates more than 100 robotaxis came to a halt because of a “system malfunction,” police in the city of Wuhan said in a statement, without elaborating. No injuries were reported. One passenger told Chinese media that their robotaxi stopped after turning a corner. An instruction on a screen read: “Driving system malfunction. Staff are expected to arrive in 5 minutes.” After no one showed up, the passenger pushed