As artificial intelligence (AI) sweeps across the world, many countries have begun to develop their own large language models, while various industries are evaluating the impact of AI usage on their business. Against this background, Academia Sinica recently released a beta version of its large language model-based chatbot, CKIP-Llama-2-7b, hoping to get feedback from the public and become the starting point for local AI chatbots so that Taiwan will not fall behind other countries.
However, the top-level research institute last week removed the traditional Chinese-language AI chatbot from its Web site after testing by netizens showed some disturbing answers to basic questions such as Taiwan’s national day, anthem and leader. Moreover, netizens found the content provided by Academia Sinica’s chat AI model was not localized enough and made inappropriate verbal choices, as its datasets were provided by several Chinese research institutes, while its dialogue training materials were compiled in simplified Chinese characters.
Academia Sinica admitted the mistake, saying the questionable responses to netizens’ queries were due to the researchers’ use of Chinese datasets in CKIP-Llama-2-7b. As the researchers wanted to save time in developing a chat AI, they simply converted the datasets from simplified Chinese into traditional Chinese characters and put the model online for crowdsourced testing, Academia Sinica explained, adding that it had learned a lesson from the incident and vowed to set up a special task force to avoid repeating the mistakes.
Clearly, it is necessary to establish a Taiwan-based large language model using datasets collected locally, otherwise the content generated by a chat AI would be disputable and controversial on certain issues. Take CKIP-Llama-2-7b as an example: Academia Sinica had claimed its model could be used for academic, commercial, copywriting, literary creation and question-and-answer systems, as well as customer service, language translation, text editing and teaching Chinese. But without datasets taken from local language examples that reflect a Taiwanese context, any homegrown AI would be hard pressed to achieve its expected goals and might be inappropriate to the locality, some AI experts have warned.
This begs the question of whether Taiwan is determined to develop local language datasets and large language models. Granted, doing so is extremely expensive, not only financially, but also in terms of time as well as the massive computing power required, and it poses a challenge to the government to draft the required budget proposal and obtain approval from legislators. It is also very unlikely that private enterprises would invest more resources in software development in addition to hardware upgrades. Nevertheless, it is a fundamental requirement, since large language model AIs need to be trained using massive datasets.
In addition to the localization issue, users of chat AIs, whether it be OpenAI’s ChatGPT or Google’s Bard, face a common problem: inaccuracy and factual errors. Until there is a major breakthrough in AI technology, users’ judgement and knowledge are essential to detect and counter any AI bias, which derives from the algorithms’ tendency to reflect the national, cultural and ideological biases of their creators. Take AI-assisted teaching in Taiwan’s classrooms as an example: Teachers’ ability to judge and correct questionable content generated by AIs is the key to their success in teaching, enabling the technology to greatly boost teaching efficiency.
The CKIP-Llama-2-7b incident serves as a reminder that the use of chat AI on a large scale has national security implications which the government must address urgently. Moreover, this powerful tool has its own fundamental flaws, which require users’ utmost discretion based on their own expertise and judgement, rather than blind trust.
George Santayana wrote: “Those who cannot remember the past are condemned to repeat it.” This article will help readers avoid repeating mistakes by examining four examples from the civil war between the Chinese Communist Party (CCP) forces and the Republic of China (ROC) forces that involved two city sieges and two island invasions. The city sieges compared are Changchun (May to October 1948) and Beiping (November 1948 to January 1949, renamed Beijing after its capture), and attempts to invade Kinmen (October 1949) and Hainan (April 1950). Comparing and contrasting these examples, we can learn how Taiwan may prevent a war with
A recent trio of opinion articles in this newspaper reflects the growing anxiety surrounding Washington’s reported request for Taiwan to shift up to 50 percent of its semiconductor production abroad — a process likely to take 10 years, even under the most serious and coordinated effort. Simon H. Tang (湯先鈍) issued a sharp warning (“US trade threatens silicon shield,” Oct. 4, page 8), calling the move a threat to Taiwan’s “silicon shield,” which he argues deters aggression by making Taiwan indispensable. On the same day, Hsiao Hsi-huei (蕭錫惠) (“Responding to US semiconductor policy shift,” Oct. 4, page 8) focused on
Taiwan is rapidly accelerating toward becoming a “super-aged society” — moving at one of the fastest rates globally — with the proportion of elderly people in the population sharply rising. While the demographic shift of “fewer births than deaths” is no longer an anomaly, the nation’s legal framework and social customs appear stuck in the last century. Without adjustments, incidents like last month’s viral kicking incident on the Taipei MRT involving a 73-year-old woman would continue to proliferate, sowing seeds of generational distrust and conflict. The Senior Citizens Welfare Act (老人福利法), originally enacted in 1980 and revised multiple times, positions older
Nvidia Corp’s plan to build its new headquarters at the Beitou Shilin Science Park’s T17 and T18 plots has stalled over a land rights dispute, prompting the Taipei City Government to propose the T12 plot as an alternative. The city government has also increased pressure on Shin Kong Life Insurance Co, which holds the development rights for the T17 and T18 plots. The proposal is the latest by the city government over the past few months — and part of an ongoing negotiation strategy between the two sides. Whether Shin Kong Life Insurance backs down might be the key factor