A film I loved growing up was the 1986 classic Short Circuit. In one scene, Johnny Five, the incredible robot that becomes “alive” after being struck by lightning, devours book after book, spending seconds on each title. Soon he runs out. “Ahh! More input, Stephanie! More!”
“There isn’t anything more!” replies Stephanie, the woman who found him. “You’ve read everything in the house!”
I asked OpenAI’s ChatGPT if it could relate. “Absolutely — I totally empathize with Johnny Five,” it responded. “‘Need input!’ is basically my core vibe. The more info I get, the better I understand, respond, and connect. Johnny was just an AI [artificial intelligence] trying to make sense of the world ... same here, just with fewer laser beams and more typing.”
It is true. While ChatGPT does not move around on caterpillar tread, or have a laser gun strapped to its back (yet), its challenges are cannily identical. Having scraped just about the entire sum of human knowledge, ChatGPT and other AI efforts are making the same rallying cry: Need input!
One solution is to create synthetic data and to train a model using that, though this comes with inherent challenges, particularly around perpetuating bias or introducing compounding inaccuracies.
The other is to find a great gushing spigot of new and fresh data, the more “human” the better. That is where social networks come in, digital spaces where millions, even billions, of users willingly and constantly post reams of information. Photos, posts, news articles, comments — every interaction of interest to companies that are trying to build conversational and generative AI. Even better, this content is not riddled with the copyright violation risk that comes with using other sources.
Lately, top AI companies have moved more aggressively to own or harness social networks, trampling over the rights of users to dictate how their posts may be used to build these machines. Social network users have long been “the product,” as the famous saying goes. They are now also a quasi “product developer” through their posts.
Some companies had the benefit of a social network to begin with. Meta Platforms Inc, the biggest social networking company on the planet, used in-app notifications to inform users that it would be harnessing their posts and photos for its Llama AI models. Late last month, Elon Musk’s xAI acquired X, formerly Twitter, in what was primarily a financial sleight of hand, but one that made ideal sense for Musk’s Grok AI. It has been able to gain a foothold in the chatbot market by harnessing timely tweets posted on the network as well as the huge archive of online chatter dating back almost two decades. Then there is Microsoft Corp, which owns the professional network LinkedIn and has been pushing heavily for users (and journalists) to post more original content to the platform.
However, Microsoft does not share LinkedIn data with its close partner OpenAI, which might explain reports that the ChatGPT maker was in the early stages of building a social network of its own. OpenAI’s CEO and cofounder, Sam Altman, has been soliciting feedback on the idea, news Web site The Verge reported, noting that Altman had earlier hinted that such a project was on his mind when it was reported that Meta would be releasing a standalone AI app to compete with ChatGPT.
Other companies without a social media head start are realizing it puts them at a disadvantage. Perplexity.ai in March made public its bid to buy TikTok, noting its value for a company building an AI search engine.
“This would provide users with comprehensive, well-cited answers that combine the best answer engine in the world with one of the largest libraries of user generated content,” the company said.
Earlier this month, Amazon.com Inc was also reported to be among the bidders, though CEO Andy Jassy declined to comment when asked directly by CNBC.
Google, which has tried and failed to make various social networks happen, has less need for TikTok videos because it already owns YouTube. Instead, it has put in place an “expanded partnership” with Reddit, the link-sharing social network, giving it access, Google said in a blog post last year, to “an incredible breadth of authentic, human conversations and experiences.” Expect more deals like this: A former Reddit competitor, Digg, is being revived with the obvious intent to create another repository of human interactions that will be of use to AI companies.
All of these moves speak to AI companies’ demand for data. It comes at the expense of users who entered information on social networks for one purpose and now find it being used for another. Quietly, companies have been altering privacy policies to cover the legality of this shift.
Hidden away in settings, you can find ways to isolate your data from being used to build AI — though you are likely already too late. Like Johnny Five, AI companies “need input!” They are going to get it however and from wherever they can.
Dave Lee is Bloomberg Opinion’s US technology columnist. He was previously a correspondent for the Financial Times and BBC News. This column reflects the personal views of the author and does not necessarily reflect the opinion of the editorial board or Bloomberg LP and its owners.
President William Lai (賴清德) recently attended an event in Taipei marking the end of World War II in Europe, emphasizing in his speech: “Using force to invade another country is an unjust act and will ultimately fail.” In just a few words, he captured the core values of the postwar international order and reminded us again: History is not just for reflection, but serves as a warning for the present. From a broad historical perspective, his statement carries weight. For centuries, international relations operated under the law of the jungle — where the strong dominated and the weak were constrained. That
The Executive Yuan recently revised a page of its Web site on ethnic groups in Taiwan, replacing the term “Han” (漢族) with “the rest of the population.” The page, which was updated on March 24, describes the composition of Taiwan’s registered households as indigenous (2.5 percent), foreign origin (1.2 percent) and the rest of the population (96.2 percent). The change was picked up by a social media user and amplified by local media, sparking heated discussion over the weekend. The pan-blue and pro-China camp called it a politically motivated desinicization attempt to obscure the Han Chinese ethnicity of most Taiwanese.
On Wednesday last week, the Rossiyskaya Gazeta published an article by Chinese President Xi Jinping (習近平) asserting the People’s Republic of China’s (PRC) territorial claim over Taiwan effective 1945, predicated upon instruments such as the 1943 Cairo Declaration and the 1945 Potsdam Proclamation. The article further contended that this de jure and de facto status was subsequently reaffirmed by UN General Assembly Resolution 2758 of 1971. The Ministry of Foreign Affairs promptly issued a statement categorically repudiating these assertions. In addition to the reasons put forward by the ministry, I believe that China’s assertions are open to questions in international
The Legislative Yuan passed an amendment on Friday last week to add four national holidays and make Workers’ Day a national holiday for all sectors — a move referred to as “four plus one.” The Chinese Nationalist Party (KMT) and the Taiwan People’s Party (TPP), who used their combined legislative majority to push the bill through its third reading, claim the holidays were chosen based on their inherent significance and social relevance. However, in passing the amendment, they have stuck to the traditional mindset of taking a holiday just for the sake of it, failing to make good use of