AT&T Labs will start selling speech software that it says is so good at reproducing the sounds, inflections and intonations of a human voice that it can re-create voices and even bring the voices of long-dead celebrities back to life.
The software, which turns printed text into synthesized speech, makes it possible for a company to use recordings of a person's voice to utter new things that the person never said.
The software, called Natural Voices, is not flawless -- its utterances still contain a few robotic tones and unnatural inflections -- and competitors question whether the software is a substantial step up from existing products. But some of those who have tested the technology say it is the first text-to-speech software to raise the specter of voice cloning, replicating a person's voice so perfectly that the human ear cannot tell the difference.
PHOTO: NY TIMES
"If ABC wanted to use Regis Philbin's voice for all of its automated customer-service calls, it could," said Lawrence R. Rabiner, vice president for AT&T Labs.
Potential customers for the software, which is priced in the thousands of dollars, include telephone call centers, companies that make software that reads digital files aloud, and makers of automated voice devices.
James R. Fruchterman, the chief executive of Benetech, a nonprofit organization that uses technology in social-service projects, tested the software along with a dozen people who evaluate technology for blind people, and they said they were impressed.
"Natural Voices gets into the gray area," he said, "where there is plausible deniability that it is a machine."
Rabiner said he is excited about the possibility of resurrecting renowned voices, like that of Harry Caray, the Chicago Cubs announcer who delivered rousing play-by-play broadcasts. "There are probably hours of recordings in archives," he said. Wouldn't it be great, he asked, if Harry Caray's voice could once again be broadcasting in Wrigley Field?
The technology raises several questions. Who, for example, owns the rights to a celebrity's voice? Rabiner predicted that new contracts will be drawn that include voice-licensing clauses.
With PC-generated characters already appearing in place of real ones in some movies, will computer-synthesized voices compete with those of live actors as well?
And although scientists say the technology is not yet good enough to perpetrate fraud, synthesized voices may eventually be capable of tricking people into thinking that they were getting phone calls from people they know.
To build the software that recreates unique voices -- which AT&T Labs is calling its "custom voice" product -- a person must first go to a studio where engineers record 10 hours to 40 hours of readings. Texts range from business news reports to nonsense babble. The recordings are then chopped into fragments of sounds and sorted into databases. When the software processes a text, it retrieves the sounds and re-assembles them to form new sentences.
In the case of long-dead celebrities, archival recordings could be used in the same way.
Other companies and research centers, like IBM Research and Lernout and Hauspie, are also experimenting with this technique -- which is called concatenative speech synthesis -- to improve the quality of text-to-speech software. It is a big step up, engineers say, from the speech engines that were built from whole words that had been pre-recorded. And it is also a vast improvement, some say, from the entirely computer-generated and therefore robotic sounds that are used in many versions of text-to-speech software on the market today.
Now aided by the declining cost and increasing speed of microprocessors, far smoother sentences are possible, Rabiner said. He said that the speech team at AT&T Labs, led by Juergen Schroeter, an expert in speech synthesis, had created a more refined form of the concatenative technique by breaking a person's voice into "the smallest number of units possible."
A demonstration of the technology will be available on the Web beginning Tuesday at www.naturalvoices.att.com, said Michael Dickman, a spokesman for AT&T Labs.
Still, many engineers are skeptical of claims of a completely simulated voice that is almost indistinguishable from that of a human.
"The methods and algorithms that we know of, they still need a lot more work," said P.S. Gopalakrishnan, the manager of the pervasive speech technologies group at IBM Research, which competes with AT&T Labs in the field.
Now the pressure is on to perfect the technology. Analysts at McKinsey & Co have predicted that the market for text-to-speech software will reach more than US$1 billion in the next five years. In addition to customers like call centers and manufacturers of automated voice systems, the software could also be used by publishers of video games and books-on-tape and automobile manufacturers whose cars are equipped with software that gives driving directions. In the near future, engineers have said they expect people will want high-end speech technology that enables them to interact at length with their cell phones and Palm organizers, instead of typing on and squinting at a tiny screen.
RETHINK? The defense ministry and Navy Command Headquarters could take over the indigenous submarine project and change its production timeline, a source said Admiral Huang Shu-kuang’s (黃曙光) resignation as head of the Indigenous Submarine Program and as a member of the National Security Council could affect the production of submarines, a source said yesterday. Huang in a statement last night said he had decided to resign due to national security concerns while expressing the hope that it would put a stop to political wrangling that only undermines the advancement of the nation’s defense capabilities. Taiwan People’s Party Legislator Vivian Huang (黃珊珊) yesterday said that the admiral, her older brother, felt it was time for him to step down and that he had completed what he
Taiwan has experienced its most significant improvement in the QS World University Rankings by Subject, data provided on Sunday by international higher education analyst Quacquarelli Symonds (QS) showed. Compared with last year’s edition of the rankings, which measure academic excellence and influence, Taiwanese universities made great improvements in the H Index metric, which evaluates research productivity and its impact, with a notable 30 percent increase overall, QS said. Taiwanese universities also made notable progress in the Citations per Paper metric, which measures the impact of research, achieving a 13 percent increase. Taiwanese universities gained 10 percent in Academic Reputation, but declined 18 percent
CHINA REACTS: The patrol and reconnaissance plane ‘transited the Taiwan Strait in international airspace,’ the 7th Fleet said, while Taipei said it saw nothing unusual The US 7th Fleet yesterday said that a US Navy P-8A Poseidon flew through the Taiwan Strait, a day after US and Chinese defense heads held their first talks since November 2022 in an effort to reduce regional tensions. The patrol and reconnaissance plane “transited the Taiwan Strait in international airspace,” the 7th Fleet said in a news release. “By operating within the Taiwan Strait in accordance with international law, the United States upholds the navigational rights and freedoms of all nations.” In a separate statement, the Ministry of National Defense said that it monitored nearby waters and airspace as the aircraft
UNDER DISCUSSION: The combatant command would integrate fast attack boat and anti-ship missile groups to defend waters closest to the coastline, a source said The military could establish a new combatant command as early as 2026, which would be tasked with defending Taiwan’s territorial waters 24 nautical miles (44.4km) from the nation’s coastline, a source familiar with the matter said yesterday. The new command, which would fall under the Naval Command Headquarters, would be led by a vice admiral and integrate existing fast attack boat and anti-ship missile groups, along with the Naval Maritime Surveillance and Reconnaissance Command, said the source, who asked to remain anonymous. It could be launched by 2026, but details are being discussed and no final timetable has been announced, the source