AI đang nghiền nát ngành báo chí: Nguy cơ diệt vong toàn ngành truyền thông

  • AI tạo sinh đang gây ra mối đe dọa nghiêm trọng cho ngành báo chí khi các chatbot như ChatGPT, Claude, Grok, Perplexity và Google AI Overviews trả lời trực tiếp người dùng, khiến độc giả không còn truy cập vào các trang tin.

  • Một nghiên cứu cho thấy Google AI Overviews đã làm giảm hơn 34% lưu lượng truy cập tới các website tin tức. CEO của DotDash Meredith cảnh báo về viễn cảnh “Google Zero” – tức là lưu lượng từ Google có thể bằng 0.

  • Các hãng tin như Business Insider và Daily Dot đã phải sa thải hàng loạt, phần lớn do lưu lượng truy cập giảm mạnh vì chatbot AI.

  • Thị phần tìm kiếm tại Mỹ đang bị AI thay thế tới hơn 25%, đẩy các nhà xuất bản vào tình thế mất cả độc giả, doanh thu quảng cáo và phí thuê bao.

  • AI huấn luyện trên dữ liệu sách, bài báo, thậm chí cả nội dung có trả phí mà không cần xin phép, khiến các nhà xuất bản vừa mất nội dung vừa mất người đọc.

  • Phản hồi từ các công ty AI hầu hết né tránh hoặc đưa ra dữ liệu mập mờ. Google tuyên bố mang lại “lưu lượng chất lượng cao” nhưng không cung cấp số liệu cụ thể. OpenAI cũng chỉ dẫn số liệu khiêm tốn, ví dụ BBC chỉ nhận được 118.000 lượt truy cập từ ChatGPT trong tháng 4, quá nhỏ so với tổng lưu lượng hàng trăm triệu lượt.

  • Nhiều nhà xuất bản đang lựa chọn giữa khởi kiện hoặc ký thỏa thuận cấp phép với các công ty AI. Ít nhất đã có 72 thỏa thuận cấp phép được ký trong 2 năm qua, nhưng giá trị thường rất thấp, chỉ vài trăm USD cho mỗi cuốn sách hoặc nội dung.

  • Các công ty AI lợi thế trên bàn đàm phán vì đã từng huấn luyện dữ liệu mà không trả phí. Ngoài ra, hiện không có chuẩn định giá nội dung cho huấn luyện LLM.

  • Các nhà xuất bản cũng không thể ngăn AI quét dữ liệu web, vì tiêu chuẩn Robots.txt dễ dàng bị vô hiệu hóa. Họ cũng không biết chắc dữ liệu của mình có bị dùng để huấn luyện hay không do các công ty AI giữ bí mật tập dữ liệu.

  • Các giám đốc như Rich Caccappolo (Daily Mail) lo ngại rằng AI có thể khiến doanh nghiệp của họ phá sản trong vài tháng tới, chứ không phải vài năm.

  • CEO OpenAI Sam Altman và CEO Google Sundar Pichai đều gợi ý rằng tương lai có thể có mô hình "marketplace" trả micropayment cho người sáng tạo, nhưng hiện tại các công ty vẫn khai thác nội dung mà không trả tiền.

  • Các nhà báo có thể chuyển sang mô hình creator economy như Substack, YouTube, TikTok, nhưng báo chí điều tra hoặc chuyên sâu vẫn cần nguồn lực lớn – thứ mà mô hình cá nhân không thể đáp ứng.


📌 AI đang bóp nghẹt ngành báo chí: Google AI Overviews làm giảm 34% lưu lượng web, AI thay thế 25% nhu cầu tìm kiếm ở Mỹ. Ít nhất 72 thỏa thuận cấp phép đã được ký nhưng giá trị quá thấp so với doanh thu mất đi. Nguy cơ sụp đổ toàn ngành báo chí đang tới gần khi AI lấy nội dung miễn phí, làm trung gian cho độc giả mà không chia sẻ lợi ích.

https://www.theatlantic.com/technology/archive/2025/06/generative-ai-pirated-articles-books/683009/

AI Is Already Crushing the News Industry

Inside Silicon Valley’s assault on the media
 
When tech companies first rolled out generative-AI products, some critics immediately feared a media collapse. Every bit of writing, imagery, and video became suspect. But for news publishers and journalists, another calamity was on the horizon.
 
Chatbots have proved adept at keeping users locked into conversations. They do so by answering every question, often through summarizing articles from news publishers. Suddenly, fewer people are traveling outside the generative-AI sites—a development that poses an existential threat to the media, and to the livelihood of journalists everywhere.
 
According to one comprehensive study, Google’s AI Overviews—a feature that summarizes web pages above the site’s usual search results—has already reduced traffic to outside websites by more than 34 percent. The CEO of DotDash Meredith, which publishes PeopleBetter Homes & Gardens, and Food & Wine, recently said the company is preparing for a possible “Google Zero” scenario. Some have speculated that traffic drops resulting from chatbots were part of the reason outlets such as Business Insider and the Daily Dot have recently had layoffs. “Business Insider was built for an internet that doesn’t exist anymore,” one former staffer recently told the media reporter Oliver Darcy.
 
Not all publishers are at equal risk: Those that primarily rely on general-interest readers who come in from search engines and social media may be in worse shape than specialized publishers with dedicated subscribers. Yet no one is totally safe. Released in May 2024, AI Overviews joins ChatGPT, Claude, Grok, Perplexity, and other AI-powered products that, combined, have replaced search for more than 25 percent of Americans, according to one study. Companies train chatbots on huge amounts of stolen books and articles, as my previous reporting has shown, and scrape news articles to generate responses with up-to-date information. Large language models also train on copious materials in the public domain—but much of what is most useful to these models, particularly as users seek real-time information from chatbots, is news that exists behind a paywall. Publishers are creating the value, but AI companies are intercepting their audiences, subscription fees, and ad revenue.
I asked Anthropic, xAI, Perplexity, Google, and OpenAI about this problem. Anthropic and xAI did not respond. Perplexity did not directly comment on the issue. Google argued that it was sending “higher-quality” traffic to publisher websites, meaning that users purportedly spend more time on the sites once they click over, but declined to offer any data in support of this claim. OpenAI referred me to an article showing that ChatGPT is sending more traffic to websites overall than it did previously, but the raw numbers are fairly modest. The BBC, for example, reportedly received 118,000 visits from ChatGPT in April, but that’s practically nothing relative to the hundreds of millions of visitors it receives each month. The article also shows that traffic from ChatGPT has in fact declined for some publishers.
 
Over the past few months, I’ve spoken with several news publishers, all of whom see AI as a near-term existential threat to their business. Rich Caccappolo, the vice chair of media at the company that publishes the Daily Mail—the U.K.’s largest newspaper by circulation—told me that all publishers “can see that Overviews are going to unravel the traffic that they get from search, undermining a key foundational pillar of the digital-revenue model.” AI companies have claimed that chatbots will continue to send readers to news publishers, but have not cited evidence to support this claim. I asked Caccappolo if he thought AI-generated answers could put his company out of business. “That is absolutely the fear,” he told me. “And my concern is it’s not going to happen in three or five years—I joke it’s going to happen next Tuesday.”
 
Book publishers, especially those of nonfiction and textbooks, also told me they anticipate a massive decrease in sales, as chatbots can both summarize their books and give detailed explanations of their contents. Publishers have tried to fight back, but my conversations revealed how much the deck is stacked against them. The world is changing fast, perhaps irrevocably. The institutions that comprise our country’s free press are fighting for their survival.

Publishers have been responding in two ways. First: legal action. At least 12 lawsuits involving more than 20 publishers have been filed against AI companies. Their outcomes are far from certain, and the cases might be decided only after irreparable damage has been done.
 
The second response is to make deals with AI companies, allowing their products to summarize articles or train on editorial content. Some publishers, such as The Atlantic, are pursuing both strategies (the company has a corporate partnership with OpenAI and is suing Cohere). At least 72 licensing deals have been made between publishers and AI companies in the past two years. But figuring out how to approach these deals is no easy task. Caccappolo told me he has “felt a tremendous imbalance at the negotiating table”—a sentiment shared by others I spoke with. One problem is that there is no standard price for training an LLM on a book or an article. The AI companies know what kinds of content they want, and having already demonstrated an ability and a willingness to take it without paying, they have extraordinary leverage when it comes to negotiating. I’ve learned that books have sometimes been licensed for only a couple hundred dollars each, and that a publisher that asks too much may be turned down, only for tech companies to take their material anyway.
Another issue is that different content appears to have different value for different LLMs. The digital-media company Ziff Davis has studied web-based AI training data sets and observed that content from “high-authority” sources, such as major newspapers and magazines, appears more desirable to AI companies than blog and social-media posts. (Ziff Davis is suing OpenAI for training on its articles without paying a licensing fee.) Researchers at Microsoft have also written publicly about “the importance of high-quality data” and have suggested that textbook-style content may be particularly desirable.
 
But beyond a few specific studies like these, there is little insight into what kind of content most improves an LLM, leaving a lot of unanswered questions. Are biographies more or less important than histories? Does high-quality fiction matter? Are old books worth anything? Amy Brand, the director and publisher of the MIT Press, told me that “a solution that promises to help determine the fair value of specific human-authored content within the active marketplace for LLM training data would be hugely beneficial.”
 
A publisher’s negotiating power is also limited by the degree to which it can stop an AI company from using its work without consent. There’s no surefire way to keep AI companies from scraping news websites; even the Robots Exclusion Protocol, the standard opt-out method available to news publishers, is easily circumvented. Because AI companies generally keep their training data a secret, and because there is no easy way for publishers to check which chatbots are summarizing their articles, publishers have difficulty figuring out which AI companies they might sue or try to strike a deal with. Some experts, such as Tim O’Reilly, have suggested that laws should require the disclosure of copyrighted training data, but no existing legislation requires companies to reveal specific authors or publishers that have been used for AI training material.
Of course, all of this raises a question. AI companies seem to have taken publishers’ content already. Why would they pay for it now, especially because some of these companies have argued in court that training LLMs on copyrighted books and articles is fair use?
 
Perhaps the deals are simply hedges against an unfavorable ruling in court. If AI companies are prevented from training on copyrighted work for free, then organizations that have existing deals with publishers might be ahead of their competition. Publisher deals are also a means of settling without litigation—which may be a more desirable path for publishers who are risk-averse or otherwise uncertain. But the legal scholar James Grimmelmann told me that AI companies could also respond to complaints like Ziff Davis’s by arguing that the deals involve more than training on a publisher’s content: They may also include access to cleaner versions of articles, ongoing access to a daily or real-time feed, or a release from liability for their chatbot’s plagiarism. Tech companies could argue that the money exchanged in these deals is exclusively for the nonlicensing elements, so they aren’t paying for training material. It’s worth noting that tech companies almost always refer to these deals as partnerships, not licensing deals, likely for this reason.
 
Regardless, the modest income from these arrangements is not going to save publishers: Even a good deal, one publisher told me, won’t come anywhere near recouping the revenue lost from decreased readership. Publishers that can figure out how to survive the generative-AI assault may need to invent different business models and find new streams of revenue. There may be viable strategies, but none of the publishers I spoke with has a clear idea of what they are.

Publishers have become accustomed to technological threats over the past two decades, perhaps most notably the loss of ad revenue to Facebook and Google, a company that was recently found to have an illegal monopoly in online advertising (though the company has said it will appeal the ruling). But the rise of generative AI may spell doom for the Fourth Estate: With AI, the tech industry even deprives publishers of an audience.
 
In the event of publisher mass extinction, some journalists will be able to endure. The so-called creator economy shows that it’s possible to provide high-quality news and information through Substack, YouTube, and even TikTok. But not all reporters can simply move to these platforms. Investigative journalism that exposes corruption and malfeasance by powerful people and companies comes with a serious risk of legal repercussions, and requires resources—such as time and money—that tend to be in short supply for freelancers.
 
If news publishers start going out of business, won’t AI companies suffer too? Their chatbots need access to journalism to answer questions about the world. Doesn’t the tech industry have an interest in the survival of newspapers and magazines?
In fact, there are signs that AI companies believe publishers are no longer needed. In December, at The New York Times’ DealBook Summit, OpenAI CEO Sam Altman was asked how writers should feel about their work being used for AI training. “I think we do need a new deal, standard, protocol, whatever you want to call it, for how creators are going to get rewarded.” He described an “opt-in” regime where an author could receive “micropayments” when their name, likeness, and style were used. But this could not be further from OpenAI’s current practice, in which products are already being used to imitate the styles of artists and writers, without compensation or even an effective opt-out.
 
Google CEO Sundar Pichai was also asked about writer compensation at the DealBook Summit. He suggested that a market solution would emerge, possibly one that wouldn’t involve publishers in the long run. This is typical. As in other industries they’ve “disrupted,” Silicon Valley moguls seem to perceive old, established institutions as middlemen to be removed for greater efficiency. Uber enticed drivers to work for it, crushed the traditional taxi industry, and now controls salaries, benefits, and workloads algorithmically. This has meant greater convenience for consumers, just as AI arguably does—but it has also proved ruinous for many people who were once able to earn a living wage from professional driving. Pichai seemed to envision a future that may have a similar consequence for journalists. “There’ll be a marketplace in the future, I think—there’ll be creators who will create for AI,” he said. “People will figure it out.”

Không có file đính kèm.

28

Thảo luận

© Sóng AI - Tóm tắt tin, bài trí tuệ nhân tạo