Cloudflare Introduces Blocking of A.I. Scrapers By Default

7 hours ago 2

You have a preview view of this article while we are checking your access. When we have confirmed access, the full article content will load.

The tech company’s customers can automatically block A.I. companies from exploiting their websites, it said, as it moves to protect original content online.

Matthew Prince stands in a corner of a tall white room with one hand in a pants pocket. Both walls have large windows divided by many small panes.
Matthew Prince, the chief executive of Cloudflare, said he was “deeply concerned that the incentives for content creation are dead.”Credit...Jason Henry for The New York Times

By Natallie Rocha

Reporting from San Francisco

July 1, 2025, 6:00 a.m. ET

Cloudflare, a tech company that helps websites secure and manage their internet traffic, said on Tuesday that it had rolled out a new permission-based setting that allows customers to automatically block artificial intelligence companies from collecting their digital data, a move that has implications for publishers and the race to build A.I.

With Cloudflare’s new setting, websites can block — by default — online bots that scrape their data, requiring the website owner to grant access for a bot to collect the content, the company said. In the past, those whom Cloudflare did not flag as a hacker or malicious actor could get through to a website to gather its information.

“We’re changing the rules of the internet across all of Cloudflare,” said Matthew Prince, the chief executive of the company, which provides tools that protect websites from cyberattacks and helps them load content more efficiently. “If you’re a robot, now you have to go on the toll road in order to get the content of all of these publishers.”

Cloudflare is making the change to protect original content on the internet, Mr. Prince said. If A.I. companies freely use data from various websites without permission or payment, people will be discouraged from creating new digital content, he said. The company, which says its network of servers handles about 20 percent of internet traffic, has seen a sharp increase in A.I. data crawlers on the web.

Data for A.I. systems has become an increasingly contentious issue. OpenAI, Anthropic, Google and other companies building A.I. systems have amassed reams of information from across the internet to train their A.I. models. High-quality data is particularly prized because it helps A.I. models become more proficient in generating accurate answers, videos and images.

But website publishers, authors, news organizations and other content creators have accused A.I. companies of using their material without permission and payment. Last month, Reddit sued Anthropic, saying the start-up had unlawfully used the data of its more than 100 million daily users to train its A.I. systems. In 2023, The New York Times sued OpenAI and its partner, Microsoft, accusing them of copyright infringement of news content related to A.I. systems. OpenAI and Microsoft have denied those claims.


Thank you for your patience while we verify access. If you are in Reader mode please exit and log into your Times account, or subscribe for all of The Times.


Thank you for your patience while we verify access.

Already a subscriber? Log in.

Want all of The Times? Subscribe.

Read Entire Article
Olahraga Sehat| | | |