Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
When Cloudflare accused the perplexity of the AI search engine of Skating websites On Monday, while ignoring the specific methods of a site to block it, this was not a clear case of AI Crawler Web.
Many people came to the defense of Perplexity. They argued that the perplexity accessing sites in defiance of the wishes of the owner of the website, although controversial, is acceptable. And it is a controversy that will certainly develop when AI agents flood the Internet: an agent who has a website on behalf of its user should be treated like a bot? Or like a human making the same request?
Cloudflare is known to provide anti-booth services and other web security services to millions of websites. Essentially, the Cloudflare test case involved setting up a new website with a new area that had never been crawled by any bot, creating a robot.txt file which specifically blocked the known IA robots of Perplexity, then asking for perplexity on the content of the website. And the perplexity answered the question.
Cloudflare researchers discovered that the AI search engine used “a generic browser intended to usurp the identity of Google Chrome on MacOS” when its web robot was blocked. Cloudflare CEO Matthew Prince poster Research on X, by writing, “some so-called” renowned “AI companies act more like North Korean pirates.
But many people disagreed with Prince’s evaluation that it was a bad behavior. Those who defend perplexity on the sites like x And Pirate news stressed that what Cloudflare seemed to document was the accessing AI to a specific public website when its user asked questions on this specific website.
“If I ask for a website as a human, then I should be displayed the content”, a person on Pirate news Writing, adding: “Why would the LLM access the website on my name in a different legal category like my Firefox web browser?”
A perplexity spokesperson previously refused Techcrunch that the Bots were the company and called Cloudflare’s blog a sales argument for Cloudflare. Then Tuesday, perplexity published a blog In his defense (and generally attacker Cloudflare), saying that behavior came from a third -party service that he used occasionally.
Techcrunch event
San Francisco
|
October 27-29, 2025
But the crux of the publication of Perplexity made a appeal similar to that of its online defenders.
“The difference between automated crawling and user -focused recovery is not only technical – it is a question of accessing information on open web,” said the post. “This controversy reveals that cloudflare systems are fundamentally inadequate to distinguish legitimate AI assistants and real threats.”
Perplexity accusations are not exactly just either. An argument that Prince and Cloudflare used to call perplexity methods was that Optaai does not behave in the same way.
“Openai is an example of a leading AI company that follows these best practices”, ” Cloudflare wrote. “They respect robots.txt and do not try to escape a robots.TXT directive or a network level block. And the Chatgpt agent sign HTTP requests using the newly offered authentic Bot Standard Open. ”
Web Bot Auth is a standard supported by Cloudflare developed by the Internet Engineering working group which hopes to create a cryptographic method to identify AI agent’s web requests.
The debate occurs while the BOT activity reshapes the Internet. As Techcrunch previously reported, robots seeking to scratch massive quantities of content to form models of AI have become a threatEspecially at small sites.
For the first time in the history of the Internet, Bot activity is currently going beyond online human activityWith AI traffic accounting for more than 50%, according to the BAD BOT of Imperva report published last month. Most of this activity comes from LLMS. But the report also found that malicious robots now represent 37% of all Internet traffic. This is an activity that includes everything, from persistent scratch to unauthorized connection attempts.
Up to the LLMS, the Internet has generally accepted that websites could and should block most bot activities given the frequency to which it was malicious using Captchas and other services (such as Cloudflare). The websites were also clearly encouraged to work with good specific players, such as Googlebot, the guideant on what not to index via robots.txt. Google has indexed the Internet, which sent traffic to sites.
Now LLM eat an increasing amount of this traffic. Gartner predicts This volume of search engine A decrease of 25% by 2026. At the moment, humans tend to click on the links of the LLMS website to the point that they are the most precious for the website, that is to say that they are ready to make a transaction.
But if humans adopt agents Like the technology industry predicts it, they – to organize our trips, reserve our dinner reservations and make purchases for us – would websites harm their commercial interests by blocking them? The debate on X perfectly captured the dilemma:
“I want perplexity to visit public content on my behalf when I give it a request / task!” wrote a person In response to Cloudflare calling perplexity.
“And if the owners of the site don’t want it? They just want you to visit the house directly, see their things” argued anotherStressing that the owner of the site who created the content wants traffic and potential advertising revenues, so as not to let the perplexity take it.
“This is why I cannot see the” agency navigation “that really works – a much more difficult problem than people think. Most websites will simply block, “,” a third foreseen.
(Tagstotranslate) Cloudflare
Source link