You are currently viewing The big steal of Perplexity AI

The big steal of Perplexity AI

In every advertising cycle, certain patterns of fraud emerge. In the last crypto boom, it was “ponzinomics” and “carpet pulling”. With self-driving cars, it was “just five years from now!” AI shows just how much unethical nonsense you can get away with.

Perplexity is basically a rent-seeking intermediary on high-quality sources

Perplexity, which is in talks to raise hundreds of millions of dollars, is trying to create a competitor to Google Search. However, Perplexity isn’t trying to create a “search engine” — it wants to create an “answer machine.” The idea is that instead of sifting through a bunch of results to answer your own question with a primary source, you’ll just get an answer that Perplexity found for you. “Factuality and accuracy are what we care about,” said Perplexity CEO Aravind Srinivas On the edge.

This means that Perplexity is basically an intermediary seeking rent from high-quality sources. The value proposition of search was originally that by scraping work done by journalists and others, Google results sent traffic to those sources. But by providing an answer instead of directing people to click through to a primary source, these so-called “answer machines” are robbing a primary source of advertising revenue — keeping that revenue for themselves. Perplexity is among a group of vampires that includes Arc Search and Google itself.

But Perplexity has gone a step further with its Pages product, which creates a summary “report” based on these primary sources. It’s not just quoting a sentence or two to directly answer a user’s question — it’s creating an entire summary article, and it’s accurate in the sense that it actively plagiarizes the sources it uses.

Forbes found Perplexity bypassing the publication’s paywall to provide a summary of an investigation the publication had done into former Google CEO Eric Schmidt’s drone company. However Forbes has a metered paywall for some of its work, premium work – like this investigation – is behind a hard paywall. Not only did Perplexity somehow manage to avoid the paywall, it barely even cites the original investigation and ganked the original art to use for his report. (For those keeping track at home, the art is copyright infringement.)

“Someone else did it” is a good argument for a five-year-old

Aggregation isn’t a particularly new phenomenon — but the extent to which Perplexity can aggregate, along with copyright infringement in the use of original art, is quite, um, remarkable. In an effort to calm everyone down, the company’s chief business officer went to Axios to say that Perplexity was developing plans to share revenue with posts, and how was everyone so bad to a product still under development?

In this moment, With cable jumped, confirming a finding by Robb Knight: Perplexity’s scraping of on Forbes work was no exception. In fact, Perplexity ignores the robots.txt code that specifically tells web crawlers not to scrape the page. Srinivas replied Fast company that in fact, Perplexity was not ignoring robots.txt; it just used third-party scrapers that ignored it. Srinivas declined to name the third-party scraper and did not commit to asking that bot to stop violating robots.txt.

“Someone else did it” is a good argument for a five-year-old. And think about the answer further. If Srinivas wanted to be ethical, he had some options here. Option one is termination of the contract with the third scraper. Option two is to try to convince the scraper to respect robots.txt. Srinivas commits to neither, and it seems to me that there is a clear reason why. Even if Perplexity itself doesn’t break the code, it relies on someone else breaking the code to make its “answer machine” work.

To add insult to injury, Perplexity plagiarizes With cablearticle of about it — although With cable explicitly blocks Perplexity in its text file. The bigger part of With cable‘c the plagiarism article is about legal remedies, but I’m interested in what’s going on here with robots.txt. It’s a bona fide agreement that’s been upheld for decades, and it’s falling apart thanks to unscrupulous AI companies — that’s right, Perplexity isn’t the only one — funneling almost everything available to train their stupid models. And remember how Srinivas said he was committed to “authenticity”? And I’m not sure that’s true: Perplexity now surfaces AI-generated results and actual misinformation, Forbes reports.

To my ear, Srinivas was bragging about how charming and clever his lie was

We’ve seen many AI giants engage in questionably legal and perhaps unethical practices to get the data they want. To prove Perplexity’s value to investors, Srinivas built a Twitter deletion tool pretending to be an academic researcher using API access for research. “I would call mine [fake academic] projects just like Bryn Rank and all that stuff,” Srinivas told Lex Friedman on the latter’s podcast. I assume “Brin Rank” is a reference to Google co-founder Sergey Brin; next to my ear Srinivas was bragging about how charming and clever his lie was.

I am not the one to tell you that the foundation of Perplexity is a lie to avoid established principles that hold the web together. Its CEO is This clarifies the actual value proposition of “answer engines”. Perplexity cannot generate actual information on its own and instead relies on third parties whose policies it abuses. The “answer machine” was developed by people who feel free to lie when it’s more convenient, and this preference is necessary for how Perplexity works.

So that’s Perplexity’s real innovation here: breaking the foundations of trust that built the Internet. The question is whether any of its users or investors care.

Leave a Reply