Research Paper (preprint): Characterizing Web Search in the Age of Generative AI

October 15, 2025 by Gary Price

UPDATE: October 26, 2025

The article shared in our October 15th post (below) received some media attention today with coverage by The Decoder’s Jonathan Kemper.

Title: AI Chatbots Use Different Sources Than Google Search and Often Cite Less-Known Websites

The researchers compared Google’s organic search results with four generative AI search systems: Google AI Overview, Gemini 2.5 Flash with search, GPT-4o-Search, and GPT-4o with the search tool enabled. More than 4,600 queries across six topics—including politics, product reviews, and science—show just how differently these systems approach the web.

A key difference is when and how these systems choose to search online. GPT-4o-Search always performs a live web search for every query. In contrast, GPT-4o with search tool enabled decides whether to use its internal knowledge or look up new information for each question.

[Clip]

AI search systems surface information from a wider and less predictable set of sources compared to traditional search engines. In the study, 53 percent of the websites cited by AI Overview didn’t appear in Google’s top 10 organic results, and 27 percent weren’t even in the top 100. This means users could be seeing content from sites that are less vetted or less familiar.

[Clip]

The domains chosen by AI systems are often less well-known. Only about a third of the domains used by AI Overview and GPT-Tool were among the 1,000 most-visited sites, compared to 38 percent for organic search. This shift expands the pool of information but may also introduce more obscure perspectives.

Direct to Full Text Article (about 950 words)

—— End Update ——

The preprint linked below was recently shared on arXiv.

Title

Characterizing Web Search in the Age of Generative AI

Authors

Elisabeth Kirsten
Ruhr University Bochum
UAR RC Trust

Jost Grosse Perdekamp
Ruhr University Bochum
UAR RC Trust

Mihir Upadhyay
UAR RC Trust

Krishna P. Gummadi
MPI-SWS

Muhammad Bilal Zafar
Ruhr University Bochum
UAR RC Trust

Source

via arXiv

Abstract

The advent of LLMs has given rise to a new type of web search: Generative search, where LLMs retrieve web pages related to a query and generate a single, coherent text as a response. This output modality stands in stark contrast to traditional web search, where results are returned as a ranked list of independent web pages. In this paper, we ask: Along what dimensions do generative search outputs differ from traditional web search? We compare Google, a traditional web search engine, with four generative search engines from two providers (Google and OpenAI) across queries from four domains. Our analysis reveals intriguing differences. Most generative search engines cover a wider range of sources compared to web search. Generative search engines vary in the degree to which they rely on internal knowledge contained within the model parameters v.s. external knowledge retrieved from the web. Generative search engines surface varying sets of concepts, creating new opportunities for enhancing search diversity and serendipity. Our results also highlight the need for revisiting evaluation criteria for web search in the age of Generative AI.

Source: 10.48550/arXiv.2510.11560

Direct to Abstract and Link to Full Text

Filed under: Journal Articles, News, Patrons and Users

About Gary Price

Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.

Research Paper (preprint): Characterizing Web Search in the Age of Generative AI

About Gary Price

Archives

FOLLOW US ON X

Research Paper (preprint): Characterizing Web Search in the Age of Generative AI

About Gary Price

Archives

Related Infodocket Posts

FOLLOW US ON X