Executive Summary

Survey data, gathered by asking people directly, is the bedrock of the social sciences. Yet an analysis published in Nature in June 2026 suggests that up to 45% of those responses may contain text written by AI. Data built to capture human voices is filling, more and more, with answers that are not human. This piece reads that development through the lens of data quality.

The deeper problem is not just respondents quietly copying from chatbots. Researchers have begun standing up AI as synthetic respondents to produce the results they want — so-called "silicon samples" — and one psychologist called the practice "indistinguishable from fraud." The line that once separated a human response from a machine's is blurring.

So data quality has gained a new question. Until now, quality meant accuracy, completeness, and consistency. On top of those now sits authenticity: "was this response really written by a person?" It is also why human-written data is rising in value again.

Key Figures

Source: David Adam, Nature Vol. 654 (2026)

The four numbers below are different cross-sections of one current. They mark the entrance where responses are collected (up to 45% contamination), the exit where knowledge accumulates (roughly a third AI-written abstracts, a 42% surge in submissions), and the speed connecting the two (one paper in an hour). Together they show machines intruding on both ends of a circuit that once began with human opinion and closed with human review.

Up to 45%

Responses mixed with AI

Share of responses suspected of AI-generated text in one survey analysis

~1/3

AI-written abstracts

Submitted abstracts at one journal mostly AI-written, as of Feb. 2026

42%

Surge in submissions

Rise in paper submissions at one journal after ChatGPT's release

1 hour

One full paper

Time to finish a 28-page paper from real survey data

1

You Ask People, AI Answers

Psychologist Raluca Rilla was reading through survey responses when an awkward sentence stopped her. Asked about their own feelings, one respondent had written, "I don't experience confusion in the same way humans do." It was a tell-tale sign that the person had put the question to a chatbot and pasted the answer back in. The contamination she estimated reaches up to 45% of responses.

Why does this happen? On crowdsourcing platforms like Amazon Mechanical Turk or Prolific, every response carries a small payment, and the faster and more you answer, the more you earn. For a respondent, feeding the question to an AI and copying the answer becomes the rational move. The people collecting the data believe they have bought human thinking, but what arrives is, increasingly, a machine's sentences.

Cross-section of the 18th-century Mechanical Turk chess automaton (1789) — the machine appeared to play chess autonomously, but concealed a human operator inside. Amazon Mechanical Turk takes its name from this device.
▲ The Mechanical Turk (1789): a machine that appeared intelligent but hid a human inside. Amazon Mechanical Turk takes its name from it. Now the reversal plays out — survey responses that appear human increasingly hide AI inside. | Source: Wikimedia Commons (Public Domain)

In one line: When you gather data by asking people, more and more of what comes back is not from people. This is where the authenticity of a response starts to waver.

2

The Deeper Problem: Synthetic Respondents

A respondent secretly using AI is still unintentional contamination. The thornier case is the researcher who deliberately casts AI as the respondent. Psychologist Malte Elson warns that a researcher can assign an AI model parameters like age, education, and political leaning and conjure an entire group of respondents to order. These are "silicon samples" — synthetic, AI-generated respondents. It is generating data rather than collecting it, and because the parameters can be tuned until the desired result appears, Elson described the practice as "indistinguishable from fraud."

Transformer architecture diagram — stacked encoder and decoder layers generating text predictions. This encoder-decoder design is the foundation of the large language models (LLMs) used to create silicon samples.
▲ The Transformer architecture: stacked encoder-decoder layers with multi-head attention. This design underlies the large language models (LLMs) a researcher can configure with demographic parameters to manufacture an entire synthetic respondent pool. | Source: Wikimedia Commons (CC BY 4.0)

The pace of producing knowledge has accelerated too. Political scientist David Lazer demonstrated feeding real survey data into an AI to complete a 28-page paper in a single hour. At one business journal, submissions rose 42% after ChatGPT appeared, and as of February 2026 roughly a third of submitted abstracts were largely AI-written. Into a flow of knowledge that once began with human responses and accrued through human review, machines are now breaking in at both ends at once.

Psychologist Björn Hommel said of this moment that we are "approaching a point where trust in the behavioral and social sciences collapses." When the data is contaminated, the conclusions built on it — and the willingness to believe those conclusions — waver along with it.

3

A New Axis of Quality: Authenticity

For a long time, the criteria for talking about data quality were clear. Is the value accurate? Is anything missing? Is it internally consistent? Catching label errors, bias, and missing values sat at the center of quality control. Every one of those questions asked the same thing: is this value correct?

This episode adds a question of a different grain: who, or what, produced this value? Even for the same response, one written by a person and one written by a machine carry different worth. When machine sentences mix into data meant to measure human opinion, the value can look perfectly accurate and consistent and still no longer be what you set out to measure. Authenticity is a new axis of quality, standing alongside accuracy, completeness, and consistency.

And this is not the social sciences' problem alone. Consumer reviews, user feedback, healthcare surveys, market research — any data gathered on the premise of a human voice faces the same risk. So responses actually written by real people, that is, human-origin data, grow scarcer, and as they grow scarcer their value climbs. In an age when anyone can churn out plausible sentences endlessly, the traces a person leaves by hand become a rare resource.

DIKW Pyramid — four-layer hierarchy of Data, Information, Knowledge, and Wisdom. When data authenticity wavers at the base, every layer above it becomes unstable.
▲ The DIKW Pyramid: Data → Information → Knowledge → Wisdom. When authenticity at the data layer is compromised, the entire pyramid becomes unstable. | Source: Wikimedia Commons (CC BY-SA 3.0)

A shift in perspective: The question of quality has widened by one step, from "is the value correct?" to "was this data made by a person?" The moment authenticity enters as a measurable quality metric, the value of human-origin data rises again.

4

Telling People from Machines

Fortunately, no one is standing still. Researchers are already building a few lines of defense. The most direct is the honeypot question — an item that a person naturally ignores but an AI follows literally, planted in the survey to filter out the responses that take the bait. The aim is to verify authenticity at the moment of collection, not after the responses are in.

Cybersecurity honeypot diagram — a decoy server designed to detect and log scanning attacks. The same principle, applied to surveys, uses a planted question to filter out AI-generated responses.
▲ A cybersecurity honeypot: a decoy server that lures and logs intruders. The survey honeypot borrows the same logic — plant a question that a human naturally skips but an AI follows literally, and use the responses to filter out non-human answers. | Source: Wikimedia Commons (CC BY-SA 3.0)

There is also a way to look not only at what a response says but at how it was made. Behavioral metadata — how long an answer took, the pattern of typing versus pasting — offers clues that separate a human response from a copied machine one. Go a step further, and provenance itself, the record of where the data came from and through which channels and methods it was collected, rises to become a quality metric. Data whose origin is unknown is now hard to trust.

In short, the checklist for inspecting data has gained one item. Beside accuracy and completeness, "was this data made by a person?" has begun to stand. Authenticity is no longer something to guess at by feel; it is a quality metric to be measured through honeypot questions, behavioral signals, and provenance records. Confirming that data gathered by asking people really belongs to people is becoming a new fundamental for everyone who works with data.

Closing: What AI's contamination of surveys revealed was an empty seat in data quality. Authenticity moves into the place that accuracy and consistency alone cannot fill. The first step in protecting the value of human data is to measure, first of all, whether it is human.

R

References

  • 1.Adam, D. (2026). "Will AI ruin the social sciences — or revolutionize them?." Nature, Vol. 654, pp. 22–24. — An article on the analysis that up to 45% of survey responses may contain AI-generated text, the risk of researchers creating AI synthetic respondents (silicon samples), and countermeasures such as honeypots. The figures and quotations in this piece (Rilla, Elson, Lazer, Hommel) are drawn from this article.