Part Four: The Invisible Builders
– Reflections on Artificial Intelligence – A Guide for Thinking Humans
This is the fourth reflection in a series inspired by Melanie Mitchell’s Artificial Intelligence: A Guide for Thinking Humans. For an introduction to the series, head over to the first post here.
When I was reading the chapters on how AI systems were trained, how they required vast amounts of labeled examples meticulously organized and verified, a question kept arising: Who did that work? Modern AI models needed millions of examples of images tagged, sentences labeled, and text annotated before learning patterns could emerge. The scale and variety of data made it impossible for academic teams alone to prepare these datasets. It was simply too large, too unstructured, and too labor-intensive for a small group of researchers to tackle by themselves.

That gap was filled, in large part, by what became known as Amazon Mechanical Turk, a crowdsourcing marketplace launched on November 2, 2005. It made it possible to distribute microtasks to a global workforce in exchange for small fees. The platform’s name itself came from an eighteenth-century automaton that appeared to play chess autonomously but concealed a human operator inside, a metaphor for the illusion of machine intelligence. Likewise, Mechanical Turk provided human judgment for tasks computers could not yet automate, helping simulate intelligence in systems that otherwise lacked common sense.
At first, Turkers, the workers who completed the Human Intelligence Tasks (HITs), did simple things: identify objects in images, transcribe audio, moderate content, or verify text. But as machine learning advanced, their contributions became deeper, providing labeled training data that models relied on to learn anything at all. These tasks were essential because without labeled data, datasets where a human had already told the machine what an image contains or what sentiment a sentence expresses, supervised learning simply could not proceed.
Notable early large-scale efforts illustrate this. For example, ImageNet, a dataset foundational to modern computer vision, involved millions of images labeled by thousands of crowdworkers sourced through Mechanical Turk and other means between 2008 and 2010. Those labeled examples became the basis on which image recognition models learned to identify objects across thousands of categories.
And so, data grew. Models grew. The illusion of autonomous learning became real only because other humans had already done the learning for them.
We now talk about GPTs and transformers as if they emerged from pure code. As if they learned logic and language by themselves. But those early datasets, and many labels that still underpin training sets today, came from real people answering simple questions at scale. Somewhere, in Indian cities, Filipino townships, Kenyan suburbs, or the remote corners of the global North, people completed hundreds of tiny tasks. Picking the right category for an image, deciding if a sentence made sense, or indicating whether a phrase was abusive or benign.
Mitchell touches on the presence of this labor in her book. Most of these workers made a few cents per task. With the average earnings historically measured at around a dollar or so per hour in the early years of MTurk. Many Turk tasks involved repetitive labeling and classification that was tedious at scale and, at times, psychologically demanding when confronting disturbing or complex content.
The bitter twist is this: the people who generated the labels that trained models are the least likely to benefit from the systems those labels enabled. They helped build the scaffolding under machine intelligence. Yet the systems they supported are often described as autonomous, self-learning, or self-contained.
I wonder if they knew. That the small, repetitive tasks they were completing click by click would one day render their own skills obsolete. That their labor, offered for cents, would train the very systems that no longer needed them. At the time it would’ve felt like easy money. Look at a picture and identify whether it has a cat or a giraffe. But those same labels became feedstock for models that can now do the labeling themselves. It has taken time. As Janelle Shane chronicles in her AI weirdness blog, the learning happened in stages. Now the machines know.
I wonder, in some strange, science-fictional future, will they realize they trained their replacements? That they were teaching their masters?
We speak of self-learning systems and zero-shot prompts, but that framing is misleading. These models were not born fully formed. They inherited structure from human labor at scale. What appears as autonomous cognition is often just a well-curated, well-laundered distillation of countless small human decisions.
This is not to condemn AI development. It is to call for memory.
If we admire a model’s fluency and prowess, we should also acknowledge the invisible workforce that made it possible. The interface is clean, but the labor that taught it to recognize cats, faces, or sentiments was anything but. That labor cannot speak for itself, nor does it appear in acknowledgments or conference footnotes. But without it, supervised AI as we know it would not exist.
And so, as we think of intelligence, whether artificial or human, let us also think about memory. Not just in the sense of data stored in weights and vectors, but in the sense of historical recognition and ethical acknowledgment. If AI progress forgets its laborers, it risks becoming exploitation wrapped in better public relations.
We should build systems that remember not only data, but debt. If the machines ever come to be seen as the masters of language and reason, let there at least be a footnote:
Built on human labor. Trained on unseen lives.