Over the last year we have seen an explosion of so-called AI-produced material. First it was wild and incredible works of art, then text documents of every kind, from homilies to academic essays. One of the issues in the current Writers Guild strike is whether studios and networks should be allowed to use programs like ChatGPT to create outlines or full-length scripts, which writers then would simply rewrite or polish.
We have labeled the programs that do this work “artificial intelligence.” And on the surface, the reality of that concept has never seemed more persuasive. These programs generate complete and seemingly original works in an instant. They can also communicate with a person in a way that resembles actual conversation.
In a way, ChatGPT and its ilk are the highest form of separating laborers from the fruit of their labor.
But in fact, for the time being anyway, these programs are not sentient but just a very complex form of the kind of predictive text bot you find using Gmail or Google docs. ChatGPT-3, for instance, was trained on and is informed by 500 billion “tokens”—words or phrases culled from books, articles and the internet through which it interprets and responds to the prompts given to it. (You’ll also hear this referred to as “large language model” machine learning.) Where Google docs might suggest the rest of a phrase when you start typing the first word, ChatGPT has so much data at its disposal it can suggest a whole paragraph or essay. And it continues to learn and develop from the data that we enter and responses that they get.
Now, you might say, why make a big fuss about what we call this? No one is claiming that ChatGPT is C-3PO or that we are approaching the singularity. Chill out.
But in calling these programs “artificial intelligence” we grant them a claim to authorship that is simply untrue. Each of those tokens used by programs like ChatGPT—the “language” in their “large language model”—represents a tiny, tiny piece of material that someone else created. And those authors are not credited for it, paid for it or asked permission for its use. In a sense, these machine-learning bots are actually the most advanced form of a chop shop: They steal material from creators (that is, they use it without permission), cut that material into parts so small that no one can trace them and then repurpose them to form new products.
Maybe that sounds silly. We’re talking about things like breaking a scene from a play into tiny phrases, not Robin Thicke and Pharrell Williams or more recently Ed Sheeran using Marvin Gaye tracks without permission. Where does the principle of fair use come into play when creative products can be sliced into such microscopic units? Some might argue that what ChatGPT does is more akin to Sheeran arguing that he’s just taking building blocks and using them in new ways. But I don’t know. “To be or not to be” or “Attention must be paid” certainly seem like someone’s intellectual property.
It is important to remember these program algorithms are predictive—that is, they’re meant to think about their tokens in relationship to each other. It makes sense that, having been given a thousand tokens from a great writer like Toni Morrison or Stephen King, bots like this would be able to either reproduce or repurpose those authors’ voices, ways of thinking and turns of phrase.
What we face in dealing with programs like ChatGPT is the further relentless corrosiveness of late-stage capitalism, in which authorship is of no value. All that matters is content.
And once again, this has all been done without anyone ever having obtained those writers’ permission to allow their work to help inform the program’s output. A number of such art-based programs are currently facing lawsuits over claims that their databases of billions of copyrighted images, which are then “diffused” to create new images, constitute copyright infringement. (There’s also a case currently before the Supreme Court considering when the use of another artist’s material becomes transformative instead of theft.)
In a way, ChatGPT and its ilk are the highest form of separating laborers from the fruit of their labor. We get an answer or piece of art from a predictive text bot and the original articles and ideas from which they were generated are so far removed that even those creators themselves don’t realize they have been stolen from. In fact, they themselves might join those who think that argument is absurd. Which is kind of like admiring somebody’s souped-up hot rod when a tiny part of its hood (or engine design) was stolen from your car.
Back in the day, programs like Napster had everyone believing that we shouldn’t have to pay for music. If I own something, why shouldn’t I be able to share it with whoever I want? Even today, it remains hard to convince some that there’s anything wrong with illegally downloading the latest music from Lizzo or a bootleg of “Guardians of the Galaxy 3.” Those same people will insist without irony that they are those artists’ biggest fans.
The same is already true with these predictive algorithms. We are so enthralled by what they can do, or the social goods they seem poised to offer, we don’t want to examine where all of this material is actually coming from. Rather than pointing to some future utopia (or robots vs. humans dystopia), what we face in dealing with programs like ChatGPT is the further relentless corrosiveness of late-stage capitalism, in which authorship is of no value. All that matters is content.
[Read next: “Is ChatGPT here to stay in higher education? Not if teachers resist it.”]