With the recent release of the open source Stable Diffusion software, the public now has access to a powerful AI tool for generating images using only text prompts. A number of artists have expressed frustration and anger at this new tool, as the models used to generate images was trained on millions of artists' copyrighted images without their knowledge or consent.
The technology underlying Stable Diffusion, Latent Diffusion Model (LDM), was itself only invented in 2022 by Stability AI. Yet the technology is already widely available for anyone to use, for any reason, free of charge. Anyone with an Internet connection can now run Stable Diffusion in the cloud or, if they have a reasonably modern computer, they can do so from the convenience of their personal computer. Given the power of this new technology and the extremely rapid pace at which it diffused to the public — Stability AI reports there are more than 10 million daily users and 1.5 million subscribers to their paid DreamStudio cloud service — it should come as no surprise that artists are feeling vulnerable.
One such artist, Sarah Andersen, who creates the popular Sarah's Scribbles comic, recently wrote about this new technology in a New York Times op-ed. In the op-ed, she encapsulated well the grievances many artists hold about this novel technology:
“For artists, many of us had what amounted to our entire portfolios fed into the data set without our consent. This means that A.I. generators were built on the backs of our copyrighted work, and through a legal loophole, they were able to produce copies of varying levels of sophistication. When I checked the website haveibeentrained.com, a site created to allow people to search LAION data sets, so much of my work was on there that it filled up my entire desktop screen.”
While I empathize with the anger and frustration artists like Sarah are feeling, in my view, there are a few issues with how they are thinking about technologies like Stable Diffusion that I want to address. Firstly, let us consider the LAION data set Sarah is referring to here, the LAION-5B. It is a curated data set that LAION put together using data from Common Crawl. Much like Google or the Internet Archive, Common Crawl uses a technology called a web crawler that scours the Internet, moving from hyperlink to hyperlink, and downloading whatever it finds. The Common Crawl data has been used by a number of projects, including the other new AI technology making waves recently, Open AI's GPT-3.
Given that the LAION-5B data set consists of nearly 5.8 billion images crawled from across the web, it should come as no more a surprise to anyone who publicly posts images online to find their work in that data set than it would be to find it in a Google search. Common Crawl supports the Robots Exclusion Protocol, and one step artists could take would be to tell CCBot not to crawl the pages where they post their content, if they are worried about it ending up in an AI data set. As Sarah's Scribbles is a wildly popular webcomic that has been running since 2011, and it is unlikely that she was aware of CCBot's scraping, it would be surprising if LAION consisted of only an "entire desktop screen" of her comics. I would in fact expect dozens of desktop screens of consisting of Sarah's content to be included in that data set.
However, the more important problem here is the issue of copyright. Intellectual property is an issue I am passionate about; it is the reason I spent several months working through Harvard Law School's challenging CopyrightX course, for example. The common sentiment among artists like Sarah is this new technology exists thanks a legal loophole in copyright law that did not anticipate tools that can generate art in the style of existing artists using nothing more than a text prompt.
This way of thinking about Stable Diffusion and similar tools is, in my view, misguided. What Stable Diffusion does is provide novel technological affordances to its users by dramatically lowering the barrier to entry for creating artwork mimicking the style of existing artists. Prior to the advent of Stable Diffusion, you would yourself have to be quite skilled to successfully mimic the style of another artist, and even then it would take time and practice to do so successfully. Now any average Joe can do so.
However, it is no more of a legal “loophole” to use Stable Diffusion to mimic the style of another artist than it is for a talented artist to do so. Artistic style is simply not copyrightable. Works created in the style of existing artist are new works altogether, with their copyright going to the person using the tool used to create the work, whether that be Stable Diffusion, Photoshop, or a pen and paper.
Sarah touches on this point in her piece when she writes, "The way I draw is the complex culmination of my education, the comics I devoured as a child and the many small choices that make up the sum of my life." What Sarah neglects to consider is that Stable Diffusion's drawings are also informed by its education, i.e., the programming instructions provided to it by its developers, and the billions of images it has devoured. The artwork generated with Stable Diffusion need not be in the style of an existing artist, but it is certainly informed by the millions of pieces of art it has likewise devoured. Just as a skilled artist could mimic the style of another artist whose works they have previously absorbed, Stable Diffusion is capable of the same. However, the choice of whether to generate such works using artistic tools is still in the hands of the individual. What Stable Diffusion does is take the cognitive load of the requisite education and media consumption off of the human brain and move it to software; it democratizes a skill previously reserved for those who dedicated years to devouring comics and honing their talents.
Whether it is ethical to mimic the style of an artist is separate from the question of whether it should be legal to do so. The law is not and cannot be the remedy to all problematic acts. There are other methods of regulating behavior. For example, when one artist mimicked the style of another artist and sold those works for a profit in the past, they risked a bad reputation in the art industry.
Regulating such behavior was, however, easier when the skill barrier was much higher than simply entering a few words into a text box. The question we face now is how to mitigate the mimicking of artists' works now that this novel technology has made it possible for anyone to do so. It may simply no longer be possible to regulate using social pressures now that the genie is out of the bottle.
Technological innovation will, for better or worse, continue to violate our cultural sensitivities. We should be wary of rushing to enact reactive legislation and regulations that enshrine our sensitivities into law and impede that innovation. To reduce the mistakes we make, we should be thinking proactively about the future capabilities of AI and other technologies and designing rules and regulations beforehand to meet these challenges, as Congress wisely did with the Genetic Information Nondiscrimination Act (GINA) in 2008. While it will not always be possible to anticipate future developments, we need to do better in this area so that we are having these debates on technology policy dispassionately and with the benefit of time on our side.
Top comments (0)