You have probably seen a variety of posts from people who have access to Dall-E 2 or some other creative AI engine on social media. Apart from the FOMO (where is my invite??) it’s hard not to notice how good has the tech gotten.
In fact, it’s so good, that a digitally created piece of art won a prize in an art competition recently, enraging artists. As an investor in the areas of entertainment, gaming, metaverse and consumer tech at Remagine Ventures, I’m fascinated by the potential of democratising the creative process. I wrote about it on VC Cafe in “The State of Creative Automation” (Sept 2021) and again in “The opportunities for Creative Automation” in April 2022. So what has changed since?
More tools, improved quality, wider access
When I first wrote about ‘creative automation’ Dall-E wasn’t open for beta users, and while the tech of generative text to image AI existed, it wasn’t widely distributed. In recent months, we’ve seen an explosion in popularity of these tools as they are 1) more available 2) relatively easy to use 3) give the user a ‘wow’ experience of creating something from nothing.
Text to image
- Dalle-2 – created by OpenAI and available to the lucky folks who got a beta invite. Terms and conditions apply and there’s a filtering criteria (i.e. no porn, etc)
- Craiyon -a free, ad-supported, poorer quality generative model
- Midjourney – a Discord server that lets users insert prompts and get results for free
- Stable Diffusion – Research collective Stability.ai has released Stable Diffusion as the first truly open source text-to-image. Thousands of developers are using it to build new applications at a rapid pace.
- DreamStudio – a UI layer on top of Stable Diffusion, by Stability.ai
- Entsil – free, text to image AI
- Neural love – also capable of image to image creation
- NightCafe – text to image, offering the ability to print your work of art on canvas, 10 first credits are free
- Promptmania – try these prompts to generate original art works
- Lexica – search for over 10M stable diffusion images and prompts
For a directory of tools check out creativeai.org or play around in online forums like Reddit – there’s always a new one!
What are the core capabilities available today for text to image AI?
Peter Yang summarised it brilliantly in his creator economy newsletter:
1. Create images from text prompts
2. Create images from rough sketches
3. Extend images beyond their borders (outpainting)
4. Change the art style i.e. de-impressionzing Monet, or Mona Lisa in cyberpunk style
The impact on creators, and artists
Jason Allen’s AI-generated work “Théâtre D’opéra Spatial” took first place in the digital category at the Colorado State Fair and artists are not happy. The digital art work used prompts, but was created by AI using MidJourney and printed on canvas.
Artists are concerned that the acceleration in creative AI will replace the need for paintors, illustrators, photographers and other creators, including actors. Cartoonist Matt Borrs said:
“Technology is increasingly deployed to make gig jobs and to make billionaires richer, and so much of it doesn’t seem to benefit the public good enough. AI art is part of that. To developers and technically minded people, it’s this cool thing, but to illustrators it’s very upsetting because it feels like you’ve eliminated the need to hire the illustrator”.
“I Went Viral in the Bad Way”, The Atlantic
And artists are expressing their frustration on Twitter:
The reality is that automation (AI, Robots, etc) is happening all around us and it will certainly have an impact on the number of jobs. Think about the last time you went to McDonald’s. Did you order from the till or a giant touch screen? Inevitably, as the tech gets better, the costs come down, and consumers/companies might choose to buy an AI generated image/logo/3D model/text, vs. paying for a person to make one.
But AI advancements can also down the line create new opportunities. It has the potential to lower the barrier for people to become artists and be part of the creator economy.
For example, AI whispering has become a new side hustle. As creative AI programs transform a text prompt into potentially award-winning art, the text prompts have become a sought after skill. PromptBase is a new meta marketplace that lets “prompt engineers” sell text descriptions that reliably produce a certain art style or subject on a specific AI platform, The Verge reports. For example, I could buy 3D Astronauts, Ape avatars and renders of 3D Pokemon for $1.99. You can also try the Dall-E prompt ebook.
A picture is worth 1,000 words, but what if it’s fake?
Take the following example, created by Israeli author and futurologist, Roey Tzezena. He wanted to demonstrate how easy it is to spread misinformation/ fake news using off the shelf tools. He decided to focus on a conspiracy theory: some people claim that the 1969 landing on the moon by Neil Armstrong was faked.
Using Dall-E and Stable Fusion he was able to create black and white images of an alleged fake landing on the moon in a studio setup. The results, as you can see below, look pretty convincing.
He then used GPT-3 to generate text that portrayed a ‘senior NASA exec, speaking with regret about the fake moon landing’. The AI promptly followed orders and created a text that looks very real. You can read his post (in Hebrew) here.
What happens when text like this makes its way to credible sources like Wikipedia? Who’s policing what is real vs. fake? The problem of fake news is certainly not new, but the tools for AI creation seem to be evolving faster than the policy to protect consumers or the tech to detect what’s real vs. not.
Looking forward
While there are no doubt risks associated with unleashing this kind of technology in the wild without supervision (Reddit already closed several forums that used Stable Diffusion offensively) I’m excited to see where this creative AI revolution will take us.
- Creative AI for 3D content?
- Creative AI for games?
- Creative AI for books?
As an investor, I could see big opportunities in using this kind of tech to create new business models and populate the metaverse. If you’re an early stage founder working on something novel in this space, don’t hesitate to reach out!
Update (Oct 17): The generative AI landscape by Sequoia
- Weekly #FIRGUN Newsletter – November 1 2024 - November 1, 2024
- The Art of Non-Consensus Investing: Unlocking Venture Capital’s Hidden Gems - October 31, 2024
- Weekly #FIRGUN Newsletter – October 25 2024 - October 25, 2024