“Learn the rules like a pro, so you can break them like an artist.”
Pablo Picasso
Like many other parts of our life and work, creativity, the ability to invent or design something new from scratch, might seem like it’s the last safe harbour from automation/machines. But with new advances in AI, creativity is now being transformed by technology and offered as a service to consumers and companies.
Creative automation is the ability to generate original, high quality content (text, video, images, music, art, etc) leveraging data and technology. Gartner predicts that 90% of large organisations will embrace some form of RPA (Robotic Process Automation) by 2022, so the kind of technologies mentioned in this post is likely to grow quickly in the coming months.
A judge ruled last week that AI cannot be named the author of a patent because they are not a person. I believe that the Creative AI landscape is ripe for disruption and will advance leaps and bounds in the next five years. We are not too far off from the ability to segment customers in real time, create and launch marketing campaigns generated with AI and iterate on the creative materials based on the conversion data – a fully autonomous marketing cycle. In this post I cover a non-exhaustive list of content categories that are being re-invented with AI.
Text
GPT-3, a complex algorithm for AI text generation, is making text automation broadly available. In 2021, more than 1,000 new startups will be powered by this technology.
- Full scale generative text engines:
- GPT-3/ Open AI – launched only in May 2020 by OpenAI, GPT-3 quickly made waves as the largest neural network ever created. GPT-3 is able to write paragraphs and even poetry, with minimal input. GPT-3 licenses are controlled by OpenAI and Microsoft and it’s not yet open and free for the public. GPT-3 is a very powerful engine and besides its generative abilities it can also be used to summarise text, including entire books.
- AI 21 Labs – an Israeli startup led by Stanford Professor and NLP expert Yoav Shoham focuses on more fundamental research for NLP. It released Jurassic1, a free, widely available text generator model (like GPT-3) which is made of 178B parameters, the largest and most sophisticated language model ever released for general use by developers. The new developer tool AI21 Studio, builds on Haimke, a research model that generates full paragraphs from bullet points and Wordtune, a chrome plugin that suggests text re-writes on twitter, email, slack, etc to improve your style of writing.
- WuDao 2.0 model – created by the Beijing Academy of Artificial Intelligence (BAAI) in June 2020, WuDao 2.0 was trained using 1.75tn parameters, 10x those of GPT-3, to simulate conversations, understand pictures, write poems and even create recipes. It’s also multi-modal and can learn from both text and images.
- AI copywriters for marketers – I’ve been seeing more and more startups focused on text automation for marketing/advertising by commercialising GPT-3, offering companies quicker turn around times around content creation for social posts or ads. A number of early stage startups are showing promise in this space:
- Copysmith / Anyword / Copy.ai / Conversion.ai / Jarvis – all similarly positioned and priced, narrowly focused on product descriptions and ads, these AI copywriters focus on content for marketers to automatically generate copy for higher conversion including: ads, emails, websites, listings, etc.
- Automated creative writing
- Kafkai / Contentbot – generates unique and readable blog posts using AI.
- Hyperwrite -wants to write your emails for you by suggesting ‘autocomplete’ for your sentences.
Video
With the advancements in GANs (Generative Adversarial Networks), AI is rapidly advancing the space of video production using real human characters. This a relatively new space, as the processing power required to generate these videos in near-real time over the cloud has only recently become possible.
- Character based video generated with text
- Hour One – offers companies the ability to create video using one of their 100s of human characters by simply editing text. There’s no need for video production/editing background, and the videos can also connect to structured data sources. The characters can speak in any language and it’s possible to create your own characters. Disclosure: I’m an investor and board member via Remagine Ventures)
- Other companies in the space of text-to-video using human characters include Synthesia and Rephrase.ai.
- Automated video ads
- Lumen5 – uses AI to automatically match stock photos, stock video footage and music to a text-based script to create video ads.
Images
The majority of the startups in the creative automation space focus on this category with an emphasis on marketing content (ads, social posts), eCommerce (listings, reviews, etc) but also increasingly art.
- Synthetic media: using GANs again, we now have the ability to create completely new images of people, landscapes, etc
- This person does not exist – started by Nvidia, might be the poster child for the capabilities of GANs and synthetic media. The Internet quickly caught up and created multiple categories of in ‘This X Does Not Exist’ – from AI generated cats, landscapes, real estate, meme, etc. Cool examples:
- MetaHuman Creator – launched by Unreal, is a free editor that can generate realistic virtual humans within a few minutes.
- Rosebud – used to focus on creating synthetic faces using GANs, but seem to have pivoted to animating portrait photos.
- Deep Nostalgia – similar to Rosebud, Israeli startup D-ID collaborated with family tree platform My Heritage on this feature that brings old portrait photos to life.
- Bria.ai – an Israeli startup, offering enterprises synthetic images on demand to enrich their marketing campaigns.
- Create images from text
- Image-GPT – the image equivalent to GPT-3 – “just as a large transformer model trained on language can generate coherent text, the same exact model trained on pixel sequences can generate coherent image completions and samples“.
- Dall-E – another brilliant app from Open AI, lets users create images from text. See my previous coverage on VC Cafe.
- Deep AI – similar to Dall-E, but I found the results very disappointing.
- Art
- NVidia Canvas – enables users to turn simple brushstrokes into beautiful art. Huge download (1.1GB) is required, but anyone can test it out.
- Artlify – Israeli founded startup Art AI, lets users ‘commission’ a piece of AI generated art, by simply typing text in a box. Try it!
- Autodraw – start doodling and the Autodraw AI will give you suggestions to make your doodle into a masterpiece. An AI experiment by Google.
Audio and Music
The experience of creating music has always been human-centric, but technology is now able to offer AI beats, vocals and text-to-speech engines that sound increasingly human.
- Music
- Open AI jukebox – a neural net that generates music, including rudimentary singing, as raw audio in a variety of genres and artist styles. It’s worth mentioning that OpenAI acknowledges that this is still in its infancy: “While Jukebox represents a step forward in musical quality, coherence, length of audio sample, and ability to condition on artist, genre, and lyrics, there is a significant gap between these generations and human-created music“
- Amper Music (by Shutterstock) – is basically the ‘stock photo’ of music. I generated my own track within minutes.
- Boomy – Enables users to create AI- generated music instantly and earn royalties from Spotify and other streaming services. Since its inception Boomy users created 1.6M songs, which is 1.8% of all recorded music!
- Solaris – an AI “singer”. From the company’s crowdfunding page: “Similar to how text-to-speech is used to have your computer “read words aloud,” vocal synthesizers let you write music and have your computer “sing” it. This can be a way to draft songs before hiring a real person to sing them, but vocal synthesizers are more often used to feature on songs, giving the songwriter more creative freedom and flexibility not available in a real voice“.
- Virtual concerts are here to stay.
- Platforms like Roblox and Fortnite are making virtual concerts with big acts a recurring event with artists like 21 Pilots, Ariana Grande and others joining the fold and attracting millions of viewers. Several startups are active in this space, namely Wave, Fly Machine that help established artists and DJs reach their audience in virtual worlds or inside games (buzz word alert: METAVERSE!)
- Authentic Artists – started by the former head of music at Oculus, Authentic Artists creates AI generated artists to perform live concerts at Twitch.
- Dubbing and Voice avatars
- DeepDub– an Israeli startup is able to generate AI voice clones of actors in multiple languages, while keeping the actor’s unique tone of voice. It’s also able to add accents, i.e. English with a Brooklyn accent. Currently working with major studios.
- Papercup – a UK startup offering automated video localisation using AI dubbing. Working with broadcasters such as Sky, BBC and Discovery.
- Podbot.ai – With one click of a button, PodBot.ai generates and records a podcast episode about your topic of interest. Completed with a generated host, researched but made up content, opening music summary and cover photo.
Gaming
AI in gaming has been largely geared towards improving computer controlled opponents., the extent of which can be seen in DeepMind’s victory against the world champion in Go. But now, leveraging reinforcement learning, gaming companies like Sony and EA are developing intelligent, creative, life-like characters. Microsoft for example, is hiring for a reinforcement learning expert for a new gaming initiative, so I suspect others are working on this as well.
- AI Dungeon – a free-to-play single-player and multiplayer text adventure game which uses artificial intelligence to generate content. When it launched, it got real dark, really fast, so the developers took it down to make modification that restrict sexual content for minors. Read the story on Wired.
- Promethean – uses AI to fill virtual worlds with trees, rocks, etc. It can be trained by artists to imitate an artist’s style. It trains and learns through machine learning techniques and makes suggestions in the creative process.
- Sonatic – focuses on AI generated voices for games. It recently received headlines for helping Val Kilmer, who suffered from voice loss after throat cancer. They were able to recreate his voice using past samples, and it’s now available via API for video games.
What’s next for Creative AI tech
Creative AI technology offers to save time and money. In marketing tech, it won’t replace human creativity, but will make the process of creating and testing visuals quicker and more efficient. Synthetic media in particular, also offers the possibility to create more personalised, diverse characters/models rather than pick a generic one. Creative Automation is one of our investment themes at Remagine Ventures (and we see it closely linked to the Creator Economy, which is another). I’d love to learn about new startups in this space.
My post is by no means an exhaustive list, but you can find a database of creative AI tools here: https://creative-ai.org/
UPDATE: Google has a wonderful AI experiments page dedicated to this topic. Check it out for more inspiration!
- The potential of Mini Apps to disrupt App Distribution as we know it - December 21, 2024
- Weekly #FIRGUN Newsletter – Dec 20 2024 - December 20, 2024
- 2025 In Media and Entertainment Tech - December 19, 2024