Inflection Point for ML and Art

Apr 3

Generating art from text prompts is at an inflection point. A new creative field is developing in front of our eyes.

A community built on Colab has nailed obviously useful generation of still images, 2D and 3D animations, and processing videos.
Tools like Midjourney are making it easy to use
Big companies are demonstrating out of this world results using larger models.
The FOSS community is creating datasets and training models to match the big companies. [LAION.ai]
It is mainstream. [A$AP Ant & A$AP Rocky music video]

*Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors*
[Facebook Meta AI Research, arXiv]

*from Jack Morris’* *The Weird and Wonderful World of AI Art*

The core technical insight is using two ML models.

One paints, the other scores against the prompt.

Jack Morris (@jxmnop) a Cornell PhD student studying natural language processing, wrote an excellent article explaining the techical background fully, The Weird and Wonderful World of AI Art.

Creating

Artists have embraced these tools.
They aren’t just entering a text prompt: they’re playing with all the parameters, trying multiple variants at the same time.
Creative work is done by exploring, and creative work is special when its distinctive.
Significantly, this lowers the burden on the technical side: it is neither necessary nor desirable to get picture perfect results on the first try.

These are the two most popular tools currently, in April 2022. Discoveries are being made at a rapid clip.
The best way I’ve found to keep up on the field as a whole is Reddit, /r/mediasynthesis and /r/discodiffusion.
Zippy’s Disco Diffusion Cheatsheet is an excellent manual, not only for Disco Diffusion itself, but the community and tooling.

Disco Diffusion

The latest and greatest Colab notebook is Disco Diffusion.
Colab is free to try. You can subscribe to get more features, most importantly, more powerful GPUs.

It can be found on Github.
/r/discodiffusion and a Discord welcome you.
Zippy’s Disco Diffusion Cheatsheet is an excellent manual.

Midjourney

Midjourney is a tool in private beta. [Twitter, link to apply in bio]
Over the week it took me to write this, Midjourney became very well-known, and it’s unclear if there are any beta spots left.

Slideshow Gallery

enjoy the slideshow, click thumbnail to jump
want to view in detail? download? scroll to bottom, Full-sized Gallery

c23e4ad0-f45f-4419-a5de-7f2a08c994a6_httpss.mj.runDpIe4N__claude_monets_impression_sunrise_painted_on_lsd.png

Chromatic+aberration+of+Monet's+impression+sunriseL.png

Axonometric taxonomy of colors organized in space.png

ed3f14be-c556-4104-96d5-5ca8dfefa0cd_the_true_form_of_the_mysterious_and_powerful_angel_castiel_famous_award-winning_painting.png

The galactic senate chamber, science fiction unreal engine 3d render at 8K, award winning.png

360 degree virtual reality rendering of massive galactic senate chamber2.png

A massive battalion of imperial storm troopers on coruscant4UP.png

A massive battalion of imperial storm troopers on coruscant.png

Chromatic aberration in spacetime-4up.png

9226f792-af8e-4f27-b695-01c0cf7552bc_astrophotography_on_lsd.png

Big old library with natural lighting, 16 million rare and beautiful colors stored on the shelvesL.png

Big old library with natural lighting, 16 million rare and beautiful colors stored on the shelvesL3.png

Big old library with natural lighting, 16 million rare and beautiful colors stored on the shelvesL4.png

Video

Disco Diffusion can create still images, or 2D animations, or 3D animations, or take a video as input and repaint each frame.

Here, we take a black and white video of Monet painting in his garden, and repaint each frame in the style of Monet.

Creating this leveraged 3 different models: a video colorizer, a painting model, and an upscaler.

Below, you can see the output of each step. Left to right:
- B&W video of Monet painting in his garden
- Colorized
- Paint Giverny garden like Monet
- Paint Giverny garden like Monet, in the winter, with an ice blue & white color scheme

Full-sized Gallery

click to view full screen, right click to download

* this isn’t all of the pictures in the slide shadow gallery, skipped some only because it’d be a couple more hours of work to hunt down the full-sized version of each upload

** @jpohhhh on Twitter; for E-mail, same user name at gmail.com