3 Modes of Making with AI

Published

November 8, 2024

In this post I’ll share a useful mental model that I’ve found useful recently in thinking through a number of topics related to AI and creativity. I call it the “3 Modes of Making with AI”. This framing has helped me think though how I feel about different coding tools, tease out some nuance in the debate around AI art, and imagine some possible futures I want to build towards!

The trick is to separate out uses of AI into three categories (although in practice it’s often more like a sliding scale). These are:

  1. “Slot Machine” - press a button or throw in a prompt and hope for the best.
  2. “Iterative Refinement” - you’re tweaking an initial result to get it closer to what you want, but the AI is doing most of the heavy lifting.
  3. “Co-Creation” - you’re working to express your vision, with the AI as a tool to help you get there.

As you move from 1 to 3, the agency of the human creator increases, and the output shifts from something ‘generic’ (that anyone might get) to something individual and unique. All three modes have their place of course, as we’ll see as I dig into each a little more, but explicitly thinking about which zone we’re aiming for can have a big impact on the kinds of tools we make, and I think at present the co-creation aspect that centers the human creator is in need of a little more love! So, let’s go through these in a little more detail and then look at some ways we can help shift the balance towards my favourite end of this spectrum :)

Slot Machines

Text-to-image models went from barely-functional to incredibly powerful in the space of a few years, and it’s easy to lose track of how magical they are. Type in a prompt and seconds later you’re staring at a high-quality image of whatever you described.* The magic quickly wears off though, and many either lose interest or begin to take advantage of the limitless output capabilities of the tool to generate endless variations of the same thing, re-running the same promt with slight variations just in case the next one is a winner. Ditto the initially-impressive “write me a poem about X” text model demos.

This style of use can be very valuable - often I just need an image or a bit of code that roughly matches what I asked for.

TODO examples

Iterative Refinement

One immediate upgrade to the above is going through a few rounds of refinement. FOr text/code models, this can be pointing out a few mistakes or clarifying a requirement, all the way up to asking for many re-writes with detailed feedback each time. For image models, I like the direction Playground are going, where you ask in natural language for edits. Even ChatGPT’s more primitive image gen can do this though - repeatedly asking for “more cute” on a picture of a duckling or something can be entertaining :)

TODO examples

Co-Creation

The most useful, the most varied, and the most underserved IMO.

  • Starting from a sketch, having the AI generate, painting over, repeating, pushing and pulling with the AI “medium” just like I do with watercolor.
  • Writing code a few lines at a time, getting AI help with syntax and brainstorming but also understanding the parts myself and learing new things as we go.
  • Humming melodies, having an AI like Suno ‘cover’ your rough takes, splitting that out and bringing the stems back into a DAW…
  • Using AI as asset generation to speed up the ‘background bits’ for a 3D scene or digital collage

… TODO expand

WIP: I’m thinking through the best way to express these ideas, maybe this will turn into a video or something…

*ish - even the best models are still far from perfect, but the errors have gone from ‘this is a blurry disfigured mess’ to ‘the text on that tiny sign in the background has bad kerning’.