How to Create Your Album Cover Using AI

December 13, 2022 by Abbey Road Institute Sydney

Is it possible to create “good” album cover art using AI? People are using artificial intelligence to create art. Right now, AI profile images generated by apps like Lensa are showing up everywhere. To see some examples, head to their Instagram page.

Let’s look at how you can use these tools to create an album cover for your next release.

In this blog post, we will be using a simple Mac app called Diffusion Bee. There is something interesting about this app in particular: it does not require any subscriptions or use any online services. All the processing is done locally on your computer while you’re off having a coffee or adding the finishing touches to your track.

About AI image generation

An AI image generator is software that uses machine learning models to turn descriptive instructions into new images. The most common ones at the time of writing are the Midjourney Discord Bot (using Stable Diffusion) and DALL-E 2 but there are countless new ones emerging all the time.

Most of these tools are cloud-based and in the case of Midjourney, you’ll even enter your prompts into Discord for everyone else to see. For anyone familiar with Discord, that is one way to learn how to phrase prompts and to see what type of artwork can be created.

For this example, we’re using Diffusion Bee because it requires no technical knowledge and runs on any new Mac running at least macOS 15.2. It’s using the Stable Diffusion model locally, which means that you’re not sending any data to the cloud and you don’t have to pay for a subscription either, so you can just keep experimenting.

Check out some examples generated using various prompts on Diffusion Bee:

Cover art requirements

Before we start, we need to remember some essentials for cover art: To ensure that our art will be approved by all major distributors and streaming services, you will need an image, saved as a JPG file, with minimum dimensions of 3000×3000 pixels. That is much larger than most of these tools tend to generate. And of course, all cover art must be square.

Let’s begin! To start creating you’ll need:

A recent Apple computer (the ones with M1 or M2 processors are best)
the Diffusion Bee app
An app or service to edit the final image with (optional)

Step 1: Install the app, download the data & set things up

The installation could not be simpler: once you downloaded the app, drag it to your application folder and start it up.

When it first launches, it will download the models (you need about 8GB of free space).

Once done, go into the “Options” and change the resolution:

Set the resolution to 768×768 pixels to ensure the image will be as large as possible. We’ll still have to upscale to our final cover art dimensions later. Diffusion Bee has the built-in option to scale your generated artwork up by 4x. This will result in an image that’s 3072×3072 pixels and just over the size, we’ll need to submit this to distribution.

Step 2: Enter Prompts

The prompts are the main instructions that create the image. They specify the subject matter, style, level of realism, camera angles, lighting and more. It’s a skill in itself to come up with prompts that create the images you have in mind. There is a section for prompt ideas that can give you a starting point. There are also resources online (Promptomania or Lexica) that give you instructions and example prompts to try and learn from.

Some things work better than others

The advances in this area are extremely fast. It also has to be said that these tools can often create strange-looking results if you’re trying to instruct it to create detailed images of people or animals.

But sometimes, when you get your instructions right and you create an image that looks the way you want, you can keep generating an endless supply of them that you can later choose from. Or you might even want to embrace the strange results this can give you!

Let’s start creating!

Let’s say I am releasing a track that’s got an electronic/jazz vibe and I want the cover art to highlight this as soon as you see it. I want the scene to be a cafe or diner setting that looks bold and cinematic. I like a very colourful example that leans on teal/orange to create bold, contrasting areas. Having read through some examples, I took a starting point and changed the subject matter and added/changed a few instructions.

This is the prompt I’m using for this example:

cafe, diner, deep dusk illuminating, cyberpunk, epic scene, vibrant colors, dynamic lighting, award winning, fantastically beautiful, trending on artstation, 8 k, cinematic teal-and-orange, oil painting, oil, expressionism, teal and orange gradient Dramatic, realistic, high quality, Studio Lighting, Lens Flare, Synthwave

And, lo and behold, here is the first image Diffusion Bee created:

Let’s say we’re happy with that, but we want to see some more options. Every time you hit “generate”, it creates a completely new image for you.

Do you want to see more images to choose from?

Diffusion Bee allows you to create up to 100 images from the same prompt in one go! Be aware that this will take some time, so depending on your computer, you might want to leave this running overnight and come back to it in the morning.

I’m also going to increase the “Steps” from 25 to 50. This will increase the number of iterations per image in order to make them more detailed. It also takes a bit longer to create.

Here is an example showing how adding more steps turns noise (1 Step) into a more and more refined image (75 Steps):

What is a seed?

If you want to recreate the same image with different settings or prompts, you’ll have to make sure that the seed is a number you specify. It can be any number you pick. By default, Diffusion Bee will add a random number and that causes it to generate a new image each time. If you add a fixed seed number, it will always recreate the same image but slightly different based on your other settings.

Let’s make more examples

I’ll generate 20 new images for this example, so I have reset the seed to be randomised. (Note: on a high spec Mac Studio, it takes about 20 minutes to generate 20 images at the “50 Steps” setting).

Step 3: Post Production: upscale the image

Now all that’s left to do is to resize the image that we picked to match the cover art requirements.

Here I will use the menu in the top left corner of the image (the 3 horizontal lines) and select “Upscale Image” to make it 4x the original size.

And we’re done!

The cover art is ready for upload. You might want to also add the release name and artist name to see how that looks. This can be done using any photo editing app of your choice. I’m normally doing this in Photoshop or Affinity Photo but Canva is another alternative.

Be aware that some streaming services will reject your cover art if it contains too much text or text that isn’t relevant to your artist name or release title, so make sure that if you’re including any text at all, it matches the release.

Once you have the final version of your cover art, don’t forget to include your artist name and release title in the file name so that anyone who comes across the file knows exactly what it’s for.

: without text

: with added text, more contrast and added film grain

What does this mean for the world of art and music?

Is it something we need to be careful with or do we embrace it? This technology advances at a swift pace and it’s starting to impact us in all aspects of art, science, entertainment and more. We are all moving towards using some parts of these types of systems, whether we’re aware of it or not. There is a lot of hype, misinformation and lack of understanding but it’s undeniably going to play a part in our lives in the future.

It’s up to us to raise the issues, talk about them and adjust how we do things.

Can this co-exist with the work and dedication of talented artists, people who have a voice, who have a message? Will it flood the world with a lot more noise? Will it disrupt industries? Maybe it will give us time back that we can use to hone our craft, do the things we enjoy doing?

None of this would be possible without the contribution of thousands of years of art, photography and other type of content to feed the models the data they need to be able to appear “intelligent”. That itself raises issues of copyright, ownership and the ability to opt-out of being part of the data-set.

Watch this video explainer to learn more about the topic (tip: watch until the end to get the link to the follow up video as well!):

To read more about tools that leverage artificial intelligence to help you make music, check out “Music at Long Distance – Artificial Intelligence Tools” on the Abbey Road Studios website.

Blog