How do DALL-E and Stable Diffusion stack up against each other when creating AI-generated images? Tune in for an informative session with Oscar Kaszubski, Chief Growth Officer at FirstMovr, and Dave Feinleib, CEO of It’sRapid, as they analyze results from both platforms.
Transcript
David Feinleib
We’re gonna be taking a look at a background image. Our scenario is we wanna create a holiday ad and we wanna generate some images that we could use as the background for our holiday ad. Tell us if you would, how does this work? What have we done here?
Oskar Kaszubski
So we basically build a prompt to specifically create a background image for a holiday display ad on Amazon with a blank array in the middle. Do not include any text, which is very important because otherwise, all of the image generation AI would try to write any text and, and you have to be very careful with the text because most AI generation images, they cannot generate specific texts. So it would be all garbled up. It would not be legible. DALL-E is probably for this purpose, the closest to what you are trying to do, specifically for the products. And of course a lot of those images are being generated, you know, 15, 512 by 512 there was, there are weights to upscale it to 1024 and 2048. And kind of above, it’s a good start.
Oskar Kaszubski
It still would require, a touch by the graphic designers to be able to see it. Now, when we give the same prompt to stable diffusion, you are actually going to see that it’s a little bit more, I would say, cartoony in terms of its approach, right? We are looking at the same prompt, we are looking at the latest version, which is the Excel beta. This is much less kind of viable for placing a product in versus what we actually seeing in DALL-E, I basically see here that we could actually create and drop any of the products and, and, and have a really good, David Feinleib The product or a CTA, some holiday headlines. There’s not exactly a place for that. These are some nice looking images, but perhaps harder to use for this, this scenario for our, our display ads. What about things like brand colors? Is that something that a brand can specify here? Is that still kind of a roadmap?
Oskar Kaszubski
Yes. So, for sure you can create your own color palette and be able to actually include it. So let’s say if you are Coca-Cola predominantly, you know, red, if you are PepsiCo, let’s say blue, what I do like about Stable Diffusions later on, it actually has the options for negative prompt. And you can actually specify which colors to avoid, which colors not to avoid, et cetera, right? So you can fine tune this much better versus DALL-E, where you actually have to write the prompt. But then if you actually look at Midjourney options, mid journey will be always more tuned towards more photorealistic imagery. So I think from that perspective, if you wanna basically build something that ‘s a little bit more photorealistic, you know, Midjourney might be your best bet. So if you actually see here, it’s almost like more like a green,
David Feinleib
Yeah, this is great. So some very different outputs. I suppose if we were seeing similar outputs then we’d be a little concerned about differences, you know, similarities in the models, but I think it’s really compelling that we’re seeing such different outputs from these different engines and really gives us some different possibilities for what we could do with the branding.
Oskar Kaszubski
Yeah. It’s also hard to compare because not necessarily having the same prompt across all of them makes sense, right? Because some of them need to be spoon fed in terms of how we wanna look at the specific image assets that we are generating. So we’ll have to kind of mix and balance and learn all of them. But I would say Midjourney by far, it’s the best results in general. But for specific cases, DALL-E might be a little bit more usable. Like I would say for this Amazon project, probably I would stick with the DALL-E, which is by open AI images versus going directly to Midjourney.
0 Comments