A prompt engineering guide for DALLE-2(dallery.gallery) |
A prompt engineering guide for DALLE-2(dallery.gallery) |
There's a fine line between so descriptive that the AI hits an edge case and can't get out of it (so every attempt looks the same) and not being descriptive enough (so you can't capture the output you're looking for). DALL-E is already incredibly fast compared to public models and I can't wait for the next order-of-magnitude improvement in generation speed.
Real-time traversal of the generation space is absolutely key for getting the output you want. The feedback loop needs to be as quick as possible, just like with programming.
>Real-time traversal of the generation space is absolutely key for getting the output you want.
I've been sketching around a two-person browser game where a pair of prompters can plug things in together in real-time :D
Damn I am salivating to get access to Dall-E for some projects. Been on the waiting list for quite a while.
I've been experimenting with Midjourney, which is amazing for spooky/ethereal artwork, but it struggles with complex prompts and realism.
That said, some styles (Comic book spreads) seem to come out better on Craiyon. And DALLE 2 does not know what a Crungus is.
You definitely have to play around with prompts to get a feel for how it works and to maximize the chance of getting something closer to what you want.
MJ falls apart when you ask for fine detail. It's a bit of the AI cliche where you have to describe the colour, shape, etc in detail to mold what you want. Asking for a "monkey, gorilla, and chimp riding a bicycle" might have a chimp riding a monkey-gorilla as a bicycle.
Dall E is a lot better with words. It seems to "smooth" some stuff. Like asking for a bone axe will still show regular axes.
But MJ is probably the best choice if you want to do landscapes and stuff, especially horror/dystopian themed.
https://www.vice.com/en/article/g5vbx9/dall-e-is-now-generat...
Maybe that’s a feature not a bug.
You could be right though. It does "digital art" well, but realistic faces poorly, and they slap down lots of restrictions to avoid deepfaking.
"This person doesn't exist" uses StyleGAN which can definitely do faces, but can't do general pictures.
Interesting topic
Also Gwern has done a lot on this.
> The Office also stated that it would not “abandon its longstanding interpretation of the Copyright Act, Supreme Court, and lower court judicial precedent that a work meets the legal and formal requirements of copyright protection only if it is created by a human author.”
https://www.copyright.gov/rulings-filings/review-board/docs/...
The copyright office claimed works created by a non-human aren't copyrightable at all when they refused Slater, but that was never challenged or decided in court. It's not a slam dunk, since the human had to do something to set up the situation and he did it specifically to maximize the chance of the camera recording a monkey selfie.
If I set up a rube goldberg machine to snap the photo when the wind blows hard enough, how far removed from the final step do I have to get before it's not me owning the result anymore? That's the essence of the case, had it gone to court, probably the essence here too.
My guess is the creativity needed for the prompt would make the output at least a jointly derived work regardless of any assignment disclaimers--pretty sure you can't casually transfer copyright ownership outside a work for hire agreement, only grant licenses--but IANAL and that's just a guess.
It also won't allow uploading images with faces in them.
(Its output seems to be a lot more aligned to the input than DALL-E2, but also less "artistic" and more like it just did exactly what you said.)