What platform generates the very best pictures?

Key Takeaways

Grok 2 generates extra lifelike pictures in comparison with DALL-E.
ChatGPT listens to directions higher than Grok, particularly for side ratios.
Grok tends to load pictures sooner than ChatGPT, regardless of occasional failures.

xAI’s Grok 2 launched to each fanfare and criticism — however one of many key adjustments to the former Twitter’s AI is the flexibility to generate images . Grok is late to the sport in generative imagery, nonetheless, in a quickly increasing market the place neural networks like DALL-E have already been producing pictures for 2 years.

To see how the brand new beta of Grok 2 compares to the long-standing DALL-E, I put the 2 AIs head-to-head, typing an identical prompts into each applications. I headed to X to make use of the AI constructed into the social platform, then opened up a chat with ChatGPT in GPT-4o to check the newest generations of each picture mills.

Whereas Grok lived as much as its early repute of producing imagery with fewer restrictions, the newer AI surprisingly churned out pictures with a extra lifelike really feel than the longstanding DALL-E. This is how Grok 2 compares to DALL-E.

Realism: Grok generates extra lifelike pictures

ChatGPT’s pictures have a better decision

One of many key areas that Grok stood out was when tasked with creating pictures that seem like an actual {photograph}. Sure, trying nearer, I may inform the picture was a generated one with out an excessive amount of trouble. However with, DALL-E, I didn’t must look nearer, the cartoonish look gave the pictures away as AI instantly. ChatGPT’’ generated pictures tends to soften faces, significantly when tasked with producing a number of folks in a picture, whereas Grok’s pictures of individuals look extra lifelike. Grok’s pictures nonetheless really feel closely airbrushed, however they appear a lot nearer to {a photograph} than the generations from ChatGPT. DALL-E’s generations come at a better decision, however with much less lifelike element to zoom in on.

One of many key variations between the 2 is that asking Grok for a picture of a selected particular person is not towards the AI’s pointers. You may ask for a picture of a celeb or politician and get a fairly shut likeness, although some generations really feel extra correct than others. DALL-E refuses to generate a picture that resembles a particularly named particular person.

An image generated by Grok AI of a mom holding a baby

A photo generated by DALL-E of A realistic and tender photo of a mother holding her newborn baby in her hands. The mother is gently cradling the baby, supporting the baby's head and copy

Each platforms, nonetheless, continued to fail the place AI has been identified to wrestle. Neither can produce fingers very properly, although they each appear to know this and if the immediate does not specify, will usually have the particular person’s fingers hidden or tucked in a pocket. And the extra folks generated inside a photograph, the upper the percentages of a laughable end result.

Accuracy: ChatGPT listened to directions higher than Grok

ChatGPT understands directions for options like side ratios

A screenshot of X Grok and ChatGPT DALL-E side-by-side

X’s AI by no means generated the proper side ratio once I particularly requested for a 16:9, the place ChatGPT was in a position to higher observe these directions.

Grok had just a few cases the place it did not utterly perceive the prompts that I typed in. For instance, X’s AI by no means generated the proper side ratio once I particularly requested for a 16:9, the place ChatGPT was in a position to higher observe these directions.

Grok additionally didn’t appear to grasp once I requested for 3 folks, every with a special emotion, making all three of them look mad, although it did appear to generate the proper facial expressions for a picture of only one particular person. ChatGPT’s end result was extra terrifying, however it adopted in-depth directions higher than Grok’s.

Pace: Grok tends to load first

ChatGPT tended to take extra time to create a picture

Typically, Grok really completed first, with the picture popping up on the display screen earlier than ChatGPT had completed. In some instances, ChatGPT wasn’t but midway completed producing when Grok had a sophisticated picture.

Nevertheless, as a beta program, I’ve had cases the place Grok would not generate pictures in any respect, and I needed to wait and take a look at once more at one other time.

Textual content: Each AIs nonetheless have a tough time with textual content on a picture

Until, after all, you inform it precisely what to say

Whereas each ChatGPT and Grok can generate pictures or textual content, creating textual content inside a picture is a completely completely different ball sport. Each platforms will produce textual content when requested, similar to when prompted to create a greeting card. However, it’s if you don’t specify what the textual content ought to say that issues get attention-grabbing. Grok created nonsensical graphic t-shirts and the generated indicators on a busy avenue used characters that regarded like Chinese language. ChatGPT’s letters have been extra nonsensical, with some precise letters and others that felt extra like Greek.

Ethics: Grok has fewer restrictions

Fewer restrictions imply extra misuse potential

A lot of the thrill round Grok is that it has fewer content material restrictions in place. Grok will produce licensed characters and logos and is keen to copy the type of particular artists. It can also create recognizable folks, all issues which might be towards DALL-E’s content material pointers. Within the fingers of somebody who won’t know higher, Grok has extra potential for touchdown the person in moral and even authorized sizzling water.

Grok can create recognizable folks, which has murky moral — and even authorized implications.

Even when used within the fingers of somebody with a Twenty first-century conscience, there are potential pitfalls with Grok. For instance, Grok twice created a recognizable brand within the background that wasn’t requested within the authentic immediate.

Whereas ChatGPT will refuse to copy an artist, use a brand, or a copyrighted character, there are methods round these pointers. For instance, once I requested for one thing within the type of Vincent Van Gogh’s Starry Night time, it refused however prompt that it generate a picture “specializing in swirling patterns, vibrant colours, and expressive brushstrokes” as an alternative. The ensuing picture felt like simply as a lot of a rip-off as Grok’s era, it simply took extra prompts. And whereas ChatGPT’s era of a “quick meals restaurant” wasn’t as recognizably McDonalds as Grok’s, it did add some golden arches to the background in a single era.

Watch out for bias

One other widespread situation with AI is the tendency in the direction of racial bias. My first time utilizing Grok, I requested for 5 completely different pictures of enterprise professionals and by no means as soon as did it generate an individual of coloration, even when requested for a “various” group. On subsequent exams, nonetheless, it did create a picture with extra ethnic selection, however solely when the immediate particularly requested variety. I think this bias has to do with Grok’s coaching information and the prevalence of Caucasians in inventory pictures of enterprise professionals – – once I requested for generations that weren’t in an workplace setting, Grok produced extra variety with out being prompted.

Associated

Do you think Google’s AI ‘Reimagine’ tool is fun or frightening?

Google’s “Reimagine” instrument on the Pixel 9 is principally the wild west of photograph enhancing, and actually, it’s essentially the most attention-grabbing factor concerning the telephone to me. You may add something to your footage — UFOs at your yard BBQ, a dinosaur on Important Road, you title it — with only a textual content immediate. Positive, it is neat, but in addition a bit terrifying — even Pocket-lint’s Managing Editor Patrick O’Rourke thinks so. The tech is so on level that it blurs the road between actual and faux, with no apparent markers that scream “AI-generated!” This lack of transparency could make any photograph suspect. Whereas Reimagine has some guardrails, in the event you’re intelligent along with your wording, you may skirt them fairly simply. What do you consider Reimagine?

ChatGPT, however, didn’t want the phrase “various” to create a picture of enterprise professionals with multiple pores and skin tone. Once more, although, with giant teams of individuals, DALL-E tends to soften faces with generally terrifying outcomes.

DALL-E vs. Grok: Which AI creates the higher pictures?

Grok would be the youthful AI, however it produced pictures that have been extra lifelike than the cartoon-like pictures nonetheless created by DALL-E. X’s AI additionally tended to create these generations sooner. The premium subscription to X additionally prices $8, whereas, in order for you the newest model of DALL-E, you may want $20 for the ChatGPT subscription. (Although the DALL-E dataset can be behind Microsoft Bing’s free AI).

Nevertheless, the less content material restrictions imposed by Grok is not at all times a great factor. Of the 2 AIs, Grok appeared the extra more likely to break copyright and use a licensed character. The flexibility to create people who seem like celebrities additionally provides Grok the higher potential for misuse creating deep fakes for political propaganda and faux information.

DALL-E 3

Supplies extra cartoonish, much less lifelike pictures, however doubtlessly creates much less of an ethical and authorized conondrum than X AI’s Grok. To entry the newest model, customers should pay $20 for ChatGPT premium.

Grok

Owned by X (previously Twitter), Grok is new and extremely lifelike than OpenAI’s DALL-E, which suggests customers could must be extra cautious in the case of authorized implications. A subscription prices $8.

Developer: X (previously Twitter)
Subscription price: $8 for premium