AI Image Generation Takes A Huge Step Forward

Image generation in OpenAI's GPT-4o model takes a significant leap forward in accuracy and context.

AI Image Generation Takes A Huge Step Forward

My regular readers will note that I don't tend to write about individual AI models, preferring to focus on applications and use cases. However, I want to refer back to an article I wrote back in December titled Has GenAI Lowered Our Standards. In that article, I point out that regardless of the fantastic advancement of image and video generation, GenAI models are still some distance away from replacing current methods in commercial creative applications. However, this week's developments show that gap might be closing.

Image Generation Leaps Forward

This week, OpenAI added new image generation capabilities into their GPT-4o model, and the initial examples are impressive to say the least. The most impressive part is being able to accurately produce the required context. Take the image of a wine glass below, generated from this prompt: "show me a wine glass with only the tiniest drop of red wine in it." The key part being "a drop of red wine".

Previous models struggled with this small detail, even if they did produce a good image of a glass of wine.

Using Images for Context

The other revolutionary development is the ability for the model to take an existing image, and incorporate it into a new creation. OpenAI use the example of the chainsaw below, being used in a subsequent image:

Uploaded image of a chainsaw
Same chainsaw used in subsequent creation

We have to admit, image creation in GPT-4o is a significant advancement, and whilst still not 100% perfect, the ability for any regular person to create commercially ready images is a huge step closer.

"The model [GPT-4o] brings world knowledge to the equation, so when you ask for an image of Newton’s prism experiment, you don’t have to explain what that is to get an image back."

Jackie Shannon, ChatGPT multimodal product lead.

This week's email is brought to you in partnership with Make.com. I've been using Make for a little while now to automate several processes. It's super easy and saves me quite a few hours a week. You should check it out.


Curated News

Zendesk Unveils AI-Driven Customer Service Platform

At Zendesk Relate 2025, the company announced the Zendesk Resolution Platform, an agentic AI-driven solution aimed at transforming customer service and sales. The platform introduces advanced tools for seamless integration and enhanced AI functionality, including next-generation AI agents capable of reasoning, learning, and adapting. Features like an upgraded AI agent builder using natural language prompts and a comprehensive knowledge graph were also unveiled, emphasizing a human-centered approach to AI deployment in customer engagement.

PwC Introduces 'Agent OS' for Enhanced AI Collaboration

PwC has launched 'Agent OS,' a platform designed to enable AI agents to communicate and work cohesively. After 19 months of developing AI agents and identifying a lack of communication among them, PwC's 'Agent OS' serves as a switchboard for enterprise AI. It allows companies to build, customize, and connect AI agents to automate complex tasks, integrating with systems from Anthropic, Google Cloud, and Microsoft Azure to enhance workflow efficiency.

Tech Industry Faces Challenges with AI 'Super Agents'

The rise of advanced AI 'super agents' capable of independently handling complex tasks may strain the current computing infrastructure. A Barclays report indicates that these AI agents generate significantly more processing tokens than traditional chatbots, necessitating massive amounts of computing power. The industry may need to increase its number of high-performance inference chips or adopt more efficient models to balance computational demands with consumer and enterprise needs.

Upcoming AI Events

Generative AI Summit 2025
Novotel West London, London, Mar-31 Apr-2

Rise of AI Conference
Berlin, Germany, May 7–8

AI in Business Conference
London, May 15

Google I/O
Shoreline Amphitheatre, Mountain View, CA, USA, June 9–12

Thanks for reading, and see you next Friday.

Simon,


Are you looking for a job in the tech sector? You'd do well to check out all the available roles on Wellfound. Wellfound is a platform that connects startups and tech companies with job seekers. It offers over 130,000 remote and local job listings, allowing candidates to apply with a single profile and view salary and equity details upfront.

You can also follow me on social media: