Google announced a significant upgrade for Gemini, its in-house artificial intelligence (AI) model, on Wednesday. The company announced that the image generation capability of the chatbot will now be handled by the Imagen 3 AI model for all users. Imagen 3 is the Mountain View-based tech giant’s latest and most capable image generation model. Apart from the Gemini app, the feature is also being extended to the API version of Gemini to let developers build apps and experiences based on this capability.
Gemini Users Get Access to Imagen 3 AI Model
In a post on X (formerly known as Twitter), the official handle of the Google Gemini App revealed that all users, including those on the free tier, will be able to generate images using Imagen 3. The post highlighted that the AI model offers a high degree of photorealism, better prompt adherence, and adds fewer unwanted elements to images.
Gadgets 360 staff members were able to verify that the Gemini app is indeed using Imagen 3 to generate images. To test its capabilities and compare it with Meta AI, we gave both chatbots the same prompt. The prompt was, “Draw an image of a golden retriever dog sitting on a train berth, looking out through the window at the Alps. The train has a wooden interior and the seats are green in colour. All other passengers on the train are also animals. One human conductor is checking for tickets.”
The generated images can be seen above. While both AI models failed to incorporate one or more elements instructed in the prompt, Gemini was able to incorporate more elements. Additionally, while Meta AI generates images in 1280 x 1280 resolution, Imagen 3 images are generated in 2048 x 2048 resolution.
Imagen 3 can generate images in a wide range of styles such as photorealistic, textured oil paintings, and claymation scenes. Users can also request images to appear as if it has been taken from a specific camera such as a Nikon DSLR, GoPro style, wide-angle lens, and more.
Google has said that the AI model comes with inbuilt safeguards to reduce the risk of deepfakes. Every generated image also comes watermarked with SynthID, a technology that adds an invisible AI label within the pixels of the image. It cannot be cropped out or removed and is present even in screenshots.