AI Showdown: OpenAI vs Anthropic vs Gemini
I used Lovable to build a Japanese language learning game using 1 prompt on 1 try
Summary
This article serves a simple dual purpose.
To compare the performance of OpenAI, Anthropic, Gemini on Game Design using 1 prompt on a AI web-app creation platform such as Lovable.
To introduce POSE, Practical One Shot Evaluations, a set of personal evaluations that result in the creation of practical applications, as well as the testing of models.
I’ll specifically talk about designing a Japanese language learning game using English natural language instructions.
Context on Evaluations
If you use AI as a pair programmer, then you may have noticed that Multimodal Language Models (MLM) are not consistent in their performance. Which is why so many evaluations (benchmarks) exist.
This article is not about the industry’s evaluations.
It’s about creating practical evaluations for heavy MLM users.
I recommend this blog post by Nicholas Carlini, who created a set of personalized evaluations to his line of work. You can use it yourself for free via his Github. Would you embrace Artificial Intelligence more, if it’s more applicable to your life ? Personally, it would be hard to resist.
Why Lovable ?
This past weekend Lovable hosted an impromptu Generative AI web app competition in which they asked people to submit:
Their best creation using either OpenAI, Anthropic, or Gemini as the MLM providers
OR
to compare the performance of the 3 models on the same prompt.
The idea
Given that I make language learning products, such as a snake inspired phrase learning game and a language learning journal, I decided to create a simple web app in the same space.
Furthermore, decided to formally (slowly) build an open source set of practical evaluations that I can continuously re-use going forward. This is becoming an industry-wide practice with players, such as every.to who created an AI diplomacy game.
So how about I create a gamified dictionary game inspired by my childhood favorite GameBoy Pokémon, called Kanjimon?
Gameplay is simple:
a user will walk around interacting with a 2D landscape
once they they have answered correctly all the “Wild” Kanji they interact with.
So I made a 1-shot prompt evaluation using the following prompt. It’s part of the POSE repo, which I plan to periodically update with practical assessments of language models using 1-shot prompts.
Let’s talk a bit about the competition.
Financial Incentives
There are 2 competition challenges that I am vying for.
A $25k challenge, which asks people to submit a direct comparison of the 3 different model labs on a specific prompt. Assessment is going to be done based on “depth” and “breadth”.
A $40k challenge, which effectively encourages novel uses of languages models in web apps.
I applied for both; this post serves as an requirement to the 1st challenge.
Results
Each of the following images are results of same the 1-shot prompt on Lovable. I provide the hyperlink to each of the submissions for complete transparency — I also document each in this YouTube video.
Let’s start with OpenAI’s results. Quite friendly color arrangement. Definitely reminiscent of a 2D game, not necessarily Pokémon. If you watch the YT summary, you’ll see the kanji bobbing.
Next, I would like to touch on what I perceive as the winner of this challenge for this particular prompt. Anthropic, provided my preferred Kanjimon design. It’s color palette is darker — which is less cheerful. However, UI is very clean.
Lastly, here’s the product from Gemini. Kanji are very legible.
However, the game design lacks the crisp execution that OpenAI and Anthropic brought.
Method
My one-shot prompt was developed by iterating between Google’s Gemini 2.5 Pro and OpenAI’s ChatGPT (gpt-4o).
I started with the following (very simple) prompt on Gemini.
I want to create a fun retro 2D japanese language learning game for the purposes of a hackathon
so we want to make the content of the game simple
the goal is for people to test their knowledge of Japanese Kanji as if they were playing Pokemon
gameplay goal is for people to face off with Kanji
they have to capture them all
we would need to create the 2D scene of maneuvering around
the action scene where a kanji challenges us to a duel
we would be able to catch the kanji if could type in English what it means
then the game would be over after we catch all 25 kanji
it’s important to imitate the 2D pixelation and gameplay of Pokemon
the kanji can look like the Unknown
given my prompt, revise it for usage with an AI code editor
Clearly, I am relying on the LLM to provide me a better prompt — note that it will do its best effort to understand me, but ultimately I am responsible for communicating key details.
It came up with a quite thorough description:
Project Title: Kanji Katcha!
1. Core Concept
Create a retro, 2D pixel-art game in the style of early Pokémon games. The player walks around a map and “captures” wild Kanji characters by correctly identifying their English meaning. The game is won when all 25 target Kanji are captured. The entire game should be a single, self-contained HTML file with embedded CSS and JavaScript, requiring no external libraries.
2. Visuals & Aesthetics
Engine: Use the HTML
<canvas>
element with the 2D rendering context.Style: Simple, 8-bit pixel art.
Player Character: A 16x16 pixel sprite. It can be a simple character with a single-frame walking animation (just flipping the sprite horizontally is fine).
Map: A tile-based map (e.g., 20x15 tiles). Use two simple tile types: “grass” (a noisy green texture) where Kanji can appear, and “path” (a light brown color) where they cannot.
Kanji Sprites: Each Kanji should be a mysterious, black, 16x16 pixel sprite, visually inspired by the Unown Pokémon. The shape should be an abstract, blocky representation of the actual Kanji. They should have a simple bobbing animation (moving up and down by a few pixels).
3. Game Mechanics & Flow
The game should operate as a state machine with three main states:
OVERWORLD
,BATTLE
, andGAME_OVER
.State 1: OVERWORLD
Display: The canvas shows the tile-based map with the player character and the remaining wild Kanji sprites. A simple UI overlay at the top of the canvas displays “Kanji Captured: X/25”.
Player Control: The player moves on the grid using the Arrow Keys. Movement should be tile by tile.
Kanji Movement: The wild Kanji sprites wander around the “grass” tiles randomly, moving one tile at a time every second.
Encounter: When the player’s tile coordinates match a Kanji’s tile coordinates, the game state immediately switches to
BATTLE
. The specific Kanji that was touched is passed to the battle state.State 2: BATTLE
Display: The view changes. The background becomes a solid dark gray. The encountered Kanji sprite is shown in the center, enlarged (e.g., 5x its original size).
UI Elements:
Below the Kanji, an HTML
<input type="text">
box is displayed for the user to type their answer.A “Capture!” button is displayed next to the input box.
Gameplay:
The player types the English meaning of the Kanji into the input box.
The player clicks the “Capture!” button.
On Correct Answer: A message “Correct! [Kanji] was captured!” appears. The game returns to the
OVERWORLD
state. The captured Kanji is permanently removed from the map, and the counter updates.On Incorrect Answer: A message “Incorrect, try again!” appears. The input box is cleared, allowing the player to try again.
State 3: GAME_OVER
Trigger: This state is entered when the “Kanji Captured” counter reaches 25/25.
Display: The screen shows a congratulatory message: “You’re a Kanji Master! You’ve captured them all!”
UI Elements: A “Play Again” button appears. Clicking it reloads the page to restart the game.
4. Game Data
Use the following list of 25 JLPT N5 Kanji and their primary English meanings. The game should randomly place these on the “grass” tiles at the start.
一
:one
二
:two
三
:three
四
:four
五
:five
六
:six
七
:seven
八
:eight
九
:nine
十
:ten
人
:person
日
:sun
(accept 'day' as well)
月
:moon
(accept 'month' as well)
火
:fire
水
:water
木
:tree
(accept 'wood' as well)
金
:gold
(accept 'money' as well)
土
:earth
(accept 'soil' as well)
山
:mountain
川
:river
田
:rice field
口
:mouth
目
:eye
手
:hand
心
:heart
Google Gemini Canvas can create prototypes. It returned the following result.
However, when I first ran this on Lovable, I got the following result.
Clearly this is not a product to clamor about.
The pixelation concept was taken literally.
The kanji are barely legible — too small.
In the actual gameplay we cannot enter kanji.
So we need to improve the prompt.
Prompt Refinement
Proceeded to refine the aforementioned prompt even further using ChatGPT with the following instructions:
Refine this project description even further such that we create high fidelity 2D game graphics that can mimic a pokemon style game called Kanjimon
ensure we use the “Press Start 2P” font
ensure we use a human like figure with a hat
ensure we have pixelated trees
spell out the exact environment you are trying to create with your prompt …[CURRENT PROMPT]
Which I then inserted into Lovable and got the results I showed earlier.
Next Steps
Is there a perfect prompt? These models are capable in creating usable products in 1 shot. How can I improve though ?
Would proceed in the following direction.
Test this exact prompt among different Generative AI App makers, such as Bolt.new and Same.dev, as well as local solutions such as Cline, Cursor, Windsurf.
Create more practical 1-shot prompts.
Formalize design instructions in prompts to eliminate ambiguity.
Keep making.