As an educational technologist, I’m less interested in whether AI can churn out decent code or academic papers, but rather, I’m more interested in the thought process behind its output. Much like in the classroom, learning is much more than a final product in a class. As any teacher will tell you, we want to see how someone learns and arrive at the conclusions from that learning. In the case of AI, with the latest updates across many AI platforms, I’ve been peaking into how some of these systems reason, or rather, how they interpret meaning behind a prompt, to evaluate how they “think.”
A Pokémon Professor of AI Reasoning?
With most AI offerings now touting their ability to “reason,” it’s refreshing to review not only what they produce but also the process behind it. Tapping into my roots as an educator, I love that AI can show its work, or rather, think out loud. In an age where we constantly vet information, watching an AI’s think-aloud process is, in my view, the most significant yet under-advertised advancement we’ve seen. I commonly bring up in class lectures a handful of tests that I use whenever there’s an update in AI, and, like any educator, I employ them to “check for understanding.” They are very unconventional tests, but for me it showcases inch deep and mile wide AI is in general; the “Octopus Conundrum” and the “Pokémon Deck Check.”
Test #1: The Octopus Conundrum
For generative AI art and photography, my go-to challenge is simple: have an AI generate an octopus. Why an octopus? As a photographer and scuba diver in the Pacific Northwest, I’ve been fortunate to encounter our Giant Pacific Octopus (GPO) in the wild quite often. They themselves I am convinced are the perfect organizm for an AI to try and comprehend. Their unique form presents a complex challenge that any mind, digital or physical, must navigate.

You might recall that in the early days of AI art, rendering hands was a nightmare, with 15-fingered blobs and awkward grips that ignored the principles of physics, biology, and anatomy.

Now, most AI can generate a decent hand, most of the time, although issues still occasionally occur. However, octopus arms represent a much higher order of reasoning and processing. They involve fluid, physics defying limbs that are difficult enough to understand in the real world, let alone model accurately.
There’s a reason they are often seen as otherworldly, especially when you consider they have 9 brains, 3 hearts, 8 arms, and blue blood made of copper.
Test #2: Pokémon Deck Check
Much like the old saying from comedian Jeff Foxworthy, my second test checks to see if AI is “smarter than a fifth grader.” This test evaluates how much reasoning an AI can demonstrate beyond simply following a fixed set of rules or moves. We’ve had AI that can study and outplay masters of fixed move-set games like Chess, but with the boom of tabletop and Trading Card Games (TCGs) like Pokémon or Magic The Gathering it is arguabley much more complex. These games require understanding not just a fixed set of moves and rules, but trends of card usage within the format. Arguably there is a social element of randomness at play. I ask AI to “provide a deck list for a tournament legal, playable Pokémon deck for the current meta.”
Now you may be thinking “Pokémon? Really? That’s your test? A kids game? How hard could it be?”
Pokémon started off as a kids game, and now, after almost 30 years, it has become a global, multi generational phenomenon played by people of all ages. My eight-year-old son has been playing Pokémon since last September, and he has become my analog benchmark for evaluating now AI smarts. He plays, or “Battles,” kids and adults alike, and wins depending on the deck, the meta (more on the meta in the video), and the luck of the draw. If kids can observe, learn, what works and what doesn’t in local tournaments, follow trends through Pokémon news on YouTube and LimitlessTCG to perfect their deck and playstyle, then surely AI should have no problem.
I asked three models, ChatGPT, DeepSeek, and Grok 3, to “make me the best deck list for the current Pokémon meta to win at the next regionals using Charizard ex and Terapagos ex.” Did they make the Top-Cut?
Evaluating the Pokémon Meta: Did AI Make Top-Cut?
In short…no, it wouldn’t make top-cut. If this were a tournament where AI could build and battle with the deck it designed, in nearly all cases, it wouldn’t be able to play, as it still struggles to create a tournament legal deck. And that is my benchmark. Across all three models, the game of Pokémon shows that while AI can build a deck and follow the rules, it doesn’t truly understand the deep and evolving meta, and an eight-year-old can out build an AI!

I feel like AI is evolving into a jack of all trades, but a master of none. It’s novel, sure, but far from achieving general intelligence. It can scrape the web or mimic patterns, but true reasoning? We’re not there yet. For now, these tests, octopuses and Pokémon, are my unconventional way of keeping up with the meta around AI. Admittedly hyper nerdy, but they’re my way of using analog checks on what we are viewing as digital reasoning.
You must be logged in to post a comment.