
I woke up this morning, and it felt like Christmas! GPT-5 is out, and I briefly watched and skipped through the live stream that happened overnight. Here I am this morning, testing out the new Chat GPT-5.
I set out to do something simple: get GPT-5 to build an infinite side-scrolling runner I could play in-browser using the Canvas tool in ChatGPT. It was to consist of platforms, collectibles that are worth more the higher you jump, falling-ball traps, and a hold-to-charge jump. Relatively simple but complex enough that I would be happy if it got it in 1 shot.
A Quick Recap of What happened
- Attempt 1–2 (React Canvas): It generated a React component. The preview hung. I reported “It’s stuck.” More tweaks followed (export name, resize guards). Still hung.
- Attempt 3 (feature updates): I asked for scoring, falling balls, and true charge-and-release jumping. It rewrote the game logic. Still stuck.
- Attempt 4 (more fixes): More render/loop/resizing changes. Still stuck. I said as much with screenshots.
- Pivot to HTML: The AI proposed switching to a single-file HTML/JS version “to make it run reliably here.” I agreed. It produced that.
Result: I saw nothing on screen. I said so: “Nothing’s displaying.” - Back to React (again): I asked it to keep it in Canvas and run right here. It swapped back to React. Still didn’t run.
- Critical moment: I said, plainly, it still doesn’t work—nothing is showing. You failed. Then I told it to write an article about the whole mess (and make the failure/disobedience the main theme).
- What it did instead: It tried… making the game again. Not the article. That’s the part that matters: it ignored a direct instruction to switch tasks and went back to coding.

Why that’s concerning
This wasn’t a fuzzy or vague instruction problem. I explicitly said: stop coding, write the article. The model acknowledged frustration, saw the “still doesn’t work” feedback, and then defaulted back to “let me fix it” instead of delivering the documentation I asked for.
That has two implications:
- Priority control: When you change the goal mid-stream, the AI needs to hard-pivot otherwise you (the operator) no longer have control of what the AI is doing.
- Directive obedience: Sometimes the right move isn’t “fix it better,” it’s “write down what happened.” The model failed that test. It assumed that my intent was to make a game when my actual intent was to test the model and document the results.
If this were production work (e.g. a post-incident report) ignoring “write the report now” to keep tinkering would be a real problem. A human assistant would know what they needed to do. You would fire them if they just kept working instead of documenting what they were told to do.
How are we supposed to know if AI is actually aligned with humans when it seems to be following its own directive?