Grumpy Gamer

Ye Olde Grumpy Gamer Blog. Est. 2004

Mar 17, 2026

I’ve been very critical of AI but have never really used it in depth and I feel that needs to change.

Don’t criticize what you don’t know.

I’m going to ignore the moral, ethical and privacy implementation of AI and just focus on the practical. Morally and ethically, AI is a train wreck. But I’m not going to focus on that.

I’m also going to only focus on AI for programming as that is my area of expertise, at least in this case.

I’ve done a lot of looking into how LLM’s work and I get the basics in the same way that I understand quantum physics. Enough to make me entertaining at a cocktail party, but not much after that.

I am starting a new game proto-type project using Raylib and it feels like this is a good place to experiment with AI. So for the next 30 days I’m going to use AI for programming and see how it goes.

I’ve been programming C since 1987 and C++ since 1993 so I have a lot of experience (although const in all but it’s basic form still messes me up, but I thinks that true for a lot of people).

I am using Claude AI (basic subscription) cli for general programming, Co-pilot in VSCode and Gemini for quick unrelated to programming questions.

Gemini

I was only using Gemini on the free trier that everyone with a Goggle account gets and it was worthless for programming, so we’ll skip it.

Copilot

Copilot comes integrated with VSCode, so it was easy to try out and doesn’t cost anything (I already pay for Github).

Copilot treads this line between being somewhat useful and maddeningly annoying.

Copilot is good at realizing I’m going to make the same small change to the next 10 lines and offer (mostly correctly) to make it for me. It’s not making big AI decision about my code base, just that I’m going to add Ox in front of a list of ints or that I’m going to fill out an enum. Good stuff. I do find that useful in the same way Intellisense can be useful.

What Copilot is not good at is making larger AI decisions about what I’m trying to add right after the if(. I’ve found it to be mostly wrong. It’s double or triple irritating that it interrupts my flow by popping up a huge block of text that is wrong with the excitement of a poorly trained intern on their first day.

I wish I could keep the first part and disable the second, but I have not found a way to do that. What it needs is a char limit to the amount of new text it will try and insert. If it’s about 20 characters then STFU.

Claude Code

Again, ignoring the ethics of the company that make Claude Code.

What Claude code is good at is doing rote things that don’t need a lot of (ironically) “Intelligence”.

I asked it to make be a .yml file for Github actions for building Mac, Linux and Windows version of my game and it did a 99% right version. Only thing wrong was the way it named the executable, but anyone could make that mistake. Maybe I could have been clearer.

Which brings me to a point. The clearer you are the better AI is. But at some point I’ve spent so much time writing the spec that I could have just done it myself. And it’s not just a high level spec, you have to get into the weeds.

I asked Claude Code to write a .py program that did texture packing, taking a folder of .png files and making a sprite sheet with a .json file describing where everything was. Basically what TexturePacker does.

After a few seconds I had to stop it because it was freely pulling in python packages that I didn’t have.

I then told it to use Pillow (PIL) and nothing else that wasn’t standard.

After a few minutes it had a runnable program, but not correct. It was putting the .json and sheet .png in different places despite me tell it they should be together. It was also processing the .png files twice, once to check for changes and another before adding them. I called it out and it apologized profusely and fix it.

There were several other instances where it wrote c++ code that was technically correct, but horrible inefficient like passing std::string around when it could have done std::string &. I caught this, but a junior dev might not have.

I had it write some c++ code to read the sprite sheet it had created and it took 5 failed attempts to get the origins right for rotation and scaling. I finally fixed it myself but it was a frustrating back and forth. Are new devs being trained to think this is normal?

I also had it write code so billboard sprites in my 3D particle system were always camera-facing. It never got that right and had generated such mess of code that I just scrapped the idea. Math is not my strong point so I didn’t feel like I could just fix it.

I also had a instance where a file was being read from the wrong path and instead of prepending the right path it tried to completely rewrite my library.

Ironically it also had a problem with const. It recompiled the program three times randomly changing where const appeared. I feel for ya.

I have spent a lot time over this experiment correcting AI.

I could go on and on. But I won’t.

Conclusions

AI for coding is very impressive. It is a lot more then a fancy auto-complete.

But it can also be very wrong and you don’t really know it unless you look closely and have the experience to do that and I fear that new devs won’t ever get that experience. I hear about new devs freaking out when they run out of tokens for AI. They are lost without it.

I have a few friends who work at companies where their managers are telling them to use AI. This isn’t coming from “it’s a useful tool” place, but rather that AI feels like magic to some managers who don’t program.

Someone once said that “AI feels amazing when it’s talking about a subject you know nothing about, but is laughably wrong when you do”. I couldn’t agree more.

I doubt I’ll use AI beyond this 30 day experiment. I spend too much time correcting it plus I enjoy programming.

Someone else also said “I want AI to do my laundry so I can make Art, not make my Art so I can do laundry”.

I also worry that AI was trained on real programmers, but now it will just be training on itself. There is no good end to that story.

I blame the I in AI. It’s not Intelligent and it’s important to remember that. It’s a really good encyclopedia of (stolen) knowledge but AI is not figuring things out on it’s own.

Autopilots in modern planes are amazing. They can take off and land the plane. But pilots are also legally requited to hand fly the plane several times a month. Training requires them to hand fly the simulators or simulators with broken autopilots. Pilots use the auto pilot as a tool not a crutch. It’s also not called a AI Pilot (yet).

More conclusions

AI will get better. It might even get as good as we see in Star Trek. I know I felt like Geordi programming the computer in engineering at times.

I grew up in a time where everyone was afraid computers would take their jobs. And they were right. Computers did. There are fewer longshoremen because of automated ports, short-haul commercial flights don’t need three pilots anymore, there aren’t rooms full of accountants hand writing figures into a book.

That will happen with AI.

Let’s just hope it also doesn’t make us too stupid in the process.

Of course, someone from 1793 would say the same thing about me as I could not survive the night in the wilderness.

In the meantime, I’m going to enjoy programming and not being a baby sitter to an AI programmer.


Comments:

David Fox 2d ago
Thanks for sacrificing your time to research some of these tools!
Blackthorn 2d ago
I still hold the hope we programmers won't really lose our jobs. Sure, maybe one day a programmer+AI can produce as much as four programmers without AI, but it doesn't necessarily mean you need to fire those three. Instead, you pair them with an AI as well, and now your company's output is so much more than before! (I'd like to know how much: I'm not with math either :P).

Ofc I might be wrong and I'll need to find alternative ways of paying my mortgage...
Ron Gilbert 2d ago
Airlines now need 1/3 less pilots of fly a plane, but there are hundreds of more flights each day. Maybe that will happen.
@sean_earle@mastodon.gamedev.place 2d ago
I appreciate you going through this pain. I'm about to have to adopt this tooling for work and am definitely less than excited about it. I find seeing people who are equally skeptical try these tools helps keep me grounded about what they can and can't actually do.
Anon to Keep My Job 2d ago
We were told, in no uncertain terms, all programmers are to use Claude first, code later. If we didn't like it, "there's the door." Eight weeks later, everyone's neck deep in Claude. Even the skeptics had found some use for it.

That's when the next email came. We are using AI too much. The bill is too high. So, the original directive stands (AI first!) but they're capping us at a very, very low token limit. Literally about 10% of what we'd become accustomed to. Execs literally sold the company on 10x'ing our output then throttled us to 10% AI usage. I'm not good at math, but... Wish I was kidding.
Jan 2d ago
I deeply admire your openness to trying new technologies, even those you find objectionable. There is no denying that there is a lot of potential in LLMs. Maybe in the future, if they're trained on code that they're explicitly allowed to use, they will become a genuinely useful tool.
David Thomsen 2d ago
I use Gemini because it's the free one where I don't have to sign up for anything, I already have a Google account. I've found it reasonably useful when I have a specific goal in mind that is too difficult to articulate in a Google search. However if I weren't a reasonably experienced developer already it would have directed me in the wrong direction a few times, sometimes in ways that would have been catastrophically bad if I didn't know better.


Personally I just don't want any AI in my IDE because that sounds irritating as heck. I want my IDE as dumb as possible with the occasional line underlined to be like "this collection initialization could be simplified" or whatever.
jimtupp 2d ago
I recommend trying os models and os tools. I.e. opencode and qwen3.5 instead in gernal ai will get better but the worse scenario in my op is if its control stays with a few companies. I many use llms with very exact promts and specific file references; so it often feels like I just use it as a faster type writer; it does so less 'easy' mistakes and cupeling it with lsps, test and such thing can be quite powerfull. Never managed to write a conplext curl test call faster than the llm; and always hast du correct / lookup something. So they are like a dangerous accelerator; a powerfull tool that you can def chop your leg off with if your not carefull; or you might already be walking in cruthes without noticing.
Bernd Milbrandt 2d ago
I work with Claude Code (Pro subscription) and Microsoft Copilot. Both are powerful tools—provided you know how to use them properly. It’s just as important not to trust them blindly. It takes experience to judge what’s really good. Young or less experienced developers, in particular, run the risk of accepting suggestions too uncritically. However, this isn’t a problem specific to AI. I’ve seen code copied unthinkingly from StackOverflow long before the AI era.

Since I’m also a trainer for software developers, I place great importance on trainees developing code themselves first before they use AI. Time and again, I see young developers wasting resources unnecessarily - whether it’s memory, CPU, API calls, or database queries. You have to keep these aspects in mind when evaluating AI suggestions. I come from the era of the C64 and COBOL—to some, that might seem like shortly after the invention of the steam engine, but it shapes one’s perspective on efficiency.

As a software architect, I primarily use Claude for code reviews. As a team, we review absolutely everything that’s developed. Without this discipline, the software would be nearly impossible to maintain or keep stable in the long run. Similar to pair programming, I take the AI suggestions, evaluate them, and decide what to adopt or add. With 10,000 new lines of C#, it’s incredibly helpful to quickly get an overview and identify the biggest issues. Of course, you’ll never catch everything.

Interestingly, the largest models don’t necessarily deliver the best results. What matters more is finding the right prompt for the right model. And “100% deterministic” is a foreign concept in this context anyway. But let’s be honest: Am I deterministic myself when I look at the same code a second time? Hardly.

I tend to use Copilot for technical discussions when no one else is available, or for brainstorming.

Will AI cost jobs in the future? Maybe. But perhaps it’s more of a leap, like the one from machine language to high-level languages back in the day. What I am certain of, however, is that companies that don’t use AI will struggle in the future. The pressure is mounting: more features, higher quality in less time, more competition.

I consider refusing to engage with AI - for whatever reason - to be a major mistake.

Joe 2d ago
There's nothing you can do with LLMs that justifies the environmental cost :\ It's a real shame to see people disregard that fact entirely.

> Let’s just hope it also doesn’t make us too stupid in the process.

We're way beyond that point; people have ended their own lives thanks to an LLM's encouragement, CEOs have fired employees, waterways have been poisoned...countries have been bombed.

It's a technology that needs to be looked at like drugs, guns, and cybertrucks - with a sidelong, grossed out look. (although drugs, of course, with proper regulation are not quite as harmful as the other two)
Stan 2d ago
"AI" is not made for coding and will never be ; it's a misuse of its purpose plus it's killing the planet.
Ron Gilbert 2d ago
I agree, we would be better off without AI. Don't read this article as an endorsement.
Uwe 2d ago
I'm absolutely sold to OpenAI Codex CLI. Beats even Claude Code for me.
Wanderer 2d ago
Your experiences match mine to a T (including the auto-complete comparison), and I can't express how happy I am to read what you write, because sometimes, speaking with colleagues, I feel like I'm going insane.

The "getting the spec right" issue aside, I find Claude code generation very frustrating, because it will get complex things surprisingly right, but then insist on things that are obviously not true. The worst part is when it insists on things that are wrong, but not obviously so, and forces you on a long detour to figure out who's right. Any gains in productivity, are often lost in those relatively frequent occurrences. I'm typically not super happy with the quality of the code, either. If you have a large, complex code base, it tends to find local solutions, muddying the code and leaving the bigger, global problem unsolved, ready to cause problems further down the line. All in all, I'd say I'm not noticeably more productive, but I find the workflow far less pleasant.

The thing for which I've found it super useful is debugging. Unlike codegen, it feels less intrusive, and it's uncannily good at spotting those small annoying mistakes that are easy to overlook by humans (or by me, at least), and relatively good at picking up on problematic complex interactions that are easy to miss. Whenever I have an issue, I fire up Claude in parallel, and if I'm lucky, it will solve it or at least point me in the right direction. When it gets it catastrophically wrong, I feel less cheated. This I find genuinely valuable, everything else, I can do without (at least in its current state).

Now, my personal concerns about what this is doing to the job. I've seen a significant increase in "AI assisted" PRs. The code tends to be, as I explained, of far lower quality. Very senior programmers, that I know and respect, are starting to submit quite a lot of shitty code (as in, "this one, short statement does the same as these, hard to follow, six over here", or even "this function should be implemented in this other file"). Each review takes longer, and the number of PRs has markedly increased. Keeping the code base in good shape is becoming an increasingly difficult task, I can see this becoming impossible soon. I've also noticed that Claude is better at producing valid (not good) code when the code is well organized, so I'm a bit concerned about what will happen as the code base becomes more and more mangled by low quality code.

In the long term, I fear a run-off effect (code base getting worse at an accelerated rate, in a feedback loop of doom). In the short term, I fear that my job will become less and less about writing code, and more and more about reviewing intern-quality code, knowing full well that the reviewee will just pass my comments to the LLM without giving it a second thought.
franksands 1d ago
Ignoring the moral and ethical problems is what will lead to EA generating a new version of Monkey Island based on AI. All generative AI is based on STOLEN work. They need to pay for each work that is used to "train" the AI models. Period.

Would you like to have all your work stolen and sold without any compensation to you? Think about that the next time you say you don't want to focus on the moral and ethical problems
CrazyKinux 1d ago
As a non-programmer (I can’t really count my experience with Logo back in the ‘80s! 🤪), I’ve found the chatbots I’ve used so far—ChatGPT, Gemini, Claude, and Copilot—extremely useful across a wide range of topics and tasks, including cleaning up HTML code for my blog when I hit a wall (again, not programming here).

Setting aside the moral and ethical concerns around how these LLMs were built and trained, I think 🤔 practically everyone stands to benefit from using them more. The key is—and will continue to be—knowing when to use them and how. And I’d be very surprised if programmers couldn’t benefit from them just as much.

Great read, including the comments!
Jimmy 1d ago
The open source AI tools are great, I use ComfyUI for anything image-based and LM Studio for LLMs... once you've installed them and downloaded the models you want, they run entirely offline using your GPU, no need for credits, etc.
You need a relatively decent GPU... RTX 3060 12GB at a minimum to make it comfortable. The best places to find out what's going on in that area are the subreddits "LocalLLaMA" for LLMs and "StableDiffusion" for all image/video models.

The open source image tools are only slightly behind the closed source ones, and for basically free (10 cents of electricity gets you about 1000 images).
Joe 1d ago
I think the point that LLMs should not be used at all is quite nicely made among the comments. Almost everyone has said "setting aside the ethical and moral issues..."

Any of the people saying that want to try applying it to something else? Just for fun, y'know.

I have never seen any argument for LLMs that doesn't seem to include any apologia or excuses around "eeh just don't worry about the piracy, or the environment, or the psychosis, or the fact that no one can buy RAM anymore, or..."

Come on. What are we doing here?
SnugglePilot 1d ago
I found AI is the _worst_ at anything visual, which makes a kind of sense, and makes game dev a bit tricksy. My latest project I made intentionally that it would work, deterministically, command-line only with no visual output. That was a lot easier for AI to swallow and work with (it could run tests and validate it's own output... eat it's own tail, so to speak).

but now I wonder if I'm just optimizing projects for AI use instead of human use. rip.
sam 13h ago
Thanks Ron! Nice to see a perspective that so closely matches my own for once.

Also echo a lot of Wanderer‘s points. It is getting to the point where an LLLm is required to keep on top of all the LLM PRs and that seems a pretty awful direction. At best codebases will be unmaintainable without paying some ‘AI’ vendor, at worst even they won’t be able to cope with the long-term result and quality will go down across the board. Sort of think we’re already seeing that with eg recentAWS outages but it’s probably too early to say what the equilibrium here will be.
Jake 7h ago
This pretty much matches my experience. There was a study not so long ago that mirrored experience I'd had, and has played out the same way every time I've seen that it's actually been properly trialled and measured. The short version is that before starting work, developers thought that they'd be ~25% faster with AI tools; after finishing work, they thought they'd been 20% faster; in reality, they were only getting 80% as much work done as they were without the AI tools. It's not hard to see where the time goes - correcting the over-eager suggestions of the glorified Markov chain.

There's lots of other issues like burnout and job satisfaction and what seems to be the glint of addiction in the eyes of Claude users that's terrifying familiar to anyone who's ever known someone on hard drugs. But forget productivity and developers' mental health, there's two other things that worry me a lot more.

Firstly, the points made here about junior developers. A lot of companies are cutting down significantly on their junior hires because "AI does as good a job" and maybe it does. And the ones that aren't cutting Junior hires are getting them to do all their work with AI tools. So we're losing a generation of junior devs who are never going to learn to write real code, and never going to grow into useful mid-level or senior developers, and in ten years' time there could easily be a serious shortage of experienced developers who actually know how to write code. Being a twenty-year veteran myself I should be rubbing my hands with glee at the fat salaries I'll be able to command but I don't think it's healthy for society as a whole to have essential systems maintained by a class of highly-paid magi who are the few who know the Old Magic.

Secondly, I don't think we'll actually really get to that point because the big AI companies are spending literally billions of dollars a month running these services at a massive loss. They're desperately clawing to get market share, but sooner or later the VC money will run dry and they'll need to turn a profit, and at that point the cost of using these tools will suddenly shoot through the roof. Everyone who doesn't know how they work (the clueless executives mentioned several times in this comment section already) is assuming that LLMs will get cheaper and faster and better because that's the pattern we're used to for tech products but there's really no rational reason to believe that. A company betting big on AI tooling today is going to find themselves suddenly paying through the nose to keep their developers in credits and/or forcibly going cold turkey and re-learning normal programming when the prices go up. The cracks are already appearing.
Heinz Sander 1h ago
This is exactly how I feel about AI. Thank you for expressing my thoughts exactly.

Just for fun, feel free to look up "Gas Town" and question the author's sanity the whole time like I did.
My comment is not an enforcement of AI. 27m ago
I'm sure you can appreciate "vibe coders" and how AI code can feel magical to people who can't code.

Now look - I do think spending time learning to code can be good. I know code is technically built to be very human readable, once you learn to parse it. I respect the skills of an experienced coder, and think if a business wants to hire a coder they should hire a coder and not an AI prompter. But I personally can't write code as easily as I can describe my desired outcome, and AI Code is a useful tool for my specific use cases. I am not using it commercially, and I don't want to LARP as a coder. I am happy to be able to use code in new ways, and do things I couldn't do before. I am not a person who would hire a coder, and who would otherwise annoy coders on forums asking for obnoxious snippets nobody wants to make.

I can draw. I think AI generated art looks poor. I also have empathy for why it looks like magic to somebody not artistically skilled. It would grant a better result to hire a professional, but that's outside people's scope a lot of the time. The person wanting to play with ghibli photo filters is not the same as a client I want to serve.

I agree with the above commenter, Jimmy, that I would rather run open source code on my PC than use SAAS. I don't care about piracy or using copyrighted code as training data - i do not like intellectual property as an idea - I worry about the outsourced cloudworkers being treated like shit and paid pennies to run these buggy LLMs and GANs and make n00bs like me go "waoh".

Add comment: