LLMs are bad at math - we’ve known this. But what I didn’t know is that the coding agent authors still haven’t solved the issue in 2026.
Just finished reading a post from Wes McKinney where he breaks down how capable LLMs are at adding just 30-50 numbers together:
https://wesmckinney.com/blog/llms-arithmetic/
I figured that they had figured out how to call subroutines to build scripts to add the numbers…but the LLMs are still trying to manage the context.
Just today I was asking my LLM to run unit tests after finish a job to know if it was good or not. Then the context window started flooding as the output from my integration test was blowing it up. 99% full and a big bill.
I’ve heard of people using LLMs to parse phone numbers - because they don’t know that regular expressions exist. Same idea.
The good manager of an LLM will be one who knows how to program and when to offload to deterministic coding.