Researchers show AI can learn a rare programming language by correcting its own errors, improving its coding success from 39% to 96%.
There's a lot more to a model than just benchmarks.