Uplevel measured "the time to merge code into a repository and] the number of pull requests merged" for about 800 developers over a three-month period (comparing the statistics to the previous three months).
The study "found no significant improvements for developers" using Microsoft's AI-powered coding assistant tool Copilot. Use of GitHub Copilot also introduced 41per cent more bugs, according to the study.
The Uplevel study looked at factors in developer burnout and found that GitHub Copilot has not helped there, either. The amount of working time spent outside of standard hours decreased for both the control and test groups using the coding tool, but it decreased more when the developers weren't using Copilot.
An Uplevel product manager/data analyst acknowledged that there may be other ways to measure developer productivity — but they still consider their metrics solid.
"We heard that people are ending up being more reviewers for this code than in the past... You just have to keep a close eye on what is being generated; does it do the thing that you're expecting it to do?"
The article also quotes the CEO of software development firm Gehtsoft, who says they did not see significant productivity gains from LLM-based coding assistants — but did see them introducing errors into code.
With different prompts generating different code sections, "It becomes increasingly more challenging to understand and debug the AI-generated code, and troubleshooting becomes so resource-intensive that it is easier to rewrite the code from scratch than fix it."