<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Andrew Rutherfoord</title>
    <description>The latest articles on DEV Community by Andrew Rutherfoord (@recursivecube44).</description>
    <link>https://dev.to/recursivecube44</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3849868%2F9ec46109-e0a6-4a71-8c52-18f0eff32809.png</url>
      <title>DEV Community: Andrew Rutherfoord</title>
      <link>https://dev.to/recursivecube44</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/recursivecube44"/>
    <language>en</language>
    <item>
      <title>Developer Productivity in the Age of AI</title>
      <dc:creator>Andrew Rutherfoord</dc:creator>
      <pubDate>Sun, 29 Mar 2026 19:21:05 +0000</pubDate>
      <link>https://dev.to/recursivecube44/developer-productivity-in-the-age-of-ai-1ff6</link>
      <guid>https://dev.to/recursivecube44/developer-productivity-in-the-age-of-ai-1ff6</guid>
      <description>&lt;p&gt;Authors: &lt;a href="https://linkedin.com/in/andrew-rutherfoord" rel="noopener noreferrer"&gt;Andrew Rutherfoord&lt;/a&gt;, &lt;a href="//www.linkedin.com/in/delia-andreea-popa"&gt;Delia Popa&lt;/a&gt;, &lt;a href="https://www.linkedin.com/in/giannis-loukas/" rel="noopener noreferrer"&gt;Ioannis Loukas&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;&lt;span&gt;AI&lt;/span&gt; coding tools use large language models to generate code to achieve the desired goals of a developer. These tools are increasingly used in the software engineering field, with the goal of accelerating implementation work and reducing developer effort. Despite their rapid adoption, the extent to which these tools improve developer productivity remains unclear. Therefore, we aim to bridge this research gap and study the use of these tools and whether it has a measurable impact on productivity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Background
&lt;/h2&gt;

&lt;p&gt;Software productivity encompasses objective dimensions (effort vs. output) and subjective perceptions of efficiency (&lt;span&gt;al&lt;/span&gt; 2025). We focus on objective metrics—commit frequency, &lt;span&gt;loc&lt;/span&gt; modified, and code churn—to measure delivery.&lt;/p&gt;

&lt;p&gt;Code churn is “commonly used to capture the intensity of software changes” (Gomes et al. 2026) and excessive churn in code files can be associated with poor design and technical debt. Therefore, churn can be used as an objective metric for code quality by analyzing changes per file over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Methodology
&lt;/h2&gt;

&lt;p&gt;In order to assess how &lt;span&gt;AI&lt;/span&gt; assisted tools contribute to developer productivity, we aim to answer the following &lt;span&gt;RQs&lt;/span&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;To what extent does the adoption of &lt;span&gt;AI&lt;/span&gt; programming tools affect developer productivity?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;How does the adoption of &lt;span&gt;AI&lt;/span&gt; programming tools affect code rework trends?&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;By analyzing code output and code churn, we can get a full understanding of productivity since raw output volume may be incomplete if that code is being reworked repeatedly.&lt;/p&gt;

&lt;p&gt;We hypothesize that the adoption of &lt;span&gt;AI&lt;/span&gt; programming tools increases development productivity, but also results in increased code rework.&lt;/p&gt;

&lt;h3&gt;
  
  
  Productivity metrics
&lt;/h3&gt;

&lt;p&gt;To answer our &lt;span&gt;RQs&lt;/span&gt;, we compare metrics before and after &lt;span&gt;AI&lt;/span&gt; adoption using statistical testing. We compute metrics per commit (each commit is one observation) and weekly (commits aggregated by week), shown in Tables 1 and 2. These metrics combine raw output and quality of changes. For example, &lt;code&gt;files_touched&lt;/code&gt; shows how broad the changes made were, whilst &lt;code&gt;add_delete_ratio&lt;/code&gt; provides information about the balance between new code and refactoring.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Code name&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Definition&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;churn&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Lines added &lt;code&gt;+&lt;/code&gt; lines removed. (Faragó et al. 2015)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;net_added&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Lines added not removed in the same commit.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;net_removed&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Lines removed not added in the same commit.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;files_touched&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Number of files modified in the commit.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;is_net_negative&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;1 if &lt;code&gt;net_removed &amp;amp;gt; net_added&lt;/code&gt;, else 0.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Per-commit productivity metrics.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Code name&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Definition&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;gross_churn&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Weekly total lines added + lines removed.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;net_added&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Weekly total net added lines.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;net_removed&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Weekly total net removed lines.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;net_negative_commits&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;net_removed - net_added&lt;/code&gt;.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;add_delete_ratio&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;total_added / total_removed&lt;/code&gt;.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;total_commits&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Number of commits in the week.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;files_touched_per_commit&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Weekly &lt;code&gt;files_touched&lt;/code&gt; divided by &lt;code&gt;total_commits&lt;/code&gt;.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Weekly productivity metrics (aggregated by week).&lt;/p&gt;

&lt;h3&gt;
  
  
  Dataset Construction
&lt;/h3&gt;

&lt;p&gt;We constructed a dataset of open-source GitHub repositories, allowing us to analyze productivity before and after &lt;span&gt;AI&lt;/span&gt; adoption. Using Google Dorks, we identified 412 repositories containing &lt;code&gt;CLAUDE.md&lt;/code&gt; or &lt;code&gt;AGENTS.md&lt;/code&gt; files. These artifacts provide context for agentic &lt;span&gt;AI&lt;/span&gt; tools, signaling extensive &lt;span&gt;AI&lt;/span&gt; integration within the project.&lt;/p&gt;

&lt;p&gt;We extracted repository data using NeoRepro &lt;sup id="fnref1"&gt;1&lt;/sup&gt; which utilizes PyDriller to extract repository data, including file modifications, and stores it in a Neo4j graph database (Rutherfoord, n.d.). Data structure is shown in Figure 1. The tool extracts file modifications, and the &lt;code&gt;git diff&lt;/code&gt; for each file and stores it on the &lt;code&gt;MODIFIED&lt;/code&gt; relation, allowing us to analyze the code changes for our code churn research question. &lt;sup id="fnref2"&gt;2&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnlv8ufpyqq37negjc6mr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnlv8ufpyqq37negjc6mr.png" width="800" height="433"&gt;&lt;/a&gt;&lt;/p&gt;
Structure of Data in Neo4j after extraction using NeoRepro &lt;span&gt;(Rutherfoord, n.d.)&lt;/span&gt;.



&lt;h3&gt;
  
  
  Dataset Cleaning
&lt;/h3&gt;

&lt;p&gt;To provide reliable results we exclude repositories with insufficient data for pre- and post adoption comparison. Therefore, we exclude repositories with fewer than 500 total commits, or fewer than 50 commits before or after &lt;span&gt;AI&lt;/span&gt; adoption, leaving 180 repositories.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjr2rlkoarc475vzt2to8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjr2rlkoarc475vzt2to8.png" width="800" height="623"&gt;&lt;/a&gt;&lt;/p&gt;
Proportion of different programming language files used for analysis. 



&lt;p&gt;Furthermore, we grouped related programming-language files by related extensions (eg. &lt;code&gt;.c&lt;/code&gt; and &lt;code&gt;.h&lt;/code&gt; as &lt;em&gt;C&lt;/em&gt;) to consolidate analysis of languages. Finally, we limited our analysis to language groups with at least 1000 files in the dataset (Figure 2).&lt;/p&gt;

&lt;h3&gt;
  
  
  Dataset Overview
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6t4dvdt34z7elhxw70sk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6t4dvdt34z7elhxw70sk.png" width="600" height="400"&gt;&lt;/a&gt;&lt;/p&gt;
Total number of commits before and after creation of &lt;span&gt;AI&lt;/span&gt; artifact.



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe4v7ziab546tow2gt72b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe4v7ziab546tow2gt72b.png" width="600" height="600"&gt;&lt;/a&gt;&lt;/p&gt;
Distribution of artifact age.Dataset overview.







&lt;p&gt;Figure 5 shows commit distribution before and after the creation of the artifact file across all repositories, revealing approximately five times more commits indexed before. Furthermore, Figure 5 show the average age of the artifact is 150 days (approximately 5 months). Although this is not enough time to understand the long term effects of &lt;span&gt;AI&lt;/span&gt; tool usage, it is sufficient data to understand the trends after adoption of the artifacts.&lt;/p&gt;

&lt;h3&gt;
  
  
  Analysis Design
&lt;/h3&gt;

&lt;p&gt;Since our commit data is time-bound, we use method suited to time-series data for analysis. We analyze changes in metrics by comparing trends from pre- and post &lt;span&gt;AI&lt;/span&gt; tool adoption, with the artifact creation date as the adoption cutoff. We aggregated the metrics on a weekly basis and performed two tests to ensure reliable results. To understand whether the effects differ between programming languages, we performed analysis on a per language basis, as well as all languages as a baseline. To avoid skew from long pre-adoption histories, we trimmed each repository’s pre adoption history to 1.5 times the length of the post adoption data.&lt;/p&gt;

&lt;p&gt;For the analysis, we first perform an intervention analysis using a time-series ARIMAX model (ARIMA with exogenous regressor) (“What Is an ARIMAX Model?” n.d.) to test whether the introduction of &lt;span&gt;AI&lt;/span&gt; tools is associated with a change in productivity and churn. This allowed us to estimate whether the adoption was followed by immediate change in the metric and/or a sustained change in the metric weekly trend. For each series, we automatically select the ARIMA parameters (p, d, q) using the &lt;code&gt;pmdarima.autoarima&lt;/code&gt; &lt;sup id="fnref3"&gt;3&lt;/sup&gt; Python package.&lt;/p&gt;

&lt;p&gt;Secondly, to ensure reliability, we use a two sample t-test to compare the average metric for each repository’s before and after adoption. Additionally, we apply post adoption offsets of 0, 2 and 4 weeks to reduce sensitivity to short spikes immediately after adoption.&lt;/p&gt;

&lt;p&gt;For our statistical analysis we used a significant value of &lt;code&gt;p = 0.1&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;Across all analysis, most repositories do not show a significant change post adoption of &lt;span&gt;AI&lt;/span&gt; for all metrics. We will review the subset of repositories which showed a statistically significant change, rather than changes across the entire project set.&lt;/p&gt;

&lt;h3&gt;
  
  
  Direction of post adoption instantaneous changes
&lt;/h3&gt;

&lt;p&gt;The heatmap in Figure 6 shows, per language group, the share of repositories for which the ARIMAX model detects a statistically significant instantaneous post adoption level shift for each metric. Overall, immediate effects are uncommon since for most metrics and languages, less than 10% of repositories show significant changes. This suggests that &lt;span&gt;AI&lt;/span&gt; adoption is generally not associated with an immediate change in activity or churn metrics across projects. However, &lt;code&gt;total commit&lt;/code&gt; is an exception, with 12% of projects seeing a change across all languages, especially for Python (20%) and Javascript (14%). Furthermore, Python shows comparatively high significance for &lt;code&gt;net added&lt;/code&gt; (17%) and &lt;code&gt;gross churn&lt;/code&gt; (14%), whereas most other churn metrics remain low.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fttpdf3ct3ap09gjs5h3u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fttpdf3ct3ap09gjs5h3u.png" width="800" height="325"&gt;&lt;/a&gt;&lt;/p&gt;
Repositories with statistically significant immediate post adoption effect per programming language.



&lt;p&gt;Figure 7 further breaks down the immediate effects for Python, showing the direction of the significant changes. For &lt;code&gt;total commits&lt;/code&gt;, the significant change is primarily negative, indicating that most projects saw a decrease in commits per week. Conversely, &lt;code&gt;net added&lt;/code&gt; shows a balanced change, showing that there was no consistent direction of change. This combination of fewer commits but similar additions suggests that post adoption changes in Python may be associated with fewer, larger commits.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2lhqfpa4zjz3tztylktf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2lhqfpa4zjz3tztylktf.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;
Significant post adoption immediate changes — Python.



&lt;h3&gt;
  
  
  Trend changes after adoption
&lt;/h3&gt;

&lt;p&gt;Contrary to the immediate effects, Figure 8 shows that significant post adoption trend changes are more common.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;total commits&lt;/code&gt; metric appears to be the most consistently affected one across several programming language groups (27%), with JavaScript (30%), Python (26%), Go (25%), and C (23%) showing the significant changes. This further suggests that artifact creation is often associated with a gradual change in development over time rather than an instantaneous shift at &lt;span&gt;AI&lt;/span&gt; adoption.&lt;/p&gt;

&lt;p&gt;Furthermore, &lt;code&gt;files touched per commit&lt;/code&gt; shows a consistent shift (16% overall), particularly for JavaScript (17%), Python (16%), Go (15%), and CSS (14%), which indicates a notable change in commit breath across languages.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ddh3w6h4n26fr9vqrwr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ddh3w6h4n26fr9vqrwr.png" width="800" height="327"&gt;&lt;/a&gt;&lt;/p&gt;
Repositories with statistically significant post adoption trend effect per programming language.



&lt;p&gt;For Python repositories (Figure 9), the most dominant pattern is the negative trend change across most metrics. In particular, &lt;code&gt;total commits&lt;/code&gt;, &lt;code&gt;net added&lt;/code&gt;, &lt;code&gt;add delete ratio&lt;/code&gt; and &lt;code&gt;files touched per commit&lt;/code&gt; showed a predominantly negative shift. This suggests that in Python repositories where a significant trend effect is identified, adoption is associated more with an overall decrease in activity rather than sustained increase. A notable exception is &lt;code&gt;net negative&lt;/code&gt;, where a positive trend is observed, with deletions increasingly outweighing additions over time. Overall, for Python repositories, there is a generally downward shift in post adoption trajectory in repositories which exhibited a significant trend change.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmfttqzatwjdakdzdksq1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmfttqzatwjdakdzdksq1.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;
Significant post adoption trend changes — Python.



&lt;p&gt;For JavaScript repositories (Figure 10), the pattern is somewhat similar to the Python repositories, but less negative overall. The strongest trend effect is again observed for the &lt;code&gt;total commits&lt;/code&gt;, where the negative changes account for the largest share of significant results. These trend changes are also quite prominent across &lt;code&gt;gross churn&lt;/code&gt;, &lt;code&gt;net added&lt;/code&gt; and &lt;code&gt;files touched per commit&lt;/code&gt;. Concurrently, &lt;code&gt;net negative&lt;/code&gt; and &lt;code&gt;add delete ratio&lt;/code&gt; show a rather mixed pattern, where both positive and negative trends are present. This shows that in the case of JavaScript repositories there is a sense of heterogeneity in post &lt;span&gt;AI&lt;/span&gt; adoption evolution.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fealsytc6qfpme34h0cl9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fealsytc6qfpme34h0cl9.png" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;
Significant post adoption trend changes — JavaScript.



&lt;h3&gt;
  
  
  Paired t-test results
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Group&lt;/th&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Delay&lt;/th&gt;
&lt;th&gt;Repos number&lt;/th&gt;
&lt;th&gt;Mean Diff&lt;/th&gt;
&lt;th&gt;Median Diff&lt;/th&gt;
&lt;th&gt;Period&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;All&lt;/td&gt;
&lt;td&gt;gross churn&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;119&lt;/td&gt;
&lt;td&gt;1815.067&lt;/td&gt;
&lt;td&gt;197.375&lt;/td&gt;
&lt;td&gt;weeks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;All&lt;/td&gt;
&lt;td&gt;net added&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;119&lt;/td&gt;
&lt;td&gt;602.957&lt;/td&gt;
&lt;td&gt;96.259&lt;/td&gt;
&lt;td&gt;weeks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;All&lt;/td&gt;
&lt;td&gt;total commits&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;119&lt;/td&gt;
&lt;td&gt;2.511&lt;/td&gt;
&lt;td&gt;0.333&lt;/td&gt;
&lt;td&gt;weeks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Go&lt;/td&gt;
&lt;td&gt;net removed&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;449.347&lt;/td&gt;
&lt;td&gt;7.800&lt;/td&gt;
&lt;td&gt;weeks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CSS&lt;/td&gt;
&lt;td&gt;net removed&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;47&lt;/td&gt;
&lt;td&gt;11.798&lt;/td&gt;
&lt;td&gt;0.000&lt;/td&gt;
&lt;td&gt;weeks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CSS&lt;/td&gt;
&lt;td&gt;files touched per commit&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;47&lt;/td&gt;
&lt;td&gt;0.252&lt;/td&gt;
&lt;td&gt;0.106&lt;/td&gt;
&lt;td&gt;weeks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rust&lt;/td&gt;
&lt;td&gt;is net negative&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;-0.023&lt;/td&gt;
&lt;td&gt;-0.028&lt;/td&gt;
&lt;td&gt;commits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bash&lt;/td&gt;
&lt;td&gt;files touched&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;17&lt;/td&gt;
&lt;td&gt;-0.376&lt;/td&gt;
&lt;td&gt;-0.133&lt;/td&gt;
&lt;td&gt;commits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;All&lt;/td&gt;
&lt;td&gt;is net negative&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;112&lt;/td&gt;
&lt;td&gt;-0.011&lt;/td&gt;
&lt;td&gt;-0.010&lt;/td&gt;
&lt;td&gt;commits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rust&lt;/td&gt;
&lt;td&gt;is net negative&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;-0.022&lt;/td&gt;
&lt;td&gt;-0.026&lt;/td&gt;
&lt;td&gt;commits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bash&lt;/td&gt;
&lt;td&gt;files touched&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;17&lt;/td&gt;
&lt;td&gt;-0.360&lt;/td&gt;
&lt;td&gt;-0.176&lt;/td&gt;
&lt;td&gt;commits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;All&lt;/td&gt;
&lt;td&gt;is net negative&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;112&lt;/td&gt;
&lt;td&gt;-0.011&lt;/td&gt;
&lt;td&gt;-0.008&lt;/td&gt;
&lt;td&gt;commits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rust&lt;/td&gt;
&lt;td&gt;is net negative&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;-0.019&lt;/td&gt;
&lt;td&gt;-0.027&lt;/td&gt;
&lt;td&gt;commits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;All&lt;/td&gt;
&lt;td&gt;is net negative&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;112&lt;/td&gt;
&lt;td&gt;-0.012&lt;/td&gt;
&lt;td&gt;-0.006&lt;/td&gt;
&lt;td&gt;commits&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Summary of paired t-test pre/post differences.&lt;/p&gt;

&lt;p&gt;The paired comparisons shown in Table 3 suggest that, after the &lt;span&gt;AI&lt;/span&gt; artifact creation event, some repositories tend to experience higher development activity and code churn, although this pattern is not uniform across all metrics and language groups. Across all languages, we see positive mean and median differences for &lt;code&gt;gross churn&lt;/code&gt;, &lt;code&gt;net added&lt;/code&gt; and &lt;code&gt;total commits&lt;/code&gt; in the weekly analysis with no delay, indicating that, on average, repositories experienced more lines changed and commits per week after &lt;span&gt;AI&lt;/span&gt; adoption.&lt;/p&gt;

&lt;p&gt;As for language specific weekly results, while CSS specific repositories have increases in both &lt;code&gt;files touched per commit&lt;/code&gt; and &lt;code&gt;net removed&lt;/code&gt;. This may indicate that for these languages, post adoption work involved more restructuring, editing or clean up activity rather than focusing on the production of new code. Especially in the CSS case, the increase in &lt;code&gt;files touched per commit&lt;/code&gt; might suggest that changes became slightly more widespread across files, which might reflect in overall broader modifications per commit.&lt;/p&gt;

&lt;p&gt;Conversely, the results at commit level point towards a reduction in rework as &lt;code&gt;is net negative&lt;/code&gt; reduces for the overall sample and for Rust repositories across all delays. Similarly, for Bash, the negative differences in files touched at delays 0 and 2 imply that commits affected fewer files on average after the event, which could suggest localized changes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Threats to Validity
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Validity of Artifact Files
&lt;/h3&gt;

&lt;p&gt;Our methodology assumes that the presence of &lt;code&gt;CLAUDE.md&lt;/code&gt; or &lt;code&gt;AGENTS.md&lt;/code&gt; files indicates meaningful &lt;span&gt;AI&lt;/span&gt; usage from creation date. This can be inaccurate since some repositories may have added for experimentation or documentation purposes, rather than development, whereas others may use &lt;span&gt;AI&lt;/span&gt; tools without adding an artifact file to their repository. Furthermore, the file may be introduced after &lt;span&gt;AI&lt;/span&gt; use has already begun, making the comparisons inconsistent, as the adoption date would be wrong.&lt;/p&gt;

&lt;h3&gt;
  
  
  Selection Bias
&lt;/h3&gt;

&lt;p&gt;The repositories analyzed were collected via Google Dorks search queries for &lt;code&gt;CLAUDE.md&lt;/code&gt; and &lt;code&gt;AGENTS.md&lt;/code&gt; files. Due to rate limits, the results were limited to the first few hundred, limiting sample size to what Google ranked as &lt;em&gt;most relevant&lt;/em&gt;. Additionally, the dataset is dominated by Python and Javascript repositories (approximately two-thirds), increasing the statistical significance for those languages.&lt;/p&gt;

&lt;h3&gt;
  
  
  Short post adoption period
&lt;/h3&gt;

&lt;p&gt;For most repositories, the post &lt;span&gt;AI&lt;/span&gt; adoption period is 5 months on average, compared to much longer before. Although we mitigated this, the limited post adoption data may not provide enough time for the effects to stabilize. This can reduce the robustness of the post adoption trend.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This study examined whether the adoption of &lt;span&gt;AI&lt;/span&gt; development tooling is associated with measurable changes in open-source developer productivity. We used artifact files to identify when &lt;span&gt;AI&lt;/span&gt; was adopted, and compared activity before and after this point using the SARIMAX model to analysis immediate and trend changes.&lt;/p&gt;

&lt;p&gt;Across language groups and metrics, we find that most repositories do not show a statistically significant change in output or churn metrics after &lt;span&gt;AI&lt;/span&gt; adoption. Immediate effects are rare, seeing fewer than 10% of projects with a significant effect for most metrics, but total commits being an exception. Trend changes are more common than immediate shifts, most consistently for total commits, but with more repositories showing significant results. In JavaScript and Python the effects are more consistent, but generally have a negative trend. For Python repositories, commit frequency trends downwards, suggesting a possible shift toward fewer, larger commits in a subset of projects.&lt;/p&gt;

&lt;p&gt;Overall, our results do not support the general claim that usage of &lt;span&gt;AI&lt;/span&gt; development tools results in a significant increase in output, nor a change in output quality. Rather, when effects are detectable, they generally show a tendency towards fewer, larger commits.&lt;/p&gt;

&lt;p&gt;Due to the empirical nature of our findings, the practical implications are not immediately evident. However, we identified two primary strategies for optimizing AI integration. First, given the sparse impact across languages, adoption could be selective, prioritizing languages with proven performance gains when using general purpse tools. Second, as our data shows a shift toward higher-density commits, developers face an increased cognitive burden during review. We recommend enforcing small-batch commit policies to mitigate this complexity, so AI-driven velocity does not compromise architectural clarity or code integrity.&lt;/p&gt;

&lt;p&gt;Future work could aim to use a more targeted methodology by selecting repositories with longer post adoption periods, validating adoption time, and focusing on languages where these tools are more prevalent. This would allow for more robust analysis of the effects of longer term &lt;span&gt;AI&lt;/span&gt; usage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Replication package available on GitHub:&lt;/strong&gt; &lt;a href="https://github.com/AndrewRutherfoord/ai-dev-productivity-data-replication-package" rel="noopener noreferrer"&gt;AndrewRutherfoord/ai-dev-productivity-data-replication-package&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;p&gt;&lt;span&gt;al, Weisz et&lt;/span&gt;. 2025. “Examining the Use and Impact of an AI Code Assistant on Developer Productivity and Experience in the Enterprise.” &lt;em&gt;CHI EA ’25: Extended Abstracts of the CHI Conference on Human Factors in Computing Systems&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Faragó, Csaba, Péter Hegedűs, and Rudolf Ferenc. 2015. “Cumulative Code Churn: Impact on Maintainability.” &lt;em&gt;2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM)&lt;/em&gt;, September, 141–50. .&lt;/p&gt;

&lt;p&gt;Gomes, Kevin Cerqueira, Elivelton Ramos Cerqueira, Gabriel Moraes, et al. 2026. “Investigating the Relationship Between Churning and Code Smells.” In &lt;em&gt;Software Engineering and Advanced Applications&lt;/em&gt;, edited by Davide Taibi and Darja Smite. Springer Nature Switzerland. .&lt;/p&gt;

&lt;p&gt;Rutherfoord, Andrew. n.d. &lt;em&gt;&lt;a href=""&gt;NeoRepro: A Tool for Creating Replication Packages for Mining Software Repository Research Using a Graph Database&lt;/a&gt;&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;“What Is an ARIMAX Model? GeeksforGeeks.” n.d. Accessed March 28, 2026. .&lt;/p&gt;






&lt;ol&gt;

&lt;li id="fn1"&gt;
&lt;p&gt;NeoRepro MSR Tool: &lt;a href="https://github.com/AndrewRutherfoord/NeoRepro-MSR-Tool" rel="noopener noreferrer"&gt;https://github.com/AndrewRutherfoord/NeoRepro-MSR-Tool&lt;/a&gt; ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn2"&gt;
&lt;p&gt;Replication Package with dataset: &lt;a href="https://github.com/AndrewRutherfoord/ai-dev-productivity-data-replication-package" rel="noopener noreferrer"&gt;https://github.com/AndrewRutherfoord/ai-dev-productivity-data-replication-package&lt;/a&gt; ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn3"&gt;
&lt;p&gt;Pmdarima package: &lt;a href="https://alkaline-ml.com/pmdarima/" rel="noopener noreferrer"&gt;https://alkaline-ml.com/pmdarima/&lt;/a&gt; ↩&lt;/p&gt;
&lt;/li&gt;

&lt;/ol&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>analytics</category>
    </item>
  </channel>
</rss>
