Diagnosing CI/CD Failures and Recommending Repair Diff Footprints with Lightweight LLM-Assisted Models: Full Experimental Evaluation on BugSwarm CI Fail–Pass Pairs

Oscar Meng

Authors

Oscar Meng Computer Science, UCLA, CA, USA Author

Keywords:

CI/CD, DevOps, build failure diagnosis, build repair, Top-k recommendation, diff footprint, BugSwarm, log mining, automated program repair, large language models

Abstract

CI/CD pipelines execute builds and tests on every change, generating high-volume failure signals and log artifacts that require rapid diagnosis and repair. This paper studies two operational tasks that appear in modern DevOps automation: (i) failure diagnosis and (ii) repair recommendation in a Top-k setting. We conducted full experimental evaluations on a public BugSwarm-derived artifact list containing 325 Java fail–pass CI pairs (SHA-256: 267fdfc1ee603af3613db96ec79230c7c2e856fa5b4594ffda6ec51f38809df6). We defined three failure types from CI outcome metadata—TEST_FAIL, BUILD_OR_SETUP_FAIL, and NONTEST_FAIL—and defined repair targets as diff-footprint patterns A{a}C{c}D{d} based on the numbers of added/changed/deleted files, mapped to 24 classes (23 frequent patterns with support ≥3 plus OTHER). Using only information available at failure time (commit message, repository metadata, build system, test framework, and test counters), we compared six baselines and a proposed probability-averaging ensemble of text-only and numeric-only multinomial logistic regression. On failure diagnosis (5-fold stratified CV), the proposed ensemble achieved macro-F1=0.881 and accuracy=0.926. On repair recommendation (3-fold stratified CV), the proposed ensemble achieved Hit@1=0.511, Hit@3=0.754, Hit@5=0.852, Hit@10=0.920, and MRR=0.657. The manuscript reports detailed tables and figures and specifies all hyperparameters to make the reported empirical findings reproducible.

Author Biography

Oscar Meng, Computer Science, UCLA, CA, USA

Diagnosing CI/CD Failures and Recommending Repair Diff Footprints with Lightweight LLM-Assisted Models: Full Experimental Evaluation on BugSwarm CI Fail–Pass Pairs

Authors

Keywords:

Abstract

Author Biography

Downloads

Published

Issue

Section

License

How to Cite

Manu

For Authors

About Journal

Editorial Team

Make a Submission

Ready to Publish