Given, When, Then, Again: Mining Subscenario Refactoring Candidates in Behaviour-Driven Test Suites with ML Classifiers and LLM-Judge Baselines

arXiv:2605.14568v2 Announce Type: replace-cross Abstract: Context. Behaviour-Driven Development (BDD) test suites accumulate duplicated step subsequences. Three published refactoring patterns are available (within-file Background, within-repo reusable-scenario invocation, cross-organisational shared higher-level step), but no prior work automates which recurring subsequences are worth extracting or which mechanism applies. Objective. Rank recurring step subsequences ("slices") by refactoring suitability (extraction-worthy), pre-map each to one of the three patterns, and quantify prevalence acr
The paper leverages recent advancements in machine learning (ML Classifiers and LLM-Judge Baselines) to address a known problem in software development, making its automation newly feasible.
Improving the efficiency of software testing and development, particularly in Behaviour-Driven Development (BDD), can lead to faster and more reliable software releases, impacting IT efficiency generally.
The proposed methodology automates the identification and classification of refactoring candidates in BDD test suites, offering a systematic way to reduce duplication and improve maintainability.
- · Software Development Teams
- · Organizations using BDD
- · AI/ML-driven developer tooling
- · Manual code refactoring consultants
- · Inefficient software development practices
BDD test suites become more maintainable and efficient through automated refactoring suggestions.
Reduced technical debt and faster feature delivery cycles in software projects employing these tools.
Increased adoption of AI/ML in developer workflows, potentially leading to fully autonomous coding agents handling mundane tasks.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL