SIGNALAI·Jun 6, 2026, 4:00 AMSignal75Medium term

A Study of LLMs' Preferences for Libraries and Programming Languages

arXiv:2503.17181v4 Announce Type: replace-cross Abstract: Despite the rapid progress of large language models (LLMs) in code generation, existing evaluations focus on functional correctness or syntactic validity, overlooking how LLMs make critical design choices such as which library or programming language to use. To fill this gap, we perform the first empirical study of LLMs' preferences for libraries and programming languages when generating code, covering eight diverse LLMs. We observe a strong tendency to overuse widely adopted libraries such as NumPy; in up to 45% of cases, this usage is

Why this matters

Why now

This study is particularly relevant now as LLMs rapidly advance in code generation, making their internal biases and decision-making processes increasingly critical for software development. The paper's publication on arXiv, a leading pre-print server, indicates active and ongoing research in this area.

Why it’s important

A strategic reader should care because understanding LLM preferences for libraries and languages reveals inherent biases that will shape future software development, potentially reinforcing existing tech monopolies or creating new lock-in effects. This research influences how we evaluate and deploy AI for coding tasks, impacting efficiency, security, and innovation across industries.

What changes

This study changes how we perceive LLM code generation, shifting the focus from mere functional correctness to the underlying design choices and biases embedded within the models. It highlights that LLMs are not neutral code generators but exhibit discernible preferences that can influence software architecture and tooling adoption.

Winners

· LLM developers
· Widely adopted library maintainers (e.g., NumPy)
· Companies using LLMs for code development
· AI ethics researchers

Losers

· Developers of niche or less popular libraries
· Companies relying on less mainstream programming languages
· Organizations seeking diversified tech stacks

Second-order effects

Direct

The study directly reveals that LLMs have strong, quantifiable preferences for certain libraries and programming languages, often overusing widely adopted ones.

Second

This preference could lead to a further entrenchment of dominant software stacks and potentially stifle innovation in alternative or emerging technologies.

Third

Over time, this effect might create a recursive loop where LLM-generated code reinforces its own biases, making it harder for new libraries or languages to gain traction in an AI-assisted development landscape.

Editorial confidence: 95 / 100 · Structural impact: 60 / 100

Original report

This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.

Read at arXiv cs.AI

#cs.SE #cs.AI

Tracked by The Continuum Brief · live intelligence network

The Brief · Weekly Dispatch

Stay ahead of the systems reshaping markets.