Probing Stylistic Appropriation using Large Language Models: An Evaluation Framework for Copyright Infringement under EU Law

arXiv:2606.31250v1 Announce Type: new Abstract: Large language models (LLM) trained on web-scale corpora generate output that may infringe copyright, yet existing technical safeguards focus narrowly on verbatim memorisation. EU copyright doctrine applies a broader standards: substantial similarity, which extends to stylistic choices, narrative structure, and creative elaboration. This mismatch between what current methods detect and what the law protects leaves a significant compliance gap. We introduce PSALM, an LLM-as-a-judge framework that operationalises EU copyright doctrine through ten e
The proliferation of advanced LLMs and increasing legal scrutiny on their output, particularly in Europe, necessitates practical frameworks for copyright compliance.
This development addresses a critical compliance gap for AI developers and users operating under EU law, impacting how LLMs are trained, deployed, and ultimately monetized.
The proposed PSALM framework offers a method to operationalize complex EU copyright doctrine, moving beyond verbatim memorization detection to evaluate stylistic and structural similarities for potential infringement.
- · Copyright holders (artists, authors)
- · Legal tech firms
- · AI ethics and compliance consultants
- · European legal framework developers
- · LLM developers without robust compliance mechanisms
- · Users generating content without considering copyright
- · Content aggregators relying on public domain with stylistic similarities
LLM providers will face increased pressure to integrate 'stylistic appropriation' detection into their models and terms of service.
This could lead to a shift in how LLMs are trained, prioritizing copyright-compliant data and methodologies, potentially impacting model capabilities and development costs.
The success of PSALM could inspire similar legal frameworks globally, creating a fragmented regulatory environment for AI content generation and international content distribution.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.CL