From Hazard Functions to Language Space: Cox-Supervised Distillation of Survival Risk into a Large Language Model

arXiv:2606.08945v1 Announce Type: new Abstract: We investigate whether information about time-to-event risk estimated by a Cox proportional hazards model can be transferred into a generative large language model. We propose a text-based survival modelling pipeline in which structured clinical covariates are converted into text prompts and a Qwen-based large language model is fine-tuned to generate patient-specific survival risk using Cox model predictions as a training target. Across GBSG2, ACTG320, and WHAS500, the model achieves competitive held-out discrimination and calibration despite bei
This work is emerging now due to the rapid advancements in large language models and their increasing integration into specialized domains like healthcare, where data distillation and efficient prediction are critical.
A strategic reader should care because this demonstrates a tangible pathway for LLMs to ingest and operationalize complex, time-series data like survival risk, expanding their utility beyond mere text generation into predictive analytics for critical applications.
This changes the perception of LLMs as purely linguistic tools, showing their potential to act as distilled representations of sophisticated statistical models, enhancing interpretability and accessibility for complex medical predictions.
- · Healthcare AI developers
- · Clinical researchers
- · Patients (potentially faster, personalized risk assessment)
- · LLM providers (Qwen and similar)
- · Traditional statistical model-only platforms
- · Developers solely focused on numerical-only deep learning for survival analysis
The immediate first-order effect is the development of more expressive and accessible survival prediction tools leveraging natural language interfaces.
A plausible second-order consequence is the acceleration of personalized medicine, where patient risk profiles are continuously updated and communicated through intelligent systems.
A speculative but reasoned third-order consequence is the creation of 'digital twin' systems for patient care, where LLMs maintain a comprehensive, dynamic health profile, predicting outcomes and suggesting interventions proactively.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG