
arXiv:2605.09081v3 Announce Type: replace Abstract: We introduce the first universal pretraining corpus for industrial time-series data: FactoryNet. 51M datapoints across 23k end-to-end task executions (13.3k real, 9.8k synthetic) on six embodiments, unified by a shared schema that enables robust zero-shot cross-embodiment transfer and highly parameter-efficient anomaly detection. We introduce a novel schema: Setpoint, Effort, Feedback, Context (S-E-F-C) underlying the whole pipeline that maps any actuated system into a common representational frame. The corpus spans 27 annotated anomaly types
The release of FactoryNet marks a significant step towards creating foundational models for industrial automation, leveraging recent advancements in large-scale AI datasets.
This dataset provides a universal pretraining corpus for industrial time-series data, enabling more robust and efficient AI deployment in manufacturing and operational technology.
The ability to train a single model across diverse industrial systems with zero-shot transfer capability will accelerate industrial AI adoption and improve anomaly detection efficiency.
- · Industrial automation sector
- · AI/ML research labs
- · Manufacturing companies
- · Robotics companies
- · Legacy industrial analytics firms
- · Companies dependent on niche, proprietary industrial datasets
FactoryNet will standardize industrial data representation and accelerate the development of general-purpose industrial AI.
This standardization will lead to a more interconnected and autonomous industrial ecosystem, reducing operational costs and increasing efficiency.
The widespread adoption of foundation models in industry could enable truly adaptive and self-optimizing factories, impacting global supply chains and economic productivity.
This signal links to a primary source. Continuum Brief monitors and indexes it as part of the live intelligence stream — we do not republish source content.
Read at arXiv cs.LG