arXiv:2508.10409v3 Announce Type: replace-cross Abstract: This paper constructs a textual dataset for training large language models (LLMs) to learn analog circuit knowledge and customizes LLM training techniques. For dataset construction, high-quality textbooks are collected and decomposed into fine-grained learning nodes, which are then used to construct structured question-thinking-solution-answer (QTSA) quadruples using a multi-agent framework to capture both final answers and thought processes. The resulting dataset consists of 7.26M tokens of unlabeled data for continual pre-training (CP
Source: arXiv cs.AI — read the full report at the original publisher.
