Engineering large language model (LLM)-native software remains a challenging and immature field, according to a new paper on arXiv. Current practice is largely exploratory, relying on experimentation and heuristic techniques such as prompting and context engineering. These approaches are low-level and lack the principled structure needed to support design-level reasoning or analysis.
To bring similar rigor to LLM-native development, we propose methods for documenting generative flows and for stating properties of LLM-based software designs.
The authors — Víctor A, Bonomo-Braberman, and Flavia — argue that traditional software engineering leverages modularity and abstraction to communicate and analyze system behavior. Their initial approach is based on graphical probabilistic models, tailored to capture phenomena characteristic of LLM-native systems. This framework, termed Generation Networks, aims to provide a foundation for principled reasoning about generative interactions and system-level properties in LLM-centric software architectures.
The Challenge of Current LLM Development
The paper notes that current practice is largely exploratory, with developers relying on low-level techniques such as prompting and context engineering. These methods lack the structure needed for systematic analysis. As a result, LLM-native software systems are difficult to analyze, debug, and verify. The authors state that such methods must account for the stochastic, prompt-dependent behavior of large language models while remaining expressive enough to capture emergent phenomena.
Generation Networks: A Proposed Framework
The proposed Generation Networks framework uses graphical probabilistic models to document generative flows. The approach is designed to capture phenomena characteristic of LLM-native systems, including the variability and dependencies introduced by prompts and model stochasticity. By modeling interactions as probabilistic graphs, the framework enables developers to state and analyze properties of LLM-based software designs.
| Current Practice | Generation Networks Framework |
|---|---|
| Exploratory, heuristic | Principled, model-based |
| Low-level prompting | Graphical probabilistic models |
| Lacks structure for analysis | Enables design-level reasoning |
| Difficult to analyze | Provides foundation for analysis |
The authors emphasize that the framework must account for the stochastic, prompt-dependent behavior of LLMs while remaining expressive enough to capture emergent phenomena. While the paper presents an initial approach, it aims to bring similar rigor to LLM-native development as traditional software engineering enjoys.
Implications for Enterprise Software Development
For CTOs and technology leaders, the Generation Networks framework offers a potential path to move beyond trial-and-error development of LLM-native systems. By adopting graphical probabilistic modeling, enterprises could apply structured analysis to generative flows, improving reliability and auditability of AI-powered applications. The framework could support design-level reasoning about system properties, helping to identify issues before deployment.
The paper is available on arXiv under a Creative Commons license. The authors have not yet released code or data associated with the article. Future work may involve deeper exploration of the modeling language and validation against real-world LLM-native applications. As LLM-native software becomes more prevalent in enterprise contexts, frameworks that bring rigor to development will be critical for building trustworthy systems.