Generative AI for Future Architecture Scenario Planning
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Scenario planning is a critical practice in enterprise architecture (EA) for envisioning alternative future states of an organization. This paper expands on prior work by exploring how generative Artificial Intelligence (AI), specifically large language models (LLMs), can accelerate and enhance the modeling of enterprise target architectures under multiple strategic scenarios. We conduct experiments with several state-of-the-art LLMs (OpenAI GPT-4, Anthropic Claude 2, and Meta LLaMA) on three distinct strategic scenarios for a hypothetical mid-size enterprise: digital channel diversification, internal operational optimization, and sustainability transformation. For each scenario, the LLMs are prompted to generate a future-state architecture, which is then compared to a human expert’s architecture design. Quantitative results show that the AI-generated architectures can cover 80–100% of the strategic objectives identified in a scenario while reducing initial design time from weeks to seconds. For instance, GPT-4 achieved near 100% coverage of scenario objectives with only minor omissions, whereas an opensource LLaMA-based model covered about 80% on average. The AI proposals included slightly more components (10–15% higher count) and integrations than the expert designs, indicating a tendency to over-engineer. We introduce a novel evaluation framework – measuring strategic coverage, architecture complexity, expert quality scores, and alignment with TOGAF principles – to systematically assess these architectures. Our experiments demonstrate that GPT-4 and Claude 2 can rapidly produce plausible and strategic future-state architectures that approach the completeness of expert models (achieving an average expert review score of 7–8 out of 10), while requiring only modest human corrections (∼10–15%). Meanwhile, the open-source LLM model generated useful drafts but with lower fidelity (expert score ∼6/10) and more extraneous elements, requiring further expert refinement (∼25% changes). Contributions of this work include: (1) the first comparative analysis of multiple LLMs for EA scenario planning, (2) an extended methodology for prompt-driven architecture generation and quantitative evaluation, (3) insights into the strengths and limitations of AI-generated architectures in coverage, complexity, and quality, and (4) recommendations for integrating generative AI into human-centric EA practices. These findings highlight that, when properly guided, generative AI can serve as a valuable “co-architect” – accelerating scenario planning by producing solid first-draft architectures – while human architects steer and validate the results for context-specific accuracy and feasibility.