top of page

AI GM
Level design / System Design

AI Game Master – Where Storytelling Meets Artificial Intelligence

The AI Game Master (AIGM) is a capstone project that explores the intersection of artificial intelligence and tabletop role-playing game design. Built to function as a cooperative Game Master, the AIDM dynamically generates storylines, environments, and character interactions — adapting in real time to the choices and creativity of human players.

Blurb:


The goal of this project is not to replace human storytellers, but to empower them. The AI Game Master acts as a creative partner, capable of expanding worlds, improvising encounters, and maintaining narrative consistency across sessions. Using natural language processing and procedural generation, it brings the spontaneity of traditional tabletop experiences into an AI-supported environment.
 

This ongoing development combines my background in RPG system design with my growing passion for human-centered AI. Through usability testing, player feedback, and iterative storytelling experiments, the AIGM project continues to evolve — bridging the gap between imagination and technology, and reimagining what it means to share a story with a machine that listens, learns, and responds.

10/26/2025

Introduction

This month marked a major step in the development of my AI Game Master (AIGM) — a capstone project that merges game design, artificial intelligence, and interactive storytelling. As a lifelong RPG fan and designer, I wanted to create a system that could act as a cooperative Game Master, adapting to players’ creativity rather than replacing it.

The AIGM prototype aims to generate unique stories, dialogue, and encounters in real time, bringing the spontaneity of tabletop sessions to a digital, AI-assisted format.

This post focuses on the core story-generation feature I built and tested this month, the tools used to create it, and what I learned along the way.

​

Feature Development: Adaptive Story Generation

Overview

This month marked the early design phase of my capstone project, the AI Game Master (AIGM), a system that merges artificial intelligence and tabletop role-playing games to create interactive storytelling experiences.

As both a game designer and a lifelong RPG fan, I wanted to build something that helps players and Game Masters alike craft memorable campaigns. The AIGM isn’t meant to replace human creativity, it’s meant to enhance it, acting as a collaborative storyteller that adapts to each group’s imagination.

This month, I focused specifically on designing the user interface (UI) and overall layout prototype for the application. The goal was to create an intuitive environment that supports both casual players and experienced Game Masters during AI-assisted sessions.

​

Tools & Technologies Used

  • Python 3.11 – Core programming language for the backend logic.

  • ​

 

Development Process

The month began with designing how the AI would store and recall narrative context. I implemented a memory cache system that logs player choices, ensuring the AI can reference past events to maintain story consistency.
Testing sessions focused on balancing creativity vs. structure — fine-tuning prompt templates to prevent the AI from creating overly long or irrelevant responses.

By the end of the sprint, the AI could dynamically create story hooks, describe environments, and generate dialogue trees that shifted based on player tone and decisions.

 

Challenges Faced

The prototype was developed entirely in Python using the Streamlit framework to quickly mock up an interactive tabletop interface. The layout was divided into three major sections:

  1. Narrative Panel – Displays AI or NPC text, player dialogue, and narrative controls (Approve, Edit, Regenerate).

  2. Encounter Tracker – Manages player and enemy hit points, attack rolls, and dice results.

  3. Game Master Controls – Allows for spawning NPCs, adjusting difficulty, and editing the world log.

​

The goal was to simulate the feel of a digital tabletop session before introducing real AI logic.

​

Tools & Technologies Used

  • Python 3.11 – Core programming language.

  • Streamlit – Used for building and displaying the user interface, managing session state, and creating interactive elements like buttons, sliders, and text inputs.

  • Random library – Generated placeholder dice rolls and attack results.

​

Development Process

The prototype was built through an iterative process:

  1. Session State Setup: Initialized persistent session variables for encounter data, chat logs, and player/NPC text.

  2. Interface Columns: Structured the app into three columns (left narrative, middle encounters, right GM tools) to mirror the typical tabletop experience.

  3. Interactivity: Added working buttons for saving, loading, dice rolls, and regenerating NPC dialogue placeholders.

  4. Chat Simulation: Implemented a temporary system where the “DM” gives automated responses to user input — simulating AI conversation flow for later testing.

​

This structure allowed real-time interaction between components, giving me a live preview of how the eventual AI logic will integrate with user input and session data.

​

Challenges Faced

The main challenge this month was ensuring the Streamlit layout stayed responsive and readable across different screen widths. Streamlit’s wide-layout option helped, but some nested columns required careful proportioning.

​

Additionally, keeping session data consistent between updates was tricky, early versions would reset variables whenever an action occurred. This was solved by centralizing initialization in an _init_state() function that preserved all key values between user actions.

​

Furthermore, on a more personal note, a rather lengthy illness made me lose about a weeks worth of development on the project during this course. 

​

​

Retrospective

What Went Right

  • Completed the IRB documentation for upcoming usability testing.

  • Feedback from faculty advisor Oleg provided strong direction on ethics and data handling.

  • Completed a fully interactive Streamlit prototype representing all major game functions.

  • Established a solid session-state system for managing user input, encounters, and dialogue.

  • Successfully simulated dice rolls and player actions using Python’s randomization functions.

 

What Went Wrong

  • Personal Life - thanks to getting sick which left me unable to work much on the project during the last week, the prototype is behind where I originally envisioned it for this time.

  • Time allocation between coding, testing, and documentation was uneven during mid-month crunch weeks.

  • Early versions of the layout were too narrow, making columns overlap or clip on smaller screens.

  • Encounter tracker logic briefly broke when session states weren’t properly initialized.

  • Time spent on debugging UI alignment slightly delayed documentation progress.

 

How I’ll Improve Moving Forward

  • Continue refining layout scaling for smaller monitors and potential tablet support.

  • Begin linking this prototype to the upcoming AI backend module next month.

  • Add data persistence so sessions can be saved and loaded between runs.

  • Conduct informal user tests to gauge readability and workflow before formal IRB testing begins.

​

Closing Thoughts

This month’s focus on UI and user interaction flow created a solid foundation for future development. While the AI system is still forthcoming, the current prototype now provides a working environment for testing usability, pacing, and data structure.

 

Designing with Streamlit proved to be an efficient way to visualize gameplay systems in real time, giving a clearer picture of how both players and Game Masters will interact with the AI Dungeon Master once intelligence and narrative features are added.

​

Next month’s goal is to integrate early AI responses and begin gathering usability data from tabletop players.

11/22/2025

Introduction

This month marked a major step in the development of my Virtual Dungeon Master (Virtual DM) — a capstone project that blends interactive storytelling, character creation tools, and performance-focused backend engineering. As both a game designer and a longtime RPG player, I’ve always wanted to build a system capable of guiding players through character creation and world interaction in a way that feels responsive, intuitive, and supportive rather than restrictive.

The Virtual DM prototype is designed to assist players and Game Masters by generating structured character data, applying races and features, and preparing prompts for later AI-driven narrative modules. This post focuses on the performance testing conducted this month, the usability evaluations completed, and the technical refinements that improved the overall responsiveness of the tool.

​

Feature Development: Performance Optimization & Usability Improvements

Overview

This month centered on two major development areas: CPU profiling and user testing. After several rounds of iteration, the Virtual DM interface was functional but showing signs of inefficiency. Additionally, I wanted to understand how real users interpreted the character creation flow, especially the “Apply Race” workflow.

Profiling the system revealed CPU bottlenecks around Streamlit’s rerun behavior, repeated JSON loading, and inefficient prompt construction. Meanwhile, user testing with participants of varying tabletop RPG experience provided insight into usability challenges, including terminology confusion and unclear system feedback.

Together, these efforts helped improve both performance and user experience, paving the way for deeper AI integration later in the project.

​

Tools & Technologies Used​

  • ​Python 3.12 – Main language for backend logic

  • Streamlit – Framework used to build the UI and manage reruns

  • cProfile – Python CPU profiler used for detailed performance testing

  • SnakeViz – Visualization tool for interpreting profiler output

  • JSON configuration files – Store race, class, and rule data

  • Local command-line environment – Used for profiling and debugging

 

Development Process

The month began with a full CPU analysis using cProfile. The profiler was attached during a typical run of the application, including selecting a race, applying racial bonuses, navigating to Step 2, and triggering prompt generation. This produced a 100-second profile containing over seven million function calls, which was then visualized using SnakeViz.

The results highlighted that:

  • Streamlit’s event loop consumed the largest amount of CPU time

  • JSON files were being reloaded on every rerun

  • String concatenation during prompt building caused unnecessary overhead

  • Some heavy logic was still running at the global scope

Based on this, several key improvements were made:

  • Implemented @st.cache_data to store static race and class data

  • Rewrote prompt building using a list-join approach instead of repeated concatenation

  • Moved expensive logic into functions so Streamlit would not repeatedly execute them

  • Reduced logging noise to eliminate unnecessary console work

Parallel to these improvements, I also conducted usability testing with four participants. Each user completed a Talk-Aloud session where they selected a race, clicked Apply Race, and moved into Step 2 while narrating their thoughts.

A structured questionnaire was created that included:

  • Demographic questions

  • Five Likert-scale usability questions

  • Open-ended follow-up questions

  • A script instructing users how to perform the Talk-Aloud

This process highlighted areas where user expectations did not match interface behavior, especially during page reruns.

 

Challenges Faced

The prototype was developed entirely in Python using the Streamlit framework to quickly mock up an interactive tabletop interface. The layout was divided into three major sections:

  1. Narrative Panel – Displays AI or NPC text, player dialogue, and narrative controls (Approve, Edit, Regenerate).

  2. Encounter Tracker – Manages player and enemy hit points, attack rolls, and dice results.

  3. Game Master Controls – Allows for spawning NPCs, adjusting difficulty, and editing the world log.

​

The goal was to simulate the feel of a digital tabletop session before introducing real AI logic.

​

Tools & Technologies Used

  • Python 3.11 – Core programming language.

  • Streamlit – Used for building and displaying the user interface, managing session state, and creating interactive elements like buttons, sliders, and text inputs.

  • Random library – Generated placeholder dice rolls and attack results.

​

Development Process

The month began with a full CPU analysis using cProfile. The profiler was attached during a typical run of the application, including selecting a race, applying racial bonuses, navigating to Step 2, and triggering prompt generation. This produced a 100-second profile containing over seven million function calls, which was then visualized using SnakeViz.

 

The results highlighted that:

  • Streamlit’s event loop consumed the largest amount of CPU time

  • JSON files were being reloaded on every rerun

  • String concatenation during prompt building caused unnecessary overhead

  • Some heavy logic was still running at the global scope

​

Based on this, several key improvements were made:

  • Implemented @st.cache_data to store static race and class data

  • Rewrote prompt building using a list-join approach instead of repeated concatenation

  • Moved expensive logic into functions so Streamlit would not repeatedly execute them

  • Reduced logging noise to eliminate unnecessary console work

 

Parallel to these improvements, I also conducted usability testing with four participants. Each user completed a Talk-Aloud session where they selected a race, clicked Apply Race, and moved into Step 2 while narrating their thoughts.

​

A structured questionnaire was created that included:

  • Demographic questions

  • Five Likert-scale usability questions

  • Open-ended follow-up questions

  • A script instructing users how to perform the Talk-Aloud

This process highlighted areas where user expectations did not match interface behavior, especially during page reruns.

​

Challenges Faced

One of the main technical challenges this month was Streamlit’s rerun model. Whenever a widget changes, the entire script re-executes, which caused:

  • Repeated data loading

  • Flickering on update

  • Confusion for inexperienced users who believed the app froze

​

Caching mitigated some issues, but Streamlit’s event loop remains a source of complexity moving forward.

From a usability standpoint, several participants struggled with terminology such as “ability bonuses” and the difference between selecting and applying a race. Users unfamiliar with tabletop RPG mechanics needed more guidance than expected, while experienced users requested clearer confirmation feedback.

Maintaining consistent session data between reruns was another area requiring attention. Several early versions duplicated work or failed to persist state. This was addressed by consolidating initialization within a dedicated setup function.

​

​

Retrospective

​

What Went Right

  • Completed full CPU profiling to identify performance bottlenecks.

  • Implemented caching that significantly reduced redundant operations.

  • Completed Talk-Aloud usability testing with four participants.

  • Created the first formal questionnaire for structured user evaluation.

  • Improved prompt generation logic and removed unnecessary overhead.

  • Gained actionable insight into how new vs. experienced users interpret the UI.

 

What Went Wrong

  • Some portions of the interface remained unclear to beginners, especially the Apply Race workflow.

  • Streamlit rerun behavior caused visible flickering that confused certain participants.

  • Time spent debugging visualization in SnakeViz slowed early analysis.

  • Maintaining consistent session state required several rewrites.

  • Balancing profiling, user testing, and documentation compressed development time.

 

How I’ll Improve Moving Forward

  • Add visual confirmation messages whenever Apply Race succeeds.

  • Implement tooltips to explain terminology like modifiers and ability bonuses.

  • Continue refining caching strategies to minimize reruns and refresh delays.

  • Begin polishing the Step-2 interface to better guide new users.

  • Expand testing to more users before integrating AI logic.

  • Start linking the optimized character creation tool to the upcoming narrative engine.

​

Closing Thoughts

This month’s focus on profiling and user testing provided a clearer understanding of how Virtual DM performs under real use. The insights gathered from both technical analysis and human evaluation shaped meaningful improvements to the system’s responsiveness, clarity, and reliability.

By refining the character creation workflow now, the project builds a strong foundation for the AI-driven features planned for future iterations. Next month’s goal is to continue polishing the interface, integrate early narrative-generation components, and deepen usability testing with a broader range of tabletop players.

bottom of page