2025-11-03 | Vibe coding simulations

Agentic coding has gotten quite good. I spent the last week or so building a generalized version of Stanford’s Generative Agents paper that could be extended to multiple scenarios.

This was quite fun. Agent simulations show a lot of promise across the field of social science. Particularly, they seem to be great additions to existing scenario planning frameworks. Currently, most scenario planning frameworks fail to properly incorporate the sociotechnical dynamics of organizations. They often don’t properly consider the personalities, aptitudes, and other traits of people, which directly impact the probability of success in organizational change initiatives.

In addition, there are use cases that expand to emergency response protocols and even simulations for social science experiments - something I call social science in silico.

Right now, Miniverse is best used for vibe coding simulations. It is a small enough library that can be ingested into context where you can simply ask your agent to create a new simulation for you. I think thats pretty cool.

It certainly has limitations though. For example, I only implemented very basic memory but included a way to extend the memory module for your own retrieval function. I have an example implementation of BM25 retrieval based memory in the Stanford Valentine’s Day replication example.

Ultimately, this taught me a good bit about constructing systems that can be used to run experiments in AI research. Specifically, for analyzing LLM personality, behavior, and multi-agent social dynamics.

Here are a few of the notable ones:

  • In the agentic future, CLIs win: designing miniverse to work well via CLI accelerated my ability to ship with terminal agents
  • Memory should be a first-class concern: even though I figured implementing a memory extension was a good call to avoid having to think this part through deeply, the actual memory implementation was important to make the Stanford replication work.
  • Useful results don’t need complete fidelity: while an important target, complete fidelity is a high-bar and not always necessary. Stanfords paper used a tree-graph structure for agents navigating the simulation space, we got away with a grid and still replicated. However, how these factors impact simulation results is probably worth further study.

See also: Miniverse, Valentine’s Replication