Released diagen yesterday, but how does it work?
1. Generate @terrastruct d2 diagrams with the model of your choice. Sonnet seems best, o1 seems needlessly expensive, gemini-flash is insane if you do a few rounds of visual reflection.
What's visual reflection? 👇
x.com/hrishioa/status/1843685800875266470
2. Fix any errors in the diagram by running a few loops cleaning the errors from the rendering engine and passing it to a correction model. I've found sonnet is great at this, almost too good.
This is sonnet trying to make the best of things
3. We use a multimodal LLM (gemini-flash-8b is insane for this, but Haiku is the best for price to performance) to provide feedback we can use to improve the generated diagram.
Rinse and repeat.
That and a bunch of prompt optimizations and engineering to pull it together!