Spellburst: A large-language-model-powered interactive canvas for generative artists
Generative artists work in code. Using programming languages like Processing or AI text-to-image tools, they translate expressive semantics into lines of code that form swirling, colorful patterns or surrealistic landscapes.
But coding art is a time-consuming, complicated process. While a pencil's eraser might fix an errant line or a little yellow might brighten a painting's dark skyline, improving generative art takes trial and error through numerous iterations with often frustratingly opaque interfaces.
After interviewing expert digital artists on these creative frustrations, Stanford scholars have developed a tool called Spellburst to improve the ideation and editing process.
"Translating an artist's imagination into code takes a lot of time, and it's very difficult," says Hariharan Subramonyam, assistant professor at the Graduate School of Education and a faculty fellow at the Stanford Institute for Human-Centered AI.
"A large language model can give you a good starting point. But when the artist wants to explore different textures, different colors or patterns, at that point they want finer control, which large language models can't provide. Spellburst essentially helps artists seamlessly switch between the semantic space and the code."
Built with the large language model GPT-4, Spellburst allows artists to input an initial prompt, say, "a stained glass image of a beautiful, bright bouquet of roses." The model then generates the code to render that concept. But what if the flowers are too pink, or the stained glass doesn't look quite right? Artists can then open a panel of dynamic sliders generated using the previous prompt to change any aspect of the image or can add modifying notes ("make the flowers a dark red").
These creators can merge different versions ("combine the color of the flowers in version 4 with the shape of the vase in version 9"). The tool also allows artists to transition from prompt-based exploration to program editing—they can click on the image to reveal the code, allowing for more granular fine-tuning.
'Larger creative leaps'
To better inform the design of Spellburst, the research team interviewed 10 expert creative coders on how they develop their concepts, their creative workflow, and their biggest challenges. Later, the team tested the tool with expert generative artists.
"The feedback was overall very positive," Subramonyam says. "The large language model helps artists bridge from semantic space to code faster, but it also helps them explore many different variations and take larger creative leaps."
The tool of course has its limitations. The research team saw errors and unexpected results in some of the prompts, particularly in version mergers, and it was unclear which prompts would lead to the desired results. Plus, the small sample of artists providing feedback certainly doesn't represent the full generative artist community.
But the hope is that this tool will be useful for coder artists and maybe even a broader audience, Subramonyam says.
"We want to release the tool as open-source later this year so that artists can start using it, but we also want to study how a tool like this can help novices learn how to make art with code."
The findings are published on the arXiv preprint server.
More information: Tyler Angert et al, Spellburst: A Node-based Interface for Exploratory Creative Coding with Natural Language Prompts, arXiv (2023). DOI: 10.48550/arxiv.2308.03921