New chef dataset brings AI to cooking
Artificial intelligence (AI) can help people shop, plan, and write—but not cook. It turns out humans aren't the only ones who have a hard time following step-by-step recipes in the correct order, but new research from the Georgia Institute of Technology's College of Computing could change that.
Researchers created a dataset called ChattyChef, which uses natural language processing models that can help a user cook a recipe. Using the open-source large language model GPT-J, ChattyChef's dataset of cooking dialogs follows recipes with the user.
The researchers presented their AI in the paper, "Improved Instruction Ordering in Recipe-Grounded Conversation," presented at the 61st annual meeting of the Association for Computational Linguistics, and the study is also published on the arXiv preprint server
Although other researchers have theorized about the possibility of an AI chef, Georgia Tech's work pushes the field forward. "We are one of the first research teams to analyze the challenges of using large language models for building an AI chef," said Duong Le, a Ph.D. student in the School of Interactive Computing.
Most attempts at using language models for cooking fail because GPT-J doesn't understand what the user wants to do next, or user intent, and has difficultly tracking how far the user is in the recipe—what the researchers call the "state of the conversation." It also can't easily answer clarification questions, like about ingredient amounts or cooking times.
For example, maybe someone is trying to cook hashbrowns. The AI tells them to melt butter in the pan and add the potatoes. The user then asks about the next step. A bad bot might jumble the order and tell them to serve the hashbrown even though they haven't finished cooking it yet. Or a user asks a follow-up question about how long to cook the hashbrown, and AI won't be precise enough, instead giving a general time and not specifying the cooking time for each side.
With this in mind, the researchers ensured their model had two key features:
- User intent detection to determine the user's current intent within a fixed set of possibilities, such as "Ask for next instruction" or "Ask for details about ingredients."
- Instruction state tracking to identify which recipe step the user is on, which works with 80% accuracy.
The combined information from these features supports the third innovation of ChattyChef—response generation. User intent helps generate the best response to answer a user's question. The instruction state selects the most relevant parts of the recipe rather than including the entire recipe, to avoid confusing the user or burdening them with extra steps as they are cooking.
The ChattyChef dataset is built off WikiHow recipes with positive ratings and fewer than eight steps. The researchers crowdsourced people to role play how they might use ChattyChef to determine what instructions would be best to include in the dataset.
The researchers believe the innovations of ChattyChef could be used in many domains besides cooking, such as repair manuals or software documentation.
More information: Duong Minh Le et al, Improved Instruction Ordering in Recipe-Grounded Conversation, arXiv (2023). DOI: 10.48550/arxiv.2305.17280