ByteCraft: Generating video games and animations through bytes
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Generating new video games using AI has potential to be the next holy grail of the video game industry. Current AI efforts have focused on two directions: i) controllable video generation and ii) code generated by Large Language Models (LLMs). The first direction is limited due to short-term memory and increasing corruption (blur, noise) over time. The second direction is promising, but it requires a lot of human hand-holding with human-provided assets.Generating hours of coherent interactive video content is infeasible. In this paper, we instead attempt the overly ambitious problem of end-to-end generation of Small Web Format (SWF) games and animations through bytes. By modeling bytes, one does not need code or assets to potentially obtain full games with title screen, narrative, text, graphics, music, and sounds.We make a first attempt by fine-tuning a 7-billion-parameter LLM at 32K context length to generate the bytes of video games and animations conditional on a text description. Our model (ByteCraft) can generate up to 32K tokens, each containing at most 4-5 bytes (generating files as big as 140 KB). Some of the generated files are partially working (4.8-12%), or fully working (0.4-1.2%). ByteCraft is a proof-of-concept highlighting what could be possible given more scaling and engineering effort.We open-source our model and inference code alongside a dataset of 10K synthetic prompts for use with ByteCraft.