(Ludum Linguarum is an open source project that I recently started, and whose creation I’ve been documenting in a series of posts. Its purpose is to let you pull localized content from games, and make flash cards for learning another language. It can be found on GitHub.)
When I started this project, I figured that support for individual games would fall into one of a small set of categories:
- Low effort, where the strings are either in a simple text file or some sort of well-structured file format like XML, where many good tools already exist to pull it apart.
- Cases where the file formats, while bespoke, are well documented, and where there may be tools and code that already exist to parse the file formats.
- The really hard cases – ones where there isn’t a lot of (or any) extant information about how the game stores its resources, and extracting strings and metadata about them is more of a reverse-engineering exercise than anything else.
In this post, I’ll talk very quickly about a few really simple examples of games that I was able to knock out very quickly: King of Fighters ‘98, King of Fighters 2002, Magical Drop V, and Skulls of the Shogun.
King of Fighters ‘98 and King of Fighters 2002
While I was working on this project, I started on some of the other supported games first. But then, I decided to take a little break, and see if there were any games out there that would be really trivial to support. I just started browsing through my Steam library, and realized that fighting games were probably a good candidate – they contain limited amounts of text, but were definitely globalized.
Both of these games use the Xbox 360 XDK’s XUI library formats to present their UI. (I determined this by the presence of some files and directories with “xui” in their name.) All of the strings in the game are inside a file conveniently named strings.txt inside the data directory.
This is a tab-delimited format with just four columns – a key for the string, a “category” comment field, and then columns for each supported language – “en” for English, and “jp” for Japan. (It’s interesting that the country code rather than the language code was used for Japan – I’m not sure if that was an unintentional mistake.)
In this case, it’s super simple to extract all of the strings, because of the simple formatting, and the one place that I need to look to find them all. I simply read in the file, and directly map the key column to the per-language text for each card.
(It’s worth noting that King of Fighters XIII doesn’t use the same format or engine, so I wasn’t able to just add support for it using the same code.)
Magical Drop V
Adding support for Magical Drop V just involved reading some XML files within its localization subdirectory, and massaging them slightly to remove invalid and undesirable text. For example, ampersands were not escaped in the XML files, which caused the .NET framework’s XML parser to complain. I also stripped out some obvious placeholder values (“<string placeholder>”).
Overall, it was really quite simple to add support for this game, with the game-specific code only running to about 50 lines.
Skulls of the Shogun
Skulls of the Shogun is a game built on XNA and MonoGame, and actually uses the .NET framework’s globalization support to localize its strings. Thus, I was able to use the framework’s support for loading satellite assemblies to pull out both the string keys used to refer to the strings, as well as the content itself, quite easily.
I actually spent more time determining that I had to load the assemblies using the reflection-only context, in order to keep my library and console application bit-length-independent, than writing the rest of the code to support this game!