Traffic Department 2192

HTML Script Generator

When I was young I had the demo of a game called Traffic Department 2192. It's a really strange title - more of a visual novel than a game. Made by one man (as far as I can tell), it is a sci-fi setting on a desert planet (hmm) telling the story of a group of beleaguered freedom fighters under an oppressive alien regime. A few of them have drug addictions. To "spice". (hmm)

I'm not sure how the game ended up the way it did. I remember thinking the gameplay was cool in 1993 but as an adult revisiting it I quickly learned the gameplay is very, very basic and underwhelming. I won't go into it in detail except to say that it's a top-down shooter where you fly around a city bumping into walls and shooting other "hoverskids."

In between each of the missions, there are lengthy dialog sections. And I mean, in some cases lengthy, featuring 4-6 different characters. There is no animation in the game. Except for the occasional full-screen painting, almost all the story is delivered in the form of text being displayed beneath static character portraits.

In my opinion this isn't a bad way to experience the game, if you have the time and patience. I think it sets an atmosphere. I think it's good. But I had the idea of enjoying the script without any of the tedious gameplay and slow dialogue text scrolls.

The Scripts

Episode 1
Episode 2 (coming soon)
Episode 3 (coming soon)

I've created a static HTML conversion of the first episode of the game's dialogue script using some of the work product of Maizure. You can get all their code there as well as a series of excellent tutorial videos on reverse engineering as well as rewriting this game from scratch, I recommend them. I used their code as a starting point to generate the resource files for this project since they had already done most of the analysis.

Was this successful? No. It turns out that the scenes are not linearly ordered in the file - the last scene in the file is from the middle of the game, and there's other mixups throughout. The author probably added these sequences late in the development process and didn't want to reorder the entire file and update all of his pointers just for these minor scenes. It's mostly in order, but it's enough out of order to not make any sense, making this, unfortunately, just a neat toy project and not a legitimate way to consume the story of this game. I wish I could do better, but at this point I believe it would require further reversing of the game executable that I'm just not up to right now.

I'm going to share my code and process for kicks. This is not my best work. I have cleaned it up and commented it as best I can, but it's hacky, it has known bugs, and if the target use case was broader it would be totally impractical. But I did want to talk about how I did it, because it was fun and maybe it'll be useful info to you.

The Code

If you want to follow along with any of this you will need the TD2192 game data. I'm not sure which version you need. Maizure used the Creative Commons release, but their .C files reference some files I don't have. You can grab the commercial release from many abandonware websites, and I believe you will know if you have the right one if it contains TD-LET.CIN.

Another thing I'll note: I hate compiling C on Windows. I have endless issues with it. Fortunately the code Maizure wrote is pretty straightforward, but if you're struggling, do this:

My goal was not to simulate the experience of the dialogue interface in the game. Instead, I wanted to just lay out the script all in one big document. The portraits would need to be displayed, and every time a portrait changes I wanted to print both portraits on the page again. The colors needed to match the colors of the names under the portraits so you can tell who's speaking and the easiest way to do that was to just use the intended colors.

Maizure went much further than I needed, analyzing the missions, fullscreen images, menu layout and so on. The only info I needed was the dialogue, portraits and palettes.

The dialogue can be extracted from DIALOGUE.DAT, and it has a remarkably simple structure; I used none of Maizure's work for that step. I have not done a lot of proper reverse engineering but I have hex edited data files from hundreds of DOS games and I can tell you that almost without exception they're never as simple as you hope; usually a chunk of C structs and embedded strings and doing anything at all is a pain. In contrast, DIALOGUE.DAT is a simple series of instructions with ordinary carriage returns (???) to separate each one. Every line starts with a command identifier - fade in the screen, change the character portraits, draw a line of text, etc. - and the only complication is keeping track of state.

My code does not need to consume every instruction. The only things it needs to know are:

This info needs to relate to two other things: Files containing the portraits, and a color palette.

There are two ways to write this kind of script: One is to extract everything and "rationalize" it - name the portraits "satair.png", "koth.png" etc. and cross-reference them to the text, then create a dict that maps the colors, portraits and text identifiers.

The other way is to just dump the raw hex values for everything, ignoring human readability, and as long as you do it the same way in both places you'll be fine. This is what I did.

The Palette

To extract the palette, I read Maizure's analysis of the palette file - it's literally just triplets of bytes. Palette index 0 is bytes 0, 1 and 2. Palette index 1 is bytes 3, 4 and 5. The values have to be bitshifted left two positions to translate to 8-bit RGB. I wrote this very simple script:

The comments should explain what's going on. Working with bytes in Python is straightforward, fortunately.

The Portraits

I used more of Maizure's output for this because working with images is miserable in any language and most of the work had been done. Maizure's FCETOBMP.C produces sprite strips from the portrait files. What I needed was individual images. I made some minor flow changes to FCETOBMP.C as follows:

First, the utility creates a bitmap that is as wide as all the portraits in the file, then loops over the file copying each portrait in turn to the large bitmap, offset by the width of the previously copied portraits. I needed to change this to create a bitmap the size of one portrait, then save it, then make another.

I adjusted SURFACE_WIDTH to 160, instead of 160 times the number of portraits. I then moved the SDL_CreateRGBSurface call inside of the loop, so that each time around we get a fresh surface. I also changed surface_offset to remove the X offset from each of the previous copied portraits. Finally, I added an sprintf call to generate a new filename for each portrait and write it out, instead of waiting until the end.

Running this utility created FACE1.BMP through FACE25.BMP. This appeared to work perfectly, except that the faces have some horizontal "jitter" - they don't all appear to be centered. I am not sure if this is how they were packed in the game or if I have a bug, but because of how unimportant this project is, and because you can't really tell in the finished output, I didn't look into it deeper. More on this later.

Here is the altered FCETOBMP.C:

The Dialogue

The meat of the project, parsing the dialogue has some caveats. Maizure covered the structure in their video, but it's so simple that I had already done much of the same by the time I came across it.

I'd like to note at this point that I really hate all hex editors that I've seen. None of them are designed for the specific thing I'm almost always doing: Parsing mixed binary/text data that contains linebreaks. I understand why this is not normally a supported thing - hex editors have to display row offsets and wrap data by bit widths, but is it so much to ask for a hybrid that can display hex values and ASCII side by side and also support line breaks? I may have to write my own.

Consequently, I had to view DIALOGUE.DAT in multiple editors, and I finally found that the best way to do it was to read the data into Python as binary, then write out the string representation, producing this exceptionally useful output:

bytearray(b'\x02\x01*  and i don"t like the officers under my command using foul language on\r')
bytearray(b'\x02\x01*t.d. communication channels, not to mention officers who make crude \r')
bytearray(b'\x02\x01*remarks about my grandmother!  now get the r"ox out of my office and\r')
bytearray(b'\x02\x01*report to recruiting tomorrow!\r')
bytearray(b'\x02\x01* \r')
bytearray(b'\x05\r')
bytearray(b'\x04\r')
bytearray(b'd\r')
bytearray(b'\x06\x04\r')
bytearray(b'\x01\x01\x01.lt. velasquez\r')
bytearray(b'\x01\x02\x04Yop. coordinator amiel\r')
bytearray(b'\x03\r')
bytearray(b'\x02\x01. \r')
bytearray(b'\x02\x01.  satair"s got me running a nursery.  got anything else?\r')
bytearray(b'\x02\x01. \r')
bytearray(b'\x05\r')
bytearray(b'\x02\x01Y \r')
bytearray(b'\x02\x01Y  i shouldn"t be doing this, velasquez.\r')
bytearray(b'\x02\x01Y \r')

If you know how to read this, it beats the hell out of a hex editor, notepad++, or anything else. In my opinion, anyway. It doesn't decode *everything*, just characters that aren't printable ASCII, and unlike notepad++ it doesn't produce big ugly tablets with ancient control code names, but shows the actual hex values. It's the best of both worlds.

Basic Structure

Above you can see a chunk out of the middle of the dialogue file. You can probably already make out a little of the structure. The first byte of each line is an instruction category - 02 to print text, 06 begins a new scene - and then the following bytes are instruction parameters.

For a text line, I'm not sure what the second byte does, but the third identifies the palette index of the text. You'd think it would indicate whether it's the left or right character - especially since, in the real game, the currently speaking character portrait is highlighted. The author must have done that by matching the palette indexes to the name fields under the character portraits. Odd choice, I would have done it the other way around.

Instruction 01 changes which character portraits are displayed. These are almost always in pairs. The second byte indicates whether its the left or right portrait that's changing, the third byte says which portrait to use, the fourth byte is a palette color index for the name field, and everything until the next newline is the name.

Interestingly, this can happen anywhere in a dialogue, not just at the beginning. This has implications.

The MVP

There are several ways I could have written this script. One would be to walk the entire dialogue file, assembling Python data structures for each conversation. Something like:

characters = {
  1: { "name" : "lt. velasquez", "color": 49 },
  2: { "name" : "commander satair", "color": 31 }
}
conversations = [
  [
    {
      "characters" : { 1, 2 },
      "dialogue" : [ 
        [1, "Character 1 speaking"],
        [2, "Character 2 speaking"]
      ]
    }
  ]
]

This is nice and clean and sustainable, which is a waste of time and mental energy for a script I will run exactly once, once it's working. This is sort of "frontloading" the effort - you have to write a pretty complicated program to generate the initial data structure, and then once you have it you can do all the transforms and reinterpretations you like on it, and you never need to re-read the original data.

The problem is, for a "toy" application like this, it's too much brainpower. Consider: Is that how you read the dialogue file I posted up there? Did you scan the entire thing, assembling notions of every character and the scene boundaries, before reading the contents? Or did you read each line in turn, remember what you'd just read before and nothing else? It's dirty pool perhaps since you don't have the whole file, but I assert that trying to write a "proper" utility requires way more Sherlock Holmes Pipe Smoking - hours of contemplative analysis with no action - than I am prepared to apply to such a small task.

It's not that I can't do this. It's not that it'll take longer. It's mostly that it's deeply unsatisfying, and thus hard to actually stick to when the stakes are so low. I want something that spits out results, and fast. So I chose to take the second approach, which is to build a broken and filthy state machine. Performing this transformation with a state machine means that my script will walk through the file, line by line, and because it knows the conversation has to follow a specific order of operations, it will make assumptions when it sees certain instructions about what state the dialogue must be in. This can be messy.

For instance, the dialogue takes the form of several "print line" instructions in a row from one character, and then the other character speaks. When a character says three lines in a row, I want those three DIVs to be wrapped in another DIV with a 'paragraph' class*. Because I'm working line-by-line, I need to do things "before and after" the current line.

* "A DIV with a paragraph class? Don't you mean a P tag?" Well, I would, except P tags can't contain other block level elements, such as divs.

When a new character begins speaking, I need to write a paragraph DIV to the HTML file, then for each line of text I write a line DIV to the file, and then when that character is done talking I need to print a DIV close tag. The only way to tell that a new character is talking is to read the "font color" value and see if it's different than the last time a line was written, so as a result, the closing tag for a paragraph gets written at the beginning of the code that writes a new line of text.

Writing code this way is dirty and confusing at times. The first working draft used several assumptions that led to bugs, such as presuming there would always be two character portrait changes in a row. I also had situations where paragraphs were closing when they shouldn't be, or the portraits were contained inside of a paragraph tag instead of outside of it. These turned out to be due to my own state machine being a very poor one. My brain presumed that the act of leaving a conversation (the 0xd instruction) would result in certain variables - such as the one that tracks what the last color of text was - being reset. So if a scene ended with Velasquez and the next one started with Velasquez, it would never create the opening tag.

The initial draft of this document also contained a lot of "This is confusing, but that's because I had to do it like this," where by the time I was done explaining why I had to do it that way, I was asking myself "why did I do it that way?" After looking over the code a couple more times I realized I should have done it that way, and did, and removed easily 50% of the original code in the process. It's actually an extremely small script for what it's doing, thanks to the fairly clean dialogue file format, but there are so many gotchas in writing a state machine with no documentation that's meant to run many times in a row that I think this is just par for the course.

Limitations / Bugs

There are still a couple remaining gaps I don't think I can fill:

All of these require mining and integrating data I don't know how to find. From looking over Maizure's videos I can't quite figure it out from them either, but I do see them scraping some data manually out of the EXE. I think this is as far as I want to take this, and I find the results pretty satisfying as is.

There is one bug I can't explain: The portraits mostly match up, but four of them generated wildly incorrect IDs when being read out of the dialogue file. For instance, on line 421 the portrait ID appears to be 77. I can validate this with an editor, but I don't understand why this seems to work differently than every other portrait in the game. I tried debugging it for a while. I think it might have something to do with a variable overflow in the original game which wrapped to the correct value, but I can't seem to replicate that with different operations in Calc.exe, so I don't know what's up. In the end I just manually renamed the appropriate BMPs.

And, of course, there are the out-of-order scenes I mentioned which trash the whole projects viability. Those aren't a bug though, just something I didn't realize wasn't solvable when I started. The file looked in order in the places I spot checked. Hubris is always punished.

The Code

Here it is. I embedded it in a textarea because sharing .py files from my server is harder than I wish it was; this is easier to update if I need to fix it. Put this in your TD data directory and execute with Python 3. You'll need to modify the input and output filenames and folders if you want to convert the data from episodes 2 and 3. You'll also need the modified FCETOBMP.C from above.

I should note that right before I published this I realized that each scene has an identifier byte right after the 0x06 instruction, and I excitedly rerolled the whole thing to put each conversation into a dict, then sort the dict, after which I found out the sequences are not chronological, so it was a waste of time. I included it further down to show how that works, and because if I ever do decide to extract the dialogue order from the EXE this will be a necessary step for sorting it once I have the right sequence.

With sorting:

Contact me at articles@gekk.info - I would love to hear your input, stories, etc.

Code Stuff