Traffic Department 2192

HTML Script Generator

When I was young I had the demo of a game called Traffic Department 2192. It's a really strange title - more of a visual novel than a game. Made by one man (as far as I can tell), it is a sci-fi setting on a desert planet (hmm) telling the story of a group of beleaguered freedom fighters under an oppressive alien regime. A few of them have drug addictions. To "spice". (hmm)

I'm not sure how the game ended up the way it did. I remember thinking the gameplay was cool in 1993 but as an adult revisiting it I quickly learned the gameplay is very, very basic and underwhelming. I won't go into it in detail except to say that it's a top-down shooter where you fly around a city bumping into walls and shooting other "hoverskids."

In between each of the missions, there are lengthy dialog sections. And I mean, in some cases lengthy, featuring 4-6 different characters. There is no animation in the game. Except for the occasional full-screen painting, almost all the story is delivered in the form of text being displayed beneath static character portraits.

In my opinion this isn't a bad way to experience the game, if you have the time and patience. I think it sets an atmosphere. I think it's good. But I had the idea of enjoying the script without any of the tedious gameplay and slow dialogue text scrolls.

The Scripts

Episode 1
Episode 2 (coming soon)
Episode 3 (coming soon)

I've created a static HTML conversion of the first episode of the game's dialogue script using some of the work product of Maizure. You can get all their code there as well as a series of excellent tutorial videos on reverse engineering as well as rewriting this game from scratch, I recommend them. I used their code as a starting point to generate the resource files for this project since they had already done most of the analysis.

Was this successful? No. It turns out that the scenes are not linearly ordered in the file - the last scene in the file is from the middle of the game, and there's other mixups throughout. The author probably added these sequences late in the development process and didn't want to reorder the entire file and update all of his pointers just for these minor scenes. It's mostly in order, but it's enough out of order to not make any sense, making this, unfortunately, just a neat toy project and not a legitimate way to consume the story of this game. I wish I could do better, but at this point I believe it would require further reversing of the game executable that I'm just not up to right now.

I'm going to share my code and process for kicks. This is not my best work. I have cleaned it up and commented it as best I can, but it's hacky, it has known bugs, and if the target use case was broader it would be totally impractical. But I did want to talk about how I did it, because it was fun and maybe it'll be useful info to you.

The Code

If you want to follow along with any of this you will need the TD2192 game data. I'm not sure which version you need. Maizure used the Creative Commons release, but their .C files reference some files I don't have. You can grab the commercial release from many abandonware websites, and I believe you will know if you have the right one if it contains TD-LET.CIN.

Another thing I'll note: I hate compiling C on Windows. I have endless issues with it. Fortunately the code Maizure wrote is pretty straightforward, but if you're struggling, do this:

Install Visual Studio
Add C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\Tools to your PATH environment variable. This path has changed between VC revs, so you're on your own figuring out what it is for your version.
Open a command prompt and execute vsdevcmd
Acquire the SDL2 dev libraries and extract them somewhere. I used E:\Code\SDL2-2.0.8
Compile as follows: cl FCETOBMP.c /IE:\CODE\SDL2-2.0.8\INCLUDE /link /SUBSYSTEM:CONSOLE E:\CODE\SDL2-2.0.8\LIB\X86\SDL2.lib

My goal was not to simulate the experience of the dialogue interface in the game. Instead, I wanted to just lay out the script all in one big document. The portraits would need to be displayed, and every time a portrait changes I wanted to print both portraits on the page again. The colors needed to match the colors of the names under the portraits so you can tell who's speaking and the easiest way to do that was to just use the intended colors.

Maizure went much further than I needed, analyzing the missions, fullscreen images, menu layout and so on. The only info I needed was the dialogue, portraits and palettes.

The dialogue can be extracted from DIALOGUE.DAT, and it has a remarkably simple structure; I used none of Maizure's work for that step. I have not done a lot of proper reverse engineering but I have hex edited data files from hundreds of DOS games and I can tell you that almost without exception they're never as simple as you hope; usually a chunk of C structs and embedded strings and doing anything at all is a pain. In contrast, DIALOGUE.DAT is a simple series of instructions with ordinary carriage returns (???) to separate each one. Every line starts with a command identifier - fade in the screen, change the character portraits, draw a line of text, etc. - and the only complication is keeping track of state.

My code does not need to consume every instruction. The only things it needs to know are:

The beginning and end of each scene
When the character portraits/names change
What text to write and in what color

This info needs to relate to two other things: Files containing the portraits, and a color palette.

There are two ways to write this kind of script: One is to extract everything and "rationalize" it - name the portraits "satair.png", "koth.png" etc. and cross-reference them to the text, then create a dict that maps the colors, portraits and text identifiers.

The other way is to just dump the raw hex values for everything, ignoring human readability, and as long as you do it the same way in both places you'll be fine. This is what I did.

The Palette

To extract the palette, I read Maizure's analysis of the palette file - it's literally just triplets of bytes. Palette index 0 is bytes 0, 1 and 2. Palette index 1 is bytes 3, 4 and 5. The values have to be bitshifted left two positions to translate to 8-bit RGB. I wrote this very simple script:

# open palette file
pal = open("td.pal", "rb").read()
entry = 0
# open output CSS
out = open("pal.css", "w")
# there are 256 entries
while entry < 256:
	# the byte offset into the file is the entry index * 3
	offset = entry * 3
	# bitshift each value left by 2 to convert e.g. 63 to 255
	r = pal[offset] << 2
	g = pal[offset+1] << 2
	b = pal[offset+2] << 2
	@ write out a CSS entry
	out.write('.color{color} {{ color: rgba({r},{g},{b},1) }}\n'.format(color=entry, r=r, g=g, b=b))
	entry = entry + 1

The comments should explain what's going on. Working with bytes in Python is straightforward, fortunately.

The Portraits

I used more of Maizure's output for this because working with images is miserable in any language and most of the work had been done. Maizure's FCETOBMP.C produces sprite strips from the portrait files. What I needed was individual images. I made some minor flow changes to FCETOBMP.C as follows:

First, the utility creates a bitmap that is as wide as all the portraits in the file, then loops over the file copying each portrait in turn to the large bitmap, offset by the width of the previously copied portraits. I needed to change this to create a bitmap the size of one portrait, then save it, then make another.

I adjusted SURFACE_WIDTH to 160, instead of 160 times the number of portraits. I then moved the SDL_CreateRGBSurface call inside of the loop, so that each time around we get a fresh surface. I also changed surface_offset to remove the X offset from each of the previous copied portraits. Finally, I added an sprintf call to generate a new filename for each portrait and write it out, instead of waiting until the end.

Running this utility created FACE1.BMP through FACE25.BMP. This appeared to work perfectly, except that the faces have some horizontal "jitter" - they don't all appear to be centered. I am not sure if this is how they were packed in the game or if I have a bug, but because of how unimportant this project is, and because you can't really tell in the finished output, I didn't look into it deeper. More on this later.

Here is the altered FCETOBMP.C:

/*  Face to BMP file converter designed for use with Traffic Department 2192 NEWFACES files
*	argv[1] points to Face data file
*
* Five issues:
*  1) This program is a security nightmare - don't compile and store
*  2) No trapping, feedback, instructions, or any output whatsoever
*  3) SDL_Surface normally needs to be locked due to concurrency
*  4) Free the SDL surface - in this case, the OS does it
*  5) Void pointer implicit cast (not portable to C++)
*  6) User must preprocess input file Name and extension
*/

#include <stdio.h>	/* Opening Files */
#include <string.h>	/* Manipulating File Names */
#include <stdint.h> /* Fixed-width data types (C99) */
#include <SDL.H>	/* Using SDL data structure */

#undef main

int main(int argc, char* argv[])
{
	FILE *fin; 						/* Input File pointer */
	uint32_t FILE_SIZE;				/* Get file size */
	uint8_t IMAGE_COUNT;            /* Number of packed images */
	uint16_t SURFACE_WIDTH;          /* Width of the output surface */
	SDL_Surface *surface; 			/* SDL Surface struct */
	uint8_t *dst_byte;				/* Pointer to surface struct Pixels */
	uint8_t current_image;			/* Named iterator */
	uint16_t current_byte;			/* Named iterator */
	uint8_t palette[768]; 			/* Palette array: 3 x 8-bit*/
	uint8_t src_byte; 				/* Single byte read from input */
	uint8_t red_p, green_p, blue_p;	/* Palette lookup bytes by color */
	uint8_t image_row, image_col;   /* Current image pixel location */
	uint32_t surface_offset;        /* Byte offset in surface */
	uint8_t FILENAME_LEN; 			/* File Name length*/
	char fout[12]; 					/* New filename container */
	
	/* Open Face File and find file statistics*/
	fin = fopen(argv[1],"rb");
	fseek(fin, 0, SEEK_END);
	FILE_SIZE = ftell(fin);
	rewind(fin);
	IMAGE_COUNT = FILE_SIZE/16768;
	
	/* Make surface strip and pixel data pointer. Each Image is 160x100 */
	SURFACE_WIDTH = 160;
	
	/* Process each image in file sequentially */
	for(current_image=0; current_image < IMAGE_COUNT; current_image++)
	{
		surface = SDL_CreateRGBSurface(0, SURFACE_WIDTH, 100, 32, 0,0,0,0);
		dst_byte = surface->pixels;
		
		/* Extract image palette - guaranteed 768 bytes*/
		for (current_byte=0; current_byte<768; current_byte++)
			palette[current_byte] = fgetc(fin) << 2;  /* Reranging values 63 -> 255 */
		
		/* Extract 16kb of image data, match to palette, write to surface */
		for (current_byte=0; current_byte<16000; current_byte++) {
			src_byte = fgetc(fin);           /* Read Byte From File */
			red_p = palette[src_byte*3];     /* Lookup Red Channel */
			green_p = palette[src_byte*3+1]; /* Lookup Green Channel */
			blue_p = palette[src_byte*3+2];  /* Lookup Blue Channel */
			
			image_row = current_byte / 160; /* Int type clips fractions*/
			image_col = current_byte % 160;
			surface_offset = (image_row*SURFACE_WIDTH
							  +image_col)*4;
			
			dst_byte[surface_offset] = blue_p; 		/* Set Blue channel pixel */
			dst_byte[surface_offset+1] = green_p;   /* Set Green channel pixel */
			dst_byte[surface_offset+2] = red_p;    	/* Set Red channel pixel */
			dst_byte[surface_offset+3] = 0xff;      /* Set Alpha channel pixel */
		}
		
		sprintf(fout, "FACE%d.BMP", current_image);
		SDL_SaveBMP(surface, fout); 
	}
	fclose(fin);
	
	/* New output filename with .BMP extension */
	//FILENAME_LEN = strlen(argv[1]);    		/* Length of input file name */
	//strncpy(fout,argv[1],FILENAME_LEN-3);  	/* Chop off extension */
	//fout[FILENAME_LEN-3]='\0'; 				/* Nullterm it */
	//strncat(fout,".BMP",4);  				/* Attach BMP extension */
	
	/* Save BMP*/
	 //SDL_SaveBMP(surface, fout); 
	
	
	return 0;
}

The Dialogue

The meat of the project, parsing the dialogue has some caveats. Maizure covered the structure in their video, but it's so simple that I had already done much of the same by the time I came across it.

I'd like to note at this point that I really hate all hex editors that I've seen. None of them are designed for the specific thing I'm almost always doing: Parsing mixed binary/text data that contains linebreaks. I understand why this is not normally a supported thing - hex editors have to display row offsets and wrap data by bit widths, but is it so much to ask for a hybrid that can display hex values and ASCII side by side and also support line breaks? I may have to write my own.

Consequently, I had to view DIALOGUE.DAT in multiple editors, and I finally found that the best way to do it was to read the data into Python as binary, then write out the string representation, producing this exceptionally useful output:
bytearray(b'\x02\x01*  and i don"t like the officers under my command using foul language on\r')
bytearray(b'\x02\x01*t.d. communication channels, not to mention officers who make crude \r')
bytearray(b'\x02\x01*remarks about my grandmother!  now get the r"ox out of my office and\r')
bytearray(b'\x02\x01*report to recruiting tomorrow!\r')
bytearray(b'\x02\x01* \r')
bytearray(b'\x05\r')
bytearray(b'\x04\r')
bytearray(b'd\r')
bytearray(b'\x06\x04\r')
bytearray(b'\x01\x01\x01.lt. velasquez\r')
bytearray(b'\x01\x02\x04Yop. coordinator amiel\r')
bytearray(b'\x03\r')
bytearray(b'\x02\x01. \r')
bytearray(b'\x02\x01.  satair"s got me running a nursery.  got anything else?\r')
bytearray(b'\x02\x01. \r')
bytearray(b'\x05\r')
bytearray(b'\x02\x01Y \r')
bytearray(b'\x02\x01Y  i shouldn"t be doing this, velasquez.\r')
bytearray(b'\x02\x01Y \r')
If you know how to read this, it beats the hell out of a hex editor, notepad++, or anything else. In my opinion, anyway. It doesn't decode *everything*, just characters that aren't printable ASCII, and unlike notepad++ it doesn't produce big ugly tablets with ancient control code names, but shows the actual hex values. It's the best of both worlds.

Basic Structure

Above you can see a chunk out of the middle of the dialogue file. You can probably already make out a little of the structure. The first byte of each line is an instruction category - 02 to print text, 06 begins a new scene - and then the following bytes are instruction parameters.

For a text line, I'm not sure what the second byte does, but the third identifies the palette index of the text. You'd think it would indicate whether it's the left or right character - especially since, in the real game, the currently speaking character portrait is highlighted. The author must have done that by matching the palette indexes to the name fields under the character portraits. Odd choice, I would have done it the other way around.

Instruction 01 changes which character portraits are displayed. These are almost always in pairs. The second byte indicates whether its the left or right portrait that's changing, the third byte says which portrait to use, the fourth byte is a palette color index for the name field, and everything until the next newline is the name.

Interestingly, this can happen anywhere in a dialogue, not just at the beginning. This has implications.

The MVP

There are several ways I could have written this script. One would be to walk the entire dialogue file, assembling Python data structures for each conversation. Something like:

characters = {
  1: { "name" : "lt. velasquez", "color": 49 },
  2: { "name" : "commander satair", "color": 31 }
}
conversations = [
  [
    {
      "characters" : { 1, 2 },
      "dialogue" : [ 
        [1, "Character 1 speaking"],
        [2, "Character 2 speaking"]
      ]
    }
  ]
]

This is nice and clean and sustainable, which is a waste of time and mental energy for a script I will run exactly once, once it's working. This is sort of "frontloading" the effort - you have to write a pretty complicated program to generate the initial data structure, and then once you have it you can do all the transforms and reinterpretations you like on it, and you never need to re-read the original data.

The problem is, for a "toy" application like this, it's too much brainpower. Consider: Is that how you read the dialogue file I posted up there? Did you scan the entire thing, assembling notions of every character and the scene boundaries, before reading the contents? Or did you read each line in turn, remember what you'd just read before and nothing else? It's dirty pool perhaps since you don't have the whole file, but I assert that trying to write a "proper" utility requires way more Sherlock Holmes Pipe Smoking - hours of contemplative analysis with no action - than I am prepared to apply to such a small task.

It's not that I can't do this. It's not that it'll take longer. It's mostly that it's deeply unsatisfying, and thus hard to actually stick to when the stakes are so low. I want something that spits out results, and fast. So I chose to take the second approach, which is to build a broken and filthy state machine. Performing this transformation with a state machine means that my script will walk through the file, line by line, and because it knows the conversation has to follow a specific order of operations, it will make assumptions when it sees certain instructions about what state the dialogue must be in. This can be messy.

For instance, the dialogue takes the form of several "print line" instructions in a row from one character, and then the other character speaks. When a character says three lines in a row, I want those three DIVs to be wrapped in another DIV with a 'paragraph' class*. Because I'm working line-by-line, I need to do things "before and after" the current line.

* "A DIV with a paragraph class? Don't you mean a P tag?" Well, I would, except P tags can't contain other block level elements, such as divs.

When a new character begins speaking, I need to write a paragraph DIV to the HTML file, then for each line of text I write a line DIV to the file, and then when that character is done talking I need to print a DIV close tag. The only way to tell that a new character is talking is to read the "font color" value and see if it's different than the last time a line was written, so as a result, the closing tag for a paragraph gets written at the beginning of the code that writes a new line of text.

Writing code this way is dirty and confusing at times. The first working draft used several assumptions that led to bugs, such as presuming there would always be two character portrait changes in a row. I also had situations where paragraphs were closing when they shouldn't be, or the portraits were contained inside of a paragraph tag instead of outside of it. These turned out to be due to my own state machine being a very poor one. My brain presumed that the act of leaving a conversation (the 0xd instruction) would result in certain variables - such as the one that tracks what the last color of text was - being reset. So if a scene ended with Velasquez and the next one started with Velasquez, it would never create the opening tag.

The initial draft of this document also contained a lot of "This is confusing, but that's because I had to do it like this," where by the time I was done explaining why I had to do it that way, I was asking myself "why did I do it that way?" After looking over the code a couple more times I realized I should have done it that way, and did, and removed easily 50% of the original code in the process. It's actually an extremely small script for what it's doing, thanks to the fairly clean dialogue file format, but there are so many gotchas in writing a state machine with no documentation that's meant to run many times in a row that I think this is just par for the course.

Limitations / Bugs

There are still a couple remaining gaps I don't think I can fill:

I'd like to inject the full-screen digital images into the right spots in the dialogue but I can't figure out what triggers those, and I suspect at least some of them are hardcoded in the game binary.
Some text, such as "VULTURESSS DON'T FLY WITH HEADLIGHTSSS," is actually part of in-mission dialogue. The DAT format does not distinguish this; the EXE must have code to account for it, which makes me wonder why they bothered putting scene delimiters in the DAT file at all.
I'd like to include mission names before each scene

All of these require mining and integrating data I don't know how to find. From looking over Maizure's videos I can't quite figure it out from them either, but I do see them scraping some data manually out of the EXE. I think this is as far as I want to take this, and I find the results pretty satisfying as is.

There is one bug I can't explain: The portraits mostly match up, but four of them generated wildly incorrect IDs when being read out of the dialogue file. For instance, on line 421 the portrait ID appears to be 77. I can validate this with an editor, but I don't understand why this seems to work differently than every other portrait in the game. I tried debugging it for a while. I think it might have something to do with a variable overflow in the original game which wrapped to the correct value, but I can't seem to replicate that with different operations in Calc.exe, so I don't know what's up. In the end I just manually renamed the appropriate BMPs.

And, of course, there are the out-of-order scenes I mentioned which trash the whole projects viability. Those aren't a bug though, just something I didn't realize wasn't solvable when I started. The file looked in order in the places I spot checked. Hubris is always punished.

The Code

Here it is. I embedded it in a textarea because sharing .py files from my server is harder than I wish it was; this is easier to update if I need to fix it. Put this in your TD data directory and execute with Python 3. You'll need to modify the input and output filenames and folders if you want to convert the data from episodes 2 and 3. You'll also need the modified FCETOBMP.C from above.

I should note that right before I published this I realized that each scene has an identifier byte right after the 0x06 instruction, and I excitedly rerolled the whole thing to put each conversation into a dict, then sort the dict, after which I found out the sequences are not chronological, so it was a waste of time. I included it further down to show how that works, and because if I ever do decide to extract the dialogue order from the EXE this will be a necessary step for sorting it once I have the right sequence.

# open as binary or you'll get errors
raw_dlg = open("DIALOGUE.DAT", "rb").read()

# you can't use .split('\r') on a binary file, so i had to reimplement it
dlg = []
# since python only vaguely has types, you have to use a function to create
# the binary equivalent of a string. you can treat bytearray() mostly like a string though.
line = bytearray()
for byte in raw_dlg:
	# 10 is the ascii code for \r
	if byte != 10:
		line.append(byte)
	else:
		# we hit a carriage return so the line is over
		dlg.append(line)
		line = bytearray()

# these are state variables representing the portraits, names and text colors for the
# currently active characters
face1 = -1
face2 = -1
name1 = "Nobody"
name2 = "Nobody"
color1 = -1
color2 = -1

# open the output htm file and write out the headers for the stylesheets
out = open("./output/td1.htm", "w")
out.write('<link href="td1.css" rel="stylesheet">\n')
out.write('<link href="pal.css" rel="stylesheet">\n')

# what was the last text color value?
lastcolor = None
inpara = False

# Begin walking the dialogue file
index = 0
for line in dlg:
	# 0x06 means a new scene is beginning
	if line[0] == 0x06:
		print ("Scene begin")		
		# Write a new conversation tag to open the scene
		out.write('<div class="conversation">\n')
	# 0x01 changes portraits
	elif line[0] == 0x01:
		print ("Face change")
		# The first byte denotes which of the faces is changing
		if line[1] == 0x01:
			print ("Left face: " + str(line[2]) + " - " + line[4:].decode("ascii"))
			
			# byte 2 is the portrait index within the portrait file, one-indexed.
			face1 = str(line[2]-1)
			# the name is everything until the end of the line
			name1 = line[4:].decode("ascii").replace('\r', '')
			# byte 3 is the palette index
			color1 = line[3]
		# ditto for face 2
		elif line[1] == 0x02:
			print ("Right face: " + str(line[2]) + " - " + line[4:].decode("ascii"))
			face2 = str(line[2]-1)
			name2 = line[4:].decode("ascii").replace('\r', '')
			color2 = line[3]
			
		# Look ahead to the next line. If it's not a portrait change, then we can
		# go ahead and write out the new portraits.
		if dlg[index+1][0] != 0x01:
			# If we're in the middle of a paragraph, close it first
			if inpara is True:
				out.write("\t</div>\n")
				inpara = False
				lastcolor = None
			
			out.write("""\t<div class="faces">
		<img class="left" src="TD1/FACE{face1}.BMP">
		<img class="right" src="TD1/FACE{face2}.BMP">
	</div>
	<div class="names">
		<div class="name1 color{color1}">{name1}</div>
		<div class="name2 color{color2}">{name2}</div>
	</div>\n
	<hr class="names">\n""".format(face1=face1, face2=face2, name1=name1, name2=name2, color1=color1, color2=color2))
	
	
	# We're writing a line of dialogue. This would be more complicated if
	# I was letting the browser wrap the text or using a <pre> with hard linebreaks,
	# but I chose to make the HTML match the original file very closely. 
	# This means there's one <div> for every single line of text. This is much more
	# compatible with a state machine approach; concatenating consecutive lines of text
	# would be messier code.
	
	# Almost every time the currently active character changes, the text is surrounded 
	# by empty lines on either side. I don't know why, and it's not consistent. This is
	# why the != 0xd is up there, it skips blank lines.
	elif line[0] == 0x02 and line[4] != 0xd:	
		# The second byte is the palette index for the text. We will output this as a
		# CSS class which matches the pal.css file.
		color = line[2]
		
		# In a lot of cases, when the currently active speaker changes, the instruction 0x05
		# is written. But this isn't consistent, for reasons I don't know. Sometimes - and it
		# seems like it might be when characters are interrupting each other - there isn't one.
		# Changing color is the only way we can know who's talking, so we keep track of the color
		# used in the last line of text, and if this one is different, we create a new paragraph.
		if color != lastcolor:
			# If lastcolor is None, it means we're at the beginning of a new scene, so the
			# new scene code closed out the preceding tags and we don't need to.
			if lastcolor is not None:
				out.write("\t</div c>\n")
			lastcolor = color
			
			# Decide whether the text should be left or right aligned, based on which portrait
			# it should fit with
			align = "left"
			if color == color1:
				align = "left"
			if color == color2:
				align = "right"
			
			# Write the new paragraph
			out.write("""\t<div class="paragraph color{color} {align}">\n""".format(color=color, align=align))
		
		print (line[3:].decode("ascii").strip())
		
		# Write out the line of text, replacing double quotes with quotes (dunno why that's needed)
		out.write("""\t\t<div class="line">{line}</div>\n""".format(line=line[3:].decode("ascii").strip().replace('"', "'")))
		
		inpara = True
	elif line[0] == ord('d'):
		# Scene is ending. Reset lastcolor to none and close out open tags.
		lastcolor = None
		
		out.write('\t</div e>\n') # close the last paragraph
		out.write('</div>\n') # close the last conversation
		out.write("<hr>\n") # horizontal rule to separate the scenes
		
		inpara = False
	else:
		pass
	
	index = index + 1
	
# that's it, it's done!
out.close()

With sorting:

# open as binary or you'll get errors
raw_dlg = open("DIALOGUE.DAT", "rb").read()

# you can't use .split('\r') on a binary file, so i had to reimplement it
dlg = []
# since python only vaguely has types, you have to use a function to create
# the binary equivalent of a string. you can treat bytearray() mostly like a string though.
line = bytearray()
for byte in raw_dlg:
  # 10 is the ascii code for \r
  if byte != 10:
    line.append(byte)
  else:
    # we hit a carriage return so the line is over
    dlg.append(line)
    line = bytearray()

# these are state variables representing the portraits, names and text colors for the
# currently active characters
face1 = -1
face2 = -1
name1 = "Nobody"
name2 = "Nobody"
color1 = -1
color2 = -1

# open the output htm file and write out the headers for the stylesheets
out = open("./output/td1.htm", "w")
out.write('<link href="td1.css" rel="stylesheet">\n')
out.write('<link href="pal.css" rel="stylesheet">\n')

# we'll store each conversation here so we can sort them later
conversations = {}

# what was the last text color value?
lastcolor = None
inpara = False

# This will hold the set of lines that assemble the current conversation
outline = []
# This will hold the ID of the current conversation
id = -1
# This is the current index of the dialogue so we can index the array to look ahead
index = 0

# Begin walking the dialogue file
for line in dlg:
  rawo.write(str(line) + "\n")
  
  # 0x06 means a new scene is beginning
  if line[0] == 0x06:
    print ("Scene begin")   
    # get the conversation id
    id = line[1]
    
    # Write a new conversation tag to open the scene
    outline.append('<div class="conversation" id="{id}">\n'.format(id=id))
  # 0x01 changes portraits
  elif line[0] == 0x01:
    print ("Face change")
    # The first byte denotes which of the faces is changing
    if line[1] == 0x01:
      print ("Left face: " + str(line[2]) + " - " + line[4:].decode("ascii"))
      
      # byte 2 is the portrait index within the portrait file, one-indexed.
      face1 = str(line[2]-1)
      # the name is everything until the end of the line
      name1 = line[4:].decode("ascii").replace('\r', '')
      # byte 3 is the palette index
      color1 = line[3]
    # ditto for face 2
    elif line[1] == 0x02:
      print ("Right face: " + str(line[2]) + " - " + line[4:].decode("ascii"))
      face2 = str(line[2]-1)
      name2 = line[4:].decode("ascii").replace('\r', '')
      color2 = line[3]
      
    # Look ahead to the next line. If it's not a portrait change, then we can
    # go ahead and write out the new portraits.
    if dlg[index+1][0] != 0x01:
      # If we're in the middle of a paragraph, close it first
      if inpara is True:
        outline.append("\t</div>\n")
        inpara = False
        lastcolor = None
      
      # Write new portraits
      outline.append("""\t<div class="faces">
    <img class="left" src="TD1/FACE{face1}.BMP">
    <img class="right" src="TD1/FACE{face2}.BMP">
  </div>
  <div class="names">
    <div class="name1 color{color1}">{name1}</div>
    <div class="name2 color{color2}">{name2}</div>
  </div>\n
  <hr class="names">\n""".format(face1=face1, face2=face2, name1=name1, name2=name2, color1=color1, color2=color2))
  
  
  # We're writing a line of dialogue. This would be more complicated if
  # I was letting the browser wrap the text or using a <pre> with hard linebreaks,
  # but I chose to make the HTML match the original file very closely. 
  # This means there's one <div> for every single line of text. This is much more
  # compatible with a state machine approach; concatenating consecutive lines of text
  # would be messier code.
  
  # Almost every time the currently active character changes, the text is surrounded 
  # by empty lines on either side. I don't know why, and it's not consistent. This is
  # why the != 0xd is up there, it skips blank lines.
  elif line[0] == 0x02 and line[4] != 0xd:  
    # The second byte is the palette index for the text. We will output this as a
    # CSS class which matches the pal.css file.
    color = line[2]
    
    # In a lot of cases, when the currently active speaker changes, the instruction 0x05
    # is written. But this isn't consistent, for reasons I don't know. Sometimes - and it
    # seems like it might be when characters are interrupting each other - there isn't one.
    # Changing color is the only way we can know who's talking, so we keep track of the color
    # used in the last line of text, and if this one is different, we create a new paragraph.
    if color != lastcolor:
      # If lastcolor is None, it means we're at the beginning of a new scene, so the
      # new scene code closed out the preceding tags and we don't need to.
      if lastcolor is not None:
        outline.append("\t</div c>\n")
      lastcolor = color
      
      # Decide whether the paragraph should be left or right aligned, based on which portrait
      # it should fit with
      align = "left"
      if color == color1:
        align = "left"
      if color == color2:
        align = "right"
      
      # Write the new paragraph
      outline.append("""\t<div class="paragraph color{color} {align}">\n""".format(color=color, align=align))
    
    print (line[3:].decode("ascii").strip())
    
    # Write out the line of text, replacing double quotes with quotes (dunno why that's needed)
    outline.append("""\t\t<div class="line">{line}</div>\n""".format(line=line[3:].decode("ascii").strip().replace('"', "'")))
    
    inpara = True
  elif line[0] == ord('d'):
    # Scene is ending. Reset lastcolor to none and close out open tags,
    # then save the conversation to the conversation dict
    lastcolor = None
    outline.append('\t</div e>\n') # close the last paragraph
    outline.append('</div>\n') # close the last conversation
    outline.append("<hr>\n") # horizontal rule to separate the scenes
    
    # Add this conversation to the collection of conversations
    conversations[id] = outline
    # Clear the output buffer
    outline = []
    
    # Signal that the last paragraph was closed
    inpara = False
  else:
    pass
  
  index = index + 1
  
# that's it, it's done!
print (sorted(conversations.keys()))

# Sort the conversations, then write them to the html file
for id in sorted(conversations.keys()):
  out.write(''.join(conversations[id]))
out.close()

Contact me at articles@gekk.info - I would love to hear your input, stories, etc.

Code Stuff