James Randall Musings on software development, business and technology.
C# / Blazor Wolfenstein - Part 5 - Decoding Maps and Performance
Code

Decoding Maps

Having loaded the maps as byte arrays I now needed to decompress them. The maps in Wolfenstein are double compressed first using a fairly standard run length encoding algorithm and then using “Carmack compression” which basically allows for similar blocks of bytes to be replaced by a single block and then referenced through a near or far pointer.

The Wolfenstein map data is essentially contained on two planes (each being an array of bytes for the 64x64 map). Plane zero contains information about the walls and doors while plane one contains information about game objects (enemies, decoration, treasure etc.). I’ve not yet tackled the game object layer other than to pull out the players starting location (expressed as a game object) and enemy turning points (we’ll come back to this later when we look at the AI) that I found in the F# version to be simpler to deal with as part of the wall structures.

In my F# version I used an unfold to do the decompression which I was always a little on the fence about. When I look at the resulting code its one of those cases where driving towards an immutable and pure function type solution in my view, and with my basic F# skills, results in code that is much less readable than the more mutable version which I took in the C# version. Were I to revist the F# version I think I would take the same approach as I have here. Its just far more readable.

While doing this I really started to rub up against what I consider (to my F# brain anyway) to be the lack of expressiveness in C#. Some of this code would work really nicely with a pipe operator and curried functions. You can sort of simulate this with a fluent style approach in C# but when your return types are basic types (byte arrays, tuples etc.) then what I really want to do is have the extensions as local static functions so they are rightly scoped. But that’s not possible. Nor are extensions on nested classes supported. So you end up with a sub-namespace or not bothering. I chose to not bother in some cases.

On the more positive side of things I’m finding the updates to tuples to be incredibly useful.

In any case you can find all this code in Level.cs.

Performance

As part of the map decoding I realised I hadn’t loaded the sprites for the player weapon (the hand holding the gun / knife). These are (unclipped) basically the size of the viewport and when I rendered these through my simple RenderTexture method performance tanked down to around 25fps on a release build.

Although I’m not planning on using this method much I figured their would be some interesting learnings I could apply to the main renderer by optimising this.

Essentially I switched over to using pointers for both writing to the output buffer and reading from the texture buffer. And where I could precalculate things in the outer loop I did so. Making these changes incrementally resulted (first I did the output buffer, then the input buffer, then the precalculations) in the performance stepping from around 25fps, to 45fps, to 50fps and then to 60fps all on a release build. The method now looks like this:

private void RenderTexture(Texture texture, int x, int y)
{
    unsafe
    {
        fixed (uint* destPtr = _buffer)
        {
            fixed (uint* fixedSrcPtr = texture.Pixels)
            {
                uint* destRowPtr = destPtr + (y * _width) + x;
                uint* srcPtr = fixedSrcPtr;
                for (int row = 0; row < texture.Height; row++)
                {
                    uint* drawPtr = destRowPtr;
                    for (int col = 0; col < texture.Width; col++)
                    {
                        uint color = *srcPtr++;
                        if (!Pixel.IsTransparent(color))
                            *drawPtr++ = color;
                        else
                            drawPtr++;
                    }
                    destRowPtr += _width;
                }
            }
        }
    }    
}

Something I might need to think about for the render loop is the orientation of the frame buffer. Wolfenstein essentially renders column by column which would require the frame buffer to be oriented the same way to enable be to do a simple ptr++ type approach with no additional calculations - but by the time I get back to sending the byte array back to Skia it needs to be in a row by row orientation. It might be that the best thing to do is arrange the frame buffer column wise and then rotate it in Skia land.

I’ll start with the simple case and then look at this.

Excitingly this is what we’ll be doing next!

If you want to discuss this or have any questions then the best place is on GitHub. I’m only really using Twitter to post updates to my blog these days.