May 22
Lately I’ve had a bunch of people ask how and where I learned how to optimize my game. Well, almost all the optimization techniques I know, I’ve gleaned by reading everything I can on game engine design: blogs, tutorials, forum threads, and books. The rest is just algorithm refactoring based on profiling results.
If you’re using a pre-built game engine, a lot of drawing optimization stuff is already taken care of for you. If you’re building your own engine, though, back-face culling is one of the first things you should turn on. Don’t render triangles you can’t see. Speaking of triangles you can’t see, don’t render anything that is outside of the camera’s viewing frustum. Here are some techniques for frustum culling: http://www.lighthouse3d.com/tutorials/view-frustum-culling/.
Drawing with immediate mode might be fine for prototyping, but will bog down quickly when drawing many triangles. Instead of immediate mode, use display lists, vertex arrays or vertex buffer objects (VBOs). Additionally, indexing vertices can cut down on the amount of data that needs to be sent across the bus. Stay off the bus as much as possible. If a vertex is shared by several triangles, it only needs to be sent once. If you have to draw many models that are the same, but may have different positions, animations and skins, check if your target hardware supports geometry instancing. On older hardware, interleaving your buffer data may increase performance, but is not necessarily a performance increase on today’s graphic hardware.
Minimizing GPU state changes can increase performance. Using a texture atlas and sorting meshes by texture can reduce texture switching. Use a texture atlas for your GUI. If you’re writing your own engine, write a state container to keep track of the GPU state. Checking the CPU state container before switching state can reduce unnecessary state changes.
If you’re creating a voxel engine, there are some extra optimizations that can be done. Don’t render cube faces that are not visible. Split your terrain into multiple chunks. Don’t render chunks that are empty and don’t render chunks that are completely full but not exposed. If you’re using colored voxels as opposed to textured voxels, you can combine faces of adjacent voxels.

Lately I’ve had a bunch of people ask how and where I learned how to optimize my game. Well, almost all the optimization techniques I know, I’ve gleaned by reading everything I can on game engine design: blogs, tutorials, forum threads, and books. The rest is just algorithm refactoring based on profiling results.

If you’re using a pre-built game engine, a lot of drawing optimization stuff is already taken care of for you. If you’re building your own engine, though, back-face culling is one of the first things you should turn on. Don’t render triangles you can’t see. Speaking of triangles you can’t see, don’t render anything that is outside of the camera’s viewing frustum. Here are some techniques for frustum culling: http://www.lighthouse3d.com/tutorials/view-frustum-culling/.

Drawing with immediate mode might be fine for prototyping, but will bog down quickly when drawing many triangles. Instead of immediate mode, use display listsvertex arrays or vertex buffer objects (VBOs). Additionally, indexing vertices can cut down on the amount of data that needs to be sent across the bus. Stay off the bus as much as possible. If a vertex is shared by several triangles, it only needs to be sent once. If you have to draw many models that are the same, but may have different positions, animations and skins, check if your target hardware supports geometry instancing. On older hardware, interleaving your buffer data may increase performance, but is not necessarily a performance increase on today’s graphic hardware.

Minimizing GPU state changes can increase performance. Using a texture atlas and sorting meshes by texture can reduce texture switching. Use a texture atlas for your GUI. If you’re writing your own engine, write a state container to keep track of the GPU state. Checking the CPU state container before switching state can reduce unnecessary state changes.

If you’re creating a voxel engine, there are some extra optimizations that can be done. Don’t render cube faces that are not visible. Split your terrain into multiple chunks. Don’t render chunks that are empty and don’t render chunks that are completely full but not exposed. If you’re using colored voxels as opposed to textured voxels, you can combine faces of adjacent voxels.

Apr 30

After converting the scripted data elements into pure TML data elements, I decided to also convert the noise function definitions to pure TML data definitions. Before doing this, however, I wanted to consolidate and expand my noise library. I was using value noise and had noise code spread out all over the project.

In my noise research, I ran across Joshua Tippetts’ Accidental Noise Library written in C++. It is modular and supports 2d, 3d, 4d and 6d noise. All the d’s you’d ever need. Also, its modular nature lends itself well to the data-driven function chaining that I want my modders to have access to.

During the process of porting the lib to Java, I soon wished I had a way to visualize how the functions were chained and be able to adjust parameters and chaining while seeing the results in real-time. That led to the development of the visual layer you see above with the blue grid background.

The visual layer let me test new modules / functions and quickly see the results. Since I would be using the chained noise functions predominantly for terrain generation (2d value maps), I also wanted a way to see what the terrain might look like without having to export the chain, load up the game and click through to see it. That led to the blocky 3d preview you see above.

The 3d preview runs as a server listening for connections from the 3d output modules in the visual layer. Whenever it receives a new array of values, it rebuilds and displays the output accordingly. This let’s me put the 3d preview on any machine, and just leave it running.

I think this is going to dramatically cut down the time spent tweaking noise.

The visual layer and 3d preview is not freely available, however the underlying noise lib is open-source: https://github.com/codetaylor/Joise

Apr 26

Tuple Markup Language

tl:dr I rewrote Google’s Gson JSON library for use with TML. If you use Java and want clean, human-readable/editable data, use it.

https://github.com/codetaylor/Juple

Initially I decided to expose game data via scripts, or rather through a Java api. The scripts were just java classes that got loaded dynamically at run-time. Now keep in mind I’m just talking about data, not logic. This seemed like an awesome idea at first. Then I started using it.

It quickly became apparent that there was too much overhead, boiler-plate, cruft-shifting type coding going on. If I wanted to add a property to something, first I had to define it in the API, then go write the code to handle populating the value in-game, and finally, go modify the script. On top of that, there was the Java-doc for the API that would have to be kept up to date, not only for the data mapping methods, but also for the logic.

If I have a game entity that has four values, I don’t want to have to define a package, define imports, think of a clever class name (or concoct a new naming convention), write an interface in the API, write the class loading code, write the definition handling code, update the java-doc… ugh. Monolithic Sisyphus syndrome.

After a little thought, I decided to split the data definition from the scripting altogether. Scripts will now simply exist for logic and data will be defined without logic. Not to say that data will be generated illogically. A sniper? Let’s give her a hand-to-hand combat bonus. What? I didn’t see an ‘if-then’! No no… I digress.

The first tech that came to mind was XML. I played about with XML for a bit and then moved on to JSON. XML felt too cumbersome for what I imagined and JSON was a move in the right direction. Google has a pretty robust, open-source Java JSON lib called Gson, and I really liked how easy it was to use and extend. Then I ran across TML…

TML, or Tuple Markup Language, is a minimalist, all-purpose markup language created by John Judnich. I thought to myself, “That’s it! That’s what I’m going to use!” I delved into the GitHub repo.. ok C, C++, Javascript, Python: no lib for Java. So sad. Back to JSON I went, sighing all the way.

I was writing stuff like this:

{
    "first name": "John",
    "last name": "Smith",
    "age": 25,
    "address": {
        "street address": "21 2nd Street",
        "city": "New York",
        "state": "NY",
        "postalCode": 10021
    }
}

…while dreaming about stuff like this:

[
    [first name | John]
    [last name | Smith]
    [age | 25]
    [address |
        [street address | 21 2nd Street]
        [city | New York]
        [state | NY]
        [postalCode | 10021]
    ]
]

I felt like I was painting with rocks. I just had to have the power and flexibility of the Gson lib, but for TML.

That’s why I rewrote Gson for TML… FTW.

It’s free, open-source, and has some decent docs, check it out.

Apr 7
My friend is about halfway done with a ship model and texture for the new game screen. Today I dropped it in to get an idea of how it will look.

My friend is about halfway done with a ship model and texture for the new game screen. Today I dropped it in to get an idea of how it will look.

Mar 24

It’s been almost two months since I started a major refactor on the context, input, state, asset, scene graph, shader and render systems. I’m pleased to say that the engine is just about back to where it was before I started, with some very beneficial added functionality.

Context System

The rendering context now exists in its own thread and fires events to its listener, the main application. Some of the event methods include initialize(), update(), restart(), stop() and destroy(). The other system components can then be established with access to the rendering context in the listener’s initialize() method. Inside the main game loop, the update() method, all of the other system components are updated. Changing the display properties calls the restart() method causing the render manager and GUI to update with respect to the new properties. The restart() method allows the context to be restarted at any time during the execution of the application without interruption; basically changing any display property will no longer force a restart of the entire application. When the player requests to exit the application, the stop() method is called and after the context is destroyed, the destroy() method is called.

Input System

The old input system was very concrete, inflexible, and too many other systems were dependent on a specific output from the input manager. This became problematic when refactoring the core systems that were too tightly integrated with the input. The new input system uses components that separate the underlying input messages from the output. Keys, buttons, and axes can be mapped to the input system as triggers for defined events and listeners can be registered to listen for those events. Keyboard and mouse input components have been written and other components can be easily written and implemented in the future. Components for joystick or touch-based input could be swapped in without changing any game code. Additionally, this redesign makes user defined controls very easy to implement.

State System

Game states should not be a set of enums in a switch managed by a singleton. Singletons are like drugs; use one and pretty soon you’re using them all the time, all over the place. It wasn’t long before I had state changes being triggered from code that shouldn’t care about state. This design is naive and gets out of hand rapidly, so needless to say, the old state system was stupid. The new system uses state objects that are initialized and updated in the rendering thread, giving the states access to the application systems and the scene graph. States can encapsulate different portions of game logic; they are kept tidy and clean up after themselves. For example I have a state to control display and interaction with the main menu, a state to display and update the scene graph objects (planet, skybox) in the main menu, and a state that starts the game server and listens and reacts to state changes within the server. The state objects can now be cleanly used to separate different kinds of game logic.

Asset System

The decision to undergo a major refactor stemmed from the need for better asset management. The old asset system loaded assets like images, fonts and models into memory and kept a hard reference. That’s it. There wasn’t even a way to unload dead assets. The new asset system leverages the java.lang.ref package, specifically the PhantomReference and WeakReference, to automatically free memory from unused assets when it’s needed. Some asset types are cloned from the original cached asset when requested. This allows the properties of the cloned asset to be manipulated independently from the other instances of the asset while still maintaining a reference to the shared asset data. The asset manager keeps track of the clones and, when the garbage collector deems that none of the clones are reachable, it cleans up the original asset if the memory is needed.

Scene Graph

The scene graph underwent many internal changes, but the overall functionality is relatively unchanged. The old scene graph used separate nodes for rotation, translation and scale. The new scene graph  combines the three into a single transform for each node and, instead of converting quaternions to rotation matrices and multiplying those, it simply uses quaternions for rotation. Some optimizations were made to the way the scene graph is updated. Previously, the transform data for each node in the scene graph was calculated every frame. Now, only the nodes that need to be updated are calculated.

Render System

The render system underwent internal changes in the way that the renderable nodes are queued, culled and rendered. These changes aren’t really worth going into detail about. One change that is worth mentioning, however, is the way that the render system handles assets that are uploaded to the GPU such as textures and buffer data. It uses a method similar to the method used by the asset manager and leverages the java references to be notified when an uploaded asset is no longer reachable CPU-side. When the system is notified of a dead GPU asset, the asset is removed from the GPU. Another new feature of the render system ties into the new asset management. The asset caches can be cleared of specific types of assets, then the render system forced to re-initialize all attached rendering components. These two features, when combined, can be used to instantly reload shaders without restarting the application.

Shader System

All of the shaders are now GLSL 1.5 core instead of GLSL 3.3 core. This was done in response to shader compilation errors on OSX, which as I understand, is limited to OpenGL 3.2 and GLSL 1.5. When shader assets are loaded, they can be assigned global uniform bindings, which consist of commonly needed uniforms, such as the view matrix and projection matrix. When the shader is enabled, the global bindings are checked and updated. Technique specific uniforms exist within the lightweight technique objects and are updated in the same fashion. Many techniques can have a reference to the same shader asset, yet have different values for the technique specific uniforms.

The old shader system used the positions of the far corners of the frustum in view space to cast a ray through a clip space fragment position and reconstruct the view space position of the fragment from the gbuffer’s depth texture. The frustum corner positions were calculated and passed into the shader every frame. Now the frustum corners are calculated in the shader by multiplying the full-screen quad corners by the inverse of the projection matrix.

Jan 26

The particle code is shaping up nicely. This is now the third re-design from the ground up and this is the one that I’m sticking with. I decided to use the CPU for particles. Yes, it is limiting and, yes I am aware of things like texture-data, transform feedback VBO, and OpenCL particle systems that run in parallel on the GPU. I chose to implement particles on the CPU because I wanted to have a modular system that was easily extensible. I didn’t want to write a tool that would dynamically generate a shader or OCL kernel for every possible combination of parameter manipulation modules. I wanted to have access to the particle data in order to do things like per-particle light sources and collision.

The current system meets these needs by using what I’m calling modules. Modules implement interfaces and are assigned to emitter definitions; essentially a form of dependency injection. Emitter definitions are collected within system definitions or assigned to other emitter definitions as child emitters that spawn either at the birth of a particle or on the death of a particle. Particles exist in an array and are generated on start-up. The particle pool manages particles by allowing emitters to request either a sorted or unsorted particle, and allows a maximum of 10k sorted and 20k unsorted particles.

When the particle manager receives a request to create a new particle system from a definition and attach it to the scene graph, it creates a new system instance, and subsequent emitter instances. The instance classes are very lightweight and contain instance specific members such as system world position, emitter position relative to the parent system, active particles, and counter modules copied from the definition. Each emitter contains a reference to it’s definition and, during the particle update loop, the modules attached to the definition are called and passed a particle to act upon.

Using a modular dependency injection in this way greatly reduces conditional branching within the update loop and allows new behavior models to be added quickly, easily, and with very little risk of breaking the save format.

Features:

  • Position modules
  • Velocity modules
  • Acceleration modules
  • Rotation modules
  • Animated vector flow field based on Perlin noise
  • Textures
  • Billboard, +x facing, +y facing, +z facing
  • Double sided with reversed texture on back
  • Keyframe interpolation for color, alpha, size, texture index, rotation
  • Burn parameter for smooth interpolation between alpha and additive blending
  • Emissive parameter

Total Project Statistics:

---------------------------------------------------------------
Language     files          blank        comment           code
---------------------------------------------------------------
Java           645          15593           7098          45229
PHP            309          10275           5858          26722
CSS              6            641             68           3242
GLSL            62            458            130           1839
Javascript      14            262             89           1192
XML             11            158             16            771
HTML             3              8              0             42
DOS Batch        1              0              0              1
---------------------------------------------------------------
SUM:          1051          27395          13259          79038
---------------------------------------------------------------
Jan 14
Jan 13
Not sure if art or bad buffer stride.

Not sure if art or bad buffer stride.

Jan 3
Update 0.0.146e is live!
This update has a few fixes and one major addition: sprite animation editor! Currently the animator supports three types of interpolation: step, linear and cosine. Initially I had planned to also use cubic interpolation like a Hermite or TCB spline, but decided against it. The cosine has some of the features of a parameterized spline, like the smooth ease-in and ease-out, but lacks the fine-tuned control of a TCB spline. I’m ok with that for now. The cosine interpolation is cheap and the sprites are blocky. Seriously, how much granularity do we need here?
There is now an envelope editor that allows you to visualize all of the keyframes in the timeline for each transform and assign an interpolation type for each axis.
Lodestar:* Fixed crash report newline spacing… again
Launcher:+ Added animator to the menu* Fixed empty menu on initial download
Sprite Animator:+ Initial release; v0.1b
Sprite Editor:+ Added hotkey Q for Add+ Added hotkey E for Paint+ Added hotkey S for Erase 
---------------------------------------------------------------
Language     files          blank        comment           code
---------------------------------------------------------------
Java           483          12830           5477          37052
PHP            309          10275           5858          26722
CSS              6            641             68           3242
GLSL            54            407            111           1579
Javascript      14            262             89           1192
XML             11            158             16            771
HTML             3              8              0             42
DOS Batch        1              0              0              1
---------------------------------------------------------------
SUM:           881          24581          11619          70601
---------------------------------------------------------------

Update 0.0.146e is live!

This update has a few fixes and one major addition: sprite animation editor! Currently the animator supports three types of interpolation: step, linear and cosine. Initially I had planned to also use cubic interpolation like a Hermite or TCB spline, but decided against it. The cosine has some of the features of a parameterized spline, like the smooth ease-in and ease-out, but lacks the fine-tuned control of a TCB spline. I’m ok with that for now. The cosine interpolation is cheap and the sprites are blocky. Seriously, how much granularity do we need here?

There is now an envelope editor that allows you to visualize all of the keyframes in the timeline for each transform and assign an interpolation type for each axis.

Lodestar:
* Fixed crash report newline spacing… again

Launcher:
+ Added animator to the menu
* Fixed empty menu on initial download

Sprite Animator:
+ Initial release; v0.1b

Sprite Editor:
+ Added hotkey Q for Add
+ Added hotkey E for Paint
+ Added hotkey S for Erase 

---------------------------------------------------------------
Language     files          blank        comment           code
---------------------------------------------------------------
Java           483          12830           5477          37052
PHP            309          10275           5858          26722
CSS              6            641             68           3242
GLSL            54            407            111           1579
Javascript      14            262             89           1192
XML             11            158             16            771
HTML             3              8              0             42
DOS Batch        1              0              0              1
---------------------------------------------------------------
SUM:           881          24581          11619          70601
---------------------------------------------------------------
Dec 29

Didn’t get much done this week with the holidays and all, so I’m afraid there will be no update this weekend. I did get a great start on the animator and I hope to release it in next weekend’s update. Stay tuned!

---------------------------------------------------------------
Language     files          blank        comment           code
---------------------------------------------------------------
Java           472          12499           5307          35440
PHP            309          10275           5858          26722
CSS              6            641             68           3242
GLSL            54            407            111           1579
Javascript      14            262             89           1192
XML             10            134             14            669
HTML             3              8              0             42
DOS Batch        1              0              0              1
---------------------------------------------------------------
SUM:           869          24226          11447          68887
---------------------------------------------------------------