Today I'm going to switch gears and write something a little more fun. This is the first article in a series dedicated to developing a QuadTree based terrain for XNA 4. Terrain generation is a hot topic amongst XNA developers and a question that arises often.
There are several different approaches to developing terrain. The simplest method is the “brute force” approach where all terrain data is rendered every frame. With the power of today’s hardware, this is definitely a viable solution for games with simple, low geometry terrains that target the PC, but it’s an approach that doesn’t scale well when terrain becomes more complex.
Another approach to terrain is to implement a Level of Detail (LOD) algorithm that sets out to reduce the complexity of the terrain as it gets further away from the user. This step alone can increase efficiency and produce phenomenal gains in performance while raising the complexity and detail of the terrain. There are several different approaches to LOD, and even more ways to improve performance beyond simply subdividing the terrain.
This series focuses on implementing something very similar to a ROAM algorithm. Aside from simply subdividing the terrain, it also implements further improvements by performing view frustrum culling. View frustrum culling put most simply is the process of culling away, or removing geometry that is not visible to the user.
My terrain implementation started with the following goals.
1. Dynamically calculate and update level of detail.
2. Generate terrain based on a heightmap
3. Use a queryable data structure to aid in locating specific nodes.
4. Perform culling of non-visible areas.
5. Make the detail level of my terrain adjustable.
6. Avoid too much garbage generation (and keep the Heap simple!).
7. Update my terrain asynchronously (this could use improvement).
While I’ve managed to accomplish all of the items above and will show you how to do the same, I'm continually updating and working on the project as I write this series. My implementation is simple and could have many improvements before being used in a production game. Below is a list of improvements I’m aware of that could be made:
1. LOD Banding algorithm - the method I'm using for banding the LOD based on distance is not very forgiving and it takes some tweaking to eliminate "popping" of vertices, though nice results can still be achieved.
2. Support for different vertex types - currently my implementation uses VertexPositionNormalTexture. This was the easiest to just get working with the BasicEffect class, however I would like to separate some of this logic and utilize generic vertex types to make the system more flexible.
3. Terrain Paging - currently the system is designed to work with a single terrain, but could be made to page terrain without much difficulty.
4. Better algorithms all the way around - I'm not a math wizard and some of my calculation logic could be improved.
5. More efficient splitting - currently the terrain merges all nodes and re-calculates all splits each update. It may be possible to simply calculate the deltas since the previous update and perform the splits and merges accordingly without resetting the entire tree.
6. Offload of work – this implementation is entirely CPU bound. It is designed that way to most effectively demonstrate how terrain is generated, however much of the work could be offloaded to the GPU. In addition to texturing, heights and normals can be calculated on the GPU via HLSL. This would greatly reduce the GPU bandwidth, the size of the vertex buffer, and the overall memory footprint of the terrain system as well as provide the ability to perform vertex morphing for smoother transitions between LOD levels without affecting performance.
The current implementation of the terrain uses a single VertexBuffer that contains all of the vertices for the terrain. This is a lot of vertex data when working with large terrains, such as a 1024 x 1024 (actually 1025 x 1025 vertices, more on this later), the overall performance is still very good because vertex data is buffered and not recalculated on each update.
The terrain update is designed for a simple asynchronous update. Two index buffers are currently used, one active and one inactive. The index buffers are initialized with a large size to avoid having to recreate them (and thus generate garbage) between updates. The inactive buffer is populated with each update and the buffers are swapped when the update completes. For working with larger terrains, this starts to lend well to a paged terrain system, which may also require multiple vertex buffers, depending on the size of the terrain. One caveat is that it’s possible to quickly generate more indices than the index buffers have capacity for if culling is disabled. We will explore possible solutions to this as we progress.
Quad Trees for Terrain
The quad tree is the core that drives the terrain. A quad tree is essentially a way to structure your data. In our case it is the data that describes the terrain and its bounds. The purpose of the quad tree is to provide a way to recursively drill down into specific parts of the terrain to perform functions and get information without having to check every vertex of the terrain.
Think of the terrain as a large square. This would be the top node (Quad Node) of the quad tree. We partition this node into four child nodes, each consuming 1/4 of the space of the parent node. The top level node will have a top-left child, a top-right child, a bottom-left child and a bottom-right child. Each of those children will also be divided into 4 children and so on until a desired level is reached.
[QUAD TREE IMAGE COMING SOON]
Now that we know each quad node consumes a specific amount of space, we can find a specific position inside our terrain efficiently. We check to see if our top level node contains our point. If it does, it checks each of its four children. For each child it checks, if the given child does not contain the point, nothing is returned. If a child contains the point, that child continues to check it’s children, and so on, until a node containing the point is found that either does not have any children or is at the maximum depth we’d like to search. As you can see, this lets use dive deep into the tree while continually trimming the number of individual nodes we have to query.
Classes vs. Structs
This implementation uses a mix of classes and structs. Other implementations I've seen use strictly classes, however I'm not fond of that approach, especially when it comes to programming for the XBox 360. Unfortunately, XBox uses a version of the Compact Framework which does not support generational garbage collection. There are many articles on the topic so I won't get into detail here but the basics are this: When you create and destroy a lot of reference type objects (such as classes) you generate garbage, as they are stored on the heap and must be garbage collected. Merely creating and destroying isn't the only problem however. The complexity of your heap (the number of objects referencing other objects) is also an issue as the garbage collector has to follow these reference paths for each object on the heap to determine whether objects still contain references or can be safely freed from memory.
We’ll attempt to strike a balance between heap complexity and memory requirements. My quad nodes are all classes. This was necessary to efficiently traverse the tree as each quad node has references to its four children as well as its neighboring quad nodes. Some examples I have seen also uses classes for the vertices however and I didn't see the sense in adding millions of additional objects to the heap and even more cross references (a single vertex can be shared with a lot of nodes and each node contains 9 vertex references).
Instead, I created a struct that maintains the vertex index in the master array as well as a boolean value indicated whether that particular vertex is active. This creates a higher demand for memory as structs are value types and a single vertex will have a struct referencing it for every quad node it exists in and slightly less efficient lookups as any calls necessary to get the specific vertex data must do a lookup by index on the master vertex array, but the trade-off is a less complex heap and more efficient garbage collection.
Bounding boxes are used to determine whether nodes need to be split. The tree checks for existence of a point within each node's bounding box and makes the determination. Additionally, a bounding frustrum is created from the camera's View and Projection matrices. Checks against this BoundingFrustrum are performed to determine whether nodes should split and whether their vertices should be included in the updated index buffer.
Setting Up the Game Project
Before we begin, we need a way to test our terrain. Go ahead and start by creating a new Windows Game project. You can call your project anything you’d like. This will be the base testing solution for our terrain. There are a few things we need to add to our Game.cs class that we will be using later. We will be adding more, but these should help you get started. Add the following private variables to your Game1.cs class:
private GraphicsDeviceManager _graphics;
Matrix _world = Matrix.Identity;
DateTime _last = DateTime.Now; //For calculating FPS
int _fps; //For calculating FPS
BackgroundWorker _terrainWorker; //Background worker for terrain processing
You'll also need to include the System.ComponentModel namespace for the background worker. Add "using System.ComponentModel;" to the top of your Game1.cs class file.
This segment of the tutorial introduced you to some of the basic concepts that we’re using for our terrain. Keep watching as I will be adding new sections as quickly as I can put them together. At the end of the series I will provide full source code for you to download and use as you’d like. If you have any questions or thoughts on the way I'm approaching my terrain generation, feel free to comment.
<< Terrain Series Table of Contents Go to Part 2 >>