|
Author Topic:   Terrain generation optimization
Pin Bender
New Member
posted September 12, 1999 06:22 PM            
Hey, here's a question for you:

Since TreadMarks generates the entire tree each frame, would it not be more optimal to generate an indexed polygon list instead of fans? If it were used, you could also get rid of the second pass on the tree because you can split the poly model at the same time that you're splitting the tree.

(At least, this would be possible under DirectX.. I'm not that familiar with OGL..)

IP:

LDA Seumas
unregistered
posted September 13, 1999 05:24 AM           
Hi Pin,

Thanks for posting.

That's basically the "use a global vertex list and have all binary triangles index their corners into it" strategy. I've looked into it in the past, but there are a couple of down sides to it.

First is cache coherency, in both the game engine itself and the 3D driver. When you get a lot of triangles in the terrain, the vertex and binary triangle tree dataset is going to start to get a little big. Constantly indexing into semi-random locations in a list while doing splits, followed by more semi-random indexing in the driver when rendering the tris, will probably start to get costly in cache misses. Right now my splitting is fairly cache friendly, as I use a set of recursive functions to split triangles in a fairly linear fashion, with the only reads outside the triangle structures being from very tiny blocks of pre-computed variance values. At all stages, vertex coordinates are passed down through the stack (as integers for x and z, float for y) with the recursive functions.

Another aspect besides cache is texture coordinate sharing. Since my landscape texture is broken into patches, the vertices at the edges need multiple texture coordinates, so having one unified vertex and texcoord array wouldn't be possible. It may work with a change to the Texture Matrix for each patch rendered while the arrays are locked, but I don't trust all cards to properly accelerate the texture matrix, and properly throw out _only_ the cached transformed texcoords on a texture matrix change.

I know having one big vertex array and only transforming each terrain vertex once is a really juicy target, and it is one I have longed to be able to achieve... But in all my experiments so far I haven't found anything faster than what I've got now, which does (in a real world game engine perf test) get 400,000 multi-textured or 500,000 single-textured triangles per second on a P-II 450 with TNT-1 under NT4.

What I may try yet is using vertex arrays for each terrain patch, so at least all the vertices within a patch will be shared by all the triangles inside. And using the change the texture array while the vertex array is still locked trick, I may be able to significantly boost single-textured multi-pass performance as well (assuming that single-texturing cards actually have drivers that are well optimized for vertex array re-use).

------------------
-- Seumas McNally, Lead Programmer, Longbow Digital Arts

IP:

assen
Member
posted September 13, 1999 07:18 AM            
glVertexArray() calls are supposed to be quite expensive, so it's not a good idea (at least on all accelerators) to issue one per terrain tile (OK, patch). This might turn out to be actually slower than just pushing all the triangles, in fans or strips.

IP:

Pin Bender
New Member
posted September 13, 1999 08:30 PM            
Actually, I guess I forgot to mention that I was thinking along the lines of using one indexed vertex array per patch.. That would probably help with the cache coherency issues..

But if indexed primitives are that expensive under OGL, I'd definately say stick with the fans.. ;D It's just that under DX, using indexed VB's are the preferred method to my understanding..

-- Pin

IP:

LDA Seumas
unregistered
posted September 14, 1999 03:55 AM           
I think Quake 3 outputs vertices in batches of less than 1,000, so specifying, locking, and rendering a vertex array can't be prohibitively expensive...

Once I get some time to revisit rendering optimizations some more, I will see if Vertex Arrays can provide a speed boost. It will be interesting to see if I'll have to increase my patch size to get better performance, as with a 1,000 meter view distance there could be over two hundred 64x64 patches visible at once, with a current max of 10,000 triangles total in the terrain.

------------------
-- Seumas McNally, Lead Programmer, Longbow Digital Arts

IP: