In this blog post I want to show different variations of how I placed trees and show their influence on performance. How trees are created, I will show in another blog post. Here I want to show that I want to place bought Speedtree trees and save the performance.
PLACE TREES
Measuring the performance in Unity itself is not very reliable. Precise measurements should be made in the finished game. These tests here are only meant to give an indication of how the performance can be improved.
Important: If you work on two monitors or if you have the “Scene” and the “Game” view displayed simultaneously in Unity, the FPS (frames per seconds) decrease massively, because Unity has to calculate everything twice. Therefore I only switched on the “Game” view for the measurements.
TEST 1 – EMPTY TERRAIN
First I start with an empty terrain. I have applied different ground textures and some elevations in the foreground. This is the test scene:
TEST 2 – TWO TEST TREES (MANUALLY)
Now I bought a tree at the Speedtree Shop “Cypress_Oak_Desktop” and exported it and placed it manually on the left side as prefab (Speedtree polygon model: Total: 9341 Tris). Afterwards I reduced the tree in Speedtree itself and placed it manually as prefab on the right on the following picture (Speedtree Polygon-Model: Total: 7627 Tris).
Interesting: Although Speedtree shows me that both trees together have 9341 + 7627 = 16968 Tris (Triangles), Unity suddenly shows This is quite astonishing.
TEST 3 – TWO TEST TREES (PAINT TREES)
Now I have entered both prefabs as trees in the Terrain Tool to place them:
Strangely enough all values remained the same, only the number of frames per second (fps) got worse. However, the number of FPS cannot be measured exactly in the Unity Editor.
TEST 4 – EIGHT TREES (MANUALLY)
The test is now being extended. Four trees are placed manually in each case:
With 6 additional trees we now have 30 batches more and 219.4k Tris more.
TEST 5 – EIGHT TREES (PAINT TREES)
The same test is repeated with the Paint Tree on the terrain.
Except that it needs one more batch, nothing has changed.
INTERMEDIATE FACTS
It seems that the information Speedtree provides and what Unity makes of it is not correct. Whether Unity is doing something strange or Speedtree is delivering wrong data, I can’t understand.
It is also noticeable that there is no significant difference between placing the trees manually and using the Terrain Painter.
SCENE OPTIMIZATION
Now it is a matter of keeping the number of batches or the number of triangles (Tris) as low as possible, but still generating a dense forest. We do this in several tests.
To better display the LOD effect, I colored the LOD1 trees, i.e. those with low quality, red.
So that we now have a comparison, I have placed the exact same number of trees manually:
It seems, that all values are now a little bit better, but not essential. The 100’000 Triangles difference is probably due to rounding problems.
We can now definitely assume that the Terrain Painting Tool does not bring any performance improvements.
LOD VS. NON LOD
As a final test I will now look how much optimization can be achieved with many trees when working with LODs.
LOD is a technique that displays 3D objects that are far away in a less accurate way to save performance. That means you create a 3D object and call it LOD0 (High Quality) and then reduce the number of details of this object, making it less accurate and calling it LOD1 or LOD2, etc. This way you can show many 3D objects to the player at the same time, but the 3D objects that far away do not need to be shown as accurately. For example a house. If you are close to the house, you want to show the door handle for example. But this detail is not necessarily visible when the house is 300 meters away. Therefore, even if the house is far away, it doesn’t have to have a door handle on the house, at most maybe a black dot that simulates a door handle.
For this purpose I drew four points on the landscape and now I place 50 “Brush Size” and 100 “Tree Density” trees that only exist as LOD0 i.e. High-Quality trees. Since the placement within the radius is random, the values can change a bit with every new placement. I have calculated plus/minus 10%. But since I work with 4 placements, the random error should be reduced a bit.
We now do the same test with trees that have a LOD0 and a LOD1 tree model, where I colored the LOD1 model red:
For this test, please ignore the FPS. I accidentally left the editor open in the background. We now notice that the number of batches has not been reduced significantly, but the number of triangles has.
But the biggest performance saver is “GPU Instancing”:
GPU Instancing is a setting that can be made on most materials. In this example I only changed the leaf textures using GPU instancing. We see here now 15% less batches. If I optimize both tree textures, including the trunk, using GPU instancing, I get a saving of 25%, which is significant.
UGLY LOD1 TREES
The problem with the LOD technique here is that the LOD0 and LOD1 trees use the same material.
What we see here is the typical effect when we have a material with an “Alpha Cutoff” setting:
This problem causes the effect that the trees in the background (i.e. those that are displayed as LOD1) appear less dense. This can look very ugly depending on the camera settings.
I solve this problem by copying the material for the LOD1 trees, but reducing the alpha cutoff value massively so that the trees in the background also appear dense.
The effect is enormous. The forest seems much denser in the distance again. It does not look optimal yet, but that is because I only took two LOD models. It would make sense to work with three or even four LOD models or materials.
CONCLUSION
Als Laie habe ich viel gelernt mein Fazit:
As a layman I have learned a lot my conclusion:
GPU instancing has a massive impact and saves about 25% batches.
LOD is worthwhile with regard to the number of triangles.
There is no significant difference if you place the trees by hand or with the Terrain Painter.
Trees look sparse in the distance. This can be improved by using two materials and different alpha cutoff values.