Tree and graph structures are often visualized as 2D linear tree diagrams with labeled dots and connecting lines. But when the data grows beyond several dozen nodes, these diagrams become large and awkward. This article looks at a 3D Cone Tree scheme for visualizing trees and graphs with several thousand nodes. I outline the layout algorithm, show examples, and discuss strengths and weaknesses for the approach.
Table of Contents
The conventional approach to visualizing trees and graphs is to draw a 2D linear tree diagram like those below. The layout algorithm is straight-forward: At the top draw a dot for a tree's root node, or a graph's chosen starting node. Draw dots for the root's children nodes on a row below it, their children on the next row, and so on, then draw lines between parents and their children. For a graph, some lines will crisscross the figure.
For several dozen nodes, this works fine. But as a data set grows, these diagrams get wide and deep. By the time a data set has thousands or tens of thousands of nodes, a linear tree is very large and difficult to use.
This article illustrates a 3D Cone Tree visualization technique that works well for up to a thousand nodes or so.
Laying out a 3D cone tree
To lay out a 3D cone tree, first build a 2D circle tree that arranges its nodes in concentric circles centered on the root node.
For a perfect tree where all the leaves are at the maximum depth, those leaves are on the outermost circle of a circle tree, and the bottom row of a linear tree. For a fixed width drawing area (such as a computer screen), the outermost circle is π times longer than the linear tree's bottom row. This enables a circle tree to show roughly three times more nodes in the same size drawing.
For a graph, lines crisscross the circle tree to connect distant nodes. This can make a 2D diagram hard to read. To help this, pull the root node up along a 3D vertical axis, stretching the diagram into a cone. The circle tree's concentric circles are now horizontal cross-sections through the cone. Nodes are on the cone's surface, and graph lines crisscross through the cone's interior while the principal graph structure remains on the cone's surface (what constitutes a "principal" graph structure depends upon your data).
To help distinguish front nodes from those behind, add some white fog (depth cueing) to desaturate the colors of rear nodes. You could also add a shaded semi-transparent cone shape. Adding a fake shadow below the cone helps anchor it visually.
Here's a perfect trinary tree as the tree's depth is increased from 1 to 6 and the number of nodes increases from 4 to 1,093. Beyond about a thousand nodes, the dots and lines become very close together and the diagram becomes difficult to read.
Adjusting the layout
The above images are of perfect trinary trees. Real data is rarely this uniform. Instead, leaves may be at any level in the tree, and the number of children may vary from node to node. In these cases, a parent node's children may not be placed directly below it, and the cone tree will look twisted.
For a tree with uniform branching, you can fix the twist by rotating a child ring to center children under their parent. But for a tree or graph with nonuniform branching, you'll always get some twist. Some parents will have many children and others few, so no single rotation angle will center every parent's children beneath the parent.
You can minimize twist by looping through the children on a ring. Compute the difference between their angular position and that of their parent. Sum the differences, compute an average, and rotate the child ring by that average. This will bring most children closer to their parents and reduce twist.
You can eliminate twist by introducing false, invisible child nodes to force the tree's branching to be uniform. This will create gaps in the rings where the false nodes occupy space, but aren't drawn. While the twist is eliminated this way, the gaps take up space and reduce the maximum number of nodes that can be displayed in a fixed-size drawing area before the user has to pan and zoom to see more.
The visualizations above have edges colored by tree depth, but it's more interesting to color using some data parameter. Here I've used a social network data set that models the way an infection spreads from one person to the next. Color is based upon the infection time and reveals a strong path through the social network.
Estimating time and space use
If you have a graph, start by building a minimum spanning tree. Algorithms exist that run in O(e log n) time, where e is the number of edges and n the number of nodes. Once you have a tree, building the cone tree requires O(n) time to loop through the nodes to bin sort them on to rings based upon their tree depth. Adjusting the tree rings to minimize twist has to be done after the rings are full. This requires another pass through the rings and their nodes for O(n) time. Finally, drawing the cone tree requires a pass through the nodes and edges for O(n+e) time.
Storage for the graph requires O(n+e) space for node and edge data, node coordinates, and graphics instructions. Beyond this, the cone tree requires an array of ring data structures. Each ring simply orders the nodes on the ring, adding O(n) space.
Cone trees are easy to build and they are fairly intuitive to use. They work well for up to about 1,000 nodes, after which nodes become tightly packed and the diagram hard to read.
Sparse graphs work well as a cone tree, but dense graphs produce a large number of lines crisscrossing through the interior of the cone, making the diagram messier. For a large number of nodes, the outer surface of the cone becomes full, also making the interior lines harder to see.
Even for large diagrams, the number of lines and polygons to draw the diagram is well within the abilities of even the slowest 3D graphics hardware on today's computers. All of the above 3D images were drawn using Java OpenGL (JOGL) on a Mac laptop.
- Graph theory. Wikipedia has a good (long) overview of graph theory. Note particularly the long list of links to related topics at the end of the article.
- Tree (graph theory). Wikipedia also has a nice overview article about trees in graph theory.
- Minimum spanning tree. And Wikipedia has a good explanation and pseudo code for building a minimum spanning tree from a graph.