Visualizing large trees and graphs using a 3D cone tree layout

Tree and graph structures are often visualized as 2D linear tree diagrams with labeled dots and connecting lines. But when the data grows beyond several dozen nodes, these diagrams become large and awkward. This article looks at a 3D Cone Tree scheme for visualizing trees and graphs with several thousand nodes. I outline the layout algorithm, show examples, and discuss strengths and weaknesses for the approach.

Introduction

The conventional approach to visualizing trees and graphs is to draw a 2D linear tree diagram like those below. The layout algorithm is straight-forward: At the top draw a dot for a tree's root node, or a graph's chosen starting node. Draw dots for the root's children nodes on a row below it, their children on the next row, and so on, then draw lines between parents and their children. For a graph, some lines will crisscross the figure.

A standard tree with a root node, children under it, and their children under them.
Linear Tree
A standard tree with a root node, children under it, and their children under them.
Linear Graph

For several dozen nodes, this works fine. But as a data set grows, these diagrams get wide and deep. By the time a data set has thousands or tens of thousands of nodes, a linear tree is very large and difficult to use.

This article illustrates a 3D Cone Tree visualization technique that works well for up to a thousand nodes or so.

Laying out a 3D cone tree

To lay out a 3D cone tree, first build a 2D circle tree that arranges its nodes in concentric circles centered on the root node.

A standard tree with a root node, children under it, and their children under them.
Circle Tree
A standard tree with a root node, children under it, and their children under them.
Circle Graph

For a perfect tree where all the leaves are at the maximum depth, those leaves are on the outermost circle of a circle tree, and the bottom row of a linear tree. For a fixed width drawing area (such as a computer screen), the outermost circle is π times longer than the linear tree's bottom row. This enables a circle tree to show roughly three times more nodes in the same size drawing.

For a graph, lines crisscross the circle tree to connect distant nodes. This can make a 2D diagram hard to read. To help this, pull the root node up along a 3D vertical axis, stretching the diagram into a cone. The circle tree's concentric circles are now horizontal cross-sections through the cone. Nodes are on the cone's surface, and graph lines crisscross through the cone's interior while the principal graph structure remains on the cone's surface (what constitutes a "principal" graph structure depends upon your data).

A standard tree with a root node, children under it, and their children under them.
Cone Tree
A standard tree with a root node, children under it, and their children under them.
Cone Graph

To help distinguish front nodes from those behind, add some white fog (depth cueing) to desaturate the colors of rear nodes. You could also add a shaded semi-transparent cone shape. Adding a fake shadow below the cone helps anchor it visually.

Here's a perfect trinary tree as the tree's depth is increased from 1 to 6 and the number of nodes increases from 4 to 1,093. Beyond about a thousand nodes, the dots and lines become very close together and the diagram becomes difficult to read.

Adjusting the layout

The above images are of perfect trinary trees. Real data is rarely this uniform. Instead, leaves may be at any level in the tree, and the number of children may vary from node to node. In these cases, a parent node's children may not be placed directly below it, and the cone tree will look twisted.

For a tree with uniform branching, you can fix the twist by rotating a child ring to center children under their parent. But for a tree or graph with nonuniform branching, you'll always get some twist. Some parents will have many children and others few, so no single rotation angle will center every parent's children beneath the parent.

You can minimize twist by looping through the children on a ring. Compute the difference between their angular position and that of their parent. Sum the differences, compute an average, and rotate the child ring by that average. This will bring most children closer to their parents and reduce twist.

You can eliminate twist by introducing false, invisible child nodes to force the tree's branching to be uniform. This will create gaps in the rings where the false nodes occupy space, but aren't drawn. While the twist is eliminated this way, the gaps take up space and reduce the maximum number of nodes that can be displayed in a fixed-size drawing area before the user has to pan and zoom to see more.

Using color

The visualizations above have edges colored by tree depth, but it's more interesting to color using some data parameter. Here I've used a social network data set that models the way an infection spreads from one person to the next. Color is based upon the infection time and reveals a strong path through the social network.

Estimating time and space use

If you have a graph, start by building a minimum spanning tree. Algorithms exist that run in O(e log n) time, where e is the number of edges and n the number of nodes. Once you have a tree, building the cone tree requires O(n) time to loop through the nodes to bin sort them on to rings based upon their tree depth. Adjusting the tree rings to minimize twist has to be done after the rings are full. This requires another pass through the rings and their nodes for O(n) time. Finally, drawing the cone tree requires a pass through the nodes and edges for O(n+e) time.

Storage for the graph requires O(n+e) space for node and edge data, node coordinates, and graphics instructions. Beyond this, the cone tree requires an array of ring data structures. Each ring simply orders the nodes on the ring, adding O(n) space.

Conclusions

Cone trees are easy to build and they are fairly intuitive to use. They work well for up to about 1,000 nodes, after which nodes become tightly packed and the diagram hard to read.

Sparse graphs work well as a cone tree, but dense graphs produce a large number of lines crisscrossing through the interior of the cone, making the diagram messier. For a large number of nodes, the outer surface of the cone becomes full, also making the interior lines harder to see.

Even for large diagrams, the number of lines and polygons to draw the diagram is well within the abilities of even the slowest 3D graphics hardware on today's computers. All of the above 3D images were drawn using Java OpenGL (JOGL) on a Mac laptop.

Further reading

  • Graph theory. Wikipedia has a good (long) overview of graph theory. Note particularly the long list of links to related topics at the end of the article.
  • Tree (graph theory). Wikipedia also has a nice overview article about trees in graph theory.
  • Minimum spanning tree. And Wikipedia has a good explanation and pseudo code for building a minimum spanning tree from a graph.

Comments

Have you build the software,

Have you build the software, is it downloadable?

Re: Have you build the software

The graph visualization algorithms discussed here are part of my research at the San Diego Supercomputer Center (SDSC) at the University of California, San Diego (UCSD). The research is funded by the National Institutes of Health (NIH) as part of their Models of Infectious Disease Agent Study (MIDAS). All of those acronyms have a voice in the distribution of the Java implementation. At this time the software is not available for download, though it may be released as open source in the future.

Thx for response!

Thx for response!

i have acone tree project

well i read your writing about cone tree its so nice and i understood the idea very well but i have a project at my college to do visualization to a cone tree using VTK library c++ the tree should be input file in any form i want but i cant do it at all :(

Hi there, Anybody knows

Hi there,
Anybody knows about a tool or software downloadable or online for cone-tree visualization?
I really need this tool for a presentation at data mining class.

Thanks in advance.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

Nadeau software consulting
Nadeau software consulting