Case Study: Large data volume visualization of an evolving emission nebula

Journal papers and polished computer animations on TV and in the movies make the process of scientific visualization look easy. Just click a few buttons and a beautiful, well-lit, and instantly informative animation pops out. Unfortunately, this is rarely the case. This case study looks at the steps and problems involved in a large data volume visualization of star and emission nebula evolution for a planetarium show at the Hayden Planetarium at the American Museum of Natural History. The visualization involved four simulations, three sites, two supercomputers, 30,000 data files, 116,000 rendered images, 1,152 processors, 8.5 CPU years of rendering, and 7 terabytes of data.


Back in 2002, the Hayden Planetarium at the American Museum of Natural History (AMNH), the San Diego Supercomputer Center (SDSC), and the National Center for Supercomputing Applications (NCSA), collaborated to create a 3D animation showing the formation of a glowing emission nebula leading to the creation of our sun and solar system. The completed animation is part of the planetarium's space show The Search for Life: Are We Alone?, narrated by Harrison Ford, and now showing at planetariums around the world.

Simulated emission nebulaThe Hayden planetarium's digital projection system gave us an opportunity to use supercomputer simulations and state-of-the-art computer graphics to take the audience away from Earth and investigate places and events on a galactic scale. In our past work we visualized the static 3D structure of the Orion Nebula. In this new project we visualized a dynamic nebula evolving over 30 million years, and requiring more than 1000 times more data. The image to the right shows the nebula we created.

We'd love to say that everything worked beautifully. But it didn't.

Lets look first at the goals of this visualization, then the simulations that were run, the data they generated, and the problems we encountered trying convert the data into a coherent story about the formation of a nebula and our solar system.

Evolving an emission nebula

Simply put, an emission nebula is an enormous cloud of dust and gas that glows. Measuring several light-years across, an emission nebula begins life as a diffuse dark nebula that doesn't glow — it is seen only as a dark patch in the night sky, obscuring the light of stars beyond.

Over time, clumps of higher density gas form and grow within the dark nebula, their gravitational attraction drawing matter from the surrounding cloud. As a clump grows, the weight of layer upon layer of gas builds up, increasing the pressure and temperature at the clump's core. The pressure continues to rise until hydrogen nuclei are packed so tightly together that they fuse, igniting a thermo-nuclear reaction that signals the birth of a star.

Rosette NebulaHot young stars born within the nebula radiate their energy outward into the surrounding gas. High-energy photons from the stars ionize the atoms of the gas, knocking electrons from their orbits. As these electrons collide with other electrons and slowly return to their former orbits, they emit light. It is this light we see as an emission nebula's eerie glow, such as in the Rosette Nebula to the right (Photo by N.Wright/Harvard-Smithsonian Center for Astrophysics).

Since electrons can reside in atoms only in discrete energy levels, when electrons drop from outer to inner orbits they emit light at discrete wavelengths. By examining the spectra of nebulas, astronomers deduce their chemical content. Most emission nebulas are about 90% hydrogen, with the remainder helium, oxygen, nitrogen, and other elements. Ionization of these gases gives nebulas many of the colors we see in astronomical photographs.

Forming an accretion disk

Under the pull of gravity from a newborn star, nearby dust and gas is drawn into a spinning accretion disk surrounding the star. As the material falls inwards and closer to the star, it orbits faster and faster. Some of the matter is sucked into the star itself, while other matter coalesces into the planets of a new solar system.

The HH-111 Stellar JetIn a process not yet fully understood, the most massive of these stars expel some of their matter in enormous turbulent jets spurting out the north and south poles. Extending for light years in each direction, these jets plow into the nearby nebula, ionizing the gas and causing the jet to glow with reds and blues. Called Herbig-Haro (HH) objects, many of these jets are visible to the Hubble Space Telescope, such as HH-111 in Orion, shown to the right (Photo by B. Reipurth/U. Colorado+ and NASA).

Accretion disk in the Orion NebulaJets like HH-111 are visible to us because they are big and they glow. In contrast, the accretion disk around a new star is small and dark. Measuring only a few 1/1000ths of a light year across, such disks can be seen only as small dark silhouettes against the glowing background of an emission nebula, such as the protoplanetary disk in the Orion nebula shown to the left (Photo by M.J.McCaughrean/MPIA, C.R. O'Dell/Rice U., and NASA).

Emission nebula, accretion disks, and possibly stellar jets are all part of the history of Earth's sun and solar system. It is this story — from nebula to planets — that is the topic of the Hayden Planetarium's show, and the story this project strived to tell using supercomputing simulations and visualizations.

Simulating all of this

No single simulation is known that models everything needed to tell this story. Spatial scales are a problem: the nebula for this project spanned 5 light years, while the accretion disk spanned only about 0.004 light years — a x1000 difference. Temporal scales also are a problem: the collapse of a nebula takes some 30 million years, while the formation of an accretion disk is relatively quick at around 100,000 years — a x300 difference.

So, this project sought to combine results from six simulations:

  1. The gravitational collapse of interstellar dust and gas to form dense clumps. This simulation modeled gas motion under the influence of magnetic fields and self-gravity.
  2. The ignition of a star and the expansion of a cavity around the star. This thermodynamics simulation modeled the ignition of a star from one of the nebula's dense clumps, and the creation of a cavity around the star as nearby dust and gas were blown away.
  3. The propagation of an ionization front through the nearby dust and gas after a star ignites. The simulation modeled the propagation of light outwards from an ignited star, taking into account shadowing caused by dense regions of the nebula. Based upon the light level at a point in space, the simulation marked the point as ionized for oxygen and/or nitrogen.
  4. The movement of stars within the nebula. The N-body gravity simulation modeled the motion of 100 stars.
  5. The collapse of matter into a stellar accretion disk. The simulation modeled the collapse of a Keplerian accretion disk.
  6. The expulsion of matter out the north and south poles of a star to create turbulent jets. The simulation modeled matter pushed outwards at high speed from a star's pole.

Increasing spatial and temporal data resolution

The first simulation's output was driven by the needs of the planetarium show:

  • The flight path for the show dives into the nebula and up to a star as the star ignites in the center. Because the viewpoint got very close, the data resolution had to be high; low-resolution data would exhibit jaggies all across the planetarium dome.
  • The time scale of nebula formation is large, but the time scale of ionization and accretion disk formation is quite short. To transition from one scale to the other, time had to slow down during the animation. To accommodate a slower progress of time, the data's temporal resolution had to be high; low-resolution time would create animation jerkiness.

The planetarium dome is 70 feet across, provides a nearly 180 degree field of view, and seats 400. Spatial jaggies would be terribly embarrassing on that scale, and a jerky animation can actually make the audience dizzy and sick! So, producing a smooth jaggy- and jerk-free visualization is essential and drove the simulation to output data at higher spatial and temporal resolutions than otherwise might be needed. We consider this an interesting twist because usually scientists output whatever they feel they need, and later visualizers of the data chafe at the lack of sufficient resolution to make animations look good. This time, visualization had a voice from the start and we got plenty of data.

The final simulation produced 10,024 files spanning 30 million years. Each file contained a 5123 volume covering a 53 light year chunk of space.

Decreasing data size

While having high spatial and temporal resolution is lovely from a visualization standpoint, it is terrible from a disk storage standpoint. Also, though the data was generated at NCSA in Illinois, the visualization was done at SDSC in California — which meant that the data had to be moved from one site to the other.

Obviously, the bigger the files the more disk storage required and the more time needed to move them. Also, once at SDSC the files would be read into memory during rendering, so memory footprint was an issue. Keeping the files small was important.

Volume data of this type is often written with a few double-precision floating-point values per voxel (volume element). At 8-bytes per double-precision value, two such values per voxel gives 2 Gbytes per file. With 10,024 files, that's 20 terabytes! This was much more than we'd like.

To visualize the data, only the gas density at each voxel is needed. This density is used to control the brightness and opacity of the voxel during rendering. With this in mind, this could drop the file size by half — one 8-byte double/voxel gives 1 Gbyte/file and 10 terabytes. Still a lot.

Reducing the density value from double- to single-precision further halves the disk space needs — down to 5 terabytes. But we can do better.

Gas density values had a wide value range — about 9 orders of magnitude from 10-4.5 to 104.5 or so. If we used this value to vary voxel brightness, then we are limited by the controllability of graphics display pixel brightness — about 3 orders of magnitude from 0 to 255. For accurate rendering, we prefer to render with 16-bit color components, or about 5 orders of magnitude. In any case, the data's 9 orders of magnitude is overkill.

To reduce file size, the 4-byte floating-point density values were compressed into a logarithmic scale and stored as 14-bit integers. This quantized density values into 16,384 values — which is more than enough visually. Stored within a 16-bit integer/voxel, the 14-bit density value left 2 bits for voxel flags used during the ionization simulation.

At 16-bits/voxel, that's ¼ gigabyte/file and 2.5 terabytes for 10,024 files — way better than the 20 terabytes we first gasped at.

When good simulations go bad

Wouldn't it be nice if the simulations all did just what they were supposed to? Stars would form, ignite, and dramatically blow away dust and gas to create a beautiful wispy glowing cavity. Accretion disks would form and swirl inward as jets spurt out in dramatic wonder. But no.

Making stars where we want them

Simulated emission nebula's blue swirlsThe raw data from the nebula simulation was filled with wonderful swirls and ribbons of high-density material, as seen in the image to the right. When animated, those swirls wiggled about and get denser as gravity attracts more and more of the surrounding gas into the clumps that will soon form stars. But the story called for just one star to form — not the five or six growing within the data. We had to pick just one to ignite and ionize the gas to make an emission nebula.

Since the story called for a flight into the glowing depths of the nebula, we wanted our destination to be in the center of the data — far from the kind of sharp edges that data cubes have but fuzzy nebulas don't. Unfortunately, none of the stars that formed were polite enough to form in the center of the data cube.

Fortunately, simulations like this wrap around so that the left side matches the right side, the top matches the bottom, and the front matches the back. If you moved an object off of one side, it would reappear on the other side. It's a bit like many of the old side-scroller video games.

This meant we could re-center the data by panning the data back and forth, and wrapping the data from one side to the other. After checking out each of the top five stars growing in the data, we picked one and prepared to center on it then ignite the star in dramatic splendor.

Pretending there's a star there

The last step of the nebula collapse simulation was supposed to be used as the initial conditions for a new simulation that would ignite the star and couple thermodynamic effects with ionization calculations. The chosen star would ignite and whooosh, it would blow a cavity within the nebula, ionizing the inner reaches of the cavity. This is what physics says will happen and what we see in nebula photos from the Hubble Space Telescope. Unfortunately, after a truly noble effort by several scientists, the exciting, new, coupled simulation bombed. Time for a backup plan.

Without the coupled simulation's thermodynamic effects, a cavity around the star would not form. And without a cavity, there was no glowing bubble into which to fly. So we cheated and moved the star to an existing cavity in the data. This is clearly wrong from a literal data-is-king standpoint, but the real point is to tell the right story, with or without the right data. In this case physics and the story required a cavity around the star, so we found one.

The 10,024 files of the simulation were re-centered on the empty space of a cavity in the data, instead of any of the high-density protostar clumps. This created another 2.5 terabytes of data.

Removing ionization jaggies

Once re-centered, the last step of the nebula simulation became the initial conditions for a backup simulation that only modeled the propagation of light into the nebula. Dense clumps of dust cast shadows across the nebula. Illuminated gas was ionized if the energy level was sufficient to cause nitrogen and/or oxygen to glow. Since it takes less energy to ionize nitrogen than oxygen, the outer dimmer reaches of the nebula glowed red from nitrogen, while the inner reaches glowed green from oxygen.

Slice through the simulated emission nebula showing sharp boundaries on an ionization frontHere we were bit by our earlier desire to keep data files small. Recall that we stored a voxel's gas density as a 14-bit integer and left two bits for flags to indicate ionization. The notion was to set one bit if the voxel contained ionized nitrogen, and the other bit if it contained ionized oxygen. While this worked, it left no option for voxels to be partially ionized. The resulting data had sharp edges at the boundary of the ionized inner region of the nebula. The image to the left shows a slice through this ionized cloud — red areas are ionized nitrogen and green ionized oxygen. The sharp boundaries are distinctly non-nebula-like.

The nebula volume spanned 5 light years side-to-side and 512 voxels. That makes each voxel 5/512 = 0.0097 light years on a side. That's 57,282,187,500 miles, or about 615 times the distance from the Earth to our Sun. The presumption inherent, and wrong, in the ionization simulation was that each of these nebula voxels was uniformly filled and would be ionized fully, or not at all. But with voxels that size, it is easy to imagine that they might contain clumps of gas and have some regions ionized and others not. Partial ionization of a voxel was a must, but the simulation didn't do it.

Without partial ionization, the ionization region had sharp boundaries that looked very wrong. So to tell the right story, we again had to fake it — this time by running a volumetric blur filter on the data. Initial tests showed that running a large kernel Gaussian blur filter took some 30 hours per file. With 300 ionization data files to do, this was too long. So we used a trick: we downsampled the volumes to a lower resolution, blurred them with a smaller filter kernel, then upsampled them back to the original resolution. This process took about 1/30th of the time, and looked almost the same.

The resulting ionization data showed blurred ionization regions with soft edges, and a soft blending between nitrogen's reds and oxygen's greens. The left image below shows the nebula before we blurred the ionization fronts, and the right image shows the nebula after bluring.

Simulated emission nebula without a blurred ionization front Simulated emission nebula with a blurred ionization front

Removing stars that are in the way

A nebula 5 light years across contains more than one star. The Orion Nebula, for instance, has literally hundreds of stars visible in Hubble Space Telescope images. While the story focused on one newborn star, the nebula needed to contain these other stars.

Using an N-body gravity simulation, AMNH produced a series of data showing the motion of 100 stars within the region. The stars meandered through the nebula, interacting with each other to create classic sling-shot maneuvers that added greatly to the realism, and correctness of the final visualization. They also created stars that ran into the camera.

With a simulation, the scientist sets up initial conditions then lets go — letting the physics govern what happens next. This works beautifully from a science perspective, but from a story-telling perspective it means you have no idea where stars are going to be as your flight path zooms the audience into the nebula. After rendering a first draft of the visualization, we discovered that a star had wandered in front of the viewpoint — filling the dome with a frighteningly enormous sun.

The fix was simple, though unscientific: delete the offending star. Each of the several thousand star simulation data files was edited to delete the star. Obviously this is wrong from a purest perspective, but the story is still right and it is the story that must have priority.

Fixing a backward-spinning accretion disk

The accretion disk simulation modeled the collapse of matter into a spinning disk and the star at its center. The simulation's results were beautiful — they clearly showed the spinning disk. Unfortunately, the temporal resolution of the output files wasn't quite high enough to capture the high-speed motion of the innermost parts of the disk. The result: horrible temporal aliasing.

In animation, we sample a motion in time. The more samples we take per unit time the better we are able to represent the motion that takes place. If the motion is very fast, we need lots more samples or we risk missing interesting bits of the motion.

In old western movies the camera is often aimed at a speeding conestoga wagon chased by bandits. The spoked wheels are turning very rapidly, while the movie camera is sampling that motion rather coarsely at 24 frames per second. The coarse sample rate is insufficient to capture all of the fast wagon wheel motion. Instead a picture is taken when a spoke is at, say, 12-oclock, then again at 9-oclock (3/4 turn), then again at 6-oclock (3/4 turn), then 3-oclock (3/4 turn). And so on. When these frames are strung together we don't see the wheel turn from 12-oclock to 9-oclock — it is too big a jump for our brains to accept. Instead, we interpret the wheel as turning backwards from 12-oclock to 9-oclock, then 6-oclock, and 3-oclock. This backwards-wheel effect, called the wagon wheel effect is temporal aliasing. It is what we saw with the spinning accretion disk — the inner reaches of the disk appeared to spin backwards while the outer parts spun forward.

While the simulation was absolutely correct, the temporal aliasing created a wrong interpretation of the action. Again, in service of telling the right story where accretion disks do not spin backwards in the disk center, we faked it — we blurred the disk in time. Blurring removed landmarks in the inner part of the disk, making it impossible to see it spinning in any direction.

Making jets the wrong (right) way

The last simulation in the story modeled the growth of stellar jets — giant spurts of glowing gas ejected from the poles of massive stars as they pull in matter from a surrounding accretion disk. As with all of the simulations, the science was right and the data very interesting. But there were problems.

The HH-111 Stellar JetWithin a nebula, a newborn star pulls towards it the dust and gas in the neighborhood. That neighborhood is clumpy — it was one such clump that led to the star forming in the first place. So when the star ejects matter in stellar jets, those jets plow into the nearby clumps. Like spraying a water hose at a bunch of loose dirt on a driveway, the bigger clumps divert the water rather than yield to it. For stellar jets, the effect is to bend the jet into turbulent wiggles, like that in HH111 to the right.

Simulated jets that look like fountain pensThe simulation data available, however, did not model this kind of clumpy environment. The jet in the data spurted straight out in a too-perfect line. Adding to the situation, the simulation was 2D, not 3D. To create a 3D jet, we tried spinning the 2D cross-sections of the simulation into a surface-of-revolution. The result looked like a glowing fountain pen, and not like a jet. We tried adding randomness to the revolving cross-section, and got a chewed up glowing fountain pen that still didn't look like a jet. The images on the left show the 2D cross-section spun into 3D without and with turbulence to try and wiggle it a bit.

Ultimately, we were forced to abandon the simulation data. To create the jets, we used the HH111 image and painted a similar jet as it might look from 90 degrees to the side. With these two images arranged in an intersecting “+”, we interpolated between them for all points within a cylinder containing the jet. The result was a 3D jet that we replicated for north and south poles of the star.

A case can be made that the use of HH111 images is more “right” than using the simulation, and the simulation's awkward assumption of a non-clumpy neighborhood. Either way, the story told with the jets is right — they do look like that and they do spurt out the poles of some stars as they form. The images below show a few frames from the approach to the accretion disk and its jets.

Image sequence showing the approach to the new jets

Rendering it all

At last the data was rendered to images. With the data staged on disk, SDSC's IBM SP2 supercomputer rendered some 116,000 images during the course of several test runs and one final run using all 1000 processors. The total compute time came to 8.5 CPU years and the total data involved is about 7 terabytes. Here are a few frames from the approach to the interior of the nebula.

Emission nebula frame 1
Emission nebula frame 2
Emission nebula frame 3
Emission nebula frame 4
Emission nebula frame 5
Emission nebula frame 6
Emission nebula frame 7
Emission nebula frame 8


“Don't be a slave to the data” is perhaps the most important lesson here. It would be nice if reality could be simulated to perfection — but we can't do that yet. Instead, there clearly still exists a substantial gap between what we can simulate and what reality really looks like. If you're trying to tell a story about simulations, then by all means stick to the data. But if you're telling a story about reality, as we were, then the gap between data and reality must be bridged somehow. Faking it is essential.

In this work we sometimes faked the data, but always with a very clear understanding of what it should look like based upon the physics and a wealth of knowledge from scientists on the team. We always went as far as we could with the simulation data, and only faked it when there was no other practical choice. We do not claim that the data is right, but we are confident that the story we told is as right as can be told using today's understanding of the universe.


This work was a big project, contributed to by dozens of people.

  • The show's producer was Anthony Braun at AMNH.
  • Art and technical direction were provided by Carter Emmart and Ryan Wyatt at AMNH.
  • The interstellar medium simulation was the work of Mordecai-Mark Mac Low at AMNH, together with Li, Norman, Heitsch, and Oishi.
  • The ionization simulation was the work of Mordecai-Mark Mac Low, together with Clay Budin at AMNH.
  • The star motion simulation was produced by Ryan Wyatt at AMNH.
  • The accretion disk simulation was the work of John Hawley at the University of Virginia.
  • The jet simulation that we wish we'd been able to use was the work of Adam Frank at the University of Rochester in New York.
  • The flight path into the nebula and past the accretion disk was the work of Stuart Leavy and Bob Patterson at NCSA, with contributions by Clay Budin at AMNH.
  • Volume data manipulation and visualization was done by David R. Nadeau at SDSC.
  • The volume renderer is based upon algorithms by Jon Genetti at the University of Alaska, Fairbanks, and updated for the project by Erik Engquist and David R. Nadeau at SDSC.

This work was funded by NASA through the Hayden Planetarium at AMNH.


Post new comment

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.

More information about formatting options

Nadeau software consulting
Nadeau software consulting