Wednesday, July 31, 2013

The End is Just the Beginning

Hubble's last observation for CANDELS is scheduled for August 10, the end of next week. Not counting the images taken next week, Hubble has taken three thousand four hundred and fifty eight pictures for CANDELS over the past three years. So now what?

The upcoming schedule for the last few CANDELS images.

Turning photons into numbers


Hubble's cameras have detectors that capture the minute amounts of light from distant stars and galaxies and convert that light into a tiny electrical signal. That electrical signal is recorded digitally as a set of ones and zeros. Hubble beams this data back to earth with a radio antenna every few hours. It will typically be several hours to a day before we have the data on our computers.

Turning the numbers into images


Even with fancy computers and software, it takes us about a month to take the raw images that are radioed back to earth from Hubble, and carefully assemble them into science-quality mosaics. This process involves identifying and masking artifacts left by charged particles in the solar wind, or trails of passing satelites or space junk, and subtracting off any scattered light that reflected into the camera from the earth's surface. We typically take several raw images of each patch of the sky, shifting the telescope a tiny bit between images. So we can almost always tell what is a real star or galaxy, and what is an artifact. We've been keeping up pretty well with the influx of data. If you are patient, you can download our processed images from the Mikulski Archive web site.

Turning the images back into numbers


In the end, we expect to detect about 250,000 galaxies in the CANDELS images. To use the images to try to make progress in understanding galaxy evolution, we need to measure the properties of the galaxies: their brightness, their colors, their sizes, and their shapes. We need to use those measurements to try to infer something more fundamental about the galaxies: their star-formation rates, their stellar masses, their ages, the amount of interstellar dust that they contain, and whether or not they harbor a central black hole.

In other words, we need to turn the images back into numbers.

We have a lot of specialized software to help us do this. To detect the galaxies, we use a customized version of a computer program called SExtractor, which identifies faint smudges of light and tries to make a semi-intelligent decision about whether two adjacent smudges are part of a single galaxy or two separate galaxies. That is, it segments the images into different regions, assigning most of the pixels to "background sky" and a few percent of them to separately-detected stars or galaxies.  It took quite a bit of work to get to the point where we were reasonably satisfied with how it was doing this. Nonetheless, there are still a few percent of the sources that have either been poorly "deblended" by the software, or are not real (the most common offender being scattered light from a bright star). At this stage, we can flag most of the artifacts. We're stuck with SExtractor's image segmentation for now, which forces us to worry about the issue for almost every CANDELS paper. If anyone would like a nice image-processing challenge, improving the segmentation step -- using all of the available color information from the multiple images -- would be a big step forward.

The image segmentation step. The picture to the left shows a small slice of the CANDELS image, with two bright galaxies and a lot of fainter ones. The image to the right illustrates how SExtractor segments the image into different galaxies, which sometimes overlap. Human judgment doesn't always agree with what this software does. Sometimes it becomes more obvious how to break up the objects if you look at full color images instead of black & white images. SExtractor only works in black & white.

SExtractor not only detects the galaxies, it measures their sizes, shapes and brightnesses. It does this very quickly and reasonably precisely. We have done thousands of experiments inserting artificial galaxies into the images to quantify the accuracy and precision of these measurements. These experiments allow us to determine some statistical corrections to SExtractor's measurements, so that we can infer more accurate brightnesses of the galaxies (for example).

Having measured the Hubble images, we next need to make use of data from other observatories. Primarily these are large ground-based telescopes like the VLT in Chile and the Spitzer infrared telescope. These images are not nearly as sharp as the Hubble images, so unfortunately many of the galaxies are blended together. However, the Hubble images tell us where almost all of the galaxies are.  So we can cut out the individual sources from the Hubble images, blur them to match the image quality of the other telescope, and then ask the computer to tell us what combination of brightnesses of the blended sources best reproduces the blended image. We use a program called TFIT to do this. As was the case for SExtractor, we have done extensive tests with artificial galaxies to convince ourselves that TFIT is doing the measurements correctly. By and large we are satisfied that it is doing an excellent job, but we have a relatively long wish-list of things that we would like to improve, both to remove systematic biases at the few percent level in the brightnesses of all the galaxies, and to deal with the very rare "problem cases" where the answers don't make sense.

This animation starts with an X-ray image from the Chandra telescope, 
showing two sources which probably harbor central black holes. It then
transitions to the Hubble images, where you can see that some of the

galaxies have spiral features and others don't. It then transitions to the
infrared images from the Spitzer, and finally the Herschel observatory.
You can see that the resolution of the infrared images is much poorer
than the Hubble images -- the galaxies are all blended together.
Nevertheless, you can make out that some galaxies are "bluer" than
others at infrared wavlengths. Looking at the last image, you can convince
yourself that the brightest source of far-infrared radiation in this image
(which comes primarily from heated dust) is probably the galaxy just
below the center, which is also an X-ray sources.
The output of this is a "Multi-wavelength Photometry Catalog," which is one of the most useful products for scientific study. We are striving to make this as reliable as possible because much of the science depends on that.

The multi-wavelength catalogs can be used to estimate "photometric redshifts," which tell us roughly how far away each galaxy is. The brightnesses and colors of each galaxy can be used to infer the stellar mass of each galaxy, the star-formation rates, dust content, and ages of each galaxy.  Our simulations tell us that the stellar mass estimates are pretty accurate (generally to within about a factor of two of the truth), but the estimates for the other quantities are pretty shaky. A lot of work is going into trying to quantify the biases and uncertainties, and perhaps find some way to improve the estimates.

The other type of measurement is galaxy shapes and sizes. We are going at this in a whole variety of ways. We use a computer program called GALFIT to fit a smooth light profile to each galaxy and extract some basic numbers that characterize the size and shape of the galaxy and gives us a measurement of how centrally concentrated the light is in each galaxy.  We have some other programs that make estimates of concentration without assuming a smooth light profile, and that measure the asymmetry of the images. Teams of people both in CANDELS and in Galaxy Zoo are inspecting the images and classifying the the galaxies into various categories.  We also have some software that tries to identify separate clumps of light within each galaxy. We need to calibrate all those measurements by inserting artifical galaxies into the images. This has been done for GALFIT, but it is just getting started for the other measurements.

Finally, there are measurements of correlations between galaxies. Galaxies have a propensity to cluster together. We can measure this quantitatively, but we have to carefully account for the fact that the images don't all have the same exposure time. In some patches of sky we can detect fainter galaxies than in other patches of sky. Once again, we resort to inserting artificial galaxies into the images and recovering them to quantify how our detection limits vary across the full survey. The clustering estimates make use of the photometric redshifts, so we also need to do lots of simulations to understand the effect of photometric redshift errors on the clustering measurements.

So you have a bunch of numbers. Now what?


This is where the fun really starts. This is where we can try to give rigorous answers to the scientific questions posed in our original proposal.  We would like to know, for example, how galaxies build up their masses over time. To address this, we can collect the galaxies together in different redshift slices and we can estimate how their stellar masses change as the universe gets older. We can compare these evolving stellar masses to our estimates of the star formation rates to see whether they are consistent. If they aren't, then we must be doing something wrong. Either we are missing some galaxies (for example, because they are obscured by dust), or we are not measuring the mass or the star-formation rates correctly. This kind of consistency check is essential. It's how we gain confidence that we really understand what we are seeing.

We can ask whether galaxies that have active nuclei -- black holes that are surrounded by hot gas that emits x-rays -- look any different from galaxies with the same stellar mass that don't emit X-rays. So far,  somewhat suprisingly, the answer is that they look the same. Now that we have all of the data in hand, we can look more closely with better statistics. It's possible that some classes of galaxies with active nuclei (perhaps the brightest ones) look different, for example.

We can look at how galaxies grow in size. This has been a long-standing puzzle. We know that galaxies were smaller in the past. For galaxies that are continually forming stars, this is not really a problem -- the later stars presumably form in the outskirts, making the galaxies bigger over time. However, galaxies that have stopped forming stars (become quiescent) already when the universe was only 3-5 billion years old are much smaller than quiescent galaxies today. The suspicion is that such galaxies merge together, becoming bigger without forming many more stars. But that suspicion has been hard to verify because we simply haven't had surveys that are large enough to accurately measure the number of galaxies that are in pairs or in the process of merging.

There are lots and lots of other questions, outlined in our original science goals for the survey. Sometimes the work involves comparing directly to theoretical predictions -- for example there are some beautiful predictions for the evolution of star-forming clumps that we can begin to test by dissecting the images. Sometimes there are unexpected discoveries.

So even though the last images are going to be here next week, we still have about two full years of work to make all the measurements, produce the catalogs, and really dig in and try to make sense of what we are seeing.

In communicating our findings to fellow astronomers and to the general public, we often strive to find a very clear diagram or figure to make the statistical evidence readily apparent. In fact, these diagrams are often the route to fame for an astronomer. Every beginning astronomy student learns about the Hubble diagram or the Hertsprung-Russell diagram. So maybe in the end, having converted the photons into numbers, the numbers into images, and the images back into numbers, we need to very cleverly turn these numbers back into images. Yes it seems silly, but that's often where we get to exercise our scientific creativity, and it's also often how we gain the most insight into the workings of the universe.

No comments:

Post a Comment