Adaptive moments | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Adaptive moments are the second moments of the object intensity, measured using a particular scheme designed to have near-optimal signal-to-noise ratio. Moments are measured using a radial weight function interactively adapted to the shape (ellipticity) and size of the object. This elliptical weight function has a signal-to-noise advantage over axially symmetric weight functions. In principle there is an optimal (in terms of signal-to-noise) radial shape for the weight function, which is related to the light profile of the object itself. In practice a Gaussian with size matched to that of the object is used, and is nearly optimal. Details can be found in Bernstein & Jarvis (2002). The outputs included in the SDSS data release are the following:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The asinh magnitude | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Magnitudes within the SDSS are expressed as inverse hyperbolic sine
(or "asinh") magnitudes, described in detail by Lupton, Gunn, & Szalay (1999). They are sometimes
referred to informally as luptitudes . The transformation
from linear flux measurements to asinh magnitudes is designed to be
virtually identical to the standard astronomical magnitude at high
signal-to-noise ratio, but to behave reasonably at low signal-to-noise
ratio and even at negative values of flux, where the logarithm in the
Pogson magnitude
fails. This allows us to measure a flux even in the absence of a
formal detection; we quote no upper limits in our photometry. m=-(2.5/ln10)*[asinh((f/f0)/2b)+ln(b)]. Here, f0 is given by the classical zero point of the magnitude scale, i.e., f0 is the flux of an object with conventional magnitude of zero. The quantity b is measured relative to f0, and thus is dimensionless; it is given in the table of asinh softening parameters (Table 21 in the EDR paper), along with the asinh magnitude associated with a zero flux object. The table also lists the flux corresponding to 10f0, above which the asinh magnitude and the traditional logarithmic magnitude differ by less than 1% in flux. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Astrometry | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A detailed description of the astrometric calibration is given in Pier et al. (2003) (AJ, or astro-ph/0211375). Portions of that discussion are summarized here, and on the astrometry quality overview page. The r photometric CCDs serve as the astrometric reference CCDs for the SDSS. That is, the positions for SDSS objects are based on the r centroids and calibrations. The r CCDs are calibrated by matching up bright stars detected by SDSS with existing astrometric reference catalogs. One of two reduction strategies is employed, depending on the coverage of the astrometric catalogs:
The r CCDs are therefore calibrated directly against the primary astrometric reference catalog. Frames uses the astrometric calibrations to match up detections of the same object observed in the other four filters. The accuracy of the relative astrometry between filters can thus significantly impact Frames, in particular the deblending of overlapping objects, photometry based on the same aperture in different filters, and detection of moving objects. To minimize the errors in the relative astrometry between filters, the u, g, i, and z CCDs are calibrated against the r CCDs. Each drift scan is processed separately. All six camera columns are processed in a single reduction. In brief, stars detected on the r CCDs if calibrating against UCAC, or stars detected on the astrometric CCDs transformed to r coordinates if calibrating against Tycho-2, are matched to catalog stars. Transformations from r pixel coordinates to catalog mean place (CMP) celestial coordinates are derived using a running-means least-squares fit to a focal plane model, using all six r CCDs together to solve for both the telescope tracking and the r CCDs' focal plane offsets, rotations, and scales, combined with smoothing spline fits to the intermediate residuals. These transformations, comprising the calibrations for the r CCDs, are then applied to the stars detected on the r CCDs, converting them to CMP coordinates and creating a catalog of secondary astrometric standards. Stars detected on the u, g, i, and z CCDs are then matched to this secondary catalog, and a similar fitting procedure (each CCD is fitted separately) is used to derive transformations from the pixel coordinates for the other photometric CCDs to CMP celestial coordinates, comprising the calibrations for the u, g, i, and z CCDs. Note: At the edges of pixels, the quantities objc_rowc and objc_colc take integer values. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Image Classification | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
This page provides detailed descriptions of various morphological
outputs of the photometry pipelines. We also provide discussion of
some methodology; for details of the Photo pipeline processing please
visit the Photo pipeline
page. Other photometric outputs, specifically the various
magnitudes, are described on the photometry
page.
The frames pipeline also provides several characterizations of the shape and morphology of an object.
Star/Galaxy Classification
In particular, Lupton et al. (2001a) show that the following simple cut works at the 95% confidence level for our data to r=21 and even somewhat fainter: psfMag - (dev_L>exp_L)?deVMag:expMag)>0.145 If satisfied, type is set to GALAXY for that band; otherwise, type is set to STAR . The global type objc_type is set according to the same criterion, applied to the summed fluxes from all bands in which the object is detected. Experimentation has shown that simple variants on this scheme, such as defining galaxies as those objects classified as such in any two of the three high signal-to-noise ratio bands (namely, g, r, and i), work better in some circumstances. This scheme occasionally fails to distinguish pairs of stars with separation small enough (<2") that the deblender does not split them; it also occasionally classifies Seyfert galaxies with particularly bright nuclei as stars. Further information to refine the star-galaxy separation further may be used, depending on scientific application. For example, Scranton et al. (2001) advocate applying a Bayesian prior to the above difference between the PSF and exponential magnitudes, depending on seeing and using prior knowledge about the counts of galaxies and stars with magnitude.
Radial Profiles
When converting the profMean values to a local surface
brightness, it is not the best approach to assign the mean
surface brightness to some radius within the annulus and then linearly
interpolate between radial bins. Do not use smoothing
splines, as they will not go through the points in the cumulative
profile and thus (obviously) will not conserve flux. What frames
does, e.g., in determining the Petrosian ratio, is to fit a taut spline to the
cumulative profile and then differentiate that spline fit,
after transforming both the radii and cumulative profiles with asinh
functions. We recommend doing the same here.
Surface Brightness & Concentration Index It turns out that the ratio of petroR50 to petroR90, the so-called "inverse concentration index", is correlated with morphology (Shimasaku et al. 2001, Strateva et al. 2001). Galaxies with a de Vaucouleurs profile have an inverse concentration index of around 0.3; exponential galaxies have an inverse concentration index of around 0.43. Thus, this parameter can be used as a simple morphological classifier. An important caveat when using these quantities is that they are not corrected for seeing. This causes the surface brightness to be underestimated, and the inverse concentration index to be overestimated, for objects of size comparable to the PSF. The amplitudes of these effects, however, are not yet well characterized.
Model Fit Likelihoods and Parameters f(deV_L)=deV_L/[deV_L+exp_L+star_L] and similarly for f(exp_L) and f(star_L). A fractional likelihood greater than 0.5 for any of these three profiles is generally a good threshold for object classification. This works well in the range 18<r<21.5; at the bright end, the likelihoods have a tendency to underflow to zero, which makes them less useful. In particular, star_L is often zero for bright stars. For future data releases we will incorporate improvements to the model fits to give more meaningful results at the bright end.
Ellipticities
The first method measures flux-weighted second moments,
defined as:
Isophotal Quantities
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Deblending Overlapping Objects | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
One of the jobs of the frames pipeline is to decide if an initial single detection is in fact a blend of multiple overlapping objects, and, if so, to separate, or deblend them. The deblending process is performed self-consistently across the bands (thus, all children have measurements in all bands). After deblending, the pipeline again measures the properties of these individual children. Bright objects are measured at least twice: once with a global sky and no deblending run (this detection is flagged BRIGHT) and a second time with a local sky. They may also be measured more times if they are BLENDED and a CHILD. Once objects are detected, they are deblended by identifying individual peaks within each object, merging the list of peaks across bands, and adaptively determining the profile of images associated with each peak, which sum to form the original image in each band. The originally detected object is referred to as the "parent" object and has the flag BLENDED set if multiple peaks are detected; the final set of subimages of which the parent consists are referred to as the "children" and have the flag CHILD set. Note that all quantities in the photometric catalogs (currently in the tsObj files) are measured for both parent and child. For each child object, the quantity parent gives the object id (object) of the parent (for parents themselves or isolated objects,7 this is set to the object id of the BRIGHT counterpart if that exists; otherwise it is set to -1); for each parent, nchild gives the number of children an object has. Children are assigned the id numbers immediately after the id of the parent. Thus, if an object with id 23 is set as BLENDED and has nchild equal to 2, objects 24 and 25 will be set as CHILD and have parent equal to 23. The list of peaks in the parent is trimmed to combine peaks (from different bands) that are too close to each other (if this happens, the flag PEAKS_TOO_CLOSE is set in the parent). If there are more than 25 peaks, only the most significant are kept, and the flag DEBLEND_TOO_MANY_PEAKS is set in the parent. In a number of situations, the deblender decides not to process a BLENDED object; in this case the object is flagged as NODEBLEND. Most objects with EDGE set are not deblended. The exceptions are when the object is large enough (larger than roughly an arcminute) that it will most likely not be completely included in the adjacent scan line either; in this case, DEBLENDED_AT_EDGE is set, and the deblender gives it its best shot. When an object is larger than half a frame,the deblender also gives up, and the object is flagged as TOO_LARGE. Other intricacies of the deblending results are recorded in flags described on the Object Flags section of the Flags page. On average, about 15% - 20% of all detected objects are blended, and many of these are superpositions of galaxies that the deblender successfully treats by separating the images of the nearby objects. Thus, it is almost always the childless (nChild=0, or !BLENDED || (BLENDED && NODEBLEND)) objects that are of most interest for science applications. Occasionally, very large galaxies may be treated somewhat improperly, but this is quite rare. The behavior of the deblender of overlapping images has been further improved since the DR1; these changes are most important for bright galaxies of large angular extent (> 1 arcmin). In the EDR, and to a lesser extent in the DR1, bright galaxies were occasionally "shredded" by the deblender, i.e., interpreted as two or more objects and taken apart. With improvements in the code that finds the center of large galaxies in the presence of superposed stars, and the deblending of stars superposed on galaxies, this shredding now rarely happens. Indeed, inspections of several hundred NGC galaxies shows that the deblend is correct in 95% of the cases; most of the exceptions are irregular galaxies of various sorts. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Reddening and Extinction Corrections | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Reddening corrections in magnitudes at the position of each object, extinction, are computed following Schlegel, Finkbeiner & Davis (1998). These corrections are not applied to the magnitudes ugriz in the databases. If you want corrected magnitudes, you should use dered_[ugriz]; these are the extinction-corrected model magnitudes. All other magnitudes must have the correction applied by hand or as part of your SQL query. Conversions from E(B-V) to total extinction Alambda, assuming a z=0 elliptical galaxy spectral energy distribution, are tabulated in Table 22 of the EDR Paper. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Image processing flags | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
For objects in the calibrated object lists, the photometric pipeline sets a number of flags that indicate the status of each object, warn of possible problems with the image itself, and warn of possible problems in the measurement of various quantities associated with the object. For yet more details, refer to Robert Lupton's flags document. Possible problems associated with individual pixels in the reduced
images ("corrected frames") are traced in the
Objects in the catalog have two major sets of flags:
The "status" of an objectThe catalogs contain multiple detections of objects from overlapping CCD frames. For most applications, remove duplicate detections of the same objects by considering only those which have the "primary" flag set in the status entry of the PhotoObjAll table and its Views. A description of status is provided on the details page. The details of determining primary status and of the remaining flags stored in status are found on the algorithms page describing the resolution of overlaps (resolve). Object "flags"The photometric pipeline's flags describe how certain measurements were performed for each object, and which measurements are considered unreliable or have failed altogether. You must interpret the flags correctly to obtain meaningful results. For each object, there are 59 flags stored as bit fields in a single 64-bit table column called flags in the PhotoObjAll table (and its Views). There are two versions of the flag variable for each object:
Note: This differs from the tsObj files in the DAS, where the individual filter flags are stored as vectors in two separate 32-bit columns called flags and flags2, and the overall flags are stored in a scalar called objc_flags. Here we describe which flags should be checked for which measurements, including whether you need to look at the flag in each filter, or at the general flags. RecommendationsClean sample of point sourcesIn a given band, first select objects with PRIMARY status and apply the SDSS star-galaxy separation. Then, define the following meta-flags: DEBLEND_PROBLEMS = PEAKCENTER || NOTCHECKED || (DEBLEND_NOPEAK && psfErr>0.2)INTERP_PROBLEMS = PSF_FLUX_INTERP || BAD_COUNTS_ERROR || (INTERP_CENTER && CR) Then include only objects that satisfy the following in the band in question: BINNED1 && !BRIGHT && !SATURATED && !EDGE && (!BLENDED || NODEBLEND) && !NOPROFILE && !INTERP_PROBLEMS && !DEBLEND_PROBLEMS If you are very picky, you probably will want not to include the NODEBLEND objects. Note that selecting PRIMARY objects implies !BRIGHT && (!BLENDED || NODEBLEND || nchild == 0) These are used in the SDSS quasar target selection code which is quite sensitive to outliers in the stellar locus. If you want to select very rare outliers in color space, especially single-band detections, add cuts to MAYBE_CR and MAYBE_EGHOST to the above list. Clean sample of galaxiesAs for point sources, but don't cut on EDGE (large galaxies often run into the edge). Also, you may not need to worry about the INTERP problems. The BRIGHTEST_GALAXY_CHILD may be useful if you are looking at bright galaxies; it needs further testing. If you want to select (or reject against) moving objects (asteroids), cut on the DEBLENDED_AS_MOVING flag, and then cut on the motion itself. See the the SDSS Moving Objects Catalog for more details. An interesting experiment is to remove the restriction on the DEBLENDED_AS_MOVING flag to find objects with very small proper motion (i.e., those beyond Saturn). Descriptions of all flagsFlags that affect the object's statusThese flags must be considered to reject duplicate catalog entries of the same object. By using only objects with PRIMARY status (see above), you automatically account for the most common cases: those objects which are BRIGHT, or which have been deblended (decomposed) into one or more child objects which are listed individually. In the tables, Flag names link to detailed descriptions. The "In Obj Flags?" column indicates that this flag will be set in the general (per object) "flags" column if this flag is set in any of the filters. "Bit" is the number of the bit. To find the hexadecimal values used for testing if a flag is set, please see the PhotoFlags table.
Flags that indicate problems with the raw dataThese flags are mainly informational and important only for some objects and science applications.
Flags that indicate problems with the imageThese flags may be hints that an object may not be real or that a measurement on the object failed.
Problems associated with specific quantitiesSome flags simply indicate that the quantity in question could not be measured. Others indicate more subtle aspects of the measurements, particularly for Petrosian quantities.
All flags so far indicate some problem or failure of a measurement. The following flags provide information about the processing, but do not indicate a severe problem or failure. Informational flags related to deblending
Further informational flags
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The fiber magnitude | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The flux contained within the aperture of a spectroscopic fiber (3" in diameter) is calculated in each band and stored in fiberMag.
Notes: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The model magnitude | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Important Note for EDR and DR1 data ONLY:Comparing the model (i.e., exponential and de Vaucouleurs fits) and Petrosian magnitudes of bright galaxies in EDR and DR1 data shows a systematic offset of about 0.2 magnitudes (in the sense that the model magnitudes are brighter). This turns out to be due to a bug in the way the PSF was convolved with the models (this bug affected the model magnitudes even when they were fit only to the central 4.4" radius of each object). This caused problems for very small objects (i.e., close to being unresolved). The code forces model and PSF magnitudes of unresolved objects to be the same in the mean by application of an aperture correction, which then gets applied to all objects. The net result is that the model magnitudes are fine for unresolved objects, but systematically offset for galaxies brighter than at least 20th mag. Therefore, model magnitudes should NOT be used in EDR and DR1 data. This problem has been corrected as of DR2. Just as the PSF magnitudes are optimal measures of the fluxes of stars, the optimal measure of the flux of a galaxy would use a matched galaxy model. With this in mind, the code fits two models to the two-dimensional image of each object in each band:
1. a pure deVaucouleurs profile:
2. a pure exponential profile Each model has an arbitrary axis ratio and position angle. Although for large objects it is possible and even desirable to fit more complicated models (e.g., bulge plus disk), the computational expense to compute them is not justified for the majority of the detected objects. The models are convolved with a double-Gaussian fit to the PSF, which is provided by psp. Residuals between the double-Gaussian and the full KL PSF model are added on for just the central PSF component of the image.
These fitting procedures yield the quantities
Note that these quantities correctly model the effects of the PSF. Errors for each of the last two quantities (which are based only on photon statistics) are also reported. We apply aperture corrections to make these model magnitudes equal the PSF magnitudes in the case of an unresolved object. In order to measure unbiased colors of galaxies, we measure their flux through equivalent apertures in all bands. We choose the model (exponential or deVaucouleurs) of higher likelihood in the r filter, and apply that model (i.e., allowing only the amplitude to vary) in the other bands after convolving with the appropriate PSF in each band. The resulting magnitudes are termed modelMag. The resulting estimate of galaxy color will be unbiased in the absence of color gradients. Systematic differences from Petrosian colors are in fact often seen due to color gradients, in which case the concept of a global galaxy color is somewhat ambiguous. For faint galaxies, the model colors have appreciably higher signal-to-noise ratio than do the Petrosian colors. Due to the way in which model fits are carried out, there is some weak discretization of model parameters, especially r_exp and r_deV. This is yet to be fixed. Two other issues (negative axis ratios, and bad model mags for bright objects) have been fixed since the EDR. Caveat: At bright magnitudes (r <~ 18), model magnitudes may not be a robust means to select objects by flux. For example, model magnitudes in target and best imaging may often differ significantly because a different type of profile (deVaucouleurs or exponential) was deemed the better fit in target vs. best. Instead, to select samples by flux, one should typically use Petrosian magnitudes for galaxies and psf magnitudes for stars and distant quasars. However, model colors are in general robust and may be used to select galaxy samples by color. Please also refer to the SDSS target selection algorithms for examples. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The Petrosian magnitude | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Stored as petroMag. For galaxy photometry, measuring flux is more difficult than for stars, because galaxies do not all have the same radial surface brightness profile, and have no sharp edges. In order to avoid biases, we wish to measure a constant fraction of the total light, independent of the position and distance of the object. To satisfy these requirements, the SDSS has adopted a modified form of the Petrosian (1976) system, measuring galaxy fluxes within a circular aperture whose radius is defined by the shape of the azimuthally averaged light profile.
We define the "Petrosian ratio" RP at a radius
r from
the center of an object to be the ratio of the local surface
brightness in an annulus at r to the mean surface brightness within
r, as described by Blanton et al. 2001a, Yasuda et al. 2001: where I(r) is the azimuthally averaged surface brightness profile.
The Petrosian radius rP is defined as the radius
at which
RP(rP) equals some specified value
RP,lim, set to 0.2 in our case. The
Petrosian flux in any band is then defined as the flux within a
certain number NP (equal to 2.0 in our case) of
r Petrosian radii: In the SDSS five-band photometry, the aperture in all bands is set by the profile of the galaxy in the r band alone. This procedure ensures that the color measured by comparing the Petrosian flux FP in different bands is measured through a consistent aperture. The aperture 2rP is large enough to contain nearly all of the flux for typical galaxy profiles, but small enough that the sky noise in FP is small. Thus, even substantial errors in rP cause only small errors in the Petrosian flux (typical statistical errors near the spectroscopic flux limit of r ~17.7 are < 5%), although these errors are correlated. The Petrosian radius in each band is the parameter petroRad, and the Petrosian magnitude in each band (calculated, remember, using only petroRad for the r band) is the parameter petroMag. In practice, there are a number of complications associated with this definition, because noise, substructure, and the finite size of objects can cause objects to have no Petrosian radius, or more than one. Those with more than one are flagged as MANYPETRO; the largest one is used. Those with none have NOPETRO set. Most commonly, these objects are faint (r > 20.5 or so); the Petrosian ratio becomes unmeasurable before dropping to the limiting value of 0.2; these have PETROFAINT set and have their "Petrosian radii" set to the default value of the larger of 3" or the outermost measured point in the radial profile. Finally, a galaxy with a bright stellar nucleus, such as a Seyfert galaxy, can have a Petrosian radius set by the nucleus alone; in this case, the Petrosian flux misses most of the extended light of the object. This happens quite rarely, but one dramatic example in the EDR data is the Seyfert galaxy NGC 7603 = Arp 092, at RA(2000) = 23:18:56.6, Dec(2000) = +00:14:38. How well does the Petrosian magnitude perform as a reliable and complete measure of galaxy flux? Theoretically, the Petrosian magnitudes defined here should recover essentially all of the flux of an exponential galaxy profile and about 80% of the flux for a de Vaucouleurs profile. As shown by Blanton et al. (2001a), this fraction is fairly constant with axis ratio, while as galaxies become smaller (due to worse seeing or greater distance) the fraction of light recovered becomes closer to that fraction measured for a typical PSF, about 95% in the case of the SDSS. This implies that the fraction of flux measured for exponential profiles decreases while the fraction of flux measured for deVaucouleurs profiles increases as a function of distance. However, for galaxies in the spectroscopic sample (r<17.7), these effects are small; the Petrosian radius measured by frames is extraordinarily constant in physical size as a function of redshift. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The PSF magnitude | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Stored as psfMag. For isolated stars, which are well-described by the point spread function (PSF), the optimal measure of the total flux is determined by fitting a PSF model to the object. In practice, we do this by sync-shifting the image of a star so that it is exactly centered on a pixel, and then fitting a Gaussian model of the PSF to it. This fit is carried out on the local PSF KL model at each position as well; the difference between the two is then a local aperture correction, which gives a corrected PSF magnitude. Finally, we use bright stars to determine a further aperture correction to a radius of 7.4" as a function of seeing, and apply this to each frame based on its seeing. This involved procedure is necessary to take into account the full variation of the PSF across the field, including the low signal-to-noise ratio wings. Empirically, this reduces the seeing-dependence of the photometry to below 0.02 mag for seeing as poor as 2". The resulting magnitude is stored in the quantity psfMag. The flag PSF_FLUX_INTERP warns that the PSF photometry might be suspect. The flag BAD_COUNTS_ERROR warns that because of interpolated pixels, the error may be under-estimated. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Match and MatchHead Tables | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Jim Gray, Alex Szalay, Robert Lupton, Jeff MunnMay 2003, revised January, May, June, July, December 2004
The SDSS data can be used for temporal studies of objects that are re-observed at different times. The SDSS survey observes about 10% of the Northern survey area 2 or more times, and observes the Southern stripe more than a dozen times. The match table is intended to make temporal queries easy by providing a precomputed list of all objects that were observed multiple times. More formally,
But, as always, there are complications. Green, Yellow, Red: What if ObjID2 in Run2 is missing? Surrogate: When an object is missing in Run2, what do we put in
the match table? Computing the Match tableThe Match table is computed by using the Neighbors table and has a very similar schema (the Neighbors table only stores mode (1,2) (aka primary/secondary) and type (3,5,6) (aka galaxy, unknown, star) objects. CREATE TABLE Match ( objID1 bigint not null, objID2 bigint not null, -- object pair run1 smallint not null, run2 smallint not null, -- their run numbers type1 tinyint not null, type2 tinyint not null, -- star, galaxy,... mode1 tinyint not null, mode2 tinyint not null, -- primary, secondary,... distance float not null, -- in arcminutes miss char not null, -- " " no miss, RGY:red,green,yellow matchHead bigint not null, -- see below. primary key (objID1, ObjID2) ) ON [Neighbors] -- now populate the table INSERT Match SELECT objID as objID1, neighborObjID as objID2, (objID & 0x0000FFFF00000000)/power(cast(2 as bigint),32) as run1, (NeighborObjID & 0x0000FFFF00000000)/power(cast(2 as bigint),32) run2, type as type1, neighborType as type2, mode as mode1, neighborMode as mode2, distance, ' ' as miss, 0 as matchHead FROM Neighbors WHERE distance < 1.0/60.0 -- within 1 arcsecond of one another One arcsecond is a large error in Sloan Positioning - the vast majority is within 0.5 arcsecond (95%). But a particular cluster may not form a complete graph (all members connected to all others). To make the graph fully transitive, we repeatedly execute the query to add the "curved" arcs in Figure 1. Notice that that figure shows two objects observed in four runs, and that the two objects are observed only once in the middle two runs. The whole collection is closed to make a "bundle" that will have a matchHead object (the smallest objID of the bundle).
declare @Trip table ( objid1 bigint,ObjID2 bigint, run1 smallint, run2 smallint, type1 tinyint, type2 tinyint, mode1 tinyInt, mode2 tinyInt, primary key (objID1,ObjID2) ) Computing the MatchHead tableNow each cluster of objects in the Match table is fully connected. We can name the clusters in the Match table by the minimum (non zero) objID in the cluster and can compute the MatchHead table that describes the global properties of the cluster: its name, its average RA and DEC and the variance in RA, DEC. -- build a table of cluster IDs (minimum object ID of each cluster).
-- compute the minimum object IDs.
Matching the Missing ObjectsThere may be an object in camcol A that should have matching objects in an overlapping camcol B (see figure 2). In particular, any object in the green part of A should have a matching object in B (in Figure 2). Objects in A that are near the edge of B (10 pixels ~4 arc seconds = the yellow part of B) may have matching objects in B.In some cases the B area is masked (red) and that explains why there is not a match. If a "green" A object does not match a B object then either the object is moving or variable or masked. We can check the masks to see if the (A.ra, A.dec) is masked in B. If not, we assume that A is just "missing." Similarly, if a "yellow" A object does not match a B object, then either the object is moving or variable or masked or the edge effects caused the object to be missing.In these edge cases we check to see if (A.ra, A.dec) is masked in B, if not we call the object missing-edge. So, missing objects come in 3 varieties:
In each of these cases we create a match object as the closest object in B to A and Match.flag is set to Green, Red or Yellow.These "fake" objects do not contribute to the cluster average or variance or centroid. We add this object to A's cluster (along with all the edges), and we increment the cluster miss count by the number of records we add to the cluster.
The logic for computing missing objects is as follows.
For each RunA in the Regions table. The actual code is a little more complex (about 700 lines of SQL). In the personal SkyServerDR1 there are about 20,000 matches and 10,000 object misses, so it seems that the misses will make an interesting study.
The results of this are that a bundle can have dangling pointers to these surrogate objects.Figure 3 shows the diagram of Figure 1 where a fifth overlapping run has been added. The leftmost object is masked in this new run and so we find a surrogate "red" object for it. The other objects are also have no match in this run but are not masked and are closest to the green (right) object in the figure. It takes 4 minutes to compute on the personal SkyServer DR1, It will take a bit longer on the thousand times larger Dr2, but ... As per Robert's request, surrogate match objects are found rather than invented. Sometimes we have to look far away for them (500 arcseconds in some cases). Misses are painted Yellow (near the edge), Red (masked), and Green (well inside the overlap). Most misses are Green.
The graphs of distances are shown in Figure 4. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
SDSS ObjID Encoding | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The bit encoding for the long (64-bit) IDs that are used as unique keys in the
SDSS catalog tables is described here.
PhotoObjIDThe encoding of the photometric object long ID (objID in the photo tables) is described in the table below. This scheme applies to the fieldID and objID (objid bits are 0 for fieldID).
SpecObjIDThe encoding of the long ID for spectroscopic objects is described below. This applies to plateID, specObjID, specLineID, specLineIndexID, elRedshiftID and xcRedshiftID.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Photometric Flux Calibration | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The objective of the photometric calibration process is to tie the SDSS imaging data to an AB magnitude system, and specifically to the "natural system" of the 2.5m telescope defined by the photon-weighted effective wavelengths of each combination of SDSS filter, CCD response, telescope transmission, and atmospheric transmission at a reference airmass of 1.3 as measured at APO. The calibration process ultimately involves combining data from three telescopes: the USNO 40-in on which our primary standards were first measured, the SDSS Photometric Telescope (or PT) , and the SDSS 2.5m telescope. At the beginning of the survey it was expected that there would be a single u'g'r'i'z' system. However, in the course of processing the SDSS data, the unpleasant discovery was made that the filters in the 2.5m telescope have significantly different effective wavelengths from the filters in the PT and at the USNO. These differences have been traced to the fact that the short-pass interference films on the 2.5-meter camera live in the same vacuum as the detectors, and the resulting dehydration of the films decreases their effective refractive index. This results in blueward shifts of the red edges of the filters by about 2.5 percent of the cutoff wavelength, and consequent shifts of the effective wavelengths of order half that. The USNO filters are in ambient air, and the hydration of the films exhibits small temperature shifts; the PT filters are kept in stable very dry air and are in a condition about halfway between ambient and the very stable vacuum state. The rather subtle differences between these systems are describable by simple linear transformations with small color terms for stars of not-too-extreme color, but of course cannot be so transformed for very cool objects or objects with complex spectra. Since standardization is done with stars, this is not a fundamental problem, once the transformations are well understood. It is these subtle issues that gave rise to our somewhat awkward nomenclature for the different magnitude systems:
Previous reductions of the data, including that used in the EDR, were based on inconsistent photometric equations; this is why we referred to the 2.5m photometry with asterisks: u*g*r*i*z*. With the DR1, the photometric equations are properly self-consistent, and we can now remove the stars, and refer to u g r i z photometry with the 2.5m. Overview of the Photometric Calibration in SDSSThe photometric calibration of the SDSS imaging data is a multi-step process, due to the fact that the images from the 2.5m telescope saturate at approximately r = 14, fainter than typical spectrophotometric standards, combined with the fact that observing efficiency would be greatly impacted if the 2.5m needed to interrupt its routine scanning in order to observe separate calibration fields. The first step involved setting up a primary standard star network of 158 stars distributed around the Northern sky. These stars were selected from a variety of sources and span a range in color, airmass, and right ascension. They were observed repeatedly over a period of two years using the US Naval Observatory 40-in telescope located in Flagstaff, Arizona. These observations are tied to an absolute flux system by the single F0 subdwarf star BD+17_4708, whose absolute fluxes in SDSS filters are taken from Fukugita et al. 1996 As noted above, the photometric system defined by these stars is called the u'g'r'i'z' system. You can look at the table containing the calibrated magnitudes for these standard stars. Most of these primary standards have brightnesses in the range r = 8 - 13, and would saturate the 2.5-meter telescope's imaging camera in normal operations. Therefore, a set of 1520 41.5x41.5 arcmin2 transfer fields, called secondary patches, have been positioned throughout the survey area. These secondary patches are observed with the PT; their size is set by the field of view of the PT camera. These secondary patches are grouped into sets of four. Each set spans the full set of 12 scan lines of a survey stripe along the width of the stripe, and the sets are spaced along the length of a stripe at roughly 15 degree intervals. The patches are observed by the PT in parallel with observations of the primary standards and processed using the Monitor Telescope Pipeline (mtpipe). The patches are first calibrated to the USNO 40-in u'g'r'i'z' system and then transformed to the 2.5m ugriz system; both initial calibration to the u'g'r'i'z' system and the transformation to the ugriz system occur within mtpipe. The ugriz-calibrated patches are then used to calibrate the 2.5-meter's imaging data via the Final Calibrations Pipeline (nfcalib). Monitor Telescope PipelineThe PT has two main functions: it measures the atmospheric extinction on each clear night based on observations of primary standards at a variety of airmasses, and it calibrates secondary patches in order to determine the photometric zeropoint of the 2.5m imaging scans. The extinction must be measured on each night the 2.5m is scanning, but the corresponding secondary patches can be observed on any photometric night, and need not be coincident with the image scans that they will calibrate. The Monitor Telescope Pipeline (mtpipe), so called for historical reasons, processes the PT data. It performs three basic functions:
The Final Calibration PipelineThe final calibration pipeline (nfcalib) works much like mtpipe, computing the transformation between psf photometry (or other photometry) as observed by the 2.5m telescope and the final SDSS photometric system. The pipeline matches stars between a camera column of 2.5m data and an overlapping secondary patch. Each camera column of 2.5m data is calibrated individually. There are of order 100 stars in each patch in the appropriate color and magnitude range in the overlap. The transformation equations are a simplified form of those used by mtpipe.
Since mtpipe delivers patch stars already calibrated to the
2.5m ugriz system, the nfcalib transformation equations have the following
form: Assessment of Photometric CalibrationWith Data Release 1 (DR1), we now routinely meet our requirements of photometric uniformity of 2% in r, g-r, and r-i and of 3% in u-g and i-z (rms). This is a substantial improvement over the photometric uniformity achieved in the Early Data Release (EDR), where the corresponding values were approximately 5% in r, g-r, and r-i and 5% in u-g and i-z. The improvements between the photometric calibration of the EDR and the DR1 can be traced primarily to the use of more robust and consistent photometric equations by mtpipe and nfcalib and to improvements to the PSF-fitting algorithm and flatfield methodology in the Photometric Pipeline (photo). Note that this photometric uniformity is measured based upon relatively bright stars which are no redder than M0; hence, these measures do not include effects of the u band red leak (see caveats below) or the model magnitude bug. How to go from Counts in the fpC file to Calibrated ugriz magnitudes?Asinh and Pogson magnitudesAll calibrated magnitudes in the photometric catalogs are given not as conventional Pogson astronomical magnitudes, but as asinh magnitudes. We show how to obtain both kinds of magnitudes from observed count rates and vice versa. See further down for conversion of SDSS magnitudes to physical fluxes. For both kinds of magnitudes, there are two ways to obtain the zeropoint information for the conversion.
On a related note, in DR1 one can also use relations similar to the above to estimate the sky level in magnitudes per sq. arcsec (1 pixel = 0.396 arcsec). Either use the header keyword "sky" in the fpC files, or remember to first subtract "softbias" (= 1000) from the raw background counts in the fpC files. Note the sky level is also given in the tsField files. This note only applies to the DR1 and later data releases. Note also that the calibrated sky brightnesses reported in the tsField values have been corrected for atmospheric extinction. Computing errors on counts (converting counts to photo-electrons)The fpC (corrected frames) and fpObjc (object tables with counts for each object instead of magnitudes) files report counts (or "data numbers", DN). However, it is the number of photo-electrons which is really counted by the CCD detectors and which therefore obeys Poisson statistics. The number of photo-electrons is related to the number of counts through the gain (which is really an inverse gain):
The gain is reported in the headers of the tsField and fpAtlas files (and hence also in the field table in the CAS). The total noise contributed by dark current and read noise (in units of DN2) is also reported in the tsField files in header keyword dark_variance (and correspondingly as darkVariance in the field table in the CAS), and also as dark_var in the fpAtlas header. Thus, the error in DN is given by the following expression:
where counts is the number of object counts, sky is the number of sky counts summed over the same area as the object counts, Npix is the area covered by the object in pixels, and gain and dark_variance are the numbers from the corresponding tsField files. Conversion from SDSS ugriz magnitudes to AB ugriz magnitudesThe SDSS photometry is intended to be on the AB system (Oke & Gunn 1983), by which a magnitude 0 object should have the same counts as a source of Fnu = 3631 Jy. However, this is known not to be exactly true, such that the photometric zeropoints are slightly off the AB standard. We continue to work to pin down these shifts. Our present estimate, based on comparison to the STIS standards of Bohlin, Dickinson, & Calzetti~(2001) and confirmed by SDSS photometry and spectroscopy of fainter hot white dwarfs, is that the u band zeropoint is in error by 0.04 mag, uAB = uSDSS - 0.04 mag, and that g, r, and i are close to AB. These statements are certainly not precise to better than 0.01 mag; in addition, they depend critically on the system response of the SDSS 2.5-meter, which was measured by Doi et al. (2004, in preparation). The z band zeropoint is not as certain at this time, but there is mild evidence that it may be shifted by about 0.02 mag in the sense zAB = zSDSS + 0.02 mag. The large shift in the u band was expected because the adopted magnitude of the SDSS standard BD+17 in Fukugita et al.(1996) was computed at zero airmass, thereby making the assumed u response bluer than that of the USNO system response. We intend to give a fuller report on the SDSS zeropoints, with uncertainties, in the near future. Note that our relative photometry is quite a bit better than these numbers would imply; repeat observations show that our calibrations are better than 2%. Conversion from SDSS ugriz magnitudes to physical fluxesAs explained in the preceding section, the SDSS system is nearly an AB system. Assuming you know the correction from SDSS zeropoints to AB zeropoints (see above), you can turn the AB magnitudes into a flux density using the AB zeropoint flux density. The AB system is defined such that every filter has a zero-point flux density of 3631 Jy (1 Jy = 1 Jansky = 10-26 W Hz-1 m-2 = 10-23 erg s-1 Hz-1 cm-2).
Then you need to apply the correction for the zeropoint offset between the SDSS system and the AB system. See the description of SDSS to AB conversion above. Transformation Equations Between SDSS magnitudes and UBVRcIcThere is a separate page describing the conversion between SDSS magnitudes and UBVRcIc, and ugriz colors of Vega and the Sun. Improved photometric calibration ("Übercal")Ubercal is an algorithm to photometrically calibrate wide field optical imaging surveys, that simultaneously solves for the calibration parameters and relative stellar fluxes using overlapping observations. The algorithm decouples the problem of relative calibrations from that of absolute calibrations; the absolute calibration is reduced to determining a few numbers for the entire survey. We pay special attention to the spatial structure of the calibration errors, allowing one to isolate particular error modes in downstream analyses. Applying this to the Sloan Digital Sky Survey imaging data, we achieve ~1% relative calibration errors across 8500 sq.deg. in griz; the errors are ~2% for the u band. These errors are dominated by unmodelled atmospheric variations at Apache Point Observatory. For a detailed description of ubercal, please see the Ubercal paper (Padmanabhan et al. 2007, ApJ submitted [astro-ph/0703454]). This improved calibration is available only through the Ubercal table. CaveatsThe u filter has a natural red leak around 7100 Å which is supposed to be blocked by an interference coating. However, under the vacuum in the camera, the wavelength cutoff of the interference coating has shifted redward (see the discussion in the EDR paper), allowing some of this red leak through. The extent of this contamination is different for each camera column. It is not completely clear if the effect is deterministic; there is some evidence that it is variable from one run to another with very similar conditions in a given camera column. Roughly speaking, however, this is a 0.02 magnitude effect in the u magnitudes for mid-K stars (and galaxies of similar color), increasing to 0.06 magnitude for M0 stars (r-i ~ 0.5), 0.2 magnitude at r-i ~ 1.2, and 0.3 magnitude at r-i = 1.5. There is a large dispersion in the red leak for the redder stars, caused by three effects:
To make matters even more complicated, this is a detector effect. This means that it is not the real i and z which drive the excess, but the instrumental colors (i.e., including the effects of atmospheric extinction), so the leak is worse at high airmass, when the true ultraviolet flux is heavily absorbed but the infrared flux is relatively unaffected. Given these complications, we cannot recommend a specific correction to the u-band magnitudes of red stars, and warn the user of these data about over-interpreting results on colors involving the u band for stars later than K. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Photometric Redshifts | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
There are no photometic redshifts available for data releases 2 through 4
(DR2-DR4). Starting with DR5, there are two versions of photometric redshift
in the SDSS databases, in the Photoz and
Photoz2 tables respectively. The
algorithms for generating these are described below.
Photoz TableThis set of photometric redshift has been obtained with the template fitting method. Please also see this link for more detailed information about this method..The template fitting approach simply compares the expected colors of a galaxy (derived from template spectral energy distributions) with those observed for an individual galaxy. The standard scenario for template fitting is to take a small number of spectral templates T (e.g., E, Sbc, Scd, and Irr galaxies) and choose the best fit by optimizing the likelihood of the fit as a function of redshift, type, and luminosity p(z, T, L). Variations on this approach have been developed in the last few decades, including ones that use a continuous distribution of spectral templates, enabling the error function in redshift and type to be well defined. Since a representative set of photometrically calibrated spectra in the full wavelength range of the filters is not easy to obtain, we have used the empirical templates of Coleman Weedman and Wu extended with spectral synthesis models. These templates were adjusted to fit the calibrations (see Budavari et al. AJ 120 1588 (2000)) For more detailed information see Csabai et al. AJ 125 580 (2003) and references therein. The table contains the estimated redshift, the best matching template's spectral class, K-corrections and absolute magnitudes. There are also some parameters of the chi-square fitting. Caveats: The quality of photometric redshift estimation of faint objects (or to be prcise with large photometric errors) is weak. The "quality", "zErr" and "tErr" values are just estimates, they are not always reliable. For this estimation we have used galaxy templates for all objects. Except for a few misidentified galaxies which were categorized as star in the photopipeline, the values fornon-galaxies shouldn't be used.
Photoz2 TableThe photometric redshifts from the U. Chicago/Fermilab/NYU group (H. Oyaizu, M. Lima, C. Cunha, H. Lin, J. Frieman, and E. Sheldon) are calculated using a Neural Network method that is similar in implementation to that of Collister and Lahav (2004, PASP, 116, 345). The photo-z training and validation sets consist of over 551,000 unique spectroscopic redshifts matched to nearly 640,000 SDSS photometric measurements. These spectroscopic redshifts come from the SDSS as well as the deeper galaxy surveys 2SLAQ, CFRS, CNOC2, TKRS, and DEEP+DEEP2.We provide photo-z estimates for a sample of over 77.4 million DR6 primary objects, classified as galaxies by the SDSS PHOTO pipeline (TYPE = 3),
with dereddened model magnitude r < 22,
and which do not have any of the flags BRIGHT ,
SATURATED , or SATUR_CENTER set.
Note that this is a significant
change in the input galaxy sample selection compared to the DR5
version of Photoz2 .
Our data model is
Both the "CC2" and "D1" photo-z's are neural network based estimators. "D1" uses the galaxy magnitudes in the photo-z fit, while "CC2" uses only galaxy colors (i.e., only magnitude differences). Both methods also employ concentration indices (the ratio of PetroR50 and PetroR90). The "D1" estimator provides smaller photo-z errors than the "CC2" estimator, and is recommended for bright galaxies r < 20 to minimize the overall photo-z scatter and bias. However, for faint galaxies r > 20, we recommend "CC2" as it provides more accurate photo-z redshift distributions. If a single photo-z method is desired for simplicity, we also recommend "CC2" as the better overall photo-z estimator. Please see this link for a detailed comparison of the two methods, including performance metrics (photo-z errors and biases), quality plots, and photometric redshift vs. spectroscopic redshift distributions in different magnitude bins. The photo-z errors (1&sigma, or 68% confidence) are computed using an empirical "Nearest Neighbor Error" (NNE) method. NNE is a training set based method that associates similar errors to objects with similar magnitudes, and is found to accurately predict the photo-z error when the training set is representative. The photo-z "flag" value is set to 2 for fainter objects with r > 20, whose photo-z's have larger uncertainties and biases. Full details about the Photoz2 photometric redshifts
are available here and in Oyaizu et al. (2007), ApJ, submitted,
arXiv:0708.0030 [astro-ph].
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
QSO Catalog | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Building the QsoCatalogAll and QsoConcordanceAll tablesJim Gray, Sebastian Jester, Gordon Richards, Alex Szalay, Ani ThakarMarch 2006 Abstract: We constructed a catalog of all quasar candidates and gathered their "vital signs" from the many different SDSS data sources into one Quasar Concordance table. 1. The Target, Best, and Spec SDSS DatasetsThe SDSS Target Database is used to select the targets that will be observed with the SDSS spectrographs. Once made, these targeting decisions are never changed but the targeting algorithm has improved over time. The SDSS pipeline software is always improving so the underlying pixels are re-analyzed with each data release. To have a consistent catalog, all the mosaiced pixels, both from early and recent observations are reprocessed with the new software in subsequent data releases. The output of each of these uniform processing steps is called a Best Database. So at any instant there is the historical cumulative Target database and the current Best database. As of early 2006 we have the Early Data Release (EDR) databases and then five "real" data releases DR1, DR2, DR3, DR4, and DR5. The target selection is done by the various branches (galaxy, quasar, serendipity) of the TARGET selection algorithm. These targets are organized for spectroscopic follow-up by the TILING (Blanton et al. 2003) [0] algorithm as part of a tiling run that works within a tiling geometry. The tiling run places a 2.5 deg. circle over a tiling geometry and then assigns spectroscopic targets to be observed. The circle corresponds to a plate that can be mounted on the SDSS telescope to observe 640 targets at a time. The plates are "drilled" and "plugged" with optical fibers and then "observed". These spectroscopic observations are fed through a pipeline that builds the Spec dataset. Because Spec is relatively small (2% the size of Best), it is included in the Best database. Unfortunately, only the "main" SDSS target photometry is exported to the Target database (the target photometry for Southern and Special plates is not exported - at best we have the later Best photometry for these objects in the database). The SDSS catalogs are cross-matched with the FIRST, ROSAT, Stetson 2. Overview: Finding Everything That MIGHT be a Quasar We look in the Target..PhotoObjAll, Best..SpecObjAll, and Best..PhotoObjAll tables to find any object that might be a quasar (a QSO). We build a QsoCatalogAll table that has a row for every combination of nearby TargPhoto-Spec-BestPhoto objects from these lists that are within 1.5 arcseconds of one another. If no matching object can be found from the QSO candidate list we find a surrogate object -- the nearest primary object from the corresponding catalog (Spec, BestPhoto, TargPhoto) if one can be found (again using the 1.5" radius.) If an object is still unmatched, we look for a secondary object, or put a zero for that ObjectID (in general, we use zero rather than the SQL null value to represent missing data). 2.1. Overview: QSO TablesThe tables and views created by the quasar concordance algorithm on the Best, Target and Spectro datasets are part of the Best database. The following sections explain how they are computed.
2.2. Overview: Quasar Bunches
The algorithm uses spatial proximity (aka: "is it nearby?") to cross-correlate objects in the Target, Best, and Spec databases. The definition of nearby is fairly loose: The SDSS Photo Survey pixels are 0.4 arcsecond and the positioning is accurate to .1 arcsecond, but the Spectroscopic survey has fibers that are 1.5 arcseconds in diameter. Therefore, the QSO concordance uses the 1.5" fiber radius to define nearby for all 3 datasets. In a perfect world, one SpecObj matches one BestObj and one TargetObj, and they are all marked as QSOs. Some objects have no match in the other catalogs -- so we have zeros in those slots of that object's row. But, sometimes 2 SpecObj match 3 TargetObj and 4 BestObj, and all 9 objects are marked as QSOs. In this case we get 2x3x4 rows. We group together all the objects that are related in this way as a bunch. Each bunch has a head object ID: the first member of the bunch to be recognized as a possible QSO. The precedence is TargetObjID first, if there is no target in the bunch then the first SpecObjID (highest S/N primary first), else the first BestObjID. This ordering reflects the first time the object was considered for follow-up spectroscopy. This order avoids a selection bias in the dataset (e.g., Malmquist bias if we were to order on decreasing S/N). 2.3 The QSO Catalog and Concordance
3. Overview: A Walkthrough of the Algorithm.Phase 1: Gather the Quasars and Quasar Candidates: As a first step, gather the Target, Spec, and Best quasar candidate or confirmed objects into a Zones table [1] containing their object identifiers and positions. These are copied from the Best and Target PhotoObjAll tables and the Best SpecObjAll table. These copies are filtered by flags indicating that the objects are QSOs or are targeted as QSOs. For the photo objects (target and best), this means they are primary or secondary and flagged (primTarget) as: TARGET_QSO_HIZ OR TARGET_QSO_CAP OR TARGET_QSO_SKIRT OR TARGET_QSO_FIRST_CAP OR TARGET_QSO_FIRST_SKIRT ( = 0x0000001F). For the spectroscopic objects, they must have one or more of the following properties:
That logic is fine for most Spectroscopic objects, but there are "special plates" whose authors overloaded the primary target flags (yes, they made it much harder to understand the data and cost many hours of discussion trying to disambiguate the data.) One can recognize the standard cases with the predicate plate.programType = 0 meaning that the plate was processed as a "Main" (programType=0 is "Main") chunk, not as a "special" (programType=2) or "Southern" (programType=1) plate. The three-case logic about works fine for "main" targets. The "targets for special plates" have SpecObj.primtarget & 0x80000000≠ 0. Once you know it is "special" plate you have to ask if it is a "special target". If it is, you have to ask is it the "Fstar72" group? If not you can use the standard test ((primTarget & 0x1F) ≠ 0) - those nice people did not "overload" the primTarget flags. But the folks who did "Fstar72" overloaded the flags and so we get the following complex logic: -- select SpecObjects that are either declared QSOs from their spectra -- or that were targeted as likely QSOs Select S.SpecObjID from BestDr5.dbo.platex as P join BestDr5.dbo.specobjall as S on P.plateid = S.plateid where specClass in (3,4,0) -- class is QSO or HiZ_QSO or Unknown. or z > 0.6 -- or high redshift or ( -- standard-survey plates px.programtype = 0 -- MAIN targeting survey and so.primtarget & 0x1f != 0 ) or ( -- special quasar targets from special plates -- see http://www.sdss.org/dr4/products/spectra/special.html so.primtarget & 0x80000000 != 0 and ( ( px.programname in ('merged48','south22') and so.primtarget & 0x1f != 0 ) or ( px.programname = 'fstar72' and so.primtarget & 4 != 0 ) or ( -- bent double-lobed FIRST source counterparts from specialplates -- The "straight double" counterparts have already been snuck -- into the usual FIRST counterpart quasar category 0x10. px.programname = 'merged48' and so.primtarget & 0x200000 != 0 ) ) ) or ( -- non-special quasar targets from special plates so.primtarget & 0x80000000 = 0 and px.programname in ('merged73','merged48','south22') and so.primtarget & 0x1f != 0 ) ---------------------------------------------------------------------------------------------- Phase 2: Find the Neighbors. Once the zone table is assembled containing all the candidates, a zones algorithm [1] is used to build a neighbors table among all these objects. Two objects are QSO neighbors if they are within 1.5 arcseconds of one another. The relationship is made transitive so that friends of friends are all part of the same neighborhood. Phase 3: Build the Bunches. The Neighbors relationship partitions the objects into bunches. We pick a distinguished member from each bunch to represent that bunch - called the bunch head. The selection favors Target then Spec, then Photo Objects and within that category it favors primary, then secondary, then outside objects if there is a tie within one group (e.g. multiple target objects in a bunch.) If there are multiple selections within these groups, the tie is broken by taking the minimum object ID for PhotoObj (again, to avoid any selection bias) and the highest S/N for specObjs. Given these bunch heads, we record a summary record for each bunch in the QsoBunch table:
Where the difference between TargetObjs and TargetPrimaries (etc.) is that TargetObjs indicates multiple entries of the same object in the database (e.g. both as a primary and a secondary), whereas TargetPrimaries helps us to identify objects that are either very close together or that were deblended into two objects separated by less than 1.5" (or are in a circle of 1.5" radius). Because the object primary flags are not handy at this point of the computation, the Bunch statistics are actually computed in Phase 9. Phase 4: Build the Catalog. Now we grow the QsoCatalogAll table which, for each bunch, has triples drawn from each class of the bunch (a target, a spec, and a best object). For example, the bunch of Figure 1 would produce 4 triples. If there is no object in one of the classes, we fill in with a non-QSO surrogate object - the primary object from that database (Targ, Photo, Spec) closest to the bunch head, or if there is no primary then a secondary (the test insists on the 1.5 arcsecond radius.) If no such object can be found we fill in that slot with a zero object. The resulting table looks like this:
The last 5 "quality fields" are computed in Phase 9. Phase 5: Find Surrogates for missing objects. Some objects in the Catalog entries have no matching Target, Best, or Spec objects. In these cases we look in the database to find a surrogate object (which was not a QSO candidate) that is nearby the bunch head object - as usual the search radius is 1.5 arcseconds and we favor primary over secondary objects and favor low-signal-to noise ratio SpecObjs. Phase 6: Get the Vital Signs. We now go to the source databases and get the "vital signs" of these photo and spetro objects (both quasar candidates and also surrogates) , building a QsoSpec, QsoTarget, and QsoBest tables holding these values and for the photo objects, some additional values from ROSAT and FIRST if there is a match. We then define QsoConcordanceAll as a view on these base tables with the following (~100) fields. Phase 7: Define QsoConcordanceAll and QsoConcordance Views: Now we are ready to "glue together the QsoCatalog with the vital signs to make a "fat table" with all the attributes.
Phase 9: Mark the primary triple of each bunch, compute some derived magnitude values and cleanup: Having the QsoConcordanceAll view and all the vital signs in place we compute some derived values: Picking the best triple of each bunch, computing the distances among members of the triple and computing some derived psf magnitudes. In the end, the DR5 database has 265,697 bunches, 329,871 triples in the concordance and 114,883 confirmed quasars. Most bunches have one catalog entry, but about 10% have multiple matches (generally and primary and secondary best or target object where both are flagged as QSO candidates or multiple observations of a spectroscopic object). The catalog itself has some interesting cases. In DR5 there are 82,142 cases where the Target, Spec, and Best all agree that it is a quasar. Since SDSS spectroscopy lags the imaging, it is not surprising that there are 81,011 objects where both the Target and Best indicate a likely QSO, but there is no spectrogram for the object (the Spec Zero case). With the QsoCatalogAll and QsoConcordanceAll in place we define two views: QsoCatalog (the best of the bunch) and QsoConcordance (the wide version) by picking the best targetObj, spec, and bestObj of each bunch.
References [0] "An Efficient Targeting Strategy for Multiobject Spectrograph Surveys: The Sloan Digital Sky Survey," Blanton et al., AJ 125:2276 (2003) [1] "There Goes the Neighborhood: Relational Algebra for Spatial Data Search", pdf, Alexander S. Szalay, Gyorgy Fekete, Wil O'Mullane, Maria A. Nieto-Santisteban, Aniruddha R. Thakar, Gerd Heber, Arnold H. Rots, MSR-TR-2004-32, April 2004 [2] "Creating Sectors," Alex Szalay, Gyorgy Fekete, Tamas Budavari, Jim Gray, Adrian Pope, Ani Thakar, August 2003, | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Spectroscopic Redshift and Type Determination | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The spectro1d pipeline analyzes the combined, merged spectra output by spectro2d and determines object classifications (galaxy, quasar, star, or unknown) and redshifts; it also provides various line measurements and warning flags. The code attempts to measure an emission and absorption redshift independently for every targeted (nonsky) object. That is, to avoid biases, the absorption and emission codes operate independently, and they both operate independently of any target selection information. The spectro1d pipeline performs a sequence of tasks for each object spectrum on a plate: The spectrum and error array are read in, along with the pixel mask. Pixels with mask bits set to FULLREJECT, NOSKY, NODATA, or BRIGHTSKY are given no weight in the spectro1d routines. The continuum is then fitted with a fifth-order polynomial, with iterative rejection of outliers (e.g., strong lines). The fit continuum is subtracted from the spectrum. The continuum-subtracted spectra are used for cross-correlating with the stellar templates. Emission-Line RedshiftsEmission lines (peaks in the one-dimensional spectrum) are found by carrying out a wavelet transform of the continuum-subtracted spectrum fc(&lambda):where g(x; a, b) is the wavelet (with complex conjugate ) with translation and scale parameters a and b. We apply the à trous wavelet (Starck, Siebenmorgen, & Gredel 1997). For fixed wavelet scale b, the wavelet transform is computed at each pixel center a; the scale b is then increased in geometric steps and the process repeated. Once the full wavelet transform is computed, the code finds peaks above a threshold and eliminates multiple detections (at different b) of a given line by searching nearby pixels. The output of this routine is a set of positions of candidate emission lines. This list of lines with nonzero weights is matched against a list of common galaxy and quasar emission lines, many of which were measured from the composite quasar spectrum of Vanden Berk et al.(2001; because of velocity shifts of different lines in quasars, the wavelengths listed do not necessarily match their rest-frame values). Each significant peak found by the wavelet routine is assigned a trial line identification from the common list (e.g., MgII) and an associated trial redshift. The peak is fitted with a Gaussian, and the line center, width, and height above the continuum are stored in HDU 2 of the spSpec*.fits files as parameters wave, sigma, and height, respectively. If the code detects close neighboring lines, it fits them with multiple Gaussians. Depending on the trial line identification, the line width it tries to fit is physically constrained. The code then searches for the other expected common emission lines at the appropriate wavelengths for that trial redshift and computes a confidence level (CL) by summing over the weights of the found lines and dividing by the summed weights of the expected lines. The CL is penalized if the different line centers do not quite match. Once all of the trial line identifications and redshifts have been explored, an emission-line redshift is chosen as the one with the highest CL and stored as z in the EmissionRedshift table and the spSpec*.fits emission line HDU. The exact expression for the emission-line CL has been tweaked to match our empirical success rate in assigning correct emission-line redshifts, based on manual inspection of a large number of spectra from the EDR. The SpecLine table also gives the errors, continuum, equivalent width, chi-squared, spectral index, and significance of each line. We caution that the emission-line measurement for Hα should only be used if chi-squared is less than 2.5. In the SpecLine table, the "found" lines in HDU1 denote only those lines used to measure the emission-line redshift, while "measured" lines in HDU2 are all lines in the emission-line list measured at the redshifted positions appropriate to the final redshift assigned to the object. A separate routine searches for high-redshift (z > 2.3) quasars by identifying spectra that contain a Lyα forest signature: a broad emission line with more fluctuation on the blue side than on the red side of the line. The routine outputs the wavelength of the Lyα emission line; while this allows a determination of the redshift, it is not a high-precision estimate, because the Lyα line is intrinsically broad and affected by Lyα absorption. The spectro1d pipeline stores this as an additional emission-line redshift. This redshift information is stored in the EmissionRedshift table. If the highest CL emission-line redshift uses lines only expected for quasars (e.g., Lyα, CIV, CIII], then the object is provisionally classified as a quasar. These provisional classifications will hold up if the final redshift assigned to the object (see below) agrees with its emission redshift. Cross-Correlation RedshiftsThe spectra are cross-correlated with stellar, emission-line galaxy, and quasar template spectra to determine a cross-correlation redshift and error. The cross-correlation templates are obtained from SDSS commissioning spectra of high signal-to-noise ratio and comprise roughly one for each stellar spectral type from B to almost L, a nonmagnetic and a magnetic white dwarf, an emission-line galaxy, a composite LRG spectrum, and a composite quasar spectrum (from Vanden Berk et al. 2001). The composites are based on co-additions of ∼ 2000 spectra each. The template redshifts are determined by cross-correlation with a large number of stellar spectra from SDSS observations of the M67 star cluster, whose radial velocity is precisely known. When an object spectrum is cross-correlated with the stellar templates, its found emission lines are masked out, i.e., the redshift is derived from the absorption features. The cross-correlation routine follows the technique of Tonry & Davis (1979): the continuum-subtracted spectrum is Fourier-transformed and convolved with the transform of each template. For each template, the three highest cross-correlation function (CCF) peaks are found, fitted with parabolas, and output with their associated confidence limits. The corresponding redshift errors are given by the widths of the CCF peaks. The cross-correlation CLs are empirically calibrated as a function of peak level based on manual inspection of a large number of spectra from the EDR. The final cross-correlation redshift is then chosen as the one with the highest CL from among all of the templates. If there are discrepant high-CL cross-correlation peaks, i.e., if the highest peak has CL < 0.99 and the next highest peak corresponds to a CL that is greater than 70% of the highest peak, then the code extends the cross-correlation analysis for the corresponding templates to lower wavenumber and includes the continuum in the analysis, i.e., it chooses the redshift based on which template provides a better match to the continuum shape of the object. These flagged spectra are then manually inspected (see below). The cross-correlation redshift is stored as z in the CrossCorrelationRedshift table. Final Redshifts and Spectrum ClassificationThe spectro1d pipeline assigns a final redshift to each object spectrum by choosing the emission or cross-correlation redshift with the highest CL and stores this as z in the SpecObj table. A redshift status bit mask zStatus and a redshift warning bit mask zWarning are stored. The CL is stored in zConf. Objects with redshifts determined manually (see below) have CL set to 0.95 (MANUAL_HIC set in zStatus), or 0.4 or 0.65 (MANUAL_LOC set in zStatus). Rarely, objects have the entire red or blue half of the spectrum missing; such objects have their CLs reduced by a factor of 2, so they are automatically flagged as having low confidence, and the mask bit Z_WARNING_NO_BLUE or Z_WARNING_NO_RED is set in zWarning as appropriate. All objects are classified in specClass as either a quasar, high-redshift quasar, galaxy, star, late-type star, or unknown. If the object has been identified as a quasar by the emission-line routine, and if the emission-line redshift is chosen as the final redshift, then the object retains its quasar classification. Also, if the quasar cross-correlation template provides the final redshift for the object, then the object is classified as a quasar. If the object has a final redshift z > 2.3 (so that Lyα is or should be present in the spectrum), and if at least two out of three redshift estimators agree on this (the three estimators being the emission-line, Lyα, and cross-correlation redshifts), then it is classified as a high-z quasar. If the object has a redshift cz < 450 km s-1, then it is classified as a star. If the final redshift is obtained from one of the late-type stellar cross-correlation templates, it is classified as a late-type star. If the object has a cross-correlation CL < 0.25, it is classified as unknown. There exist among the spectra a small number of composite objects. Most common are bright stars on top of galaxies, but there are also galaxy-galaxy pairs at distinct redshifts, and at least one galaxy-quasar pair, and one galaxy-star pair. Most of these have the zWarning flag set, indicating that more than one redshift was found. The zWarning bit mask mentioned above records problems that the spectro1d pipeline found with each spectrum. It provides compact information about the spectra for end users, and it is also used to trigger manual inspection of a subset of spectra on every plate. Users should particularly heed warnings about parts of the spectrum missing, low signal-to-noise ratio in the spectrum, significant discrepancies between the various measures of the redshift, and especially low confidence in the redshift determination. In addition, redshifts for objects with zStatus = FAILED should not be used. Spectral Classification Using EigenspectraIn addition to spectral classification based on measured lines, galaxies are classified by a Principal Component Analysis (PCA), using cross-correlation with eigentemplates constructed from SDSS spectroscopic data. The 5 eigencoefficients and a classification number are stored in eCoeff and eClass, respectively, in the SpecObj table and the spSpec files. eClass, a single-parameter classifier based on the expansion coefficients (eCoeff1-5), ranges from about -0.35 to 0.5 for early- to late-type galaxies. A number of
changes to eClass have occurred since the EDR. The
galaxy spectral classification eigentemplates for DR1 are created from
a much larger sample of spectra than were used in the Stoughton et al. EDR
paper, and now number
approximately 200,000. The eigenspectra used in DR1 are an early
version of those created by Yip et al. (in prep). The sign of the
second eigenspectrum has been reversed with respect to that of EDR;
therefore we recommend using the expression Manual Inspection of SpectraA small percentage of spectra on every plate are inspected manually, and if necessary, the redshift, classification, zStatus, and CL are corrected. We inspect those spectra that have zWarning or zStatus indicating that there were multiple high-confidence cross-correlation redshifts, that the redshift was high (z > 3.2 for a quasar or z > 0.5 for a galaxy), that the confidence was low, that signal-to-noise ratio was low in r, or that the spectrum was not measured. All objects with zStatus = EMLINE_HIC or EMLINE_LOC, i.e., for which the redshift was determined only by emission lines, are also examined. If, however, the object has a final CL > 0.98 and zStatus of either XCORR_EMLINE or EMLINE_XCORR, then despite the above, it is not manually checked. All objects with either specClass = SPEC_UNKNOWN or zStatus = FAILED are manually inspected. Roughly 8% of the spectra in the EDR were thus inspected, of which about one-eighth, or 1% overall, had the classification, redshift, zStatus, or CL manually corrected. Such objects are flagged with zStatus changed to MANUAL_HIC or MANUAL_LOC, depending on whether we had high or low confidence in the classification and redshift from the manual inspection. Tests on the validation plates, described in the next section, indicate that this selection of spectrafor manual inspection successfully finds over 95% of the spectra for which the automated pipeline assigns an incorrect redshift. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Resolving Multiple Detections and Defining Samples | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
In addition to reading this section, we recommend that users familiarize themselves with the , which indicate what happened to each object during the Resolve procedure. SDSS scans overlap, leading to duplicate detections of objects in the overlap regions. A variety of unique (i.e., containing no duplicate detections of any objects) well-defined (i.e., areas with explicit boundaries) samples may be derived from the SDSS database. This section describes how to define those samples. The resolve figure is a useful visual aid for the discussion presented below. Consider a single drift scan along a stripe, called a run. The camera has six columns of CCDs, which scan six swaths across the sky. A given camera column is referred to throughout with the abbreviation camCol. The unit for data processing is the data from a single camCol for a single run. The same data may be processed more than once; repeat processing of the same run/camCol is assigned a unique rerun number. Thus, the fundamental unit of data process is identified by run/rerun/camCol. While the data from a single run/rerun/camCol is a scan line of data 2048 columns wide by a variable number of rows (approximately 133000 rows per hour of scanning), for purposes of data processing the data is split up into frames 2048 columns wide by 1361 rows long, resulting in approximately 100 frames per scan line per hour of scanning. Additionally, the first 128 rows from the next frame is added to the previous frame, leading to frames 2048 columns wide by 1489 rows long, where the first and last 128 rows overlap the previous and next frame, respectively. Each frame is processed separately. This leads to duplicate detections for objects in the overlap regions between frames. For each frame, we split the overlap regions in half, and consider only those objects whose centroids lie between rows 64 and 1361+64 as the unique detection of that object for that run/rerun/camCol. These objects have the OK_RUN bit set in the "status" bit mask. Thus, if you want a unique sample of all objects detected in a given run/rerun/camCol, restrict yourself to all objects in that run/rerun/camCol with the OK_RUN bit set. The boundaries of this sample are poorly defined, as the area of sky covered depends on the telescope tracking. Objects must satisfy other criteria as well to be labeled OK_RUN; an object must not be flagged BRIGHT (as there is a duplicate "regular" detection of the same object); and must not be a deblended parent (as the children are already included); thus it must not be flagged BLENDED unless the NODEBLEND flag is set. Such objects have their GOOD bit set. For each stripe, 12 non-overlapping but contiguous scan lines are defined parallel to the stripe great circle (that is, they are bounded by two lines of constant great circle latitude). Each scan line is 0.20977 arcdegrees wide (in great circle latitude). Each run/camCol scans along one of these scan lines, completely covering the extent of the scan line in latitude, and overlapping the adjacent scan lines by approximately 1 arcmin. Six of these scan lines are covered when the "north" strip of the stripe is scanned, and the remaining six are covered by the "south" strip. The fundamental unit for defining an area of the sky considered as observed at sufficient quality is the segment. A segment consists of all OK_RUN objects for a given run/rerun/camCol contained within a rectangle defined by two lines of constant great circle longitude (the east and west boundaries) and two lines of constant great circle latitude (the north and south boundaries, being the same two lines of constant great circle latitude which define the scan line). Such objects have their OK_SCANLINE bit set in the status bit mask. A segment consists of a contiguous set of fields, but only portions of the first and/or last field may be contained within the segment, and indeed a given field could be divided between two adjacent segments. If an object is in the first field in a segment, then its FIRST_FIELD bit is set, along with the OK_SCANLINE bit; if its not in the first field in the segment, then the OK_SCANLINE bit is set but the FIRST_FIELD bit is not set. This extra complication is necessary for fields which are split between two segments; those OK_SCANLINE objects without the FIRST_FIELD bit set would belong to the first segment (the segment for which this field is the last field in the segment), and those OK_SCANLINE objects with the FIRST_FIELD bit set would belong the the second segment (the segment for which this field is the first field in the segment). A chunk consists of a non-overlapping contiguous set of segments which span a range in great circle longitude over all 12 scan lines for a single stripe. Thus, the set of OK_SCANLINE (with appropriate attention to the FIRST_FIELD bit) objects in all segments for a given chunk comprises a unique sample of objects in an area bounded by two lines of constant great circle longitude (the east and west boundaries of the chunk) and two lines of constant great circle latitude (+- 1.25865 degrees, the north and south boundaries of the chunk). Segments and chunks are defined in great circle coordinates along their given stripe, and contain unique detections only when limited to other segments and chunks along the same stripe. Each stripe is defined by a great circle, which is a line of constant latitude in survey coordinates (in survey coordinates, lines of constant latitude are great circles while lines of constant longitude are small circles, switched from the usual meaning of latitude and longitude). Since chunks are 2.51729 arcdegrees wide, but stripes are separated by 2.5 degrees (in survey latitude), chunks on adjacent stripes can overlap (and towards the poles of the survey coordinate system chunks from more than two stripes can overlap in the same area of sky). A unique sample of objects spanning multiple stripes may then be defined by applying additional cuts in survey coordinates. For a given chunk, all objects that lie within +- 1.25 degrees in survey latitude of its stripe's great circle have the OK_STRIPE bit set in the "status" bit mask. All OK_STRIPE objects comprise a unique sample of objects across all chunks, and thus across the entire survey area. The southern stripes (stripes 76, 82, and 86) do not have adjacent stripes, and thus no cut in survey latitude is required; for the southern stripes only, all OK_SCANLINE objects are also marked as OK_STRIPE, with no additional survey latitude cuts. Finally, the official survey area is defined by two lines of constant survey longitude for each stripe, with the lines being different for each stripe. All OK_STRIPE objects falling within the specified survey longitude boundaries for their stripe have the PRIMARY bit set in the "status" bit mask. Those objects comprise the unique SDSS sample of objects in that portion of the survey which has been finished to date. Those OK_RUN objects in a segment which fail either the great circle latitude cut for their segment, or the survey latitude or longitude cut for their stripe, have their SECONDARY bit set. They do not belong to the primary sample, and represent either duplicate detections of PRIMARY objects in the survey area, or detections outside the area of the survey which has been finished to date. Objects that lie close to the bisector between frames, scan lines, or chunks present some difficulty. Errors in the centroids or astrometric calibrations can place such objects on either side of the bisector. A resolution is performed at all bisectors, and if two objects lie within 2 arcsec of each other, then one object is declared OK_RUN/OK_SCANLINE/OK_STRIPE (depending on the test), and the other is not. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Transformations between SDSS magnitudes and UBVRcIc | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
There have been several efforts in calculating transformation equations between ugriz (or u'g'r'i'z') and UBVRcIc. Here, we focus on six of the most current efforts:
There are currently no transformation equations explicitly for galaxies, but Jester al.'s (2005) and Lupton's (2005) transformation equations for stars should also provide reasonable results for normal galaxies (i.e., galaxies without strong emission lines). Caveat: Note that these transformation equations are for the SDSS ugriz (u'g'r'i'z') magnitudes as measured, not for SDSS ugriz (u'g'r'i'z') corrected for AB offsets. If you need AB ugriz magnitudes, please remember to convert from SDSS ugriz to AB ugriz using AB offsets described at this URL). At the end of this webpage, we estimate the ugriz colors of Vega and the Sun. Jester et al. (2005)The following transformation equations were extracted from Table 1 of Jester et al. (2005) and are generally useful for stars and for quasars. The transformation equations for z<=2.1 quasars is based upon synthetic photometry of an updated version of the quasar composite spectrum of Vanden Berk et al. (2001) using DR1 data as well as the red and reddened quasar composites for Richards et al. (2003). The transformations for stars were derived from the Smith et al. (2002) u'g'r'i'z' photometry of Landolt stars, suitably transformed from the USNO-1.0m u'g'r'i'z' system to the SDSS 2.5m ugriz system via the u'g'r'i'z'-to-ugriz transformations. The transformation equations for stars supercede those of Fukugita et al.(1996) and Smith et al. (2002). UBVRcIc -> ugriz ================ Quasars at z <= 2.1 (synthetic) Transformation RMS residual u-g = 1.25*(U-B) + 1.02 0.03 g-r = 0.93*(B-V) - 0.06 0.09 r-i = 0.90*(Rc-Ic) - 0.20 0.07 r-z = 1.20*(Rc-Ic) - 0.20 0.18 g = V + 0.74*(B-V) - 0.07 0.02 r = V - 0.19*(B-V) - 0.02 0.08 Stars with Rc-Ic < 1.15 and U-B < 0 Transformation RMS residual u-g = 1.28*(U-B) + 1.14 0.05 g-r = 1.09*(B-V) - 0.23 0.04 r-i = 0.98*(Rc-Ic) - 0.22 0.01 r-z = 1.69*(Rc-Ic) - 0.42 0.03 g = V + 0.64*(B-V) - 0.13 0.01 r = V - 0.46*(B-V) + 0.11 0.03 All stars with Rc-Ic < 1.15 Transformation RMS residual u-g = 1.28*(U-B) + 1.13 0.06 g-r = 1.02*(B-V) - 0.22 0.04 r-i = 0.91*(Rc-Ic) - 0.20 0.03 r-z = 1.72*(Rc-Ic) - 0.41 0.03 g = V + 0.60*(B-V) - 0.12 0.02 r = V - 0.42*(B-V) + 0.11 0.03 ugriz -> UBVRcIc ================ Quasars at z <= 2.1 (synthetic) Transformation RMS residual U-B = 0.75*(u-g) - 0.81 0.03 B-V = 0.62*(g-r) + 0.15 0.07 V-R = 0.38*(r-i) + 0.27 0.09 Rc-Ic = 0.72*(r-i) + 0.27 0.06 B = g + 0.17*(u-g) + 0.11 0.03 V = g - 0.52*(g-r) - 0.03 0.05 Stars with Rc-Ic < 1.15 and U-B < 0 Transformation RMS residual U-B = 0.77*(u-g) - 0.88 0.04 B-V = 0.90*(g-r) + 0.21 0.03 V-R = 0.96*(r-i) + 0.21 0.02 Rc-Ic = 1.02*(r-i) + 0.21 0.01 B = g + 0.33*(g-r) + 0.20 0.02 V = g - 0.58*(g-r) - 0.01 0.02 All stars with Rc-Ic < 1.15 Transformation RMS residual U-B = 0.78*(u-g) - 0.88 0.05 B-V = 0.98*(g-r) + 0.22 0.04 V-R = 1.09*(r-i) + 0.22 0.03 Rc-Ic = 1.00*(r-i) + 0.21 0.01 B = g + 0.39*(g-r) + 0.21 0.03 V = g - 0.59*(g-r) - 0.01 0.01 Karaali, Bilir, and Tuncel (2005)These transformations appeared in Karaali, Bilir, and Tuncel (2005). They are based on Landolt (1992) UBV data for 224 stars in the color range 0.3 < B-V < 1.1 with SDSS ugr photometry from the CASU INT Wide Field Survey. An improvement over previous SDSS<->UBVRcIc transformations is the use of two colors in each equation, which is particularly helpful for the u-g transformation. UBVRcIc -> ugriz ================ Stars with 0.3 < B-V < 1.1 u-g = 0.779*(U-B) + 0.755*(B-V) + 0.801 g-r = 1.023*(B-V) + 0.016*(U-B) - 0.187 ugriz -> UBVRcIc ================ Stars with 0.3 < B-V < 1.1 B-V = 0.992*(g-r) - 0.0199*(u-g) + 0.202 Bilir, Karaali, and Tuncel (2005)These transformation equations appeared in Bilir, Karaali, and Tuncel (2005, AN 326, 321). They are based upon 195 dwarf stars that have both ugriz photometry and Landolt UBV photometry. UBVRcIc -> ugriz ================ Dwarf (Main Sequence) Stars g-r = 1.124*(B-V) - 0.252 r-i = 1.040*(B-V) - 0.224 g = V + 0.634*(B-V) - 0.108 West, Walkowicz, and Hawley (2005)These transformation equations appeared in West, Walkowicz, and Hawley (2005, PASP 117, 706). They are based upon photometry of M and L dwarf stars from SDSS Data Release 3. UBVRcIc -> ugriz ================ M0-L0 Dwarfs, 0.67 <= r-i <= 2.01 Transformation RMS residual r-i = -2.69 + 2.29*(V-Ic) 0.05 - 0.28*(V-Ic)**2 M0-L0 Dwarfs, 0.37 <= i-z <= 1.84 Transformation RMS residual i-z = -20.6 + 26.0*(Ic-Ks) 0.10 - 11.7*(Ic-Ks)**2 - 2.30*(Ic-Ks)**3 - 0.17*(Ic-Ks)**4 Rodgers et al. (2005)These equations are from Rodgers et al. (2005, AJ, submitted). They are based upon a set of main sequence stars from the Smith et al. (2002) u'g'r'i'z' standard star network that also have Landolt UBVRcIc photometry. Note that these equations, strictly speaking, transform from UBVRcIc to u'g'r'i'z' and not to ugriz. The transformation from u'g'r'i'z' to ugriz, however, is rather small. Note also, as with the Karaali, Bilir, and Tuncel (2005) transformations listed above, two colors are used in the u'-g' and g'-r' equations to improve the fits. The use of two colors in the fits is especially useful for u'-g', which is strongly affected by the Balmer discontinuity. UBVRcIc -> u'g'r'i'z' ===================== Main Sequence Stars u'-g' = 1.101(+/-0.004)*(U-B) + 0.358(+/-0.004)*(B-V) + 0.971 g'-r' = 0.278(+/-0.016)*(B-V) + 1.321(+/-0.030)*(V-Rc) - 0.219 r'-i' = 1.070(+/-0.009)*(Rc-Ic) - 0.228 r'-z' = 1.607(+/-0.012)*(Rc-Ic) - 0.371 Lupton (2005)These equations that Robert Lupton derived by matching DR4 photometry to Peter Stetson's published photometry for stars. Stars B = u - 0.8116*(u - g) + 0.1313; sigma = 0.0095 B = g + 0.3130*(g - r) + 0.2271; sigma = 0.0107 V = g - 0.2906*(u - g) + 0.0885; sigma = 0.0129 V = g - 0.5784*(g - r) - 0.0038; sigma = 0.0054 R = r - 0.1837*(g - r) - 0.0971; sigma = 0.0106 R = r - 0.2936*(r - i) - 0.1439; sigma = 0.0072 I = r - 1.2444*(r - i) - 0.3820; sigma = 0.0078 I = i - 0.3780*(i - z) -0.3974; sigma = 0.0063 Here is the CAS SQL query Robert used to perform the matchup of DR4 photometry with Stetson's: select dbo.fSDSS(P.objId) as ID, name, S.B, S.Berr, S.V, S.Verr , S.R, S.Rerr, S.I, S.Ierr, psfMag_u, psfMagErr_u, psfMag_g, psfMagErr_g, psfMag_r, psfMagErr_r, psfMag_i, psfMagErr_i, psfMag_z, psfMagErr_z, case when 0 = (flags_u & 0x800d00000000000) and status_u = 0 then 1 else 0 end as good_u, case when 0 = (flags_g & 0x800d00000000000) and status_g = 0 then 1 else 0 end as good_g, case when 0 = (flags_r & 0x800d00000000000) and status_r = 0 then 1 else 0 end as good_r, case when 0 = (flags_i & 0x800d00000000000) and status_i = 0 then 1 else 0 end as good_i, case when 0 = (flags_z & 0x800d00000000000) and status_z = 0 then 1 else 0 end as good_z from stetson as S join star as P on S.objId = P.objId join field as F on P.fieldId = F.fieldId where 0 = (flags & 0x40006) Estimates for the ugriz Colors of Vega and the SunAssuming V=+0.03 and U-B = B-V = V-Rc = Rc-Ic = 0.00, we find for the A0V star Vega the following: g = -0.08 (+/-0.03) u-g = +1.02 (+/-0.08) g-r = -0.25 (+/-0.03) r-i = -0.23 (+/-0.02) i-z = -0.17 (+/-0.02) where we used the Bilir, Karaali, and Tuncel (2005) transformation for g and the Rodgers et al. (2005) transformations (plus the u'g'r'i'z'-to-ugriz transformations) for the u-g, g-r, r-i, and i-z colors. The error bars in parentheses are rough estimates of the systematic errors based upon the different values that different sets of transformation equations yield. Assuming M(V)=+4.82, U-B=+0.195, B-V=+0.650, V-Rc=+0.36, and Rc-Ic=+0.32, we find for the Sun the following: M(g)= +5.12 (+/-0.02) u-g = +1.43 (+/-0.05) g-r = +0.44 (+/-0.02) r-i = +0.11 (+/-0.02) i-z = +0.03 (+/-0.02) where, again, we used the Bilir, Karaali, and Tuncel (2005) transformation for g and the Rodgers et al. (2005) transformations (plus the u'g'r'i'z'-to-ugriz transformations) for the u-g, g-r, r-i, and i-z colors. As above, the error bars in parentheses are rough estimates of the systematic errors based upon the different values that different sets of transformation equations yield. Last modified: Mon Apr 10 21:30:06 BST 2006 |
The SDSS spectroscopic survey will consist of about 2000 circular Tiles, each about 1.5 deg. in radius, which contain the objects for a given spectroscopic observation. There are more opportunities to target (get the spectrum of) an object if it is covered by multiple tiles. If three tiles cover an area, the objects in that area are three times more opportunity to be targeted. At the same time, objects are not targeted uniformly over a plate. The targeting is driven by a program that uses the SDSS photographic observations to schedule the spectroscopic observations. These photographic observations are 2.5 deg. wide stripes across the sky. The strips overlap about 15%, so the sky is partitioned into disjoint staves and the tiling is actually done in terms of these staves (see Figure 1.) Staves are often misnamed stripes in the database and in other SDSS documentation.
Figure 1. Observations consist of overlapping stripes partitioned into disjoint staves. Tiling Runs work on a set of staves, and each Tiling Geometry region is contained within a stave. |
Spectroscopic targeting is done by a tiling run that works with a collection of staves - actually not whole staves but segments of them called chunks. The tiling run generates tiles that define which objects are going to be observed (actually, which holes to drill in a SDSS spectroscopic plate.) The tiling run also generates a list of TilingGeometry rectangular regions that describe the sections of the staves that were used to make the tiles. Some TilingGeometry rectangles are positive, others are negative (masks or holes.) Subsequent tiling runs may use the same staves (chunks) and so tiling runs are not necessarily disjoint. So, TilingGeometries form rather complex intersections that we call SkyBoxes.
The goal is to compute contiguous sectors covered by some number of plates and at least one positive TilingGeometry. We also want to know how many plates cover the sector.
This is a surprisingly difficult task because there are subtle interactions. We will develop the algorithm to compute sectors in steps. First we will ignore the TilingGeometry and just compute the wedges (Boolean combinations of tiles). Then we will build TilingBoxes, positive quadrilateral partitions of each tiling region that cover the regions. SkyBoxes are the synthesis of the TilingBoxes from several tiling runs into a partitioning of the survey footprint into disjoint quadrilaterals positive quadrilaterals. Now, to compute sectors, we simply intersect all wedges with all Skyboxes. The residue is the tile coverage of the survey. A tile contributes to a sector if the tile contributes to the wedge and the tile was created by one of the tile runs that contain the SkyBox (you will probably understand that last sentence better after you read to the end of this paper.)
Figure 2. A wedge and
sector covered by one plate. There are adjoining wedges
covered by 2, 3, 4 plates. The lower left corner is an area that is
not part of any wedge or sector. SkyBoxes break wedges into
sectors and may mask parts of a wedge. |
Figure 3. Tile A has a blue boundary; tile B has the red boundary, both regions of depth 1. Their intersection is yellow, a Region of depth 2. The crescents shaded in blue and green are the two wedges of depth 1, and the yellow area is a wedge of depth 2. Nodes are purple dots. |
A sector is a wedge modified by intersections with overlapping TilingGeometry regions. If the TilingGeometry regions are complex (multiple convexes) or if they are holes (isMask=1), then the result of the intersection may also be complex (a region of multiple wedges). By going to a SkyBox model we keep things simple. Since SkyBoxes partition the sky into areas of known tile-run depth, SkyBox boundaries do not add any depth to the sectors; they just truncate them to fit in the box boundary and perhaps mask a tile if that tile is in a TilingGeometry hole or if the tile that contributes to that wedge is not part of the TilingGeometry (one of the tiling runs) that make up that SkyBox (Figure 4 shows a simple example of these concepts).
Figure 4.This shows how the tiles and
TilingGeometry rectangles intersect to form
sectors. On the figure we have a layout that has wedges
of various depths, depth 1 is gray, depth 2 is light blue, depth 3 is
yellow and depth 4 is magenta. The wedges are clipped by the
TilingGeometry boundary to form sectors. |
To get started, spCreateWedges() computes the wedge regions, placing them in the Sectors table, and for each wedge W and each tile T that adds to or subtracts from W, records the T->W in the Sectors2Tiles table (both positive and negative parents). So, in Figure 3, the green wedge (the leftmost wedge) would have tile A as a positive parent and tile B as a negative parent.
Figure 5.Staves (convex sides not illustrated) are processed in chunks. TilingGeometry is a chunk/stavesubset with holes (masks). TilingBoxes cover a TilingGeometrywith disjoint spherical rectangles. There are many such coverings, two are shown for TG1. The one at left has 23 TileBoxes while the one at right has 7 TileBoxes |
It is not immediately obvious how to construct the TileBoxes. Figure 6 gives some idea.
First, the whole operation of subtracting out the masks happens inside the larger TilingGeometry, called the Universe, U. We are going to construct nibbles which are a disjunctive normal form of the blue area with at least one negative hole edge to make sure we exclude the hole. These nibbles are disjoint and cover the TileGeometry and exclude the mask (white) area.
As described in "There Goes the Neighborhood: Relational Algebra for Spatial Data Search" we represent spherical polygons as a set of half-space constraints of the form h = (hx,hy,hz,c). Point p = (px,py,pz) is inside the halfspace if hx*px+hy*py+hz*pz>c. A convex region, C ={hi} is the set of points inside each of the hi.
Given that representation we can compute the set N of nibbles covering region R = U-C as follows:
Compute R = N = U - C where U and C are convex regions (C is the "hole" in U) the idea is
R = {ui} - {ci} = U &{~c1} | U&{~c2} | ...| U&{~cm} = U&~c1 | U&c1&~c2 | ... | U&c1&c2&...&cm-1&~cm The terms in the last equation are called nibbles. They are disjoint (look at the terms if each term has a unique ~ci) and together they cover R and exclude C (each ~ci excludes C).
R= {} -- the disjoint regions will be added to R. NewU = spRegionCopy U -- make a copy of U so we do not destroy it for each c in C -- for each constraint in c that is an arc -- of the hull Nibble = NewU &{ ~c } -- intersect Not c with the current universe if Nibble not empty -- if Not c intersects universe then add Nibble to R -- Add this Nibble to answer set NewU = NewU & {c} -- Not c is covered, so reduce the universe When each positive TilingGeometry is "nibbled" by its masks, the resulting nibbles are the TileBoxes we need.
The procedure spCreateTileBoxes creates, for each TilingGeometry, a set of TilingBox regions that cover it. That procedure also records in Region2Boxes a mapping of TilingGeometry-> TileBox so that we can tell which TilingGeometry region covers a box.
SkyBoxes are the unification of all TileBoxes into a partitioning of the entire sky. Logically, SkyBboxes are the Boolean combination of all the TileBoxes - somewhat analogous to the relationship between wedges and tiles. A SkyBoxes may be covered by multiple TilingGeometries (and have corresponding tiling runs); Region2Boxes records this mapping of TilingGeometry -> TileBox. Figure 7 illustrates how SkyBoxes are computed and how the TilingGeometry relationship is maintained.
Figure 7. SkyBoxes are the intersection of
TileBoxes. A pair can produce up to 7
SkyBoxes. The green areas are covered by the union of
the tiling runs of the two TileBoxes and
the other SkyBoxes are covered by the Tiling Runs of
their one parent box. |
spCreateSkyBoxes builds all the SkyBoxes and records the dependencies. spCreateSkyBoxes uses the logic of spRegionQuradangleFourOtherBoxes to create the SkyBoxes from the intersections of TileBoxes.
This is may be fine a partition - but two adjacent sectors computed in this way might have the same list of covering TileGeometry and Tiles in which case they should be unified into one sector. So, this first Wedge-SkyBox partition is called sectorlets. These sectorlets need to be unified into sectors if they have the same covering tiles. This unification gives us a unique answer (remember that Figure 5 showed many different TileBox partitions, this final step eliminates any "fake" partitions introduced by that step).
Sectorlets are computed as follows: Given a wedge W and a SkyBox SB, the area is just W ( SB. If that area is non-empty then we need to compute the list of covering tileGeometry and tiles. The TilingGeometries come from SB. The tiles are a bit more complex. Let T be the set of tiles covering W. Discard from T any tile not created by a tiling run covering SB. In mathematical notation:
But, a particular tile or set of tiles can create many sectorlets. We want the sector to be all the adjacent sectorlets with the same list of parent tiles (note that sectorlets have positive (covering) and negative (excluded) parents that make up the sector).
Figure 8.This diagram shows some SDSS data and demonstrates the concepts of Tile, Mask, TileBox, TilingGeometry, SkyBox, Wedge, Sectorlet, and Sector. |
The routine spSectorCreateSectors unifies all the sectorlets with the same list of parent tiles into one region. This region may not be connected (masks or tiling geometry may break it into pieces which we then glued back together - see the example of 5 sectorlets creating one sector in Figure 8.)
All these routines are driven by the parent spSectorCreate routine.
There are two main strategies employed to avoid these difficulties: the use of clipped means, and the use of rank statistics such as the median.
Photo performs two levels of sky subtraction; when first processing each frame it estimates a global sky level, and then, while searching for and measuring faint objects, it re-estimates the sky level locally (but not individually for every object).
The initial sky estimate is taken from the median value of every pixel in the image (more precisely, every fourth pixel in the image), clipped at 2.32634 sigma. This estimate of sky is corrected for the bias introduced by using a median, and a clipped one at that. The statistical error in this value is then estimated from the values of sky determined separately from the four quadrants of the image.
Using this initial sky estimation, Photo proceeds to find all the bright objects (typically those with more than 60 sigma detections). Among these are any saturated stars present on the frame, and Photo is designed to remove the scattering wings from at least the brighter of these --- this should include the scattering due to the atmosphere, and also that due to scattering within the CCD membrane, which is especially a problem in the i band. In fact, we have chosen not to aggressively subtract the wings of stars, partly because of the difficulty of handling the wings of stars that do not fall on the frame, and partly due to our lack of a robust understanding of the outer parts of the PSF . With the parameters employed, only the very cores of the stars (out to 20 pixels) are ever subtracted, and this has a negligible influence on the data. Information about star-subtraction is recorded in the fpBIN files, in HDU 4.
Once the BRIGHT detections have been processed, Photo proceeds with a more local sky estimate. This is carried out by finding the same clipped median, but now in 256x256 pixel boxes, centered every 128 pixels. These values are again debiased.
This estimate of the sky is then subtracted from the data, using linear interpolation between these values spaced 128 pixels apart; the interpolation is done using a variant of the well-known Bresenham algorithm usually employed to draw lines on pixellated displays.
This sky image, sampled every 128x128 pixels is written out to the fpBIN file in HDU 2; the estimated uncertainties in the sky (as estimated from the interquartile range and converted to a standard deviation taking due account of clipping) is stored in HDU 3. The value of sky in each band and its error, as interpolated to the center of the object, are written to the fpObjc files along with all other measured quantities.
After all objects have been detected and removed, Photo has the option of re-determining the sky using the same 256x256 pixel boxes; in practice this has not proved to significantly affect the photometry.
Spectro1D fits spectral features at three separate stages during the pipeline. The first two fits are fits to emission lines only. They are done in the process of determining an emission line redshift and these are referred to as foundLines. The final fitting of the complete line list, i.e. both emission and absorption lines, occurs after the object's classification has been made and a redshift has been measured. These fits are known as measuredLines. In all cases a single Gaussian is fitted to a given feature, therefore the quality of the fit is only good where this model holds up.
The first line fit is done when attempting to measure the object's emission line redshift. Wavelet filters are used to locate emission lines in the spectrum. The goal of these filters is to find strong emission features, which will be used as the basis for a more careful search. The lines identified by the wavelet filter are stored in the specLine table as foundLines, i.e., with the parameter category set to 1. They are stored without any identifications, i.e., they have restWave = 0.
Every one of these features is then tentatively matched to each of a list of candidate emission lines as given in the line table below, and a system of lines is searched for at the position indicated by the tentative matching. The best system of emission lines (if any) found in this process is used to calculate the object's emission-line redshift. The lines from this system and their parameters are stored in the specLine table as foundLines, i.e., with the parameter category set to 1. These lines are identified by their restWave as given in the line table below.
The final line fitting is done for all features (both emission and absorption) in the line list below, and occurs after the object has been classified and a redshift has been determined. This allows for a better continuum estimation and thus better line fits. This latter fit is stored in the specLine table with the parameter category set to 2.
Type of fit | category | restWave |
---|---|---|
"Found" emission lines from wavelet filter | 1 | 0 |
"Found" emission lines from best-fit system to wavelet detections | 1 | restWave from line list |
"Measured" emission and absorption lines according to the object's classification and best redshift | 2 | restWave from line list |
For almost all purposes we recommend the use of the measuredLines (category=2) since these result from the most careful continuum measurement and precise line fits.
All of the line parameters are measured in the observed frame, and no correction has been made for the instrumental resolution.
The continuum is fit using a median/mean filter. A sliding window is created of length 300 pixels for galaxies and stars or 1000 pixels for quasars. Pixels closer than 8 pixels(560km/s) for galaxies and stars or 30 pixels (2100 km/s) for QSOs to any reference line are masked and not used in the continuum measurement. The remaining pixels in the filter are ordered and the values between the 40th and 60th percentile are averaged to give the continuum. The category=1 lines are fit with a cruder continuum which is given by a fifth order polynomial fit which iteratively rejects outlying points.
The list of lines which are fit are given as an HTML line table below. Note that many times a single line in the table actually represents multiple features. Since the line fits are allowed to drift in wavelength somewhat, the exact precision of the lines are not important. The wavelength precision does become important for the emission line determination. To improve the accuracy of the emission-line redshift determination for QSOs, the wavelength for many of the lines listed here are not the laboratory values, but the average values calculated from a sample of SDSS QSOs taken from Vanden Berk et al. 2001 AJ 122 .
Every line in the reference list is fit as a single Gaussian on top of the continuum subtracted spectrum. Lines that are deemed close enough are fitted simultaneously as a blend. The basic line fitting is performed by the SLATEC common mathematical library routine SNLS1E which is based on the Levenberg-Marquardt method. Parameters are constrained to fall within certain values by multiplying the returned chi-squared values by a steep function. Any lines with parameters falling close to these constraints should be treated with caution. The constraints are: sigma > 0.5 Angstrom, sigma < 100 Angstrom, and the center wavelength is allowed to drift by no more than 450 km/sec for stars and galaxies or 1500 km/sec for QSOs, except for the CIV line which is allowed to be shifted by as much as 3000 km/sec.
There are a number of ways that the line fitting can fail. If the continuum is bad the line fits will be compromised. The median/mean filtering routine will always fail for white dwarfs, some A stars as well as late-type stars. In addition is has trouble for galaxies with a strong 4000 Angstrom break. Likewise the line fitting will have trouble when the lines are not really Gaussian. The Levenberg-Marquardt routine can fall into local minima, which can happen when there is self-absorption in a QSO line or both a narrow and broad component for example. One should always check the chi-squared values to evaluate the quality of the fit.
restWave | Line |
---|---|
1857.40 | AlIII_1857 |
8500.36 | CaII_8500 |
8544.44 | CaII_8544 |
8664.52 | CaII_8665 |
1335.31 | CII_1335 |
2326.00 | CII_2326 |
1908.73 | CIII_1909 |
1549.48 | CIV_1549 |
4305.61 | G_4306 |
3969.59 | H_3970 |
6564.61 | Ha_6565 |
4862.68 | Hb_4863 |
4102.89 | Hd_4103 |
3971.19 | He_3971 |
3889.00 | HeI_3889 |
1640.40 | HeII_1640 |
4341.68 | Hg_4342 |
3798.98 | Hh_3799 |
3934.78 | K_3935 |
6707.89 | Li_6708 |
1215.67 | Lya_1215 |
5176.70 | Mg_5177 |
2799.12 | MgII_2799 |
5895.60 | Na_5896 |
2439.50 | NeIV_2439 |
3346.79 | NeV_3347 |
3426.85 | NeVI_3427 |
6529.03 | NI_6529 |
6549.86 | NII_6550 |
6585.27 | NII_6585 |
1240.81 | NV_1241 |
1305.53 | OI_1306 |
6302.05 | OI_6302 |
6365.54 | OI_6366 |
3727.09 | OII_3727 |
3729.88 | OII_3730 |
1665.85 | OIII_1666 |
4364.44 | OIII_4364 |
4932.60 | OIII_4933 |
4960.30 | OIII_4960 |
5008.24 | OIII_5008 |
1033.82 | OVI_1033 |
3836.47 | Oy_3836 |
4072.30 | SII_4072 |
6718.29 | SII_6718 |
6732.67 | SII_6733 |
1397.61 | SiIV_1398 |
1399.80 | SiIV_OIV_1400 |
Each SPECTRO object points to a BEST photo object if there is one nearby the spectro (ra,dec) and a TARGET object id if there is a nearby one.
We chose 1 arc seconds as the "nearby radius" since that approximates the fiber radius.
This is complicated by the fact that
To resolve these ambiguities, we defined two views:
There is at most one "primary" object at any spot in the sky.
So, the logic is as follows:
TargetInfo.targetObjID is set while loading the data for a chunk into TARGET. The only difference between a targetID and targetObjID is the possible flip of one bit. This bit distinguishes between identical PhotoObjAll objects that are in fields that straddle 2 chunks. Only one of the pair will actually be within the chunk boundaries, so we want to make sure we match to that one. Note that the one of the pair that is actually part of a chunk might not be primary.
So, setting SpecObjAll.targetObjID does not use a positional match - it's all done through ID numbers. This match should always exist, so SpecObjAll.targetObjID always points to something in TARGET.PhotoObjAll. However, it is not guaranteed that SpecObjAll.targetObjID will match something in TARGET.PhotoObj because in the past we have targetted non-primaries (stripe 10 for example). To try to make this slightly less confusing we require something in SpecObj to have been targetted from something in TARGET.PhotoObj (ie primary spectra must have been primary targets).
SpecObjAll objects with targetObjID = 0 are usually fibers that were not mapped, so we didn't have any way to match them to the imaging (for either TARGET or BEST since we don't have an ID or position).
SpecObjAll.bestObjID is set as described above. To be slightly more detailed about the case where there is no BEST.PhotoObj within 1", we go through the modes (primary,secondary,family) in order looking for the nearest BEST.PhotoObjAll within 1".
BEST.PhotoObjAll.specObjID only points to things in SpecObj (ie SpecObjAll.sciencePrimary=1) because the mapping to non-sciencePrimary SpecObjAlls is not unique. You can still do BEST.PhotoObjAll.objID = SpecObjAll.bestObjID to get all the matches.
Ambiguities are not flagged. There are no ambiguities if you start from PhotoObj and go to SpecObj. It might be possible for more than one SpecObj to point to the same PhotoObj, but there are no examples of this unless it is a pathological case. It is possible for a SpecObj to point to something in PhotoObjAll that is not in PhotoObj, but if you are joining with PhotoObj you won't see these. If you start joining PhotoObjAll and SpecObjAll you need to be quite careful because the mapping is (necessarily) complicated.
Because the SDSS spectra are obtained through 3-arcsecond fibers during non-photometric observing conditions, special techniques must be employed to spectrophotometrically calibrate the data. There have been three substantial improvements to the algorithms which photometrically calibrate the spectra
On each spectroscopic plate, 16 objects are targeted as spectroscopic standards. These objects are color-selected to be F8 subdwarfs, similar in spectral type to the SDSS primary standard BD+17 4708.
The color selection of the SDSS standard stars. Red points represent stars selected as spectroscopic standards. (Most are flux standards; the very blue stars in the right hand plot are"hot standards"used for telluric absorption correction.)
The flux calibration of the spectra is handled by the Spectro2d pipeline. It is performed separately for each of the 2 spectrographs, hence each half-plate has its own calibration. In the EDR and DR1 Spectro2d calibration pipelines, fluxing was achieved by assuming that the mean spectrum of the stars on each half-plate was equivalent to a synthetic composite F8 subdwarf spectrum from Pickles (1998). In the reductions included in DR2, the spectrum of each standard star is spectrally typed by comparing with a grid of theoretical spectra generated from Kurucz model atmospheres (Kurucz 1992) using the spectral synthesis code SPECTRUM (Gray & Corbally 1994; Gray, Graham, & Hoyt 2001). The flux calibration vector is derived from the average ratio of each star (after correcting for Galactic reddening) and its best-fit model. Since the red and blue halves of the spectra are imaged onto separate CCDs, separate red and blue flux calibration vectors are produced. These will resemble the throughput curves under photometric conditions. Finally, the red and blue halves of each spectrum on each exposure are multiplied by the appropriate flux calibration vector. The spectra are then combined with bad pixel rejection and rebinned to a constant dispersion.
Throughput curves for the red and blue channels on the two SDSS spectrographs.
The EDR and DR1 data nominally corrected for galactic extinction. The spectrophotometry in DR2 is vastly improved compared to DR1, but the final calibrated DR2 spectra are not corrected for foreground Galactic reddening (a relatively small effect; the median E(B-V) over the survey is 0.034). This may be changed in future data releases. Users of spectra should note, though, that the fractional improvement in spectrophotometry is much greater than the extinction correction itself.
The second update in the pipeline is relatively minor: We now compute the absolute calibration by tying the r-band fluxes of the standard star spectra to the fiber magnitudes output by the latest version of the photometric pipeline. The latest version now corrects fiber magnitudes to a constant seeing of 2", and includes the contribution of flux from overlapping objects in the fiber aperture; these changes greatly improve the overall data consistency.
The third update to the spectroscopic pipeline is that we no longer use the "smear" observations in our calibration. As the EDR paper describes, "smear" observations are low signal-to-noise ratio (S/N) spectroscopic exposures made through an effective 5.5" by 9" aperture, aligned with the parallactic angle. Smears were designed to account for object light excluded from the 3" fiber due to seeing, atmospheric refraction and object extent. However, extensive experiments comparing photometry and spectrophotometry calibrated with and without smear observations have shown that the smear correction provides improvements only for point sources (stars and quasars) with very high S/N. For extended sources (galaxies) the spectrum obtained in the 3" fiber aperture is calibrated to have the total flux and spectral shape of the light in the smear aperture. This is undesirable, for example, if the fiber samples the bulge of a galaxy, but the smear aperture includes much of its disk: For extended sources, the effect of the smears was to give a systematic offset between spectroscopic and fiber magnitudes of up to a magnitude; with the DR2 reductions, this trend is gone. Finally, smear exposures were not carried out for one reason or another for roughly 1/3 of the plates in DR2. For this reason, we do not apply the smear correction to the data in DR2.
To the extent that all point sources are centered in the fibers in the same way as are the standards, our flux calibration scheme corrects the spectra for losses due to atmospheric refraction without the use of smears. Extended sources are likely to be slightly over-corrected for atmospheric refraction. However, most galaxies are quite centrally concentrated and more closely resemble point sources than uniform extended sources. In the mean, this overcorrection makes the g-r color of the galaxy spectra too red by ~1%.
Spectra for over 250,000 Galactic stars of all common spectral types are available with DR6. These Spectra were processed with a pipeline called the 'Spectro Parameter Pipeline' (spp) that computes line indices for a wide range of common features at the radial velocity of the star in question. These outputs are stored in the CAS in a table called sppLines, indexed on the 'specObjID' key index parameter for queries joining to other tables such as specobjall and photoobjall. The fields available in the sppLines table are:
The star is identified with these numbers: specobjid int64 Unique spectrum id f(plate,mjd,fiberid) bigint(8 bytes) plate int mjd int fiberid int Then for each star, for each index below, the 'side band' EqW (A) and the 'continuum' EqW (A) along with an Error (A) and a mask (0=bad, 1=ok) value are tablulated: The name of the index, the central wavelength and width of the on and off bands are recorded here (in Angstroms). Index_name Lambda0 Width Upper_refband width Lower_ref width ---------------------------------------------------------------- H8w3 3889.0 3.0 3912.0 8.0 3866.0 8.0 H8w12 3889.1 12.0 4010.0 20.0 3862.0 20.0 H8w24 3889.1 24.0 4010.0 20.0 3862.0 20.0 H8w48 3889.1 48.0 4010.0 20.0 3862.0 20.0 KPw12 3933.7 12.0 4010.0 20.0 3913.0 20.0 KPw18 3933.7 18.0 4010.0 20.0 3913.0 20.0 KPw6 3933.7 6.0 4010.0 20.0 3913.0 20.0 CaIIK 3933.6 30.0 4010.0 5.0 3910.0 5.0 CaIIHKp 3962.0 75.0 4010.0 5.0 3910.0 5.0 Heps 3970.0 50.0 4010.0 5.0 3910.0 5.0 KPw16 3933.7 16.0 4018.0 20.0 3913.0 10.0 Sr 4077.0 8.0 4090.0 6.0 4070.0 4.0 HeI 4026.2 12.0 4154.0 20.0 4010.0 20.0 Hdeltaw12 4101.8 12.0 4154.0 20.0 4010.0 20.0 Hdeltaw24 4101.8 24.0 4154.0 20.0 4010.0 20.0 Hdeltaw48 4101.8 48.0 4154.0 20.0 4010.0 20.0 Hdelta 4102.0 64.0 4154.0 20.0 4010.0 20.0 CaI 4226.0 4.0 4232.0 4.0 4211.0 6.0 CaIw12 4226.7 12.0 4257.0 20.0 4154.0 20.0 CaIw24 4226.7 24.0 4257.0 20.0 4154.0 20.0 CaIw6 4226.7 6.0 4257.0 20.0 4154.0 20.0 G 4305.0 15.0 4367.0 10.0 4257.0 20.0 Hgammaw12 4340.5 12.0 4425.0 20.0 4257.0 20.0 Hgammaw24 4340.5 24.0 4425.0 20.0 4257.0 20.0 Hgammaw48 4340.5 48.0 4425.0 20.0 4257.0 20.0 Hgamma 4340.5 54.0 4425.0 20.0 4257.0 20.0 HeIa 4471.7 12.0 4500.0 20.0 4425.0 20.0 G_blue 4305.0 26.0 4507.0 14.0 4090.0 12.0 G_whole 4321.0 28.0 4507.0 14.0 4096.0 12.0 Ba 4554.0 6.0 4560.0 4.0 4538.0 4.0 12C13C 4737.0 36.0 4770.0 20.0 4423.0 10.0 CC12 4618.0 256.0 4780.0 5.0 4460.0 10.0 metal1 4584.0 442.0 4805.8 5.0 4363.0 5.0 Hbetaw12 4862.3 12.0 4905.0 20.0 4790.0 20.0 Hbetaw24 4862.3 24.0 4905.0 20.0 4790.0 20.0 Hbetaw48 4862.3 48.0 4905.0 20.0 4790.0 20.0 Hbeta 4862.3 60.0 4905.0 20.0 4790.0 20.0 C2 5052.0 204.0 5230.0 20.0 4935.0 10.0 C2+MgI 5069.0 238.0 5230.0 20.0 4935.0 10.0 MgH+MgI+C2 5085.0 270.0 5230.0 20.0 4935.0 10.0 MgH+MgI 5198.0 44.0 5230.0 20.0 4935.0 10.0 MgH 5210.0 20.0 5230.0 20.0 4935.0 10.0 CrI 5206.0 12.0 5239.0 8.0 5197.5 5.0 MgI+FeII 5175.0 20.0 5240.0 10.0 4915.0 10.0 MgI 5183.0 2.0 5240.0 10.0 4915.0 10.0 MgIa 5170.5 12.0 5285.0 20.0 5110.0 20.0 MgIb 5176.5 24.0 5285.0 20.0 5110.0 20.0 MgIc 5183.5 12.0 5285.0 20.0 5110.0 20.0 NaI 5890.0 20.0 5918.0 6.0 5865.0 10.0 Naw12 5892.9 12.0 5970.0 20.0 5852.0 20.0 Naw24 5892.9 24.0 5970.0 20.0 5852.0 20.0 Halphaw12 6562.8 12.0 6725.0 50.0 6425.0 50.0 Halphaw24 6562.8 24.0 6725.0 50.0 6425.0 50.0 Halphaw48 6562.8 48.0 6725.0 50.0 6425.0 50.0 Halphaw70 6562.8 70.0 6725.0 50.0 6425.0 50.0 CaH 6788.0 505.0 7434.0 10.0 6532.0 5.0 TiO 7209.0 333.3 7434.0 10.0 6532.0 5.0 CN 6890.0 26.0 7795.0 10.0 6870.0 10.0 OItrip 7775.0 30.0 7805.0 10.0 7728.0 10.0 KI 7687.0 34.0 8080.0 10.0 7510.0 10.0 KIa 7688.0 95.0 8132.0 5.0 7492.0 5.0 NaIa 8187.5 15.0 8190.0 55.0 8150.0 10.0 NaIred 8190.2 33.0 8248.6 5.0 8140.0 5.0 CaIIw26 8498.0 26.0 8520.0 10.0 8467.5 25.0 Paschenw13 8467.5 13.0 8570.0 14.0 8457.0 10.0 CaII 8498.5 29.0 8570.0 14.0 8479.0 10.0 CaIIw40 8542.0 40.0 8570.0 14.0 8479.0 10.0 CaIIa 8542.0 16.0 8600.0 60.0 8520.0 20.0 Paschenw42 8598.0 42.0 8630.5 23.0 8570.0 14.0 CaIIb 8662.1 16.0 8694.0 12.0 8600.0 60.0 CaIIaw40 8662.0 40.0 8712.5 25.0 8630.5 23.0 Paschenaw42 8751.0 42.0 8784.0 16.0 8712.5 25.0 TiO5 7134.4 5.0 7134.4 5.0 7045.9 4.0 TiO8 8457.3 30.0 8457.3 30.0 8412.3 20.0 CaH1 6386.7 10.0 6415.0 10.0 6351.7 10.0 CaH2 6831.9 32.0 7045.9 4.0 7045.9 4.0 CaH3 6976.9 30.0 7045.9 4.0 7045.9 4.0 ------------------------------------------------------------------ Try sending this query: select top 10 * from sppLines To get a feel for the columns. Here's a sample query that selects red stars (spectral type K) that are bright (g0 < 17), with large CaII triplet eq widths, with low errors on the CaII triplet eq widths: select sp.plate,sp.mjd,sp.fiberid,sp.g0,sp.gmr0, sl.caII,sl.caIIerr,sl.caIIa,sl.caIIaerr, sl.caIIb,sl.caIIberr,sl.caII+sl.caIIa+sl.caIIb as caIItripsum, sp.feha,sp.fehaerr,sp.fehan,sp.logga,sp.loggaerr,sp.loggan from sppLines as sl, sppParams as sp where sl.CaII + sl.CaIIa + sl.CaIIb > 12 and sl.CaIImask = 1 and sl.CaIIamask =1 and sl.CaIIbmask = 1 and sl.caIIerr < 2 and sl.caIIaerr < 2 and sl.caIIberr < 2 and sl.caII > 0 and sl.caIIa > 0 and sl.caIIb > 0 and gmr0 between 0.8 and 1.1 and g0 < 17 and sl.specobjid = sp.specobjid (returns 4 stars).
Spectra for over 250,000 Galactic stars of all common spectral types are available with DR6. These Spectra were processed with a pipeline called the 'Spectro Parameter Pipeline' (spp) that computes standard stellar atmospheric parameters such as [Fe/H], log g and Teff for each star by a variety of methods. These outputs are stored in the CAS in a table called sppParams, indexed on the 'specObjID' key index parameter for queries joining to other tables such as specobjall and photoobjall. The fields available in the sppParams table are:
n fieldName type/unit Description 0 specobjid int64 64 bit unique spectrum id f(plate,mjd,fiberid) 1 plate int plate number 2 mjd int Modified Julian Date of (last) observation for this plate/mjd combination 3 fiberid int Fiber number (1-640) 1-320 on spectrograph#1 321-640 on spectrograph#2 4 fehspectype char[4] SEGUE or SDSS target type abbreviations (type as targeted from photo) AGB Asymtotic Giant Branch candidate star BHB Blue Horizontal Branch candidate star CVR Cool White Dwarf candidate star FTO F turnoff candidate star GAL SDSS main survey or LRG GALAXY candidate GD G Dwarf candidate star HOT HOT standard star (main SDSS Survey, g-r < 0) KD K Dwarf candidate star KG K Giant candidate star LOW Low metalicity candidate star MD M Dwarf candidate star PHO Photometric Standard star (usually brighter F dwarf) QA Quality Assurance target (in SEGUE, duplicates another fiber) QSO SDSS main survey QSO candidate RED Reddening Standard star (usually fainter F dwarf) ROS may be F/G dwarf or MS/WD or low lat target SER Serendip. Manual (1. globular/open cluster 2. High Prop motion) STA Main SDSS survey STAR (not GAL or QSO or Standard) SKY Sky fiber, should have no object flux WD White Dwarf candidate fiber 5 sptypea char[4] Stellar Type Classification -- A F0-F9, G0-G9, K0-K5, K7, M0-M9, L0-L8 NA 6 hammersptype char[5] Stellar Type Classification -- B 00, AGN, QSO, O, OB, A0, A0p, B6, B9, Car, CAR, CV, CWD, F2, F5, F9 G0, G2, G5, K1, K3, K5, K7, M0V, M1-M8, L0, L1-L5, L9, QSO, SBT, SFM, SKY, STA, T2, WD, WDm 7 flag char[5] SSPP flags 4 flags for each spectrum as a 4 char string 'abcd'. xxxx = this is not a star, it's a Galaxy/QSO from SDSS main n= all is normal for this flag position (one of four) By position: 1. d, D, E, h, H, l, n, N, S, V d: possible White Dwarf D: apparent White Dwarf E: Emission, possible QSO h: Helium line detected H: apparent Teff too Hot for parameter est. l: Sodium line, possibly late type N: Noise spectrum S: Sky fiber -- no signal V: Radial Velocity mismatch 2. C Color mismatch (spectroscopic color Teff far from implied (g-r)_0 color Teff) 3. B Balmer flag: unusual Balmer eq. widths for star 4. g g band : unsual G-band eq. width for star Carbon star? 8 feha float/dex adopted [Fe/H] value -- if abs(fehr-fehw) < 0.15, feha=fehw, if not, feha=(fehr + fehw)/2.0. If not available, set to -9.999 9 fehan int number of estimators. if feha=(fehr + fehw)/2.0, =1; if feha=fehw, =fehwn 10 fehaerr float/dex error/sigma on the [Fe/H] value. If not available, set to -9.999 11 fehw float/dex bi-weight [Fe/H] value from all available estimates of [Fe/H]. If not available, set to -9.999 12 fehwn int number of estimators for bi-weight averaging 13 fehwerr float/dex error/sigma on the [Fe/H] value. If not available, set to -9.999 14 fehr float/dex refined [Fe/H] from assorted estimates of [Fe/H] by comparing synthetic spectra. If not available, set to -9.999 15 fehrn int number of estimators for calculating refine [Fe/H] 16 fehrerr float/dex error/sigma on the [Fe/H] value. If not available, set to -9.999 17 feh1 float/dex YoungSun's Kurucz grid 2 (NGS2 grid). If not available, set to -9.999 18 feh1n int indicator variable, =1 ok, =0 not used 19 feh1err float/dex error/sigma on the [Fe/H] value. If not available, set to -9.999 20 feh2 float/dex YoungSun's Kurucz grid 1 (NGS1 grid). If not available, set to -9.999 21 feh2n int indicator variable, =1 ok, =0 not used 22 feh2err float/dex error/sigma on the [Fe/H] value. If not available, set to -9.999 23 feh3 float/dex [Fe/H] from neural network method. If not available, set to -9.999 24 feh3n int indicator variable, =1 ok, =0 not used 25 feh3err float/dex error/sigma on the [Fe/H] value. If not available, set to -9.999 26 feh4 float/dex [Fe/H] from Ca II K method. If not available, set to -9.999 27 feh4n int indicator variable, =1 ok, =0 not used 28 feh4err float/dex error/sigma on the [Fe/H] value. If not available, set to -9.999 29 feh5 float/dex [Fe/H] from auto-correlation method. If not available, set to -9.999 30 feh5n int indicator variable, =1 ok, =0 not used 31 feh5err float/dex error/sigma on the [Fe/H] value. If not available, set to -9.999 32 feh6 float/dex [Fe/H] from Ca II triplet method. If not available, set to -9.999 33 feh6n int indicator variable, =1 ok, =0 not used 34 feh6err float/dex error/sigma on the [Fe/H] value. If not available, set to -9.999 35 feh7 float/dex [Fe/H] from Wilhelm, Beers, and Gray (1999) method. If not available, set to -9.999 36 feh7n int indicator variable, =1 ok, =0 not used 37 feh7err float/dex error/sigma on the [Fe/H] value. If not available, set to -9.999 38 feh8 float/dex [Fe/H] from Carlos k24 grid. If not available, set to -9.999 39 feh8n int indicator variable, =1 ok, =0 not used 40 feh8err float/dex error/sigma on the [Fe/H] value. If not available, set to -9.999 41 feh9 float/dex [Fe/H] from Carlos ki13 grid. If not available, set to -9.999 42 feh9n int indicator variable, =1 ok, =0 not used 43 feh9err float/dex error/sigma on the [Fe/H] value. If not available, set to -9.999 44 teffa float/Kelvin Adopted Teff -- weighted mean of all available methods teff1-teff10 below. If not available, set to -9999 45 teffan int number of methods averaged 46 teffaerr float/K error/sigma on the Teff. If not available, set to -9999 47 teff1 float/K Teff from H alpha line. If not available, set to -9999 48 teff1n int indicator variable, =1 ok, =0 not used 49 teff1err float/K erorr/sigma on the Teff. If not available, set to -9999 50 teff2 float/K Teff from H delta line index. If not available, set to -9999 51 teff2n int indicator variable, =1 ok, =0 not used 52 teff2err float/K erorr/sigma on the Teff. If not available, set to -9999 53 teff3 float/K Teff from Kurucz models, based on colors. If not available, set to -9999 54 teff3n int indicator variable, =1 ok, =0 not used 55 teff3err float/K erorr/sigma on the Teff. If not available, set to -9999 56 teff4 float/K Teff from Girardi isochrones (2004). If not available, set to -9999 57 teff4n int indicator variable, =1 ok, =0 not used 58 teff4err float/K erorr/sigma on the Teff. If not available, set to -9999 59 teff5 float/K Teff from Ivezic et al. (2007) -- based on colors. If not available, set to -9999 60 teff5n int indicator variable, =1 ok, =0 not used 61 teff5err float/K erorr/sigma on the Teff. If not available, set to -9999 62 teff6 float/K Teff from YoungSun's Kurucz grid 1 (NGC1 grid). If not available, set to -9999 63 teff6n int indicator variable, =1 ok, =0 not used 64 teff6err float/K erorr/sigma on the Teff. If not available, set to -9999 65 teff7 float/K Teff from neural network method. If not available, set to -9999 66 teff7n int indicator variable, =1 ok, =0 not used 67 teff7err float/K erorr/sigma on the Teff. If not available, set to -9999 68 teff8 float/K Teff from Wilhelm, Beers, and Gray (1999) method. If not available, set to -9999 69 teff8n int indicator variable, =1 ok, =0 not used 70 teff8err float/K erorr/sigma on the Teff. If not available, set to -9999 71 teff9 float/K Teff from Carlos k24 grid. If not available, set to -9999 72 teff9n int indicator variable, =1 ok, =0 not used 73 teff9err float/K erorr/sigma on the Teff. If not available, set to -9999 74 teff10 float/K Teff from Carlos ki13 grid. If not available, set to -9999 75 teff10n int indicator variable, =1 ok, =0 not used 76 teff10err float/K erorr/sigma on the Teff. If not available, set to -9999 77 logga float/dex Adopted log g -- weighted mean of all available methods logg1-logg8 below. If not available, set to -9.999 78 loggan int number of methods used 79 loggaerr float/dex error/sigma in the log g. If not available, set to -9.999 80 logg1 float/dex log g from YoungSun's Kurucz grid 2 (NGS2). If not available, set to -9.999 81 logg1n int indicator variable, =1 ok, =0 not used 82 logg1err float/dex error/sigma in the log g. If not available, set to -9.999 83 logg2 float/dex log g from YoungSun's Kurucz grid 1 (NGS1). If not available, set to -9.999 84 logg2n int indicator variable, =1 ok, =0 not used 85 logg2err float/dex error/sigma in the log g. If not available, set to -9.999 86 logg3 float/dex log g from neural netork method. If not available, set to -9.999 87 logg3n int indicator variable, =1 ok, =0 not used 88 logg3err float/dex error/sigma in the log g. If not available, set to -9.999 89 logg4 float/dex log g from from Ca I (4227 A) line index. If not available, set to -9.999 90 logg4n int indicator variable, =1 ok, =0 not used 91 logg4err float/dex error/sigma in the log g. If not available, set to -9.999 92 logg5 float/dex log g from MgH features (around 5170 A). If not available, set to -9.999 93 logg5n int indicator variable, =1 ok, =0 not used 94 logg5err float/dex error/sigma in the log g. If not available, set to -9.999 95 logg6 float/dex log g from Wilhelm, Beers, and Gray method (1999). If not available, set to -9.999 96 logg6n int indicator variable, =1 ok, =0 not used 97 logg6err float/dex error/sigma in the log g. If not available, set to -9.999 98 logg7 float/dex log g from Carlos k24 grid. If not available, set to -9.999 99 logg7n int indicator variable, =1 ok, =0 not used 100 logg7err float/dex error/sigma in the log g. If not available, set to -9.999 101 logg8 float/dex log g from Carlos ki13 grid. If not available, set to -9.999 102 logg8n int indicator variable, =1 ok, =0 not used 103 logg8err float/dex error/sigma in the log g. If not available, set to -9.999 104 alphafe float/dex [alpha/Fe] from YoungSun's Kurucz grid 2 (NGS2). If not available, set to -9.999 (experimental) 105 alphafen int indicator variable, =1 ok, =0 not used 106 alphafeerr float/dex error/sigam in [alpha/Fe]. If not available, set to -9.999 107 distV float/kpc distance to star if star is dwarf (Luminosity class V) 108 distTO float/kpc distance to star if star is turnoff star (class IV) 109 distIII float/kpc distance to star if star is giant (Lum class III) 110 distAGB float/kpc distance to star if star is on Asym giant branch 111 distHB float/kpc distance to star if star is on Horizontal Branch 112 distCAP float/kpc distance to star from Carlos model fits 113 unused1 float unused 114 distZ float/kpc distance to star in Z direction above Galactic plane 115 KPindex float/Angstroms Ca II K line index 116 GPindex float/Angstroms G band line index 117 unused2 float unused 118 RVflag char[32] RV flag, if ok, it says OK 119 rva float/km/s Adopted RV (usually Elodie) with +7.3m km/s offset added to fit known clusters THIS IS THE Radial Velocity to use in nearly all circumstances for stars 120 rvaerr float/km/s error in RV 121 calcrv float/km/s sspp calculated RV c=3e5 used throughout (i.e. not 299792.) 122 calcrverr float/km/s error in calcrv 123 bsrv float/km/s SpecBS RV -- based on templates taken from actual SDSS spectra, zeropointed 124 bsrverr float/km/s SpecBS RV error 125 elodierv float/km/s ELODIE template match RV (best estimate if err not 0, > 20) 126 elodierverr float/km/s ELODIE RV error 127 unused3 float unused 128 unused4 float unused 129 velgal float/km/s Galactocentric radial velocity 130 g0 float/mag Dereddened psf magnitude in the g band. SFD(98) used to subtract E(B-V)*3.793 131 V0 float/mag Derived V magnitude, V0, determined from g0 - 0.561((g-r)_0)-0.004 132 gmr0 float/mag Dereddened psf color (g-r)_0 133 grhbaa float/mag Adopted predicted (g-r)_0 from grbha, grbhd, grhp 134 grbha float/mag predicted (g-r)_0 from H alpha line index 135 grbhd float/mag predicted (g-r)_0 from H delta line index 136 grhp float/mag predicted (g-r)_0 from half power point 137 BmV0 float/mag (B-V)_0 derived from 0.916(g-r)+0.187 138 BmVBalmer0 float/mag (B-V)_0 derived from H balmer lines 139 umg0 float/mag Dereddened psf color (u-g)_0 140 rmi0 float/mag Dereddened psf color (r-i)_0 141 imz0 float/mag Dereddened psf color (i-z)_0 142 uerr float/mag Error in psf magnitude u band 143 gerr float/mag error in g 144 rerr float/mag error in r 145 ierr float/mag error in i 146 zerr float/mag error in z 147 ebv float E(B-V) at this (l,b) from SFD98 148 sna float Average Signal to noise ratio per pixel over 3950-6000 A 149 qualitycheck int Quality check integer 150 ra double/deg RA (J2000) 151 dec double/deg DEC (J2000) 152 l double/deg Galactic Longitude 153 b double/deg Galactic Latitude 154 chiHK float chi-square/dof measurement for the best fit to the spectrum in the region of Ca H+K 155 chiGband float chi-square/dof measure for the best fit spectrum in the region of the G band 156 chiMg float chi-square/dof measure for the best fit spectrum in the region of Mg triplet 157 pmepoch double/years Proper motion epoch 158 pmmatch int Proper motion match to USNO-B catalog (1 = match, 0 =no match) 159 pmdelta float/arcsec distance the star has moved 160 pml float/mas/yr propermotion in the Galactic l direction 161 pmb float/mas/yr proper motion in the Galactic b direction 162 pmra float/mas/yr proper motion in ra 163 pmdec float/mas/yr proper motion in dec 164 pmraerr float/mas/yr error in pmra 165 pmdecerr float/mas/yr error in pmdec 166 pmsigra float sigma in ra ppm 167 pmsigdec float sigma in dec ppm 168 pmnfit int number of fits to the ppm 169 usnomag_u float/mag USNO-B mag of the star in the derived u band 170 usnomag_g float/mag USNO-B mag of the star in the derived g band 171 usnomag_r float/mag USNO-B mag of the star in the derived r band 172 usnomag_i float/mag USNO-B mag of the star in the derived i band 173 usnomag_z float/mag USNO-B mag of the star in the derived z band 174 pmdist20 float/mag distance to nearest star brighter than 20th 175 pmdist22 float/mag distance to nearest star brighter than 22nd 176 brun int Run number (best reduction of sky) from photoobjall table 177 brerun int Rerun number 178 bcamcol int Camcol (1-6 on 2.5m mosaic camera) 179 bfield int Field number 180 bobj int object number within field 181 zbsubclass char[32] specBS subclass 182 zbelodiesptype char[32] specBS ELODIE best match template type 183 lpar float/mag Newberg-Lenz "l" color, perpendicular to stellar locus, sensitive to metalicity. 184 spar float/mag Ivezic (modified) s color, principal component 185 wpar float/mag Ivezic (modified) w color, principal component 186 p1spar float/mag Ivezic perpendicular to s color,component 187 zbclass char[32] SpecBS class 188 zbrchi2 float SpecBS rchi2 reduced chi-square per dof 189 zbdof float SpecBS dof measure for best fit template 190 zbvdisp float/km/s SpecBS Velocity dispersion (only relevant for GALaxies, not for stars) 191 zbvdisperr float/km/s SpecBS error in Velocity dispersion 192 zbzwarning int SpecBS warning flag 193 spec_cln char[32] Spectro1d(Chicago) classification 1= star, 6= red star, 0 = unknown 194 sprv float/km/s Spectro1d(Chicago) pipeline RV (also valid for Galaxies computed as cz) 195 sprverr float/km/s Spectro1d(Chicago) pipeline RV err 196 vel_dis float/km/s Spectro1d(Chicago) Velocity dispersion (only relevant for GALaxies, not for stars) 197 vel_disperr float/km/s Spectro1d(Chicago) Velocity dispersion error 198 spz_conf float Spectro1d(Chicago) redshift confidence 199 spz_status int Spectro1d(Chicago) redshift status 200 spz_warning int Spectro1d(Chicago) redshift warning 201 eclass float Spectro1d(Chicago) Galaxy Principal component classification - Andy Connolly 202 ecoeff1 float Spectro1d(Chicago) Principal component 1 (for Galaxies) 203 ecoeff2 float Spectro1d(Chicago) Principal component 2 (for Galaxies) 204 ecoeff3 float Spectro1d(Chicago) Principal component 3 (for Galaxies) 205 ecoeff4 float Spectro1d(Chicago) PC 4 206 ecoeff5 float Spectro1d PC5 207 inspect int Inspection flag (0 = sspp not manually inspected, yet) ------------------------------------------------------------------------------------------------- enter this query into the DR6QA context table in casjobs to see all the columns: select top 10 * from sppLines Here's a sample query that selects low metalicity stars ([Fe/H] < -3.5) which have relatively small error bars on the abundance (err < 0.5 dex) and which are brighter than 19th mag (dereddened) and where more than three different measures of feh are ok and are averaged. select sp.plate,sp.mjd,sp.fiberid,sp.g0,sp.gmr0, sl.caIIK,sl.caIIKerr,sl.caIIKmask, sp.feha,sp.fehaerr,sp.fehan,sp.logga,sp.loggaerr,sp.loggan from sppLines as sl, sppParams as sp where feha < -3.5 and fehaerr between 0.01 and 0.5 and fehan > 3 and g0 < 19 and sl.specobjid = sp.specobjid (5 stars returned)
Detailed descriptions of the selection algorithms for the different categories of SDSS targets are provided in the series of papers noted below under Target Selection References. Here we provide short summaries of the various target selection algorithms.
In the SDSS imaging data output tsObj files, the result of target selection for each object is recorded in the 32-bit primTarget flag, as defined in Table 27 of Stoughton et al. (2002). For details, see the Target Selection References
Note the following subtleties:
The following samples are targeted:
The main galaxy sample target selection algorithm is detailed in Strauss et al. (2002) and is summarized in this schematic flowchart.
Galaxy targets are selected starting from objects which are detected in the r band (i.e. those objects which are more than 5σ above sky after smoothing with a PSF filter). The photometry is corrected for Galactic extinction using the reddening maps of Schlegel, Finkbeiner, and Davis (1998). Galaxies are separated from stars using the following cut on the difference between the r-band PSF and model magnitudes:
rPSF - rmodel >= 0.3
Note that this cut is more conservative for galaxies than the star-galaxy separation cut used by Photo. Potential targets are then rejected if they have been flagged by Photo as SATURATED, BRIGHT, or BLENDED The Petrosian magnitude limit rP = 17.77 is then applied, which results in a main galaxy sample surface density of about 90 per deg2.
A number of surface brightness cuts are then applied, based on mu50, the mean surface brightness within the Petrosian half-light radius petroR50. The most significant cut is mu50 <= 23.0 mag arcsec-2 in r, which already includes 99% of the galaxies brighter than the Petrosian magnitude limit. At surface brightnesses in the range 23.0 <= mu50 <= 24.5 mag arcsec-2, several other criteria are applied in order to reject most spurious targets, as shown in the flowchart. Please see the detailed discussion of these surface brightness cuts, including consideration of selection effects, in Section 4.4 of Strauss et al. (2002). Finally, in order to reject very bright objects which will cause contamination of the spectra of adjacent fibers and/or saturation of the spectroscopic CCDs, objects are rejected if they have (1)fiber magnitudes brighter than 15.0 in g or r, or 14.5 in i; or (2) Petrosian magnitude rP < 15.0 and Petrosian half-light radius petroR50 < 2 arcsec.
Main galaxy targets satisfying all of the above criteria have the GALAXY bit set in their primTarget flag. Among those, the ones with mu50 >= 23.0 mag arcsec-2 have the GALAXY_BIG bit set. Galaxy targets who fail all the surface brightness selection limits but have r band fiber magnitudes brighter than 19 are accepted anyway (since they are likely to yield a good spectrum) and have the GALAXY_BRIGHT_CORE bit set.
SDSS luminous red galaxies (LRGs) are selected on the basis of color and magnitude to yield a sample of luminous intrinsically red galaxies that extends fainter and farther than the SDSS main galaxy sample. Please see Eisenstein et al. (2001) for detailed discussions of sample selection, efficiency, use, and caveats.
LRGs are selected using a variant of the photometric redshift technique and are meant to comprise a uniform, approximately volume-limited sample of objects with the reddest colors in the rest frame. The sample is selected via cuts in the (g-r, r-i, r) color-color-magnitude cube. Note that all colors are measured using model magnitudes, and all quantities are corrected for Galactic extinction following Schlegel, Finkbeiner, and Davis (1998). Objects must be detected by Photo as BINNED1, BINNED2, OR BINNED4 in both r and i, but not necessarily in g, and objects flagged by Photo as BRIGHT or SATURATED in g, r, or i are excluded.
The galaxy model colors are rotated first to a basis that is aligned with the galaxy locus in the (g-r, r-i) plane according to:
c&perp = (r-i) + (g-r)/4 + 0.18
c|| = 0.7(g-r) + 1.2[(r-i) - 0.18]
Because the 4000 Angstrom break moves from the g band to the r band at a redshift z ~ 0.4, two separate sets of selection criteria are needed to target LRGs below and above that redshift:
Cut I for z <~ 0.4
Cut II for z >~ 0.4
Cut I selection results in an approximately volume-limited LRG sample to z=0.38, with additional galaxies to z ~ 0.45. Cut II selection adds yet more luminous red galaxies to z ~ 0.55. The two cuts together result in about 12 LRG targets per deg2 that are not already in the main galaxy sample (about 10 in Cut I, 2 in Cut II).
In primTarget, GALAXY_RED is set if the LRG passes either Cut I or Cut II. GALAXY_RED_II is set if the object passes Cut II but not Cut I. However, neither of these flags is set if the LRG is brighter than the main galaxy sample flux limit but failed to enter the main sample (e.g., because of the main sample surface brightness cuts). Thus LRG target selection never overrules main sample target selection on bright objects.
The final adopted SDSS quasar target selection algorithm is described in Richards et al. (2002). However, it should be noted that the implementation of this algorithm came after the last date of DR1 spectroscopy. Thus this paper does not technically describe the DR1 quasar sample and the DR1 quasar sample is not intended to be used for statistical purposes (but see below). Interested parties are instead encouraged to use the catalog of DR1 quasars that is being prepared by Schneider et al (2003, in prep.), which will include an indication of which quasars were also selected by the Richards et al. (2002) algorithm. At some later time, we will also perform an analysis of those objects selected by the new algorithm but for which we do not currently have spectroscopy and will produce a new sample that is suitable for statistical analysis.
Though the DR1 quasars were not technically selected with the Richards et al. (2002) algorithm, the algorithms used since the EDR are quite similar to this algorithm and this paper suffices to describe the general considerations that were made in selecting quasars. Thus it is worth describing the algorithm in more detail.
The quasar target selection algorithms are summarized in this schematic flowchart. Because the quasar selection cuts are fairly numerous and detailed, the reader is strongly recommended to refer to Richards et al. (2002) (link to AJ paper; subscription required) for the full discussion of the sample selection criteria, completeness, target efficiency, and caveats.
The quasar target selection algorithm primarily identifies quasars as outliers from the stellar locus, modeled following Newberg & Yanny (1997) as elongated tubes in the (u-g, g-r, r-i) (denoted ugri) and (g-r, r-i, i-z) (denoted griz) color cubes. In addition, targets are also selected by matches to the FIRST catalog of radio sources (Becker, White, & Helfand 1995). All magnitudes and colors are measured using PSF magnitudes, and all quantities are corrected for Galactic extinction following Schlegel, Finkbeiner, and Davis (1998).
Objects flagged by Photo as having either "fatal" errors (primarily those flagged BRIGHT, SATURATED, EDGE, or BLENDED; or "nonfatal" errors (primarily related to deblending or interpolation problems) are rejected from the color selection, but only objects with fatal errors are rejected from the FIRST radio selection. See Section 3.2 of Richards et al. (2002) for the full details. Objects are also rejected (from the color selection, but not the radio selection) if they lie in any of 3 color-defined exclusion regions which are dominated by white dwarfs, A stars, and M star+white dwarf pairs; see Section 3.5.1 of Richards et al. (2002) for the specific exclusion region color boundaries. Such objects are flagged as QSO_REJECT. Quasar targets are further restricted to objects with iPSF > 15.0 in order to exclude bright objects which will cause contamination of the spectra from adjacent fibers.
Objects which pass the above tests are then selected to be quasar targets if they lie more than 4σ from either the ugri or griz stellar locus. The detailed specification of the stellar loci and of the outlier rejection algorithm are provided in Appendices A and B of Richards et al. (2002). These color-selected quasar targets are divided into main (or low-redshift) and high-redshift samples, as follows:
These are outliers from the ugri stellar locus and are selected in the magnitude range 15.0 < iPSF < 19.1. Both point sources and extended objects are included, except that extended objects must have colors that are far from the colors of the main galaxy distribution and that are consistent with the colors of AGNs; these additional color cuts for extended objects are specified in Section 3.4.4 of Richards et al. (2002).
Even if an object is not a ugri stellar locus outlier, it may be selected as a main quasar sample target if it lies in either of these 2 "inclusion" regions: (1) "mid-z", used to select 2.5 < z < 3 quasars whose colors cross the stellar locus in SDSS color space; and (2) "UVX", used to duplicate selection of z <= 2.2 UV-excess quasars in previous surveys. These inclusion boxes are specified in Section 3.5.2 of Richards et al. (2002).
Note that the QSO_CAP and QSO_SKIRT distinction is kept for historical reasons (as some data that are already public use this notation) and results from an original intent to use separate selection criteria in regions of low ("cap") and high ("skirt") stellar density. It turns out that the selection efficiency is indistinguishable in the cap and skirt regions, so that the target selection used is in fact identical in the 2 regions (similarly for QSO_FIRST_CAP and QSO_FIRST_SKIRT, below).
These are outliers from the griz stellar locus and are selected in the magnitude range 15.0 < iPSF < 20.2. Only point sources are selected, as these quasars will lie at redshifts above z~3.5 and are expected to be classified as stellar at SDSS resolution. Also, to avoid contamination from faint low-redshift quasars which are also griz stellar locus outliers, blue objects are rejected according to eq. (1) in Section 3.4.5 of Richards et al. (2002).
Moreover, several additional color cuts are used in order to recover more high-redshift quasars than would be possible using only griz stellar locus outliers. So an object will be selected as a high-redshift quasar target if it lies in any of these 3 "inclusion" regions: (1) "gri high-z", for z >= 3.6 quasars; (2) "riz high-z", for z >= 4.5 quasars; and (3) "ugr red outlier", for z >= 3.0 quasars. The specifics are given in eqs. (6-8) in Section 3.5.2 of Richards et al. (2002).
Irrespective of the various color selection criteria above, SDSS stellar objects are selected as quasar targets if they have 15.0 < iPSF < 19.1 and are matched to within 2 arcsec of a counterpart in the FIRST radio catalog.
Finally, those targets which otherwise meet the color selection or radio selection criteria described above, but fail the cuts on iPSF, will be flagged as QSO_MAG_OUTLIER (also called QSO_FAINT). Such objects may be of interest for follow-up studies, but are not otherwise targeted for spectroscopy under routine operations (unless another "good" quasar target flag is set).
A variety of other science targets are also selected; see also Section 4.8.4 of Stoughton et al. (2002). With the exception of brown dwarfs, these samples are not complete, but are assigned to excess fibers left over after the main samples of galaxies, LRGs, and quasars have been tiled.
A variety of stars are also targeted using color selection criteria, as follows:
SDSS objects are positionally matched against X-ray sources from the ROSAT All-Sky Survey (RASS; Voges et al. 1999), and SDSS objects within the RASS error circles (commonly 10-20 arcsec) are targeted using algorithms tuned to select likely optical counterparts to the X-ray sources. Objects are targeted which:
Objects are flagged ROSAT_E if they fall within the RASS error circle but are either too faint or too bright for SDSS spectroscopy.
This is an open category of targets whose selection criteria may change as different regions of parameter space are explored. These consist of:
Tiling is the process by which the spectroscopic plates are designed and placed relative to each other. This procedure involves optimizing both the placement of fibers on individual plates, as well as the placement of plates (or tiles) relative to each other.
Much of the content of this page can be found as a preprint on astro-ph.
NOTE: the term "chunk" or "tiling chunk" is sometimes used to denote a tiling region. To avoid confusion with the correct use of the term chunk, we use "tiling region" here.
Figure 1: Simplified Tiling and Network Flow View |
The basic idea is shown in the right half of Figure 1, which shows the appropriate network for the situation in the left half. Using this figure as reference, we here define some terms which are standard in combinatorial literature and which will be useful here:
Imagine a flow of 7 objects entering the network at
the source node at the left. We want the entire flow to leave
the network at the sink node at the right for the lowest possible
cost. The objects travel along the arcs, from node to node. Each
arc has a maximum capacity of objects which it can transport, as
labeled. (One can also specify a
As described above, there is a limit of 55" to how close two fibers can be on the same tile. If there were no overlaps between tiles, these collisions would make it impossible to observe ~10% of the SDSS targets. Because the tiles are circular, some fraction of the sky will be covered with overlaps of tiles, allowing some of these targets to be recovered. In the presence of these collisions, the best assignment of targets to the tiles must account for the presence of collisions, and strive to resolve as many as possible of these collisions which are in overlaps of tiles. We approach this problem in two steps, for reasons described below. First, we apply the network flow algorithm of the above section to the set of "decollided" targets --- the largest possible subset of the targets which do not collide with each other. Second, we use the remaining fibers and a second network flow solution to optimally resolve collisions in overlap regions.
Figure 2: Fiber Collisions |
The "decollided" set of targets is the maximal subset of targets which are all greater than 55" from each other. To clarify what we mean by this maximal set, consider Figure 2. Each circle represents a target; the circle diameter is 55", meaning that overlapping circles are targets which collide. The set of solid circles is the "decollided" set. Thus, in the triple collision at the top, it is best to keep the outside two rather than the middle one.
This determination is complicated slightly by the fact that some targets are assigned higher priority than others. For example, as explained in the Targeting section, QSOs are given higher priority than galaxies by the SDSS target selection algorithms. What we mean here by "priority" is that a higher priority target is guaranteed never to be eliminated from the sample due to a collision with a lower priority object. Thus, our true criterion for determining whether one set of assignments of fibers to targets in a group is more favorable than another is that a greater number of the highest priority objects are assigned fibers.
Once we have identified our set of decollided objects, we use the network flow solution to find the best possible assignment of fibers to that set of objects.
After allocating fibers to the set of decollided targets, there will usually be unallocated fibers, which we want to use to resolve fiber collisions in the overlaps. We can again express the problem of how best to perform the collision resolution as a network, although the problem is a bit more complicated in this case. In the case of binaries and triples, we design a network flow problem such that the network flow solution chooses the tile assignments optimally. In the case of higher multiplicity groups, our simple method for binaries and triples does not work and we instead resolve the fiber collisions in a random fashion; however, fewer than 1% of targets are in such groups, and the difference between the optimal choice of assignments and the random choices made for these groups is only a small fraction of that.
We refer the reader to the tiling algorithm paper for more details, including how the fiber collision network flow is designed and caveats about what aspects of the method may need to be changed under different circumstances.
Once one understands how to assign fibers given a set of tile centers, one can address the problem of how best to place those tile centers. Our method first distributes tiles uniformly across the sky and then uses a cost-minimization scheme to perturb the tiles to a more efficient solution.
In most cases, we set initial conditions by simply laying down a rectangle of tiles. To set the centers of the tiles along the long direction of the rectangle, we count the number of targets along the stripe covered by that tile. The first tile is put at the mean of the positions of target 0 and target N_t, where N_t is the number of fibers per tile (592 for the SDSS). The second tile is put at the mean between target N_t and 2N_t, and so on. The counting of targets along adjacent stripes is offset by about half a tile diameter in order to provide more complete covering.
The method is of perturbing this uniform distribution is iterative. First, one allocates targets to the tiles, but instead of limiting a target to the tiles within a tile radius, one allows a target to be assigned to further tiles, but with a certain cost which increases with distance (remember that the network flow accommodates the assignment of costs to arcs). One uses exactly the same fiber allocation procedure as above. What this does is to give each tile some information about the distribution of targets outside of it. Then, once one has assigned a set of targets to each tile, one changes each tile position to that which minimizes the cost of its set of targets. Then, with the new positions, one reruns the fiber allocation, perturbs the tiles again, and so on. This method is guaranteed to converge to a minimum (though not necessarily a global minimum), because the total cost must decrease at each step.
In practice, we also need to determine the appropriate number of tiles to use. Thus, using a standard binary search, we repeatedly run the cost-minimization to find the minimum number of tiles necessary to satisfy the SDSS requirements, namely that we assign fibers to 99% of the decollided targets.
In order to test how well this algorithm works, we have applied it both to simulated and real data. These results are discussed in the Tiling paper.
There are a few technical details which may be useful to mention in the context of SDSS data. Most importantly, we will describe which targets within the SDSS are "tiled" in the manner described here, and how such targets are prioritized. Second, we will discuss the method used by SDSS to deal with the fact that the imaging and spectroscopy are performed within the same five-year time period. Third, we will describe the tiling outputs which the SDSS tracks as the survey progresses. Throughout, we refer to the code which implements the algorithm described above as tiling.
Only some of the spectroscopic target types identified by the target selection algorithms in the SDSS are "tiled." These types (and their designations in the primary and secondary target bitmasks) are described in the Targeting pages). They consist of most types of QSOs, main sample galaxies, LRGs, hot standard stars, and brown dwarfs. These are the types of targets for which tiling is run and for which we are attempting to create a well-defined sample. Once the code has guaranteed fibers to all possible "tiled targets," remaining fibers are assigned to other target types by a separate code.
All of these target types are treated equivalently, except that they assigned different "priorities," designated by an integer. As described above, the tiling code uses them to help decide fiber collisions. The sense is that a higher priority object will never lose a fiber in favor of a lower priority object. The priorities are assigned in a somewhat complicated way for reasons immaterial to tiling, but the essence is the following: the highest priority objects are brown dwarfs and hot standards, next come QSOs, and the lowest priority objects are galaxies and LRGs. QSOs have higher priority than galaxies because galaxies are higher density and have stronger angular clustering. Thus, allowing galaxies to bump QSOs would allow variations in galaxy density to imprint themselves into variations in the density of QSOs assigned to fibers, which we would like to avoid. For similar reasons, brown dwarfs and hot standard stars (which have extremely low densities on the sky) are given highest priority.
Each tile, as stated above, is 1.49 degrees in radius, and has the capacity to handle 592 tiled targets. No two such targets may be closer than 55" on the same tile.
The operation of the SDSS makes it impossible to tile the entire 10,000 square degrees simultaneously, because we want to be able to take spectroscopy during non-pristine nights, based on the imaging which has been performed up to that point. In practice, periodically a "tiling region" of data is processed, calibrated, has targets selected, and is passed to the tiling code. During the first year of the SDSS, about one tiling region per month has been created; as more and more imaging is taken and more tiles are created, we hope to decrease the frequency with which we need to make tiling regions, and to increase their size.
A tiling region is defined as a set of rectangles on the sky (defined in
survey
coordinates). All of these rectangles cover only sky which has
been imaged and processed. However, in the case of tiling, targets
may be missed near the edges of a tiling region because that area
was not covered by tiles. Thus, tiling is actually run on a somewhat larger area than a single tiling region, so the areas near the edges of adjacent
tiling regions are also included. This larger area is known as a
The first tiling region which is "supported" by the SDSS is denoted Tiling Region 4. The first tiling region for which the version of tiling described here was run is Tiling Region 7. Tiling regions earlier than Tiling Region 7 used a different (less efficient) method of handling fiber collisions. The earlier version also had a bug which artificially created gaps in the distribution of the fibers. The locations of the known gaps are given in the EDR paper for Tiling Region 4 as the overlaps between plates 270 and 271, plates 312 and 313, and plates 315 and 363 (also known as tiles 118 and 117, tiles 76 and 75, and tiles 73 and 74).
In order to interpret the spectroscopic sample, one needs to use the information about how targets were selected, how the tiles were placed, and how fibers were assigned to targets. We refer to the geometry defined by this information as the "tiling window" and describe how to use it in detail elsewhere. As we note below, for the purposes of data release users it is also important to understand what the photometric imaging window which is released (including, if desired, masks for image defects and bright stars) and which plates have been released.
The observed velocity dispersion sigma is the result of the superposition of many individual stellar spectra, each of which has been Doppler shifted because of the star's motion within the galaxy. Therefore, it can be determined by analyzing the integrated spectrum of the whole galaxy - the galaxy integrated spectrum will be similar to the spectrum of the stars which dominate the light of the galaxy, but with broader absorption lines due to the motions of the stars. The velocity dispersion is a fundamental parameter because it is an observable which better quantifies the potential well of a galaxy.
Estimating velocity dispersions for galaxies which have integrated spectra which are dominated by multiple components showing different stellar populations and different kinematics (e.g. bulge and disk components) is complex. Therefore, the SDSS estimates the velocity dispersion only for spheroidal systems whose spectra are dominated by the light of red giant stars. With this in mind, we have selected galaxies which satisfy the following criteria:
Because the aperture of an SDSS spectroscopic fiber (3 arcsec) samples only the inner parts of nearby galaxies, and because the spectrum of the bulge of a nearby late-type galaxy can resemble that of an early-type galaxy, our selection includes spectra of bulges of nearby late-type galaxies. Note that weak emission lines, such as Halpha and/or O II, could still be present in the selected spectra.
A number of objective and accurate methods for making velocity dispersion measurements have been developed (Sargent et al. 1977; Tonry & Davis 1979; Franx, Illingworth & Heckman 1989; Bender 1990; Rix & White 1992). These methods are all based on a comparison between the spectrum of the galaxy whose velocity dispersion is to be determined, and a fiducial spectral template. This can either be the spectrum of an appropriate star, with spectral lines unresolved at the spectra resolution being used, or a combination of different stellar types, or a high S/N spectrum of a galaxy with known velocity dispersion.
Since different methods can give significantly different results, thereby introducing systematic biases especially for low S/N spectra, we decided to use two different techniques for measuring the velocity dispersion. Both methods find the minimum of
chi2 = sum { [G - B * S]2 }where G is the galaxy, S the star and B is the gaussian broadening function (* denotes a convolution).
chi2 = sum { [G~(k) - B~(k,sigma) S~(k)]2 /Vark2},where G~, B~ and S~ are the Fourier Transforms of G, B and S, respectively, and Vark2 = sigmaG~2 + sigmaS~2 B~(k,sigma). (Note that in Fourier space, the convolution is a multiplication.)
chi2 = sum { [G(n) - B(n,sigma) S(n)]2 /Varn2}.Because the S/N of the SDSS spectra are relatively low, we assume that the observed absorption line profiles in early-type galaxies are Gaussian.
It is well known that the two methods have their own particular biases, so we carried out numerical simulations to calibrate these biases. In our simulations, we chose a template stellar spectrum measured at high S/N, broadened it using a Gaussian with rms sigmainput, added Gaussian noise, and compared the input velocity dispersion with the measured output value. The first broadening allows us to test how well the methods work as a function of velocity dispersion, and the addition of noise allows us to test how well the methods work as a function of S/N. Our simulations show that the systematic errors on the velocity dispersion measurements appear to be smaller than ~ 3% but estimates of low velocity dispersions (sigma< 100 km s-1) are more biased (~ 5%).
The SDSS uses 32 K and G giant stars in M67 as stellar templates. The SDSS velocity dispersion estimates are obtained by fitting the restframe wavelength range 4000-7000 Å, and then averaging the estimates provided by the "Fourier-fitting" and "Direct-fitting" methods. The error on the final value of the velocity dispersion is determined by adding in quadrature the errors on the two estimates (i.e., the Fourier-fitting and Direct-fitting). The typical error is between delta(logsigma) ~ 0.02 dex and 0.06 dex, depending on the signal-to-noise of the spectra. The scatter computed from repeated observations is ~ 0.04 dex, consistent with the amplitude of the errors on the measurements.
Estimates of sigma are limited by the instrumental dispersion and resolution. The instrumental dispersion of the SDSS spectrograph is 69 km s-1 per pixel, and the resolution is ~ 90 km s-1. In addition, the instrumental dispersion may vary from pixel to pixel, and this can affect measurements of sigma. These variations are estimated for each fiber by using arc lamp spectra (up to 16 lines in the range 3800-6170 Å and 39 lines between 5780-9230 Å). A simple linear fit provides a good description of these variations. This is true for almost all fibers, and allows us to remove the bias such variations may introduce when estimating galaxy velocity dispersions.
The velocity dispersion measurements distributed with SDSS spectra use template spectra convolved to a maximum sigma of 420 km/s. Therefore, velocity dispersion sigma > 420 km/s are not reliable and must not be used.
We recommend the user to not use SDSS velocity dispersion measurements for:
Also note that the velocity dispersion measurements output by the SDSS spectro-1D pipeline are not corrected to a standard relative circular aperture. (The SDSS spectra measure the light within a fixed aperture of radius 1.5 arcsec. Therefore, the estimated velocity dispersions of more distant galaxies are affected by the motions of stars at larger physical radii than for similar galaxies which are nearby. If the velocity dispersions of early-type galaxies decrease with radius, then the estimated velocity dispersions (using a fixed aperture) of more distant galaxies will be systematically smaller than those of similar galaxies nearby.)