amt-9-3513-2016.pdf

Atmos. Meas. Tech., 9, 3513–3525, 2016

www.atmos-meas-tech.net/9/3513/2016/

doi:10.5194/amt-9-3513-2016

GFIT2: an experimental algorithm for vertical

profile retrieval from near-IR spectra

Brian J. Connor

, Vanessa Sherlock

, Geoff Toon

, Debra Wunch

, and Paul O. Wennberg

BC Consulting Limited, Martinborough, New Zealand

National Institute for Water and Atmospheric Research, Wellington, New Zealand

Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, USA

California Institute of Technology, Pasadena, CA, USA

Correspondence to:

Brian J. Connor (bcconsulting@xtra.co.nz)

Received: 27 August 2015 – Published in Atmos. Meas. Tech. Discuss.: 23 November 2015

Revised: 29 June 2016 – Accepted: 4 July 2016 – Published: 2 August 2016

Abstract.

An algorithm for retrieval of vertical profiles from

ground-based spectra in the near IR is described and tested.

Known as GFIT2, the algorithm is primarily intended for

, and is used exclusively for CO

in this paper. Retrieval

of CO

vertical profiles from ground-based spectra is theo-

retically possible, would be very beneficial for carbon cycle

studies and the validation of satellite measurements, and has

been the focus of much research in recent years. GFIT2 is

tested by application both to synthetic spectra and to mea-

surements at two Total Carbon Column Observing Network

(TCCON) sites. We demonstrate that there are approximately

◦

of freedom for the CO

profile, and the algorithm per-

forms as expected on synthetic spectra. We show that the ac-

curacy of retrievals of CO

from measurements in the 1.61

(6220 cm

−

) spectral band is limited by small uncertainties

in calculation of the atmospheric spectrum. We investigate

several techniques to minimize the effect of these uncer-

tainties in calculation of the spectrum. These techniques are

somewhat effective but to date have not been demonstrated

to produce CO

profile retrievals with sufficient precision

for applications to carbon dynamics. We finish by discussing

ongoing research which may allow CO

profile retrievals

with sufficient accuracy to significantly improve the scien-

tific value of the measurements from that achieved with col-

umn retrievals.

1 Motivation

Since 2004 the Total Carbon Column Observing Network

(TCCON) has measured ground-based near-IR solar spec-

tra. Their high spectral resolution and signal-to-noise ratio

(SNR) allow high-precision measurements of total overhead

column abundance of CO

and other gases, which provide

constraints on the global carbon budget and also help val-

idate satellite measurements using the same spectral bands.

The standard analysis consists of least-squares spectral fitting

to derive a multiplicative scale factor applied to an assumed

(“a priori”) CO

profile shape. That analysis (“profile scal-

ing”) provides column densities, from which dry-air average

mole fractions (“

”) can be derived.

Allowing the a priori CO

profile shape to vary in the re-

trieval process (“profile retrieval”) has several potential ad-

vantages. For one, it can be shown that such an algorithm

has more uniform sensitivity to CO

as a function of altitude

and so should be less sensitive to bias from the a priori pro-

file (Fig. 1). Secondly, it has more freedom to fit the observed

spectrum, and thus generally will leave smaller residuals, and

may help in understanding their origin. Finally, it would the-

oretically allow separation of the boundary layer from the

rest of the column, helping to distinguish sources and sinks

at continental and sub-continental scales (Fig. 2). Profile re-

trieval at the accuracy required is challenging, however. Pro-

file retrieval requires that the factors that affect the spectral

line shape (e.g., instrument line shape (ILS), spectroscopic

widths) be accurately known. And since profile retrieval at-

tempts to extract more information from the spectrum than

Published by Copernicus Publications on behalf of the European Geosciences Union.

3514

B. J. Connor et al.: GFIT2: an experimental algorithm for vertical profile retrieval

Figure 1.

Column averaging kernels for simulated retrievals. “Pro-

file retrieval” and “profile scaling” refer to a typical retrieval using

GFIT2 on the 6220 cm

−

band observed operationally. “Op-

tically thin”, “optically thick”, and “intermediate” refer to an ide-

alized, single, isolated spectral line. For these, we calculate a spec-

trum from a reference profile, perturb the profile at a single altitude,

and calculate a new spectrum. We then perform a least-squares fit

to this synthetic spectrum by deriving a scale factor for the refer-

ence profile. This scale factor, divided by the actual perturbation,

produces a single element of the column averaging kernels shown.

profile scaling, there typically need to be a priori constraints

to keep the retrieval stable (Rodgers, 1976; Solomon et al.,

2000; Dohe, 2013).

It is worth noting that profile retrieval is unlikely to reduce

uncertainties in determination of

. Wunch et al. (2015)

showed that error due to the CO

a priori profile shape was

less than

∼

0.01 % at solar zenith angle (SZA) < 70, and in-

creased only slowly at higher SZA, unless deliberately crude

a priori profile shapes were chosen.

Recent work on profile retrieval development includes

Kuai et al. (2013), who retrieved CO

in three tropospheric

layers using the 1.6

spectral band, and Dohe (2013), who

studied a complete profile retrieval from measurements of the

2.1

band.

This paper describes the experimental implementation and

early tests of a profile retrieval algorithm for TCCON spec-

tra. Its layout is as follows. In Sect. 2 we describe the stan-

dard algorithm used for TCCON and briefly discuss the his-

tory of profile retrieval and the chosen algorithm for similar

measurements. Section 3 describes implementation of the al-

gorithm as GFIT2. Section 4 presents tests of GFIT2, both

for synthetic spectra and for real spectra taken to coincide

with overpasses by aircraft making in situ CO

measure-

ments. Section 5 presents preliminary conclusions to date,

and Sect. 6 outlines future plans.

Figure 2.

Partial column averaging kernels for the profile retrieval

algorithm of Fig. 1 (GFIT2 applied to the 6220 cm

−

spectral band,

with assumed signal-to-noise ratio

1000).

2 Background and algorithm origins and history

GFIT is the algorithm adopted by the TCCON for analysis of

the spectra; it was developed over many years by Geoff Toon

at JPL. GFIT is also used to analyze MkIV balloon spectra

(e.g., Sen et al., 1996) and was used in the Version 3 pro-

cessing of ATMOS spectra (Irion et al., 2002). It is a profile

scaling algorithm, employing a quasi-linear regression to de-

rive scale factors for all important absorbers as well as other

atmospheric and instrument parameters, such as continuum

level and frequency shift.

GFIT is designed in such a way that its “forward model”

is independent of and separable from its “inverse method”.

These terms are discussed in Rodgers (2000), but briefly

the forward model is an algorithm that calculates the atmo-

spheric spectra comparable to the observed spectra, incor-

porating radiative transfer and molecular physics along with

assumed gas distributions. The inverse method retrieves a

state vector of parameters, such as molecular mixing ratio,

by finding values which provide a best fit to the spectrum

given other assumptions and constraints. The GFIT inverse

method is a form of “optimal estimation” as described further

below, which applies the Gauss–Newton method, iteratively

estimating the parameters by successive approximation.

Ground-based spectra – at microwave, IR, and UV wave-

lengths – have been analyzed in selected applications for

limited altitude profile information, for many years (Connor

et al., 1995, 2007; Pougatchev et al., 1996; Schofield et al.,

2004.) The physical origin of the limited profile information

in ground-base spectra is somewhat varied. Most commonly,

the pressure-broadened line shape is exploited; however the

use of lines of varied opacity (see Fig. 1) and the use of multi-

ple atmospheric paths (Schofield et al., 2004) are also sources

of profile information.

Atmos. Meas. Tech., 9, 3513–3525, 2016

www.atmos-meas-tech.net/9/3513/2016/

B. J. Connor et al.: GFIT2: an experimental algorithm for vertical profile retrieval

3515

Figure 3.

A typical spectrum in the 6220 cm

−

band. The calcu-

lated spectrum, produced by GFIT in profile scaling mode, is su-

perimposed, with the individual gas contributions shown in color.

, solar lines, and H

O dominate the visible features. The resid-

uals (also typical) are shown in the upper panel. This spectrum is

one of the atmospheric measurements used in Sect. 4.2 and subse-

quently.

The most common algorithm for the inverse method used

in these and other studies is optimal estimation, formulated

by Rodgers (1976, 2000). That algorithm will be described

in detail in the following section.

Optimal estimation has been implemented as a user-

selected option of inverse method added to the version of

GFIT publically released in 2012; no other changes to the

standard GFIT algorithm were made. The modified algo-

rithm is known as GFIT2. GFIT is designed to treat each

spectral band independently. All calculations in this paper

are of the 1.61

(6220 cm

−

) spectral band; the use of other

bands will be discussed briefly in Sect. 5. Figure 3 shows a

typical spectrum from the TCCON site at Lamont, OK, USA.

3 Algorithm and implementation

The optimal estimation formulation of Rodgers (2000) was

adapted and applied for use with the “full physics” algorithm

(inverse method plus forward model) developed for the first

Orbiting Carbon Observatory (OCO) satellite, which failed

to reach orbit in 2009. The OCO inverse method, as it existed

in 2007, is described in Connor et al. (2008) and was used

as a starting point for the development of GFIT2. Much of

the discussion in Sect. 2 of Connor et al. (2008) is directly

applicable.

3.1 Inverse method

The OCO inverse method was adapted for use with GFIT and

is briefly described here. We use the notation and concepts

of Rodgers (2000). The spectrum, or measurement vector

is expressed symbolically as

)

, where

is the

state vector,

is the forward model, and

is the vector of

measurement errors.

The solution of the GFIT2 inverse method is the state vec-

tor

with maximum a posteriori probability, given the mea-

surement

. We solve for the state vector update d

, using

a slightly modified form of Rodgers’ Eq. (5.8), to improve

numerical accuracy by avoiding the inversion of a large ma-

trix:

(

−

)

[

−

(

−

(

))

−

(

−

)

]

(1)

where

is the weighting function matrix, or Jacobian,

∂

;

is the a priori state vector;

is the a priori covariance

matrix; and

is the measurement covariance matrix.

After each iteration, we test for convergence. To facilitate

that, we compute the change in the solution scaled by its es-

timated variance:

−

(2)

where

denotes the covariance of the retrieved state, using

the relation

−

∼

[

−

(

−

)

−

(

−

)

]

(3)

is effectively the square of the state vector update in units

of the solution variance.

If d

(where

is the number of state vector ele-

ments and

is an adjustable convergence parameter), con-

vergence is reached.

Lastly, we compute the retrieval covariance matrix,

, and

the averaging kernel matrix,

is given by

(

−

)

−

(4)

The averaging kernel matrix

is given by

−

(5)

Finally, the degrees of freedom (DoF) for signal are given by

the trace of the matrix

; the degrees of freedom for the CO

profile are the trace of the CO

-only sub-matrix of

To enable use of the Rodgers algorithm, a modified GFIT

code was developed which completely separates the forward

model and inverse method, and allows integration of optimal

estimation profile retrieval with the existing code. Concep-

tually, the experimental, integrated GFIT allows selection of

the existing (profile scaling) or modified (profile retrieval) al-

gorithm. This is simply accomplished by setting a parameter

in an input file. The integrated algorithm has input and output

files identical to the existing GFIT, plus new input and out-

put files specific to profile retrieval, which are not required

unless the modified algorithm is selected.

www.atmos-meas-tech.net/9/3513/2016/

Atmos. Meas. Tech., 9, 3513–3525, 2016

3516

B. J. Connor et al.: GFIT2: an experimental algorithm for vertical profile retrieval

3.2 Measurement error

A critical input to the algorithm of Eq. (1) is the measure-

ment error covariance,

. It is assumed to be diagonal, and

the simplest assumption is that the error is entirely random

noise independent of frequency. As we will see later, in all

real spectra there are systematic residuals, larger than the ac-

tual noise level, due to spectral features that can only be im-

perfectly modeled in the algorithm. If these features are not

taken into account in constructing

, the retrieved profile

develops severe oscillations.

The simplest way of “de-weighting” spectral features

which remain in the residuals is to increase the estimated

measurement error estimate (equivalently, reduce the as-

sumed SNR) at all frequencies, so that residual features are

ignored (treated as measurement error) (e.g., Connor et al.,

1995). As we will see, in practice this is somewhat effective

at damping oscillations, but only at the cost of losing most of

the profile information in the spectrum.

An alternative approach we have attempted to avoid pro-

file oscillations is to vary the assumed spectral error to reflect

the real residuals obtained by spectral fits (Rodgers and Con-

nor, 2003). A two-stage retrieval is run, in which stage 1 is

profile scaling. The residuals from stage 1 are then used to

estimate the spectral error as a function of frequency and in-

serted on the diagonal of

. This procedure greatly reduces

profile oscillations in many synthetic retrievals. We refer to

it as “variable SNR”.

A third approach to the problem of systematic residuals

is to estimate them empirically, and then to include them in

the forward model, multiplied by a scale factor retrieved as

part of the state vector (JPL, 2015). Perhaps the simplest ap-

proach is to estimate the systematic component by averag-

ing the residuals over the entire set of spectra under study.

This technique and simple variants on it will be described in

Sect. 4.4 below.

3.3 State vector and a priori uncertainties

The full state vector consists of the CO

profile, scale factors

for the other gas profiles contributing to the spectrum in the

band pass (H

O, HDO, and CH

in the 6220 cm

−

band);

the background continuum level, tilt, and curvature; a fre-

quency shift and a zero level offset. This is identical to the

standard GFIT scale factor, except for the CO

profile itself.

A scale factor multiplying a vector of systematic residuals,

as described in Sect. 3.2, has been added to the state vector

for the retrieval tests of Sect. 4.4.

A critical input is the a priori covariance matrix

, speci-

fying assumed uncertainties in the state vector and their cor-

relations. The retrievals in this paper assume that

is diag-

onal. The a priori uncertainties assumed are guided by those

used in the standard GFIT scaling, namely 1 for the three

interfering species and the continuum level, 0.1 for the con-

tinuum tile and curvature, 2 for frequency shift, and 0.5 %

for zero level offset (which is expected to be approximately

zero). The uncertainty in each of the 70 levels in the CO

profile is set independently. These uncertainties range from 1

to 5 %, are largest near the surface, and have been adjusted to

improve the test results where possible. Finally the residual

scale factor, when in use, has been assigned an uncertainty

of 10 %, based on the observed variability of the systematic

residuals.

3.4 Other input parameters

The only other input parameters specific to profile retrieval

concern convergence and goodness of fit. They include the

convergence parameter defined in Sect. 3.1, the maximum

acceptable

of the spectral fit, and the maximum number

of iterations allowed.

4 Testing GFIT2

4.1 Synthetic spectra

The algorithm was first tested by retrievals on synthetic spec-

tra, where the “true” atmospheric profile is known. In these

tests, the forward model (used by the algorithm) may be

the same as the forward function (which includes all true

physics) or may differ from it in a controlled way.

4.1.1 No forward-model error

We illustrate the most basic test in Fig. 4. Here a synthetic

spectrum was calculated from the profile labeled true (di-

amonds); then GFIT2 was run on the calculated spectrum

without modification, using the a priori profile shown. (The

a priori profile is selected on the basis of climatology.)

The assumed signal-to-noise ratio was 1000. The solid

lines in Fig. 4 show the retrieved profile, and the degrees of

freedom for the profile are shown in the legend. The two re-

sults shown in Fig. 4a and b are typical. The retrieved profile

has 3.3–3.5

◦

of freedom and follows the departure of the true

profile from the a priori reasonably well. This behavior is

consistent with previous experience and with expectations,

and it leads to the conclusion that the algorithm is working

as designed.

4.1.2 Pointing error

Next we used a known instrumental limitation to test the skill

of the variable-SNR technique. Namely, while the instrument

nominally points at the center of the solar disk, it is common

for some error to be introduced by patchy cloud cover or sim-

ply by tracking hardware problems. The effect of such point-

ing error is to introduce a Doppler shift due to solar rotation,

making calculations of the solar Doppler shift inaccurate, and

thus result in an uncompensated shift in the position of solar

lines relative to telluric lines.

Atmos. Meas. Tech., 9, 3513–3525, 2016

www.atmos-meas-tech.net/9/3513/2016/

B. J. Connor et al.: GFIT2: an experimental algorithm for vertical profile retrieval

3517

Figure 4. (a)

Retrievals from a synthetic spectrum with no forward-

model error.

(b)

As in

(a)

but assuming different true and a priori

profiles.

To assess the effect of pointing error, we assumed that an

error was present equal to 10 % of the solar diameter, which

produces an error in the solar Doppler shift of

∼

1.3 ppm. We

then applied the variable-SNR modification to the measure-

ment error covariance

, as described in Sect. 3.2. Retrievals

with this assumed error are shown in Fig 5. There is a signif-

icant decrease in the degrees of freedom, from

∼

3.3–3.5 to

∼

2.6. This occurs because the apparent SNR decreases in

the regions of the solar lines. That is, the assumed measure-

ment error is increased at all frequencies where the Doppler

shift error causes an increased residual spectrum, and in some

cases the solar lines and CO

features overlap; the increased

measurement error reduces the algorithm’s sensitivity to the

measured spectrum, corresponding to lower DoF.

The retrieval in Fig. 5a is qualitatively similar to the one

with higher DoF in Fig. 4a. The retrieved profile in Fig. 5b

spreads the increased CO

in the lower troposphere over a

broad range of altitude, showing the effect of poorer vertical

sensitivity compared to Fig. 4b, but does detect the presence

Figure 5. (a)

As Fig. 4a, with telescope pointing error assumed.

(b)

As Fig. 4b but with telescope pointing error assumed.

of enhanced CO

in the troposphere. Overall, we believe the

variable-SNR modification is shown to be effective in coping

with systematic residuals due to the level of solar pointing

error introduced.

4.1.3 Linewidth error

Error in spectroscopic parameters is an important cause of

systematic error in the calculated spectra, leading to system-

atic structures in the spectral residuals. Since the profile re-

trieval depends critically on the spectral line shape, spectro-

scopic errors will limit its performance, possibly severely.

The most obvious source of spectroscopic error affecting line

shape is the pressure-broadening coefficient, which simply

scales the linewidth at a given pressure. It is arguably the

largest source of line shape error as well. We will use syn-

thetic spectra and simulated retrievals to evaluate its effects.

For this purpose we have multiplied the pressure-

broadening coefficients in the relevant CO

bands near

1.6

m by 1.01, thus modeling a 1 % error in linewidth, and

www.atmos-meas-tech.net/9/3513/2016/

Atmos. Meas. Tech., 9, 3513–3525, 2016

3518

B. J. Connor et al.: GFIT2: an experimental algorithm for vertical profile retrieval

Figure 6. (a)

Retrievals with a 1 % error in pressure-broadening

coefficient assumed.

(b)

(a)

except the residuals of the scaling

retrieval are included in the profile retrieval’s forward model.

used these modified coefficients to calculate sets of

∼

100

synthetic spectra on specific days at Lamont, using actual

SZA and modeled temperature, etc. We have then run both

the scaling and profile retrievals with the original unmodi-

fied coefficients. The average profiles for 1 day are shown in

Fig. 6a.

The scaling retrieval reduces the CO

mixing ratio slightly

to compensate for the 1 % linewidth error, producing a net

error in

(and mixing ratio) of 0.2 %. The profile re-

trieval, on the other hand, produces large oscillations, of

∼

5 % at the surface and

∼

2 % in the upper troposphere and

stratosphere. Despite these large errors in profile, the net er-

ror in

is of similar magnitude (but opposite sign) to

the scaling retrieval.

Fortunately, errors in real retrievals are unlikely to be as

large as in these simulations.

Much effort in recent years has gone into refining knowl-

edge of the spectroscopic parameters needed for modeling

atmospheric CO

. Devi et al. (2007) state typical uncertainty

in the pressure-broadening coefficient of strong lines in the

1.6

m CO

bands is only approximately 0.1 %. Admittedly,

this is a formal uncertainty derived from their spectral fits and

may not include some sources of absolute uncertainty. How-

ever even an absolute uncertainty that small would produce

an error in CO

at the surface of

∼

2 ppm.

The extreme sensitivity of the CO

profile to errors in the

pressure-broadening coefficient, and by extension to other

sources of line shape error, motivates a search for ways to

“correct” the forward-model spectra to minimize such ef-

fects. Since the (unknown) true error in a spectroscopic pa-

rameter is constant, it may be expected to produce a spectral

signature which is very similar from measurement to mea-

surement, over many measurements. If we can isolate and

remove that signature, the profile retrieval may be able to

capture variations in profile shape within that set of measure-

ments.

As a preliminary test, we assume that the spectroscopic er-

ror signature is given by the residuals of the scaling retrieval

and add those to the calculated spectrum in the profile re-

trieval. The average profiles, corresponding to Fig. 6a, are

shown in Fig. 6b.

The average retrieved profile is nearly identical to the scal-

ing retrieval; no spurious changes in profile shape are in-

troduced. The derived

mole fractions differ by only

0.01 ppm.

Of course, for real measurements, the signature of spectral

error is not so easily derived. Later, in Sect. 4.4, we calculate

the mean residual vector for large sets of real measurements

and attempt profile retrievals including a scale factor applied

to the mean residual vector.

4.1.4 ILS error

Another potentially significant source of error is distortion of

the measured line shape itself. For Fourier transform spec-

trometers (FTSs, as used by TCCON) the ILS is a convolu-

tion of contributions from the finite path difference and the

finite field of view (FOV) of the FTSs. The path difference

and its ILS contribution (a sinc function) are well known, but

the FOV, which contributes a rectangular shape, has an un-

certainty we estimate as 7 %. This causes the observed line

to be broader and weaker than the atmospheric line, and it

progressively has a larger effect as the line is narrower; i.e.,

the error due to finite aperture becomes more important at

lower pressure where the intrinsic line shape is narrower (see

for example Davis et al., 2001).

We illustrate this effect in Fig. 7, which is calculated for

the same spectra as Fig. 6 and so is directly comparable to

Fig. 6a. The net effect of this error is very small in the lower

troposphere and grows only to

∼

1% in the stratosphere. We

conclude that error in the measured line shape is unlikely to

dominate error in the calculated line shape (Sect. 4.1.3).

Atmos. Meas. Tech., 9, 3513–3525, 2016

www.atmos-meas-tech.net/9/3513/2016/

B. J. Connor et al.: GFIT2: an experimental algorithm for vertical profile retrieval

3519

Figure 7.

Retrievals of the spectra used for Fig. 6 but with an as-

sumed error of 7 % in the instrument field-of-view.

4.2 Atmospheric measurements

Atmospheric spectra are routinely measured in Lamont, Ok-

lahoma, at the Southern Great Plains site of the Department

of Energy Atmospheric Radiation Measurement network. A

Cessna aircraft equipped with air sampling in situ detectors

is flown there on a regular basis and produces CO

pro-

files from

∼

0 to 5 km altitude. (Biraud et al., 2013). Al-

though these profiles include only about half the total col-

umn of atmospheric CO

, it is in the lowest few kilometers

that CO

is most variable and least predictable. Therefore the

Cessna measurements, coupled with climatological estimates

at higher altitudes, are expected to produce reasonable esti-

mates of the full CO

profile. In particular, they can test the

profile retrieval algorithm’s ability to detect variations from

climatology in the lower troposphere.

With this in mind, we have chosen several days for

study. On these days, Cessna flights were made, atmo-

spheric conditions were excellent, and many high-quality

near-infrared spectra were recorded. We selected days from

various times of year to allow conditions as variable as possi-

ble. The specific days chosen are 15 June 2011, 5 July 2011,

28 July 2011, 26 August 2011, 24 Decemnber 2011, 14 Jan-

uary 2012, and 15 January 2012.

Analysis of the data from these days immediately revealed

significant errors in the solar Doppler shift. This is not only

shown by simple examination of the residuals but is also

formally calculated, as the difference in the frequency shift

observed for solar lines (after correcting for the calculated

Doppler shift) and telluric lines. GFIT does not automatically

take this error into account, by recalculating the spectrum

with the correct Doppler shift. However, these errors can

be corrected by using the retrieved solar–telluric difference

to correct the calculated Doppler shift, and then re-running

the retrieval. All measured spectra and retrievals used and/or

Figure 8. (a)

NCEP temperature profiles at 06:00 and 18:00 for

a single day, less the profile at 12:00 on the same day.

(b)

Error

in CO

retrievals produced by not accounting for the temperature

changes of

(a)

. Dashed lines for profile scaling, solid for profile

retrieval.

shown in this paper have been “Doppler-shift-corrected” in

this way.

Another potential issue is the temperature profile used in

performing each retrieval.

In clear, dry conditions, the temperature at the surface will

sometimes vary by as much as 20 K during daylight. This

has an important impact on retrieval of tropospheric CO

, as

shown in Fig. 8. Figure 8a shows the difference in tempera-

ture between 12:00 LMST and 06:00 and 18:00 of the same

day. Figure 8b shows the effect of not correctly including this

temperature variation. In particular Fig. 8b shows the differ-

ence between profiles retrieved assuming the 12:00 temper-

ature profile and those using the temperatures of 06:00 and

18:00. Two sets of curves are shown, one for profile scaling

and one for profile retrieval.

www.atmos-meas-tech.net/9/3513/2016/

Atmos. Meas. Tech., 9, 3513–3525, 2016

3520

B. J. Connor et al.: GFIT2: an experimental algorithm for vertical profile retrieval

Figure 9. (a)

Average retrieval within about 1 h of a Cessna over-

flight, assuming SNR

100, on 28 July 2011.

(b)

(a)

but for

15 January 2012.

Note that the CO

errors produced by imposing an error in

surface temperature are largely confined to the lowest 2 km.

To minimize this effect, we have used the NCEP profiles at

06:00, 12:00, and 18:00 LMT, and interpolated between them

to the approximate time of the Cessna overflights. These in-

terpolated temperature profiles have been used in all the re-

trievals shown subsequently in this section.

4.2.1 SNR

=

100

As described in Sect. 3.2, we first attempted retrievals by

setting the SNR low enough to avoid trying to fit system-

atic spectral residuals. The SNR observed to achieve this

for the current dataset is approximately 100. Two examples

are shown in Fig. 9. Figure 9a may be compared directly to

Figs. 4b and 5b. A smoother version of the Cessna data is the

true profile in Figs. 4b and 5b. The a priori profile is the same.

Note that the degrees of freedom for the CO

profile (“DoF

”) are 1.4–1.5, implying there is only at best slightly

Figure 10.

Signal-to-noise ratio (SNR) derived for 28 July 2011

from the variable-SNR technique (see text).

more information about the profile than the 1

◦

of freedom

available to the scaling retrieval.

We see that the retrieval in Fig. 9a poorly matches the

Cessna profile in the lower troposphere, although it is a small

improvement on the a priori. The profile in Fig. 9b matches

5 atm reasonably well, though that may be for-

tuitous.

For present purposes the thing to note is that assuming

SNR

100 not only largely masks information on the alti-

tude profile but also avoids profile oscillations (e.g., Fig. 6a),

at least for these 2 days.

4.2.2 Variable SNR

We attempted to include as much profile information as pos-

sible while avoiding “over-fitting” the spectral regions with

the poorest residuals, by using the variable SNR as described

in Sect. 3.2. We illustrate by showing results for the same

days as in the preceding section, 28 July 2011 and 15 Jan-

uary 2012. We show in Fig. 10 the effective SNR on which

we based the diagonal of

for use in Eq. (1), for the day

28 July 2011.

Note that the effective SNR varies from

∼

100 to 750. The

mean value is shown in the title and is a bit over 610. This is

to be compared to SNR

1000 in Fig. 4 and SNR

100 in

Fig. 9.

Examples of profiles retrieved with variable SNR are given

in Fig. 11. CO

DoF has increased to

∼

2.9–3.3, compared to

Fig. 9. Figure 11a is directly comparable to Fig. 9a. There ap-

pears to be some improvement in the lower troposphere but

degradation in the upper troposphere, with a suggestion of

an incipient oscillation. Figure 11b shows a runaway oscilla-

tion. This result shows that the fundamental instability of the

retrieval has not been adequately mitigated by the applica-

tion of the variable-SNR technique. It is worth pointing out

that these 2 days are “typical” of the 7 days studied. In par-

Atmos. Meas. Tech., 9, 3513–3525, 2016

www.atmos-meas-tech.net/9/3513/2016/

B. J. Connor et al.: GFIT2: an experimental algorithm for vertical profile retrieval

3521

Figure 11. (a)

Average retrieval within about 1 h of a Cessna over-

flight, assuming variable SNR, on 28 July 2011.

(b)

(a)

but for

15 January 2012.

ticular, the 2 other winter days show oscillations much like

in Fig. 11b; the 3 other summer days do not, but they show

little if any improvement vs. the Cessna profile.

The performance of the profile retrieval algorithm with

variable SNR on the measured spectra tested is clearly un-

satisfactory. In an effort to understand this limitation we

next closely examine the spectral residuals the algorithm pro-

duces, as described in the next section.

4.3 Spectroscopic residuals

Past experience indicates that the oscillatory behavior of the

profiles seen in Fig. 11 is most likely driven by a failure of the

forward model to adequately reproduce the measured spec-

trum. This of course may reflect systematic error in either

the model or the instrument. To isolate the spectral signature

of the error, we calculated mean residuals from many spec-

tra. We further expanded our study to examine the residuals

produced by the same model instrument at a different site,

namely Lauder, New Zealand.

On the 7 days examined in Sect. 4.2, a total of 5946 good-

quality spectra were recorded at Lamont. There were

∼

600

spectra day

−

in winter and up to

∼

850 day

−

in summer.

We have performed retrievals on all of these spectra and cal-

culated the mean residual for each day, along with the over-

all mean residual, in an attempt to isolate systematic spectral

features which the GFIT forward model cannot reproduce.

At Lauder we examined every 10th spectrum recorded on

13 days selected for clear sky and seasonal coverage. The

days were 10 July 2010, 11 July 2010, 30 July 2010, 24 Au-

gust 2010, 8 September 2010, 2 November 2010, 7 Novem-

ber 2010, 8 November 2010, 4 February 2011, 16 Febru-

ary 2011, 1 April 2011, 21 May 2011, and 28 Septem-

ber 2011. A total of 621 spectra were included; the number

of spectra per day ranged from

∼

10 to 70.

Figures 12–13 show expanded views of portions of the

mean residuals for the two sites, for air mass < 2 (Fig. 12) and

for 2 < air mass < 4 (Fig. 13). A pair of two (upper and lower)

panels is shown for each spectral interval, 6205–6210 and

6240–6245 cm

−

. The upper panel shows the mean residual.

The dashed vertical lines indicate the positions of CO

spec-

tral lines. The lower panel shows the standard deviation of

the daily mean residual vectors.

It is immediately clear that there are systematic residuals

∼

0.5 % depth at all of the CO

positions. Also, the CO

residuals are very similar at Lauder and Lamont, and for both

ranges of air mass. Closer examination shows that the resid-

uals are slightly asymmetric, such that the line center is at

slightly higher frequency than the center of the residual.

While to first order all the CO

residuals in Figs. 12–13 are

very similar, there are also differences. The systematic resid-

uals are broader at higher air mass, and the same features are

somewhat broader at Lauder than at Lamont.

4.4 Mean bias correction

We strongly suspect that a stable profile retrieval is not possi-

ble in the presence of systematic spectral errors as suggested

by the residuals of Sect. 4.3 and that these will readily pro-

duce the unsatisfactory oscillations seen in Fig. 11. This sys-

tematic spectral signature might be thought of as a “bias”

of the GFIT forward model which prevents it from fitting

the measured spectra precisely enough. The GFIT forward

model, as with most practical atmospheric spectral line mod-

els, uses the Voigt profile as its line shape. However, it has

recently become well understood that the Voigt line shape is

inadequate to model atmospheric spectra at the sub-percent

level. See, for example, Fig. 1 of Long et al. (2011). Unfortu-

nately, improved line shape functions are far more complex,

and, while several of them are known to improve spectral fits

in the laboratory (ibid.), there is no agreement as to which

of them contains the best physical description of the line for-

mation. Since atmospheric spectra are formed in far different

physical conditions than laboratory spectra, it is unclear how

www.atmos-meas-tech.net/9/3513/2016/

Atmos. Meas. Tech., 9, 3513–3525, 2016

3522

B. J. Connor et al.: GFIT2: an experimental algorithm for vertical profile retrieval

Figure 12.

Mean and standard deviation of residuals in selected

spectral intervals, for measurements with air mass < 2.

to modify the forward model to improve the observed spec-

tral fits.

Pending a future clarification of the physics of line for-

mation, we have attempted to stabilize the algorithm by per-

forming simple “corrections” to the forward model to re-

move, as far as possible, the spectral bias. In the first in-

stance, we have performed retrievals on the Lamont spec-

tra discussed in Sect. 4.2, accounting for systematic spectral

residuals as follows. We have modified the forward model to

include addition of a spectral basis vector, multiplied by a

scale factor, to the modeled spectrum. We calculate the mean

residual spectrum from a large set of the Lamont retrievals

(the set to be defined shortly). We then use those mean resid-

uals as the basis vector to be added to the modeled spectrum.

The scale factor which multiplies the basis vector is incor-

porated in the state vector, to be retrieved for each measured

spectrum. It is typically

∼

In the first instance, we derive the mean residual from the

full set of 7 days of data and use this as the basis vector in

retrievals from each measured spectrum. We show two of the

Figure 13.

Mean and standard deviation of residuals in selected

spectral intervals, for measurements with 2 < air mass < 4.

daily retrieved profiles (each the average of

∼

80 individual

retrievals) in Fig. 14.

Figure 14a and b are directly comparable to Fig. 11a and b.

The same data and algorithm are used except for the addition

of the scaled mean residual as just discussed.

The comparison of Figs. 11 and 14 shows a dramatic im-

provement on 15 January 2012, eliminating the large oscilla-

tion of Fig. 11b. Also on 28 July 2011, the profile of Fig. 14a,

after subtracting the mean residual, is less oscillatory than

previously (Fig. 11a). On both days, the integrated column

from the profile retrieval, represented by

, is slightly

closer to the estimated true value. This is promising and sug-

gests further development of the “bias correction” procedure.

Deriving the residual vector from the full 7 days of mea-

surements implicitly assumes that the residuals are indepen-

dent of seasonal effects and instrumental adjustments over a

long period. This is unlikely to be the case; in fact we have

already noted in discussing Figs. 12–13 that the residuals de-

pend to some degree on air mass. The mean air mass of a set

of spectra will vary with season and times of day. To lessen

Atmos. Meas. Tech., 9, 3513–3525, 2016

www.atmos-meas-tech.net/9/3513/2016/

B. J. Connor et al.: GFIT2: an experimental algorithm for vertical profile retrieval

3523

Figure 14. (a)

As Fig. 11a but including a basis vector of mean

residuals in the forward model.

(b)

(a)

but for 15 January 2012.

the impact of both air mass and potential instrumental vari-

ations in the residuals, we have used monthly residuals cal-

culated in a limited range of air mass (1–2 or 2–4) as the

spectral correction vectors and run the retrievals once more.

The results are shown in Fig. 15.

Unfortunately these results are a clear step backward.

Figure 15a shows no sensitivity to the enhanced lower-

tropospheric CO

, and in Fig. 15b we see the return of os-

cillatory behavior. The

value from the profile retrieval

minus the estimated true value is similar to the scaled minus

true value on 28 July 2011, and somewhat larger than scaled

minus true value on 15 January 2012. It should be noted

that the final residuals resulting from addition of the scaled,

monthly mean residuals are better (the rms fit is smaller) than

those which result from use of the overall mean residuals.

Despite that, the profiles are worse, suggesting that spectral

features relevant to the CO

profile are being removed by the

attempt at spectral bias correction.

Our best results to date come from adding the scaled mean

residuals of the set of days under study. With that in mind we

Figure 15. (a)

As Fig. 14a but with a different residual basis vector.

(b)

As Fig. 14b but with a different residual basis vector.

will expand the discussion to include days other than the two

illustrations used so far.

Figure 16a and b show results for 26 August 2010 and

14 January 2012. They were produced in the same way as

Fig. 14a and b, that is, including addition of a scaled mean

residual. The two dates are chosen to illustrate two features

of the full set of retrievals. Namely, on 26 August we see a

smooth profile with only a suggestion of oscillation, which

seems (maybe fortuitously) to track some enhanced CO

the lower troposphere. This description is similar to one for

28 July (Fig. 14a); in fact it is typical of all 4 summer days

of this group. Conversely, 14 January shows a serious oscil-

lation in the profile, unlike 15 January (Fig. 14b). The third

winter day studied, 24 December 2011 (not shown), has an

oscillation similar to 14 January. On both days,

from

the profile retrieval is slightly closer to the true value than is

the scaled value.

In summary these results fall into two classes. In one, the

retrieved profile is reasonably well behaved but offers lit-

tle if any improvement on the profile scaling version. In the

www.atmos-meas-tech.net/9/3513/2016/

Atmos. Meas. Tech., 9, 3513–3525, 2016

3524

B. J. Connor et al.: GFIT2: an experimental algorithm for vertical profile retrieval

Figure 16. (a)

As Fig. 14a but for 28 August 2011.

(b)

As Fig. 14b

but for 14 January 2012.

other, the retrieved profile suffers serious oscillations.

from the profile retrieval is similar to that from the scaling

retrieval.

5 Conclusions

The algorithm behaves as expected on synthetic data. On real

data, results are usually worse than scaling, given our a pri-

ori knowledge of the CO

distribution with altitude. Spectral

residuals are generally poor. When modifications to the spec-

tra and/or tight constraints force residuals to be small, profile

oscillations tend to be severe. Based on the tests shown in

Sect. 4.1, it would seem that our theoretical knowledge of

the atmospheric spectra is inadequate to provide useful CO

profiles at the accuracy required. It is important to consider

the word “useful”: the required accuracy for useful measure-

ments of CO

in the troposphere is very high (

∼

0.1–0.2 %)

relative to other atmospheric species and altitude regions.

Demands on our knowledge of the spectra are correspond-

ingly high.

There are at least two directions to follow in pursuit of

useful profile retrievals. One is improvements to the forward

model. These could be in the form of more accurate values of

spectral parameters, more appropriate models of spectral line

shape, and/or knowledge of the instrument line shape. All of

these areas have, however, already been the focus of intense

work over an extended period of time, and breakthroughs

may be slow in coming.

A second alternative is to exploit profile information from

sources other than the pressure-broadened line shape. An

immediately accessible source is spectral regions of higher

and lower opacity than the spectral band considered here.

In particular, several other CO

bands of varying opacity

are routinely measured simultaneously with the 1.61

(6220 cm

−

)

band by the TCCON FTSs. A profile retrieval

using several bands simultaneously should be explored. For

example, the 2.06

m (4852 cm

−

)

band has much higher

opacity, while the 1.65

m (6073 cm

−

)

band has lower

opacity than the 1.61

m band we have used. As shown in

Fig. 1, regions of high opacity are very sensitive to the lower

troposphere, while more optically thin regions are sensitive

to the upper troposphere and stratosphere. Simultaneous re-

trievals using all three bands would be far less sensitive to

details of the spectral line shape and thus might avoid the

difficulties described earlier. A complicating factor, however,

is the likelihood of different errors in band strength in the

three regions. A strategy for self-consistent scaling the band

strengths might be required before performing a simultane-

ous profile retrieval.

An apparently simple alternative has been suggested,

namely imposing a priori constraints on the profile shape by

experimenting with explicit interlayer correlations in the a

priori covariance matrix

. Such correlations are not nec-

essary simply to preserve profile smoothness. This is fun-

damentally because in the Rodgers algorithm the a priori

profile shape implicitly imposes the profile fine structure,

which is not strongly influenced by the measurement (see

Rodgers, 1990). So, for example, the algorithm developed at

Stony Brook University for ground-based microwave mea-

surements of ClO has been used successfully for more than

20 years and has always used a diagonal

matrix (Solomon

et al., 2000; Connor et al, 2013).

Nevertheless explicit interlayer correlations may damp un-

desirable oscillations, and their effect should be explored.

They have been used routinely in the OCO/OCO-2 retrieval

algorithms (Connor et al., 2008; JPL, 2015), where experi-

ence shows that the nature and strength of correlations is key

to doing this successfully. A whole series of experiments,

analogous to those presented in Sect. 4 of this paper, could

be envisioned to decide how best to apply correlations and

to evaluate their efficacy. This would be a valuable part of a

follow-on study, especially if combined with the multiband

Atmos. Meas. Tech., 9, 3513–3525, 2016

www.atmos-meas-tech.net/9/3513/2016/

B. J. Connor et al.: GFIT2: an experimental algorithm for vertical profile retrieval

3525

retrieval approach. Regrettably, it is beyond the scope of the

present effort.

Finally, it is our intention to release GFIT2 to the com-

munity, as an option within the public version of GFIT. That

would allow testing and development by a wider range of ex-

perienced researchers. So far that has proven impractical, but

we hope to do so in the near future.

Acknowledgements.

Part of this research was performed at the Jet

Propulsion Laboratory, California Institute of Technology, under

contract with NASA. We thank NASA’s Carbon Cycle Science

Investigation Program for supporting the development of GFIT2

(NNX14AI60G). Operations of TCCON at Lamont, Oklahoma,

are made possible by NASA’s OCO-2 project in collaboration with

the DOE ARM program. Cessna data from the SGP are available

through the ARM archive (www.archive.arm.gov). We thank

Sebastien Biraud for his assistance in interpreting the aircraft data.

Edited by: F. Hase

Reviewed by: two anonymous referees

References

Biraud, S. C., Torn, M. S., Smith, J. R., Sweeney, C., Riley, W. J.,

and Tans, P. P.: A multi-year record of airborne CO

observations

in the US Southern Great Plains, Atmos. Meas. Tech., 6, 751–

763, doi:10.5194/amt-6-751-2013, 2013.

Connor, B. J., Siskind, D. E., Tsou, J. J., Parrish, A., and Rems-

berg, A. E. E.: Ground-based microwave observations of ozone

in the upper stratosphere and mesosphere, J. Geophysical Res.,

99, 16757–16770, 1994.

Connor, B. J., Parrish, A., Tsou, J.-J., and McCormick, M. P.: Error

analysis for the ground-based microwave ozone measurements

during STOIC, J. Geophys. Res., 100, 9283–9292, 1995.

Connor, B. J., Mooney, T., Barrett, J., Solomon, P., Parrish, A., and

Santee, M.: Comparison of ClO measurements from the Aura

Microwave Limb Sounder to ground-based microwave measure-

ments at Scott Base, Antarctica, in spring 2005, J. Geophys. Res.,

112, D24S42, doi:10.1029/2007JD008792, 2007.

Connor, B. J., Bösch, H., Toon, G., Sen, B., Miller, C., and

Crisp, D.: Orbiting Carbon Observatory: Inverse method and

prospective error analysis, J. Geophys. Res., 113, D05305,

doi:10.1029/2006JD008336, 2008.

Davis, S. P., Abrams, M. C., and Brault, J. W.: Fourier Transform

Spectroscopy, Academic Press, 2001.

Devi, V. M., Benner, D. C., Brown, L. R., Miller, C. E., and Toth, R.

A.: Line Mixing and Speed Dependence in CO

at 6227.9 cm

−

Constrained multispectrum analysis of intensities and line shapes

in the 30013–00001 band, J. Mol. Spec., 245, 52–80, 2007.

Dohe, S.: Measurements of atmospheric CO

columns using

ground-based FTIR spectra Doctor of Science dissertation, Karl-

sruhe Institute or Technology, 101–104, 2013.

Irion, F. W., Gunson, M. R., Toon, G. C., Chang, A. Y., Eldering,

A., Mahieu, E., Manney, G. L., Michelsen, H. A., Moyer, E. J.,

Newchurch, M. J., Osterman, G. B., Rinsland, C. P., Salawitch,

R. J., Sen, B., Yung, Y. L., and Zander, R.: Atmospheric Trace

Molecule Spectroscopy (ATMOS) Experiment Version 3 data re-

trievals, Appl. Opt., 41, 6968–6979, 2002.

JPL: Jet Propulsion Laboratory, Orbiting Carbon Observatory

– 2, Level 2 Full Physics Algorithm Theoretical Basis

Document, http://disc.sci.gsfc.nasa.gov/OCO-2/documentation/

oco-2-v6/OCO2_L2_ATBD.V6.pdf, 2015.

Kuai, L., Wunch, D., Shia, R.-L., Connor, B., Miller, C., and Yung,

Y.: Vertically constrained CO

retrievals from TCCON measure-

ments, J. Quant. Spec. Rad. Trans., 113, 1753–1761, 2012.

Long, D. A., Bielska, K., Lisak, D., Havey, D. K., Okumura, M.,

Miller, C. E., and Hodges, J. T.: The air-broadened, near-infrared

line shape in the spectrally isolated regime: Evidence of

simultaneous Dicke narrowing and speed dependence, J. Chem.

Phys., 135, 064308, doi:10.1063/1.3624527, 2011.

Pougatchev, N. S., Connor, B. J., Jones, N. B., and Rinsland, C. P.:

Validation of ozone profile retrieval from infrared ground-based

solar spectra, Geophys. Res. Lett., 23, 1637–1640, 1996.

Rodgers, C. D.: Retrieval of Atmospheric Temperature and Compo-

sition From Remote Measurements of Thermal Radiation, Rev.

Geophys. Space Phys., 14, 609–624, 1976.

Rodgers, C. D.: Inverse Methods for Atmospheric Sounding: The-

ory and Practice, World Scientific Publishing Co. Ltd., 2000.

Rodgers, C. D. and Connor, B. J.: Intercomparison of re-

mote sounding instruments, J. Geophys. Res., 108, 4116,

doi:10.1029/2002JD002299, 2003.

Schofield, R., Connor, B. J., Kreher, K., Johnston, P. V., and

Rodgers, C. D.: The retrieval of profile and chemical information

from ground-based UV-Visible DOAS measurements, J. Quant.

Spectros. Ra. 86, 115–131, 2004.

Sen, B., Toon, G. C., Blavier, J.-F., Fleming, E. L., and Jackman,

C. H.: Balloon-borne observations of mid-latitude fluorine abun-

dance, J. Geophys. Res., 101, 9045–9054, 1996.

Solomon, P. M., Barrett, J., Connor, B. J., Zoonematkermani, S.,

Parrish, A., Lee, A., Pyle, J., and Chipperfield, M.: Seasonal ob-

servations of chlorine monoxide in the stratosphere over Antarc-

tica during the 1996–1998 ozone holes and comparison with

the SLIMCAT 3D model, J. Geophys. Res., 105, 28979–29001,

2000.

Wunch, D., Toon, G. C., Sherlock, V., Deutscher, N. M.,

Liu, X., Feist, D. G., and Wennberg, P. O.: The To-

tal Carbon Column Observing Network’s GGG2014 Data

Version. Carbon Dioxide Information Analysis Center, Oak

Ridge National Laboratory, Oak Ridge, Tennessee, USA,

doi:10.14291/tccon.ggg2014.documentation.R0/1221662, 2015,

www.atmos-meas-tech.net/9/3513/2016/

Atmos. Meas. Tech., 9, 3513–3525, 2016