Data Quality Control

TAO data undergo extensive quality control analysis through comparisons with historic averages to ensure the data released for public use are accurate. This page summarizes the various procedures for real-time Autonomous Temperature Line Acquisition System (ATLAS) data, delayed mode ATLAS data, and Acoustic Doppler Current Profiler (ADCP) data. Quality codes, numeric values used to determine the trustworthiness of the data stored, are also described.

Real-time ATLAS data

Related Links
Sensor specifications
Mooring information
GTS data distribution
Data telemetry
Sampling specifications
Climatologies

NDBC data analysts perform quality control of the real-time data on a daily, weekly, and monthly basis.

Daily quality control

The first process of NDBC data quality control is the automated check of the data by a computer program which fails any measurement values that fall outside a pre-determined set of broad upper and lower data limits for each measurement type. Next, the data are processed through a program that performs a comparison of the measurements against historic averages. Suspect data are then compiled into a possible error report for final quality control by the analyst. These results are not failed automatically by the computer program but are failed manually by the analyst upon their subjective finding of the measurement validity.

In addition to the error checking program, daily comparisons are made between TAO database measurement information that are processed at NDBC and TAO data that are transmitted via the GTS. Any discrepancies between the data sets are immediately investigated and corrected.

Data quality control procedures are summarized in the table below:

Measurement Preliminary gross automated error checking Sensor measurements that will generate automated error alerts Additional daily checks
Wind direction
Hourly and daily compass or vane zero; daily compass or vane constant; daily direction varies more than 90° from previous day. Visual inspection of last ten days against available model data
Wind velocity
Daily speed changes more than 5 m s-1 from previous day Visual inspection of last ten days against available model data
Relative humidity (RH) RH set to missing if > 99.9% Daily RH outside 65-99%; hourly RH outside 50-100% within past two weeks; changes >20% from previous day Compare hourly and daily RH against hourly and daily air temperature to ensure an invalid air temperature measurement did not cause the anomaly
Air temperature (AT) AT set to missing if > 33.0° or < -9.0°. Daily AT changes > 5°C from previous day; daily AT - SST > 1.4°C;daily AT outside 6-32°C; hourly AT outside 15-33°C within past two weeks Visually inspect hourly air temperature for > 2° changes from previous hour
Sea surface temperature (SST) SST set to missing if > 33.0° or < -9.0°. Daily SST changes > 5°C from previous day; daily SST - T at 20m or 25m > 0.2°; hourly SST outside 20-30°C within past two weeks Visual inspection of the last two weeks time series plot of SST vs wind vectors
Subsurface temperature (T) Daily T set to missing if > 99.99° or < -9.0°. Daily T changes > 5°C from previous day; vertical gradient between adjacent sensors checked. Daily T conforms to the historic values for the current quarter (± 3 s.d. of 90-day mean) Visual inspection of T profiles
Rainfall Rate > 10 mm hr-1
Sensor output full scale; daily rainfall rate outside -0.1-10 mm hr-1; daily rain rate > 1.0 mm hr-1 for < 5% time raining; daily rain rate < 0.1 mm hr-1 for > 25% time raining Visually inspect last ten days percent time raining for invalid measurement trends
Shortwave radiation (SWR) Set to missing if > 1400 W m-2. If any SWR value (mean, standard deviation, maximum) reads 0, all are set to missing for that day. Sensor output zero or full scale; daily radiation outside 50-325 W m-2; max radiation exceeds 1350 W m-2 Visual inspection and comparison with time series plots from neighboring sites and satellite imagery.
Barometric pressure
BP changes > 5 mb from previous day; daily BP outside 990-1018 mb; hourly barometric pressure outside 990-1018 in past two weeks
Visual inspection and comparison with time series plots from neighboring sites.
Salinity Computed only for conductivity in range 30.0-70.0 mS cm-1 and T > 0.0° Salinity changes by > 0.5 psu; salinity outside 31.0-36.5 psu; density inversions computed from daily averaged salinities and temperatures > 0.05 kg m-3; salinity does not conform to the historic values for the current quarter (> ± 3 s.d. of 90-day mean) Visually inspect last ten days of salinity for erratic trends
Position Data from moorings which have drifted more than 1 degree of latitude or 5 degrees of longitude are excluded from data base.
Buoy position changes from deployment position by > 6nm Visually inspect the 300m and 500m pressure for abrupt decreases corresponding to buoy movement

Weekly real-time quality control
Every week, the National Centers for Environmental Prediction (NCEP) compiles statistics of TAO data transmitted via the GTS and compares these statistics to numerical weather prediction Medium Range Forecast (MRF) model output. Weekly mean and RMS differences of daily averaged TAO and NCEP 10 m winds are computed. Daily averaged NCEP winds in these computations are based on four 6-hourly forecasts at 00z, 06z, 12z, and 18z. Weekly mean and standard deviations for TAO air temperatures and sea surface temperatures are also computed. Based on these statistics, NCEP reports the number of suspect observations for wind, air, and sea surface temperature according to the criteria listed in the table below.

The 5-day mean of most variables are compared to the previous month's monthly averaged data. Conditions which indicate possible errors are listed below. Analysts investigate anomalies and only release the highest quality data, failing measurements of suspect values.

Measurement NDBC checks NCEP statistic output
Wind direction Direction differs from monthly average by > 30°
Wind vector components (U/V) Mean, standard deviation, root mean square is => +/-3 from NCEP model Mean and standard deviation of MRF output and TAO winds; RMS difference of MRF and TAO winds
Wind speed 5-day mean vs monthly average  
Relative humidity (RH) 5-day average < 40%  
Air temperature (AT) 5-day mean different from monthly average by > 2°C, Mean, standard deviation, root mean square is => +/-3 from NCEP model Mean and standard deviation TAO AT; AT < 15.0 or > 35.0
Sea surface temperature (SST) 5-day mean different from monthly average by > 2°C, Mean, standard deviation, root mean square is => +/-3 from NCEP model Mean and standard deviation TAO SST; SST < 15.0 or > 35.0
Subsurface temperature (T) 20°C isotherm differs from monthly average by > 25m  
Rainfall

Mean daily rainfall rate and standard deviation; number points since deployment where % time raining is > 30%; number points where rain rate >4mm hr-1


Shortwave radiation

Mean daily radiation and standard deviation; number points since deployment where maximum daily radiation > 1350 W m-2 ; number points where average daily radiation > 650 W m-2; number of points average radiation < 50 W m -2


Salinity
Two week time series plot compared to nearby station and for erratic trends  

Monthly real-time quality control

Daily averaged data are plotted by site for the past month for all 55 buoys. The measurement trends are analyzed by the data analyst and checked for bad data runs and sensor drift trends. If plots indicate errors within the data runs, then the raw data for the erroneous periods are examined. After the data analyst has completed the quality control of the raw data, the analyst makes a decision on whether to fail the data. If a previously failed data measurement run is determined to be valid, the analyst releases the data. The data analyst applies the daily quality control measurement thresholds on the sensor or sensors that are indicating a possible bad data run or valid data run.

Delayed mode ATLAS data

General

Raw data recovered from sensor internal memory are first processed using computer programs that apply pre-deployment calibrations and generate time series in engineering units. These programs also flag for missing data and perform gross error checks for data that fall outside physically realistic ranges. A log of potential data problems is automatically generated as a result of these procedures.

Next, time series plots, spectral plots, and histograms are generated for all data. Statistics, including the mean, median, standard deviation, variance, minimum and maximum are calculated for each time series.

Individual time series and statistical summaries are examined by trained analysts. Data that have passed gross error checks but which are unusual relative to neighboring data in the time series, and/or which are statistical outliers, are examined on a case-by-case basis. Mooring deployment and recovery logs are searched for corroborating information such as problems with battery failures, vandalism, damaged sensors, or incorrect clocks. Consistency with other variables is also checked. Data points that are ultimately judged to be erroneous are then flagged.

For some variables, additional post-processing after recovery is required to ensure maximum quality. These variable-specific procedures are described below.

Rain Rate

Rainfall data are collected using a RM Young rain gauge and recorded internally at a 1-min sample rate. The RM Young rain gauge consists of a 500 ml catchment cylinder which, when full, empties automatically via a siphon tube. Data from a 3-min period centered near siphon events are ignored. Occasional random spikes, which typically occur during periods of rapid rain accumulation or immediately preceding or following siphon events, are eliminated manually.

Rain rates computed from first differences of 1-min accumulations are often noisy because of the sensitivity of rate calculations to noise in accumulations over short time scales. To reduce this noise, 1-min accumulations are filtered with a 16-point Hanning filter and rates are computed at 10-min intervals. Residual noise in the filtered time series may include occasional spurious negative rain rates, but these rarely exceed a few mm hr-1. Serra et al (2001) [1] estimate the overall accuracy of 10-min data to be 0.3 mm hr-1 on average.

Subsurface Pressure (and other measurements)

The majority of ATLAS moorings are taut-line moorings. Therefore, vertical excursions of the mooring line are generally small, and subsurface instruments do not deviate far from their nominal measurement depths. Vertical excursions of the mooring line are detected by pressure sensors usually placed at depths of 300 m and 500 m, where the largest line variations typically occur (McCarty et al. (1997) [2]). Large, short-duration, upward spikes in subsurface pressure data are occasionally observed. These spikes usually indicate either purposeful or accidental interaction between fishermen and the moorings. Each spike, and its effects on the subsurface data, is individually evaluated. Data from all subsurface sensors are flagged when pressure excursions exceed the range expected for normal variability.

Salinity

Salinity values are calculated from measured conductivity and temperature data using the method of Fofonoff and Millard (1983) [3]. Surface salinity records are plotted and examined for periods of spiky data caused by response time differences between conductivity and temperature sensors. The identified spiky periods are flagged. If necessary, conductivity values from all depths are adjusted for sensor calibration drift by linearly interpolating over time between values calculated from the pre-deployment calibration coefficients and those derived from the post-deployment calibration coefficients.

A thirteen point Hanning filter is applied to the high-resolution (ten minute interval) conductivity and temperature data. A filtered value is calculated at any point for which seven of the thirteen input points are available. The missing points are handled by dropping their weights from the calculation, rather than by adjusting the length of the filter. Salinity values are recalculated from the filtered data and subsampled to hourly intervals.

Delayed mode daily salinity and density values are calculated by taking the mean of the available hourly values for the day. If there are fewer than 12 hourly values available, a daily mean value is not computed.

[1] Serra, Y.L., P.A'Hearn, H.P. Freitag, and M.J. McPhaden, 2001: ATLAS self-siphoning rain gauge error estimates. J. Atmos. Ocean. Tech., in press.

[2] McCarty, M.E., L.J. Mangum, and M.J. McPhaden, 1997: Temperature errors in TAO data induced by mooring motion. NOAA Tech. Memo. ERL PMEL-108, Pacific Marine Environmental Laboratory, Seattle, WA, 68 pp.

[3] Fofonoff, P., and R. C. Millard Jr., Algorithms for computation of fundamental properties of seawater, Tech. Pap. Mar. Sci., 44, 53 pp., Unesco, Paris, 1983.

[4] Freitag, H.P., M.E. McCarty, C. Nosse, R. Lukas, M.J. McPhaden, and M.F. Cronin, 1999: COARE Seacat data: Calibrations and quality control procedures. NOAA Tech. Memo. ERL PMEL-115, 89 pp.


Subsurface moored Acoustic Doppler Current Profiler (ADCP) data

Velocity profiles are obtained from upward looking Acoustic Doppler Current Profilers (ADCPs) deployed on subsurface moorings at nominal depths of 250 m to 300 m below the sea surface. The narrowband RD Instruments ADCPs have a 20 degree transducer orientation and are set to collect data with 8.68 m nominal bin and pulse lengths. The instruments collect data at a 3 second sample rate and form averages over 15 minutes beginning at the top of the hour.

Velocity data are processed and quality controlled at NDBC after the mooring is recovered and the data retrieved from the instrument's memory. The ADCP velocity measurements assume a constant sound speed of 1536 m s-1 at the transducer. In situ hourly temperature and average salinity measurements are used to adjust the velocities for sound speed variations. The nominal ADCP bin widths, which assume a constant sound speed with depth of 1475.1 m s-1 , are adjusted using historical hydrographic sound speed profiles.

The actual depth of the ADCP transducer head is variable in time, as the mooring reacts to variations in ocean currents beneath the instrument. Therefore, velocity profiles need to be adjusted for head depth. The transducer head depth is computed using two independent methods. In the first, the hourly target strength for each beam and each depth bin is computed from the echo intensities. The sea surface appears as a maximum target strength for most (>80%) hourly profiles. A polynomial is fit to the target strengths of the three bins closest to the surface. The position of the maximum target strength with respect to the ADCP transducer is then used as the depth of the instrument for each hourly profile. The second method of estimating the head depth is from pressure time series recorded by duplicate pressure sensors mounted near the ADCP transducer. Estimates of head depth from the maximum target strength and the pressure sensors are typically within +/- 2m, less than half of the ADCP bin width. The computed transducer head depth and the bin widths (nominal bin widths which have been adjusted for sound speed) are used to compute the bin depths for the hourly ADCP velocity data.

Near surface velocity measurements may be in error due to strong reflections from the surface that overcome the sidelobe suppression of the transducer. Hourly data are flagged as bad if the bin depth (the center of the velocity bin) is closer to the surface than D*(1-cos(theta)) + bin width where D is the transducer depth, theta is the angle of the transducer beam relative to vertical, and the bin width has been adjusted for sound speed. Velocities from the remaining depth bins are then interpolated to standard depths at 5 meter intervals.

The ADCP velocities are also compared with coincident point velocity measurements when available on nearby surface moorings. ADCP and point velocity measurements generally agree to within 5 cm s-1, and no velocity adjustments to the ADCPs have yet been made based on these comparisons. ADCP directions are also checked against available point velocity measurements.

Quality codes and sensor drift

Instrumentation recovered in working condition is returned to PMEL for post-deployment calibration before being reused on future deployments.  After post-deployment calibrations are made, the resultant coefficients are compared to the pre-deployment coefficients.  A set of output values are computed by application of the calibration equation using pre-deployment coefficients to a set of input values.  Input values are chosen so that the output values would range over normal environmental conditions.  A second set of output values are generated by application of the calibration equation using post-deployment coefficients to the same set of input values.  Sensor drift is calculated by subtracting the first set of output values from the second set of output values.  The sensors are then assigned quality codes based on drift using the following criteria:

1 - Highest Quality.  Pre/post-deployment calibrations agree to within sensor specifications.  In most cases, only pre-deployment calibrations have been applied.



2 - Default Quality.  Pre-deployment calibrations only or post-deployment calibrations only applied.  Default value for sensors presently deployed and for sensors which were not recovered or not calibratable when recovered, or for which pre-deployment calibrations have been determined to be invalid.



3 - Adjusted Data.  Pre/post calibrations differ, or original data do not agree with other data sources (e.g., other in situ data or climatology), or original data are noisy. Data have been adjusted in an attempt to reduce the error.



4 - Lower Quality.  Pre/post calibrations differ, or data do not agree with other data sources (e.g., other in situ data or climatology), or data are noisy.  Data could not be confidently adjusted to correct for error.

5 - Sensor or Tube Failed.  Used when there is known tube or sensor failure that is preventing measurement information from being collected.

When a recovered sensor meets the criteria for nominal drift, the quality index is changed from the default value of "2" to "1" for highest quality data.  When it does not meet the criteria for sensor drift, the index becomes "4".  If an adjustment based on post-deployment calibrations or other information is later made, the index may then be set to "3" or "1". When damage or loss of an instrument due to vandalism, harsh environmental conditions, electronics failures, or loss of a mooring prevents post-deployment calibration, a default quality of "2" is assigned to the data.

 



Nominal drift criteria:



Measurement Drift criteria
Air temperature 0.4°C
Relative humidity 4%
Wind velocity 0.6m s-1 or 6%
Temperature 0.02°C
Salinity 0.04 PSU
Rainfall 0.6mm hr-1
Shortwave radiation 2 %