TEOM Data Reading and Processing
The TEOM (Tapered Element Oscillating Microbalance) instrument is used for continuous monitoring of PM2.5 mass concentrations. This document details the data reading and quality control procedures implemented in the AeroViz package.
Supported File Formats
This module supports three types of TEOM data output formats:
-
Remote Download Format
- Identified by the 'Time Stamp' column
- Date format: 'DD - MM - YYYY HH:MM:SS'
- May contain Chinese month names requiring conversion
- Column mapping: Time Stamp → time, System status → status, PM-2.5 base MC → PM_NV, PM-2.5 MC → PM_Total, PM-2.5 TEOM noise → noise
-
USB Download or Auto Export Format
- Identified by the 'tmoStatusCondition_0' column
- Two possible time formats: a) Standard: 'Date' and 'Time' columns (YYYY-MM-DD HH:MM:SS) b) Alternative: 'time_stamp' column (similar to remote format)
- Column mapping: tmoStatusCondition_0 → status, tmoTEOMABaseMC_0 → PM_NV, tmoTEOMAMC_0 → PM_Total, tmoTEOMANoise_0 → noise
-
Other Formats - Not implemented, raises NotImplementedError
Data Processing Workflow
The data processing workflow consists of the following steps:
- Data Standardization
- Quality Control Procedures
- Output Data Generation
Data Standardization
- Unifies column names across different data formats
- Handles various time formats, including Chinese month name conversion
- Converts all measurement values to numeric format
- Removes duplicate timestamps and invalid indices
Quality Control Procedures
- Noise threshold filtering (noise < 0.01)
- Value range validation (removes negative or zero values)
- Time-based outlier detection (using 6-hour window IQR filtering)
- Temporal data completeness check (minimum 50% measurements per hour)
- Complete record requirement (both PM_NV and PM_Total columns must have values)
Output Data
The processed DataFrame contains the following standardized columns:
- PM_NV: Non-volatile PM2.5 concentration (µg/m³)
- PM_Total: Total PM2.5 concentration (µg/m³)
Usage Example
from datetime import datetime
from pathlib import Path
from AeroViz import RawDataReader
# Set data path and time range
data_path = Path('/path/to/your/data/folder')
start_time = datetime(2024, 2, 1)
end_time = datetime(2024, 3, 31, 23, 59, 59)
# Read and process TEOM data
teom_data = RawDataReader(
instrument='TEOM',
path=data_path,
reset=True,
qc='1MS',
start=start_time,
end=end_time,
mean_freq='1h',
)
# Show processed data
print("\nProcessed TEOM data:")
print(teom_data.head())