AbstractReader
The AbstractReader
class is the foundation of AeroViz's data reading system, providing a standardized interface for
reading and processing aerosol instrument data.
Core Architecture
AbstractReader serves as the base class for all instrument-specific readers in AeroViz. It defines the common interface and provides shared functionality for data processing, quality control, and output formatting.
Overview
The AbstractReader implements a consistent workflow for all aerosol instruments:
- Data Ingestion - Read raw instrument files
- Format Detection - Automatically identify data structure
- Quality Control - Apply built-in validation and filtering
- Standardization - Convert to unified output format
- Metadata Handling - Preserve instrument and measurement metadata
Usage Pattern
While you can use AbstractReader directly, it's typically accessed through the RawDataReader
factory function which
automatically selects the appropriate reader based on your instrument type.
Key Features
- Flexible Input Handling - Supports various file formats and structures
- Built-in Quality Control - Configurable data validation and filtering
- Metadata Preservation - Maintains instrument configuration and measurement context
- Extensible Design - Easy to subclass for new instruments
- Error Handling - Robust error reporting and recovery
Implementation Note
AbstractReader is an abstract base class. For actual data reading, use instrument-specific implementations or the
RawDataReader
factory function.
API Reference
AeroViz.rawDataReader.core.AbstractReader
Bases: ABC
Abstract class for reading raw data from different instruments. Each instrument should have a separate class that
inherits from this class and implements the abstract methods. The abstract methods are _raw_reader
and _QC
.
List the file in the path and read pickle file if it exists, else read raw data and dump the pickle file the pickle file will be generated after read raw data first time, if you want to re-read the rawdata, please set 'reset=True'
A core initialized method for reading raw data from different instruments.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
str | Path
|
The path of the raw data file. |
required |
reset
|
bool | str
|
Whether to reset the raw data before reading. |
False
|
qc
|
bool | str
|
Whether to read QC data before reading. |
True
|
**kwargs
|
dict
|
Additional keyword arguments passed to the reader. |
{}
|
Attributes
logger
instance-attribute
Functions
__call__
Process data for specified time range.
__calculate_rates
Calculate acquisition rate, yield rate, and total rate.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
raw_data
|
DataFrame
|
Raw data before quality control |
required |
qc_data
|
DataFrame
|
Data after quality control |
required |
all_keys
|
bool
|
Whether to calculate rates for all deterministic keys |
False
|
with_log
|
bool
|
Whether to output calculation logs |
False
|
Returns:
Type | Description |
---|---|
dict
|
Dictionary containing calculated rates |
__generate_grouped_report
__generate_grouped_report(current_time, weekly_raw_groups, weekly_qc_groups, monthly_raw_groups, monthly_qc_groups)
Generate acquisition and yield reports based on grouped data
_timeIndex_process
Process time index, resample data, extract specified time range, and optionally append new data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
_df
|
DataFrame
|
Input DataFrame with time index |
required |
user_start
|
datetime or str
|
Start of user-specified time range |
None
|
user_end
|
datetime or str
|
End of user-specified time range |
None
|
append_df
|
DataFrame
|
DataFrame to append to the result |
None
|
Returns:
Type | Description |
---|---|
DataFrame
|
Processed DataFrame with properly formatted time index |
_read_raw_files
Read and process raw files.
reorder_dataframe_columns
staticmethod
Reorder DataFrame columns.
time_aware_IQR_QC
staticmethod
filter_error_status
staticmethod
Filter data containing specified error status codes and specially handle certain specific codes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
_df
|
DataFrame
|
A DataFrame containing a 'Status' column |
required |
error_codes
|
list
|
A List of status codes for bitwise testing |
required |
special_codes
|
list
|
List of special status codes for exact matching |
None
|
Returns:
Type | Description |
---|---|
DataFrame
|
Filtered DataFrame |
Notes
This function performs two types of filtering: 1. Bitwise filtering that checks if any error_codes are present in the Status 2. Exact matching for special_codes
options: show_source: false show_bases: true show_inheritance_diagram: false members_order: alphabetical show_if_no_docstring: false filters:
- "!^_"
- "!^init" docstring_section_style: table heading_level: 3 show_signature_annotations: true separate_signature: true group_by_category: true show_category_heading: true
Related Documentation
- RawDataReader Factory - High-level interface for instrument data reading
- Quality Control - Data validation and filtering options
- Supported Instruments - Available instrument implementations
Quick Example
```python from AeroViz import RawDataReader from datetime import datetime
# Using the factory function (recommended)
data = RawDataReader(
instrument='AE33',
path='/path/to/data',
start=datetime(2024, 1, 1),
end=datetime(2024, 12, 31)
)
# Direct usage (advanced - for custom implementations)
from AeroViz.rawDataReader.core import AbstractReader
class MyInstrumentReader(AbstractReader):
nam = 'MyInstrument'
def _raw_reader(self, file):
# Custom file reading logic
pass
def _QC(self, df):
# Custom QC logic
return df
```