|
- Overview
- ========
-
- TODS follows the design principal of `D3M <http://datadrivendiscovery.org/>`_.
- The toolkit wraps each function into ``Primitive`` class with an unified
- interface for various functionalities. The goal of this toolkit is to enable
- the users to easily develop outlier detection system for multivariate time series data.
- TODS provides three pervasive outlier scenarios for the given time series data.
- * **Point-wise Outliers** are the outliers that occur on time points. In other words, each time point in the time series data could be an outlier.
- * **Pattern-wise Outliers** refer to the scenario that each outlier is define as a subsequence. It is also known as collective outlier detection.
- * **System-wise Outliers** are defined as a set of time series. For example, each set of time series represents a device (system) that equipped with multiple sensors, where each sensor is represented as an univariate time series. The goal is to detect the anomalous devices from the normal ones.
-
- TODS High-level Design
- ~~~~~~~~~~~~~~~~~~~~~~~~
- Following the typical machine learning pipeline, there are 6 modules lie in TODS: Data prociessing, time series processing, feature analysis, detection algorithms and reinforcement module.
-
- .. image:: img/tods_framework.pdf
- :width: 800
-
-
-
- Data Processing
- ---------------
- Data processing aims on processing data following the tabular fashion. The functionalities including: dataset loading, data filtering, data validation, data binarization, and timestamp transformation.
-
- Timeseries Processing
- ---------------------
- Time series processing provides multiple time series-specific preprocessing techniques including: seasonality/trend decomposition, time series transformation/scaling/smoothing.
-
- Feature Analysis
- ----------------
- Feature analysis module provides exhaustive feature extraction techniques from three aspects: Time domain, frequency domain and latent factor models.
- There are 30 feature extraction methods including statistical methods, time series filters, spectral transformations, and matrix factorization models.
-
- Detection Algorithms
- ---------------------
- Based on the three scenarios above, we provide multiple algorithms including traditional approaches (e.g. IForest, Autoregression), heuristic methods (e.g. HotSax algorithm, Matrix Profile), deep learning methods (e.g. RNN-LSTM, GAN, VAE), and ensemble methods to address each kind of outlier.
-
- Reincforcement Module
- ----------------------
- Reinforcement module is designed for improve the existed model with human expertise. Specifically, rule-based filtering has been developed to allow users to transform the domain knowledge into rule filters, active learning based methods will be involved in the near future as well.
|