Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
|
4 years ago | |
---|---|---|
.. | ||
data/script | 4 years ago | |
pipeline_construction | 4 years ago | |
pipelines | 4 years ago | |
result | 4 years ago | |
README.md | 4 years ago | |
build_pipelines.sh | 4 years ago | |
run.sh | 4 years ago | |
run_ae.sh | 4 years ago | |
run_pipeline.py | 4 years ago | |
run_rnn.sh | 4 years ago | |
test_pipeline.py | 4 years ago |
This branch is the source code of experiment part of our paper. We provide everything needed when running the experiments: Dataset, Dataset Generator, Pipeline json, Python script, runner and the result (in "./result") we get from the experiments.
To get the dataset, please go to "data/script" to run all of the python scripts. They will download and preprocess the data automatically into "data/" folder.
This Pipeline json files are organized by different settings of algorithms.
To run a pipeline, you can generate your own pipeline json file from script.
sh build_pipelines.sh
Then run the pipeline with run_pipeline.py (Below is the example for running IForest on GECCO dataset)
python run_pipeline.py --pipeline_path pipelines/simple/pyod_iforest_0.01.json --data_path ./data/water_quality.csv
Or you can directly use the pipelines we have generated in /pipelines with bash script:
sh run.sh
If you find this work useful, you may cite this work:
@misc{lai2020tods,
title={TODS: An Automated Time Series Outlier Detection System},
author={Kwei-Harng Lai and Daochen Zha and Guanchu Wang and Junjie Xu and Yue Zhao and Devesh Kumar and Yile Chen and Purav Zumkhawaka and Minyang Wan and Diego Martinez and Xia Hu},
year={2020},
eprint={2009.09822},
archivePrefix={arXiv},
primaryClass={cs.DB}
}
*Please refer master branch of TODS for details of running pipelines.
全栈的自动化机器学习系统,主要针对多变量时间序列数据的异常检测。TODS提供了详尽的用于构建基于机器学习的异常检测系统的模块,它们包括:数据处理(data processing),时间序列处理( time series processing),特征分析(feature analysis),检测算法(detection algorithms),和强化模块( reinforcement module)。这些模块所提供的功能包括常见的数据预处理、时间序列数据的平滑或变换,从时域或频域中抽取特征、多种多样的检测算
CSV Python Text