History

lhenry15 a2e3c407ce revise readme, add build_pipeline shell script		4 years ago
..
data/script	add raw_data_folder	4 years ago

pipeline_construction	add README and bash script for running benchmark	4 years ago

pipelines	upload benchmark pipelines, scripts, data	4 years ago

result	remove redundant files	4 years ago

README.md	revise readme, add build_pipeline shell script	4 years ago

build_pipelines.sh	revise readme, add build_pipeline shell script	4 years ago

run.sh	add README and bash script for running benchmark	4 years ago

run_ae.sh	upload benchmark pipelines, scripts, data	4 years ago

run_pipeline.py	upload benchmark pipelines, scripts, data	4 years ago

run_rnn.sh	add README and bash script for running benchmark	4 years ago

test_pipeline.py	upload benchmark pipelines, scripts, data	4 years ago

README.md

Revisiting Time Series Outlier Detection:Definitions and Benchmarks

Revisiting Time Series Outlier Detection:Definitions and Benchmarks

This branch is the source code of experiment part of our paper. We provide everything needed when running the experiments: Dataset, Dataset Generator, Pipeline json, Python script, runner and the result (in "./result") we get from the experiments.

Resources

Paper: Under review

Datasets

To get the dataset, please go to "data/script" to run all of the python scripts. They will download and preprocess the data automatically into "data/" folder.

Pipeline

This Pipeline json files are organized by different settings of algorithms.

Runner

To run a pipeline, you can generate your own pipeline json file from script.

sh build_pipelines.sh

Then run the pipeline with run_pipeline.py (Below is the example for running IForest on GECCO dataset)

python run_pipeline.py --pipeline_path pipelines/simple/pyod_iforest_0.01.json --data_path ./data/water_quality.csv

Or you can directly use the pipelines we have generated in /pipelines with bash script:

sh run.sh

Cite this Work:

If you find this work useful, you may cite this work:

@misc{lai2020tods,
    title={TODS: An Automated Time Series Outlier Detection System},
    author={Kwei-Harng Lai and Daochen Zha and Guanchu Wang and Junjie Xu and Yue Zhao and Devesh Kumar and Yile Chen and Purav Zumkhawaka and Minyang Wan and Diego Martinez and Xia Hu},
    year={2020},
    eprint={2009.09822},
    archivePrefix={arXiv},
    primaryClass={cs.DB}
}

*Please refer master branch of TODS for details of running pipelines.

全栈的自动化机器学习系统，主要针对多变量时间序列数据的异常检测。TODS提供了详尽的用于构建基于机器学习的异常检测系统的模块，它们包括：数据处理（data processing），时间序列处理（ time series processing），特征分析（feature analysis)，检测算法（detection algorithms），和强化模块（ reinforcement module）。这些模块所提供的功能包括常见的数据预处理、时间序列数据的平滑或变换，从时域或频域中抽取特征、多种多样的检测算

CSV Python Text

khlai037@gmail.com daochen.zha@tamu.edu yl2015000102@gmail.com devesh@tamu.edu hewangyang@yahoo.com wanmia1996@gmail.com 63481760+jamielxu@users.noreply.github.com puravzum@gmail.com xu.jj724@gmail.com jin@tamu.edu didi@192.168.1.7 hegsns@mail.ustc.edu.cn