revise readme, add build_pipeline shell script

4 years ago · a2e3c407ce
--- a/benchmark/realworld_data/README.md
+++ b/benchmark/realworld_data/README.md
@@ -5,18 +5,6 @@ This branch is the source code of  experiment part of our paper. We provide ever
 ## Resources
 * Paper: Under review

 ## Cite this Work:
 If you find this  work useful, you may cite this work:
 ```
@misc{lai2020tods,
    title={TODS: An Automated Time Series Outlier Detection System},
    author={Kwei-Harng Lai and Daochen Zha and Guanchu Wang and Junjie Xu and Yue Zhao and Devesh Kumar and Yile Chen and Purav Zumkhawaka and Minyang Wan and Diego Martinez and Xia Hu},
    year={2020},
    eprint={2009.09822},
    archivePrefix={arXiv},
    primaryClass={cs.DB}
 }
 ```

 ## Datasets
 To get the dataset, please go to "data/script" to run all of the python scripts. They will download and preprocess the data automatically into "data/" folder.
@@ -31,10 +19,7 @@ This Pipeline json files are organized by different settings of algorithms.
 To run a pipeline, you can generate your own pipeline json file from script.

 ```python
 python pipeline_construction/pipeline_construction_simple.py 
 python pipeline_construction/pipeline_construction_subseq.py 
 python pipeline_construction/neural/build_AE_pipeline.py 
 python pipeline_construction/neural/build_RNNLSTM_pipeline.py 
 sh build_pipelines.sh
 ```

 Then run the pipeline with run\_pipeline.py (Below is the example for running IForest on GECCO dataset)
@@ -47,9 +32,20 @@ python run_pipeline.py --pipeline_path pipelines/simple/pyod_iforest_0.01.json -
 Or you can directly use the pipelines we have generated in /pipelines with bash script:

 ```python
 ./run.sh
 sh run.sh
 ```



 ## Cite this Work:
 If you find this  work useful, you may cite this work:
 ```
@misc{lai2020tods,
    title={TODS: An Automated Time Series Outlier Detection System},
    author={Kwei-Harng Lai and Daochen Zha and Guanchu Wang and Junjie Xu and Yue Zhao and Devesh Kumar and Yile Chen and Purav Zumkhawaka and Minyang Wan and Diego Martinez and Xia Hu},
    year={2020},
    eprint={2009.09822},
    archivePrefix={arXiv},
    primaryClass={cs.DB}
 }
 ```
 *Please refer master branch of TODS for details of running pipelines.
--- a/benchmark/realworld_data/build_pipelines.sh
+++ b/benchmark/realworld_data/build_pipelines.sh
@@ -0,0 +1,26 @@
 #!/bin/bash

 if [ ! -d "./pipelines/simple" ]; then
 	mkdir -p ./pipelines/simple
 fi
 if [ ! -d "./pipelines/subseq" ]; then
 	mkdir -p ./pipelines/subseq
 fi
 python pipeline_construction/pipeline_construction_simple.py 
 python pipeline_construction/pipeline_construction_subseq.py 

 data="swan_sf creditcard web_attack water_quality"
 for d in $data
 do 
 	if [ ! -d "./pipelines/AE/$d" ]; then
 		mkdir -p ./pipelines/AE/$d
 	fi
 	if [ ! -d "./pipelines/RNN_LSTM/$d" ]; then
 		mkdir -p ./pipelines/RNN_LSTM/$d
 	fi

 	python pipeline_construction/neural/build_AE_pipeline.py "./data/"$d".csv"
 	python pipeline_construction/neural/build_RNNLSTM_pipeline.py "./data/"$d".csv"
 done