You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

results.md 4.5 kB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384
  1. # Results with minimal test RMSE for each kernel on dataset Asyclic
  2. All kernels are tested on dataset Asyclic, which consists of 185 molecules (graphs).
  3. The criteria used for prediction are SVM for classification and kernel Ridge regression for regression.
  4. For predition we randomly divide the data in train and test subset, where 90% of entire dataset is for training and rest for testing. 10 splits are performed. For each split, we first train on the train data, then evaluate the performance on the test set. We choose the optimal parameters for the test set and finally provide the corresponding performance. The final results correspond to the average of the performances on the test sets.
  5. ## Summary
  6. | Kernels | RMSE(℃) | STD(℃) | Parameter | k_time |
  7. |---------------|:---------:|:--------:|-------------:|-------:|
  8. | Shortest path | 35.19 | 4.50 | - | 14.58" |
  9. | Marginalized | 18.02 | 6.29 | p_quit = 0.1 | 4'19" |
  10. | Path | 14.00 | 6.94 | - | 37.58" |
  11. | WL subtree | 7.55 | 2.33 | height = 1 | 0.84" |
  12. | Treelet | 8.31 | 3.38 | - | 49.58" |
  13. * RMSE stands for arithmetic mean of the root mean squared errors on all splits.
  14. * STD stands for standard deviation of the root mean squared errors on all splits.
  15. * Paremeter is the one with which the kenrel achieves the best results.
  16. * k_time is the time spent on building the kernel matrix.
  17. * The targets of training data are normalized before calculating *path kernel* and *treelet kernel*.
  18. ## Detailed results of each kernel
  19. In each table below:
  20. * The unit of the *RMSEs* and *stds* is *℃*, The unit of the *k_time* is *s*.
  21. * k_time is the time spent on building the kernel matrix.
  22. ### shortest path kernel
  23. ```
  24. RMSE_test std_test RMSE_train std_train k_time
  25. ----------- ---------- ------------ ----------- --------
  26. 35.192 4.49577 28.3604 1.35718 14.5768
  27. ```
  28. ### Marginalized kernel
  29. The table below shows the results of the marginalized under different termimation probability.
  30. ```
  31. p_quit RMSE_test std_test RMSE_train std_train k_time
  32. -------- ----------- ---------- ------------ ----------- --------
  33. 0.1 18.0243 6.29247 12.1863 7.03899 258.77
  34. 0.2 18.3376 5.85454 13.9554 7.54407 256.327
  35. 0.3 18.496 5.73492 13.9391 7.95812 255.614
  36. 0.4 19.4491 5.3713 16.2593 6.69358 254.897
  37. 0.5 19.7857 5.55054 17.0181 6.84437 256.757
  38. 0.6 20.1922 5.59122 17.6618 6.56718 256.557
  39. 0.7 21.6614 6.02685 20.5882 5.74601 254.953
  40. 0.8 22.996 6.08347 23.5943 3.80637 252.804
  41. 0.9 24.4241 4.95119 25.8082 3.31207 256.738
  42. ```
  43. ### Path kernel
  44. **The targets of training data are normalized before calculating the kernel.**
  45. ```
  46. RMSE_test std_test RMSE_train std_train k_time
  47. ----------- ---------- ------------ ----------- --------
  48. 14.0015 6.93602 3.76191 0.702594 37.5759
  49. ```
  50. ### Weisfeiler-Lehman subtree kernel
  51. The table below shows the results of the WL subtree under different subtree heights.
  52. ```
  53. height RMSE_test std_test RMSE_train std_train k_time
  54. -------- ----------- ---------- ------------ ----------- --------
  55. 0 15.6859 4.1392 17.6816 0.713183 0.360443
  56. 1 7.55046 2.33179 6.27001 0.654734 0.837389
  57. 2 9.72847 2.05767 4.45068 0.882129 1.25317
  58. 3 11.2961 2.79994 2.27059 0.481516 1.79971
  59. 4 12.8083 3.44694 1.07403 0.637823 2.35346
  60. 5 14.0179 3.67504 0.700602 0.57264 2.78285
  61. 6 14.9184 3.80535 0.691515 0.56462 3.20764
  62. 7 15.6295 3.86539 0.691516 0.56462 3.71648
  63. 8 16.2144 3.92876 0.691515 0.56462 3.99213
  64. 9 16.7257 3.9931 0.691515 0.56462 4.26315
  65. 10 17.1864 4.05672 0.691516 0.564621 5.00918
  66. ```
  67. ### Treelet kernel
  68. **The targets of training data are normalized before calculating the kernel.**
  69. ```
  70. RMSE_test std_test RMSE_train std_train k_time
  71. ----------- ---------- ------------ ----------- --------
  72. 8.3079 3.37838 2.90887 1.2679 49.5814
  73. ```

A Python package for graph kernels, graph edit distances and graph pre-image problem.