You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.txt 2.6 kB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485
  1. README for dataset MUTAG
  2. === Usage ===
  3. This folder contains the following comma separated text files
  4. (replace DS by the name of the dataset):
  5. n = total number of nodes
  6. m = total number of edges
  7. N = number of graphs
  8. (1) DS_A.txt (m lines)
  9. sparse (block diagonal) adjacency matrix for all graphs,
  10. each line corresponds to (row, col) resp. (node_id, node_id)
  11. (2) DS_graph_indicator.txt (n lines)
  12. column vector of graph identifiers for all nodes of all graphs,
  13. the value in the i-th line is the graph_id of the node with node_id i
  14. (3) DS_graph_labels.txt (N lines)
  15. class labels for all graphs in the dataset,
  16. the value in the i-th line is the class label of the graph with graph_id i
  17. (4) DS_node_labels.txt (n lines)
  18. column vector of node labels,
  19. the value in the i-th line corresponds to the node with node_id i
  20. There are OPTIONAL files if the respective information is available:
  21. (5) DS_edge_labels.txt (m lines; same size as DS_A_sparse.txt)
  22. labels for the edges in DD_A_sparse.txt
  23. (6) DS_edge_attributes.txt (m lines; same size as DS_A.txt)
  24. attributes for the edges in DS_A.txt
  25. (7) DS_node_attributes.txt (n lines)
  26. matrix of node attributes,
  27. the comma seperated values in the i-th line is the attribute vector of the node with node_id i
  28. (8) DS_graph_attributes.txt (N lines)
  29. regression values for all graphs in the dataset,
  30. the value in the i-th line is the attribute of the graph with graph_id i
  31. === Description of the dataset ===
  32. The MUTAG dataset consists of 188 chemical compounds divided into two
  33. classes according to their mutagenic effect on a bacterium.
  34. The chemical data was obtained form http://cdb.ics.uci.edu and converted
  35. to graphs, where vertices represent atoms and edges represent chemical
  36. bonds. Explicit hydrogen atoms have been removed and vertices are labeled
  37. by atom type and edges by bond type (single, double, triple or aromatic).
  38. Chemical data was processed using the Chemistry Development Kit (v1.4).
  39. Node labels:
  40. 0 C
  41. 1 N
  42. 2 O
  43. 3 F
  44. 4 I
  45. 5 Cl
  46. 6 Br
  47. Edge labels:
  48. 0 aromatic
  49. 1 single
  50. 2 double
  51. 3 triple
  52. === Previous Use of the Dataset ===
  53. Kriege, N., Mutzel, P.: Subgraph matching kernels for attributed graphs. In: Proceedings
  54. of the 29th International Conference on Machine Learning (ICML-2012) (2012).
  55. === References ===
  56. Debnath, A.K., Lopez de Compadre, R.L., Debnath, G., Shusterman, A.J., and Hansch, C.
  57. Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds.
  58. Correlation with molecular orbital energies and hydrophobicity. J. Med. Chem. 34(2):786-797 (1991).

A Python package for graph kernels, graph edit distances and graph pre-image problem.