You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

compute_distance_in_kernel_space.py 2.4 kB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
  1. # -*- coding: utf-8 -*-
  2. """compute_distance_in_kernel_space.ipynb
  3. Automatically generated by Colaboratory.
  4. Original file is located at
  5. https://colab.research.google.com/drive/17tZP6IrineQmzo9sRtfZOnHpHx6HnlMA
  6. **This script demonstrates how to compute distance in kernel space between the image of a graph and the mean of images of a group of graphs.**
  7. ---
  8. **0. Install `graphkit-learn`.**
  9. """
  10. """**1. Get dataset.**"""
  11. from gklearn.utils import Dataset
  12. # Predefined dataset name, use dataset "MUTAG".
  13. ds_name = 'MUTAG'
  14. # Initialize a Dataset.
  15. dataset = Dataset()
  16. # Load predefined dataset "MUTAG".
  17. dataset.load_predefined_dataset(ds_name)
  18. len(dataset.graphs)
  19. """**2. Compute graph kernel.**"""
  20. from gklearn.kernels import PathUpToH
  21. import multiprocessing
  22. # Initailize parameters for graph kernel computation.
  23. kernel_options = {'depth': 3,
  24. 'k_func': 'MinMax',
  25. 'compute_method': 'trie'
  26. }
  27. # Initialize graph kernel.
  28. graph_kernel = PathUpToH(node_labels=dataset.node_labels, # list of node label names.
  29. edge_labels=dataset.edge_labels, # list of edge label names.
  30. ds_infos=dataset.get_dataset_infos(keys=['directed']), # dataset information required for computation.
  31. **kernel_options, # options for computation.
  32. )
  33. # Compute Gram matrix.
  34. gram_matrix, run_time = graph_kernel.compute(dataset.graphs,
  35. parallel='imap_unordered', # or None.
  36. n_jobs=multiprocessing.cpu_count(), # number of parallel jobs.
  37. normalize=True, # whether to return normalized Gram matrix.
  38. verbose=2 # whether to print out results.
  39. )
  40. """**3. Compute distance in kernel space.**
  41. Given a dataset $\mathcal{G}_N$, compute the distance in kernel space between the image of $G_1 \in \mathcal{G}_N$ and the mean of images of $\mathcal{G}_k \subset \mathcal{G}_N$.
  42. """
  43. from gklearn.preimage.utils import compute_k_dis
  44. # Index of $G_1$.
  45. idx_1 = 10
  46. # Indices of graphs in $\mathcal{G}_k$.
  47. idx_graphs = range(0, 10)
  48. # Compute the distance in kernel space.
  49. dis_k = compute_k_dis(idx_1,
  50. idx_graphs,
  51. [1 / len(idx_graphs)] * len(idx_graphs), # weights for images of graphs in $\mathcal{G}_k$; all equal when computing the mean.
  52. gram_matrix, # gram matrix of al graphs.
  53. withterm3=False
  54. )
  55. print(dis_k)

A Python package for graph kernels, graph edit distances and graph pre-image problem.