You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

extra_get_basic_and_yago.py 691 B

12345678910111213141516171819202122232425262728
  1. #encoding=utf-8
  2. basic = []
  3. yago = []
  4. b = 0
  5. y = 100000
  6. '''
  7. In dbpedia dataset we use two sorts of type: yago type and basic type
  8. yago type refers to type with yago prefix
  9. basic type refers to objects pointed to by rdf:type
  10. this script divide this two kinds of types into different files.
  11. '''
  12. with open('type id file here') as f:
  13. for line in f:
  14. dou = line[:-1].split('\t')
  15. if dou[0][:6] == '<yago:':
  16. yago.append(dou[0]+"\t%d\n"%y)
  17. y+=1
  18. else:
  19. basic.append(dou[0]+"\t%d\n"%b)
  20. b+=1
  21. with open('basic types id file here','w') as f:
  22. for str in basic:
  23. f.write(str)
  24. with open("yago type id file here",'w') as f:
  25. for str in yago:
  26. f.write(str)

GAnswer system is a natural language QA system developed by Institute of Computer Science & Techonology Data Management Lab, Peking University, led by Prof. Zou Lei. GAnswer is able to translate natural language questions to query graphs containing semant