-
Notifications
You must be signed in to change notification settings - Fork 4
Ensure SnippetsTrees are written with file = unix filename #7
Description
results.xml contains lines like:
<results>
<result pre="ric multi-attribute utility values " name0="exclude" value0="exclude" post="important domains and non-health outcomes, while p" xpath="/*[local-name()='html'][1]/*[local-name()='body'][1]/*[local-name()='div'][1]/*[local-name()='div'][3]/*[local-name()='div'][1]/*[local-name()='p'][1]"/>
<results/>
snippetsTrees are elements which contain results elements. Sometimes multiple, sometimes only one.
projectSnippetsTrees are elements which contain snippetTree elements. One snippets tree element for each paper that is addressed.
However, we directly build snippetsTrees from results. Indeed the current code in SnippetsTree.java relies on them being precisely saved in a file called results/pluginname/option/results.xml. (see line 107). However this doesn't make sense because snippetsTrees when written to file are written with a name of type: plugin.option.snippets.xml which makes it impossible to read a snippetsTree in from a file and have it as a valid object.
I think this shows where we've introduced two different functions of the ami code that should be more strongly decoupled: 1) mining information from papers and 2) formatting it for human reading.
A machine doesn't really need either the snippetsTree or the projectSnippetsTree. We should probably stop making these (including for the situation where they contain post-processed data from the mine like word counts) and leave it to a tool further down the line.