Conversation
|
IT seems I answered in the wrong box and pressed the wrong button! |
| if (NULL == dj) continue; | ||
|
|
||
| // Look at the right-going connectors. | ||
| for (Connector* rcon = dj->right; rcon; rcon=rcon->next) |
There was a problem hiding this comment.
Why the left connectors are ignored?
There was a problem hiding this comment.
Because this was a prototype and I did not do those.
| for (Connector* wrc = lkg->link_array[l].rc; wrc; wrc=wrc->next) | ||
| { | ||
| // Skip if no match. | ||
| if (strcmp(rcon->desc->string, wrc->desc->string)) continue; |
There was a problem hiding this comment.
As I wrote in the general response (in the wrong box...), there is 1-1 match between condesct_t elements and connector strings, disregarding if the connectors are multi or not. Thus the condesc_t strings don't include@.
So there is a need for a different comparison.
|
I propose a simpler algo:
The multi-connector costs can be taken from a hash table stored in |
|
When you say
Does this table currently exist, or would it have to be created? |
It doesn't exist. (BTW, I already used |
OK, thanks. I'm going to keep this open, as a to-do item. Finishing it would not be hard - an afternoon of creating the table and hooking it all up. But nothing depends on it right now, so I don't feel like cluttering up the current code with this new stuff. |
|
FYI, this did eventually become urgent, and an alternate work-around was done in commit 1dcdd15 which is adequate for now. |
|
In any case I added this to my TODO list. |
(EDIT: Since I have already written it in the wrong box, I leave it here...)
It seems there are several problems in this code.
Also, there are several other problems related to it.
Regarding
condesc_t::cncost- this is an improper place to put a multi-connector cost: There is a singlecondesc_telement for each connector string (i.e. all multi and non-multi connectors with the same connector string share a singlecondesc_telement). However, each multi-connector in the dict may have its own cost, and these costs may be different.There is also a problem to store the cost in the connector due to the 64 bytes limitation.
A multi-connector cost table could be added to disjuncts, but there is a space constraint there too.
So the only good place for multi-connector costs seems to be in
Sentence_s- a hash table for that can be added.Among the problems that are related:
Disjunct_structbecause this will create a cost mess if relinking is needed (e.g. for implementing phantoms). To that end, anadjust_costfield could be added toDisjunct_struct, but again there is a space problem there.!disjunct) should be updated, to be consistent with the adjusted disjunct costs. However, it is not clear how to textually represent multi-matches of multi-connectors. (I also discovered a bug there regarding multi-connectors that has no implications in the current dict/corpora and in your optional multi-connector examples).Regarding the code, it seems it has several things which are wrong (or that I don't understand).
In any case, the code can be simplified and this will save fixing these things.
I will add my comments to the code itself.