Low performance when using demonstration retrieval

I could not find the demonstration retrieval code in `code/data_process.py`, so I wrote my own implementation, using `sentence_transformers` and the `all-mpnet-base-v2` model.  I followed the method outlined in the paper in which I find the top 1 corresponding utterance in the training set, using same-label pairing for training and all-labels pairing for dev/test.

Without demonstration retrieval, I was able to come close to the paper's result.  However, with demonstration retrieval, my F1-scores decreased all the way from ~69 to ~25 for MELD (and similar results for other datasets).  I find the model is just outputting the demonstration emotion while training, in order to reduce the loss as much as possible, which leads to extreme overfitting.  However, in validation, this almost never works since it uses all-labels pairing.

Do you have the demonstration retrieval code we can look at?  It seems it is not effective.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Low performance when using demonstration retrieval #15

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Low performance when using demonstration retrieval #15

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions