The token input to the model decoder at each time step is the same as the output token

您的代码对我很有帮助，但是我发现代码可能存在如下问题：
1) 模型解码器每个时间步输入的token和输出的token相同，因此模型准确率很高，但是正确的做法是每次输入t-1时刻的token（而不是t时刻的token），输出t时刻的token。
2) teacher forcing仅可被用于模型的训练，但是代码在对模型进行评价时依然使用了teacher forcing。


Thanks for the code！I found that the code may have the following problems:
1) The token input to the model decoder at each time step is the same as the output token, so the accuracy of the model is very high. But the correct input to the decoder is the token at time t-1, not the token at time t. 
2) teacher forcing can only be used for model training, but the code uses it when evaluating the model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The token input to the model decoder at each time step is the same as the output token #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The token input to the model decoder at each time step is the same as the output token #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions