You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the code!I found that the code may have the following problems:
The token input to the model decoder at each time step is the same as the output token, so the accuracy of the model is very high. But the correct input to the decoder is the token at time t-1, not the token at time t.
teacher forcing can only be used for model training, but the code uses it when evaluating the model.