Skip to content

Overfitting when training ACDNet20 in TensorFlow #8

@mrerdem

Description

@mrerdem

Thank you for the paper and the repository.

When I try to follow the procedure for "B. Rebuilding ACDNet20 in Tensorflow" the model starts overfitting after just several epochs and final validation accuracy does not go beyond ~30%. I am using "micro_acdnet_pruned_trained_fold4_86.00" as the reference torch model in the first input.

I should also note that the structure of pretrained tf model (acdnet20_20khz_fold4) does not match any of the 4 torch models provided.
The pre-trained TF model has conv filters of [4, 32, 12, 23, 18, 38, 43, 62, 58, 77, 37, 50] whereas torch model has [7, 20, 10, 14, 22, 31, 35, 41, 51, 67, 69, 48].
Anyway, I also tried using the architecture of the pre-trained TF model to train in TF from scratch, but overfitting is still there.

At first I thought it is because weight_decay parameter in SGD is now deprecated. However, I implemented L2 regularization for the same functionality in all Conv layers, but overfitting is still there.

Do you have any suggestion to fix this issue?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions