Spectrograms not updating  well at low frequency bins

Hello, Thanks for putting in place a really useful library!

I'm working on the pneumonia detection problem. My dataset is super imbalanced, with 2000+ non-pneumonia cases and 142 cases, but I decided to stick with 142 cases of each label to keep the dataset balanced.

I am trying to apply the STFT layer in the following model:

![image](https://user-images.githubusercontent.com/29982236/150547472-118873e9-9d0d-455c-a84a-0f84c3b6cabc.png)

with the following parameters:

`self.spec_layer = Spectrogram.STFT(n_fft=256, hop_length=128,  sr=8000,
                                            trainable=True, output_format="Magnitude")`

Now, I'm observing some modifications of the spectrograms as it trains, but it seems like the trained spectrogram mainly gets updated at the higher frequency bins. It should be the low-frequency bins that inform the neural network of decision-making, since lung sounds are of the range 0-4000Hz and I sample at 8000 Hz. Here is a spectrogram of a pneumonia sample before training:

![outputs__orig_index_9_label_1](https://user-images.githubusercontent.com/29982236/150548548-1496e3f8-c9c3-411e-ab3f-038a3a6b58b0.png)

and here its updated version at, respectively, epochs 10, 50, and 150:

![outputs___9_label_1_epoch_10](https://user-images.githubusercontent.com/29982236/150548995-a1d0ebf5-9d24-4015-9735-18e58552e048.png)

![outputs___9_label_1_epoch_50](https://user-images.githubusercontent.com/29982236/150549032-afa86994-574a-4ab6-aef3-f72ed06e6d79.png)

![outputs___9_label_1_epoch_140](https://user-images.githubusercontent.com/29982236/150549056-598ba64f-4731-41ca-9b4a-b4dbf8589234.png)

Since it's really hard to visualize, I generate a difference map ( = trained spectrogram at given epoch - original untrained spectrogram). Here are the difference maps at, respectively, epoch 10, 50 and 150:


![diff___9_label_1_epoch_10](https://user-images.githubusercontent.com/29982236/150549075-9718f379-2d3e-4730-b536-4bdf980affd8.png)

![diff___9_label_1_epoch_50](https://user-images.githubusercontent.com/29982236/150549086-57402d18-7b70-44b2-b749-3673b3697935.png)

![diff___9_label_1_epoch_140](https://user-images.githubusercontent.com/29982236/150549109-5213d46e-1feb-487a-9f16-4128c958e644.png)

It's difficult to see but there are some slight modifications of the lower frequency bins 0-24, only it's little, and barely any for bins 0-12.

Some of the training parameters are 

parameters.lr = 1e-4
parameters.n_epochs = 150
parameters.batch_size = 32
parameters.audio_length = 5

I use nnAudio == 0.2.6.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spectrograms not updating well at low frequency bins #115

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Spectrograms not updating well at low frequency bins #115

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions