-
Notifications
You must be signed in to change notification settings - Fork 97
Description
Hello, Thanks for putting in place a really useful library!
I'm working on the pneumonia detection problem. My dataset is super imbalanced, with 2000+ non-pneumonia cases and 142 cases, but I decided to stick with 142 cases of each label to keep the dataset balanced.
I am trying to apply the STFT layer in the following model:
with the following parameters:
self.spec_layer = Spectrogram.STFT(n_fft=256, hop_length=128, sr=8000, trainable=True, output_format="Magnitude")
Now, I'm observing some modifications of the spectrograms as it trains, but it seems like the trained spectrogram mainly gets updated at the higher frequency bins. It should be the low-frequency bins that inform the neural network of decision-making, since lung sounds are of the range 0-4000Hz and I sample at 8000 Hz. Here is a spectrogram of a pneumonia sample before training:
and here its updated version at, respectively, epochs 10, 50, and 150:
Since it's really hard to visualize, I generate a difference map ( = trained spectrogram at given epoch - original untrained spectrogram). Here are the difference maps at, respectively, epoch 10, 50 and 150:
It's difficult to see but there are some slight modifications of the lower frequency bins 0-24, only it's little, and barely any for bins 0-12.
Some of the training parameters are
parameters.lr = 1e-4
parameters.n_epochs = 150
parameters.batch_size = 32
parameters.audio_length = 5
I use nnAudio == 0.2.6.







