Skip to content

Performance degradation caused by normalization #48

@BoZenKhaa

Description

@BoZenKhaa

Performance degradation on PCA

I was getting unexpectedly poor performance from PCA on the Exathlon data (using the VUS-PR metric):

TSB-AD - latest commit Values reported in the paper Difference
TranAD 0.95 0.10 0.86
OFA 0.85 0.58 0.28
CNN 0.95 0.68 0.27
LSTMAD 0.96 0.82 0.14
OmniAnomaly 0.97 0.84 0.13
USAD 0.97 0.84 0.13
RobustPCA 0.81 0.77 0.04
AnomalyTransformer 0.14 0.10 0.04
AutoEncoder 0.91 0.91 0.00
IForest 0.32 0.35 -0.04
PCA 0.53 0.95 -0.42

First column are values I obtained with the latest commit of TSB-AD, second are values from the publication.

The improvement in most methods can probably be ascribed to the fixes (e.g. use of correct hyperparameters) since publication. However, there might be a bug affecting PCA and possibly other methods as well:

Bug (in PCA?)

At least part of the issue might be normalization introduced in a79f315:

        # models/PCA.py
        X = Window(window = self.slidingWindow).convert(X)
        if self.normalize: 
            if n_features == 1:
                X = zscore(X, axis=0, ddof=0)
            else: 
                X = zscore(X, axis=1, ddof=1) #<--- 2nd issue
                
        # validate inputs X and y (optional)
        X = check_array(X)
        self._set_n_classes(y)

        # PCA is recommended to use on the standardized data (zero mean and
        # unit variance).
        if self.standardization: #<--- 1st issue
            X, self.scaler_ = standardizer(X, keep_scalar=True)

In the example of PCA above:

  1. Normalization is applied twice, once for the normalization flag and once for the standardization flag.
  2. X = zscore(X, axis=1, ddof=1) seems to apply normalization independently on each window. Each window contains multiple features and has form [feat0_t0, feat0_t1, ..., feat0_tw, feat1_t0, feat1_t1, ..., feat1_tw, ...].

It is not intuitive for me what No 2. is trying to achieve, but it does not seem correct PCA. Independent application of the z-score to each time window e.g. messes up information on absolute magnitude of features between time windows, and overall seems to only remove information from the data.

Also, I think normalization might not be needed for IsolationForest forest at all? For other methods, I am currently unsure I can't tell for sure without looking into them more.

Fix

Two parts to this:

  1. In my opinion, PCA should not have both "normalization" and "standardization". I think removing "normalization" code and renaming "standardization" to "normalization" would be appropriate to be consistent with other methods. Even though I think "standardization" is actually a better term here.
  2. I think X = zscore(X, axis=1, ddof=1) is a bug at least for some methods (PCA, IForest). Naively, I believe changes from a79f315 could be replaced to use StandardScaler in standardizer along the columns for all the methods involved.

I can prepare a PR for either 1. or 1. and 2., if that is welcome.

PS: thank you for an awesome project of such a large scope <3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions