DEV Community

Tomoya Oda
Tomoya Oda

Posted on • Updated on

Kaggle SETI 59th Solution

This article is translated from my Japanese tech blog.
https://tmyoda.hatenablog.com/entry/20210819/1629384283

About the SETI Competition

https://www.kaggle.com/competitions/seti-breakthrough-listen

This competition is given a spectrogram of a signal and predicts anomalies in it.
(The data used in this competition has been artificially generated from a simulator)

Pipeline

Image description

Augmentation

I didn't have enough time to investigate augmentation thoroughly. For now, I used these four and mixup is included. I don't know which one is effective...

  • vflip
  • shift_scale_rotate
  • motion_blur
  • spec_augment

I wanted to use SpecAug in albumentations, so I created a class as follows.

class SpecAugment(ImageOnlyTransform):
    def __init__(self, alpha=0.1, **kwargs):
        super(SpecAugment, self).__init__(**kwargs)
        self.spec_alpha = alpha

    def apply(self, img, **params):
        x = img
        t0 = np.random.randint(0, x.shape[0])
        delta = np.random.randint(0, int(x.shape[0] * self.spec_alpha))
        x[t0:min(t0 + delta, x.shape[0])] = 0
        t0 = np.random.randint(0, x.shape[1])
        delta = np.random.randint(0, int(x.shape[1] * self.spec_alpha))
        x[:, t0:min(t0 + delta, x.shape[1])] = 0
        return x

Enter fullscreen mode Exit fullscreen mode

Test Time Augmentation (TTA)

Since there are four augmentations I applied this time, I decided to perform the TTA 16 times. The number 16 was chosen because I wanted to apply all the augmentations at least once for each image during the TTA.

For example, when TTA 16 times, 4 types of augmentation, and the probability of each augmentation being applied is p=0.5, the probability of all augmentations being applied at least once can be calculated using the following formula.

TTA: 16, Augmentation 4
Image description

TTA: 4, Augmentation 4

Image description

Resizing Network

This model is the best score so far.
I believe it would be better to input the image without resizing, but my GPU has not enough memory.
If I want to input the image without resize, I need to reduce the batch size.

However, this leads to a situation where, in the case of imbalanced data like this time (9:1), only one class appears in a batch.

So, I decided to train with the largest possible image size using this model.

Training

In this competition, the dataset was reset once, and the dataset was completely refreshed. So, I decided to use the previous data for pre-training. Doing this, the score slightly increased for both LB and CV.

Also, the pre-training of the model is fold-out, and the fine-tuning is 4Fold CV.

Model

I have encountered a problem model would not learn when enlarged (probably due to bad learning rate and scheduler) even I tried various models (nfnet, volo, swin,...).

So, I decided to use efficientnetv2_s and m which had good score.

What I tried

1st Place Solution

I was surprised by the first place solution.
I think the idea to remove this background can be used in other competitions dealing with spectrograms.

https://www.kaggle.com/c/seti-breakthrough-listen/discussion/266385

Top comments (0)