Fitting a keras classifier wasn't that straightforward as it is for sklearn-classifiers (like the MLPClassifier
). After some struggle through the docs and github issues, I've figured out a reusable solution, and thought of sharing it here.
Here's what my keras classification model looks like, I'll wrap it in a function and add a few comments,
#making MLP with keras for classification - predicts probability of 4 classes
def get_clf_model(input_size, output_size):
# Use stochastic gradient descent optimizer
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
# Building the model layer by layer
model = tf.keras.Sequential([
# First layer
tf.keras.layers.Dense(128, activation='relu', input_shape=[input_size],
kernel_regularizer='l2'),
# Add dropout layer to reduce overfitting
tf.keras.layers.Dropout(0.5),
# Second layer
tf.keras.layers.Dense(16, activation='sigmoid', kernel_regularizer='l2'),
# Another dropout layer
tf.keras.layers.Dropout(0.5),
# Final layer, it outputs probability values for every class
tf.keras.layers.Dense(output_size, kernel_initializer='he_uniform', activation='sigmoid'),
])
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
print(model.summary())
return model
The loss function used is categorical_crossentropy
which expects the Y-labels in "One hot vector form" but the problem is the feature selectors, normalizers and other transformers might expect the Y-labels in simple array form.
Apparently, I can't just write another transformer extending the BaseEstimator
or TransformerMixin
class since the API only supports transformation of features and not lables.
The solution that worked for me was breaking the pipeline in between to fit my classifier, and then remerging it for later uses. Here's how it looks in code,
# training & preprocessing for keras classifier
n_features = X_train.shape[1] // 2
# We need to break pipeline into two steps if the keras model expects
# Y-values in one hot vector form.
steps = [
['scaling', preprocessing.PowerTransformer()], # Scaling
['feature_selection', SelectKBest(score_func=f_regression, k=n_features)], # Feature selection
]
prep_pipeline = Pipeline(steps, verbose=True)
X_train_p = prep_pipeline.fit_transform(X_train, Y_train_lab)
# Add class weight for impbalanced dataset
class_weights = class_weight.compute_class_weight('balanced',
np.unique(Y_train_lab),
Y_train_lab)
clf = get_clf_model(n_features, len(le.classes_))
clf.fit(X_train_p, to_categorical(Y_train_lab), epochs=250, class_weight=dict(enumerate(class_weights)))
# Combine the pipeline
pipeline = Pipeline([
*prep_pipeline.steps,
('clf', clf)
])
# Now we can make predictions like this with the whole pipeline
pipeline.predict(X_val)
Of course, the above hack works because predictions has nothing to do with the Y-lables.
Top comments (0)