- When the target (y) is categorical
- For loss function. cross-entropy is used and for evaluation metric, accuracy is commonly used.
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
whole_data = load_breast_cancer()
X_data = whole_data.data
y_data = whole_data.target
X_train,X_test,y_train,y_test = train_test_split(X_data,y_data,test_size = 0.3, random_state = 7)
Dataset Description
- Breast cancer dataset has a total of 569 data instances (212 malign, 357 benign instances)
- 30 attributes (features) to predict the binary class (M/B)
- Doc: http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html#sklearn.datasets.load_breast_cancer
print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)
1. Creating a model
- Same with the regression model at the outset
from keras.models import Sequential
model = Sequential()
1–1. Adding layers
- Keras layers can be added to the model
- Adding layers is like stacking lego blocks one by one
- It should be noted that as this is a classification problem, sigmoid layer (softmax for multi-class problems) should be added
- Doc: https://keras.io/layers/core/
# Keras model with two hidden layer with 10 neurons each
model.add(Dense(10, input_shape = (30,))) # Input layer => input_shape should be explicitly designated
model.add(Activation('sigmoid'))
model.add(Dense(10)) # Hidden layer => only output dimension should be designated
model.add(Activation('sigmoid'))
model.add(Dense(10)) # Hidden layer => only output dimension should be designated
model.add(Activation('sigmoid'))
model.add(Dense(1)) # Output layer => output dimension = 1 since it is regression problem
model.add(Activation('sigmoid'))# This is equivalent to the above code block
model.add(Dense(10, input_shape = (13,), activation = 'sigmoid'))
model.add(Dense(10, activation = 'sigmoid'))
model.add(Dense(10, activation = 'sigmoid'))
model.add(Dense(1, activation = 'sigmoid'))
1–2. Model compile
- Keras model should be “compiled” prior to training
- Types of loss (function) and optimizer should be designated
- Doc (optimizers): https://keras.io/optimizers/
- Doc (losses): https://keras.io/losses/
from keras import optimizers
sgd = optimizers.SGD(lr = 0.01) # stochastic gradient descent optimizer
model.compile(optimizer = sgd, loss = 'binary_crossentropy', metrics = ['accuracy'])
#Summary of the model¶
model.summary()
2. Training
- Training the model with training data provided
model.fit(X_train, y_train, batch_size = 50, epochs = 100, verbose = 1)
3. Evaluation
- Keras model can be evaluated with evaluate() function
- Evaluation results are contained in a list
- Doc (metrics): https://keras.io/metrics/
results = model.evaluate(X_test, y_test)
print(model.metrics_names) # list of metric names the model is employing print(results) # actual figure of metrics computed
print('loss: ', results[0])
print('accuracy: ', results[1])
Full code on Google Colaboratory link.