Deep Learning – Classification Example

Deep Learning Classification Problem - Example with sklearn breast cancer dataset - Model Creation and Evaluation

Hello Everyone!!! Its an immense pleasure to write today as this is the first post I am able to write in 2021. Happy New Year!!! 🥳 🎂 🎉 In this article we are going to see the continuation of Deep Learning techniques. We are going to see an Deep Learning model with a Classification Example.

In our last article, we learned to use a simple Neural Network to solve regression problems – Artificial Neural Network Explained with an Regression Example.

If you missed the prequels, please check below:

Artificial Intelligence – A brief Introduction and history of Deep Learning

How Neurons work? and How Artificial Neuron mimics neurons of human brain?

Artificial Neural Network Explained with an Regression Example

Classification Example Dataset:

For easy use and easy to download, the breast cancer dataset from sklearn.datasets has been taken.

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

whole_data = load_breast_cancer()

X_data =
y_data =

X_train, X_test, y_train, y_test = train_test_split(X_data, y_data, test_size = 0.3, random_state = 7) 

Finally we have the data set of 398 records for training and 171 records for testing with 30 features. Even though this is a very less data for deep learning, we are using this for experimental purpose.

Documentation of dataset:

X_train.shape, X_test.shape, y_train.shape, y_test.shape

((398, 30), (171, 30), (398,), (171,))

Model Creation:

from keras.models import Sequential

model = Sequential()

A sequential model is created. Lets add deep learning layers to it.

# Keras model with two hidden layer with 10 neurons each 
model.add(Dense(10, input_shape = (30,)))    
# Input layer
# Hidden layer 
# Hidden layer 
# Output layer => output dimension = 1 since it is regression problem

Here it can be noticed the input layer is having input shape of 30. And also 10 is mentioned in first line itself.

Now most of us may have a question, which is the exact input shape? 30 or 10?

The first layer that gets input is the input layer which is having 30 neurons here. each 1 input node is for each feature. the next layer is the first hidden layer which is having 10 neurons.

Image by Author – Input layer and first hidden layer

Also we can see that the output layer is having only 1 neuron. Why it is not having 2 neurons as the target variable is a binary class (1- malign, 0 – benign)?

This one neuron output value will be the probability value between 0 and 1 as we are using sigmoid activation in output layer. The probability value can be determined as 0 or 1n based on the threshold of 0.5.

Image by Author – Last hidden layer and output layer

Now we have constructed the neural network. Lets compile it with optimizers and loss function.

from keras import optimizers

sgd = optimizers.SGD(lr = 0.01)    # stochastic gradient descent optimizer

model.compile(optimizer = sgd, loss = 'binary_crossentropy', metrics = ['accuracy'])

Stochastic gradient descent is used as an optimizer which will be used when Back propagation and “binary_crossentropy” loss function is used for evaluation.

To get the entire structure as a summary, we can use summary() function of the model.


Layer (type)                 Output Shape              Param #   
dense_9 (Dense)              (None, 10)                310       
activation_4 (Activation)    (None, 10)                0         
dense_10 (Dense)             (None, 10)                110       
activation_5 (Activation)    (None, 10)                0         
dense_11 (Dense)             (None, 10)                110       
activation_6 (Activation)    (None, 10)                0         
dense_12 (Dense)             (None, 1)                 11        
activation_7 (Activation)    (None, 1)                 0         
dense_13 (Dense)             (None, 10)                20        
dense_14 (Dense)             (None, 10)                110       
dense_15 (Dense)             (None, 10)                110       
dense_16 (Dense)             (None, 1)                 11        
Total params: 792.0
Trainable params: 792.0
Non-trainable params: 0.0

This summary helps us to understand the total number of weights and biases. Total params: 792.0

Train the Model:

history =, y_train, batch_size = 50, epochs = 100, verbose = 1)

The above code will train the model in 50 batches and 100 epochs.

1 iteration – A data point or a set of data points sent to a neural network once.

batch_size = 50. That means the entire training data points will be divided and for each iteration 50 data points will be given in bulk.

epochs = 100. One epoch mentions that the entire data set is sent to training 1 time. I.e. The neural network will trained with 398 datapoints for 100 times. Each time the weights will be updated using Backpropagation.

So that the final weights will be the optimized approximately. Increasing the number of epochs may or may not optimize it based on the data set, learning rate, activation functions, etc.

As we have given verbose as 1, we will be able to see the accuracy and loss in each epoch.

Image by Author – Logs of Model training

Here you can see that we higher loss and less accuracy.


The model’s evaluation function. The documentation can be found here

results = model.evaluate(X_test, y_test)

32/171 [====>.........................] - ETA: 0s

Now we have the results. This object will have 2 metrics Loss and Accuracy.

print(model.metrics_names)     # list of metric names the model is employing
print(results)                 # actual figure of metrics computed
print('loss: ', results[0])
print('accuracy: ', results[1])

['loss', 'acc']
[0.6395693376050358, 0.6783625724022848]
loss:  0.6395693376050358
accuracy:  0.6783625724022848

To get a clear view about what happened in the whole training process, we can use the object received from statement. We named it as history.

History of the training process

We can use matplotlib for visualization:

from matplotlib import pyplot as plt

plt.legend(['Accuracy', 'Loss'], loc = 'upper left')

If we have added validation split in fit function, we will get both training and validation results.

history =, y_train, validation_split = 0.3, epochs = 100, verbose = 0)
plt.legend(['training', 'validation'], loc = 'upper left')

You can see that the validation and training accuracy is not improved in this new model. We used the same training data for validation split also. So the training data was not enough.


Now you may think that, all that CPU time and GPU time was wasted for less accuracy? Even ML model gave better results!!!

Yes, I too thought the same when I see this accuracy for first time.

We haven’t done one thing yet which is the interesting part of model creation. Yes! Tuning!

This is a base model. We can still tune this model with different number of neurons, layers, epochs, different activation functions, optimizers and loss functions.

Also we can increase the data set to get better results.

I am planning to post an article to tune this model. Will see you in next post.

Thank you for reading our article and hope you enjoyed it. 😊

Like to support? Just click the like button ❤️.

Happy Learning! 👩‍💻

Asha Ponraj
Asha Ponraj

Data science and Machine Learning enthusiast | Software Developer | Blog Writter

Articles: 86


Leave a Reply

Your email address will not be published. Required fields are marked *