Projects

Sports Image Classification

Built a image classification model to predict the sport that is being represented in the image.

Ashwin Mathur

Jun 2, 2023 • 3 min read

Have you ever wondered how machines can recognize sports just by looking at images? Well, it's all thanks to the power of Convolutional Neural Networks (CNNs). In this article, we will explore how CNNs can be used to classify sports images and compare the performance of different CNN architectures.

Dataset

The dataset used for this project is a collection of images representing 100 different types of sports and activities. The sports range from traditional sports like "archery", "arm wrestling", "bowling", "football", "water polo", "weightlifting" to non-traditional ones like "wingsuit flying" and "nascar racing". The goal is to predict the correct sport based on the image.

The dataset consists of 13572 train, 500 test, and 500 validation images. Each image is of size 224 x 224 pixels and has been segregated into train, test, and valid directories. The dataset can be downloaded from Kaggle.

Experiments

To classify the sports images, we compared the performance of five different CNN architectures: a custom CNN, InceptionV3, ResNet50V2, MobileNetV2, and EfficientNetB3. For each model, we fine-tuned the pre-trained ImageNet weights and added two hidden layers of 256 and 128 neurons respectively with leaky-relu activations. Dropout layers with p=0.1 were added to prevent overfitting. The number of epochs was 50 with early stopping with a patience parameter of 2 epochs. A batch size of 32 was used for training, and Sparse Categorical Cross-Entropy was used as the loss function.

Custom CNN

We started with a baseline custom CNN model with 3 convolution layers and 3 dense layers. A kernel of size 3 x 3 was used for all the convolution layers. Training the model with an Adam optimizer with a learning rate of 0.001 for 47 epochs yielded an Accuracy of 56.44%, F1-Score of 48.48%, and an ROC-AUC Score of 49.46%.

InceptionV3

The InceptionV3 model was initialized with pre-trained ImageNet weights. Only the Dense layers were fine-tuned. Training the model with an Adam optimizer with a learning rate of 0.001 for 22 epochs yielded an Accuracy of 68.92%, F1-Score of 64.92%, and an ROC-AUC Score of 66.64%.

ResNet50V2

The ResNet50V2 model was initialized with pre-trained ImageNet weights. Only the Dense layers were fine-tuned. Training the model with an Adam optimizer with a learning rate of 0.001 for 16 epochs yielded an Accuracy of 72.88%, F1-Score of 70.67%, and an ROC-AUC Score of 69.72%.

MobileNetV2

The MobileNetV2 model was initialized with pre-trained ImageNet weights, and all the layers were fine-tuned. Training the model with an Adam optimizer with a learning rate of 0.001 for 8 epochs yielded an Accuracy of 86.68%, F1-Score of 86.79%, and an ROC-AUC Score of 88.36%.

EfficientNetB3

The EfficientNetB3 model was initialized with pre-trained ImageNet weights, and all the layers were fine-tuned. Training the model with an Adam optimizer with a learning rate of 0.001 for 18 epochs yielded an Accuracy of 92.72%, F1-Score of 91.76%, and an ROC-AUC Score of 96.92%.

Results

The best results were obtained using a fine-tuned EfficientNetB3 model, which achieved an Accuracy of 92.72%, F1-Score of 91.76%, and an ROC-AUC Score of 96.92%. The results from all the models have been summarized in the table below:

Model	Accuracy	F1-Score	ROC-AUC Score
Custom CNN	56.44	48.48	49.46
InceptionV3 (fine-tuned)	68.92	64.92	66.64
ResNet50V2 (fine-tuned)	72.88	70.67	69.72
MobileNetV2 (fine-tuned)	86.68	86.79	88.36
EfficientNetB3 (fine-tuned)	92.72	91.76	96.92

Deploying the Model

A web app was made using Streamlit to make predictions for new images using the best model. The live app can be viewed here.

Run Locally

All the code for this project can be found in this Github repository. To run the app locally, you can follow the instructions provided in the repository.

Install required libraries:

  pip install -r streamlit/requirements.txt

Fine-tune models:

  python sports-clf-custom-cnn.py
  python sports-clf-inception.py
  python sports-clf-resnet.py
  python sports-clf-mobilenet.py
  python sports-clf-efficientnet.py

Generate predictions:

  python sports-clf-final-predictions.py

Conclusion

In conclusion, we have seen how CNNs can be used to classify sports images with high accuracy. We compared the performance of five different CNN architectures and found that the EfficientNetB3 model achieved the best results. This model can be used to classify sports images in real-time, which can be useful in various applications such as sports analytics, sports broadcasting, and sports betting.