Build a Weapon Detection System on the Nvidia Jetson Platform Leveraging Wavelabs Insight Framework

Build a Weapon Detection System on the Nvidia Jetson Platform Leveraging Wavelabs Insight Framework

20 Jul 2022 | Vijay Morampudi

Public security threats are an increasing concern to law enforcement, civilians, and first responders. In most cases of public attacks, first responders must take action after the attack is underway, forcing them to use lethal weapons such as firearms in defense.

Business Problem:

Our client envisioned building a security surveillance system for both home and commercial use which can detect and alert openly carried weapons (various types of guns) from a surveillance camera in real time. This needs to be achieved by processing the video data locally, on the edge , and the results pushed to mobile app.

Working with our client, we built an AI solution for smart cameras. This is used for threat detection and response monitoring system that can detect weapons in real-time.

In this blogpost, we talk about our journey of leveraging Insight framework to build the solution:

Solution Approach:

Below is the high-level view of end-to-end solution deployed to Nvidia Jetson Xavier-NX

  • 8MP camera is used to capture the continuous video feed. HD recording with a resolution is used to detect weapons from distance
  • Nvidia’s hardware accelerated GStreamer elements to process raw camera video stream. Frames extracted from video are resized
  • Frames are sent to weapon detection models running on Jetson Xavier-NX edge device
  • Raise an alert for gun detection based on threshold and notify the first responders
  • Encode the video stream using Nvidia’s hardware codecs (H.264)
  • Publish the detected frames in encoded video stream format over the network to Ant Media server on AWS over RTMP protocol for remote monitoring
Nvidia Jetson Xavier NX

Datasets Discovery:

Video recordings of people carrying authorized weapons (long gun, hand gun, shot gun, knives etc.) are recorded. Surveillance CCTV is positioned indoor and outdoor environments with day and night modes. Recorded around 60 hours of video with security personnel carrying their authorized weapons to mimic the threat.

Dataset characteristics Description
Original size 220,000 images

Dataset is augmented to supplement the original data. Applied data over-sampling technique to balance the dataset.

Dataset characteristics Description
Augmented size 1.2 M images

Image Labelling Workflow on AWS:

We leveraged Wavelabs AI Insight framework to set up the image labelling workflow on AWS.

Below is the high-level description of workflow:

  • Video Recordings (.mp4) are stored in S3
  • Built code to Filter frames that did not meet the quality bar
  • Built code to Select frames and generate unique ID’s. 1 out of 5 frames are selected
  • CVAT tool is set up and a labelling task is created
  • Defined a set of labelling instructions (exclude frames where gun visibility is < 20%)
  • Built code to augment the images
  • Build code to convert annotations to YOLO format. Labelled frames and annotations are stored in S3
Image Labelling Workflow on AWS

Automated Annotation on Nvidia Tesla K80

Did the annotations for 19 classes in total. Out of which 10 are gun related classes. To reduce the false positives, included objects similar to guns in our object detection model. Also did annotation to detect the face along with the gun.

Did the set-up of annotation pipeline on AWS EC2 instances. Used CVAT tool for defining the bounding box for 10 different guns.

Did the inferencing of 8 COCO classes on Tesla K80 with an inference latency of ~ 2 sec per image. Frames of the video recordings are sent to pre-trained object detection model for inferencing. The annotations are stored in JSON (class name and boundary box dimensions)

Automated Annotation on Nvidia Tesla-K80
Bounding box for 8 COCO classes(person, backpack, umbrella, handbag, suitcase, bottle, knife, cell phone)

Did the detection of faces on Tesla K80 with an inference latency of ~ 1 sec per image. Frames of the video recordings are sent to pre-trained face detection model for inferencing. The annotations are stored in JSON (class name and boundary box dimensions)

Automated Annotation on Nvidia Tesla-K80
Bounding box for face classes

Model Training:

Data scientists trained models to identify the best model that meets the performance criteria. Leveraged the Insight training pipeline to train the models on 4 V100 GPU’s using data parallelism mechanism in Keras. Accelerated the training time by 3 times

Distributed training pipeline on AWS

AI training pipeline AWS

Model flow diagram

Model flow diagram

Below table provides details of the algorithm and hyper-parameters used for model training:

Algorithm Characteristics Details
Framework Darkmet
Model learning strategy Open-source model fine-tuned on custom dataset
Training dataset 1.2 M images
Training CPU 4 x Tesla V100
Training duration 5 days on 4 V100 CPU’s
Hyperparameter Value
Batch 64
Subdivisions 8
Momemtum 0.9
Learning Race 0.0025
Policy Step
Iterations 816000

Model performance for long gun class is shown below:

mAP for Gun

Model Serving

AI Engineers optimized the model using TensorRT to maximize throughput and minimize inference latency. We leveraged the Insight serving pipeline to deploy optimized model to Nvidia Jetson Xavier-NX.

Model serving

Did the inferencing on edge leveraging Nvidia Jetson family devices for both weapon and face detection. Below table provides the summary of metrics for various devices. Xavier series performed well at high FPS and thermals are under control. Xavier-AGX supported high throughput. Considering economic viability Xavier-NX is chosen as choice for deployment

Nvidia-jetson-family-summary-metrics

Summary:

We talked about how we built a weapon detection system leveraging Insight and deployed it to Nvidia Jetson edge device. Continuous video feed from camera is processed by AI model for detection of weapons and raise an alert to the first responders.

About Wavelabs Insight:

Wavelabs Insight is an end-to-end AI framework to accelerate journey to launch AI models. It leverages open-source software modules and custom-built components to accelerate the time-to-deployment of AI solutions & ROI on your AI Initiatives. It provides components for data processing, model development and deployment.

Our systematic approach of identifying business needs through use case discovery and designing and building innovative AI solutions for deployment and scale translates data insights into business value.

To learn more about it please reach out to us.