Abstract
Abundant Luggage detection in
this project we try to detect abundant Luggage from the video. We use CNN
Architecture for abandoned luggage detection from the different video frames. For
this project’s testing and training we have used different frames from two
video datasets which contains different scenarios such as people with luggage
and abundant luggage with different possible scenes. We have listed all the
results and make possible comparisons. Our project’s results and methodology
will benefit further work in testing and training of CCNs in security related
tasks.
Introduction
For images processing there are
many processing technologies and method have used for instant image enhancement
etc. but all these method can only use for pixel operations that way in many
application computer vision being used such as character recognition,
classification of medical scans, pattern recognition and object detection. As
we all know that in today’s world many operations has already shifted to
machine power by keeping the same in mind automation in every
field serve the purpose and can
do all the task which human can typically do, by bringing automation is work
will reduce the human error in many ways, or lessen the number of working hours
which need to complete that tasks. Computer vision somehow lacking in object
and activity detection in surveillance video, surveillance video monitored by
computer vision would greatly benefit the required task, it would be much better
to create such automatic system which generate alert for the human if any
anomalous behavior or suspicious object detected.
Major issues of this application
is there are many qualities available for the surveillance video frames there
may be different issues arises such as angle of the camera, video quality which
depends on the camera model and how to differentiate between the normal and
abnormal activities. In this project we proposed the solution that uses CNNs
which helps us to make video analysis better and easier. Major issue we are facing
in this application is to able to identify from different scenes captured from
video that they are normal or not. For this project we only focus on
identifying whether an image has an abundant luggage in it or not.
Background/Related Work
In
the literature review we have seen many different methods have been proposed to
identify objects or actions in the videos. As this review paper [3] have
presented comparison of five different algorithms for object tracking and
detection although they focus more on moving objects but presented algo also
applicable on no moving object as well. In this paper [1] they propose
methodology for detection of abundant luggage from the surveillance video. They
have used two-way steps for detection 1) static object detection 2) abandoned
luggage detection using cascade of convolution neural network. Many methods
also used for object detection such as background subtraction for analysis of
videos. [8] Proposed method based on BG subtraction which used to distribution
of images vectors to detect change in a scene.
[4]
They have proposed methodology which based on Bayesian framework that integrate
Spectral, pixel base and time related features. In the literature
review there are many researches have been done on abundant luggage detection
in public places. [9] they have propose method two M.C.M.C model that takes
note on object tracking and use this information for further object detection
process they used BG and FG subtractions techniques for further improve their
methodology.[2] they have formalize framework for abundant object detection and
also provide extensive review on existing state-of-the-art approaches that used
for aforementioned purpose. They have also built multi configuration system for
serving the purpose which uses combination of the state-of-the-art approaches
to achieve best performance results. [6] Propose a method to localize object
using FC masking to extract area of interest. [5] They have elaborated the method
to detect abundant objects and locate them to their previous owner through
object tracking using frame sequencing.
Studying previous work, we have
found that there have also been challenges such as PETS and the I-LIDS. [7] Have
used an edge detection to locate the abundant object from the video frames they
have also compared their results on PETS and AVSS datasets. These challenges consist
of multiple scenarios in which they have human with luggage and with luggage,
traveling, standing, and many other scenarios which shows abundant luggage
possible frames. All the method which we have studies and find better for
abundant object detection purpose according to their proposed environments and
scenarios but there are no such datasets available which deals only abundant
object detection scenarios or generated only for said purpose that is also a
challenge face while working on this project.
As we have already discuss and
present the previous work have done on this problem there seem to be very
little focus on using CNNs to get better object detection system as we have
seen they work on computer vision available methods on the other hand we
proposed a method using CNNs which allows us to have more control on model fine
tuning to improve our results.
Dataset
While working on this project the
first challenge to consider was to find benchmark dataset as there was shortage
of data for surveillance videos. Many small datasets which contains only
specific scenarios were available, but they are not fulfil our requirements.
Many papers have mentioned their datasets they have created and used for their methodology,
but they have not published them and the ones that are published reflects real
scenarios to a limited extends. We have decided to use combination of the available
datasets that have been mentioned in previous work such as CAVIAR, and I-LIDS.
The CAVIAR dataset was created
INRIA in 2013-14. This dataset contains different real time scenarios such as
people walking, meeting with others, shopping, entering, and exiting in public
places. All videos frames were captured using wide lenses in two different
locations. From this dataset we only took those frames which suits our purpose
for instant people leaving their luggage in public below attached exampled from
the dataset.
The i-LIDS dataset was offered by
the Home Office Scientific Development Branch, UK for research purpose. It
contains multiple scenarios such as train station, public places etc. all
frames captured using CCTV cameras. Examples form this dataset.
As I have already mentioned about
the limitation of the dataset and restriction in the video quality and angles,
we decided to perform augmentation such as rotating, shearing, zooming,
vertical flip, width and High shift on the dataset frames. Resulting dataset
frames count are 55656 in total in which total 26,066 labeled as abundant
luggage and 29,590 labeled as attended luggage.
Methodology
Dataset used in the project was
in surveillance videos of certain events. We split the dataset into train,
validation and testing we did not mix different frames as we do not want to use
same videos for training and testing. At the split dataset step, we have split
all the dataset and then shuffled all the frames randomly which helpful to
reduce the similarity between train and test dataset. After performing split on
dataset, we use training dataset frames for training of the CCN model we used
transfer learning. Transfer learning implemented in our model by implementing
VGG-19 model. VGG-19 has 19 layers and
pre-trained on imageNet. For this project we used VGG-19 and retrained the last
layer only which fulfils our project’s purpose which is detection of abundant
luggage and attendant luggage from the video frames. Framework for the model
was Tensor Flow. All experiments were done using Google Colab Free resources it
took almost 5 hours to train our model using Colab GPU.
Results
For our project we used Transfer
learning model we only train last layer
of the VGG-19 model and keep the rest same as it pre-trained by using this we
have achieve better results and they can be enhance by tuning model with
different combination of the hyper- parameter values for this project we have
used following hyper parameter values:
·
Batch Size = 32
·
Validation Split = 0.1
·
Epochs = 20
Below attached graphs in figure 2
shows training and validation loss graphs. In figure 3 shows training accuracy
and validation accuracy.
Figure 2
Figure 3
Below table contains the information about out training
loss, validation loss, train accuracy, and validation accuracy after training
our model we have obtained accuracy on the trained model we have listed this
also in the table 1 below:
Dataset
|
Train Loss
|
Validation Loss
|
Train Accuracy
|
Validation Accuracy
|
Test Accuracy
|
Combined
|
0.2511
|
0.2537
|
0.9297
|
0.9279
|
0.9284
|
As our results shows that using
combination of the aforementioned datasets we can obtained 0.92% accuracy for
detection of the required class correctly.
Conclusion
In our project we have
implemented abundant luggage detection using pre-trained CNNs model named as
VGG19. For better results we have split our dataset into test, train, and
validation after implementing split we further shuffle the dataset in each for
avoiding any biasness. Training dataset have different surveillance videos which
have different scenarios such as people with luggage, people leaving their
packages, meeting with others, and abundant luggage etc. we have trained our
model using Google Colab which shows good results with the accuracy of 0.92% using hyper parameters such as epochs=20,
batch size=32 and validation split=0.1 although our dataset have limitations
and restriction such as video quality and size of dataset we have use
augmentation such as horizontal shift, vertical shift and shearing etc. which
shows better results and we can make improvements using different model tuning
approached.
Future
work
This project will provide the
future framework to detect abundant Luggage detection using CCNs architectures
for better detection and results. This project can be extended to tracking of
luggage throughout the videos to locate the person who responsible for that,
further extension of the project will be adding activities profiling such as we
can profile a person as suspicious activity or not by detecting and tracking
their activities same methodology can be used for other object detection.
More extension of this work would
be to use dataset with better scenarios only related to our domain and
environment so we can develop a more better and generalized model which can be
suitable for transfer learning in all scenarios/related domain problems.
References
[1] Smeureanu,
S., & Ionescu, R. T. (2018, September). Real-time deep learning method for
abandoned luggage detection in video. In 2018 26th European Signal
Processing Conference (EUSIPCO) (pp. 1775-1779). IEEE.
[2] Luna,
E., San Miguel, J. C., Ortego, D., & MartÃnez, J. M. (2018). Abandoned
object detection in video-surveillance: survey and comparison. Sensors, 18(12),
4290.
[3] Nascimento,
J. C., & Marques, J. S. (2006). Performance evaluation of object detection
algorithms for video surveillance. IEEE Transactions on Multimedia, 8(4),
761-774..
[4] Li,
L., Huang, W., Gu, I. Y. H., & Tian, Q. (2004). Statistical modeling of
complex backgrounds for foreground object detection. IEEE Transactions
on Image Processing, 13(11), 1459-1472.
[5] Bhargava,
M., Chen, C. C., Ryoo, M. S., & Aggarwal, J. K. (2007, September).
Detection of abandoned objects in crowded environments. In 2007 IEEE
Conference on Advanced Video and Signal Based Surveillance (pp. 271-276).
IEEE.
[6] Liao,
H. H., Chang, J. Y., & Chen, L. G. (2008, September). A localized approach
to abandoned luggage detection with foreground-mask sampling. In 2008
IEEE Fifth International Conference on Advanced Video and Signal Based Surveillance (pp.
132-139). IEEE.
[7] Ilias,
D. A. H. I., El Mezouar, M. C., Taleb, N., & Elbahri, M. (2017). An
edge-based method for effective abandoned luggage detection in complex
surveillance videos. Computer Vision and Image Understanding, 158,
141-151.
[8] Seki,
M., Fujiwara, H., & Sumi, K. (2000, December). A robust background
subtraction method for changing background. In Proceedings Fifth IEEE
Workshop on Applications of Computer Vision (pp. 207-213). IEEE.
[9] Smith,
K. C., Quelhas, P., & Gatica-Perez, D. (2006). Detecting abandoned
luggage items in a public space (No. REP_WORK). IDIAP.
Post a Comment