Matlab Code for Basic Video Tracking

Overall rating
Introduction to Video tracking

Video tracking is the process of locating a moving object (or multiple objects) over time using a camera. It has a variety of uses, some of which are: human-computer interaction, security and surveillance, video communication and compression, augmented reality, traffic control, medical imaging and video editing. Video tracking can be a time consuming process due to the amount of data that is contained in video. Adding further to the complexity is the possible need to use object recognition techniques for tracking, a challenging problem in its own right

The objective of video tracking is to associate target objects in consecutive video frames. The association can be especially difficult when the objects are moving fast relative to the frame rate. Another situation that increases the complexity of the problem is when the tracked object changes orientation over time. For these situations video tracking systems usually employ a motion model which describes how the image of the target might change for different possible motions of the object.

Problems encountered In Video Tracking

Problems encountered in the above factors of people counting are explained below.

Background Model

Background subtraction–Object detection can be achieved by building a representation of the scene called the background model, and then the decision deviates from the model for each incoming frame. Any important change in an image region from the background model signifies a moving object. While tracking one person in a stationary background may be relatively simple, the problem becomes very complicated with many people. They may be crossing in front of each other, behind occlusions, through different lighting, with shadows, and in groups.

Occlusion Problem

A large number of objects lead to some objects being fully visible and, many objects partly visible due to occlusions from other objects. The problem of detection and tracking partially visible or hidden objects becomes really challenging. The camera is not fixed and there is a lot of camera shake, which further increases the complexity of the problem. The poor resolution of the video also compounds the detection problem.

Blob Analysis

Blob tracking may be easy and fast, but it does not work normally, particularly with people moving in groups. A number of candidate algorithms claim to be capable of differentiating people in groups. People coming from grocery stores or other similar shops often push trolleys ahead of them. These trolleys are typically of an identical size as a human being. This makes splitting blobs by height or width a hard task, as it must take movement into account; it is hard to see if the blob is real or collected from many people. It becomes impossible to detect that in busy areas. Still, huge area blobs are expected to be groups of either people or something else that is moving. People standing in groups are tough to compact with, even for the human eye. When they stand close mutually or hold hands, they appear as one big blob. Blob detection (blob analysis) involves examining the blob image and finding each human being as a foreground object. This may, in some particular cases, become a rather difficult task, even for the human eye. When the moving persons have comparable colours as the background, the blobs become rather unclear, and unusual situations appear when blobs contain holes. Dilation is not enough in this case to entirely compact the blob.

Lighting Conditions and Noise

The first frame on the video should have a static background; apply that frame as a base frame and judge against the real video frames against this base frame. It is hard to believe that static background frames with similar lighting conditions for day, night and noon might be obtained. Light is limited according to the seasons and non-natural lighting conditions; so a more composite approach is needed.The image still contains a lot of noise and requires further processing. The subtracted resulting frame gives a pretty good approximation of the information. Noise should be eliminated in order to avoid its interference with the blob detection algorithm.

Colour Classification

An object can be segmented based on its colour in RGB or HSV space. This is ideal if the object colour is distinct from the background. An object can also be recognized or classified based on its colour. In the given video sequence, object colours are not distinct as they take a large range of colours. Object recognition based on colour would work for some objects, like rickshaws having a distinct yellow colour, but would fail for cars that can take several colours.

Object Boundaries

Edge-based– Object boundaries generally produce strong changes in image intensities. Edge detection is used to recognize these changes. A main property of edges is that, they are less sensitive to illumination changes compared to colour features. Algorithms that follow the boundary of the things generally use edges as the diplomat feature. This technique is not suitable when there are numerous cluttered objects in a frame.

Different Gray Levels and Image Quality

These issues imposed problems specially for image segmentation and binarization techniques, where the foreground image is represented as black, and the background is white. They give high false alarms for images with high noise and uneven illumination. Although most images with a simple background and high contrast can be correctly localized and extracted, images with low resolution and complex background will be difficult to extract. Most digital images and videos are generally stored, processed and transmitted in a compressed form.

Simple Algorithm for Video Surveillance

To perform video tracking an algorithm analyzes sequential video frames and outputs the movement of targets between the frames. There are a variety of algorithms, each having strengths and weaknesses. Considering the intended use is important when choosing which algorithm to use. There are two major components of a visual tracking system: target representation and localization, as well as filtering and data association


Matlab Code for Video Surveillance


B = '.png';

C = 'yjf-00_1-';







        filename3 =strcat(e2,B);


        b=imread( filename3);




        [r c]=size(a1);




        z1= imsubtract(a1,b1);

        z2= imsubtract(a2,b2);

        z3= imsubtract(a3,b3);








        I3 = imadjust(I2, stretchlim(I2), [0 1]);

        level = graythresh(I3);

bw = im2bw(I3,level);

        K = medfilt2(bw);

        K6 = medfilt2(K,[5,5]);

%         k6=double(K6);

%         K6=edge(K6,'canny');

%         imshow(K6,[])




        Image  = K6;

% get size of image as H,W


        [H,W] = size(Image);

        [r,c] = find( bwperim(Image,4) == 1 );

        [tr,tc] = boundarytrack(r,c,H,W,0);


        Y1 =tc;


        YB =tc;

Xmin = min(XB);

Ymin = min(YB);

Xmax = max(XB);

Ymax = max(YB);




        X = [XminXmax];

        Y = [YminYmin];

        line(Y,X) ;

        X = [XminXmin];

        Y = [YminYmax];

        line(Y,X) ;

        X = [XmaxXmax];

        Y = [YminYmax];

        line(Y,X) ;

        X = [XminXmax];

        Y = [YmaxYmax];

        line(Y,X) ;

Yg = Xmin+fix((Ymax-Ymin)/2);

Xg = Ymin+fix((Xmax-Xmin)/2);






MatlabGui for Video Tracking

Join the World's Largest Technical Community

we respect your privacy.