Automatic indexing of video documents


First results

The algorithms developped in the project framework have been tested on 22 video sequences, acquired at the frequency of 5 or 6 Hz. The images are in CIF format. The average rate of transition detection reaches 90.4% for an average rate of false alarms of 1.5%. An array sumarizing the obtained results is available.
Sequence Nb of transitions % detected transitions % false alarms Commentary
hard rock 7 cuts - 1 dissolve 87.5 0 1 non-detected dissolve
embroidery 5 cuts - 1 dissolve 83.3 0 1 non-detected dissolve
kart 9 cuts - 1 dissolve 80 11.1 1 non-detected dissolve, 1 non-detected cut, 1 false detection due to a flash, difficult sequence
tennis 6 cuts 100 0 The cut takes place in the last image of the sequence. This case is not treated by the algorithm.
old tennis 4 cuts 100 0
sect 8 cuts - 1 dissolve 88.9 0 1 non-detected dissolve
lille 3 cuts - 3 wipes 66.6 0 2 non-detected wipes
interview 4 cuts 100 0
6 minutes 5 cuts 100 16.6 (37.5) 1 false detection due to the M6 band, 2 false detections due to flashes
road work 11 cuts - 1 dissolve 100 0 (7.7) 1 dissolve detected as a full shot
affaire 2 cuts - 1 dissolve 100 0
colère 14 cuts - 1 dissolve 100 0
procès 3 cuts - 1 dissolve 100 0 (20) 1 dissolve detected as a full shot
brèves 13 cuts - 1 dissolve 100 0
visite 5 cuts 80 0 (33.3) 1 non-detected cut, 2 false detections due to a flash
abidjan 61 cuts 96.7 3.2 2 non-detected cuts, 1 false detection
nantes 5 cuts - 1 dissolve 83.3 0 1 non-detected dissolve
paris 21 cuts - 3 dissolve - 1 page turn - 1 other transition   84.6 (65.4) 0 (5.5) 5 cuts, 2 dissolves, 1 page turn and the other transition were not detected. Only 1 false alarm due to a dissolve detected as a full shot. The non-detected cuts correspond to the extreme case of successive cuts non treated by the algorithm. A little lower threshold (0.05) results in 1 dissolve and 4 cuts non detected, but for a higher number of false alarms (2 dissolves detected as full shots and a moving object near to the camera (2)).
acadie 32 cuts - 5 dissolves 86.5 (83.8) 0 1 cut and 5 dissolves non-detected. The cut is among two successive cuts.
senghor 8 cuts - 5 dissolves 61.5 0 5 non-detected dissolves
jtv1 40 cuts 100 0 (6.9) The 3 false alarms are due to pieces of dissolves which are considered as full shots. These do not correspond to real dissolves as this sequence is a simulation of a TV newscast, realized by concatenating shot parts of a real TV newscast.
TOTAL 90.4 (89.4) 1.5 (5.9) 98.3% (97.1%) of cuts were detected
Tab.1 Summary of the results obtained on the 22 test sequences.


From these results, it appears clearly that the algorithm is efficient when dealing with simple transitions (cuts), nevertheless it needs to be improved for the dissolve detection.

Moreover in most of the cases, the choice of a lower threshold allows to recover most of the lost transitions, with the drawback of a higher rate of false detections, which are by the way easily detected afterwards, by means of the relations established between the shots (it is then possible to put shot parts together to create one single shot).
This is what happens to the sequence Paris for the false detections due a large moving object near to the camera.

Let also notice that the establishement of relations between the shots allows one to treat the case of the flashes (cf. sequence 6 Minutes), as both parts of the shot surrounding the flash are related.

Both examples below illustrate the splitting and the hierarchization of two sequences, 6 Minutes and Paris. We give successively the key frames of both sequences (6 Minutes and Paris), then the inner changes for each shot (6 Minutes and Paris) and finally the related shots (6 Minutes and Paris).

For the sequence 6 Minutes, four shots are classified as containing change. These decisions are due respectively to the M6 band (shot #1), to the disappearing of text (shot #4), to the appearing of a bigger object in the foreground (shot #7) and to a camera motion (shot #8). The same reasons apply for the relation detection in sequence Paris.

Fig.4 Key frames of sequence 6 Minutes - CopyrightM6/CMM/ENSMP Back to top 
Shot 0 1 2 3 4 5 6 7 8
Change no yes no no yes no no yes yes
 
Fig.5 Results of the inner shot change detection, sequence 6 MinutesBack to top 
Relations
shot 1 - shot 2
shot 4 - shot 6
 
Fig.6 Results of the relation detection, sequence 6 MinutesBack to top 
Fig.7 Key frames of sequence Paris - CopyrightCMM/ENSMP  Back to top 
Shot 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Change no no yes yes no no no yes yes no no yes yes no no no no yes no no no yes yes yes no no no
 
Fig.8 Results of the inner shot change detection, sequence Paris Back to top 
Relations
shot 2 - shot 8
shot 3 - shot 15
shot 4 - shot 5
shot 17 - shot 18
shot 19 - shot 20
shot 24 - shot 25
shot 25 - shot 26
 
Fig.9 Results of the relation detection, sequence Paris Back to top  

All the original images in this page come from TV newscasts of the french TV channels TF1, FR3, A2 or M6 and are therefore copyrighted by these channels. All other images or photos are copyrighted by the CMM. These documents are protected by the law on the author rights, each non-authorized copy or use is therefore strickly forbidden.
Last update : 01 - 06 - 99
demarty@cmm.ensmp.fr