Automatic indexing of video documents


Establishment of a shot hierarchy

The splitting of a video document into shots corresponds to the basis of a bigger hierarchy, which aims at grouping several shots together according to chosen criteria, in order to reach a higher semantical level for the understanding of the document.
Therefore, any video document can be structured and splitted into shots, which themselves constitute scenes (same unity of place for example, but different view points) and sequences (same unity of subject), etc. when grouped together.

Let us take the example of a TV newscast which is of great interest in our framework. For the indexing process, one has to be able to group the different shots to extract each news report as a entire unity, and, from a given report (that is to say a sequence with the above terminology), to extract all scenes and illustrations of this report.

When applied to the key frames of two different shots, the algorithm of transition detection allows one to establish relations between the different shots. The only change to apply to the algorithm consists in the choice of a little higher threshold.

It is besides interesting, still in an indexing framework, to be able to say, if during a shot, the scene was modified or not. A motion estimation should of course provide a lot of useful information in this case. But once again, the use of the detection algorithm applied on key frames of a same shot, is a simple way to get a quick idea of the answer to the question "Was there any change during this shot ?" This time the chosen threshold value is between the global threshold and the relation threshold.

These two additional informations, related shots and inner shot change, are illustrated for an interview sequence. The key frames of the sequence are first proposed in Fig.1 ; Fig.2 and Fig.3 present respectively the results of the inner shot change detection and the results of the extraction of related shots.

Fig.1 Key frames of the sequence interview - CopyrightA2/CMM/ENSMP. Back to top
Shot 0 1 2 3 4
Change yes no no no no
Fig.2 Results of the inner shot change detection, sequence interview. Back to top
Relations
shot 0 - shot 3
shot 2 - shot 4
Fig.3 Results of the detection of related shots, sequence interview. Back to top


The detected change in the first shot corresponds to the text disappearing in the first frame.
Some additional results of relation and change detection are also proposed here.
All the original images in this page come from TV newscasts of the french TV channels TF1, FR3, A2 or M6 and are therefore copyrighted by these channels. All other images or photos are copyrighted by the CMM. These documents are protected by the law on the author rights, each non-authorized copy or use is therefore strickly forbidden.
Last update : 01 - 06 - 99
demarty@cmm.ensmp.fr