The Talking Face video consists of 5000 frames taken from a video of a person engaged in conversation. This corresponds to about 200 seconds of recording. The sequence was taken as part of an experiment designed to model the behaviour of the face in natural conversation.
The subject was engaged in conversation in a room with another person, over an extended period of time. The intention was to produce natural behaviour, despite the un-natural circumstances. A static camera was trained on each individual, framed so that the movement of the face during the sequence was almost entirely within the image.
Example images are shown below:-
The data is available below:
The data set has been tracked with an AAM using a 68 point model. The annotation scheme is shown below. Although the annotation was performed semi-automatically, it has been checked visually and is generally sufficiently accurate to represent the facial movements during the sequence.
The data can be used to train statistical models of facial behaviour, or to test face tracking algorithms.
Note that the data can be used to generate a person specific appearance model using the tools downloadable from here.
Currently the full 68 point annotation is available for every frame of the sequence.
The image files and point files have corresponding names.
e.g. "franck_00127.jpg" has points in file
"franck_00127.pts"
The points file format is as follows:-
version: 1 (This points file version no. can be ignored) n_points: 68 (The number of labelled points on the image) { xxxx yyyy ..... }
For each point xxxx is the x co-ord starting from the top-left corner and yyyy is the y co-ord similarly starting from the top-left corner of the image.
All points files contain 68 points with each point representing a specific point on the face (see diagram above).
The annotations have been made available under the aegis of the EU FP5 project FGNET.