A real-time algorithm for accurate localization of
facial landmarks in a single monocular image is proposed. The
algorithm is formulated as an optimization problem, in which
the sum of responses of local classifiers is maximized with
respect to the camera pose by fitting a generic (not a person specific)
3D model. The algorithm simultaneously estimates a
head position and orientation and detects the facial landmarks
in the image. Despite being local, we show that the basin of
attraction is large to the extent it can be initialized by a scanning
window face detector. Other experiments on standard datasets
demonstrate that the proposed algorithm outperforms a state-ofthe-
art landmark detector especially for non-frontal face images,
and that it is capable of reliable and stable tracking for large set
of viewing angles.