Facial Expression Recognition (FER) is a challenging topic in the image classification field. Some of its applications, such as driver assistance systems, require real-time response or demand methods that can run on low-resources devices. FER can be classified into conventional methods and deep learning methods. Deep learning-based methods have attracted much attention in recent years because of their higher performance even under challenging scenarios. However, deep learning-based methods rely on models that demand high computational resources. At the same time, conventional methods depend on hand-crafted features that may not perform well in different scenarios. In this context, some studies are pursuing to reduce the computational complexity of deep learning models while achieving similar results to those more complex models. But even these models with reduced complexity can require a lot of computational resources.

To tackle this problem, we propose ExpressionHop. ExpressionHop is based on a Successive Subspace Learning classification technique called PixelHop[1], which allows us to automatically extract meaningful features without the need for higher computational demanding models. As shown in Figure1, we first extract facial landmark patches from face images, and then use Pixelhop to extract feature. Discriminant feature test is utilized for feature selection before doing classification using logistic regression. As shown in table1, our model achieved higher or similar results compared to traditional and deep learning methods for JAFFE, CK+, and KDEF datasets. At the same time, a comparison of the number of parameters of the models indicates that the proposed model demands fewer computational resources even when compared to newer deep learning methods that rely on reduced complexity models.


— By Chengwei Wei and Rafael Luiz Testa