Face attributes classification is an important topic in biometrics. The ancillary information of faces such as gender, age and ethnicity is referred to as soft biometrics in forensics. The face gender classification problem has been extensively studied for more than two decades. Before the resurgence of deep neural networks (DNNs) around 7-8 years ago, the problem was treated using the standard pattern recognition paradigm. It consists of two cascaded modules: 1) unsupervised feature extraction and 2) supervised classification via common machine learning tools such as support vector machine (SVM) and random forest (RF) classifiers.

We have seen a fast progress on this topic due to the application of deep learning (DL) technology in recent years. Cloud-based face verification, recognition and attributes classification technologies have become mature, and they have been used in many real world biometric systems. Convolution neural networks (CNNs) offer high performance accuracy. Yet, they rely on large learning models consisting of several hundreds of thousands or even millions of model parameters. The superior performance is contributed by factors such as higher input image resolutions, more and more training images and abundant computational/memory resources.

Edge/mobile computing in a resource-constrained environment cannot meet the above-mentioned conditions. The technology of our interest finds applications in rescue missions and/or field operational settings in remote locations. The accompanying face inference tasks are expected to execute inside a poor computing and communication infrastructure. It is essential to have a smaller learning model size, lower training and inference complexity, and lower input image resolution. The last requirement arises from the need to image individuals at farther standoff distances, which results in faces with fewer pixels.

In this research, MCL worked closely with ARL researchers in developing a new interpretable non-parametric machine learning solution called the FaceHop method. FaceHop has quite a few desired characteristics, including a small model size, a small training data amount, low training complexity, and low resolution input images. FaceHop follows the traditional pattern recognition paradigm that decouples the feature extraction module from the decision module. FaceHop automatically extracts statistical features instead of handcrafted features.

The effectiveness of the FaceHop method is demonstrated by experiments on two benchmarking datasets. For gray-scale face images of resolution 32×32 obtained from the LFW and the CMU Multi-PIE datasets, FaceHop achieves correct gender classification rates of 94.63% and 95.12% with model sizes of 16.9K and 17.6K parameters, respectively. FaceHop outperforms LeNet-5 in classification accuracy while the model size of LeNet-5 is significantly larger, which contains 75.8K parameters.

— by Dr. C.-C. Jay Kuo