Face Recognition in iOS - Christmas Edition


Computer vision is a very interesting computer science field, and there are many ways one can apply it. There are powerful libraries such as OpenCV, Vuforia and many others that can perform computer vision tasks such as object tracking and shape or face recognition. These libraries are great if you need the most capable solution and you are prepared to spend a lot of time learning new things. While these libraries are most efficient, sometimes you need might want a simple solution that you can implement in 30 minutes.

What makes iOS great is that there are loads of native frameworks that do magical things for us. One of these great frameworks is Core Image. It has many useful functions to work with pictures but we will use it’s face recognition function for the purpose of this article. Today I will show you the simplest way to recognize faces in live video. To spice things up we will add a beard, a mustache and a Santa hat to your face.

The basis for this tutorial is taken from Apple’s example SquareCam.

You can get the full project source code from Devbridge’s github repository.


This example was written using Xcode 4.5.2 which you can download from the Mac App Store or Apple Developer website. Because Core Image framework appeared in iOS5.0, you will need a device that is running iOS5.0 or later. Also, you need to be registered in the iOS Developer Program to run apps on your iDevice. This example does NOT work on a simulator as a simulator cannot simulate the camera.


Create a new project and choose the “Single View Application” option.

Ios 1

After filling in the info about your new app go to “Build Phases” and link the shown libraries to the project:

Ios 2

Because we want to make the application as simple as possible, we will disable rotation for it and presume that the user will always hold it in portrait mode. If you want to see how to implement face recognition for all orientations please look at Apple’s SquareCam example.

Ios 3

Create a new file and name it DetectFace. This class will recognize the faces, return face features and display video in UIView.

We will need a layer to show video (previewLayer), view to display that layer (previewView), video data output instance to process the raw data (videoDataOutput), video data output queue to perform recognition without blocking the main thread (videoDataOutputQueue) and face detector instance to detect faces (faceDetector).

@property (nonatomic, strong) UIView \*previewView;
@property (nonatomic, strong) AVCaptureVideoPreviewLayer \*previewLayer;
@property (nonatomic, strong) AVCaptureVideoDataOutput \*videoDataOutput;
@property (nonatomic, assign) dispatch\_queue\_t videoDataOutputQueue;
@property (nonatomic, strong) CIDetector \*faceDetector;

First, set up an AVCapture session and set all options. We will be using the front camera with VGA quality (640x480).

\- (void)setupAVCapture
    AVCaptureSession \*session = \[AVCaptureSession new\];
    \[session setSessionPreset:AVCaptureSessionPreset640x480\];
    // Select a video device, make an input
    AVCaptureDevice \*device = \[AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo\];
    //in real app you would use camera that user chose
    if(\[UIImagePickerController isCameraDeviceAvailable:UIImagePickerControllerCameraDeviceFront\]) {
        for (AVCaptureDevice \*d in \[AVCaptureDevice devicesWithMediaType:AVMediaTypeVideo\]) {
            if (\[d position\] == AVCaptureDevicePositionFront)
                device = d;
    NSError \*error = nil;
    AVCaptureDeviceInput \*deviceInput = \[AVCaptureDeviceInput deviceInputWithDevice:device error:&error\];
    if(error != nil)
    if (\[session canAddInput:deviceInput\])
        \[session addInput:deviceInput\];
    // Make a video data output
    self.videoDataOutput = \[AVCaptureVideoDataOutput new\];
    // we want BGRA, both CoreGraphics and OpenGL work well with 'BGRA'
    NSDictionary \*rgbOutputSettings = @{(id)kCVPixelBufferPixelFormatTypeKey : @(kCMPixelFormat\_32BGRA)};
    \[self.videoDataOutput setVideoSettings:rgbOutputSettings\];
    \[self.videoDataOutput setAlwaysDiscardsLateVideoFrames:YES\]; // discard if the data output queue is blocked (as we process the still image)
    self.videoDataOutputQueue = dispatch\_queue\_create("VideoDataOutputQueue", DISPATCH\_QUEUE\_SERIAL);
    \[self.videoDataOutput setSampleBufferDelegate:self queue:self.videoDataOutputQueue\];
    if ( \[session canAddOutput:self.videoDataOutput\] )
        \[session addOutput:self.videoDataOutput\];
    \[\[self.videoDataOutput connectionWithMediaType:AVMediaTypeVideo\] setEnabled:NO\];
    self.previewLayer = \[\[AVCaptureVideoPreviewLayer alloc\] initWithSession:session\];
    \[self.previewLayer setBackgroundColor:\[\[UIColor blackColor\] CGColor\]\];
    \[self.previewLayer setVideoGravity:AVLayerVideoGravityResizeAspect\];
    CALayer \*rootLayer = \[self.previewView layer\];
    \[rootLayer setMasksToBounds:YES\];
    \[self.previewLayer setFrame:\[rootLayer bounds\]\];
    \[rootLayer addSublayer:self.previewLayer\];
    \[session startRunning\];

Next, lets implement the AVCaptureVideoDataOutputSampleBufferDelegate function captureOutput:didOutputSampleBuffer:fromConnection: in which we will extract the image from the buffer, pass it to the face recognition function featuresInImage:options: and then send features array to the controller, which has implemented DetectFaceDelegate.

\- (void)captureOutput:(AVCaptureOutput \*)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection \*)connection
    // got an image
    CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    CFDictionaryRef attachments = CMCopyDictionaryOfAttachments(kCFAllocatorDefault, sampleBuffer, kCMAttachmentMode\_ShouldPropagate);
    CIImage \*ciImage = \[\[CIImage alloc\] initWithCVPixelBuffer:pixelBuffer options:(\_\_bridge NSDictionary \*)attachments\];
    if (attachments)
    /\* kCGImagePropertyOrientation values
     The intended display orientation of the image. If present, this key is a CFNumber value with the same value as defined
     by the TIFF and EXIF specifications -- see enumeration of integer constants.
     The value specified where the origin (0,0) of the image is located. If not present, a value of 1 is assumed.
     used when calling featuresInImage: options: The value for this key is an integer NSNumber from 1..8 as found in kCGImagePropertyOrientation.
     If present, the detection will be done based on that orientation but the coordinates in the returned features will still be based on those of the image. \*/
    int exifOrientation = 6; //   6  =  0th row is on the right, and 0th column is the top. Portrait mode.
    NSDictionary \*imageOptions = @{CIDetectorImageOrientation : @(exifOrientation)};
    NSArray \*features = \[self.faceDetector featuresInImage:ciImage options:imageOptions\];
    // get the clean aperture
    // the clean aperture is a rectangle that defines the portion of the encoded pixel dimensions
    // that represents image data valid for display.
    CMFormatDescriptionRef fdesc = CMSampleBufferGetFormatDescription(sampleBuffer);
    CGRect clap = CMVideoFormatDescriptionGetCleanAperture(fdesc, false /\*originIsTopLeft == false\*/);
    // called asynchronously as the capture output is capturing sample buffers, this method asks the face detector
    // to detect features
    dispatch\_async(dispatch\_get\_main\_queue(), ^(void) {
        CGSize parentFrameSize = \[self.previewView frame\].size;
        NSString \*gravity = \[self.previewLayer videoGravity\];
        CGRect previewBox = \[DetectFace videoPreviewBoxForGravity:gravity frameSize:parentFrameSize apertureSize:clap.size\];
        if(\[self.delegate respondsToSelector:@selector(detectedFaceController:features:forVideoBox:withPreviewBox:)\])
            \[self.delegate detectedFaceController:self features:features forVideoBox:clap withPreviewBox:previewBox\];

We also need functions to start and stop detection. Because we are recognizing faces in video we want our detector to be fast, so we will be using the Low Accuracy option.

\- (void)startDetection
    \[self setupAVCapture\];
    \[\[self.videoDataOutput connectionWithMediaType:AVMediaTypeVideo\] setEnabled:YES\];
    NSDictionary \*detectorOptions = @{CIDetectorAccuracy : CIDetectorAccuracyLow};
    self.faceDetector = \[CIDetector detectorOfType:CIDetectorTypeFace context:nil options:detectorOptions\];

- (void)stopDetection
    \[self teardownAVCapture\];

// clean up capture setup
- (void)teardownAVCapture
    if (self.videoDataOutputQueue)
        self.videoDataOutputQueue = nil;

To pass face feature array to our view controller we will use a delegate.

@protocol DetectFaceDelegate 
- (void)detectedFaceController:(DetectFace \*)controller features:(NSArray \*)featuresArray forVideoBox:(CGRect)clap withPreviewBox:(CGRect)previewBox;

We are using 2 additional functions to calculate the correct sizes for images bounds


\+ (CGRect)videoPreviewBoxForGravity:(NSString \*)gravity frameSize:(CGSize)frameSize apertureSize:(CGSize)apertureSize;
+ (CGRect)convertFrame:(CGRect)originalFrame previewBox:(CGRect)previewBox forVideoBox:(CGRect)videoBox isMirrored:(BOOL)isMirrored;

Please look at the source code for their implementation.

After creating DetectFace open MainStoryboar.storyboard file and add UIView to your View Controller. Then open Assistance view and connect this new view to your controller (Select UIView and drag it to your controller while holding Ctrl). Name it previewView.

Ios 4

As I have mentioned we will be adding a beard, a mustache, and a hat on the video feed so we need to have 3 image views.

@property (nonatomic, strong) UIImageView \*hatImgView;
@property (nonatomic, strong) UIImageView \*beardImgView;
@property (nonatomic, strong) UIImageView \*mustacheImgView;

Also, create a DetectFace instance which will do the recognition for us.

@property (strong, nonatomic) DetectFace \*detectFaceController;

In viewDidLoad function initiate detectFaceController, set ViewController as it’s delegate, set previewView to display video from detectFaceController and start detection.

\- (void)viewDidLoad
    \[super viewDidLoad\];
    // Do any additional setup after loading the view, typically from a nib.
    self.detectFaceController = \[\[DetectFace alloc\] init\];
    self.detectFaceController.delegate = self;
    self.detectFaceController.previewView = self.previewView;
    \[self.detectFaceController startDetection\];
\- (void)viewWillUnload
    \[self.detectFaceController stopDetection\];
    \[super viewWillUnload\];

The final step is to implement the DetectFace delegate function. In this function we check to see if image views already exist. If they don’t exist, we create them with the appropriate images and add them to video preview view. Then we take the coordinates of face features and position the beard, mustache and hat to match face features.

\- (void)detectedFaceController:(DetectFace \*)controller features:(NSArray \*)featuresArray forVideoBox:(CGRect)clap withPreviewBox:(CGRect)previewBox
    if (!self.beardImgView) {
        self.beardImgView = \[\[UIImageView alloc\] initWithImage:\[UIImage imageNamed:@"beard"\]\];
        self.beardImgView.contentMode = UIViewContentModeScaleToFill;
        \[self.previewView addSubview:self.beardImgView\];
    if (!self.mustacheImgView) {
        self.mustacheImgView = \[\[UIImageView alloc\] initWithImage:\[UIImage imageNamed:@"mustache"\]\];
        self.mustacheImgView.contentMode = UIViewContentModeScaleToFill;
        \[self.previewView addSubview:self.mustacheImgView\];
    if (!self.hatImgView) {
        self.hatImgView = \[\[UIImageView alloc\] initWithImage:\[UIImage imageNamed:@"christmas\_hat"\]\];
        self.hatImgView.contentMode = UIViewContentModeScaleToFill;
        \[self.previewView addSubview:self.hatImgView\];
    for (CIFaceFeature \*ff in featuresArray) {
        // find the correct position for the square layer within the previewLayer
        // the feature box originates in the bottom left of the video frame.
        // (Bottom right if mirroring is turned on)
        CGRect faceRect = \[ff bounds\];
        //isMirrored because we are using front camera
        faceRect = \[DetectFace convertFrame:faceRect previewBox:previewBox forVideoBox:clap isMirrored:YES\];
        float hat\_width = 290.0;
        float hat\_height = 360.0;
        float head\_start\_y = 150.0; //part of hat image is transparent
        float head\_start\_x = 78.0;
        float width = faceRect.size.width \* (hat\_width / (hat\_width - head\_start\_x));
        float height = width \* hat\_height/hat\_width;
        float y = faceRect.origin.y - (height \* head\_start\_y) / hat\_height;
        float x = faceRect.origin.x - (head\_start\_x \* width/hat\_width);
        \[self.hatImgView setFrame:CGRectMake(x, y, width, height)\];
        float beard\_width = 192.0;
        float beard\_height = 171.0;
        width = faceRect.size.width \* 0.6;
        height = width \* beard\_height/beard\_width;
        y = faceRect.origin.y + faceRect.size.height - (80 \* height/beard\_height);
        x = faceRect.origin.x + (faceRect.size.width - width)/2;
        \[self.beardImgView setFrame:CGRectMake(x, y, width, height)\];
        float mustache\_width = 212.0;
        float mustache\_height = 58.0;
        width = faceRect.size.width \* 0.9;
        height = width \* mustache\_height/mustache\_width;
        y = y - height + 5;
        x = faceRect.origin.x + (faceRect.size.width - width)/2;
        \[self.mustacheImgView setFrame:CGRectMake(x, y, width, height)\];


This is a very simple example but it’s a good start to create something interesting.

Please download the project from a public GitHub repository with all source code and images (https://github.com/devbridge/examples/tree/master/SantaFace-iOS).

Have fun coding and have a nice holiday! Ho Ho Ho!

Ios 5