Home / Services / Application Development & Management / Articles
CoreML: Integrating Machine Learning models in iOS app
Today's customers expect their applications to be tailor-made for their precise needs. And including Machine Learning (ML) capabilities in your application can address this demand. ML gives machines the power to evolve and learn without being explicitly programmed. ML utilizes Artificial Intelligence (AI), whereby users do not have to code an algorithm. Rather it is the ML tools that enhance an algorithm by continuously accessing, learning, and finding patterns from huge amounts of business data.
However, such tasks require a huge computing power that has historically been beyond the capabilities of smartphones or tablets. Meaning, smartphone applications offload the computational heavy tasks upon a remote data center through an Internet connection, rather than running the algorithm on the mobile device itself. Today, developers have found a way of bringing the ML capabilities onto mobile platforms. Both Apple and Google have launched their frameworks enabling on-device ML. Apple's Machine Learning framework for iOS devices is known as Core ML.
What is CoreML?
Core ML is the foundation for domain-specific frameworks and functionality. It supports the vision for image analysis, natural language for Natural Language Processing (NLP), speech for converting audio to text, and sound analysis for identifying sounds in audio. Core ML is built over low-level primitives like Metal Performance Shaders.
The main concept lies in the Core ML model format specification. This specification consists of several .proto files containing protobuf message definitions. The serialization format used by Core ML’s model files is called “protocol buffers”. It’s a common serialization technology. The proto files describe the different objects that can be found in a mlmodel file. Coremltools repo contains .proto files. These are simple text files that can be seen in any editor.
Model.proto is the main file in the format specification and it defines what a model is, what kind of inputs and outputs a model can have, and different types of models existing. The model definition contains an important property called ‘specification version’. The version number helps in determining which functionality is available in the mlmodel file, and which iOS operating system can run the model.
Feature upgradation in Specification Version 2
Apple added a small update in Core ML’s second version by providing support for 16-bit floating-point weights. It helps in making the mlmodel files about 2X smaller. Other significant features that were added include: weight quantization for smaller mlmodel files (without any change in the inference speed), flexible input sizes, API added batch predictions and better support for dealing with sequential data.
Highlights of Core ML 3
The latest version of Core ML, aka Core ML 3 (spec v4), comes with new sophistication and allows the developers to describe the following model types in mlmodel files:
On-device ML: Core ML 3 enables advanced neural networks with support for over 100 types of layers, and seamlessly takes advantage of the CPU, GPU, and Neural Engine to provide maximum performance and efficiency. You can run ML models directly on the device.
On-device Training: Core ML models are bundled into apps to help drive intelligent features such as search or object recognition in photos. These models can be updated with on-device user data, helping the models to stay relevant to user behavior without compromising privacy.
Advanced Neural Networks: With support for advanced Neural Networks, sophisticated ML models can now run on-device with Core ML 3. Use and run the latest models, such as cutting-edge neural networks designed to understand images, video, sound, and other rich media.
Vision Framework: This framework simplifies the development of computer-vision Machine Learning (ML) features into your app. It offers advanced features such as face detection, tracking, and capturing quality along with text recognition, image saliency and classification, and image similarity identification. Other features of the Vision framework include improved landmark detection, rectangle detection, barcode detection, object tracking, and image registration. Apps utilizing Vision can detect and capture documents using the camera using the new Document Camera API.
Natural Language: Natural language framework analyzes the natural language text and finds language-specific metadata for deep learning. You can use this framework, with Create ML, to train and deploy custom NLP models. Its features include the transfer of learning for Create ML text models, word embeddings, sentiment classification, and a text catalog. The framework can be applied for English, French, German, Italian, Chinese, and Spanish language processing.
Speech: Speech framework allows developers to use speech recognition for 10 languages and speech saliency features such as pronunciation information, streaming confidence, utterance detection, and acoustic features.
Model Object: The Model object in Core ML 3 contains an isUpdatable property. With this model object, the model can be trained on-device with new data. Presently, this property works only on neural networks and k-Nearest Neighbors (either as a standalone model or inside a Pipeline).
k-NN: k-Nearest Neighbors is a simple algorithm but is suitable for on-device training. A common method is to have a fixed neural network, such as VisionFeaturePrint, extract the features from the input data, and then use k-NN to classify those feature vectors. Such a model is fast to “train” because k-NN simply memorizes any examples you give it — it doesn’t perform any actual learning on its own. A major downside of utilizing k-NN is that the prediction mechanism becomes slow when you have a lot of examples memorized. However, Core ML also supports a K-D variant that would prove efficient in such instances.
How to Integrate Core ML model into your app?
We would use the SqueezeNet.mlmodel, for this quick tutorial. SqueezeNet.mlmodel can be downloaded from Apple’s machine learning page. These are pre-trained ML models that can be used offline too.
After downloading SqueezeNet.mlmodel, drag it from Finder into your project’s Project Navigator.
Select the file, and wait for a moment. An arrow will appear when Xcode has generated the model class.
Click the arrow to see the generated class.
Now you can find input and output classes along with the main class SqueezeNet which the XCode has generated. It contains a model property and two prediction methods.
Note: You might want to implement Vision Framework here. It converts familiar image formats into the correct type. The framework converts SqueezeNetOutput properties into its result types and manages calls to prediction methods. As a result, out of all the generated codes, your code will use only the model property.
Using Vision framework to wrap CoreML Model
In your ViewController import the below two frameworks.
Add below function to wrap model into the vision framework.
Now you can easily use the above function to pass any image in the parameter as CIImage and check if it matches the condition defined in detectImage function.
With the help of the above-mentioned code, you can easily integrate an existing model into your app and create some exciting user experiences