Machine Learning Solutions for Machine learning at the Edge
In order to develop solutions based on machine learning, several technical disciplines are required. However, most companies only have a few of them available in-house. Data scientists, machine learning engineers and software developers who can create, train, tune and test machine learning models are then often hired.
The crux: As a rule, these models do not run on embedded hardware or mobile devices. Because most machine learning engineers have never used models on embedded hardware and are not familiar with the limited resources. For the use of trained models on mobile SoCs, FPGAs and microprocessors, the model must be optimized and quantized.
Semiconductor manufacturers, in turn, are faced with the task of developing products that meet new requirements in terms of performance, cost and form factor – and that under tight time-to-market specifications. Flexibility is required for interfaces, inputs and outputs as well as memory usage, so that the solutions cover different use cases.
Optimization and quantization facilitated by TensorFlow Lite
Thanks to Google’s TensorFlow Lite, this has become a little easier in recent years. The open source machine learning platform now also contains scripts that can be used to optimize and quantize machine learning models in a “flatbuffers” file (or *tflite). Parameters that have been configured for a specific application environment are used.
Ideally, an embedded hardware solution can directly import the flatbuffer files from TensorFlow without having to resort to proprietary or hardware-specific optimization techniques outside the TensorFlow ecosystem. Then software and hardware engineers can easily use the quantized and optimized flatbuffer file on FPGAs, SoCs and microcontrollers.
SoC, MCU and FPGA compared
Embedded hardware platforms have only limited resources, are not very development-friendly and are demanding to use. The reward for this is low power consumption, low costs and solutions with small dimensions. What do SoCs, microcontrollers and FPGAs offer?
SoCs have the highest performance and many common interfaces, but usually also the highest power consumption. Due to the interface-specific inputs and outputs, they occupy a lot of chip space. This makes them relatively cost-intensive.
Microcontrollers score with very low power consumption and a small form factor, but they are often very limited in machine learning performance and model capabilities. Models that are located at the upper end of the product portfolio usually only offer specialized interfaces, such as cameras or digital microphones.
FPGAs occupy a wide range between microcontrollers and SoCs. They are available with a wide range of housings and flexible inputs and outputs. With it, you can pick up any interface needed for a particular application, without wasting silicon area. In addition, the configuration options ensure that costs and power consumption can be scaled with the performance and integration of additional functions. The problem with using FPGAs for machine learning is their lack of support and integration with SDK platforms such as TensorFlow Lite.
Hybrid µSoC FPGAs with additional PSRAM
To eliminate this vulnerability, Gowin Semiconductor provides an SDK on its GoAI 2.0 platform that extrapolates models and coefficients and generates C code for the ARM Cortex-M processor integrated into the FPGAs and bitstream or firmware for the FPGAs.
Another challenge is the great need for flash and RAM of machine learning models. New hybrid µSoC FPGAs such as the Gowin GW1NSR4P meet this requirement by embedding 4-8 MB of additional PSRAM. With the GW1NSR4P, this is especially intended for the GoAI 2.0 coprocessor for accelerated processing and storage of folding and pooling layers. It is used in conjunction with its hardware Cortex-M-IP, which controls the layer parameters, as well as model processing and output results.
Many providers of programmable semiconductors also use design services programs to provide their customers with steeper learning curves when using embedded hardware for machine learning. This also applies to Gowin: the GoAI Design Services program supports users who are looking for a one-chip solution for the classification or implementation support of tested trained models “off the shelf”, but who do not know how to address the embedded hardware.
With such programs, the providers support and relieve the burden on companies, so that they require fewer resources in the area of embedded machine learning and implementation on embedded hardware (TinyML) and can concentrate more on their product development.
Local, embedded machine learning is currently a popular and constantly growing field for many product developers. However, there are significant challenges, as engineers from various disciplines and fields are needed to develop these solutions. Some programmable semiconductor vendors are responding to this by leveraging popular ecosystem tools for embedded hardware, as well as devices with flexible interfaces and advanced storage, new software tools and design services.
This post is originally from our partner portal Aktuelle-Technik.ch.