The NDK supports the ARM Advanced SIMD, an optional instruction-set extension of the ARMv7 spec. NEON provides a set of scalar/vector instructions and registers (shared with the FPU) comparable to MMX/SSE/3DNow! in the x86 world. To function, it requires VFPv3-D32 (32 hardware FPU 64-bit registers, instead of the minimum of 16).
The NDK supports the compilation of modules or even specific source files with support for NEON. As a result, a specific compiler flag enables the use of GCC ARM NEON intrinsics and VFPv3-D32 at the same time.
Not all ARMv7-based Android devices support NEON, but devices that do may benefit significantly from its support for scalar/vector instructions. For x86 devices, the NDK can also translate NEON instructions into SSE, although with several restrictions. For more information, see x86 Support for ARM NEON Intrinsics.
Using LOCAL_ARM_NEON
To have the NDK build all its source files with NEON support, include the following line in your module definition:
LOCAL_ARM_NEON := true
It can be especially useful to build all source files with NEON support if you want to build a static or shared library that specifically contains NEON code paths.
Using the .neon Suffix
When listing source files for your LOCAL_SRC_FILES
variable, you have the option of
using the .neon
suffix to indicate that you want to build binaries with NEON support.
For example, the following example builds one file with .neon
support, and another
without it:
LOCAL_SRC_FILES := foo.c.neon bar.c
You can combine the .neon
suffix with the .arm
suffix, which specifies the 32-bit
ARM instruction set for non-NEON instructions. In such a definition, arm
must come before
neon
. For example: foo.c.arm.neon
works, but foo.c.neon.arm
does not.
Build Requirements
NEON support only works with the armeabi-v7a
and x86
ABIs. If the NDK build
scripts encounter other ABIs while attempting to build with NEON support, the NDK build scripts
exit. x86 provides partial NEON support via translation header. It is
important to use checks like the following in your
Android.mk
file:
# define a static library containing our NEON code ifeq ($(TARGET_ARCH_ABI),$(filter $(TARGET_ARCH_ABI), armeabi-v7a x86)) include $(CLEAR_VARS) LOCAL_MODULE := mylib-neon LOCAL_SRC_FILES := mylib-neon.c LOCAL_ARM_NEON := true include $(BUILD_STATIC_LIBRARY) endif # TARGET_ARCH_ABI == armeabi-v7a || x86
Runtime Detection
Your app must perform runtime detection to confirm that NEON-capable machine code can be run on
the target device. This is because not all ARMv7-based Android devices support NEON. The app can
perform this check using the
cpufeatures
library that comes with
this NDK.
You should explicitly check that android_getCpuFamily()
returns ANDROID_CPU_FAMILY_ARM
, and that android_getCpuFeatures()
returns a value including the
ANDROID_CPU_ARM_FEATURE_NEON flag
set. For example:
#include <cpu-features.h> ... ... if (android_getCpuFamily() == ANDROID_CPU_FAMILY_ARM && (android_getCpuFeatures() & ANDROID_CPU_ARM_FEATURE_NEON) != 0) { // use NEON-optimized routines ... } else { // use non-NEON fallback routines instead ... } ...
Sample Code
The source code for the NDK's hello-neon sample provides an example of how to use the
cpufeatures
library and NEON intrinsics at the same time. This sample implements a tiny
benchmark for a FIR filter loop using a C version, and a NEON-optimized one for devices that
support it.