NNEF Overview


Neural Network Exchange Format (NNEF)

NNEF reduces machine learning deployment fragmentation by enabling a rich mix of neural network training tools and inference engines to be used by applications across a diverse range of devices and platforms.

NNEF 1.0 Specification

The goal of NNEF is to enable data scientists and engineers to easily transfer trained networks from their chosen training framework into a wide variety of inference engines. A stable, flexible and extensible standard that equipment manufacturers can rely on is critical for the widespread deployment of neural networks onto edge devices, and so NNEF encapsulates a complete description of the structure, operations and parameters of a trained neural network, independent of the training tools used to produce it and the inference engine used to execute it.

NNEF - Solving Neural Net Fragmentation

Convolutional Neural Networks (CNN) are computationally expensive, and so many companies are actively developing mobile and embedded processor architectures to accelerate neural net-based inferencing at high speed and low power. As a result of such rapid progress, the market for embedded neural net processing is in danger of fragmenting, creating barriers for developers seeking to configure and accelerate inferencing engines across multiple platforms.

Today, most neural net toolkits and inference engines use proprietary formats to describe the trained network parameters, making it necessary to construct many proprietary importers and exporters to enable a trained network to be executed across multiple inference engines.

Before NNEF and with NNEF diagram

NNEF has been designed to be reliably exported and imported across tools and engines such as Torch, Caffe, TensorFlow, Theano, Chainer, Caffe2, PyTorch, and MXNet. The NNEF 1.0 Specification covers a wide range of use-cases and network types with a rich set of operations and a scalable design that borrows syntactical elements from existing languages with formal elements to aid in correctness. NNEF includes the definition of custom compound operations that offers opportunities for sophisticated network optimizations. Future work will build on this architecture in a predictable way so that NNEF tracks the rapidly moving field of machine learning while providing a stable platform for deployment.

NNEF 1.0

Released as a stable version after getting industry feedback based on provisional version

Initial focus on passing trained frameworks to embedded inference engines

  • Authoring interchange, importing NNEF into tools, is also an emerging use case

Support deployable range of network topologies

  • Rapid evolution to encompass new network types as they emerge from research

NNEF File Structure

NNEF File Structure

Split Structure and Data files

  • Easy independent access to network structure or individual parameter data
  • Set of files can use a container such as tar or zip with optional compression and encryption

NNEF Implementations and Roadmap

Active NNEF roadmap development

  • Track development of new network types
  • Explicit support for authoring interchange
  • Support retraining with third party NNEF tools
  • Address an ever wider range of applications
  • Increase the expressive power of the format

NNEF open source projects

  • Hosted on Khronos NNEF GitHub repository
  • Apache 2.0 license
  • More RFQ projects coming
NNEF Implementations and Roadmap


NNEF Working Group Participants

NNEF Industry Support

Khronos has initiated a series of open source projects, including a NNEF syntax parser/validator and example exporters from a selection of frameworks such as TensorFlow, and welcomes the participation of the machine learning community to make NNEF useful for their own workflows. In addition, NNEF is working closely with the Khronos OpenVX™ working group to enable ingestion of NNEF files. The OpenVX Neural Network extension enables OpenVX 1.2 to act as a cross-platform inference engine, combining computer vision and deep learning operations in a single graph.

Get Involved with NNEF

The NNEF™ working group has released NNEF 1.0. Learn more about becoming a Khronos member and help define the Khronos Neural Network Exchange Format.

Read the press release

List of changes since Provisional version

  • renamed extenttype to integer
  • made semicolon after assignments mandatory
  • depth-wise convolution can be expressed with 'groups' = 0
  • added new operations: squeeze, unsqueeze, stack, unstack, slice, argmax_reduce, prelu, RoIoperations
  • matmuloperation generalized to batched version, 'trA' and 'trB' parameters renamed to 'transposeA' and 'transposeB'
  • renamed 'perm' parameter of transposeoperation to 'axes'
  • added 'epsilon' parameter to normalization operations
  • variable labels are restricted to contain limited set of characters allowed in file names, case insensitive comparison
  • added missing negative numeric literals in case of flat syntax
  • changed syntax of array comprehension to let the loop variable be declared before the loop core expression
  • added the 'in' operator for testing containment of items in arrays
  • made parentheses mandatory for tuple literals in rvalue expressions
  • added appendix that contains both grammars in one place
  • added syntax to specify extensions in the structure description
  • revised tensor binary header (fixed size, aligned fields, quantization info is represented with binary fields instead of text)
  • introduced syntax for explicit tensor data-type specification for operations
  • introduced explicit generic tensor data-type syntax
  • clarifications about type system and casting
  • clarified tensor rank definition, added explicit tensor rank pre and post conditions for operations
  • added 'output_shape' parameter to deconv-like operations
  • removed the option of using 0s in the externaloperation to indicate unknown shapes
  • enhancement of formulas and wording in various places