Skip to main content

Exploratory Group - Heterogeneous Communication Overview

 

Your Chance to Help Set a New Communications Standard

Khronos is considering an open standardization initiative to unify low-level communication into a simple API with the aim of reducing application complexity, minimizing development costs, and improving time-to-market for high-performance embedded products. If successful, this new standard that could transform the way applications are developed for heterogeneous systems.

Khronos’ methodical Exploratory Group process takes proposals for new open standards and evaluates industry interest before we create a Working Group to develop the standard itself. This process enables us to initiate and focus on standardization efforts that stand an excellent chance of being widely-adopted and having a positive impact on the industry.

Khronos Exploratory Group Process

As part of this process we are reaching out to the industry and invite you to review the materials below for a potential new standard for Heterogeneous Communications. They outline the problems that this new standard would address and some potential design directions.

Finally, we encourage you to fill out the online survey below to provide us your feedback, and contact us at [email protected] if you would like to discuss getting involved.

Link to Press Release

The Problem to Solve

The Problem

Many cross-platform communication standards already exist, but for the most part they are either focused on a particular interconnect hardware, a homogeneous HPC architecture, or locality (inter-thread, inter-process, inter-processor).

Each existing standard has different design methodologies, strengths, and weaknesses. Some are very complex requiring hundreds of lines of code just to handle simple concepts. Others intend to be simple but can get deceptively complex. Some mask important underlying features which can have performance impacts on latency and determinism.

Unfortunately, there is no single standard that fits all localities, features, and strengths. A single application may be required to use multiple communication interfaces. Over time this application may need to be re-distributed on different hardware requiring a large refactoring of code (e.g. moving inter-process communication to inter-thread communication). This results in high development costs, compromising shortcuts taken, potential to require use of non-de-facto 3rd party proprietary APIs, and overwhelmed application developers.

A Potential Solution

The Problem

At a fundamental level, communication is about getting data from one end point to another. This concept is the same for all interconnects and localities.

If multiple low-level communication APIs are wrapped into a high-level open standard without compromising the low-level features and strengths, then this new open standard could be a one stop shop for developers.

Primary Goals for a New Open Standard

  • Simple API (small learning curve)
  • Built for Performance
    • Very low overhead (low latency, high throughput)
    • Deterministic
  • Dynamic and fault tolerant
  • Portability and Maintainability
    • Interconnect and locality agnostic (one solution for heterogeneous architectures)
    • Reliable (point-to-point) & unreliable (unicast, multicast, IO devices)
    • Extendable to custom interconnects, new technology insertion
    • Wrappers for collective functions
    • Allow for multiple language bindings

Using this new API, distributing an application across different cores, processors, GPUs and IO devices would be faster and easier than with existing solutions.

Why not MPI, MCAPI, OFI, etc?

A new open standard would be conceptually similar to MPI, MCAPI, OFI, etc. It could be used as an alternative to these APIs, or layered over lower-level communication code (e.g. sockets), providing a clean, modern programming abstraction.

Current standards do not seem to fully address the above goals. Technology has moved on considerably since many of these were established. The following is a basic overview of common point-to-point technologies, and how they could compare to a new unified API:

MPI MCAPI Sockets ZeroMQ OFI libfabric

There are other technologies that try to solve distributed algorithms in a different way, such as OpenMP or HSA (Heterogeneous System Architecture)/HSAIL. These are different paradigms that are not in competition with the goals of unifying existing point-to-point technologies.

More Details

A more detailed presentation explaining the rationale behind creating a new Heterogeneous Communications API standard is here:

If We Build It, Will You Come and Help?

Who would benefit and potentially be involved in creating and using a new communication open standard?

Potential Implementers:

  • Hardware vendors:
    • COTS board vendors
    • Chip vendors
  • Embedded software vendors:
    • Tool suite vendors
    • OS vendors

Potential users:

  • Mil/Aero primes
  • IoT product developers / vendors
  • Autonomous vehicle / Robot developers
  • Industrial & Medical device vendors

Khronos would ensure that a quorum of relevant companies is interested to participate before initiating a working group to create this new standard.

Help Us Decide the Next Steps

To help us get started, please fill out this survey to allow us to collect feedback on your interests and usage of communication APIs – and your thoughts on a new open standard.

Take the Survey 

Please contact us at [email protected] if you have questions, or would like to discuss getting involved.

 

 

 

New API Proposals

New API Proposals: Takyon

Khronos and Takyon
"Takyon is not a commercial Abaco product. Abaco has developed the API with the intention to get open community feedback and determine if Takyon is a suitable proposal, but also welcome suggested changes or other proposals."

 

Takyon’s goal is to cater to the embedded HPC (High Performance Computing) engineer who is focused on algorithm development, not the complexities of low-level communication. They need the performance and flexibility of low-level and the simplicity of high-level, so the application can be re-factored in various ways without needing to redesign source code.

Where is this relevant:

  • Radar and radio communication processing
  • EO and IR video processing
  • Autonomous vehicles
  • Virtual reality, augmented reality, simulators
  • HPC computing
  • Industrial, medical, IoT

Takyon's Prime Directives

Takyon is designed with the following requirements:

One Way, Zero Copy, Two Sided

This is the formula to achieving best performance with modern interconnects.

  • One Way: Takyon’s knowledge of the message destination removes the need for back and forth coordination with the sender.
  • Zero Copy: The message is transferred without needing to use any additional intermediate buffering beyond what the underlying interconnect uses.
  • Two Sided: The receiver is notified when message arrives. No reliance on extra application messages or application-induced receive-side polling.

Minimal Implicit Synchronization

Synchronizing requires messaging from one end point to the other and can perturb determinism and latency. Takyon only uses implicit synchronization to notify the sender when the message has left and notify the receiver when the message has arrived. All other synchronization is left to the application to control exactly when it happens. This is especially useful with multi-buffering when a synchronization signal only needs to occur after all the buffers are used up. The application should use explicit synchronization to know when:

  • The receiver is ready for more data and the sender can start another message transfer.
  • The sender's buffer can be filled (only needed in the case where the sender and receiver are sharing a common memory buffer).

Fault Tolerance

Fault tolerance is the ability to recover after error situations occur, making a communication path unusable. This is also known as HAA (high application availability). This means that if something goes wrong with a communication path, it can be detected and either re-established or use an alternative path or method to make sure the application continues reliably. Fault tolerance in Takyon is achieved via these features:

  • Dynamic Connections: Create and destroy paths at any time during the application's life cycle without effecting any other established paths. Multiple paths can be created between the same two endpoints using the same or different interconnects.
  • Timeouts: When sending and/or receiving, there may be some amount of time that passes where the path is no longer considered responsive. If a timeout is detected, then the application can take the appropriate action to keep running reliably.
  • Disconnect Detection: While timeouts can imply degraded communication paths, most modern interconnects can detect when a path has disconnected due to remote failures, network failures, etc. If a disconnect is detected, then the application can take the appropriate action to keep running reliably.

Follow the Intuition, Not the Hardware

Fundamentally, communication is about getting data from A to B. When using a high-level communication package, there should be no need to have different API groups for each type of interconnect, each type of buffering/synchronization, or the different localities (thread, process, processor). Takyon's core API is comprised of five simple and intuitive communication functions:

  • Create: create a communication path to a remote endpoint
  • Send: send a message to the remote endpoint (blocking and non-blocking)
  • Send Completion Test: test if a previously started non-blocking send is complete
  • Receive: block until a message is receive
  • Destroy: destroy the path

These functions can be used for reliable point-to-point communication, un-reliable one-sided communication useful for things like IO device streaming (video, audio, Lidar, etc.) and network multicasting.

Takyon Resources

  • Takyon Users Guide: A detailed usage guide for the Takyon API.
  • Takyon Proposal Overview: A presentation showing how Takyon can fill the “New API” requirements.
  • Takyon Quick Reference Sheet: A simple reference for the Takyon design, API, hello world example, potential extensions, utility functions, and collective functions.
  • Reference Implementation: Give it a test run on Linux, Mac, or Windows. This implementation supports the basic interconnects: socket, memory maps, and memcpy().

New API Proposals: Other Proposals

Other proposals are welcome – please contact us at [email protected] if you would like to discuss getting involved.