D/AVE NX NX is the latest and most powerful addition to the D/AVE family of rendering cores. It is the first IP to bring full OpenGL ES 2.0/3.1 and VULKAN rendering to the FPGA and SoC world. Targeting graphics user interfaces on displays up to 4K x 4K resolution in the Industrial, Medical, Military, Avionics, Automotive and Consumer markets D/AVE NX is designed to meet the sweet spot of performance and footprint.
By enabling the use of state of the art shader based APIs even on small devices high quality 2D and full 3D applications can be satisfied using the D/AVE NX core. The possibility to work with a modern graphics programming language like OpenGL ES 2.0 enables the user to rapidly build high-end user interfaces (e.g. based on QT, Android or WebGL) and makes new, future proof implementations possible.
D/AVE NX can scale easily to fit exactly into the resource/performance sweet spot for a particular application. Thus entire device families can be equipped with differently scaled variants of the core, making all of them fully software compatible. A single unified software stack and the guarantee to produce exactly the same visual result (at different speeds) allows saving significant development resources. D/AVE NX is highly efficient as the internal multi-level scheduler can maximize the utilization of every HW element even better than the fixed function pipeline of the successful D/AVE cores could. Scheduling also does not have to be pre-computed in the compiler simplifying the compiler and driver architecture considerably.
Qt Yocto Linux System Solution for Intel PSG SoCs driven by D/AVE NX
D/AVE NX will be provided as an evaluation kit for Intel PSG SoCs including a complete Qt System solution on Yocto Linux and Qt- as well as OpenGL ES 2.0 example applications.
A Qt + OpenGL Demo “Cube” for the Cyclone V SoC based "DE10-Nano SoC Board" is available for download via the following link:
D/AVE NX Feature & Technology Overview
- Unified Shader Architecture
- Fully IEEE compatible floating point ALUs (incl. rounding, denormals etc.)
- True integer arithmetic (8bit, 16bit, 32bit)
- Massively parallel execution with fine grained Multithreading
- Scalability throughout the entire design
- Scaling from tiny footprint up to high end performance with exact same driver / software stack for all versions => same output at different speeds!
- Bandwidth reduction by e.g. on the fly lossless data compression/ decompression
- System security features
- Stop on bus error for integration with memory protection units
- Hardware out-of-framebuffer memory access protection
- Full support of all OpenGL ES 2.0/3.1 and VULKAN rendering features
- High render quality
- Highly accurate sub pixel positioning, interpolation and filtering
- Multiple anti-aliasing techniques (including MSAA)
- Effective texture and frame-buffer compression
- Hardware supported blending (normal alpha, linear colorspace, Porter-Duff ,…)
- Various texture and framebuffer formats
- High resolutions: Frame buffers and textures up to 4k x 4k pixels
- Support for Image Transformation & Warping
- Composition Engine
- Memory blocks controlled by Chip Select port
- Prepared for efficient automatic clock gating
- Global clock gating as option
- Low resource consumption (starting at 31K LE)
- Single clock domain architecture
- Bus interface clock frequency may differ from core frequency
- High latency capable
- Optional internal arbitration to work with a single bus master
- Adaptors for common bus protocols
- ARM AMBA: APB for register access, AHB or AXI (preferred) for memory bus master access
- Intel PSG Avalon as bus adaptors for both register and bus master access
- Other bus protocols can be easily adapted
Resource Usage and Performance
The actual resource usage of D/AVE NX depends mainly on the number of Shader Units (SUs), the number of Arithmetic Logic Units (ALUs) per shader unit and partly on the bus and cache configuration. The following numbers give an indication of the resource usage and resulting performance for a typical configuration.
FPGA Resource Usage and Performance (preliminary estimates)
|(*1)||1 SU with 4 ALUs||1 SU with 8 ALUs||2 SUs with 8 ALUs||2 SUs with 16 ALUs|
|Adaptive Logic Modules (ALMs)||14k ALMs||20k ALMs||34k ALMs||50k ALMs|
|Logic Elements (LEs)||31k LEs||44k LEs||75k LEs||110k LEs|
|Pixels per cycle (*2)||0.5||1||2||4|
|Performance||0.8 GFLOPS||1.6 GFLOPS||3.2 GFLOPS||6.4 GFLOPS|
*2 Best case
TES provides Khronos conform OpenGL ES 2.0/3.1 and EGL drivers. Both drivers rely on a low level D/AVE NX driver layer abstracting hardware details like the register access and making porting to different CPUs / Operating systems a lot easier.
All drivers have the following features:
- Fully reentrant & thread-safe
- Minimal OS dependency (HAL part separated)
- No inline assembler required
- Support for multiple D/AVE NX instances
- Multi-threading support, i.e. multiple applications can use D/AVE NX concurrently
- Small memory footprint
On top of the standard Khronos APIs it is also possible to provide drivers for the legacy TES APIs from D/AVE HD and D/AVE 2D on request.