Qt Quick Direct3D 12 Adaptation

The Direct3D 12 adaptation for Windows 10 (both Win32 ( windows platform plugin) and UWP ( winrt platform plugin)) is shipped as a dynamically loaded plugin. It will not be functional on earlier Windows versions. The building of the plugin is enabled automatically whenever the necessary D3D and DXGI develpoment files are present. In practice this currently means Visual Studio 2015 and newer.

The adaptation is available both in normal, OpenGL-enabled Qt builds and also when Qt was configured with -no-opengl . However, it is never the default, meaning the user or the application has to explicitly request it by setting the QT_QUICK_BACKEND 环境变量到 d3d12 or by calling QQuickWindow::setSceneGraphBackend ().

动机

This experimental adaptation is the first Qt Quick backend focusing on a modern, lower-level graphics API in combination with a windowing system interface different from the traditional approaches used in combination with OpenGL.

It also allows better integration with Windows, Direct3D being the primary vendor-supported solution. This means that there are fewer problems anticipated with drivers, operations like window resizes, and special events like graphics device loss caused by device resets or graphics driver updates.

Performance-wise the general expectation is a somewhat lower CPU usage compared to OpenGL due to lower driver overhead, and a higher GPU utilization with less wasted idle time. The backend does not heavily utilize threads yet, which means there are opportunities for further improvements in the future, for example to further optimize image loading.

The D3D12 backend also introduces support for pre-compiled shaders. All the backend's own shaders (used by the built-in materials on which the Rectangle, Image, Text, etc. QML types are built) are compiled to D3D shader bytecode when compiling Qt. Applications using ShaderEffect items can chose to ship bytecode either in regular files or via the Qt resource system, or use HLSL source strings. Unlike OpenGL, the compilation for the latter is properly threaded, meaning shader compilation will not block the application and its user interface.

Graphics Adapters

The plugin does not necessarily require hardware acceleration. Using WARP, the Direct3D software rasterizer, is also an option. By default the first adapter providing hardware acceleration is chosen. To override this, in order to use another graphics adapter or to force the usage of the software rasterizer, set the environment variable QT_D3D_ADAPTER_INDEX to the index of the adapter. The discovered adapters are printed at startup when QSG_INFO or the logging category qt.scenegraph.general is enabled.

故障排除

When encountering issues, always set the QSG_INFO and QT_D3D_DEBUG environment variables to 1 in order to get debug and warning messages printed on the debug output. The latter enables the Direct3D debug layer. Note that the debug layer should not be enabled in production use since it can significantly impact performance (CPU load) due to increased API overhead.

Render Loops

By default the D3D12 adaptation uses a single-threaded render loop similar to OpenGL's windows render loop. There is also a threaded variant available, that can be requested by setting the QSG_RENDER_LOOP 环境变量到 threaded . However, due to conceptual limitations in DXGI, the windowing system interface, the threaded loop is prone to deadlocks when multiple QQuickWindow or QQuickView instances are shown. Therefore the default is the single-threaded loop for the time being. This means that with the D3D12 backend applications are expected to move their work from the main (GUI) thread out to worker threads, instead of expecting Qt to keep the GUI thread responsive and suitable for heavy, blocking operations.

Scene Graph page for more information on render loops and the MSDN page for DXGI regarding the issues with multithreading.

Renderer

The scenegraph renderer in the D3D12 adaptation does not currently perform any batching. This is less of an issue, unlike OpenGL, because state changes are not presenting any problems in the first place. The simpler renderer logic can also lead to lower CPU overhead in some cases. The trade-offs between the various approaches are currently under research.

着色器效果

The ShaderEffect QML type is fully functional with the D3D12 adaptation as well. However, the interpretation of the fragmentShader and vertexShader properties is different than with OpenGL.

With D3D12, these strings can either be an URL for a local file or a file in the resource system, or a HLSL source string. The former indicates that the file in question contains pre-compiled D3D shader bytecode generated by the fxc tool, or, alternatively, HLSL source code. The type of the file is detected automatically. This means that the D3D12 backend supports all options from GraphicsInfo .shaderCompilationType and GraphicsInfo .shaderSourceType.

Unlike OpenGL, there is a QFileSelector with the extra selector hlsl used whenever opening a file. This allows easy creation of ShaderEffect items that are functional across both backends, for example by placing the GLSL source code into shaders/effect.frag , the HLSL source code or - preferably - pre-compiled bytecode into shaders/+hlsl/effect.frag , while simply writing fragmentShader: "qrc:shaders/effect.frag" in QML.

ShaderEffect 文档编制了解更多细节。

Multisample Render Targets

The Direct3D 12 adaptation ignores the QSurfaceFormat 设置在 QQuickWindow or QQuickView (or set via QSurfaceFormat::setDefaultFormat ()), with two exceptions: QSurfaceFormat::samples () 和 QSurfaceFormat::alphaBufferSize () are still taken into account. When the samples value is greater than 1, multisample offscreen render targets will be created with the specified sample count and a quality of the maximum supported quality level. The backend automatically performs resolving into the non-multisample swapchain buffers after each frame.

Semi-transparent Windows

When the alpha channel is enabled either via QQuickWindow::setDefaultAlphaBuffer () or by setting alphaBufferSize to a non-zero value in the window's QSurfaceFormat or in the global format managed by QSurfaceFormat::setDefaultFormat (), the D3D12 backend will create a swapchain for composition and go through DirectComposition since the flip model swapchain (which is mandatory) would not support transparency otherwise.

It is therefore important not to unneccessarily request an alpha channel. When the alphaBufferSize is 0 or the default -1, all these extra steps can be avoided and the traditional window-based swapchain is sufficient.

This is not relevant on WinRT because there the backend always uses a composition swapchain which is associated with the ISwapChainPanel that backs QWindow on that platform.

Mipmaps

Mipmap generation is supported and handled transparently to the applications via a built-in compute shader, but is experimental and only supports power-of-two images at the moment. Textures of other size will work too, but this involves a QImage -based scaling on the CPU first. Therefore avoid enabling mipmapping for NPOT images whenever possible.

图像格式

When creating textures via the C++ scenegraph APIs like QQuickWindow::createTextureFromImage (), 32-bit formats will not involve any conversion, they will map directly to the corresponding R8G8B8A8_UNORM or B8G8R8A8_UNORM format. Everything else will trigger a QImage -based format conversion on the CPU first.

Unsupported Features

Particles and some other OpenGL-dependent utilities, like QQuickFramebufferObject , are not currently supported.

Like with the Software adaptation , text is always rendered using the native method. Distance field-based text rendering is not currently implemented.

The shader sources in the Qt Graphical Effects module have not been ported to any format other than the OpenGL 2.0 compatible one, meaning the QML types provided by that module are not currently functional with the D3D12 backend.

Texture atlases are not currently in use.

The renderer may lack support for certain minor features, for example drawing points and lines with a width other than 1.

Custom Qt Quick items using custom scenegraph nodes can be problematic. Materials are inherently tied to the graphics API. Therefore only items using the utility rectangle and image nodes are functional across all adaptations.

QQuickWidget and its underlying OpenGL-based compositing architecture is not supported. If mixing with QWidget -based user interfaces is desired, use QWidget::createWindowContainer () to embed the native window of the QQuickWindow or QQuickView .

Finally, rendering via QSGEngine and QSGAbstractRenderer is not feasible with the D3D12 adaptation at the moment.

To integrate custom Direct3D 12 rendering, use QSGRenderNode in combination with QSGRendererInterface . This approach does not rely on OpenGL contexts or API specifics like framebuffers, and allows exposing the graphics device and command buffer from the adaptation. It is not necessarily suitable for easy integration of all types of content, in particular true 3D, so it will likely get complemented by an alternative to QQuickFramebufferObject in future releases.

To perform runtime decisions based on the adaptation in use, use QSGRendererInterface from C++ and GraphicsInfo from QML. They can also be used to check the level of shader support (shading language, compilation approach).

When creating custom items, use the new QSGRectangleNode and QSGImageNode classes. These replace the now deprecated QSGSimpleRectNode and QSGSimpleTextureNode. Unlike their predecessors, the new classes are interfaces, and implementations are created via the factory functions QQuickWindow::createRectangleNode () 和 QQuickWindow::createImageNode ().

Advanced Configuration

The D3D12 adaptation can keep multiple frames in flight, similarly to modern game engines. This is somewhat different from the traditional render - swap - wait for vsync model and allows better GPU utilization at the expense of higher resource usage. This means that the renderer will be a number of frames ahead of what is displayed on the screen.

For a discussion of flip model swap chains and the typical configuration parameters, refer to this article .

Vertical synchronization is always enabled, meaning Present() is invoked with an interval of 1.

The configuration can be changed by setting the following environment variables:

  • QT_D3D_BUFFER_COUNT - The number of swap chain buffers in range 2 - 4. The default value is 3.
  • QT_D3D_FRAME_COUNT - The number of frames prepared without blocking in range 1 - 4. Note that Present will start blocking after queuing 3 frames (regardless of QT_D3D_BUFFER_COUNT ), unless the waitable object is in use. Note that every additional frame increases GPU resource usage since geometry and constant buffer data will have to be duplicated, and involves more bookkeeping on the CPU side. The default value is 2.
  • QT_D3D_WAITABLE_SWAP_CHAIN_MAX_LATENCY - When set to a value between 1 and 16, the frame latency is set to the specified value. This changes the limit for Present() and will trigger a wait for an available swap chain buffer when beginning each frame. Refer to the article above for a detailed discussion. This is considered experimental for now and the default value is 0 (disabled).
  • QT_D3D_BLOCKING_PRESENT - When set to a non-zero value, there will be CPU-side wait for the GPU to finish its work after each call to Present. This effectively kills all parallelism but makes the behavior resemble the traditional swap-blocks-for-vsync model, and can therefore be useful in some special cases. This is not the same as setting the frame count to 1 because that still avoids blocking after Present, and may block only when starting to prepare the next frame (or may not block at all depending on the time gap between the frames). By default blocking present is disabled.