Vulkan driver compatibility on Mesa Intel Ivy Bridge

12 November 2024
C/C++,
Vulkan,
Graphics

With older processors like Intel's Ivy Bridge series, there are limitations in Mesa driver support and incomplete extension implementations for Vulkan.

Other (related) driver repositories or tools specific to Mesa include:

mesa: Modern Gallium3D drivers for Gen 3^ hardware
mesa-amber: Legacy package for classic drivers for Gen 2 - 11 hardware
xf86-video-intel: Legacy Intel DDX driver for Gen 2 - 9 hardware
vulkan-intel: Package for Vulkan support on Broadwell and newer chips

Phoronix reported that Intel's Mesa drivers are preparing for the new Xe kernel driver, which will allow Intel to modernize their kernel driver codebase.

What is the issue? #

Extensions in Vulkan expose additional functionality that may not be available on all hardware or drivers. E.g. VK_KHR_SWAPCHAIN is required to render graphics to a window, literally any application with a graphical output requires this.

Although, it might present incomplete support through the Mesa drivers on Fedora or Debian Linux, resulting in limitations or warnings, i.e. Ivy Bridge Vulkan support is incomplete which is what we will be addressing.

The incomplete support may mean the GPU lacks specific capabilities or optimizations, which could prevent the application from running correctly if it relies on unsupported extensions. Usually, or at least in my case, this results in a fatal error in the pipeline setup.

Refer to this gist for driver installation.

Validating extension support #

There are two steps here:

Checking that the driver can support Vulkan at the required version
Verifying that all required extensions are available immediately

We begin by checking for Ivy Bridge-specific support before proceeding to Vulkan initialization. To check if the program is running on an Ivy Bridge CPU, we require information about the processor (CPUID instruction).

bool is_ivy_bridge() {
    unsigned int eax, ebx, ecx, edx;

    char vendor[13];
    __get_cpuid(0, &eax, &ebx, &ecx, &edx);
    reinterpret_cast<unsigned int*>(vendor)[0] = ebx;
    reinterpret_cast<unsigned int*>(vendor)[1] = edx;
    reinterpret_cast<unsigned int*>(vendor)[2] = ecx;
    vendor[12] = '\0';

    if (std::string(vendor) != "GenuineIntel")
        return false;

    __get_cpuid(1, &eax, &ebx, &ecx, &edx);

    unsigned int family    = (eax >> 8)  & 0xF;
    unsigned int model     = (eax >> 4)  & 0xF;
    unsigned int ext_model = (eax >> 16) & 0xF;

    if (family == 6) {
        model += (ext_model << 4);
        if (model == 0x3A || model == 0x3E)
            return true;
    }

    return false;
}

We verify the CPU vendor string and check the CPU model and family. Ivy Bridge processors have specific model numbers (58 or 62) and belong to Intel's family 6. GenuineIntel is used to pack the vendor char buffer into a 3 x 4 byte space without padding. That's also why AMD' equivalent is AuthenticAMD.

Creating the Vulkan instance #

Vulkan programming begins by creating an instance, the main interface for accessing Vulkan functions. The setup happens to also be verbose. If you've worked with OpenGL before, this step will seem too much and requires getting accustomed to.

VkInstance create_instance() {
    VkApplicationInfo app_info{};
    app_info.sType = VK_STRUCTURE_TYPE_APPLICATION_INFO;
    app_info.pApplicationName = "Pipeline";
    app_info.applicationVersion = VK_MAKE_VERSION(1, 0, 0);
    app_info.pEngineName = "No Engine";
    app_info.engineVersion = VK_MAKE_VERSION(1, 0, 0);
    app_info.apiVersion = VK_API_VERSION_1_0;

    VkInstanceCreateInfo create_info{};
    create_info.sType = VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO;
    create_info.pApplicationInfo = &app_info;

    VkInstance instance;

    if (vkCreateInstance(&create_info, nullptr, &instance) != VK_SUCCESS) {
        std::cerr << "failed to create instance" << std::endl;
        std::exit(1);
    }

    return instance;
}

VkApplicationInfo provides metadata about the application. The instance is the first step in Vulkan's layered architecture for higher-level constructs like physical devices to be accessed. If instance creation fails, there is no point in proceeding and we can terminate the program.

Verifying device extension support #

For the application to function correctly, it must confirm that essential extensions are available on the GPU. The program checks for VK_KHR_SWAPCHAIN_EXTENSION_NAME, which is necessary for rendering output surfaces.

bool check_device_extension_support(VkPhysicalDevice device) {
    uint32_t extension_count;
    vkEnumerateDeviceExtensionProperties(device, nullptr, &extension_count, nullptr);
    std::vector<VkExtensionProperties> available_extensions(extension_count);
    vkEnumerateDeviceExtensionProperties(device, nullptr, &extension_count, available_extensions.data());

    const std::vector<const char*> required_extensions = {
        VK_KHR_SWAPCHAIN_EXTENSION_NAME
    };

    for (const auto& required : required_extensions) {
        bool is_ext_found = false;

        for (const auto& extension : available_extensions) {
            if (strcmp(extension.extensionName, required) == 0) {
                is_ext_found = true;
                break;
            }
        }

        if (!is_ext_found) {
            return false;
        }
    }

    return true;
}

Choosing a compatible physical device #

The next step is to locate a physical device (GPU) that meets all requirements. On a sidenote, you may want to link with libvulkan, i.e. -lvulkan and specify the flag last (at least with clang++, I'm not familiar with g++).

VkPhysicalDevice pick_physical_device(VkInstance instance) {
    uint32_t device_count = 0;
    vkEnumeratePhysicalDevices(instance, &device_count, nullptr);

    if (device_count == 0) {
        std::cerr << "failed to find GPUs with vulkan support" << std::endl;
        std::exit(1);
    }

    std::vector<VkPhysicalDevice> devices(device_count);
    vkEnumeratePhysicalDevices(instance, &device_count, devices.data());

    for (const auto& device : devices) {
        if (is_device_suitable(device)) {
            return device;
        }
    }

    std::cerr << "failed to find a suitable GPU" << std::endl;
    std::exit(1);
}

If no devices are found or none meet the criteria, the program terminates. This ascertains that only GPUs with compatible drivers and support for required extensions proceed.

Ensuring suitability with `is_device_suitable` #

is_device_suitable verifies whether the device is running an integrated GPU type, which is the case for Intel Ivy Bridge:

bool is_device_suitable(VkPhysicalDevice device) {
    VkPhysicalDeviceProperties device_properties;
    vkGetPhysicalDeviceProperties(device, &device_properties);

    VkPhysicalDeviceFeatures device_features;
    vkGetPhysicalDeviceFeatures(device, &device_features);

    return device_properties.deviceType == VK_PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU && check_device_extension_support(device);
}

Notice that it also verifies that all required extensions are available by calling check_device_extension_support, so any device selected has sufficient functionality for the application's requirements.

We first check if the CPU is an Ivy Bridge model. We then initialize a Vulkan instance, select a compatible physical device, and verify that all necessary extensions are available. If all steps succeed, we can work with the instance for other graphics-related stuff and then destroy the Vulkan instance before exiting. Pretty straightforward, right?

In Vulkan, creating a graphics pipeline is a painful process, albeit structured. Each component is explicitly specified to configure rendering behavior. The pipeline defines the stages of rendering, from vertex input to fragment shading and rasterization.

VkPipelineShaderStageCreateInfo         shader_stages[2]{};
VkPipelineVertexInputStateCreateInfo    vertex_input_info{};
VkPipelineInputAssemblyStateCreateInfo  input_assembly{};
VkPipelineRasterizationStateCreateInfo  rasterizer{};
VkPipelineMultisampleStateCreateInfo    multisampling{};
VkPipelineColorBlendAttachmentState     color_blend_attachment{};
VkPipelineColorBlendStateCreateInfo     color_blending{};
VkPipelineLayout                        pipeline_layout;
VkPipelineLayoutCreateInfo              pipeline_layout_info{};
pipeline_layout_info.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO;
vkCreatePipelineLayout(device, &pipeline_layout_info, nullptr, &pipeline_layout);

What did we do? We initialized structures for each stage of the pipeline, each part of the rendering process is explicitly set. The majority of the process is selecting required metadata and initializing.

VkPipelineShaderStageCreateInfo holds shader stages, while VkPipelineVertexInputStateCreateInfo specifies how vertex data is fetched and processed. The layout configuration is initialized through VkPipelineLayoutCreateInfo.

The input_assembly configuration defines how primitives are constructed from vertex data. rasterizer controls rasterization, setting options for polygon mode, culling, and front-face orientation. multisampling is configured minimally to avoid performance costs while supporting basic anti-aliasing (in our case). color_blend_attachment and color_blending set up basic color blending, required for rendering transparency and other color operations. Like I mentioned, the layout itself is defined by pipeline_layout_info, and vkCreatePipelineLayout creates the layout that the pipeline will use.

VkGraphicsPipelineCreateInfo pipeline_info{};
pipeline_info.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO;
pipeline_info.stageCount = 2;
pipeline_info.pStages = shader_stages;
pipeline_info.pVertexInputState = &vertex_input_info;
pipeline_info.pInputAssemblyState = &input_assembly;
pipeline_info.pRasterizationState = &rasterizer;
pipeline_info.pMultisampleState = &multisampling;
pipeline_info.pColorBlendState = &color_blending;
pipeline_info.layout = pipeline_layout;
VkPipeline graphics_pipeline;
vkCreateGraphicsPipelines(device, VK_NULL_HANDLE, 1, &pipeline_info, nullptr, &graphics_pipeline);

VkGraphicsPipelineCreateInfo collects all the configured states into a single graphics pipeline. Each field in pipeline_info points to a previously defined structure, passing configuration details down the pipeline. For Ivy Bridge, this explicit detailing forces the setup to be optimized within known driver constraints and provides an architecture-aware configuration. stageCount and pStages define the number and type of shader stages. Each pointer field (i.e. pVertexInputState, pRasterizationState) links to the corresponding Vulkan structure, as a stub for defining each stage in the pipeline.

Mapping features onto driver support #

This is where we mention the logical device, which is selected after finding a suitable physical device. We will need to configure a logical device with specific extensions required for rendering and compatibility with the drivers. For this, we define the device queue, select required extensions, and establish a logical device that supports key functionalities for integrated graphics cards.

VkDeviceQueueCreateInfo queue_create_info{};
queue_create_info.sType = VK_STRUCTURE_TYPE_DEVICE_QUEUE_CREATE_INFO;
queue_create_info.queueFamilyIndex = 0;
queue_create_info.queueCount = 1;
queue_create_info.pQueuePriorities = &queue_priority;

VkDeviceQueueCreateInfo configures a single device queue which is an interface for submitting graphics commands, with queueFamilyIndex targeting the default queue family index for general operations. The queue count is set to one, and a priority is specified.

VkPhysicalDeviceFeatures device_features{};
VkDeviceCreateInfo create_info{};
create_info.sType = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO;
create_info.pQueueCreateInfos = &queue_create_info;
create_info.queueCreateInfoCount = 1;
create_info.pEnabledFeatures = &device_features;

The device features and queue configuration are then added to VkDeviceCreateInfo, the main structure for creating the logical device. The device_features structure is left as default, as Ivy Bridge hardware may not fully support advanced Vulkan features. Ideally, this configuration keeps the logical device lightweight and compatible with any driver’s restricted feature set on older integrated GPU versions.

const std::vector<const char*> device_extensions = {
    VK_KHR_SWAPCHAIN_EXTENSION_NAME,
    "VK_EXT_shader_stencil_export",
    "VK_KHR_maintenance1",
    "VK_EXT_shader_viewport_index_layer"
};

create_info.enabledExtensionCount = static_cast<uint32_t>(device_extensions.size());
create_info.ppEnabledExtensionNames = device_extensions.data();

VK_KHR_SWAPCHAIN_EXTENSION_NAME is required to create swapchains, which manage the rendering surface and enable double buffering, a core feature for any windowed graphics application.

VK_EXT_shader_stencil_export is required for shading operations involving stencil export, which is helpful in applications that rely on stencil-based effects or depth-related optimizations.

VK_KHR_maintenance1 introduces adjustments that reduce edge-case issues in compatibility and provide additional control over negative viewport heights, a required feature for certain rendering setups.

VK_EXT_shader_viewport_index_layer is used for fine control over rendering layers and viewports which is valuable in viewport manipulation, e.g. split-screen or multi-view applications.

If you're lucky and not running on an Intel® Core™ i5 processor like me with third generation graphics, you'll be able to pass the logical device creation step with no issues.

Previous: Entropy and compute estimation in model inputs
Next: Writing a mark-and-sweep tracing GC in Rust