The installation of AMD OpenCL driver was discussed in a previous post. Here let’s explore a bit more on how the AMD OpenCL HelloWorld sample project is built and run.
Include Header File
HelloWorld.cpp includes the OpenCL header file CL/cl.h. Additional include directories shows that it comes from $(AMDAPPSDKROOT)/include:
As part of the AMD APP SDK installation, the environment variable AMDAPPSDKROOT is set to the directory where the SDK is installed, C:\Program Files (x86)\AMD APP SDK\3.0 in my PC:
And it’s not surprising that we can find the include subdirectory there, and CL/cl.h in it:
This is the header file that contains the OpenCL API functions, such as clGetPlatformIDs and others.
Link Library File
Eventually, the function and symbol names from the cl.h header file should be found by the linker when the executable is built.
The relevant additional library directory is $(AMDAPPSDKROOT)/lib:
And the library file to be linked is OpenCL.lib:
In fact, AMD APP SDK provides library files for both 32-bit (x86) and 64-bit (x86-64) Windows:
The library file OpenCL.lib is however very small, only 28KB:
This probably means that OpenCL.lib is an import library rather than a static library. The LIB tool confirms this:
No .obj files are listed and only OpenCL.dll is referred to in OpenCL.lib.
Dynamic Link Library File
It is clear that HelloWorld.exe will need the dynamic link library OpenCL.dll to run. Go to the directory where HelloWorld.exe is built:
We do not find OpenCL.dll here. Pull 32-bit HelloWorld.exe into 32-bit Dependency Walker window, it reveals that HelloWorld.exe does depend on OPENCL.DLL:
Dependency Walker also shows that OpenCL.DLL is from C:\Windows\System32. But 64-bit Dependency Walker shows that 64-bit HelloWorld.exe also depends on a 64-bit OpenCL.dll from the same C:\Windows\System32 directory:
If we check carefully, we can see that the two OpenCL.dll files are different: the 32-bit version is 58KB, and the 64-bit version is 64KB. But they have the same filename and cannot exist in the same directory!
VoidTools Everything search tool shows that the OpenCL.dll at C:\Windows\System32 is 64KB, but OpenCL.dll under C:\Windows\SysWOW64 is 58KB:
It turns out that Dependency Walker 32-bit version sees the faked C:\Windows\System32 directory, which in fact is C:\Windows\SysWOW64 in Windows 7 x64. This is how 64-bit Windows tricks 32-bit applications into believing that they are running in a real 32-bit environment. The WOW64 DLLs would automatically be seen by 32-bit applications as system wide available DLLs.
Nevertheless, two versions of OpenCL.dll are installed to Windows system directories (32-bit and 64-bit) by OpenCL driver installation, and therefore available to all applications.
Installable Client Driver Loader
Everything seems fine so far. HelloWorld is able to find the header file, link to the import library, and run with OpenCL.dll installed at the system directory.
But, checking the OpenCL.dll file with Dependency Walker reveals something not making sense:
- OpenCL.dll only depends on ADVAPI32.DLL and KERNEL32.DLL. The two depended DLLs are Windows system DLLs. This suggests that the bulk of the OpenCL implementation is inherent in OpenCL.dll, not in other depended DLLs.
- But OpenCL.dll is very small, both versions are less than 64KB. It’s very unlikely that AMD is able to pack their OpenCL implementation into such a tiny binary file.
If we check the Details tab of the OpenCL.dll properties, it shows something interesting:
The DLL is not provided by AMD! Rather, it is built by Khronos, the standardization organization of OpenCL (and OpenGL and others). Obviously, AMD’s OpenCL implementation is not in this DLL.
However the product name, Khronos OpenCL ICD, gives the hint about the nature of this DLL file. ICD stands for Installable Client Driver. This is a mechanism to allow multiple OpenCL implementations from different vendors to co-exist in the same system. The mechanism has a few parts:
- ICD Loader. This is a “well-known” proxy that the user OpenCL application talks to. In our case, it is OpenCL.dll provided by Khoronos.
- ICDL implements all OpenCL API functions. So the user does not need to know the actual OpenCL implementation and needs only to talk to OpenCL.dll for all OpenCL business.
- ICDL forwards actual OpenCL API calls to actual vendor implementations.
- ICD Vendor Libraries, i.e., ICDs. They are the actual OpenCL implementation libraries from vendors. There can be multiple OpenCL libraries from different vendors in the system.
- ICD Loader Vendor Discovery. The discovery has two parts:
- Vendor enumeration. The vendors need to register their libraries to the system. This is where the ICD Loader finds all vendor libraries:
- On Windows, vendor library DLL file paths can be found as names in Windows registry key HKEY_LOCAL_MACHINE\SOFTWARE\Khronos\OpenCL\Vendors with DWORD value 0.
- On Linux, each vendor library should drop a text file in directory /etc/OpenCL/vendors., where the text file should have only one line with the shared object file path.
- On Android, although pretty much Linux, ICD is only available with OpenCL 2.0 and later (see source).
- Adding libraries. For each enumerated vendor library:
- ICD Loader dynamically loads the library through LoadLibrary/dlopen;
- ICD Loader queries for symbols for clIcdGetPlatformIDsKHR, clGetPlatformInfo and clGetExtensionFunctionAddress in the dynamically loaded library through GetProcAddress/dlsym;
- ICD Loader calls clIcdGetPlatformIDsKHR, clGetPlatformInfo and clGetExtensionFunctionAddress to get available platforms, their information and extension function addresses, and make sure the vendor library is ICD compliant.
- If any of the steps above fails, the vendor library is considered not ICD compliant, and ignored by ICD Loader.
How ICDL Works
The OpenCL user application always calls clGetPlatformIDs first before it makes any other OpenCL calls. The implementation of clGetPlatformIDs in ICD Loader performs the discovery step described above, and possibly returns the aggregated platform_ids from different ICD vendors. For any ICD compliant driver, the returned platform_id object must have a dispatch member:
typedef struct _cl_platform_id* cl_platform_id; // cl.h
struct _cl_platform_id // in vendor implementation
struct _cl_icd_dispatch *dispatch;
// ... remainder of internal data
The definition of _cl_icd_dispatch is provided by Khronos to members in Khronos, which contains function pointers to all OpenCL API functions. It is not public, but is similar to this:
CL_API_ENTRY cl_int (CL_API_CALL *clGetPlatformIDs)(
cl_platform_id * platforms,
cl_uint * num_platforms) CL_API_SUFFIX__VERSION_1_0;
CL_API_ENTRY cl_int (CL_API_CALL *clGetPlatformInfo)(
void * param_value,
size_t * param_value_size_ret) CL_API_SUFFIX__VERSION_1_0;
/* ...continues... */
In fact, the struct _cl_icd_dispatch* dispatch member is in every OpenCL object in addition to cl_platform_id. Therefore, almost every OpenCL API function implementation in ICD Loader is a straightforward redirection similar to this:
cl_abc clXYZ(cl_object_type obj, ...)
return obj->dispatch->clXYZ(obj, ...);
Where, cl_object_type can be any OpenCL object, for example, cl_platform_id, cl_device_id, cl_context and so on, while clXYZ is a public OpenCL function. This is possible because all OpenCL objects would have the dispatch field per ICD.
In a rough C++ analogy:
- _cl_icd_dispatch is an interface that contains all the OpenCL API functions as virtual methods.
- Each OpenCL object implement the _cl_icd_dispatch interface.
- ICD Loader calls the corresponding method in the interface, therefore the implementation of the virtual function from the vendor is called.
AMD OpenCL Implementation
On 64-bit Windows, you can run Registry Editor of either 64-bit version (C:\Windows\regedit.exe) or 32-bit version (C:\Windows\SysWOW64\regedit.exe). They show exactly the same registry information.
This is the OpenCL ICD registry for 64-bit Windows native registry key HKEY_LOCAL_MACHINE\SOFTWARE\Khronos\OpenCL\Vendors. 64-bit OpenCL.dll will find entries here.
The reflected registry key for 32-bit Windows on 64-bit Windows is HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Khronos\OpenCL\Vendors. 32-bit OpenCL.dll will find entries here.
Notice the both Vendors entries above do not have full path to the ICD dll. Basically, it means those DLL files are installed under Windows system directories. As Everything shows, the AMD OpenCL 64-bit driver amdocl64.dll is under C:\Windows\System32 directory, and the 32-bit driver amdocl.dll is under C:\Windows\SysWOW64. They are registered to the corresponding registry entry for ICD Loader to discover.
As we expected, the sizes of the actual AMD OpenCL implementation DLL files are much bigger than the ICD Loader: 47MB for 64-bit and 39MB for 32-bit! A bit more details about AMD OpenCL implementation:
Dependency Walker shows that it depends on some Windows system DLLs and OpenGL. It exposes OpenCL functions such as clBuildProgram, and some AMD proprietary functions such as aclWriteToMem.
This blog post walks through the header and import library files in building AMD HellowWorld OpenCL sample project, and explores the Khronos ICD Loader OpenCL.dll as well as AMD OpenCL implementation drivers on Windows in running the sample.