I’ve been looking in to PCIe+CXL. These are my notes.
There is a cxl_test module the Linux tree under tools/testing/cxl/.
There is a cxl command line tool. On Ubuntu and CentOS you install it via the ndctl package. This is short for libnvdimm, or Nonvoltile Memory. I think it is needed for CXL Kernel tests, but it is interesting in its own right, too.
When trying to build the cxl_test module, from it’s directory I got…
/home/ayoung/linux/tools/testing/cxl/config_check.c: In function ‘check’: ././include/linux/compiler_types.h:352:45: error: call to ‘__compiletime_assert_117’ declared with attribute error: BUILD_BUG_ON failed: !IS_MODULE(CONFIG_CXL_BUS) 352 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) |
This means I need to change the config option to the kernel build from ‘y’ to ‘m’ in order to build it as a module. The make menuconfig search function shows the output below. Note a that PCI Support is the top menu item on the device drivers page.
Symbol: CXL_BUS [=y] Type : tristate Defined at drivers/cxl/Kconfig:2 Prompt: CXL (Compute Express Link) Devices Support Depends on: PCI [=y] Location: Main menu -> Device Drivers (1) -> PCI support (PCI [=y]) |
One made, there are a bunch of .ko files in the subdir:
$ find . -name \*.ko ./test/cxl_mock.ko ./test/cxl_mock_mem.ko ./test/cxl_test.ko ./cxl_mem.ko ./cxl_pmem.ko ./cxl_acpi.ko ./cxl_core.ko ./cxl_port.ko |
$ sudo insmod test/cxl_test.ko insmod: ERROR: could not insert module test/cxl_test.ko: Unknown symbol in module |
11204.608668] cxl_test: Unknown symbol cxl_decoder_autoremove (err -2) [11204.615136] cxl_test: Unknown symbol devm_cxl_add_dport (err -2) [11204.621236] cxl_test: Unknown symbol is_cxl_memdev (err -2) [11204.626927] cxl_test: Unknown symbol cxl_decoder_add_locked (err -2) [11204.633917] cxl_test: Unknown symbol cxl_switch_decoder_alloc (err -2) [11204.640706] cxl_test: Unknown symbol cxl_endpoint_decoder_alloc (err -2) [11204.647649] cxl_test: Unknown symbol to_cxl_port (err -2) [11204.653117] cxl_test: Unknown symbol register_cxl_mock_ops (err -2) [11204.659676] cxl_test: Unknown symbol unregister_cxl_mock_ops (err -2) |
The mock module reports the error
[11573.093178] cxl_mock: Unknown symbol nvdimm_bus_register
So Building ../nvdimm using the same approach as above. This symbol is defined in
../nvdimm/nfit.mod.c:105: { 0xe9117c1f, "nvdimm_bus_register" }, |
That brings up the errors
[11907.753694] libnvdimm: Unknown symbol __wrap_devm_memunmap (err -2) [11907.760070] libnvdimm: Unknown symbol __wrap___release_region (err -2) [11907.766676] libnvdimm: Unknown symbol __wrap___devm_request_region (err -2) [11907.773764] libnvdimm: Unknown symbol __wrap_memunmap (err -2) [11907.779997] libnvdimm: Unknown symbol __wrap___devm_release_region (err -2) [11907.787085] libnvdimm: Unknown symbol __wrap_memremap (err -2) [11907.793345] libnvdimm: Unknown symbol __wrap_iounmap (err -2) [11907.799217] libnvdimm: Unknown symbol __wrap___request_region (err -2) [11907.806304] libnvdimm: Unknown symbol __wrap_devm_memremap (err -2) |
Some guidance from Dan Williams on how to run the test: https://github.com/pmem/ndctl/blob/main/README.md. To Build nvdimm code:
make M=tools/testing/nvdimm make M=tools/testing/cxl/ sudo make M=tools/testing/nvdimm modules_install |
Both of those give:
depmod: WARNING: /lib/modules/5.19.0_ampcxl_+/extra/test/nfit_test.ko needs unknown symbol libnvdimm_test depmod: WARNING: /lib/modules/5.19.0_ampcxl_+/extra/test/nfit_test.ko needs unknown symbol acpi_nfit_test depmod: WARNING: /lib/modules/5.19.0_ampcxl_+/extra/test/nfit_test.ko needs unknown symbol pmem_test depmod: WARNING: /lib/modules/5.19.0_ampcxl_+/extra/test/nfit_test.ko needs unknown symbol device_dax_test depmod: WARNING: /lib/modules/5.19.0_ampcxl_+/extra/test/nfit_test.ko needs unknown symbol dax_pmem_test |
When I try to run the ndctl test:
sudo meson test -C build |
The tests are skipped
Due to
libkmod: DEBUG libkmod/libkmod-module.c:202 kmod_module_parse_depline: 1 dependencies for nfit test/init: ndctl_test_init: nfit.ko: appears to be production version: /lib/modules/5.19.0_ampcxl_+/kernel/drivers/acpi/nfit/nfit.ko __ndctl_test_skip: explicit skip test_libndctl:2600 nfit_test unavailable skipping tests |
The instructions above showed the way forward: I needed to perform a modules_install of the modules built for the test (tools/testing/nvdimm and tools/testing/cxl including explicitly installing the ones for the tools/testing/nvdimm/test) before the tests will run. Which is clearly stated in the instructions.
The error in the logfile now shows that the code is x86_64 specific: there is a failure to load the module nd_e820 which is related to memory management on x86_64 platforms. The file: ndctl/test/core.c has the following line:
if (access("/sys/bus/acpi", F_OK) == 0) family = NVDIMM_FAMILY_INTEL; |
and then later
if (family != NVDIMM_FAMILY_INTEL && (strcmp(name, "nfit") == 0 || strcmp(name, "nd_e820") == 0)) continue; |
However, my machine does have the path /sys/bus/acpi but will not build/load the nd_8280 module. This seems to indicate at least where to start working on the test: making an appropriate AARCH64 Family for the core test framework. I suspect the right thing is to add in a check to something like /proc/cpu and look at the manufacturer. Alternately, I could look at uname -r and see what architecture the Kernel is running on, if the solution is less vendor specific than required for x86_64. Tasks for future days.
For now, I am just going to highjack this check and say that it should set family equal to NVDIMM_FAMILY_AARCH64. With that, the first test passes, maybe some others, have not looked that closely yet.
Next up I will continue through the tests and see what else I can hammer in to place to get them to pass.