From: Jeremie Courreges-Anglas Subject: Re: UPDATE: numpy To: Brad Smith Cc: ports@openbsd.org, daniel@openbsd.org Date: Tue, 24 Jun 2025 17:46:29 +0200 On Sat, Jun 21, 2025 at 04:49:35AM -0400, Brad Smith wrote: > On 2025-06-20 1:17 p.m., Jeremie Courreges-Anglas wrote: > > On Fri, Jun 20, 2025 at 01:25:11AM -0400, Brad Smith wrote: > > > Here is a diff for numpy that backports patches to enable CPU feature > > > detection via elf_aux_info() on ARM, PowerPC64, and RISC-V 64. > > PLEASE, Brad, tell us what you tested and what you didn't test when > > you propose patches, especially arch-dependent ones. > > This still builds on riscv64. > > > > It shouldn't matter much, but I'd prefer that you initialize 'hwcap' > > to zero and/or add error checking around elf_aux_info(). > > > I have build tested and ran 'make test' on both arm64 and riscv64. Nothing > changes about riscv64. That's expected, we don't support RVV anyway. > arm64 -- M1 > > before: > = 97 failed, 44699 passed, 345 skipped, 2815 deselected, 33 xfailed, 4 > xpassed, 65 warnings in 525.39s (0:08:45) = > > after: > = 97 failed, 44699 passed, 345 skipped, 2815 deselected, 33 xfailed, 4 > xpassed, 65 warnings in 509.20s (0:08:29) = The apparent speedup wasn't very stable in my tests, but we should probably go for it anyway. It could probably be better. While fixing riscv64 I also looked at arm64 and I think we're busted because of some failing checks: [...] Message: During parsing cpu-dispatch: The following CPU features were ignored due to platform incompatibility or lack of support: "XOP FMA4" Test features "NEON NEON_FP16 NEON_VFPV4 ASIMD" : Supported Test features "ASIMDHP" : Unsupported due to Compiler fails against the test code of "ASIMDHP" Test features "ASIMDFHM" : Unsupported due to Implied feature "ASIMDHP" is not supported Test features "SVE" : Unsupported due to Implied feature "ASIMDHP" is not supported Configuring npy_cpu_dispatch_config.h using configuration Message: CPU Optimization Options baseline: Requested : min Enabled : NEON NEON_FP16 NEON_VFPV4 ASIMD dispatch: Requested : max -xop -fma4 Enabled : [...] meson-log.txt: ----------- Command line: `cc /usr/ports/pobj/py-numpy-2.2.6/numpy-2.2.6/.mesonpy-u2hr_00r/meson-private/tmpgaj2d4u6/testfile.c -o /usr/ports/pobj/py-numpy-2.2.6/numpy-2.2.6/.mesonpy-u2hr_00r/meson-private/tmpgaj2d4u6/output.obj -c -O2 -pipe -g -D_FILE_OFFSET_BITS=64 -O0 -march=armv8.2-a+fp16` -> 0 Running compile: Working directory: /tmp/tmp4ekm55s8 Source file: /usr/ports/pobj/py-numpy-2.2.6/numpy-2.2.6/numpy/distutils/checks/cpu_asimdhp.c ----------- Command line: `cc /usr/ports/pobj/py-numpy-2.2.6/numpy-2.2.6/numpy/distutils/checks/cpu_asimdhp.c -o /tmp/tmp4ekm55s8/output.exe -D_FILE_OFFSET_BITS=64 -march=armv8.2-a+fp16` -> 1 stderr: /tmp//ccZ65tvS.s: Assembler messages: /tmp//ccZ65tvS.s:50: Error: selected processor does not support `fabd v0.8h,v1.8h,v0.8h' /tmp//ccZ65tvS.s:53: Error: selected processor does not support `fcvtzs w0,h0' /tmp//ccZ65tvS.s:61: Error: selected processor does not support `fabd v0.4h,v1.4h,v0.4h' /tmp//ccZ65tvS.s:64: Error: selected processor does not support `fcvtzs w0,h0' ----------- This is because the compiler used really is egcc which in turn uses devel/gas. > I'll see about zero initializing hwcap. thx. It matters more as a good practice to push upstream than as an additional safety net in our patch. elf_aux_info(AT_HWCAP) *shouldn't* fail on OpenBSD/arm64. -- jca