From: Landry Breuil Subject: Re: [new] databases/arrow 18.0.0 To: ports@openbsd.org Date: Fri, 1 Nov 2024 21:38:26 +0100 Le Fri, Nov 01, 2024 at 07:17:31PM +0100, Landry Breuil a écrit : > Le Fri, Nov 01, 2024 at 10:10:35AM +0000, Stuart Henderson a écrit : > > On 2024/11/01 10:56, Landry Breuil wrote: > > > hi, > > > > > > following thrift, here's the port for the c++ part of arrow: > > > https://github.com/apache/arrow/blob/main/cpp/README.md > > > it provides the parquet library for https://parquet.apache.org/. > > > > > > some open questions: > > > - i've put the port in databases because for me its sort-of a database > > > format: "The universal columnar format and multi-language toolbox for > > > fast data interchange and in-memory analytics" > > > > > > but it can go into devel or textproc, i'm not settled on it. devel is > > > already a bit crowded... > > > > databases sounds good > > > > > - the toplevel in https://github.com/apache/arrow/ has zero build goo, > > > so from the same distfile one has to build by subdir (eg setting > > > WRKDIST=${WRKDIR}/${DISTNAME}/cpp), hence the pkgname being arrow-cpp > > > since i'm only interested in the c++ part. > > > > shouldn't that be WRKSRC=${WRKDIST}/cpp? > > yes, i'm always confused by the various variations.. > > > > should i name the port databases/arrow-cpp ? databases/arrow/cpp in > > > preparation for potential other ports for various bindings ? > > > > databases/arrow/cpp sounds a good plan to me. common parts can be > > factored in Makefile.inc later when we find out what the common parts > > are :) > > here's a new version that: > - enables json support via textproc/rapidjson > (CXXFLAGS=-I/usr/local/include was the missing key so that cmake finds > rapidjson headers) > - enables building tools & tests, lots of tests run fine: > 89% tests passed, 9 tests failed out of 81 > (and i just realized some of the parquet test failures are only because > i forgot to set PARQUET_TEST_DATA in the env) and here's a third version fetching the test files from github via DIST_TUPLE, and properly setting TEST_ENV so that more tests pass: 94% tests passed, 5 tests failed out of 81 The following tests FAILED: 23 - arrow-compute-scalar-temporal-test (Failed) 34 - arrow-io-file-test (Failed) 36 - arrow-utility-test (Failed) 39 - arrow-threading-utility-test (Failed) 78 - parquet-arrow-test (Failed) feedback on that last version welcome, oks too. My plan is to enable lerc in gtiff, and lerc/arrow/avif in gdal when updating to 3.10 in the coming days. Landry