From: Steffen Nurpmeso Subject: [NEW] sysutils/s-bsdipa To: ports@openbsd.org Date: Tue, 15 Apr 2025 23:33:03 +0200 Hello. S-bsdipa, a mutation of BSDiff: create or apply a ZLIB compressed binary difference patch. (Think xdelta3 / vcdiff; but very small and often better.) In short: the OpenBSD port only supports the 32-bit mode, the binary is a multiplexer that creates and incorporates the patches. The full test suite is only part of the BsDiPa perl (CPAN) module (because it would be quite lengthy in C), therefore the port only creates an on-the-fly test that at least shows that the executable as such is at least usable. In long: ie, that is [.] the BSDiff algorithm of Colin Percival, [.] taken from the FreeBSD operating system source code, and slightly rearranged. There is a freely usable (BSD 2-clause, ISC and MIT licenses) plug-and-play ISO C99 and perl implementation available (https://github.com/sdaoden/ s-bsdipa), which includes further references on the algorithm. [.this port is] a 32-bit adaption sufficient for email that almost halves memory requirements compared to 64-bit, and also produces smaller difference control data. The resulting binary difference is then ZLIB[RFC1950] compressed[.] with the following adaptions: * First of all: the string suffix sorting and difference creation approach of Colin Percival has been left unchanged. * The original had been fixated on 64-bit file sizes and content representation. The adaption supports (compile-time switching in between) 32-bit (and 64-bit). Using 32-bit almost halves memory constraints, and produces smaller patch control data. It is deemed sufficient for email purposes. (32-bit and 64-bit patches are not interchangeable.) * The "magic window of inspection" has been made configurable, from the fixed original value 8, which represents a perfect fit for compiler output. The adaption uses the default value 16, which is a very good fit for textual data. The value is, however, irrelevant on the patch application side. * In order to reduce memory usage during patch generation, the adaption uses a shared memory region for differential and extra data: the former is therefore stored in reversed order, top down. (This reduces memory usage by the size of the target data set.) * The adoption stores data in big endian (network; MSF; most significant byte first) instead of little endian (LSF; least significant byte first) byte order. * The original uses three separate bzip2 streams to serialize control, differential and extra data. The adaption separated patch generation from the I/O layer, which will therefore see the entire readily prepared patch data.[.] [The port uses ZLIB[RFC1950] for patch compression.] * The original header did not contain the size of the extra data, which was stored last, with its size implicitly extending to the end of the patch. The adaption includes the extra data size in the header, allowing more verification tests to be applied with only the header being readily parsed. This also enables the I/O layer to allocate perfectly sized memory with only the header data being available. * The adaption performs memory allocations through user provided callbacks. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)