Index | Thread | Search

From:
Steffen Nurpmeso <steffen@sdaoden.eu>
Subject:
Re: [NEW] sysutils/s-bsdipa
To:
ports@openbsd.org
Date:
Thu, 15 May 2025 21:30:51 +0200

Download raw body.

Thread
Steffen Nurpmeso wrote in
 <20250415213303.moqL3xEJ@steffen%sdaoden.eu>:
 |  S-bsdipa, a mutation of BSDiff:
 |  create or apply a ZLIB compressed binary difference patch.
 |
 |(Think xdelta3 / vcdiff; but very small and often better.)

Any interest in this?
Ie it is like a combined FreeBSD bsdiff(1)/bspatch(1), but uses
a different storage format.

 |In short: the OpenBSD port only supports the 32-bit mode, the
 |binary is a multiplexer that creates and incorporates the patches.
 |The full test suite is only part of the BsDiPa perl (CPAN) module
 |(because it would be quite lengthy in C), therefore the port only
 |creates an on-the-fly test that at least shows that the executable
 |as such is at least usable.
 |
 |In long: ie, that is
 |
 |  [.]
 |   the BSDiff algorithm of Colin
 |   Percival, [.] taken from the FreeBSD
 |   operating system source code, and slightly rearranged.  There is a
 |   freely usable (BSD 2-clause, ISC and MIT licenses) plug-and-play ISO
 |   C99 and perl implementation available (https://github.com/sdaoden/
 |   s-bsdipa), which includes further references on the algorithm.
 |   [.this port is]
 |   a 32-bit adaption sufficient for email that almost

(..and to mention that i think some other such approaches only
support 32-bit file sizes anyway; the performance is a bitter
thing.  Having said this, the real 32-bit limit is about ~500MB,
not sufficient for a CD; 64-bit variant is a compile flag.)

 |   halves memory requirements compared to 64-bit, and also produces
 |   smaller difference control data.  The resulting binary difference is
 |   then ZLIB[RFC1950] compressed[.]
 |
 |with the following adaptions:
 |
 |   *  First of all: the string suffix sorting and difference creation
 |      approach of Colin Percival has been left unchanged.
 |
 |   *  The original had been fixated on 64-bit file sizes and content
 |      representation.  The adaption supports (compile-time switching in
 |      between) 32-bit (and 64-bit).  Using 32-bit almost halves memory
 |      constraints, and produces smaller patch control data.  It is
 |      deemed sufficient for email purposes.  (32-bit and 64-bit patches
 |      are not interchangeable.)
 |
 |   *  The "magic window of inspection" has been made configurable, from
 |      the fixed original value 8, which represents a perfect fit for
 |      compiler output.  The adaption uses the default value 16, which is
 |      a very good fit for textual data.  The value is, however,
 |      irrelevant on the patch application side.
 |
 |   *  In order to reduce memory usage during patch generation, the
 |      adaption uses a shared memory region for differential and extra
 |      data: the former is therefore stored in reversed order, top down.
 |      (This reduces memory usage by the size of the target data set.)
 |
 |   *  The adoption stores data in big endian (network; MSF; most
 |      significant byte first) instead of little endian (LSF; least
 |      significant byte first) byte order.
 |
 |   *  The original uses three separate bzip2 streams to serialize
 |      control, differential and extra data.  The adaption separated
 |      patch generation from the I/O layer, which will therefore see the
 |      entire readily prepared patch data.[.]
 |      [The port uses ZLIB[RFC1950] for patch compression.]
 |
 |   *  The original header did not contain the size of the extra data,
 |      which was stored last, with its size implicitly extending to the
 |      end of the patch.  The adaption includes the extra data size in
 |      the header, allowing more verification tests to be applied with
 |      only the header being readily parsed.  This also enables the I/O
 |      layer to allocate perfectly sized memory with only the header data
 |      being available.
 |
 |   *  The adaption performs memory allocations through user provided
 |      callbacks.
 --End of <20250415213303.moqL3xEJ@steffen%sdaoden.eu>

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)