Commit Briefs
less chatty regress
ok stsp@
adjust the expected output so that regress passes again
Note that test015 is still broken on OpenBSD with base patch(1). ok stsp@
ARRAY_LIST allocation optimisation
Rather than realloc in fixed-sized blocks, use the 1.5 * allocated scheme when growing the array. This produces fewer allocations and up to 3x speedup on large diffs. ok stsp@
fix accounting for line endings in CRLF files
There are two different subtles error in computing the end of line in diff_data_atomize_text_lines (one in per implementation, _fd and _mmap) that causes the '\n' of the '\r\n' case to be left out the current line. It causes strange bugs when diffing CRLF files, such as printing the "\ No newline at end of file" marker very often and showing the wrong offsets in the hunk headers. ok stsp@
add missing line offset information for unidiff output
We forgot to generate line offset information for lines of the form: "Binary files %s and %s differ" Which is causing scrolling problems in tog's diff view. ok thomas_adam
fix a size_t multiplication overflow in diff_meyrs.c
Found on an OpenBSD armv7 machine running Got regression tests: test_status_shows_no_mods_after_complete_merge Segmentation fault (core dumped) The problematic multiplication is kd_len * kd_len in diff_algo_myers() with kd_len set to 65537. (gdb) p (int)(65537 * 65537) $64 = 131073 (gdb) p (int)(65537 + 65537) $65 = 131074 (gdb) p (unsigned int)(size_t)(-1) $68 = 4294967295 (gdb) p (4294967295 / kd_len) $71 = 65535 Detect such overflow and run the fallback diff algorithm instead.
avoid writing just one character at a time in diff_output_lines()
With input from ori@ and millert@
fix performance issues in the search for function prototypes
with + ok naddy
patience: do not swallow identical neighbors
This does not make much sense, because if common-unique lines swallow their neighboring ones, they count less, and another bad, shorter sequence may gain more weight than a very long sequence that was combined to just one common-unique chunk. It also much simplifies the code and avoids bugs we had to implement complex fixes for before.
set diff box recursion limit to UINT_MAX by default
In practice, recursion is already limited by our Myers max-effort cut and this heuristic should generally provide a better split than a limit on the number of diff boxes. A recursion limit is only required for diff configs that do not include the Myers algorithm, which we currently don't have. Discussed with Neels
set a minimum myers effort limit to avoid an early short-cut on small files
discussed with Neels
allow for telling the difference between empty and non-existent files
Adjust labels used in diff output accordingly; non-existent files should have the label "/dev/null"
add another test case where a context line appears as both - and +
This line appears as a context line with regular diff(1) and git diff: crd->crd_key = sd->mds.mdd_crypto.scr_key[0]; [[[ + crd->crd_alg = sd->mds.mdd_crypto.scr_alg; + crd->crd_klen = sd->mds.mdd_crypto.scr_klen; crd->crd_key = sd->mds.mdd_crypto.scr_key[0]; - bcopy(&blk, crd->crd_iv, sizeof(blk)); + memcpy(crd->crd_iv, &blkno, sizeof(blkno)); ]]] Our diff produces a different result where this context line is both deleted and added: [[[ - crd->crd_alg = CRYPTO_AES_XTS; + crd->crd_alg = sd->mds.mdd_crypto.scr_alg; + crd->crd_klen = sd->mds.mdd_crypto.scr_klen; + crd->crd_key = sd->mds.mdd_crypto.scr_key[0]; + memcpy(crd->crd_iv, &blkno, sizeof(blkno)); + } - switch (sd->mds.mdd_crypto.scr_meta->scm_alg) { - case SR_CRYPTOA_AES_XTS_128: - crd->crd_klen = 256; - break; - case SR_CRYPTOA_AES_XTS_256: - crd->crd_klen = 512; - break; - default: - goto unwind; - } - crd->crd_key = sd->mds.mdd_crypto.scr_key[0]; - bcopy(&blk, crd->crd_iv, sizeof(blk)); - } - crwu->cr_wu = wu; - crwu->cr_crp->crp_opaque = crwu; - return (crwu); ]]]