Commit Briefs

1dce05e8f9 Omar Polo

always cast ctype' is*() arguments to unsigned char

Almost all had already an unsigned argument (uint8_t or unsigned char), but cast anyway in case the types are changed in the future. ok stsp@


674563ab13 tj

Don't return errno when fread fails

fread doesn't consistently set errno on failure. - On OpenBSD fread sets errno on possible argument overflows, but this doesn't occur on other platforms. rfread doesn't set errno on EOF or other failures. - ferror does not set errno on failure. Returning errno here is possibly inconsistent. Return EIO here instead. ok stsp@



f400825bc6 Omar Polo

make diff_atom_hash_update private to diff_atomize_text.c

ok stsp@


cd9ef01a44 Omar Polo

reuse diff_atom_hash_update

ok stsp@


ed9312f04b Omar Polo

fix accounting for line endings in CRLF files

There are two different subtles error in computing the end of line in diff_data_atomize_text_lines (one in per implementation, _fd and _mmap) that causes the '\n' of the '\r\n' case to be left out the current line. It causes strange bugs when diffing CRLF files, such as printing the "\ No newline at end of file" marker very often and showing the wrong offsets in the hunk headers. ok stsp@




fe6d58fb52 Christian Weisgerber

add a missing include for uint8_t and switch from <inttypes.h> to <stdint.h>

ok millert stsp


c16dde50bb Stefan Sperling

allow diff API users to atomize files separately

This is a breaking API change (not that we care about that at this point). This can avoid redundant work spent on atomizing a file multiple times. There are use cases where one particular file must be compared to other files over and over again, such as when blaming file history. The old API gave access to both versions of the file to the atomizer just in case a future atomizer implementation needs this. This can still be achieved by passing a second file via the atomizer's private data pointer.


845f35754a Neels Hofmeyr

reflect ignore-whitespace in atom hash


ca85e8bc8f Stefan Sperling

fix off-by-one in the off-by-one fix made in bdfcb086


bdfcb0869a Stefan Sperling

fix off-by-one access beyond mapped file in diff_data_atomize_text_lines_mmap()

Thread 1 received signal SIGSEGV, Segmentation fault. 0x0000013992a89eca in diff_data_atomize_text_lines_mmap (d=0x13b9b455668) \ at /home/stsp/src/got/got/../lib/diff_atomize_text.c:134 134 if (line_end[0] == '\r' (gdb) p pos $1 = (const uint8_t *) 0x13be402006d "" (gdb) p end $2 = (const uint8_t *) 0x13be4023000 <error: Cannot access memory at \ address 0x13be4023000> (gdb) p end-1 $3 = (const uint8_t *) 0x13be4022fff "" (gdb) p line_end $4 = (const uint8_t *) 0x13be4023000 <error: Cannot access memory at \ address 0x13be4023000>




ad5b3f8555 Neels Hofmeyr

rename diff_atom->d to diff_atom->root, because it always is

The idea was that for each diff box within the files, the atoms would have a backpointer to the current layer of diff_data (indicating the current section), but it is not actually needed to update the backpointer in each atom to the current diff_data. That is why the current code always points atom->d to the root diff_data for the entire file. Clarify by proper name. Constructs like atom->d->root->foo are redundant, just use atom->root->foo.



2a1b94d029 Stefan Sperling

repair DEBUG build




03f497279d Stefan Sperling

return error instead of abort()


e4464189bc Stefan Sperling

rename 'debug.h' to 'diff_debug.h'



3e6cba3a54 Stefan Sperling

replace enum diff_rc errors with plain errno values