Commit Briefs

dac5c75ed0 Stefan Sperling

convert delta cache to a hash table

This approach uses more memory but is much faster. To offset the additional memory usage somewhat the cache now stores very small deltas only. However, overall memory usage goes up. Hopefully we will find a way to reduce this later. ok op@



a19f439c4e Omar Polo

don't pass $ret to test_done on failure when it's known to be zero

Otherwise the test directory it's not left in place; ok tracey@


6a88129775 Stefan Sperling

properly swap cached struct pack array elements in got_repo_cache_pack()

Avoids clobbering open files for delta base/accumulation, leaking file descriptors, and triggering errors from close(2) during got_repo_close() as we try to close the same file descriptor more than once.


b72706c3d1 Stefan Sperling

move creation of tempfiles outside of lib/diff.c

ok tracey


2497f032fa Stefan Sperling

tog: override SIGTERM and SIGINT handlers to avoid ncurses cleanup() handler

ok thomas


cfcf1cbc17 Stefan Sperling

reduce GOT_PACK_CACHE_SIZE to 32, otherwise it uses too many open files

found by tracey


13242195c2 Stefan Sperling

ensure that all open basefd/accumfd get closed in got_repo_close()

found by tracey


571608344a Stefan Sperling

open temporary files needed for delta application in got_repo_open()

This prepares for callers of got_repo_open() that cannot afford to open files in /tmp, such as gotwebd. In a follow-up change, we could ask such callers to pass in the required amount of open temporary files. One consequence is that got_repo_open() now requires the "cpath" pledge promise. Add the "cpath" promise to affected callers and remove it once the repository has been opened. ok tracey




ce2bf7b7c9 Stefan Sperling

fix a bug in findwixt() which caused pack files with missing parent commits

The 'nskip' variable is supposed to reflect commits which are waiting on the queue and have the 'skip' color. Only increment 'nskip' when adding such commits to the queue. Problem observed with got send -T and a tag pointing to a deleted branch. Test to reproduce the bug written by op@.


d6a28ffe18 Omar Polo

use random seeds for murmurhash2

change the three hardcoded seeds to fresh ones generated on demand via arc4random. Suggested/fixed by and ok stsp@


17cfdba68d Omar Polo

include header


411cbec1f7 Stefan Sperling

shrink struct got_pack_meta a bit by removing the have_reused_delta flag

This flag can be expressed as m->reused_delta_offset != 0 because all deltas in valid pack files will be written at a non-zero offset. We allocate a huge number of these structs during packing, so every little bit helps.


adb4bbb29d Stefan Sperling

reduce the amount of memory used for caching deltas during deltification

With files sorted properly for deltification we produce better deltas but end up consuming more memory and risk running into OpenBSD ulimits during packing. To compensate, reduce the threshold for the amount of delta data we store in memory, spooling more deltas into the cache file. ok op@


f8174ca59b Stefan Sperling

store a path hash instead of a verbatim path in pack meta data

This reduces memory use by gotadmin pack. The goal is to sort files which share a path next to each other for deltification. A hash of the path is good enough for this purpose and consumes less memory than a verbatim copy of the path. Git does something similar. ok op@


3e6ceea0bd Stefan Sperling

fix paths stored in pack meta data, improving file deltification

The old code was broken and stored an empty path or filenames, instead of a repository-relative path. Which means we didn't sort files for deltification as was intended. Fixing this provides much better deltas in large pack files written by gotadmin pack -a. In my test case, pack size changed from 2GB to 1.5GB. ok op@


17259bfa94 Stefan Sperling

plug a small memleak on error in got_pack_create()


e1f5d7cf67 Stefan Sperling

avoid malloc/free for duplicate check in got_pathlists_insert()

ok op@


c9b75c7bd7 Stefan Sperling

revert "Skip poll(2) if an imsgbuf has a non-empty read buffer"

imsg_read() will call recvmsg() on the file descriptor regardless of the read buffer's state, so we should ensure that data is ready. The read buffer is used by imsg_get(), not imsg_read(). We already call imsg_get() before imsg_read(), and call the latter only if imsg_get() returns zero.


2ab714554f Stefan Sperling

Skip poll(2) if an imsgbuf has a non-empty read buffer.



33fd69c2e5 Stefan Sperling

batch up tree entries in imsg instead of sending one imsg per tree entry

This speeds up loading of trees significantly. ok op@


9985f404ff Stefan Sperling

parse tree entries into an array instead of a pathlist

Avoids some extra malloc/free in a performance-critical path. ok op@