Blob


1 .\"
2 .\" Copyright (c) 2018 Stefan Sperling <stsp@openbsd.org>
3 .\"
4 .\" Permission to use, copy, modify, and distribute this software for any
5 .\" purpose with or without fee is hereby granted, provided that the above
6 .\" copyright notice and this permission notice appear in all copies.
7 .\"
8 .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
9 .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
10 .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
11 .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
12 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
13 .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
14 .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
15 .\"
16 .Dd $Mdocdate$
17 .Dt GIT-REPOSITORY 5
18 .Os
19 .Sh NAME
20 .Nm git-repository
21 .Nd Git repository format
22 .Sh DESCRIPTION
23 A Git repository stores a series of versioned snapshots of a file hierarchy.
24 Conceptually, the repository's data model is a directed acyclic graph which
25 contains three types of objects as nodes:
26 .Bl -tag -width Ds
27 .It Blobs
28 The content of tracked files is stored in objects of type
29 .Em blob .
30 .It Trees
31 A
32 .Em tree
33 object points to any number of such blobs, and also to other trees in
34 order to represent a hierarchy of files and directories.
35 .It Commits
36 A
37 .Em commit
38 object points to the root element of one tree, and thus records the
39 state of this entire tree as a snapshot.
40 Commit objects are chained together to form lines of version control history.
41 Most commits have just one successor commit, but commits may be succeeded by
42 an arbitrary number of subsequent commits so that diverging lines of version
43 control history, known as
44 .Em branches ,
45 can be represented.
46 A commit which precedes another commit is referred to as that other commit's
47 .Em parent commit .
48 A commit with multiple parents unites disparate lines of history and is
49 known as a
50 .Em merge commit .
51 .It Tags
52 A
53 .Em tag
54 object associates a user-defined label with another object, which is
55 typically a commit object.
56 Tag objects also contain a tag message, as well as author and
57 timestamp information.
58 .El
59 .Pp
60 Each object is identified by a SHA1 hash calculated over both the object's
61 header and the data stored in the object.
62 .Sh OBJECT STORAGE
63 Loose objects are stored as individual files beneath the directory
64 .Pa objects ,
65 spread across 256 sub-directories named after the 256 possible hexadecimal
66 values of the first byte of an object identifier.
67 The name of the loose object file corresponds to the remaining hexadecimal
68 byte values of the object's identifier.
69 .Pp
70 A loose object file begins with a header which specifies the type of object
71 as an ASCII string, followed by an ASCII space character, followed by the
72 object data's size encoded as an ASCII number string.
73 The header is terminated by a
74 .Sy NUL
75 character, and the remainder of the file contains object data.
76 Loose objects files are compressed with
77 .Xr deflate 3 .
78 .Pp
79 Multiple objects can be bundled in a
80 .Em pack file
81 for better disk space efficiency and increased run-time performance.
82 The pack file format introduces two additional types of objects:
83 .Bl -tag -width Ds
84 .It Offset Delta Objects
85 This object is represented as a delta against another object in the
86 same pack file.
87 This other object is referred to by its offset in the pack file.
88 .It Reference Delta Objects
89 This object is represented as a delta against another object in the
90 same pack file.
91 The other object is referred to by its SHA1 object identifier.
92 .El
93 .Pp
94 Pack files are self-contained and may not refer to loose objects or
95 objects stored in other pack files.
96 Deltified objects may refer to other deltified objects as their delta base,
97 forming chains of deltas.
98 The ultimate base of a delta chain must be an object of the same type as
99 the original object which is stored in deltified form.
100 .Pp
101 Each pack file is accompanied by a corresponding
102 .Em pack index
103 file, which lists the IDs and offsets of all objects contained in the
104 pack file.
105 .Sh REFERENCES
106 A reference associates a name with an object ID.
107 A prominent use of references is providing names to branches in the
108 repository by pointing at commit objects which represent the current
109 tip commit of a branch.
110 Because references may point to arbitrary object IDs their use
111 is not limited to branches.
112 .Pp
113 The name is a UTF-8 string with the following disallowed characters:
114 .Sq \ \&
115 (space),
116 \(a~ (tilde),
117 \(a^ (caret),
118 : (colon),
119 ? (question mark),
120 * (asterisk),
121 [ (opening square bracket),
122 \\ (backslash).
123 Additionally, the name may not contain the two-character sequences
124 //, .. , and @{.
125 .Pp
126 Reference names may optionally have multiple components separated by
127 the / (slash) character, forming a hierarchy of reference namespaces.
128 Got reserves the
129 .Pa got/
130 reference namespace for internal use.
131 .Pp
132 A symbolic reference associates a name with the name of another reference.
133 The most prominent example is the
134 .Pa HEAD
135 reference which points at the name of the repository's default branch
136 reference.
137 .Pp
138 References are stored either as a plain file within the repository,
139 typically under the
140 .Pa refs/
141 directory, or in the
142 .Pa packed-refs
143 file which contains one reference definition per line.
144 .Pp
145 Any object which is not directly or indirectly reachable via a reference
146 is subject to deletion by Git's garbage collector.
147 .Sh FILES
148 .Bl -tag -width packed-refs -compact
149 .It Pa HEAD
150 A reference to the current head commit of the Git work tree.
151 In bare repositories, this files serves as a default reference.
152 .It Pa ORIG_HEAD
153 Reference to original head commit.
154 Set by some Git operations.
155 .It Pa FETCH_HEAD
156 Reference to a branch tip commit most recently fetched from another repository.
157 .It Pa branches/
158 Legacy directory used by the deprecated Gogito Git interface.
159 .It Pa config
160 Git configuration file.
161 See
162 .Xr git-config 1 .
163 .It Pa description
164 A human-readable description of the repository.
165 .It Pa got.conf
166 Configuration file for
167 .Xr got 1 .
168 See
169 .Xr got.conf 5 .
170 .It Pa hooks/
171 This directory contains hook scripts to run when certain events occur.
172 .It Pa index
173 The file index used by
174 .Xr git 1 .
175 This file is not used by
176 .Xr got 1 ,
177 which uses the
178 .Xr got-worktree 5
179 file index instead.
180 .It Pa info
181 Various configuration items.
182 .It Pa logs/
183 Directory where reflogs are stored.
184 .It Pa objects/
185 Loose and packed objects are stored in this directory.
186 .It Pa packed-refs
187 A file which stores references.
188 Corresponding on-disk references take precedence over those stored here.
189 .It Pa refs/
190 The default directory to store references in.
191 .El
192 .Pp
193 A typical Git repository exposes a work tree which allows the user to make
194 changes to versioned files and create new commits.
195 When a Git work tree is present, the actual repository data is stored in a
196 .Pa .git
197 subfolder of the repository's root directory.
198 A Git repository without a work tree is known as a
199 .Dq bare
200 repository.
201 .Xr got 1
202 does not make use of Git's work tree and treats every repository as if it
203 was bare.
204 .Sh SEE ALSO
205 .Xr got 1 ,
206 .Xr gotadmin 1 ,
207 .Xr deflate 3 ,
208 .Xr SHA1 3 ,
209 .Xr got-worktree 5 ,
210 .Xr got.conf 5
211 .Sh HISTORY
212 The Git repository format was initially designed by Linus Torvalds in 2005
213 and has since been extended by various people involved in the development
214 of the Git version control system.
215 .Sh CAVEATS
216 The particular set of disallowed characters in reference names is a
217 consequence of design choices made for the command-line interface of
218 .Xr git 1 .
219 The same characters are disallowed by Got for compatibility purposes.
220 Got additionally prevents users from creating reference names with
221 a leading - (dash) character, because this is rarely intended and
222 not considered useful.