On Tue, Apr 28, 2015 at 09:17:21AM -0400, Robert P. J. Day wrote: > i'm curious if git will recognize identical underlying content from > two different repositories depending on how that content was added and > committed. If it were done, it would have to be done on the same filesystem, and either using reflinks or hardlinks. The former's safer but the latter is good enough if you trust the object store to never be modified. $ cd /tmp $ git clone ~/Documents/Projects/suckless/st $ git clone st st2 $ cd st2/.git $ stat ./objects/66/30912ed9c7123c7b20a80b90c44810a96ad9cc File: ‘./objects/66/30912ed9c7123c7b20a80b90c44810a96ad9cc’ Size: 194 Blocks: 8 IO Block: 4096 regular file Device: 17h/23d Inode: 1895687 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 1000/ alp) Gid: ( 100/ users) Access: 2015-03-16 22:23:04.000000000 -0400 Modify: 2015-03-16 22:23:04.000000000 -0400 Change: 2015-04-28 10:58:59.650934072 -0400 Birth: - $ cd /tmp/st/.git $ stat ./objects/66/30912ed9c7123c7b20a80b90c44810a96ad9cc File: ‘./objects/66/30912ed9c7123c7b20a80b90c44810a96ad9cc’ Size: 194 Blocks: 8 IO Block: 4096 regular file Device: 17h/23d Inode: 1895687 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 1000/ alp) Gid: ( 100/ users) Access: 2015-03-16 22:23:04.000000000 -0400 Modify: 2015-03-16 22:23:04.000000000 -0400 Change: 2015-04-28 10:58:59.650934072 -0400 Birth: - Note the identical inode number, and the link count. Another way to find this is `find -type f -links +1`. So it does work if you clone one repo from another on the same filesystem. However, cloning again the same repo will not result in deduplication. Makes sense because it'd potentially be a lot of work to find another repo without keeping a list of recent clones, which I'm sure some people'd balk at. > in what circumstances would an identical tree object result in less > work, or however you want to phrase it? See also gitnamespaces(7). Apparently that's a way to deduplicate.
Attachment:
pgpiRU4qV3bHW.pgp
Description: PGP signature