Comment calculer le sha d'un dossier (=> tree) github

Le problème exposé dans ce sujet a été résolu.

Salut,

Qu’est-ce que tu n’as pas compris ? C’est juste un script, il suffit de l’appeler avec python githash.py, en lisant le début de la fonction main il semble attendre juste le chemin du blob/tree dont tu veux le hash. Pas sûr que ce soit très utile, git hash-object doit faire le taf, non ?

Au final : L’API pourrait ne pas retourné un tree universelle. Voir : https://stackoverflow.com/a/54137728/2226755. La question : https://stackoverflow.com/a/54137728/2226755

Your code contains a problem, however fixing it doesn’t remove the discrepancy in the A2 case.

The problem with your code

The official git algorithm for computing hashes on trees drops leading zeros from the mode field. In your examples, that field contains values 100644 and 040000, and the latter is recorded by git as 40000.

Proof:

$ git cat-file tree 4ef57de8e81c8415d6da2b267872e602b1f28cfe|hexdump -C
00000000  31 30 30 36 34 34 20 2e  63 6f 76 65 72 61 67 65  |100644 .coverage|
00000010  72 63 00 44 91 70 d0 fa  eb 75 18 23 10 34 55 64  |rc.D.p...u.#.4Ud|
00000020  fd 18 11 c0 b9 fd 73 31  30 30 36 34 34 20 2e 65  |......s100644 .e|
00000030  64 69 74 6f 72 63 6f 6e  66 69 67 00 75 88 49 36  |ditorconfig.u.I6|
00000040  ea 2d 35 b5 31 af 88 6a  ca d7 47 d4 fd 9b 2a 9e  |.-5.1..j..G...*.|
00000050  31 30 30 36 34 34 20 2e  66 6c 61 6b 65 38 00 69  |100644 .flake8.i|
00000060  e8 72 e3 0d 30 f5 c7 de  32 76 d2 89 d6 ae e8 1c  |.r..0...2v......|
00000070  cf 4a f7 34 30 30 30 30  20 2e 67 69 74 68 75 62  |.J.40000 .github|
...                                                             ^^^^^
...                                                             !!!!!

But adding removal of leading zeros1 to your perl script still doesn’t fix the A2 case (although the computed hash changes, it is still different from the expected one):

$ cat main.sh
XX="$(perl -sane '$F[2] =~ s/(..)/\\x$1/g ; $F[0] =~ s/^0+//g ; print $F[0]." ".$F[1]."\\"."x00".$F[2]' output_a1)"
SIZE=$(echo -en "$XX" | wc -c)

echo "original: 8d66139b3acf78fa50e16383693a161c33b5e048 output:"
echo -en "tree $SIZE\x00$XX" | sha1sum

XX="$(perl -sane '$F[2] =~ s/(..)/\\x$1/g ; $F[0] =~ s/^0+//g ; print $F[0]." ".$F[1]."\\"."x00".$F[2]' output_a2)"
SIZE=$(echo -en "$XX" | wc -c)

echo "original: 4ef57de8e81c8415d6da2b267872e602b1f28cfe output:"
echo -en "tree $SIZE\x00$XX" | sha1sum

$ ./main.sh 
original: 8d66139b3acf78fa50e16383693a161c33b5e048 output:
8d66139b3acf78fa50e16383693a161c33b5e048  -
original: 4ef57de8e81c8415d6da2b267872e602b1f28cfe output:
c5c701b8114582e3bb2e353aac157a7febfcd33b  -

The problem (?) with the GitHub API

The explanation is that the A2 hash 4ef57de8e81c8415d6da2b267872e602b1f28cfe points to a commit object rather than a tree. That commit object in turn refers to the tree with hash c5c701b8114582e3bb2e353aac157a7febfcd33b, which is exactly the value computed by the fixed code:

$ git cat-file -t 4ef57de8e81c8415d6da2b267872e602b1f28cfe
commit

$ git cat-file -p 4ef57de8e81c8415d6da2b267872e602b1f28cfe
tree c5c701b8114582e3bb2e353aac157a7febfcd33b
parent 502a88b41161ec7dbff0862e3d805db397caf366
...

Had you used for the A2 query the tree rather than the commit hash (try this) you wouldn’t have had any problem in the first place.

An arguable issue with the GitHub API is that it silently resolves a commit hash to the underlying tree instead of returning an error or including in the response an indication of what happened (for example, by setting the sha field to the hash of the tree rather than the query value).


Leon

  1. The quick&dirty fix won’t work correctly in one case, when the mode field consists of only zeros. In that case the mode field will be completely erased instead of being replaced by a single zero. However that case cannot occur in practice, since an object with such a mode value would be simply inaccessible to git.

Connectez-vous pour pouvoir poster un message.
Connexion

Pas encore membre ?

Créez un compte en une minute pour profiter pleinement de toutes les fonctionnalités de Zeste de Savoir. Ici, tout est gratuit et sans publicité.
Créer un compte