Recent comments posted to this site:
It seems git-annex needs some way to verify that all blobs match its expected format for this security to be strong. My histories have tons of huge binary blobs in them from when I tried upgrading to v6 before it was stable and got a lot of data committed raw. I guess I need to rewrite my history to contain only verified blobs?
It's really too bad git hasn't upgraded to a newer hashing function by now. They should make a plan to do that at least once a decade.
@dvicory if someone only knows the onion service address, they can do
nothing to your repository except connect to it and get rejected
due to failure to authenticate. They need the authentication data too
in order to do any of those things. That was talking about the
addresses generated by git annex peer --gen-addresses
,
which include authentication data.
I've improved the wording to avoid confusion between git-annex's addresses and onion addresses.
In the security section, you say that
Anyone who learns the address of a peer can connect to that peer, download the whole history of the git repository, and any available annexed files. They can also upload new files to the peer, and even remove annexed files from the peer. So consider ways that the address of a peer might be exposed.
Do you mean the addresses from git annex peer --gen-addresses
here? Say, if someone has only my onion service address, and none of the authentication data that is normally placed in .git/annex/creds/
, what can they do with my git repository? I think I might be confused by the use of "address" because of onion addresses, which are not private.
Finally got around to playing with this today figured i'd post an example for anyone looking fr this in the future. Please forgive the TCL code if that isn't your thing.
- Adding the "cost command" to your remotes - Example here https://gist.github.com/zpeters/d288531db95aa69a495944a261d17b1c
- The actual script being referenced is really basic. It's basically taking the name of the remote as an argument and depending on where i am it will give that remote a "high" or "low" cost. Code here https://gist.github.com/zpeters/89ba09289c89225ae5344e65c3562e6c
- The cost script needs to know if I'm "home" or not. Here is an example of "whereami" https://gist.github.com/zpeters/cb8c2be53fd41b4d04a78fe81aaaea58
@Sundar, good question.
git annex enableremote
will always refuse to enable the remote if there's
a missing parameter, and prompt for the parameter. Finding the right value
is up to you. Most of the time, no additional parameters are needed, or
the parameters are fairly self-explanatory, eg login passwords for remote
services.
The difficulty with directory special remotes is that my /foo may not be the same as your /foo, so it can't reuse the directory= that was provided to initremote, and it's up to you to enter the right directory path.
I think this needs to come down to documentation in the repository. The
description of the remote (set by git annex describe
is a reasonable place to put that, unless you have somewhere better.
Is there a way to determine the parameters that an enableremote command must use, if one does not know it? The use case is as follows:
* Dev 1 performs an initremote annexed-media directory=/path/to/media ...
* Dev 1 syncs content
* Dev 2 comes along (or Dev 1 comes along months later with a different machine) and clones the repo, but needs to know the directory=/path... in order to 'enableremote'. Is there any way to glean this information from the source repo itself?
The steps would be:
dev1$ git clone git@gitserver:myproject.git && cd myproject
dev1$ mkdir images && touch images/foo1.png
dev1$ git annex initremote annexation.dir directory=/mnt/media/myproject.annex/ encrypted=false
dev1$ git commit && git push && git annex sync --content
``` dev2$ git clone git@gitserver:myproject.git && cd myproject dev2$ git annex whereis
shows something like ...
whereis images/foo1.png (7 copies) ...
38e67e39-7dfb-45e8-90fc-8c5d01aae0b4 -- annexation.dir
dev2$ git annex enableremote annexation.dir directory=??? ```
So how does the new developer know how to define the annexation.dir? Is there any way to extract from the repo itself? Or must this information be saved into the repo's documentation to avoid losing the reference?
Thanks!
I sometimes receive the following error when trying to upload files to glacier:
Traceback (most recent call last):
File "/home/victor/bin/glacier", line 736, in <module>
main()
File "/home/victor/bin/glacier", line 732, in main
App().main()
File "/home/victor/bin/glacier", line 718, in main
self.args.func()
File "/home/victor/bin/glacier", line 500, in archive_upload
file_obj=self.args.file, description=name)
File "/usr/lib/python2.7/site-packages/boto/glacier/vault.py", line 178, in create_archive_from_file
writer.close()
File "/usr/lib/python2.7/site-packages/boto/glacier/writer.py", line 228, in close
self.partitioner.flush()
File "/usr/lib/python2.7/site-packages/boto/glacier/writer.py", line 79, in flush
self._send_part()
File "/usr/lib/python2.7/site-packages/boto/glacier/writer.py", line 75, in _send_part
self.send_fn(part)
File "/usr/lib/python2.7/site-packages/boto/glacier/writer.py", line 222, in _upload_part
self.uploader.upload_part(self.next_part_index, part_data)
File "/usr/lib/python2.7/site-packages/boto/glacier/writer.py", line 129, in upload_part
content_range, part_data)
File "/usr/lib/python2.7/site-packages/boto/glacier/layer1.py", line 1279, in upload_part
response_headers=response_headers)
File "/usr/lib/python2.7/site-packages/boto/glacier/layer1.py", line 119, in make_request
raise UnexpectedHTTPResponseError(ok_responses, response)
boto.glacier.exceptions.UnexpectedHTTPResponseError: Expected 204, got (408, code=RequestTimeoutException, message=Request timed out.)
gpg: [stdout]: write error: Broken pipe
gpg: DBG: deflate: iobuf_write failed
gpg: [stdout]: write error: Broken pipe
gpg: filter_flush failed on close: Broken pipe
gpg: [stdout]: write error: Broken pipe
gpg: filter_flush failed on close: Broken pipe
git-annex: fd:17: hPutBuf: resource vanished (Broken pipe)
It happens only sometimes. glacier-cli can upload files without problems. The progress of the file upload is also erratic, it jumps to ~90% and then gets stuck. Can I do something to resolve this?
git-annex version: 5.20140717
build flags: Assistant Inotify DBus TDFA
key/value backends: SHA256E SHA1E SHA512E SHA224E SHA384E SHA256 SHA1 SHA512 SHA224 SHA384 WORM URL
remote types: git gcrypt bup directory rsync web glacier ddar hook external
I am using glacier-cli from git master.
Suppose I have two Samba fileservers in two different locations. Can I use git-annex in thin mode + git-annex assistant to automatically synchronize these two fileserver? Specifically, I am trying to understand if:
1) git-annex preserve owner/group ID and POSIX ACLs; 2) it can efficiently manage very large number of file/directory (500K+ files) 3) it can be used alongside inotify to efficiently transfer only changed files
One last thing: I which sense git-annex is not like Unison? It seems it can be configured to have very similar functionality. I am missing something?
Thank you all.
To clarify, this only prevents SHA1 collision attacks from causing problems with annexed files. Files checked into the git repository itself are still vulnerable to collision attacks.