Recent comments posted to this site:
@Sundar, good question.
git annex enableremote
will always refuse to enable the remote if there's
a missing parameter, and prompt for the parameter. Finding the right value
is up to you. Most of the time, no additional parameters are needed, or
the parameters are fairly self-explanatory, eg login passwords for remote
services.
The difficulty with directory special remotes is that my /foo may not be the same as your /foo, so it can't reuse the directory= that was provided to initremote, and it's up to you to enter the right directory path.
I think this needs to come down to documentation in the repository. The
description of the remote (set by git annex describe
is a reasonable place to put that, unless you have somewhere better.
Is there a way to determine the parameters that an enableremote command must use, if one does not know it? The use case is as follows:
* Dev 1 performs an initremote annexed-media directory=/path/to/media ...
* Dev 1 syncs content
* Dev 2 comes along (or Dev 1 comes along months later with a different machine) and clones the repo, but needs to know the directory=/path... in order to 'enableremote'. Is there any way to glean this information from the source repo itself?
The steps would be:
dev1$ git clone git@gitserver:myproject.git && cd myproject
dev1$ mkdir images && touch images/foo1.png
dev1$ git annex initremote annexation.dir directory=/mnt/media/myproject.annex/ encrypted=false
dev1$ git commit && git push && git annex sync --content
``` dev2$ git clone git@gitserver:myproject.git && cd myproject dev2$ git annex whereis
shows something like ...
whereis images/foo1.png (7 copies) ...
38e67e39-7dfb-45e8-90fc-8c5d01aae0b4 -- annexation.dir
dev2$ git annex enableremote annexation.dir directory=??? ```
So how does the new developer know how to define the annexation.dir? Is there any way to extract from the repo itself? Or must this information be saved into the repo's documentation to avoid losing the reference?
Thanks!
I sometimes receive the following error when trying to upload files to glacier:
Traceback (most recent call last):
File "/home/victor/bin/glacier", line 736, in <module>
main()
File "/home/victor/bin/glacier", line 732, in main
App().main()
File "/home/victor/bin/glacier", line 718, in main
self.args.func()
File "/home/victor/bin/glacier", line 500, in archive_upload
file_obj=self.args.file, description=name)
File "/usr/lib/python2.7/site-packages/boto/glacier/vault.py", line 178, in create_archive_from_file
writer.close()
File "/usr/lib/python2.7/site-packages/boto/glacier/writer.py", line 228, in close
self.partitioner.flush()
File "/usr/lib/python2.7/site-packages/boto/glacier/writer.py", line 79, in flush
self._send_part()
File "/usr/lib/python2.7/site-packages/boto/glacier/writer.py", line 75, in _send_part
self.send_fn(part)
File "/usr/lib/python2.7/site-packages/boto/glacier/writer.py", line 222, in _upload_part
self.uploader.upload_part(self.next_part_index, part_data)
File "/usr/lib/python2.7/site-packages/boto/glacier/writer.py", line 129, in upload_part
content_range, part_data)
File "/usr/lib/python2.7/site-packages/boto/glacier/layer1.py", line 1279, in upload_part
response_headers=response_headers)
File "/usr/lib/python2.7/site-packages/boto/glacier/layer1.py", line 119, in make_request
raise UnexpectedHTTPResponseError(ok_responses, response)
boto.glacier.exceptions.UnexpectedHTTPResponseError: Expected 204, got (408, code=RequestTimeoutException, message=Request timed out.)
gpg: [stdout]: write error: Broken pipe
gpg: DBG: deflate: iobuf_write failed
gpg: [stdout]: write error: Broken pipe
gpg: filter_flush failed on close: Broken pipe
gpg: [stdout]: write error: Broken pipe
gpg: filter_flush failed on close: Broken pipe
git-annex: fd:17: hPutBuf: resource vanished (Broken pipe)
It happens only sometimes. glacier-cli can upload files without problems. The progress of the file upload is also erratic, it jumps to ~90% and then gets stuck. Can I do something to resolve this?
git-annex version: 5.20140717
build flags: Assistant Inotify DBus TDFA
key/value backends: SHA256E SHA1E SHA512E SHA224E SHA384E SHA256 SHA1 SHA512 SHA224 SHA384 WORM URL
remote types: git gcrypt bup directory rsync web glacier ddar hook external
I am using glacier-cli from git master.
Suppose I have two Samba fileservers in two different locations. Can I use git-annex in thin mode + git-annex assistant to automatically synchronize these two fileserver? Specifically, I am trying to understand if:
1) git-annex preserve owner/group ID and POSIX ACLs; 2) it can efficiently manage very large number of file/directory (500K+ files) 3) it can be used alongside inotify to efficiently transfer only changed files
One last thing: I which sense git-annex is not like Unison? It seems it can be configured to have very similar functionality. I am missing something?
Thank you all.
(My own question, answered by Use The Source Luke method)
If you need to point a type=S3
special remote at a service which provides only https (in my case, a local CEPH RADOS gateway) then you can do it by setting port=443
.
This was implemented in 6fcca2f1 and next tag was 5.20141203 . On Ubuntu, that version is available in Xenial but not Trusty.
@davidriod you can do things like this with special remotes, as long as the special remotes are not encrypted.
I don't really recommend it. With such a shared special remote R and two disconnected git repos -- call them A and B, some confusing situations can occur. For example, the only copies of some files may be on special remote R and git repo B. A knows about the copy in R, so git-annex is satisfied there is one copy of the file. But now, B can drop the content from R, which is allowed as the content is in B. A is then left unable to recover the content of the files at all, since they have been removed from R.
Better to connect the two repositories A and B, even if you do work in two separate branches. Then if a file ends up located only on B, A will be able to say where it is, and could even get it from B (if B was set up as a remote).
Thank you for this, I've always wanted such a GUI, and it's been a common user request!
Been using the one-liner. Despite the warning, I'm not dead yet.
There's much more to do than the one-liner.
This post offers instructions.
First simple try: slow
Was slow (estimated >600s for 189 commits).
In tmpfs: about 6 times faster
I have cloned repository into /run/user/1000/rewrite-git, which is a tmpfs mount point. (Machine has plenty of RAM.)
There I also did git annex init
, git-annex found its state branches.
On second try I also did
git checkout -t remotes/origin/synced/master
So that filter-branch would clean that, too.
There, filter-branch
operation finished in 90s first try, 149s second try.
.git/objects
wasn't smaller.
Practicing reduction on clone
This produced no visible benefit:
time git gc --aggressive time git repack -a -d
Even cloning and retrying on clone. Oh, but I should have done git clone file:///path
as said on git-filter-branch man page's section titled "CHECKLIST FOR SHRINKING A REPOSITORY"
This (as seen on https://rtyley.github.io/bfg-repo-cleaner/ ) was efficient:
git reflog expire --expire=now --all && git gc --prune=now --aggressive
.git/objects
shrunk from 148M to 58M
All this was on a clone of the repo in tmpfs.
Propagating cleaned up branches to origin
This confirmed that filter-branch did not change last tree:
git diff remotes/origin/master..master
git diff remotes/origin/synced/master synced/master
This, expectedly, was refused:
git push origin master
git push origin synced/master
On origin, I checked out the hash of current master, then on tmpfs clone
git push -f origin master
git push -f origin synced/master
Looks good.
I'm not doing the aggressive shrink now, because of the "two orders of magnitude more caution than normal filter-branch" recommended by arand.
Now what? Check if precious not broken
I'm planning to do the same operation on the other repos, then :
- if everything seems right,
- if
git annex sync
works between all those fellows - etc,
- then I would perform the reflog expire, gc prune on some then all of them, etc.
Joey, does this seem okay? Any comment?