Recent comments posted to this site:

It would be great to have an option to include git history in the export, such that a special remote could be used both to rebuild a repository and to view the contents.
Comment by xloem Thu Aug 10 00:25:27 2017

It would be awesome to be able to move (or copy) between remotes, eg. git annex copy --from=remote-a --to=remote-b

Use case: I have a repo of ~1TB, but only ~30GB free disk space on my laptop. Currently I have to manually get and move the files almost one by one to get them from one remote to the other.

Comment by lykos Wed Jul 26 19:08:07 2017

I couldn't get this to run and had a lot of performance issues with rclone on Google Drive, so I adapted the rclone wrapper to gdrive. It's running fine so far, so I thought I share it:

https://github.com/Lykos153/git-annex-remote-gdrive

Comment by lykos Fri Jul 21 23:56:12 2017

Also, it's not safe to merge two separate git repositories that have been tuned differently (or one tuned and the other one not). git-annex will prevent merging their git-annex branches together, but it cannot prevent git merge remote/master merging two branches, and the result will be ugly at best (git annex fix can fix up the mess somewhat).

My main use repo is 1.7TB large and holds 172.000+ annexed files. Variations in filename case has lead to a number of file duplications that are still not solved (I have base scripts that can be used to flatten filename case and fix references in other files, but it will probably mean handling some corner cases and there are more urgent matters for now).

For these reasons I'm highly interested in the lowercase option and I'm probably not the only one in a similar case.

Does migrating to a tuned repository mean unannexing everything and reimporting into a newly created annex, replica by replica then sync again? That's a high price in some setup. Or is there a way to somehow git annex sync between a newly created repo and an old, untuned one?

Comment by https://launchpad.net/~stephane-gourichon-lpad Tue Jul 18 11:49:10 2017

In some cases, if remote supports versioning, might be cool to be able to export all versions (from previously exported point, assuming linear progression). Having a chat with https://quiltdata.com/ folks, project which I just got to know about. 1. They claim/hope to provide infinite storage for public datasets 2. They do support "File" model, so dataset could simply contain files. If we could (ab)use that -- sounds like a lovely free ride 3. They do support versioning. If we could export all the versions -- super lovely.

Might also help to establish interoperability between the tools

Comment by yarikoptic Fri Jul 14 20:10:42 2017
  • TRANSFEREXPORT STORE|RETRIEVE Key File Name -- note that File could also contain spaces etc (not only the Name), so should be encoded somehow?
  • old external special remote programs ... need to handle an ERROR response -- why not just to boost protocol VERSION to e.g. 2 so those which implement this would reply with a new version #?
Comment by yarikoptic Wed Jul 12 22:09:54 2017

I also wonder if SETURLPRESENT Key Url could also be extended to be SETURLPRESENT Key Url Remote, i.e. that a custom remote could register a URL to Web remote? In many cases I expect a "custom uploader/exporter" but then public URL being available, so demanding a custom external remote to fetch it would be a bit overkill.

N.B. I already was burnt once on a large scale with our custom remote truthfully replying to CLAIMURL (since it can handle them if needed) to public URLs, thus absorbing them into it instead of relaying responsibility to 'Web' remote. Had to traverse dozens of datasets and duplicate urls from 'datalad' to 'Web' remote.

Comment by yarikoptic Wed Jul 12 22:04:38 2017

DAV = “Distributed Authoring and Versioning.”, but versioning was forgotten about in the original RFC. Only some servers/clients implement DeltaV spec (RFC 3253) which came later to fill that gap. But in principle, any DeltaV-compliant WebDAV special remote could then be used for "export" while retaining access to all the versions. References: - WebDAV and Autoversioning - Version Control with Subversion - RFC 3253

I have got interested whenever saw that box.com is supported through WebDAV but not sure if DeltaV is anyhow supported and apparently number of versions stored per file is anyways depends on type of the account (and no versions for a free personal one): https://community.box.com/t5/How-to-Guides-for-Managing/How-To-Track-Your-Files-and-File-Versions-Version-History/ta-p/329

Comment by yarikoptic Wed Jul 12 21:54:49 2017

That would almost work without any smarts on the git-annex side. When it tells the special remote to REMOVEEXPORT, the special remote could remove the file from the HEAD equivilant but retain the content in its versioned snapshots, and keep the url to that registered. But, that doesn't actually work, because the url is registered for that special remote, not the web special remote. Once git-annex thinks the file has been removed from the special remote, it will never try to use the url registered for that special remote.

So, to support versioning-capable special remotes, there would need to be an additional response to REMOVEEXPORT that says "I removed it from HEAD, but I still have a copy in this url, which can be accessed using the web special remote".

Comment by joey Wed Jul 12 18:09:00 2017
We provide quite an up-to-date standalone backport build of git-annex (package name git-annex-standalone) through NeuroDebian for all Debian/Ubuntus, so you might want to enable NeuroDebian repository (apt-get install neurodebian on a recent debian/ubuntu or follow NeuroDebian website for instructions).
Comment by yarikoptic Wed Jul 12 17:57:21 2017