Recent comments posted to this site:
It would be awesome to be able to move (or copy) between remotes,
eg. git annex copy --from=remote-a --to=remote-b
Use case: I have a repo of ~1TB, but only ~30GB free disk space on my laptop. Currently I have to manually get
and move
the files almost one by one to get them from one remote to the other.
I couldn't get this to run and had a lot of performance issues with rclone on Google Drive, so I adapted the rclone wrapper to gdrive. It's running fine so far, so I thought I share it:
Also, it's not safe to merge two separate git repositories that have been tuned differently (or one tuned and the other one not). git-annex will prevent merging their git-annex branches together, but it cannot prevent git merge remote/master merging two branches, and the result will be ugly at best (git annex fix can fix up the mess somewhat).
My main use repo is 1.7TB large and holds 172.000+ annexed files. Variations in filename case has lead to a number of file duplications that are still not solved (I have base scripts that can be used to flatten filename case and fix references in other files, but it will probably mean handling some corner cases and there are more urgent matters for now).
For these reasons I'm highly interested in the lowercase option and I'm probably not the only one in a similar case.
Does migrating to a tuned repository mean unannexing everything and reimporting into a newly created annex, replica by replica then sync again? That's a high price in some setup. Or is there a way to somehow git annex sync
between a newly created repo and an old, untuned one?
In some cases, if remote supports versioning, might be cool to be able to export all versions (from previously exported point, assuming linear progression). Having a chat with https://quiltdata.com/ folks, project which I just got to know about. 1. They claim/hope to provide infinite storage for public datasets 2. They do support "File" model, so dataset could simply contain files. If we could (ab)use that -- sounds like a lovely free ride 3. They do support versioning. If we could export all the versions -- super lovely.
Might also help to establish interoperability between the tools
TRANSFEREXPORT STORE|RETRIEVE Key File Name
-- note that File could also contain spaces etc (not only the Name), so should be encoded somehow?old external special remote programs ... need to handle an ERROR response
-- why not just to boost protocolVERSION
to e.g.2
so those which implement this would reply with a new version #?
I also wonder if SETURLPRESENT Key Url
could also be extended to be SETURLPRESENT Key Url Remote
, i.e. that a custom remote could register a URL to Web remote?
In many cases I expect a "custom uploader/exporter" but then public URL being available, so demanding a custom external remote to fetch it would be a bit overkill.
N.B. I already was burnt once on a large scale with our custom remote truthfully replying to CLAIMURL (since it can handle them if needed) to public URLs, thus absorbing them into it instead of relaying responsibility to 'Web' remote. Had to traverse dozens of datasets and duplicate urls from 'datalad' to 'Web' remote.
DAV = “Distributed Authoring and Versioning.”, but versioning was forgotten about in the original RFC. Only some servers/clients implement DeltaV spec (RFC 3253) which came later to fill that gap. But in principle, any DeltaV-compliant WebDAV special remote could then be used for "export" while retaining access to all the versions. References: - WebDAV and Autoversioning - Version Control with Subversion - RFC 3253
I have got interested whenever saw that box.com is supported through WebDAV but not sure if DeltaV is anyhow supported and apparently number of versions stored per file is anyways depends on type of the account (and no versions for a free personal one): https://community.box.com/t5/How-to-Guides-for-Managing/How-To-Track-Your-Files-and-File-Versions-Version-History/ta-p/329
That would almost work without any smarts on the git-annex side.
When it tells the special remote to REMOVEEXPORT
, the special remote
could remove the file from the HEAD equivilant but retain the content in its
versioned snapshots, and keep the url to that registered. But, that
doesn't actually work, because the url is registered for that special
remote, not the web special remote. Once git-annex thinks the file has been
removed from the special remote, it will never try to use the url
registered for that special remote.
So, to support versioning-capable special remotes, there would need to be
an additional response to REMOVEEXPORT
that says "I removed it from HEAD,
but I still have a copy in this url, which can be accessed using
the web special remote".
apt-get install neurodebian
on a recent debian/ubuntu or follow NeuroDebian website for instructions).