Recent comments posted to this site:
scheme:<arbitrary json>
, but I am not sure if this might become an issue later. I could also encode the data with base64 or something similar, in which case size limitations would still be relevant; if there are any. Although, the json variant has the added benefit of being much more easily readable in whereis output.
When you run into weird no space left on device
errors although there clearly is enough room on your disk while running git annex repair
, it probably means that your /tmp
is too small (You can verify with watch -n0 df -h
or just df -h
). You can instruct git annex repair
to use a different directory for intermediate storage via the TMPDIR
environment variable:
mkdir /path/to/dir/with/enough/space
TMPDIR=/path/to/dir/with/enough/space git annex repair
# Now that new dir is used and the 'no space left on device' error should disappear
Might be worth adding a note for TMPDIR
to the git annex repair
manpage, @joey?
Note that while bare repos can have worktrees in git, the combination is not really supported in git-annex; as a result,
git annex add
doesn't work ("You cannot run this command in a bare repository.") andgit annex fix
would change all object links from two-character (../.git/annex/objects/KP/4p/SHA256...
) to three-character (../.git/annex/objects/7e3/613/SHA256...
) names.
I see there is a --list
command to list benchmarks but it just "errored out" with
❯ git annex benchmark --list
Missing: COMMAND
Usage: git-annex COMMAND
git-annex - benchmarking
... here comes long list of all commands ...
of which that short Usage:
help seems to be wrong (I did give the command) but I wonder what would be the invocation of annex benchmark
if I were to run a collection of benchmarks to compare different filesystems, like I did in the past in https://www.datalad.org/test_fs_analysis.html?
FWIW git-annex version 10.20220724 (same with 10.20221003)
Joey, you said
.git/annex/othertmp has to be on the same filesystem as the work tree and git repository.
is that for atomic/quick renames important for adjusted branch unlocked mode? asking because in datalad we have that "wreckless" mode where we symlink an entire .git/annex
from original clone to e.g. have quick throw away clone to access the data and possibly even without modifying state of any annexed key.
Or how else are we stepping on the shovel here?
I've audited the code and the only place I could find where it did not work
to have othertmp on a different filesystem is in the bittorrent special
remote when it downloads a torrent file. But that also failed when
.git/annex/tmp
was on a different filesystem! (Since it was moving between
the two directories.) I've fixed that.
It's still best to keep things on the same filesystem because cross-filesystem moves can be expensive and it sometimes falls back to less ideal behavior in other ways too when operating across filesystems. Also of course, you avoid being the one who gets to find and report breakage like the above..
That --list turns out to come from the criterion library, and you can actually use it but only if you provide the mandatory "-- command" parameters too. For example:
git-annex benchmark --list -- version
version
That's not useful at all, since all if can list is the same benchmark you provide on input. But this is a consequence of supporting other criterion options like --iters and --csv that can be useful.
git-annex benchmark
does not use a canned set of benchmarks; it
benchmarks a git-annex subcommand or subcommands that you specify.
After creating git-annex-backend-XFOO, how should I use this file? Where should I put this file?
Put it somewhere in your PATH. Make sure the script it executable.
Just for curiosity, what hashing scheme does your custom backend implement?
@matthias.risze length is not an issue. You should avoid characters that are not usually in urls, particularly whitespace and newline.
It seems to me though that your special remote would perhaps be better served by using the SETSTATE and GETSTATE commands (see external special remote protocol)