Recent comments posted to this site:
@john, the difference is that while addurl can make up a filename to use if you do not provide one, rmurl needs you to specifiy a filename.
So, yes, "git annex rmurl --file=whatever url" would be more consistent, but it requires typing more my making something that is not actually optional into an option. And "git annex addurl file url" would make the command more consistent with rmurl, but harder to use.
Consistency is not everything.
(Also, the rmurl batch interface would then be less consistent to its command-line interface.)
- git checkout failed,
fatal: cannot create directory at
'doc/bugs/Assertion96cnt6040sizeof4095nl95value95type95LC95TIME4147sizeof4095nl95value95type95LC95TIME91093414139failed.': Filename too long
fixed by updating longpaths in gitconfig,
stack setup ok
stack install failed,
regex-tdfa-1.2.3.1: copy/register Progress 58/168 removeDirectoryRecursive:removeContentsRecursive:removePathRecursive:removeContentsRecursive:removePathRecursive:removeContentsRecursive:removePathRecursive:removeContentsRecursive:removePathRecursive:removeContentsRecursive:RemoveDirectory
Will have to try the Make approach on Windows ?
Because "git add foo" does not work in direct mode.
This is really not the place to be having a conversation about this. If you want something changed in git-annex, open a bug report or todo item.
Something like
git config remote.origin.annex-ignore true
could be useful information in this page. It took me 1.5 hours to figure this out for me (while coding)...
Built git-annex with profiling, using stack build --profile
(For reproduciblity, running git-annex in a clone of the git-annex repo https://github.com/RichiH/conference_proceedings with rev 2797a49023fc24aff6fcaec55421572e1eddcfa2 checked out. It has 9496 annexed objects.)
Profiling git-annex find +RTS -p
:
total time = 3.53 secs (3530 ticks @ 1000 us, 1 processor)
total alloc = 3,772,700,720 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
spanList Data.List.Utils 32.6 37.7
startswith Data.List.Utils 14.3 8.1
md5 Data.Hash.MD5 12.4 18.2
join Data.List.Utils 6.9 13.7
catchIO Utility.Exception 5.9 6.0
catches Control.Monad.Catch 5.0 2.8
inAnnex'.checkindirect Annex.Content 4.6 1.8
readish Utility.PartialPrelude 3.0 1.4
isAnnexLink Annex.Link 2.6 4.0
split Data.List.Utils 1.5 0.8
keyPath Annex.Locations 1.2 1.7
This is interesting!
Fully 40% of CPU time and allocations are in list (really String) processing,
and the details of the profiling report show that spanList
and startsWith
and join
are all coming from calls to replace
in keyFile
and fileKey
.
Both functions nest several calls to replace, so perhaps that could be unwound
into a single pass and/or a ByteString used to do it more efficiently.
12% of run time is spent calculating the md5 hashes for the hash directories for .git/annex/objects. Data.Hash.MD5 is from missingh, and it is probably a quite unoptimised version. Switching to the version if cryptonite would probably speed it up a lot.
Normally, unlocking a file requires a copy to be made of its content, so that its original content is preserved
Does that imply than, on v7 in a file system that does not support hard links such as FAT32, git annex adjust --unlock
would effectively be creating a duplicate of all files via cp (which is incredibly costly time-wise specially for big repos and huge files) and would effectively double the size it occupies?
Yes, if you don't want the copy set annex.thin as documented on the man page.
In using git-annex in the past, I've always found it counterintuitive that rmurl uses the following form to remove a URL from a file:
While, in contrast, addurl uses a flag to designate the file that a URL should be added to the list of URLs a file points to:
It would make sense (at least to me) to make the syntax for these more congruous so that both commands use either two positional arguments or one positional argument and one keyword argument / flag.