Recent comments posted to this site:

i have had many problems trying this on a ntfs filesystem. the idea was to share files with a friend using a Mac (we're desperate) and having a partial checkout that only showed the files that were present.

first, git annex upgrade --version=7 doesn't work - i don't know when or if git-annex-upgrade ever supported that option.

then git annex sync --content some_file some_directory --no-push --no-pull doesn't work either: this will tell you that some_file is not a remote, because that's the argument git-annex expects to sync. I tried the -C (--content-of) option, but it doesn't work on missing files:

git-annex: /media/anarcat/red-rhl/video/tv/directory/missing-file.mkv not found

note that this is the local repository path, not the remote. missing-file.mkv is present on the remote, but is totally missing locally. I have no idea how I can fetch that file, even in unlocked mode, it's really strange... -- anarcat

Comment by anarcat Fri Mar 15 14:03:07 2019

For anyone dealing with files with spaces, try this:

git annex find --include '*' --format='${escaped_file} ${escaped_key}\n' | \
    sort -k2 | uniq --all-repeated=separate -f1 | \
    sed 's/ [^ ]*$//'

Using escaped_file escapes the filename, which will avoid whitespace so the rest of the pipe commands work correctly. You'll need to deal with the files being escaped in the final output, but you'll see them correctly. This worked for me.

Comment by chris Fri Mar 15 14:03:07 2019
As the key never contains spaces, it is better to have the key first. Then the filename is anything after key(plus separator) up to the newline.
Comment by CandyAngel Fri Mar 15 14:03:07 2019

git-annex looks at the file's stat() and only if the device id is the same

They are indeed not the same across subvolumes of the same BTRFS file system

$> time cp --reflink=auto home/yoh/reprotraining.ova scrap/tmp
cp --reflink=auto home/yoh/reprotraining.ova scrap/tmp  0.00s user 0.00s system 92% cpu 0.004 total

$> stat home/yoh/reprotraining.ova scrap/tmp/reprotraining.ova
  File: home/yoh/reprotraining.ova
  Size: 5081213952  Blocks: 9924248    IO Block: 4096   regular file
Device: 2fh/47d Inode: 61771704    Links: 1
Access: (0600/-rw-------)  Uid: (47521/     yoh)   Gid: (47522/     yoh)
Access: 2018-06-14 19:23:25.000000000 -0400
Modify: 2018-06-11 15:35:57.000000000 -0400
Change: 2018-06-14 19:23:25.891351983 -0400
 Birth: -
  File: scrap/tmp/reprotraining.ova
  Size: 5081213952  Blocks: 9924248    IO Block: 4096   regular file
Device: 30h/48d Inode: 190040764   Links: 1
Access: (0600/-rw-------)  Uid: (47521/     yoh)   Gid: (47522/     yoh)
Access: 2019-03-06 10:38:02.610657786 -0500
Modify: 2019-03-06 10:38:02.610657786 -0500
Change: 2019-03-06 10:38:02.610657786 -0500
 Birth: -

cp seems to just to attempt a cheap clone

/* Perform the O(1) btrfs clone operation, if possible.
   Upon success, return 0.  Otherwise, return -1 and set errno.  */
static inline int
clone_file (int dest_fd, int src_fd)
{
#ifdef FICLONE
  return ioctl (dest_fd, FICLONE, src_fd);
#else
  (void) dest_fd;
  (void) src_fd;
  errno = ENOTSUP;
  return -1;
#endif

and if that one fails, assumes that full copy is required:

  /* --attributes-only overrides --reflink.  */
  if (data_copy_required && x->reflink_mode)
    {
      bool clone_ok = clone_file (dest_desc, source_desc) == 0;
      if (clone_ok || x->reflink_mode == REFLINK_ALWAYS)
        {
          if (!clone_ok)
            {
              error (0, errno, _("failed to clone %s from %s"),
                     quoteaf_n (0, dst_name), quoteaf_n (1, src_name));
              return_val = false;
              goto close_src_and_dst_desc;
            }
          data_copy_required = false;
        }
    }

BTW, why rsync instead of a regular cp for local filesystem if it is across the devices?

Comment by yarikoptic Fri Mar 15 14:03:07 2019
Is there already a way to specify flags to youtube-dl on a per-file basis. I think it would be OK to do it during either during addurl (modifying the resulting reference that is stored in the annex somehow), or during git-annex get. This is so that the preferred format can be specified. Primarily this would enable to download audio-only formats for some files. ) Apologies if I missed some documentation on how to achieve this)
Comment by gan Fri Mar 15 14:03:07 2019

Sometimes I want to move files from one git annex repo to another. It would be really awesome if one could so something like:

git annex find --in here --and --not --in-repo /path/to/OTHER-REPO

Just to make myself clear. I do not mean "other remote" (foreign instance of "same" repo). I actually mean different repos without common location tracking, no common branches, etc. The only concession I would make (since I think it's necessary) would be that the same backend has to be used in both repos.

This approach could also be relevant for other git annex commands, e.g.:

git annex move file --to-repo /path/to/OTHER-REPO

Is there any way to do it? Or would this be a feature request worth to consider?

Comment by mario Fri Mar 15 14:03:07 2019
So, to clarify - I read your first answer. But if this coulud be done during get perhaps then it's OK because it is an explicit request for the potentially unsafe operation?
Comment by gan Fri Mar 15 14:03:07 2019
I see that annex.thin doesn't support FAT. What's the best option to save disk space when you are using FAT? I'm currently trying to put files that are more than 50% of a drive's size on that drive, with a v7 repository. Is that possible?
Comment by tjbk123 Fri Mar 15 14:03:07 2019

@gan, there's not much point in providing flags that are only used in the initial download; the main point in adding the url to git-annex is so you can download the same content from it again later.

Comment by joey Fri Mar 15 14:03:07 2019

In using git-annex in the past, I've always found it counterintuitive that rmurl uses the following form to remove a URL from a file:

git annex rmurl [file] [url]

While, in contrast, addurl uses a flag to designate the file that a URL should be added to the list of URLs a file points to:

git annex addurl [url] --file=[file]

It would make sense (at least to me) to make the syntax for these more congruous so that both commands use either two positional arguments or one positional argument and one keyword argument / flag.

Comment by pellman.john Fri Feb 8 19:29:09 2019