git-annex's metadata works best when files have a lot of useful metadata attached to them.

To make git-annex automatically set the year and month when adding files, run git config annex.genmetadata true.

A git commit hook can be set up to extract lots of metadata from files like photos, mp3s, etc.

  1. Install the extract utility, from http://www.gnu.org/software/libextractor/
    apt-get install extract
  2. Download pre-commit-annex and install it in your git-annex repository as .git/hooks/pre-commit-annex.
    Remember to make the script executable!
  3. Run: git config metadata.extract "artist album title camera_make video_dimensions"

Now any fields you list in metadata.extract to will be extracted and stored when files are committed.

To get a list of all possible fields, run: extract -L | sed 's/ /_/g'

By default, if a git-annex already has a metadata field for a file, its value will not be overwritten with metadata taken from files. To allow overwriting, run: git config metadata.overwrite true

http://projects.iq.harvard.edu/fits might be an even better choice than libextractor. We use it in work and its not too bad, but it can be slow to startup due to the JVM.

is there a way for this to be done globally, without having to install and configure the hook for each repository? it seems like a fairly useful feature that could be factored in git-annex itself (as opposed to be shipped as a shell script)...

also, is there a way to retroactively parse the tags from existing files (as opposed to only new files added to the repo).

thanks

Comment by https://id.koumbit.net/anarcat Tue Apr 1 04:18:10 2014

@anarcat, I have modified pre-commit-annex so if it's passed already annexed files, it'll extract their metadata.

So this can be used to add metadata to files added before you installed the hook, or if you've configured more fields to be extracted.

Comment by http://joeyh.name/ Thu Apr 17 20:15:07 2014

seemingly pre-commit hooks are not being called on windows, it could have to do with git annex sync bypassing them when doing commits ?

on the other side genmetadata works. although that is not enough for me since I'd want to preserve complete last modification date/time and I was in the process of modifying the supplied pre-commit script to call for "stat %Y" (which btw is working fine on windows, while the last binaries for extract are failing there).

am I correct in assuming that direct mode [on windows at least] bypasses hooks [namely pre-commit as well as pre-commit-annex] ?

@Michele git annex sync in a direct mode repository does bypass the pre-commit hook. However, it will try to run the pre-commit-annex hook.

Most likely, the hook script does not appear executable on Windows, so git-annex cannot run it.

Comment by joey Tue Jan 20 16:52:28 2015

@Michele after testing, git-annex on Windows seems to not see a file that does have the executable bit set as executable. I have opened a bug report windows isExecutable fail, and worked around the problem now.

Comment by joey Tue Jan 20 17:19:34 2015

@Joey just tested a nightly build and now pre-commit-annex is called, and with my modifications it autoadds last modified times for content. Trivially it's just the matter of adding:

field="datemod"
value=$(stat -c %Y $f)
addmeta "$f" "$field" "$value"

to the body of the process() function to the supplied pre-commit-annex script. thanks

Comments on this page are closed.