Recent comments posted to this site:
With systemd using --autostart --foreground
either ignore foreground or quit immediatelly.
I managed to have the process stay alive with RemainAfterExit=on
:
[Service]
User=%i
ExecStart=/usr/bin/git-annex assistant --autostart --foreground
ExecStop=/usr/bin/git-annex assistant --autostop
RemainAfterExit=on
Restart=on-failure
RestartSec=5
but git-annex processes does not maintain the --foreground
option which is causing a lot of zombies in the long period (not totally clear why).
My current solution is to have a service for each annex repository and avoid --autosart
but this is annoying because it require to pass the path as %I
and wrap git-annex in bash script to get the repo owner as the user.
Hi folks!
We are considering introducing git-annex with gcrypt in hybrid mode as secure storage for common data in our company and I'd rather not delete and reinit the repo everytime when somebody new is granted access. A little testing with current git-annex showed, that GCRYPT_FULL_REPACK with a forced git-push of all branches makes the git-repo accessible (I get the files) to the newcomer but not the annexed data (gpg error "No secret key" in git annex get, git annex info secretRepo just lists my first key).
Has anybody sucessfully tested adding keyids in hybrid-encryption later on? Which further steps where needed to make it work?
Thanks for any input! :)
Cheers
Jörn
@joey - thanks, that's prompt feature request fulfilment :-)
Looking more closely at the duplicates, it turns out that not everything got duplicated, just the "older" episodes. It turns out the newer episodes do have guid
values saved (as itemid
in the metadata) and the older episodes do not. I think this is most likely because I was running a fairly old git-annex until about October 2016, on a fairly old OS install, but then upgraded to a more recent one (now about 6 months old) which does track them. My assumption (without checking every file) is the episodes downloaded before October 2016 are ones that got duplicated.
I've edited the main page and added a note that GUIDs are tracked in versions since 2015, since I didn't obviously find that listed anywhere before.
Ewen
@rok it's a consequence of using smudge/clean filters; git add passes the file through the filters.
@ewen importfeed already tracks guids, since 2015. Relevant commit is [[!commit f95a8c867223b2e17d036d0d3377bf0fc9d3adff]]
You may well have an older version of git-annex that didn't do that. But there are probably also feeds that lack a useful guid, or that even make a change that changes the guid of an existing item.
With git annex metadata
, you can see the itemid
which is where the guid
is stored.
PS, please post in todo when you have a request..
While tracking podcast media URLs usually works to avoid duplicate downloads, when it fails it usually fails spectacularly. In particular if a podcast feed decides to update all the URLs (for old and new podcasts) to use a different URL scheme, then suddenly that looks like a huge volume of new URLs, and all of them get downloaded again -- even if the content has actually already been retrieved from a different URL (ie, older URL scheme). For instance the acast.com
service has changed their URL scheme a couple of times in the last 1-2 years, rewriting all the historical URLs, so I have three copies of many of the episodes on podcasts on their service :-( (Many downloaded; some skipped once I caught the bulk download and stopped it/reran with --fast
or --relaxed
to make placeholders instead. acast.com
seem to have managed to cause even more confusion by rewriting many of the older mp3
files with new id3
tags, thus changing the file size/hashes -- it definitely made cleaning up more complicated.)
Some (all?) podcast feeds also have a guid
field, which specifies what should be a unique per-episode and unchanging, that other podcatchers use to track "seen this" content. In theory that guid
value should be stable even across media URL changes -- at least if it isn't, then a podcaster changing the guid
and media URL will almost certainly induce re-downloads in most podcatchers, and thus hopefully realise early on (eg, during testing) rather than in production.
Can git-annex
be extended to track the guid
values as well as the filenames, so git annex importfeed
can avoid downloading episodes where it has already processed that guid
, and instead just add the newly listed url as an alternate web URL for that specific episode (which has been my manual work around). Perhaps the episode guid
could be stored as additional metadata, along with some sort of feed unique ID (link?), and then an index built/consulted when importfeed
runs (although that "feed unique ID" would probably also have to be updatable by the user, to cope with "the feed URL has now changed from http://
to https://
which also seems to be happening a bunch at present.)
Ewen
PS: Apologies for duplicate partial comment; I think my browser decided some key combination meant "do default form action", which is post -- and I wasn't finished writing. I couldn't see a way to edit the comment, hence deleting/readding.
Thanks, joey.
Your last comment brought me onto the right track. The Problem was not in the repository, but an old stale global .gitconfig in my homedir. I just checked $XDG_CONFIG_HOME/git/config were currently my global git-config is residing and totaly forgot about this old config. Stupid me!
git config --show-origin --get annex.largefiles
was my savior here as it clearly indicated that there is indeed a (unintended) config setting and where to find the file. So i can strongly recommend anybody experiencing strange behavior to try this one-liner. It might have saved me hours of time.
Thanks for your help! :)
Cheers
Jörn
Note that if annex.largefiles is set in git config (including global git config), it overrides the .gitattributes setting. So a reasonable guess would be that you set it in the git config.
@joern.mankiewicz, you need to file a bug report with enough information to reproduce your problem.
annex.largefiles in .gitattributes works fine:
joey@darkstar:~/tmp> git init ttt
Initialized empty Git repository in /home/joey/tmp/ttt/.git/
joey@darkstar:~/tmp> cd ttt
joey@darkstar:~/tmp/ttt> git annex init
init ok
(recording state in git...)
joey@darkstar:~/tmp/ttt> echo '* annex.largefiles=nothing' > .gitattributes
joey@darkstar:~/tmp/ttt> touch foo
joey@darkstar:~/tmp/ttt> git annex add foo
add foo (non-large file; adding content to git repository) ok
(recording state in git...)
you may want to consider using onion-grater to limit possible escalations in the use of that control port:
RFP: [[!debbug 859125]]