Flickr Local Backup?

Flickr is dead / dying. Yahoo! is owned by someone like CBS… i’m not even sure i want to know. Time to get out. Maybe not completely, but that’s a different story.

My plan is to make a local backup / archive of everything that i’ve uploaded. In theory there is a copy of everything elsewhere, but as previous discussed, my organisational skills leave a little to be desired.

Pick a tool. There is a Python Flickr API kit thing, so presumably someone will have written a backup tool. Checkout GitHub and see what is there. The most recently updated / highly rated choice is Flickrmirrorer. Lets give that a go:

$ git clone --recursive https://github.com/markdoliner/flickrmirrorer.git
Cloning into 'flickrmirrorer'...
remote: Counting objects: 483, done.
remote: Total 483 (delta 0), reused 0 (delta 0), pack-reused 483
Receiving objects: 100% (483/483), 104.01 KiB | 0 bytes/s, done.
Resolving deltas: 100% (225/225), done.
$ cd flickrmirrorer/
$ ./flickrmirrorer.py --help
Traceback (most recent call last):
 File "./flickrmirrorer.py", line 49, in <module>
 import requests
ImportError: No module named requests

Ah, fun… Python libraries. This always ends well. According to the README.md we need:

  • python 2.something or python 3.anything
  • python dateutil
    • Ubuntu: apt-get install python-dateutil
  • python flickrapi library 2.0 or newer.
  • python requests

I’ve not done any Python dev on this machine, and it’s macOS not Linux, so no apt-get here. The first thing to try is installing pip… with easy_install. Sometimes it’s pip, sometimes it’s easy_install. They might well be equivalent… no idea!

$ sudo easy_install pip
...
$ sudo pip install requests
...
$ sudo pip install dateutils
...
$ sudo easy_install flickrapi

Might have just as easily been able to install flickrapi with pip, but they said they wanted easy_install. And now:

$ ./flickrmirrorer.py --help
usage: flickrmirrorer.py [-h] [-v] [-q] [-s] [--ignore-views]
 [--ignore-photos] [--ignore-videos]
 [--delete-unknown]
 destdir

Create a local mirror of your flickr data.

positional arguments:
 destdir the path to where the mirror shall be stored

optional arguments:
 -h, --help show this help message and exit
 -v, --verbose print progress information to stdout
 -q, --quiet print nothing to stdout if the mirror succeeds
 -s, --statistics print transfer-statistics at the end
 --ignore-views do not include views-counter in metadata
 --ignore-photos do not mirror photos
 --ignore-videos do not mirror videos
 --delete-unknown delete unrecognized files in the destination directory.
 Warning: if you choose to ignore photos or videos, they
 will be deleted!

Cool. Now all we need to do is setup a disk to keep all this stuff and kick it off:

$ ./flickrmirrorer.py /Volumes/FlickrDump/backup/
0:121: execution error: "https://www.flickr.com/services/oauth/authorize?oauth_token=<token>&perms=read" doesn’t understand the “open location” message. (-1708)

Please authorize Flickr Mirrorer to read your photos, titles, tags, etc.

1. Visit https://www.flickr.com/services/oauth/authorize?oauth_token=<token>&perms=read
2. Click "OK, I'LL AUTHORIZE IT"
3. Copy and paste the code here and press 'return'

XYZ-ABC-123

Fetching 33991776973.jpg
Updated metadata for 33991776973.jpg
Fetching 32923862986.jpg
Updated metadata for 32923862986.jpg
Fetching 31996362424.jpg
...

This will presumably take a while. An hour or so in:

$ ls -1 *.jpg | wc -l
 900

Edit: of course it died right after this was posted…

Traceback (most recent call last):
 File "./flickrmirrorer.py", line 812, in <module>
 main()
 File "./flickrmirrorer.py", line 807, in main
 mirrorer.run()
 File "./flickrmirrorer.py", line 195, in run
 self._run_helper()
 File "./flickrmirrorer.py", line 237, in _run_helper
 self._download_all_photos()
 File "./flickrmirrorer.py", line 288, in _download_all_photos
 new_files |= self._download_photo(photo)
 File "./flickrmirrorer.py", line 398, in _download_photo
 photo_datetime = get_photo_datetime(photo)
 File "./flickrmirrorer.py", line 147, in get_photo_datetime
 return dateutil.parser.parse(photo['datetaken'])
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/dateutil/parser.py", line 697, in parse
 return DEFAULTPARSER.parse(timestr, **kwargs)
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/dateutil/parser.py", line 310, in parse
 ret = default.replace(**repl)
ValueError: month must be in 1..12

<sigh>

It seems that some of the images in Flickr has no date taken data at all. Isn’t that fun. Time for a little hackery:

$ git diff flickrmirrorer.py 
diff --git a/flickrmirrorer.py b/flickrmirrorer.py
index ef9aa85..29df9a2 100755
--- a/flickrmirrorer.py
+++ b/flickrmirrorer.py
@@ -144,7 +144,11 @@ def get_photo_datetime(photo):
 datetime.datetime
 """
 if photo['datetakenunknown'] == "0":
- return dateutil.parser.parse(photo['datetaken'])
+ try:
+ return dateutil.parser.parse(photo['datetaken'])
+ except ValueError:
+ # now what?
+ return datetime.datetime(1970,1,1,0,0,0,0)
 
 try:
 parsed = datetime.datetime.strptime(photo['title'], '%Y%m%d_%H%M%S')

Which is to say:

  • wrap the parsing of ‘datetaken’ in a try / catch
  • return the beginning of unix time and move on

Suppose it might have been easier to edit the metadata on flickr… but where is the fun in that!

Advertisements