exiftool-json-db
Maintain a JSON collection of photos and videos with their metadata
This is one of the core modules of thumbsup.github.io.
Purpose
This package helps maintain a JSON
database of photo & video files, including all their metadata. The result is the same as running exiftool on an entire folder, except that results are cached and only updated when files are added/changed/deleted.
This means you can update the database within a few seconds when adding 20 photos to a collection of 10,000 - and then load that full collection in memory for processing (including captions, timestamps, GPS data...) in just a few milliseconds. See below for examples of useful queries to run.
Requirements
This package requires exiftool
from http://www.sno.phy.queensu.ca/~phil/exiftool/ (version 9.70 or above).
Quick start
npm install -g exiftool-json-db
On the command line:
exiftool-json-db --media '/Photos/Holidays' --database '/Documents/holidays.json'
Or programmatically:
const database = const emitter = databaseemitter
This will create or update /Documents/holidays.json
which uses the following format:
"SourceFile": "NewYork/IMG_5364.jpg" "File": "FileSize": "449 kB" "MIMEType": "image/jpeg" /* ... */ "EXIF": "Orientation": "Horizontal (normal)" "DateTimeOriginal": "2017:01:07 13:59:56" /* ... */ "Composite": "GPSLatitude": "+51.5285578" "GPSLongitude": -02420248 /* ... */
Some notes on the structure:
- the format is identical to the raw
exiftool
output- it doesn't try to parse date strings, and doesn't assume timezones when absent
- doesn't fix GPS format oddities, like
-10.000
(number) and"+10.000"
(string) - doesn't merge similar fields together, like
EXIF:ImageDescription
andIPTC:Caption-Abstract
- all
SourceFile
paths are relative to the input folder. This means the database stays valid when processing photos from a removable drive, or a drive whose mount point changes over time - the name of the groups and tags are exactly as documented at http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/index.html
Examples of useful queries
Once you have run exiftool-json-db
to update your database, you can run useful queries on the JSON data.
Node.js or jQ are easy choices to process JSON.
- Find all camera models used
jq '[.[].EXIF.Model] | unique' holidays.json
- Group photos by aperture value
jq 'group_by(.Composite.Aperture) | map({Aperture: .[0].Composite.Aperture, Files: map(.SourceFile)})' holidays.json
- Find all "group" photos (where the camera identified more than 5 faces)
jq '.[] | select([.XMP.RegionInfo.RegionList[]? | select(.Type == "Face")] | select(length > 5)) | .SourceFile' holidays.json
- Find all photos within 10km of London
const geodist = const db = const LONDON = lat: 515285578 long: -02420248 db
Programatic usage
create()
collection
events
create()
returns an event emitter that emits the following:
// basic stats about the collection, before any processing is done // before a file is processed (index is 1 based, e.g. 1/3) // unexpected error, cannot recover // finished, passing the collection as an argument
In case you need to process the list of files straight away, you don't need to re-load holidays.json
from disk.
The done
event includes the whole updated array as an argument.
debug
By default, the library does not print any extra information. The command-line tool only prints basic stats and the progress bar.
To display extra troubleshooting info simply set the following ENV variable:
DEBUG="*" exiftool-json-db