2:1
mediafile: Media File Metadata Utilities
Link to this document with
@other-doc['(lib "mediafile/mediafile.scrbl")]
Link to this document with
@other-doc['(lib "mediafile/mediafile.scrbl")]
1 Introduction
Link to this section with
@secref["Introduction" #:doc '(lib "mediafile/mediafile.scrbl")]
Link to this section with
@secref["Introduction" #:doc '(lib "mediafile/mediafile.scrbl")]
The mediafile package provides utilities for dealing with collections of media
files (still image, audio, video) and the metadata properties of those files.
Currently, this package provides procedures for extracting metadata from a few
popular media file formats, and procedures for maintaining a database of media
files currently in various filesystem directory trees. This functionality is
useful for media-player applications, and for managing collections of media
files.
Currently, this package is implemented in pure Racket code, without
linking any new native code into the Racket process, nor running external
programs.
This package was written for the RackOut appliance, but has not
been exercised heavily.
2 Types
Link to this section with
@secref["Types" #:doc '(lib "mediafile/mediafile.scrbl")]
Link to this section with
@secref["Types" #:doc '(lib "mediafile/mediafile.scrbl")]
(mediafile-type? x) → boolean?
|
x : any/c |
Predicate for whether or not x is a mediafile-type.
A valid type is either a symbol, of a MIME content-type name, or
a list of symbols, in which the last symbol is the MIME content-type and the
one-or-more preceeding symbols are encodings atop the content-type. For
example, file "foo.tif" might have type 'image/tiff, and file "foo.tif.gz" might have type '(gzip image/tiff).
(mediafile-props? x) → boolean?
|
x : any/c |
Predicate for whether or not x is a mediafile-props, which is used to represent properties of a media file.
A props is an alist of alists of symbols to datums. In other words, following this contract:
(listof (cons/c any/c |
(listof (cons/c symbol? |
any/c)))) |
The top level alist is for “parts”, such as for distinguishing
multiple media objects in a single container file. The car of each of these top level alist pairs can be any datum, although will often be a number representing the sequence of the part in the container, unless there is a better unique key. A special car is #f, which means properties of the entire container file.
The cdr of these top-level pairs is the second-level alist, which is
symbol-to-datum pairs specific to the part. The names of the symbols are often
specific to the type of either the file or the part. The datum values
corresponding to the names in the part can be of any type; an application
wishing to do more with the value than display it in raw form must have a priori knowledge of the type, such as that 'exif:metering-mode typically has values like 'center-weighted-average and 'spot, and what those values mean for the application.
(struct | | mediafile (path type identity size mtime props) |
|
#:extra-constructor-name make-mediafile |
#:transparent) |
path : path? |
type : mediafile-type? |
identity : any/c |
size : any/c |
mtime : any/c |
props : mediafile-props? |
Struct representing a mediafile. The identity, size, and mtime values are intended to help determine whether a file has been
modified since it was last scanned for properties.
3 Content Types
Link to this section with
@secref["Content_Types" #:doc '(lib "mediafile/mediafile.scrbl")]
Link to this section with
@secref["Content_Types" #:doc '(lib "mediafile/mediafile.scrbl")]
This package currently supports a few different MIME content-types,
listed in the following subsections, along with lists of references that were
used in the implementation for each content-type.
3.1 TIFF (image/tiff)
Link to this section with
@secref["TIFF__image_tiff_" #:doc '(lib "mediafile/mediafile.scrbl")]
Link to this section with
@secref["TIFF__image_tiff_" #:doc '(lib "mediafile/mediafile.scrbl")]
3.2 JPEG/Exif (image/jpeg)
Link to this section with
@secref["JPEG_Exif__image_jpeg_" #:doc '(lib "mediafile/mediafile.scrbl")]
Link to this section with
@secref["JPEG_Exif__image_jpeg_" #:doc '(lib "mediafile/mediafile.scrbl")]
ITU CCITT T.81, Terminal Equipment and Protocols for Telematic Services -
Information Technology - Digital Compression and Coding of Continuous-Tone
Still Images - Requirements and Guidelines, 1992-09
CIPA DC-008-Translation-2010: Exchangeable image file
format for digital still cameras: Exif Version 2.3, 2010-04-26
Davis Burren, EXIF MakerNote of Canon, Revision 1.15, 2001-06-03
Evan Hunter, Canon Makernote information, viewed 2012-11-17
Canon MakerNote Tags defined in Exiv2, viewed 2012-11-17
GVsoft Exif MakerNote information, viewed 2012-11-23
3.3 Ogg Vorbis (audio/ogg)
Link to this section with
@secref["Ogg_Vorbis__audio_ogg_" #:doc '(lib "mediafile/mediafile.scrbl")]
Link to this section with
@secref["Ogg_Vorbis__audio_ogg_" #:doc '(lib "mediafile/mediafile.scrbl")]
4 Files and Scanning
Link to this section with
@secref["Files_and_Scanning" #:doc '(lib "mediafile/mediafile.scrbl")]
Link to this section with
@secref["Files_and_Scanning" #:doc '(lib "mediafile/mediafile.scrbl")]
This section lists procedures for maintaining a database of "mediafile" objects corresponding to files in filesystem directory trees.
(path->mediafile | | path | | | [ | #:canonicalize-path? canonicalize-path? | | | | #:old-mediafile old-mediafile | | | | #:type-mandatory? type-mandatory? | | | | #:props-mandatory? props-mandatory? | | | | #:exception? exception?]) | |
|
→ mediafile? |
path : path-string? |
canonicalize-path? : boolean? = #true |
old-mediafile : (or/c #f mediafile?) = #f |
type-mandatory? : boolean? = #false |
props-mandatory? : boolean? = #false |
exception? : boolean? = #true |
Yields a mediafile, given a path to the file. If #:old-mediafile is given, then that value will be returned if the file does not seem to have changed since that mediafile was created, which potentially saves the cost of scanning for properties.
If there is a problem creating a mediafile, then the behavior depends on #:exception? – if true, then an exception is raised; if false, then this procedure returns #false rather than a mediafile. The #:type-mandatory? and #:props-mandatory? arguments specify what should be considered a “problem” for this purpose.
The #:canonicalize-path? specifies whether or not to store a canonicalized path in the mediafile, rather than the path argument verbatim. Most applications will want to have a
canonicalized path, which is the default behavior.
(scan-mediafiles | | start-path-or-paths | | | [ | #:canonicalize-paths? canonicalize-paths? | | | | #:type-mandatory? type-mandatory? | | | | #:props-mandatory? props-mandatory? | | | | #:old-hash old-hash | | | | #:remove-other-paths? remove-other-paths?]) | |
|
→ immutable-hash? |
start-path-or-paths : (or/c path-string? (list-of path-string?)) |
canonicalize-paths? : boolean? = #true |
type-mandatory? : boolean? = #false |
props-mandatory? : boolean? = #false |
old-hash : immutable-hash? = #f |
remove-other-paths? : boolean? = #true |
Scans filesystems recursively, beneath the paths given as start-path-or-paths, and returns a hash of paths to mediafile objects.
If #:old-hash is provided, then this hash is used as a starting point for the hash that will ultimately be returned, such as for updating from a previous run of scan-mediafiles. If #:old-hash is provided, then #:remove-other-paths? determines whether paths in the old hash that are not within the scope of start-path-or-paths should be removed before returning the new hash.
The #:canonicalize-paths?, #:type-mandatory?, and #:props-mandatory? arguments are passed to path->mediafile.
5 Test Files
Link to this section with
@secref["Test_Files" #:doc '(lib "mediafile/mediafile.scrbl")]
Link to this section with
@secref["Test_Files" #:doc '(lib "mediafile/mediafile.scrbl")]
This package contains some files that are used for test data.
5.1 Current Test Files
Link to this section with
@secref["Current_Test_Files" #:doc '(lib "mediafile/mediafile.scrbl")]
Link to this section with
@secref["Current_Test_Files" #:doc '(lib "mediafile/mediafile.scrbl")]
The following directory structure exists in the source code
distribution for this package.
"test-files/"
"exif-org/" – JPEG/Exif and other files, from http://exif.org/samples.html, courtesy of John Hawkins.
"public-domain/" – Files known to be in the legal public domain, for
testing with a breadth of file creators (e.g., different camera models) and
situations (e.g., different Ogg container layouts).
5.2 Contributing Test Files
Link to this section with
@secref["Contributing_Test_Files" #:doc '(lib "mediafile/mediafile.scrbl")]
Link to this section with
@secref["Contributing_Test_Files" #:doc '(lib "mediafile/mediafile.scrbl")]
Please note that we are not currently soliciting contributions of
test data for this package. This section remains just in case we resume.
If you’d like to contribute a JPEG file from a particular camera
model, that would be very welcome. Here’s how:
Set camera to capture the image in a relatively small file size. This means setting camera to low resolution,
high compression, low quality, etc. (The small size is to make including files
with the package more practical.)
Choose a photographic subject (e.g., stop sign, cloud,
thumbtack, light switch) that:
Does not contain any trademarks or copyrighted
material (no brand names, logos, book pages, etc.).
Does not contain anything personally-identifiable,
such as faces.
Is G-rated. (No showing off Racket programmer abs.)
Is not too complicated, so should compress well.
Take photo with camera.
Do not edit the photo in any way at all – it must be
byte-for-byte identical to how the camera first wrote it to your memory card.
Email the file to: neil@neilvandyke.org
In the text of the email, please state “This file is in
the public domain.” Note that you are legally giving up all copyright to the
image, to make including the file in a regression test suite more practical.
6 Known Issues
Link to this section with
@secref["Known_Issues" #:doc '(lib "mediafile/mediafile.scrbl")]
Link to this section with
@secref["Known_Issues" #:doc '(lib "mediafile/mediafile.scrbl")]
Assemble a suite of test input files, without legal encumbrances.
Preferrably small enough file sizes to include in PLaneT package, as part of
built-in unit tests.
Needs more real-world testing with diversity of files.
Malformed or insufficiently supported TIFF and Exif files can
result in infinite loops or use excessive resources.
Support for additional Exif MakerNotes, especially Nikon ones.
A little bit more support for Canon Exif MakerNotes is possible,
such as decoding Custom Functions.
Add support for getting info about multiple streams in Ogg
files.
Support additional file types, especially JPEG/JFIF, PNG, and the
multiple MP3 ID3 variants.
Add feature to map properties of different formats to common ontology.
Make a variation on scan-mediafiles with a fold interface, such as for various ways of getting
incremental results, including if the scanning is in concurrent thread or
process.
7 History
Link to this section with
@secref["History" #:doc '(lib "mediafile/mediafile.scrbl")]
Link to this section with
@secref["History" #:doc '(lib "mediafile/mediafile.scrbl")]
Version 2:1 — 2016-03-02
Version 2:0 — 2016-02-27
Moving from PLaneT to new package system.
Fixed test cases in "mediafile-exif.rkt". Old change, perhaps unreleased.
Version 1:0 — 2012-11-27
8 Legal
Link to this section with
@secref["Legal" #:doc '(lib "mediafile/mediafile.scrbl")]
Link to this section with
@secref["Legal" #:doc '(lib "mediafile/mediafile.scrbl")]
Copyright 2012, 2016 Neil Van Dyke. This program is Free Software; you can
redistribute it and/or modify it under the terms of the GNU Lesser General
Public License as published by the Free Software Foundation; either version 3
of the License, or (at your option) any later version. This program is
distributed in the hope that it will be useful, but without any warranty;
without even the implied warranty of merchantability or fitness for a
particular purpose. See http://www.gnu.org/licenses/ for details. For other
licenses and consulting, please contact the author.
Test files from exif.org (used with permission) and/or that are in the public
domain might also be included with this software, and no copyright on them is
claimed on those test files by the author of this software.