A
collection of
digital books,
digital articles,
digital films or
digital documents in
other media.
The american government funded several digital library research and development projects, on the assumption that libraries were central to the information superhighway. Most of these projects have designed and built software architectures that are too high bandwidth, too proprietary or flexible in wrong dimensions for use on the internet. Examples include the Alexandria Digital Library Project http://alexandria.sdc.ucsb.edu/ and Persival http://www.cs.columbia.edu/diglib/PERSIVAL/. Successes include Perseus Project http://www.perseus.tufts.edu/. Interestingly enough, none of these has achieved quite the success of a number of essentially unfunded projects such as Gutenberg project/project Gutenberg/Project Gutenberg http://promo.net/pg/ or the Greenstone http://nzdl.org.
Many digital libraries are modeled on traditional libraries, whose organisation is the cumulation of about 3000 years of literary development. Documents are arranged in collections, each with a collections policy, which is a applicative test for inclusion in the collection (that is, the collections policy can be used to determine whether or not a document should be included in the collection). Finding aids, such as subject, author and title indexes (recently combined into a computerized library catalog) enable users to find individual documents within a collection to answer specific information needs. An information need can be anything needing to know how many tons of coal were mine in Fiji last year (none) to finding a copy of Hamlet to needing some porn.
Digital libraries are currently active in research in a number of areas, including text mining, natural language processing, classification theory, the applicability of traditional classification systems (Library of Congress Classification System, Dewey Decimal Classification and Relative Index, ACM Classification Scheme, Harvard classification system ) to digital media, automatically generated hypertext from inferred rules, etc.
Economically digital libraries are going to be very important in the current renegotiation of the economics of the publishing industry. Historically libraries have been funded by consumers of texts and acted in their interests and to a large extent this is still so (consider public libraries, and university libraries), been champions of free speech and copyright ``fair use'' exemptions. By moving away from a model were `copy' as a meaningful definition new ways of renumerating authors, editors, illustrators and other worker in the making of books. The move away from physical books is going to piss off those with book fetishes.
Haunting all digital libraries is the specter of the Library of Alexandria whose burning is essentially the defining fact of the history of librarianship. There is an extent to which much of western civilization's obsession with buried, arcane and hidden knowledge stems from a lingering cultural fear that in the barbarian founding of the western civilization we burnt an irreplaceable record of who we are, where we've come from and where we're going. The stories of Jorge Luis Borges illustrate this fear very well. The Library of Alexandria is the standard by which all collections of information are measured, digital libraries come no closer than traditional libraries, but are perhaps closing faster.