<<Biblioteca Digital del Portal<<INTERAMER<<Serie Educativa<<Digital Libraries and Virtual Workplaces Important Initiatives for Latin America in the Information Age<<Chapter 6
Colección: INTERAMER
Número: 71
Año: 2002
Autor: Johann Van Reenen, Editor
Título: Digital Libraries and Virtual Workplaces. Important Initiatives for Latin America in the Information Age
Creation and conversion of data
The most significant contribution
of digital libraries is the creation of digital content, whether “born
digital” or converted from print-based formats. Several representative
digital library projects have been initiated to advance developments
in both arenas. Noerr’s Digital Tool Kit and Fox’s Chapters 4 and
5 of this book provide extensive descriptions of current projects.
Most library ventures into the creation
of digital libraries include the conversion of local collections from
print-based formats to digital formats. There are two major methods
for conversion of text-based collections: manual re-keying of text
and Optical Character Recognition (OCR) scanning. Before any text
is converted a document structure or schema must be determined and
the markup method developed. Of course the application software used
to produce the collection must also be considered.
If a re-keying process of the text
is chosen, then the materials are removed from their shelves or cabinets,
data entry is performed, errors detected and then corrected. This
process is very labor intensive, slow and can be expensive if executed
in-house by the library. Commercial service bureaus or conversion
vendors with highly skilled data entry operators located in several
countries can provide cost effective quality conversions. Typically,
re-keying is considered to be the most expensive means of conversion.
OCR scanning is a viable option
for some text data conversion projects. Rather than re-keying the
text, scanners are used to “read” the characters and convert them
into digitally encoded text. Materials to be scanned must be
removed from their storage containers. Bound materials may need to
be photocopied or unbound. Scanning speeds can vary depending on the
capacity of the scanners and the PC hardware and software selected.
The quality of the original document ultimately will determine the
quality of the scanned document.
The conversion of documents that
include images (i.e. photos, drawings, maps, graphs) must be converted
using scanners and document imaging techniques. Integrated commercial
packages have been bundled with hardware (servers, scanners, and workstations)
and the software to facilitate indexing and retrieval of the processed
collections.
Extensive developments have been
accomplished in recent years to increase the functionality and productivity
of conversion techniques. Open standards should be used where available,
affordable and feasible. Avoid investing in conversions of data that
will result in long term archival storage in proprietary data formats
such as Adobe’s Acrobat. Will a reader program exist in 20 years that
can read that data? Or will that data have to be repeatedly converted
to keep pace with future versions of reader programs?
Cornell University Library has published
a Digital Imaging Tutorial in both English and Spanish (Cornell 2000).
The Northeast Document Conservation Center conducts the School
for Scanning, see their web site for more information (http://www.nedcc.org).
Saffady (1999, p. 291) and Lesk
(1997, p. 48) offer comprehensive chapters on digital libraries and
text conversions. The Archives Builders provides extensive information
on document conversion techniques, see web site for more information
(http://www.ArchiveBuilders.com).
While the focus of this chapter
has been on the technical aspects of digital libraries, perhaps the
most valuable investment an organization can make is in the human
resources necessary to create and support the hardware infrastructure
and the creation and conversion of content. The recruitment and hiring
of knowledgeable technicians is crucial, but the commitment does not
end there. Ongoing training is a must. The constant surveying of technological
developments and new digital library projects will be required to
maintain an awareness of new techniques.