Digital Data: Preservation and Re-Use

SAA 2000



Digital Archiving Pilot Project for Excavation Records (DAPPER)

Dr. Damian Robinson
Archaeology Data Service
29th February 2000


About this document



Introduction

Archaeologists are creating digital data in the course of their work, and their preservation is as significant to the future of the archaeological record as the preservation of conventional finds and paper archives. Modern excavations create huge amounts of digital information. Whether it is the on-site recording of the archaeology, specialist databases created during post-excavation or publication standard interpretative maps and plans, digital information has the potential to be created at every stage from assessment to publication.

Traditionally the entire archive would have been transferred to a museum at the end of a project's life, but in a recent survey into the state of museum archaeological archives in England, Swain (1999, 47) noted that "most museums do not have the correct technology to store, access and curate in the long-term those archives for which computer files form an important part." The Swain report also illustrated that little digital material was being transferred to museums for archival purposes. This comment is echoed in Strategies for Digital Data (Condron et al 1999, 29-32, and Figure 6.5), which showed that the majority of extensively digitised projects are either retained by their creators or are in the hands of local government organisations. Strategies for Digital Data, however, also highlighted the inadequate archival policies of these organisations (Condron et al 1999, 33-39). Archaeologists are in danger of losing digital material, much of which is central to the archaeological record.

The Digital Archiving Pilot Project for Excavation Records (DAPPER) arose from these concerns. DAPPER was a collaborative venture between the Archaeology Data Service (ADS), English Heritage (EH), the Museum of London Archaeology Service (MoLAS) and the Oxford Archaeological Unit. The Pilot Project aimed to:

Choosing projects to archive

In order to demonstrate the concept and potential of this new form of information provision it was necessary to choose the excavations to archive with care. After a lengthy consultation period two reasonably large and 'high profile' sites were identified as most suitable: the Royal Opera House excavation by the Museum of London Archaeology Service and Eynsham Abbey, excavated by the Oxford Archaeological Unit.

Excavations at Eynsham Abbey
Between 1989 and 1992 the Oxford Archaeological Unit (OAU) undertook a full excavation of part of the precinct of the medieval abbey, revealing part of the cloister and south ranges, kitchens and domestic buildings. Evidence for earlier activity included the 11th century Anglo-Saxon abbey and earlier Minster, and a Bronze Age enclosure.

Excavations at The Royal Opera House
Following exploratory work in 1995 full-scale excavations were undertaken in 1996 by MoLAS at the site of the Royal Opera House in Covent Garden, Greater London. The excavation examined the largest area of the Saxon trading port of Lundenwic so far to become available and has provided a wealth of information on the form and economy of this settlement.

Key factors in identifying appropriate projects

The contents of the archives

It is important to note that in neither case was the digital archive planned as part of the dissemination phase of the project and that both Eynsham Abbey and the Royal Opera House will be published as traditional monographs. The resultant digital archives are, however, remarkably similar in their content. This is due to their origin as the residues of large-scale, well-funded post excavation projects. Both archives are comprised of essentially the digital residues of the post-excavation process and represent a digital version of the 'research archive'.

Eynsham Abbey
The project was never designed as an exercise in digital data collection or management, but rapid changes in computer technology in the last five years has meant that the project developed in this direction.

The downloadable archive contains text files, databases made available as comma delimited text, JPEG images, a 3D reconstruction of the medieval abbey available in DWG or DXF format (i.e. AutoCAD) and numbers of digitised drawings also available as DWG or DXF files. Image files have also been deposited in both preservation and dissemination formats.

The contents of the Eynsham digital archive can be summarised as follows:

Dataset documentation Project background
Excavation - aims, methodology
Post-excavation -; phasing, structures
Specialist report summaries and database documentation, bibliographies
Text files Project documentation
Delimited text files architectural stone database
Context database keywords
Glass lead cames database
Vessel glass database
Window glass database
Lead objects database
Post-Roman pottery database
Environmental sample register
Environmental sampling results register
Small finds database
Parallels for small finds database
Structures database
Tile database
Drawing files Sections
Phase plans
Structure plans
Site plans
3-D reconstruction
Images

Royal Opera House
All data from the Royal Opera House project was collected in computer based formats. This excavation was the first for which MoLAS used its state-of-the-art integrated database and GIS recording system. The on-line archive is currently incomplete but contains Geographical Information System (GIS) files, interpretive groupings and data files consisting of context, artefact and ecofact attribute sets. The digital archive is currently in the process of completion and only those elements that will remain unaltered have been deposited. The remainder of the digital archive consisting of the higher-level interpretative datasets will be deposited when the publication is complete. A much wider variety of queries are possible with the full dataset permitting assemblages to be related directly to buildings and periods of activity.

Dataset documentation
Text files Group descriptions
Delimited text files Field records by context
Basic interpretation
Roman pottery
Post-Roman pottery
Saxon pottery fabrics
Registered finds
Coins
Loom weights
Expansion codes for registered finds
Assessment level animal bon
Post assessment animal bone
Animal Bone codes
Animal species expansion codes
Processed environmental samples
Botanical remains -; analysis
Botanical expansion codes
ArcView themes Context groups
Trench edges
Unreal edges

Accessing the archives

Although DAPPER has led the way in developing digital archives, if they were difficult to find, or impossible to use, the project would ultimately be a failure. Consequently the project also concentrated on assessing the utility of the digital archives and gained feedback from the user community. This was done in order to consider the archives at an early stage and modify them prior to the public launch.

Resource discovery
The Eynsham Abbey and Royal Opera House excavation archives have the potential to be accessed in one of three ways:

  • Using the ArchSearch catalogue. The site level metadata records can be searched, for example a Keyword search for 'Eynsham' and 'Abbey' results in 51 records being returned. One of these records has 'Excavation Archive' included in the Name of Resource field. This record leads to a more detailed catalogue record detailing the site, its excavators and its digital archive, which may be accessed via a hotlink entitled 'Resource Details' (Figure 1). Equally if a user was interested in GIS resources they could do a What Search for 'GIS'. This would return five records, one of which included 'Excavation Archive' in its title to indicate that GIS resources were available from the Royal Opera House archive.
Figure 1: A section of the metadata record for the Royal Opera House excavation

  • In the left hand frame of ArchSearch there is a Project Archive button (Figure 2), which leads the user to a list of projects with downloadable resources.
Figure 2: The Project Archive button

  • Although not yet implemented by the OAU or MoLAS, it is possible for both units to hot-link directly from their own web pages to their project archives.

Resource delivery
One of the major data delivery issues concerned the user interface and revolved around the issue of whether a complex, visually pleasing, heavily packaged interface should be developed, or whether the raw data should be simply presented? The heavily packaged interface would be user friendly and consequently may increase access to the data. This approach, however, was rejected as it would have been prohibitively expensive to produce, would have set a precedent for all subsequent deposits and the heavily packaged nature of the data would probably present migration problems. More importantly perhaps, the packaging of data may restrict its potential for reinterpretation. Consequently it was decided to present the data in standard formats with sufficient on-line support documentation to enable the re-use of the data. This was regarded as the ideal, sustainable cost-effective solution.

The resource delivery issues brought the idea of who the data is aimed at into focus. The ADS receives its core funding from the Higher Education sector and consequently it is the scholarly community at which its data resources are aimed. Equally the ADS user community, as reflected in Strategies for Digital Data (Condron et al 1999), clearly wanted raw, rather than packaged data. The simple, unpackaged DAPPER interface consequently reflects the professional archaeological community's desire for raw data.

The ADS have recently submitted a proposal to the Joint Information Systems Committee to fund the creation of an online tutorial into the reuse of the digital resources from the DAPPER excavation archives. This will involve the re-purposing of the raw data into Internet deliverable teaching modules and will expose the next generation of archaeological students to the manipulation and re-use of digital excavation archives.

The cost of digital archiving

Digital archiving entails expenditure, both in the setting up and running of a digital archive and in preparing, depositing, accessioning and curating the information resources. If archiving of the digital component of fieldwork is to become a part of the everyday practice of archaeology, DAPPER also had to demonstrate the commercial case for such archives. The pilot has allowed us to quantify the effort and costs to both units and archives in the preservation and dissemination of computer-based data.

In a recent survey into the use and re-use of digital information in archaeology (Condron et al 1999, 70-71), the respondents clearly accepted the principal of cost recovery for digital archives, although the notion of charging re-users for access to information was firmly rejected. Given this clear steer from its user community the ADS has formulated and implemented a charging policy.

The central tenets of this policy are that:

The experiences gained from DAPPER have proved invaluable in the development of the ADS charging policy (ADS 1999a). More information on the charging policy, including a detailed breakdown of the costs and charging categories is available online http://ads.ahds.ac.uk/project/userinfo/charging.html

 Digital Archiving cost of Eynsham Abbey    £6484.05 
 Cost of Eynsham Abbey Excavations   £290,000 
 Cost of Eynsham Abbey Post-excavation   £250,000 
 Proportion of budget spent on digital archiving    1.2% 

 Digital Archiving cost of ROH Phase 1   £655 
 Cost of Royal Opera House Project (total)  £650,000 
 Proportion of budget spent on digital archiving   0.1% 

The Eynsham Abbey and Royal Opera House digital archives are both essentially research archives. Yet they differ quite markedly in their cost. This is clearly due to the ways in which both sites were analysed during post-excavation. Although the stratigraphic drawings of both sites are available digitally, the use of GIS by MoLAS ensured a deposit of 3 files, as opposed to the 404 Eynsham Abbey site plan CAD files. The 80 structure plans, 15 phase plans, 2 sections, a trench plan, a composite plan and a 3-D reconstruction of the Abbey also swelled the number of Eynsham Abbey CAD files. Consequently the cost of archiving is closely related to the presentation of the drawing information. It should not be assumed, however, that GIS is preferable to CAD because it is more cost effective to archive. Both file formats have their own distinct advantages and disadvantages. The GIS structure of the Royal Opera House archive, for example would require specialist software and training to facilitate effective understanding and reuse of the data. It is consequently difficult for a user with little knowledge of GIS to use the archive. On the other hand the Eynsham Abbey archive is structured like a traditional archaeological publication, with its separate structure and phase plans. With appropriate viewers, the CAD files can be accessed via the Internet. Consequently the CAD presentation of the data can open up access to the information for non-expert users. Yet for highly experienced users the MoLAS GIS offers a very powerful analytical tool, with interpretative functionality that is beyond CAD. It is recommended that units continue to undertake their post excavation in the ways most appropriate to them and that the production of a comprehensive digital archive should come before matters of cost.

In its Charging Policy the ADS state that "it is usual for digital archiving costs to add an overhead of less than 5% to the total project budget" (ADS 1999a). DAPPER has illustrated that for projects of the scale of Eynsham Abbey and the Royal Opera House, the actual cost of digital archiving can be as little as 1% of the total project budget. It must be recognised, however, that this 1% figure relates only to large-scale excavation projects. For medium to smaller-scale projects with a variety of files, the actual cost of archiving will be represent a greater proportion of the total budget and be in the region of 3-5% of the total project budget.

Contributing to the debates on digital project archiving

The Eynsham Abbey and Royal Opera House excavations were both extensive and well funded post-excavation projects that produced research level archives. It is recognised, however, that not every archaeological project merits this level of expenditure and archiving. Consequently DAPPER enabled the ADS to consider the different levels of digital archiving. Four distinct types of digital archive can be produced (Figure 3). These archives are:

Figure 3: A model for the production of archaeological digital archives

Conclusion

DAPPER has delivered:

The archiving and Internet delivery of digital project data
DAPPER enabled the Archaeology Data Service to develop and implement the world's first online digital project archive.
DAPPER has provided some 'start-up' funding to help with the development of the Project Archives section of ArchSearch. It provided the necessary digital infrastructure that subsequent deposits will benefit from; for example, the creation of the Download support pages is a vital component of the digital archive.
DAPPER gave the University-based service a valuable introduction to commercial archaeology and has enabled it to develop its Charging Policy (ADS 1999a)
DAPPER also enabled the ADS and the units the chance to discuss some of the inevitable queries that the units had concerning digital archives and enabled the ADS to reassess its policies and procedures in the light of unit concerns.
DAPPER provided the necessary funding and time for the Museum of London Archaeological Service and the Oxford Archaeological Unit to prepare their data for deposit and to reassess and develop their post-excavation procedures and documentation.

The costs of digital archives
DAPPER has demonstrated that an effective digital archive can be delivered for a fraction of the total project cost. For substantial fieldwork projects the DAPPER project suggests that the estimated archive deposition cost (approximately 5% of the total project budget) as reported in the ADS Charging Policy (ADS 1999a) can be modified downwards, to less than 1% of the total project budget. These costs need to be written into the original budgetary specifications.

Evaluating user reactions
The results so far from DAPPER have been very encouraging. The Royal Opera House archive was released in August, followed by Eynsham Abbey in October 1999. Following the launch of the excavation archives the ADS website has received an increased number of visitors and has experienced record-breaking user figures. Indeed over 8,000 people have visited the project archives and either accessed or downloaded approximately 10,000 separate files within the archives. People are downloading and starting to reuse the data, indeed one researcher remarked that the archives would prove very useful in his research and that the time and money he expended on downloading the entire Royal Opera House archive and reassembling it into its GIS form was well worth it.

The DAPPER resources are also being used as teaching datasets in the Archaeological Information Systems Masters degree that is being run at the University of York. Other universities, such as Durham and Glasgow, have also expressed an interest in using the DAPPER resources.

DAPPER has been a highly successful pilot project. It represents a new beginning both for the archiving and dissemination of digital archaeological information.



Return to index of papers for April, 2000, SAA Session "Digital Data: Preservation and Re-Use"



About this document: