Harrison Eiteljorg, II
(See email contacts page for the author's email address.)
A great many papers at the recent CAA (Computer Applications and Quantitative Methods in Archaeology) meetings in Williamsburg, VA, dealt with attempts to construct web-based data resources for archaeology. Much good, careful work was in evidence in those papers, and a great deal of time and effort was also clear. The papers generally explored the technologies required and the systems that must be in place to make such web resources function properly.
Building such web resources is, like most academic exercises, a process that depends upon determining for whom your are working and what your aims are. That seems obvious; we don't give the same presentation to graduate students that we give to elementary-school children; we don't speak to an audience of scholars as we do to one of interested amateurs. Similarly, we prepare a graduate-student presentation with specific aims, and we prepare a presentation for a general audience with aims that are very different, but nonethless specific.
The papers at the CAA meetings, however, were so focused on technology issues that it too often seemed as if the presenters had not decided who made up the audience for the resources they were creating and, consequently, had also neglected to define their archaeological aims with the requisite care. Instead, there were too many talks about resources - either real or hypothetical, but mostly somewhere in between - that seemed to be based on the notion so widely quoted from the movie Field of Dreams, "If you build it, they will come." (For those who are not familiar with the movie, it concerns a farmer from the American midwest named Ray Kinsella who hears a mysterious, disembodied voice saying to him, "If you build it, they will come." He builds a baseball diamond in the middle of his cornfield, and the ghosts of old baseball players emerge from the cornfield to play baseball games for crowds of people who converge on the farm, apparently drawn by some invisible force.) In the movie the audience does not know who is meant by they. But the term, in the end, means either the ghosts of baseball players past or the fans who will come to watch them -- or both. The viewer cannot be sure whether or not Ray Kinsella, the hero, understood the they reference, but he clearly understood the it reference; it was a baseball diamond.
It seemed to me in Williamsburg that, of those who were discussing the provision of web resources, too many focused on the technology without discussing explicitly either the they (who will come) or the it (exactly what should be built). It was as if, to continue the analogy to the movie, Ray Kinsella knew he could use his tractor and plow for something and began fiddling about with them because he had a strong feeling that something desirable would come of it. But Ray Kinsella knew what he needed to build, and he built it, a baseball diamond.
Putting material on the web seems to be the natural end product of good research. In addition, sharing data has become everyone's idea of an imperative. The problems lie in those notions of audience and aims. The intended audience is one determinant of the way a web resource and access thereto are structured. The aim of the project is another. Both are critical. They are, however, rather easily ignored when collaborative work between archaeologists and computer scientists yields a desire to explore possibilities rather than a need to accomplish a specific end result. For example, papers on natural language processing for harvesting data from text documents were very interesting discussions of computer-science problems when dealing with the humanities, but, absent some specific users and needs, they did not seem to me to point toward any useful conclusions concerning archaeological use of the technology. How, after all, does one judge such a project without some measure of its utility? And how does one measure utility without a well-defined aim?
Of course, this kind of exploration can easily happen when you have people (i.e., computer-science graduate students) looking for ways to demonstrate new technologies and doing so for a community that has vague and unspecified needs for those technologies (i.e., archaeologists). When the audience's only desire is to see what might emerge, and the designers are simply interested in trying something new with existing technology, the result is not likely to be useful except as a proof-of-concept exercise.
So it seemed that, in Williamsburg, there were many discussions of the symantic web, LinkedData, and the like, all of which lead to one or another approach to presenting archaeolgical information on the web. What was missing time and again was a clear sense of an audience for the resource and a defined archaeological intent. That does not mean the papers were uninteresting. The natural language processing papers, for instance, had considerable intrinsic interest. Rather, it is meant as a reminder that, as Robert Burns famously pointed out, "The best laid schemes o' Mice an' Men,/ gang aft agley." It is all too likely that those schemes for making web resources will go awry if they are made with scant attention to real-world users or archaeological aims and therefore without mechanisms for measuring utility.
In fact, of course, the presenters may have had a clear sense of audience and of purpose. Their audiences may have been computer-science experts, and their purposes may have generally been to show that some specific approach would work to present archaeological data on the web. However, general notions concerning data access and web-based data resources do not advance the archaeological cause; we need to see valuable resources for a real audience in order to offer a proper appraisal.
An example of what would seem to me to be a carefully defined project should illustrate my point. An aggregated data set of all ancient ship remains, let us say from the earliest found to the end of the Byzantine period, could be conceived to be of value to serious students (not limited to professional scholars) of the history of shipping and/or ship-building. Setting that group of users as the audience would, in turn, help to lead the designer of such a resource to decide what information should be included, if available. For instance, ribs-first versus shell-first construction should be included as a data item. Having the audience defined in that way would also tend to discourage the addition of much information (other than bibliographic) about excavation of the ships in question, and a more careful definition of the audience would be needed to determine whether or not to include cargo information that might be relevant to trade routes. The value of creating such a resource should be real, and, precisely because the resource would have been designed to serve a defined set of needs, many ways of testing and improving it could be devised so that, over time, its creators would be able to learn important lessons about the utility of this specific resource and, one would hope, some generalizable things about web resources for archaeology. Such a project would be far more than a proof of concept; it would be a proof of the concept put into practice.
Building web resources for archaeology requires so much time, skill, effort, and expense that it should not be just an experiment by and for computer specialists to prove general possibilities. If a resource-creation project is intended to be an archaeological project as well as a computer-science one, there should be an attempt to solve a particular, defined problem for a particular, defined group of users. We need the proof that is in the pudding, not just the general concept, if we are to be sure that the benefits are real, tangible, and in keeping with the costs.
Readers of this newsletter are likely to be familiar with TED (Technology, Entertainment, Design) and the presentations on the TED website of Tim Berners-Lee. and Hans Rosling (link to the second presentation of two by Dr. Rosling). Their presentations are relevant here. Mr. Berners-Lee made an argument that may be persuasive, but it lacked both a specific audience and a specific intent. Dr. Rosling, on the other hand, similarly extolled the need for openly accessible data resources -- but with careful and specific uses of those data. He had explicit needs and desires, and he was able to use the data he sought to make clear and cogent points about matters of concern to him. We need the visionaries like Mr. Berners-Lee to provide the general ideas and to encourage us to look beyond the confines of our present world, but to make those ideas concrete and to prove their utility to a specific audience with real needs we need Dr. Rosling and others like him.
Archaeology has some notable projects that demonstrate, as Dr. Rosling did, the value of web access to data. (The ADS comes most quickly to mind, and I will not try to present a longer list, lest I omit important and valuable resources.) At present, though, it seems to me that archaeology needs many more examples of real value being provided in the form of web-based data resources that answer concrete needs for real scholars and scholarship -- and fewer examples of what we might be able to do with one or another bit of technological wizardry for an undefined audience and an unspecified purpose. This is particularly true when aggregated data are the resources in question since the process of data aggregation is so complex and necessarily varies enormously from data of one type to data of any other type.
-- Harrison Eiteljorg, II
(A relevant item, "An Illustration or a Data Source?" by the same author, may be found in this issue of the newsletter. It concerns the differences in data-representation techniques that arise from different conceptions of ultimate use.)
Next Article: An Illustration or a Data Source?
|CSA Home Page|