On the merits of dust

Posted on September 3, 2012 by Brodie Waddell

Brodie Waddell

Mark’s recent post – and the related questions that come up at EMOB, Tim Hitchcock’s blog, and elsewhere – got me thinking a bit more ‘about the relative merits of (cheap, easy and efficient) access to digitised primary sources on one hand, and to (often expensive, labour intensive and time-consuming) hands-on access to original materials on the other’.

This is something that I’m conflicted about too. On the one hand, I have an emotional and aesthetic preference for the dusty originals. On the other, I often find at least as much useful material in the clean, searchable digitised sources.

At a practical level, I’m inclined to throw in my lot with the digitisers. Wonderful resources like EBBO, ECCO, TCP, EBBA, BHO, OBO and LL opened up new worlds to me (especially when I was a student in Canada) and to many other scholars. Without them, much of my work would be impossible or, at the very least, about ten times slower.

An information related to the theft of two pewter pots, from the Middlesex Sessions Papers, dated March 1760, digitised at London Lives.

Nonetheless, I believe that there is another consideration that is rarely mentioned in discussions like these: there is an undeniable tendency for digitisation to reinforce existing biases in source use. Before digitisation began, people tended to use printed works more than manuscripts and to use southern English (especially metropolitan) archives more than distant archives. This makes perfect sense: if you are based at Oxford or Cambridge or flying into London from North America, why wouldn’t you focus on the sources accessible there. Digitisation has made this bias even stronger. Print has been digitised before mss and southern/metropolitan archives have digitised more than less central ones. (See, e.g., the sites mentioned above and also TNA and the ERO.)

What this means is that one often finds historians extrapolating from the same types of evidence, with the same innate biases, rather than drawing on anything even approximating a ‘random sample’. Indeed, I often find myself doing this, so I don’t blame anyone else for doing the same.

In contrast, I’ve been privileged to have had the opportunity (i.e. time, funding) over the last few years to be able to regularly trek to a range of different county record offices and to simply dive into their material for a particular period to see what I find. As a result, I feel like I’ve gained a genuinely stronger sense of what was going on than I would have had I been constrained by the limits of digitised material as it exists now or even as it continues to expand in the near future. I can see now that some previous historians may have mischaracterised events and periods purely because they were unable to explore a range of local material.

Obviously this isn’t something everyone, or even most historians, is able to do, so I unhesitatingly endorse all the good work that is going into digitising ever-more material and making it accessible to a much wider audience of researchers. Still, we must guard against the temptation to think that the great masses of sources that have been digitised somehow represent a more balanced source base merely because they are now so numerous. Biases remain and they may even be growing stronger.

PS: As Gavin Robinson is showing with his series blogging a soldier’s letters from the English Civil War, even when a manuscript source has previously been transcribed and printed (and will be eventually digitised), it’s often worth revisiting the original. Earlier editors sometimes made hilarious errors or took liberties with the text that can completely change the meaning.

14 thoughts on “On the merits of dust”

Lydia on September 4, 2012 at 7:44 am said:

An interesting debate. My PhD would have been impossible without digital resources (or at least, taken ten times as long and been 1/10th as good), so in theory I’m all for them. But I mostly use newspaper databases, which are keyword search
able. These are incredibly useful, and make a lot possible, but seeing the article in the context of the page, and seeing what else is going on, that may or may not relate to what I’m looking for, provides a wealth of material that you don’t get when you look at the digitised version. So reels of microfilm have to remain in the frame. I also agree with the points about bias – our reliance on digitised sources may mean we miss a lot. Plus the keyword searches are not 100% sound yet, so we miss things there too. For me, for example, realising that I needed to search for “lynch law” rather than “lynching” threw up rather more results – in fact, far more than I could get through. That’s the other problem with digitisation I suppose – will anyone ever *finish* a project, or will they just stop reading at a certain point?

Reply ↓
- Brodie Waddell on September 4, 2012 at 3:58 pm said:
  
  I agree completely, Lydia, on the opportunities offered by digital resources. I too would have taken longer and produced a worse thesis if I hadn’t had access to EBBO, etc.
  
  You make a good point about newspapers. My impression is that they were under-used prior to digitisation but now they can be used almost too easily (e.g. bits taken out of context, etc.). In some ways, there is still a big difference between seventeenth-centuryists who use EBBO (not very keyword searchable) and eighteenth-centuryists who use ECCO and the Burney Collection (very searchable, but often patchy results due to poor OCR) and of course nineteenth-centuryists like yourself (very searchable and effective, but often overwhelming amounts of material). Each faces somewhat different ‘digital’ problems.
  
  Reply ↓
Matthew Jackson on September 4, 2012 at 8:03 am said:

Really interesting stuff, Brodie. Having spent nearly 2 months down in Bristol’s archive office, I certainly feel like I’m gaining a really clear sense of how people ticked in Bristol, who the persistent culprits were, who the detested neighbours were, who the most vociferous and lamented official was etc., etc. I know this isn’t really what you were hoping to evoke out of me as a reader, but I thought I’d leave that here nonetheless.

The question marks you raise about the innate bias in current digitised material is a provocative thought, and not a point that I’d honestly given time to reflect upon before now, so thanks. In twenty years time, I wonder what the balance between digital and manuscript usage will be? If there’s a huge swing to the former, which I expect there would be, then perhaps the best way forward within the humanities is to pump money into digitising as much of the county record office material as possible. This would hopefully remove a substantial chunk of the bias we’re speaking of here, and help to offer a more geographically balanced range available digital sources. There is one major and one minor (at least that I can think of at 9:03) pitfall to my plan. 1. I’m not sure how much more money can be, and indeed whether there is any at all to be, invested into digitisation projects of that scale. Perhaps, once digitised, a subscription service could be introduced, though? 2. What would happen to county archives as a result? Genealogists and family historians are the essential core of attendees at the Bristol archive I’m about to visit after my morning cup of coffee, so I’m not sure that digitising late medieval and early modern legal material, for example, would have a massive detriment upon archive attendance in that sense. But it seems a sad thought that these old bundles of folios would sit forever in partially damp, hidden and dark cellars waiting for the day that someone fills out a ticket with their name on it.

Reply ↓
- Brodie Waddell on September 4, 2012 at 4:28 pm said:
  
  Yes, the possibility of immersion in the records for a particular place is something that can have real, practical benefits and that is something that only really works thorough physical visits to the archive. When digitisation does happen, it is nearly always limited to one or two particular series (e.g. a single court, etc.) though there are partial exceptions (London Lives draws on a range of material) and I’m aware of one attempt at total digitisation (Earls Colne, 1380-1854).
  
  Which brings me to your second point. While I agree that digitising big chunks of local record office mss would be a great step, but I think your suspicion about funding is exactly right: it simply doesn’t exist on a sufficient scale. Even London Lives, which managed to go much further than most, is still not even close to comprehensive. It’s covers no more than a fraction of the metropolitan material for the eighteenth century. To find money to do the same thing for England’s dozens of county record offices is hard to imagine. Likewise, the National Archives generally only digitises upon request (and payment) by private individuals, making the images publicly available thereafter. This is a promising funding-model, but then there is a bias towards documents that are most interesting to genealogists, etc.
  
  That’s a bit of a depressing place to end, so I’ll just finish by saying that I think all of the current and forthcoming digitisation projects are great and will be immensely useful. We just need to always remember their limitations.
  
  Reply ↓
manyheadedhailwood on September 5, 2012 at 1:04 am said:

I’m really pleased to see that my post has sparked a bit of a debate, and reassured that most people seem to feel as conflicted as I do! There are a number of good points raised that I need to chew over, but I’d just like to cough up a couple of initial reactions.

First, I’m not convinced that digitisation necessarily reinforces existing biases. I’m thinking here of the digitisation of broadside ballads (primarily by EBBA), which has helped to breathe new life into a source that was previously accessible only in slightly unwieldy nineteenth-century printed editions. The fact that students can now approach many aspects of early modern life—such alehouse-going, or marriage—through popular literature arguably helps to address an older bias for gleaning contemporaries’ attitudes towards these things from (often more authoritarian) printed conduct literature. At the least, it offers an extra lens through which to study these issues, and often a different perspective. It is possible though, as Brodie suggests in regard to newspapers, that they may become over-used because they are so easy to access.

Second, a greater reliance on visiting the record office is not necessarily a safeguard against getting a blinkered or unrepresentative view of the past. It is arguable that an earlier tradition of early modern English history based on very detailed local case studies that were painstakingly researched in the archives produced its fair share of distortions. Working closely on the records of one particular place can heavily colour the way you view early modern society, but your conclusions may not apply to other times and places. And counties that have more surviving material, or are more accessible, or even that have better facilities for research, will attract more attention and skew the historical narrative towards prioritising their story.

In other words, archival work can also provide distorting effects on the way we understand the past. Indeed, it seems to me that this is a broader problem we face with all of our sources, rather than something that is attributable to the division between digital and archival sources.

Reply ↓
- Brodie Waddell on September 11, 2012 at 10:03 pm said:
  
  I think you’re right, Mark. The increasing availability and use of ballads since digitisation is a very strong counter-point to my suspicion that that the process might enhance previous biases. Whereas before historians and students often relied on ‘canonical’ texts (Shakespeare and his ilk) that could be accessed through modern printed editions, now they’re able to draw on a whole universe of printed material through EBBO, etc.
  
  I’m also in agreement about the danger in believing that archival research is somehow immune to the temptation to extrapolate from one rich set of sources – e.g. a particular county, or court, or even a single village. A big collection of documents is unlikely to be very representative if it’s the product of a narrow base.
  
  But I stand by part of the post at least. The archives and record offices that are doing the most digitation (e.g. The National Archives, London Metropolitan Archives, Essex Record Office) are exactly the same ones that were already well-used. Today, it is quite possible to write an article with a decent ‘archival base’ without ever leaving your desk. Frankly, that’s wonderful. I’m really excited about this progress and I do _not_ want to turn back the clock. Yet, it does mean that there will be (already are?) a rising proportion of scholarship that relies exclusively on those already well-studied places, a trend with real consequences.
  
  – Brodie
  
  Reply ↓
  - manyheadedhailwood on September 12, 2012 at 5:05 am said:
    
    You’re right Brodie, and that brings us back to your discussion with Matt above – it is hard to see a way out of this given that the funding for a wide-scale digitisation of county record office material isn’t there.
    
    On the other hand, perhaps we should see digital resources as a kind of ‘gateway’ form of research, rather than one that a new generation of scholars are becoming dependent upon. It seems to me to be a fairly common experience that people who cut their teeth researching with digital resources then develop a desire to complement their knowledge with archival research – and often in archives that are less well trodden. If so, we may be worrying more than is necessary.
    
    – Mark
  - Brodie Waddell on September 14, 2012 at 8:12 pm said:
    
    I love the idea of ditigal work as ‘gateway’ research. If I ever put together a grant proposal for a big digitization project, I’ll be sure to include that…
    
    ‘This new resource will allow young people to experiment recreationally with ‘soft’ digital sources, which will become a ‘gateway’ for them to the ‘hard’ archival manuscripts that their newfound addiction will soon require.’
Laura Sangha on September 8, 2012 at 9:44 am said:

I have just been catching up with all of these discussions and have enjoyed them immensely – thanks to all involved! I just wanted to return to something that Matt (I think) pointed out in his response to Mark’s earlier post – the questions that you are asking of your sources are also a key issue here. Digitisation, as we are all aware, gives us the ability to ask research questions that would never have been attempted in an earlier era.

As a historian of religious cultures, when I was working on angels I wanted to uncover ‘popular’ belief – popular in the sense of opinions that were widely held, widely disseminated, and not just embedded in theological spheres – beliefs that seemed to have broader purchase in English life beyond the pen of a clergyman. Unfortunately, there is no angels archive, nor any source genre that is more likely to contain material relating to belief about angels.

Subsequently, for a large part of the time I was an ‘EEBO’ historian, as printed material was undoubtedly the source base most suited to my central research question, and digitisation meant that I could process enormous amounts of material quickly. Keyword searches threw up an enormous amount of information. But that was not the only material that I was systematically searching – I also read my way through primers, catechisms, scriptural commentaries, funeral sermons, and anything else that I thought might be relevant. Printed material was my chosen genre – it was most likely to show me when and how people came into contact with ideas about angels.

Similarly, when it came to writing about ‘Popular’ i.e. explicitly non-ecclesiastical, non-elite belief, I turned to printed drama, and ballads but also personal diaries, travel literature, and visual culture for my answers. So I guess what I am trying to say is that a good/ properly trained historian will always be aware of the potentials and shortfalls of their source type, and will treat it as such. Digitisation has almost added a new element to this, in a way. My research project was made ‘do-able’ by the database of printed material, but had the sources not been digitised they would still have been my sources, I just would have done a lot more travelling (and probably needed an entire careers worth of research time). So to return to one of Brodie’s points, yes, perhaps this will encourage a tendency to utilise printed material as a source base. But that does not mean that the questions asked or the answers found will not change substantially along the way.

Reply ↓
- Brodie Waddell on September 11, 2012 at 10:05 pm said:
  
  Thanks for your thoughts, Laura. I especially like your point about the limited range of research questions that we were once able to ask. As you say, ‘there is no angels archive’ – and that applies to a wide range of other important historical topics too. Thanks to digitisation, it’s now much more possible to read across genres and to explore culture quantitatively as well as qualitatively.
  
  And your final point is the one I’d like to reiterate most strongly. This is something I was trying to get at in the original post but you express it much more clearly. In short, digitisation creates brilliant new avenues of research – it does not, however, mean that we can just relax, run a few keyword searches and let the articles write themselves. As ‘good historians’, we now have another set of issues that we need to be thinking critically about.
  
  Reply ↓
Pingback: REED all about it – Part I: Fiddling at the Church Ale… | the many-headed monster
Pingback: An archival miscellany: a warning, a rat, a blog and another warning | the many-headed monster
Pingback: Richard Blakemore, ‘Finding fragments – the past and the future’ | the many-headed monster
Pingback: Matthew Jackson, ‘Relocating History From Below: Places, Spaces and Databases’ | the many-headed monster