Open Objects... has moved: digital history

Showing posts with label digital history. Show all posts

Thursday, 11 December 2014

The rise of interpolated content?

One thing that might stand out when we look back at 2014 is the rise of interpolated content. We've become used to translating around auto-correct errors in texts and emails but we seem to be at a tipping point where software is going ahead and rewriting content rather than prompting you to notice and edit things yourself.

iOS doesn't just highlight or fix typos, it changes the words you've typed. To take one example, iOS users might use 'ill' more than they use 'ilk', but if I typed 'ilk' I'm not happy when it's replaced by an algorithmically-determined 'ill'. As a side note, understanding the effect of auto-correct on written messages will be a challenge for future historians (much as it is for us sometimes now).

And it's not only text. In 2014, Adobe previewed GapStop, 'a new video technology that eases transitions and removes pauses from video automatically'. It's not just editing out pauses, it's creating filler images from existing images to bridge the gaps so the image doesn't jump between cuts. It makes it a lot harder to tell when someone's words have been edited to say something different to what they actually said - again, editing audio and video isn't new, but making it so easy to remove the artefacts that previously provided clues to the edits is.

Photoshop has long let you edit the contrast and tone in images, but now their Content-Aware Move, Fill and Patch tools can seamlessly add, move or remove content from images, making it easy to create 'new' historical moments. The images on extrapolated-art.com, which uses '[n]ew techniques in machine learning and image processing [...] to extrapolate the scene of a painting to see what the full scenery might have looked like' show the same techniques applied to classic paintings.

But photos have been manipulable since they were first used, so what's new? As one Google user reported in It’s Official: AIs are now re-writing history, 'Google’s algorithms took the two similar photos and created a moment in history that never existed, one where my wife and I smiled our best (or what the algorithm determined was our best) at the exact same microsecond, in a restaurant in Normandy.' The important difference here is that he did not create this new image himself: Google did, without asking or specifically notifying him. In twenty years time, this fake image may become part of his 'memory' of the day. Automatically generated content like this also takes the question of intent entirely out of the process of determining 'real' from interpolated content. And if software starts retrospectively 'correcting' images, what does that mean for our personal digital archives, for collecting institutions and for future historians?

Interventions between the act of taking a photo and posting it on social media might be one of the trends of 2015. Facebook are about to start 'auto-enhancing' your photos, and apparently, Facebook Wants To Stop You From Uploading Drunk Pictures Of Yourself. Apparently this is to save your mum and boss seeing them; the alternative path of building a social network that don't show everything you do to your mum and boss was lost long ago. Would the world be a better place if Facebook or Twitter had a 'this looks like an ill-formed rant, are you sure you want to post it?' function?

So 2014 seems to have brought the removal of human agency from the process of enhancing, and even creating, text and images. Where do we go from here? How will we deal with the increase of interpolated content when looking back at this time? I'd love to hear your thoughts.

Thursday, 4 December 2014

Three ways you can help with 'In their own words: collecting experiences of the First World War' (and a CENDARI project update)

Somehow it's a month since I posted about my CENDARI research project (in Moving forward: modelling and indexing WWI battalions) on this site. That probably reflects the rhythm of the project - less trying to work out what I want to do and more getting on with doing it. A draft post I started last month simply said, 'A lot of battalions were involved in World War One'. I'll do a retrospective post soon, and here's a quick summary of on-going work.

First, a quick recap. My project has two goals - one, to collect a personal narrative for each battalion in the Allied armies of the First World War; two, to create a service that would allow someone to ask 'where was a specific battalion at a specific time?'. Together, they help address a common situation for people new to WWI history who might ask something like 'I know my great-uncle was in the 27th Australian battalion in March 1916, where would he have been and what would he have experienced?'.

I've been working on streamlining and simplifying the public-facing task of collecting a personal narrative for each battalion, and have written a blog post, Help collect soldiers’ experiences of WWI in their own words, that reduces it to three steps:

Take one of the diaries, letters and memoirs listed on the Collaborative Collections wiki, and
Match its author with a specific regiment or battalion.
Send in the results via this form.

If you know of a local history society, family historian or anyone else who might be interested in helping, please send them along to this post: Help collect soldiers’ experiences of WWI in their own words.

Work on specifying the relevant data structures to support a look-up service to answer questions about a specific units location and activities at a specific time largely moved to the wiki:

Talk:British battalions and regiments in World War I
Talk:British Army Hierarchies
Template talk:Battalion - what information should be recorded on every battalion/unit page?
Template talk:Infobox command structure - what structured data should be recorded about military hierarchies?
Template talk:Infobox theatre of war/doc - what structured data should be recorded about a unit's activities and engagements in the war?
Template talk:Infobox military unit - what structured data should be recorded about a battalion/unit?

You can see the infobox structures in progress by flipping from the talk to the Template tabs. You'll need to request an account to join in but more views, sample data and edge cases would be really welcome.

Populating the list of battalions and other units has been a huge task in itself, partly because very few cultural institutions have definitive lists of units they can (or want to) share, but it's necessary to support both core goals. I've been fortunate to have help (see 'Thanks and recent contributions' on 'How you can help') but the task is on-going so get in touch if you can help!

So there are three different ways you can help with 'In their own words: collecting experiences of the First World War':

collect diaries linked to specific battalions;
help check or complete the lists of Australian battalions, British battalions and regiments, Canadian battalions and regiments, Indian battalions, Italian battalions and New Zealand battalions in World War;
review and contribute to the data structures needed to record information about military units in the Talk and Template pages above

Finally, last week I was in New Zealand to give a keynote on this work at the National Digital Forum. The video for 'Collaborative collections through a participatory commons' is online, so you can catch up on the background for my project if you've got 40 minutes or so to spare. Should you be in Dublin, I'm giving a talk on 'A pilot with public participation in historical research: linking lived experiences of the First World War' at the Trinity Long Room Hub today (thus the poster).

And if you've made it this far, perhaps you'd like to apply for a CENDARI Visiting Research Fellowships 2015 yourself?

Friday, 31 October 2014

Moving forward: modelling and indexing WWI battalions

A super-quick update from my CENDARI Fellowship this week. I set up the wiki for In their own words: linking lived experiences of the First World War a week ago but only got stuck into populating it with lists of various national battalions this week. My current task list, copied from the front page is to:

Populate list of military units: Australian battalions in World War I, British battalions and regiments in World War I), Canadian battalions in World War I, Indian battalions in World War I, Italian battalions in World War I, New Zealand battalions in World War I. A list of battalions is needed to form the basis for the collecting process. (I'm starting with a list of divisions because I can get it from Wikipedia, but I know this is problematic)
Collate lists of personal diaries, letters, memoirs that can be linked to units through their authors
Collate lists of official unit diaries and histories
Collate resources on researching World War One records to help researchers know where to start
Create a sample battalion page as a demonstrator to show how personal accounts can be linked
Collate information about private letters, diaries and memoirs

If you can help with any of that, let me know! Or just get stuck in and edit the site.

I've started another Google Doc with very sketchy Notes towards modelling information about World War One Battalions. I need to test it with more battalion histories and update it iteratively. At this stage my thinking is to turn it into an InfoBox format to create structured data via the wiki. It's all very lo-fi and much less designed than my usual projects, but I'm hoping people will be able to help regardless.

So, in this phase of the project, the aim is find a personal narrative - a diary, letters, memoirs or images - for each military unit in the British Army. Can you help?

Friday, 17 October 2014

In which I am awed by the generosity of others, and have some worthy goals

A quick update from my CENDARI fellowship working on a project that's becoming 'In their own words: linking lived experiences of the First World War'. I've spent the week reading (again a mixture of original diaries and letters, technical stuff like ontology documentation and also WWI history forums and 'amateur' sites) and writing. I put together a document outlining a rang of possible goals and some very sketchy tech specs, and opened it up for feedback. The goals I set out are copied below for those who don't want to delve into detail. The commentable document, 'Linking lived experiences of the First World War': possible goals and a bunch of technical questions goes into more detail.

However, the main point of this post is to publicly thank those who've helped by commenting and sharing on the doc, on twitter or via email. Hopefully I'm not forgetting anyone, as I've been blown away by and am incredibly grateful for the generosity of those who've taken the time to at least skim 1600 words (!). It's all helped me clarify my ideas and find solutions I'm able to start implementing next week. In no order at all - at CENDARI, Jennifer Edmond, Alex O'Connor, David Stuart, Benjamin Štular, Francesca Morselli, Deirdre Byrne; online Andrew Gray @generalising; Alex Stinson @ DHKState; jason webber @jasonmarkwebber; Alastair Dunning @alastairdunning; Ben Brumfield @benwbrum; Christine Pittsley; Owen Stephens @ostephens; David Haskiya @DavidHaskiya; Jeremy Ottevanger @jottevanger; Monika Lechner @lemondesign; Gavin Robinson ‏@merozcursed; Tom Pert @trompet2 - thank you all!

Worthy goals (i.e. things I'm hoping to accomplish, with the help of historians and the public; only some of which I'll manage in the time)

At the end of this project, someone who wants to research a soldier in WWI but doesn't know a thing about how armies were structured should be able to find a personal narrative from a soldier in the same bit of the army, to help them understand experiences of the Great War.

Hopefully these personal accounts will provide some context, in their own words, for the lived experiences of WWI. Some goals listed are behind-the-scenes stuff that should just invisibly make personal diaries, letters and memoirs more easily discoverable. It needs datasets that provide structures that support relationships between people and documents; participatory interfaces for creating or enhancing information about contemporary materials (which feed into those supporting structures), and interfaces that use the data created.

More specifically, my goals include:

A personal account by someone in each unit linked to that unit's record, so that anyone researching a WWI name would have at least one account to read. To populate this dataset, personal accounts (diaries, letters, etc) would need to be linked to specific soldiers, who can then be linked to specific units. Linking published accounts such as official unit histories would be a bonus. [Semantic MediaWiki]

Researched links between individual men and the units they served in, to allow their personal accounts to be linked to the relevant military unit. I'm hoping I can find historians willing to help with the process of finding and confirming the military unit the writer was in. [Semantic MediaWiki]

A platform for crowdsourcing the transcription and annotation of digitised documents. The catch is that the documents for transcription would be held remotely on a range of large and small sites, from Europeana's collection to library sites that contain just one or two digitised diaries. Documents could be tagged/annotated with the names of people, places, events, or concepts represented in them. [Semantic MediaWiki??]

A structured dataset populated with the military hierarchy (probably based on The British order of battle of 1914-1918) that records the start and end dates of each parent-child relationship (an example of how much units moved within the hierarchy)

A published webpage for each unit, to hold those links to official and personal documents about that unit in WWI. In future this page could include maps, timelines and other visualisations tailored to the attributes of a unit, possibly including theatres of war, events, campaigns, battles, number of privates and officers, etc. (Possibly related to CENDARI Work Package 9?) [Semantic MediaWiki]

A better understanding of what people want to know at different stages of researching WWI histories. This might include formal data gathering, possibly a combination of interviews, forum discussions or survey

Goals that are more likely to drop off, or become quick experiments to see how far you can get with accessible tools:

Trained 'named entity recognition' and 'natural language processing' tools that could be run over transcribed text to suggest possible people, places, events, concepts, etc [this might drop off the list as the CENDARI project is working on a tool called Pineapple (PDF poster). That said, I'll probably still experiment with the Stanford NER tool to see what the results are like]

A way of presenting possible matches from the text tools above for verification or correction by researchers. Ideally, this would be tied in with the ability to annotate documents

The ability to search across different repositories for a particular soldier, to help with the above.

Commentable version: 'Linking lived experiences of the First World War': possible goals and a bunch of technical questions.

Friday, 10 October 2014

Linking lived experiences of WWI through battalions?

Another update from my CENDARI Fellowship at Trinity College Dublin, looking at 'In their own words: linking lived experiences of the First World War', which is a small-scale, short-term pilot based on WWI collections. My first post is Defining the scope: week one as a CENDARI Fellow. Over the past two weeks I've done a lot of reading - more WWI diaries and letters; WWI histories and historiography; specialist information like military structures (orders of battle, etc). I've also sketched out lots of snippets of possible functions, data, relationships and other outcomes.

I've narrowed the key goal (or minimum viable product, if you prefer) of my project to linking personal accounts of the war - letters, diaries, memoirs, photographs, etc - to battalions, by creating links from the individual who wrote them to their military unit. Once these personal accounts are linked to particular military units, they can be linked to higher units - from the battalion, ship or regiment to brigade, corps, etc - and to particular places, activities, events and campaigns. The idea behind this is to provide context for an individual's experience of WWI by linking to narratives written by people in the same situation. I'm still working out how to organise the research process of matching the right soldier to the right battalion/regiment/ship so that relevant personal stories are discoverable. I'm also still working out which attributes of a battalion are relevant, how granular the data will be, and how to design for the inevitable variation in data quality (for example, the availability of records for different armies varies hugely). Finally, I’m still working out which bits need computer science tools and which need the help of other historians.

Given the number of centenary projects, I was hoping to find more structured data about WWI entities. Trenches to Triples would be useful source of permanent URLs, and terms to train named entity recognition, but am I missing other sources?

There's a lot of content, and so much activity around WWI records, but it's spread out across the internet. Individual people and small organisations are digitising and transcribing diaries and letters. Big collecting projects like Europeana have lots of personal accounts, but they're often not transcribed and they don't seem to be linked to structured data about the item itself. Some people have painstakingly transcribed unit diaries, but they're not linked from the official site, so others wouldn't know there's a more easily read version of the diary available. I've been wondering if you could crowdsource the process of transcribing records held elsewhere, and offer the transcripts back to sites. Using dedicated transcription software would let others suggest corrections, and might also make it possible to link sections of the text to external 'entities' like names, places, events and concepts.

Albert Henry Bailey. Image:
Sir George Grey Special Collections,
Auckland Libraries, AWNS-19150909-39-5

To help figure out the issues researchers face and the variations in available resources, I'm researching randomly selected soldiers from different Allied forces. I've posted my notes on Private Albert Henry Bailey, service number 13/970a. You'll see that they're in prose form, and don't contain any structured data. Most of my research used digitised-but-not-transcribed images of documents, with some transcribed accounts. It would definitely benefit from deeper knowledge of military history - for a start, which battalions were in the same place as his unit at the same time?

This account of the arrival and first weeks of the Auckland Mount Rifles at Gallipoli from the official unit history gives a sense of the density and specificity of local place names, as does the official unit diary, and I assume many personal accounts. I'm not sure how named entity recognition tools will cope, and ideally I'd like to find lists of places to 'train' the tools (including possibly some from the 'Trenches to Triples' project).

If there aren't already any structured data sources for military hierarchies in WWI, do I have to make one? And if so, how? The idea would be to turn prose descriptions like this Australian War Memorial history of the 27th AIF Battalion, this order of battle of the 2nd Australian Division and any other suitable sources into structured data. I can see some ways it might be possible to crowdsource the task, but it's a big task. But it's worth it - providing a service that lets people look up which higher military units, places. activities and campaigns a particular battalion/regiment/ship was linked to at a given time would be a good legacy for my research.

I'm sure I'm forgetting lots of things, and my list of questions is longer than my list of answers, but I should end here. To close, I want to share a quote from the official history of the Auckland Mounted Rifles. The author said he 'would like to speak of the splendid men of the rank and file who died during this three months' struggle. Many names rush to the memory, but it is not possible to mention some without doing an injustice to the memory of others'. I guess my project is driven by a vision of doing justice to the memory of every soldier, particularly those ordinary men who aren't as easily found in the records. I'm hoping that drawing on the work of other historians and re-linking disparate sources will help provide as much context as possible for their experiences of the First World War.

--
Update, 15 October 2014: if you've made it this far, you might also be interested in chipping in at 'Linking lived experiences of the First World War': possible goals and a bunch of technical questions.

Monday, 28 July 2014

The sounds of silence

I've been reading World War One diaries and letters (getting distracted by sources is an occupational hazard in my research) as I look for sample primary sources for teaching crowdsourcing at the HILT summer school in Maryland next week and for my CENDARI fellowship later this year.

I noticed one line in the Diary of William Henry Winter WWI 1915 that manages to convey a lot without directly giving any information about his opinions or relationship with this person:

'Major Saunders is supposed to be on his way back here as well but I don't know as he is coming back to our Coy, I hope not any way. We have got a good man now.'

There's nothing in the rest of the entries online that provides any further background. It may be that sections of this correspondence either didn't survive, weren't held by the same person, or perhaps were edited before deposit with the library or during transcription (it's particularly hard to judge as the site doesn't have images of the original document), so this particular silence may not have been intentional.

Whatever the case, it's a good reminder that there are silences behind every piece of content. While it's an amazing time to research the lives of those caught up in WWI as more and more private and public material is digitised and shared, silences can be created in many ways - official archives privilege some voices over others, personal collections can be censored or remain tucked away in a shoebox, and large parts of people's experiences simply went unrecorded. Content hidden behind paywalls or inaccessible to search engines (whether inadvertently hidden behind a search box or through lack of text transcription or description) is effectively hushed, if not exactly silenced. Sources and information about WWI collected via community groups on Facebook may be lost the next time they change their terms and conditions, or only partially shared. Our challenge is to make the gaps and questions about what was collected visible (audible?) while also being careful not to render the undigitised or unsearchable invisible in our rush to privilege the easily-accessible.

[Update: I've just realised that Winter might not have needed to provider further context as it seems many men in his unit were from the same region as him, and therefore his relationship with the Major may have pre-dated the war. Tacit knowledge is of course another example of the unrecorded, and one perhaps more familiar to us now than the unsayable.]

Sunday, 25 May 2014

Piloting a Participatory History Commons

I've been awarded a CENDARI Visiting Research Fellowship at Trinity College Dublin for a project called 'Bridging collections with a participatory Commons: a pilot with World War One archives'. I've posted my proposal at the link above, and when I start in September I'll post about my progress here. CENDARI have now published the list of all 2014 Fellows and a neat summary of the programme: 'The CENDARI Visiting Research Fellowships are intended to support and stimulate historical research in the two pilot areas of medieval European culture and the First World War, by facilitating access to key archives, specialist knowledge and collections in CENDARI host institutions'.

As I said in my post, 'it's an ambitious project which requires tackling community building, user experience design, historical materials and programming, and I'll be drawing on the expertise of many people'. I'll post as I go - but first, I'd best get back to finishing up my PhD thesis!

In the meantime, here's a small collection of things I've written as I think through what a participatory commons is and how it might work: my poster and talk notes for Herrenhausen conference and my keynote for Sharing is Caring, 'Enriching cultural heritage collections through a Participatory Commons platform: a provocation about collaborating with users'.

Wednesday, 19 March 2014

Early PhD findings: Exploring historians' resistance to crowdsourced resources

I wrote up some early findings from my PhD research for conferences back in 2012 when I was working on questions around 'but will historians really use resources created by unknown members of the public?'. People keep asking me for copies of my notes (and I've noticed people citing an online video version which isn't ideal) and since they might be useful and any comments would help me write-up the final thesis, I thought I'd be brave and post my notes.

A million caveats apply - these were early findings, my research questions and focus have changed and I've interviewed more historians and reviewed many more participative history projects since then; as a short paper I don't address methods etc; and obviously it's only a huge part of a tiny topic... (If you're interested in crowdsourcing, you might be interested in other writing related to scholarly crowdsourcing and collaboration from my PhD, or my edited volume on 'Crowdsourcing our cultural heritage'.) So, with those health warnings out of the way, here it is. I'd love to hear from you, whether with critiques, suggestions, or just stories about how it relates to your experience. And obviously, if you use this, please cite it!

Exploring historians' resistance to crowdsourced resources

Scholarly crowdsourcing may be seen as a solution to the backlog of historical material to be digitised, but will historians really use resources created by unknown members of the public?

The Transcribe Bentham project describes crowdsourcing as 'the harnessing of online activity to aid in large scale projects that require human cognition' (Terras, 2010a). 'Scholarly crowdsourcing' is a related concept that generally seems to involve the collaborative creation of resources through collection, digitisation or transcription. Crowdsourcing projects often divide up large tasks (like digitising an archive) into smaller, more manageable tasks (like transcribing a name, a line, or a page); this method has helped digitise vast numbers of primary sources.

My doctoral research was inspired by a vision of 'participant digitization', a form of scholarly crowdsourcing that seeks to capture the digital records and knowledge generated when researchers access primary materials in order to openly share and re-use them. Unlike many crowdsourcing projects which are designed for tasks performed specifically for the project, participant digitization harnesses the transcription, metadata creation, image capture and other activities already undertaken during research and aggregates them to create re-usable collections of resources.

Research questions and concepts

When Howe clarified his original definition, stating that the 'crucial prerequisite' in crowdsourcing is 'the use of the open call format and the large network of potential laborers', a 'perfect meritocracy' based not on external qualifications but on 'the quality of the work itself', he created a challenge for traditional academic models of authority and credibility (Howe 2006, 2008). Furthermore, how does anonymity or pseudonymity (defined here as often long-standing false names chosen by users of websites) complicate the process of assessing the provenance of information on sites open to contributions from non-academics? An academic might choose to disguise their identity to mask their research activities from competing peers, from a desire to conduct early exploratory work in private or simply because their preferred username was unavailable; but when contributors are not using their real names they cannot derive any authority from their personal or institutional identity. Finally, which technical, social and scholarly contexts would encourage researchers to share (for example) their snippets of transcription created from archival documents, and to use content transcribed by others? What barriers exist to participation in crowdsourcing or prevent the use of crowdsourced content?

Methods

I interviewed academic and family/local historians about how they evaluate, use, and contribute to crowdsourced and traditional resources to investigate how a resource based on 'meritocracy' disrupts current notions of scholarly authority, reliability, trust, and authorship. These interviews aimed to understand current research practices and probe more deeply into how participants assess different types of resources, their feelings about resources created by crowdsourcing, and to discover when and how they would share research data and findings.

I sought historians investigating the same country and time period in order to have a group of participants who faced common issues with the availability and types of primary sources from early modern England. I focused on academic and 'amateur' family or local historians because I was interested in exploring the differences between them to discover which behaviours and attitudes are common to most researchers and which are particular to academics and the pressures of academia.

I recruited participants through personal networks and social media, and conducted interviews in person or on Skype. At the time of writing, 17 participants have been interviewed for up to 2 hours each. It should be noted that these results are of a provisional nature and represent a snapshot of on-going research and analysis.

Early results

I soon discovered that citizen historians are perfect examples of Pro-Ams: 'knowledgeable, educated, committed, and networked' amateurs 'who work to professional standards' (Leadbeater and Miller, 2004; Terras, 2010b).

How do historians assess the quality of resources?

Participants often simply said they drew on their knowledge and experience when sniffing out unreliable documents or statements. When assessing secondary sources, their tacit knowledge of good research and publication practices was evident in common statements like '[I can tell from] it's the way it's written'. They also cited the presence and quality of footnotes, and the depth and accuracy of information as important factors. Transcribed sources introduced another layer of quality assessment - researchers might assess a resource by checking for transcription errors that are often copied from one database to another. Most researchers used multiple sources to verify and document facts found in online or offline sources.

When and how do historians share research data and findings?

It appears that between accessing original records and publishing information, there are several key stages where research data and findings might be shared. Stages include acquiring and transcribing records, producing visualisations like family trees and maps, publishing informal notes and publishing synthesised content or analysis; whether a researcher passes through all the stages depends on their motivation and audience. Information may change formats between stages, and since many claim not to share information that has not yet been sufficiently verified, some information would drop out before each stage. It also appears that in later stages of the research process the size of the potential audience increases and the level of trust required to share with them decreases.

For academics, there may be an additional, post-publication stage when resources are regarded as 'depleted' – once they have published what they need from them, they would be happy to share them. Family historians meanwhile see some value in sharing versions of family trees online, or in posting names of people they are researching to attract others looking for the same names.

Sharing is often negotiated through private channels and personal relationships. Methods of controlling sharing include showing people work in progress on a screen rather than sending it to them and using email in preference to sharing functionality supplied by websites – this targeted, localised sharing allows the researcher to retain a sense of control over early stage data, and so this is one key area where identity matters. Information is often shared progressively, and getting access to more information depends on your behaviour after the initial exchange - for example, crediting the provider in any further use of the data, or reciprocating with good data of your own.

When might historians resist sharing data?

Participants gave a range of reasons for their reluctance to share data. Being able to convey the context of creation and the qualities of the source materials is important for historians who may consider sharing their 'depleted' personal archives - not being able to provide this means they are unlikely to share. Being able to convey information about data reliability is also important. Some information about the reliability of a piece of information is implicitly encoded in its format (for example, in pencil in notebooks versus electronic records), hedging phrases in text, in the number of corroborating sources, or a value judgement about those sources. If it is difficult to convey levels of 'certainty' about reliability when sharing data, it is less likely that people will share it - participants felt a sense of responsibility about not publishing (even informally) information that hasn't been fully verified. This was particularly strong in academics. Some participants confessed to sneaking forbidden photos of archival documents they ran out of time to transcribe in the archive; unsurprisingly it is unlikely they would share those images.

Overall, if historians do not feel they would get information of equal value back in exchange, they seem less likely to share. Professional researchers do not want to give away intellectual property, and feel sharing data online is risky because the protocols of citation and fair use are presently uncertain. Finally, researchers did not always see a point in sharing their data. Family history content was seen as too specific and personal to have value for others; academics may realise the value of their data within their own tightly-defined circles but not realise that their records may have information for other biographical researchers (i.e. people searching by name) or other forms of history.

Which concerns are particular to academic historians?

Reputational risk is an issue for some academics who might otherwise share data. One researcher said: 'we are wary of others trawling through our research looking for errors or inconsistencies. [...] Obviously we were trying to get things right, but if we have made mistakes we don't want to have them used against us. In some ways, the less you make available the better!'. Scholarly territoriality can be an issue – if there is another academic working on the same resources, their attitude may affect how much others share. It is also unclear how academic historians would be credited for their work if it was performed under a pseudonym that does not match the name they use in academia.

What may cause crowdsourced resources to be under-used?

In this research, 'amateur' and academic historians shared many of the same concerns for authority, reliability, and trust. The main reported cause of under-use (for all resources) is not providing access to original documents as well as transcriptions. Researchers will use almost any information as pointers or leads to further sources, but they will not publish findings based on that data unless the original documents are available or the source has been peer-reviewed. Checking the transcriptions against the original is seen as 'good practice', part of a sense of responsibility 'to the world's knowledge'.

Overall, the identity of the data creator is less important than expected - for digitised versions of primary sources, reliability is not vested in the identity of the digitiser but in the source itself. Content found on online sites is tested against a set of finely-tuned ideas about the normal range of documents rather than the authority of the digitiser.

Cite as:

Ridge, Mia. “Early PhD Findings: Exploring Historians’ Resistance to Crowdsourced Resources.” Open Objects, March 19, 2014. http://openobjects.blogspot.co.uk/2014/03/early-phd-findings-exploring-historians.html.

References

Howe, J. (undated). Crowdsourcing: A Definition http://crowdsourcing.typepad.com

Howe, J. (2006). Crowdsourcing: A Definition. http://crowdsourcing.typepad.com/cs/2006/06/crowdsourcing_a.html

Howe, J. (2008). Join the crowd: Why do multinationals use amateurs to solve scientific and technical problems? The Independent. http://www.independent.co.uk/life-style/gadgets-and-tech/features/join-the-crowd-why-do-multinationals-use-amateurs-to-solve-scientific-and-technical-problems-915658.html

Leadbeater, C., and Miller, P. (2004). The Pro-Am Revolution: How Enthusiasts Are Changing Our Economy and Society. Demos, London, 2004. http://www.demos.co.uk/files/proamrevolutionfinal.pdf

Terras, M. (2010a) Crowdsourcing cultural heritage: UCL's Transcribe Bentham project. Presented at: Seeing Is Believing: New Technologies For Cultural Heritage. International Society for Knowledge Organization, UCL (University College London). http://eprints.ucl.ac.uk/20157/

Terras, M. (2010b). “Digital Curiosities: Resource Creation via Amateur Digitization.” Literary and Linguistic Computing 25, no. 4 (October 14, 2010): 425–438. http://llc.oxfordjournals.org/cgi/doi/10.1093/llc/fqq019

Sunday, 5 January 2014

2013 in review: crowdsourcing, digital history, visualisation, and lots and lots of words

A quick and incomplete summary of my 2013 for those days when I wonder where the year went... My PhD was my main priority throughout the year, but the slow increase in word count across my thesis is probably only of interest to me and my supervisors (except where I've turned down invitations to concentrate on my PhD). Various other projects have spanned the years: my edited volume on 'Crowdsourcing our Cultural Heritage', working as a consultant on the 'Let's Get Real' project with Culture24, and I've continued to work with the Open University Digital Humanities Steering Group, ACH and to chair the Museums Computer Group.

In January (and April/June) I taught all-day workshops on 'Data Visualisation for Analysis in Scholarly Research' and 'Crowdsourcing in Libraries, Museums and Cultural Heritage Institutions' for the British Library's Digital Scholarship Training Programme.

In February I was invited to give a keynote on 'Crowd-sourcing as participation' at iSay: Visitor-Generated Content in Heritage Institutions in Leicester (my event notes). This was an opportunity to think through the impact of the 'close reading' people do while transcribing text or describing images, crowdsourcing as a form of deeper engagement with cultural heritage, and the potential for 'citizen history' this creates (also finally bringing together my museum work and my PhD research). This later became an article for Curator journal, From Tagging to Theorizing: Deepening Engagement with Cultural Heritage through Crowdsourcing (proof copy available at http://oro.open.ac.uk/39117). I also ran a workshop on 'Data visualisation for humanities researchers' with Dr. Elton Barker (one of my PhD supervisors) for the CHASE 'Going Digital' doctoral training programme.

In March I was in the US for THATCamp Feminisms in Claremont, California (my notes), to do a workshop on Data visualisation as a gateway to programming and I gave a paper on 'New Challenges in Digital History: Sharing Women's History on Wikipedia' at the Women's History in the Digital World' conference at Bryn Mawr, Philadelphia (posted as 'New challenges in digital history: sharing women's history on Wikipedia – my draft talk notes'). I also wrote an article for Museum Identity magazine, Where next for open cultural data in museums?.

In April I gave a paper, 'A thousand readers are wanted, and confidently asked for': public participation as engagement in the arts and humanities, on my PhD research at Digital Impacts: Crowdsourcing in the Arts and Humanities (see also my notes from the event), and a keynote on 'A Brief History of Open Cultural Data' at GLAM-WIKI 2013.

In May I gave an online seminar on crowdsourcing (with a focus on how it might be used in teaching undergraduates wider skills) for the NITLE Shared Academics series. I gave a short paper on 'Digital participation and public engagement' at the London Museums Group's 'Museums and Social Media' at Tate Britain on May 24, and was in Belfast for the Museums Computer Group's Spring meeting, 'Engaging Visitors Through Play' then whipped across to Venice for a quick keynote on 'Participatory Practices: Inclusion, Dialogue and Trust' (with Helen Weinstein) for the We Curate kick-off seminar at the start of June.

In June the Collections Trust and MCG organised a Museum Informatics event in York and we organised a 'Failure Swapshop' the evening before. I also went to Zooniverse's ZooCon (my notes on the citizen science talks) and to Canterbury Cathedral Archives for a CHASE event on 'Opening up the archives: Digitization and user communities'.

In July I chaired a session on Digital Transformations at the Open Culture 2013 conference in London on July 2, gave an invited lightning talk at the Digital Humanities Oxford Summer School 2013, ran a half-day workshop on 'Designing successful digital humanities crowdsourcing projects' at the Digital Humanities 2013 conference in Nebraska, and had an amazing time making what turned out to be Serendip-o-matic at the Roy Rosenzweig Center for History and New Media at George Mason University's One Week, One Tool in Fairfax, Virginia (my posts on the process), with a museumy road trip via Amtrak and Greyhound to Chicago, Cleveland, Pittsburg inbetween the two events.

In August I tidied up some talk notes for publication as 'Tips for digital participation, engagement and crowdsourcing in museums' on the London Museums Group blog.

October saw the publication of my Curator article and Creating Deep Maps and Spatial Narratives through Design with Don Lafreniere and Scott Nesbit for the International Journal of Humanities and Arts Computing, based on our work at the Summer 2012 NEH Advanced Institute on Spatial Narrative and Deep Maps: Explorations in the Spatial Humanities. (I also saw my family in Australia and finally went to MONA).

In November I presented on 'Messy understandings in code' at Speaking in Code at UVA's Scholars' Lab, Charlottesville, Virginia, gave a half-day workshop on 'Data Visualizations as an Introduction to Computational Thinking' at the University of Manchester and spoke at the Digital Humanities at Manchester conference the next day. Then it was down to London for the MCG's annual conference, Museums on the Web 2013 at Tate Modern. Later than month I gave a talk on 'Sustaining Collaboration from Afar' at Sustainable History: Ensuring today's digital history survives.

In December I went to Hannover, Germany for the Herrenhausen Conference: "(Digital) Humanities Revisited – Challenges and Opportunities in the Digital Age" where I presented on 'Creating a Digital History Commons through crowdsourcing and participant digitisation' (my lightning talk notes and poster are probably the best representation of how my PhD research on public engagement through crowdsourcing and historians' contributions to scholarly resources through participant digitisation are coming together). In final days of 2013, I went back to my old museum metadata games, and updated them to include images from the British Library and a first pass at making them responsive for mobile and tablet devices.

Tuesday, 17 December 2013

Why we need to save the material experience of software objects

Conversations at last month's Sustainable History: Ensuring today's digital history survives event [my slides] (and at the pub afterwards) touched on saving the data underlying websites as a potential solution for archiving them. This is definitely better than nothing, but as a human-computer interaction researcher and advocate for material culture in historical research, I don't think it's enough.

Just as people rue the loss of the information and experiential data conveyed by the material form of objects when they're converted to digital representations - size, paper and print/production quality, marks from wear through use and manufacture, access to its affordances, to name a few - future researchers will rue the information lost if we don't regard digital interfaces and user experiences as vital information about the material form of digital content and record them alongside the data they present.

Can you accurately describe the difference between using MySpace and Facebook in their various incarnations? There's no perfect way to record the experience of using Facebook in December 2013 so it could be compared with the experience of using MySpace in 2005, but usability techniques like screen-recording software linked to eyetracking or think-aloud tests would help preserve some of the tacit knowledge and context users bring to sites alongside the look-and-feel, algorithms and treatments of data the sites present to us. It's not a perfect solution, but a recording of the interactions and designs from both sites for common tasks like finding and adding a friend would tell future researchers infinitely more about changes to social media sites over eight years than simple screenshots or static webpages. But in this case we're still missing the notifications on other people's screens, the emails and algorithmic categorisations that fan out from simple interactions like these...

Even if you don't care about history, anyone studying software - whether websites, mobile apps, digital archives, instrument panels or procedural instructions embedded in hardware - still needs solid methods for capturing the dynamic and subjective experience of using digital technologies. As Lev Manovich says in The Algorithms of Our Lives, when we use software we're "engaging with the dynamic outputs of computation; studying software culture requires us to "record and analyze interactive experiences, following individual users as they navigate a website or play a video game ... to watch visitors of an interactive installation as they explore the possibilities defined by the designer—possibilities that become actual events only when the visitors act on them".

The Internet Archive does a great job, but in researching the last twenty years of internet history I'm constantly hitting the limits of their ability to capture dynamic content, let alone the nuance of interfaces. The paradox is that as more of our experiences are mediated through online spaces and the software contained within small boxy devices, we risk leaving fewer traces of our experiences than past generations.

Saturday, 23 March 2013

New challenges in digital history: sharing women's history on Wikipedia - my talk notes

I'm at The Albert M. Greenfield Digital Center for the History of Women's Education at Bryn Mawr College for the inaugural Women's History in the Digital World Conference. Since I'm about to speak and ask historians to share their research and write history in public, I thought I should also be brave and share my draft talk notes (which I've now updated with formatted references, though Blogger is still re-formatting things slightly oddly).

Introduction: New challenges in digital history: sharing women's history on Wikipedia

[slide – title, my details]
Hi, I'm Mia. I'm actually doing a PhD on scholarly crowdsourcing, or collaboratively creating online resources, and, thinking about the impact of digitality on the practices of historians, so this paper is indirectly related to my research but isn't core to it.
I proposed this paper as a deliberate provocation: 'if we believe the subjects of our research are important, then we should ensure they are represented on freely available encyclopedic sites like Wikipedia'. Just in case you're not familiar with it, Wikipedia is a free online encyclopedia 'that anyone can edit.' It contains 25 million articles, over 4 million of them in English, but also in 285 other languages, and has 100,000 active contributors[1].

'Brilliant Women' at the National Portrait Gallery

The genesis of this paper was two-fold. The 2008 exhibition 'Brilliant Women: 18th Century Bluestockings' at the UK National Portrait Gallery, made the point that 'Despite the fact that 'bluestockings' made a substantial contribution to the creation and definition of national culture their intellectual participation and artistic interventions have largely been forgotten'. As a computer programmer, reinventing the wheel and other inefficient processes drive me crazy, and I began to think about how digital publishing could intervene in the cycle of remembering and forgetting that seemed to be the fate of brilliant women throughout history. How could historians use digital platforms to stop those histories being lost and to make them easy for others to find?

[Screenshot – Caitlin Moran quote from How to be a woman: 'Even the ardent feminist historian, male or female – citing Amazons and tribal matriarchies and Cleopatra – can't conceal that women have done basically f*ck-all for the last 100,000 years']
A few years later, by then a brand-new PhD student, I attended the Women's History Network conference in London in 2011 and learnt of so many interesting lives that challenged conventional mainstream historical narratives of gender. I wished that others could hear those stories too. But when I asked if any of these histories were available outside academia on sites like Wikipedia, there was a strong sense that editing Wikipedia was something that other people did. But who better to make a case for better representation of women's histories than the people in that room? Who else has the skills, knowledge and the passion? Some academic battles may have been won regarding the importance of women's histories, but representing women's histories on the sites where ordinary people start their queries is hugely important. The quote on this slide illustrates why – even if it was meant in jest, it represents a certain world view.

WikiWomen's Collaborative

[slide – logos from http://en.wikipedia.org/wiki/Wikipedia:WikiWomen%27s_History_Month http://meta.wikimedia.org/wiki/WikiWomen%27s_Collaborative ]
Of course, I'm not the first, and definitely not the most qualified to make this point. I would also like to acknowledge the work of many groups and individuals, particularly within Wikipedia, that's preceded this.[2]

[slide – Scripps editathon, #tooFEW]
Things move fast in the digital world and we're at a different moment than the one when I proposed this paper. Gender issues on Wikipedia had been discussed for a number of years but there's been a recent burst of activity, including the #tooFEW ('Feminists Engage Wikipedia') editathons – 'a scheduled time where people edit Wikipedia together, whether offline, online, or a mix of both' - [3], held online and in person across four physical sites.[4] [5] I was going to be provocative and ask you to create Wikipedia entries about the histories you've invested so much in researching, but some of that is happening already. As a result, this is version 2 of this paper, but my starting question remains the same – assuming we believe that women's history is important, what's wrong with our current methods of research dissemination and dialogue?

The case of the Invisible Scholarship

[slide – outline of section]
Cumulative centuries of archival and theoretical work have been spent recovering women's histories, yet much of this inspiring scholarship might as well not exist when so few people have access to it. Sadly, it's currently the case that scholarship that isn't deliberately made public is invisible outside academia. The open access movement, with all its thorny complications, is one potential solution. Engaging in new forms of open scholarship and disseminating research on sites where the public already goes to learn about history is another.

If it's not Googleable, it doesn't exist.

[slide – screenshot of unsuccessful search for Ina von Grumbkow]
Most content searches start and end online. The content and links available to search engines inform their assumptions about the world, and they in turn shape the world view presented on the results screen. If the name of a historical figure doesn't show up in Google, how else would someone find out about them? While college students might be heavy users of Google's specialist Google Scholar search, it's unlikely that people would come across it accidentally, not least because there's a 'semantic gap' between the language used in academia and the language used in everyday speech. Writing for Wikipedia means writing in everyday language, and the site is heavily indexed by search engines – it doesn't take long for content created on Wikipedia – even on a user's talk page and not the main site – to show up in Google results. So one reason to take history on Wikipedia seriously is that it affects what search engines know about the world.

'Did you mean… hegemony?'

Search for 'Viscountess Ranelagh', Google says 'Did you mean Viscount'. No.

[slide – screenshot of search for 'Viscountess Ranelagh and the Authorisation of Women's Knowledge in the Hartlib Circle', Google says 'Did you mean Viscount'. No.]
Scholarship and sources contained in specialist online archives and repositories are often off-limits to the Google bots that crawl the web looking for content to index. Because search engines normalise certain assumptions about the world, getting more content about women's histories in publically accessible spaces will eventually have an effect in the algorithms that determine suggestions for 'did you mean' etc. Contributions to sites like Wikipedia can eventually become contributions to the 'knowledge graphs' that determine the answers to questions we ask online.

If it's behind a paywall, it only exists for a privileged few

[Slide - Screenshot of blocked attempt to access 'Wives and daughters of early Berlin geoscientists and their work behind the scenes']
Specialist users will be able to find academic research via Google Scholar, but any independent scholars in attendance will be able to speak to the difficulties in gaining access to journal articles without membership of an institutional library. Journal articles obviously have a lot of value within academic communities, but the research they represent is only available to a privileged few.

Why does Wikipedia matter?

[slide: For some, Wikipedia is the font of all wisdom]
Wikipedia is one of the most visited websites in the world. As one commentator said, 'people turn to Wikipedia as an objective resource' but ' it's not so objective in many ways.'[6]

However, as the free online encyclopedia 'that anyone can edit', it also provides the ability to take direct action to fix the under-representation of women's history. President of the AHA, William Cronon said, 'Wikipedia provides an online home for people interested in histories long marginalized by the traditional academy'[7] – this may not be entirely true yet, but we can hope.

Wikipedia is not yet encyclopedic

[Slide – Ina screenshot]
The English version of Wikipedia has over 4 million articles but it still has some way to go to become truly encyclopedic. Martha Saxton has noted the absence of women's history content on Wikipedia and was distressed by 'its superficiality and inaccuracies when present [8]'. Just as female assistants, secretaries, collectors, illustrators, correspondents, translators, salonists, cataloguers, text book writers, popularisers, explorers, pioneers and colleagues have been left out of traditional academic histories and gradually reclaimed by historians, they are often still invisible on Wikipedia. This may be partly because not enough women edit Wikipedia – as Wikipedia User Gobonobo says, 'editors often contribute to topics they are familiar with and that concern them [...] This systemic bias has the potential to exacerbate an historical record that already gives undue emphasis to men.' [9]

The under-representation of women's history undermines Wikipedia's claim to be encyclopedic. Issues include missing entries or omissions in coverage for existing topics, entries with inaccurate content, a failure to represent a truly 'neutral point of view', and a representation of 'male' as the default gender.

Many notable women have been buried in pages titled for their husbands, brothers, tutors, etc. In 1908 Ina von Grumbkow undertook an expedition to Iceland. She later made significant contributions to the field of natural history and wrote several books but other than passing references online and a mention on her husband's Wikipedia page, her story is only available to those with access to sources like the ' Earth Sciences History' journal[10][11].

[Slide: 'Main articles: List of Fellows of the Royal Society and List of female Fellows of the Royal Society '.]
Some of the categories used in Wikipedia posit the default gender as male. For example, there's a ' List of Fellows of the Royal Society ' and ' List of female Fellows of the Royal Society'.

Wikipedia and the challenges of digital history

Writing for Wikipedia encapsulates many, but not all, of the challenges of digital history.

New forms of writing

Writing for Wikipedia calls upon historians to write engaging, intellectually accessible, succinct text that still accurately represents its subject. It not only means valuing the work and skills in writing public history, it requires the ability to write history in public.

Writing for a 'neutral point of view' – one of the key values of Wikipedia – is challenging for historians. Many may find difficult to believe that it's even possible, and it's difficult to achieve [12].

Unlike traditional historical scholarship, characterised by 'possessive individualism' [13] and honed to perfection before publication, Wikipedia entries are considered a work in progress, and anyone who spots an issue is asked to fix it themselves or flag it for others to review.

It won't advance your career

While it might have a large public impact, editing Wikipedia is work that isn't credited in academia, and it takes time that could be used for projects that would count for career advancement. More importantly from Wikipedia's point of view, you can't promote your own work on the site, so writing about your own research interests is not straightforward if not many people have published in your area of expertise.

“On the internet, nobody knows you're a professor”

In a comment with 'pointers for academics who would like to contribute to Wikipedia' on a Chronicle article, commentator 'operalala' said, '"On the internet nobody knows you're a professor." If you're used to deferential treatment at your home institution, you'll be treated like everybody else in the Wide Open Internet.'[14] Or in William Cronon's words, you must 'give up the comfort of credentialed expertise'.[15] Anyone can edit, re-shape or even delete your work.

Just like academia, Wikipedia has ways of establishing the credibility and reputation of a contributor, and just like any other community, there are etiquettes and conventions to observe. As newcomers to the community, Claire Potter warns that it's important not to think of Wikipedia as 'another realm for intellectuals to colonize and professionalize'.[16]

The opportunities and challenges of women's history as public history on Wikipedia

Opportunities

#WomenSciWP editathon at the Royal Society

Wikipedia uses red links to represent entries that could be created but don't yet exist. Women's history editathons often create lists of red-linked names as suggested topics that could be created [17] . Projects on and outside Wikipedia, and events at institutions like the Smithsonian and Royal Society and just last weekend at three THATCamps across the United States might be part of a critical mass of people learning how to edit Wikipedia to better include women's history.

Compared to the lengthy process of writing for academic publication, a new Wikipedia entry can be created in a few hours, allowing for time to structure the content and format the references as necessary to pass the first quality bar. An existing entry can be corrected in minutes. Each editathon or personal edit improves the representation of women's history, and there's something very satisfying about turning red links blue.

Ina von Grumbkow's name red-linked on her husband's Wikipedia page

Adding the brackets that turn a piece of text into a red link, suggesting the possibility of an entry to be created is a small but potentially powerful intervention. Red links can render the gaps and silences visible.

Resistance

Creating or editing entries on women's history may be relatively easy, but making sure they stay there is less so. There are countless examples of women having to fight to keep changes in as other editors revert them, argue about their choice of sources, the significance or notability of their topic. Wikipedians are zealous in preventing spammers and crackpots polluting the quality of the site, which explains some of the rapid 'nominations for deletion', but some pockets of the site are also hostile to women's history or to women themselves.

Saxton said editing Wikipedia is 'not for the faint of heart' and 'a lesson in how little women's history has penetrated mainstream culture'. There's work to be done in sharing and normalising an understanding of the historical circumstances and cultural contexts that created difficulties for women. We might know that, as Janet Abbate said, 'The laws and social conventions of a given time and place strongly shape the kinds of technical training available to women and men, the career options open to them, their opportunities for advancement and recognition' [18] but until other Wikipedians understand that, there will continue to be issues around 'notability'. Having those conversations as many times as necessary might be tiring and uncomfortable or even controversial, but it's part of the work of representing women's history on Wikipedia.

Tensions

'Reliable sources'

Wikipedians may have different definitions of 'reliable sources' than scholarly researchers. As one academic discovered:
"Wikipedia is not 'truth,' Wikipedia is 'verifiability' of reliable sources. Hence, if most secondary sources which are taken as reliable happen to repeat a flawed account or description of something, Wikipedia will echo that."' [19]

The same gatekeepers matter

As some academics have found, 'Wikipedia differs from primary-source research, from scholarly writing, and how it privileges existing rather than new knowledge' [20] [21] Wikipedia is not the place to redress fundamental issues with silences in the archives or in the profession overall, not least because on Wikipedia, primary research is bad and secondary sources are good [22] . This puts the onus back on to traditional academic publishing in peer-reviewed journals and books that can be cited in Wikipedia articles, though other published works such as 'credible and authoritative books' and 'reputable media sources' can also be cited.

'Notability'

'A person is presumed to be notable if he or she has received significant coverage in reliable secondary sources that are independent of the subject. [...] the person who is the topic of a biographical article should be "worthy of notice" – that is, "significant, interesting, or unusual enough to deserve attention or to be recorded" within Wikipedia as a written account of that person's life.' [23] 'The common theme in the notability guidelines is that there must be verifiable, objective evidence that the subject has received significant attention from independent sources to support a claim of notability.' [24] This creates obvious difficulties for some women's histories.

It's also difficult to judge where 'notability' should end. When does focusing on exceptional women become counter-productive? When do we risk creating a new canon? When does it stop being remarkable that a woman became prominent in a field and start being more accepted, if still not expected? [25] At what point should writing shift from individual entries to integration into more general topics?

Conclusion

Sometimes it's hard to tell whether Wikipedia lags behind academia's acceptance and general integration of women's history into mainstream history or whether it is representative of the field's more conservative corners. Recent digital history projects are doing a good job in explaining some of the issues with key sources for Wikipedia like the Oxford Dictionary of National Biography [26] , and I'd hope that this continues. As Martha Saxton said, 'integrating women's experience into broad subjects' is 'both more challenging intellectually and ultimately, more to the point of the overall project of bringing women into our acknowledged history'. [27]

But it's also clearly up to us to make a difference. If it's worth researching the life and achievements of a notable woman, it's worth making sure their contribution to history is available to the world while improving the quality of the world's biggest encyclopaedia. And it doesn't mean going it alone. It's still just Women's History Month so it's not too late to sign up and join one of the women's history projects, or to plan something with your students. [28] [29] [30]

I'd like to close with quotes from two different women. Executive Director of the Wikimedia Foundation, Sue Gardner: 'Wikipedia will only contain 'the sum of all human knowledge' if its editors are as diverse as the population itself: you can help make that happen. And I can't think of anything more important to do, than that.' [31]

And to quote Laura Mandell's keynote yesterday: 'Let's write and publish about each other's projects so that future historians will have those sources to write about. ... Nothing changes through thinking alone, only through massive amounts of re-iteration'. [32]

[Update: based on questions afterwards, you may want to get started with Wikipedia:How to run an edit-a-thon, or sign up and say hello at Wikipedia:WikiProject Women's History. You could also join in the Global Women Wikipedia Write-In #GWWI on April 26 (1-3pm, US EST), and they have a handy page on How to Create Wikipedia Entries that Will Stick.

And update April 30, 2013: check out 'Learning to work with Wikipedia - New Pages Patrol and how to create new Wikipedia articles that will stick' by the excellent Adrianne Wadewitz.

Update, June 9: if you're thinking of setting a class assignment involving editing Wikipedia, check out their 'For educators' and 'Assignment Design' pages for tips and contact points. June 18: see also Nicole Beale's 'Wikipedia for Regional Museums'.

Update, August 21, 2013: content on Wikipedia appears to have had an additional boost in Google's search results, making it even more important in shaping the world's knowledge. More at 'The Day the Knowledge Graph Exploded'.

New link, February 2014: Jacqueline Wernimont's Notes for #tooFEW Edit a thon based on a training session by Adrianne Wadewitz are a useful basic introduction to editing.]

References

[1] Various. ‘Wikipedia’. 2013. Wikipedia. http://en.wikipedia.org/wiki/Wikipedia.

[2] See for example http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Women%27s_History http://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Feminism http://en.wikipedia.org/wiki/Wikipedia:Meetup/DC_30

[3] https://en.wikipedia.org/wiki/Wikipedia:How_to_run_an_edit-a-thon

[4] http://en.wikipedia.org/wiki/Wikipedia:Meetup/Feminists_Engage_Wikipedia

[5] Barnett, Fiona. 2013. ‘#tooFEW - Feminists Engage Wikipedia’. HASTAC. March 11. http://hastac.org/blogs/fionab/2013/03/11/toofew-feminists-engage-wikipedia.

[6] Gobry, Pascal-Emmanuel. 2011. ‘Wikipedia Is Hampered By Its Huge Gender Gap’. Business Insider. January 31. http://www.businessinsider.com/wikipedia-is-hampered-by-its-huge-gender-gap-2011-1#.

[7] Cronon, William. 2012. ‘Scholarly Authority in a Wikified World’. Perspectives on History, American Historical Association. February 7. http://www.historians.org/perspectives/issues/2012/1202/Scholarly-Authority-in-a-Wikified-World.cfm.

[8] Saxton, Martha. 2012. ‘Wikipedia and Women’s History: A Classroom Experience’. Writing History in the Digital Age. http://writinghistory.trincoll.edu/crowdsourcing/saxton-etal-2012-spring/.

[9] Gobonobo. 2013. ‘User:Gobonobo/Gender Gap Red List’. Wikipedia. https://en.wikipedia.org/wiki/User:Gobonobo/Gender_Gap_red_list

[10] Various.. ‘Hans Reck’. Wikipedia. https://en.wikipedia.org/wiki/Hans_Reck

[11] Mohr, B. A. R. 2010. Wives and daughters of early Berlin geoscientists and their work behind the scenes. Earth Sciences History 29 (2): 291–310.

[12] As commenter Operalala suggested, one challenge is recognising ‘the difference between the plurality of academia and the singularity of a Wikipedia article’. Comment http://chronicle.com/article/The-Undue-Weight-of-Truth-on/130704/#comment-437781354 on Messer-Kruse, Timothy. 2012. ‘The “Undue Weight” of Truth on Wikipedia’. The Chronicle of Higher Education. February 12. http://chronicle.com/article/The-Undue-Weight-of-Truth-on/130704/.

[13] Rosenzweig, Roy. 2006. ‘Can History Be Open Source? Wikipedia and the Future of the Past’. The Journal of American History 93 (1) (June): 117–46. https://chnm.gmu.edu/essays-on-history-new-media/essays/?essayid=42

[14] Operalala on Messer-Kruse, 2012 [15] Cronon, 2012.

[16] Potter, Claire. 2013. ‘Looking for the Women on Wikipedia: Readers Respond’. The Chronicle of Higher Education. March 14. http://chronicle.com/blognetwork/tenuredradical/2013/03/looking-for-the-women-on-wikipedia-readers-respond/

[17] For example, https://en.wikipedia.org/wiki/User:Gobonobo/Gender_Gap_red_list https://en.wikipedia.org/wiki/User:T._Anthony/Women_in_Red https://en.wikipedia.org/wiki/User:Dsp13/Redlinks/Women

[18] Janet Abbate, "Guest Editor's Introduction: Women and Gender in the History of Computing," IEEE Annals of the History of Computing, vol. 25, no. 4, pp. 4-8, October-December, 2003

[19] Messer-Kruse, 2012.

[20] Anderson, Jill. 2013. ‘A Supposedly Fun Thing I’ll (Probably) Never Do Again’. True Stories Backward. http://girlhistorian.wordpress.com/2013/03/16/a-supposedly-fun-thing-ill-probably-never-do-again/

[21] Messer-Kruse, 2012.

[22] Various. 2013. ‘Wikipedia:No Original Research’. Wikipedia. https://en.wikipedia.org/wiki/Wikipedia:No_original_research

[23] Various. 2013. ‘Wikipedia:Notability (people)’. Wikipedia. http://en.wikipedia.org/wiki/Wikipedia:Notability_(people)

[24] Various. 2013. ‘Wikipedia:Notability’. Wikipedia. https://en.wikipedia.org/wiki/Wikipedia:NOTE
[25] Or as Christie Aschwanden says when proposing the 'Finkbeiner test' for contemporary journalism about women in science, 'treating female scientists as special cases only perpetuates the idea that there’s something extraordinary about a woman doing science'. Aschwanden, Christie. 2013. ‘The Finkbeiner Test’. Double X Science. March 5. http://www.doublexscience.org/the-finkbeiner-test/

[26] For a recent example, see ‘An Entry of One’s Own, or Why Are There So Few Women In the Early Modern Social Network?’ 2013. Six Degrees of Francis Bacon. March 8. http://sixdegreesoffrancisbacon.com/post/44879380376/an-entry-of-ones-own-or-why-are-there-so-few-women-in and ‘Gender and Name Recognition’. 2013. Six Degrees of Francis Bacon. March 20. http://sixdegreesoffrancisbacon.com/post/45833622936/gender-and-name-recognition

[27] Saxton, 2012

[28] http://en.wikipedia.org/wiki/Wikipedia:WikiWomen%27s_History_Month

[29] Potter, Claire. 2013. ‘Prikipedia? Or, Looking for the Women on Wikipedia’. The Chronicle of Higher Education. March 10. http://chronicle.com/blognetwork/tenuredradical/2013/03/prikipedia-looking-for-the-women-on-wikipedia/
[30] For advice, see: Wikimedia Outreach. 2013. ‘Education Portal/Tips and Resources’. Wikipedia Outreach Wiki. http://outreach.wikimedia.org/wiki/Education_Portal/Tips_and_Resources
[31] A comment on Gardner, Sue. 2010. ‘Unlocking the Clubhouse: Five Ways to Encourage Women to Edit Wikipedia’. Sue Gardner’s Blog. November 14. http://suegardner.org/2010/11/14/unlocking-the-clubhouse-five-ways-to-encourage-women-to-edit-wikipedia/
[32] Mandell, Laura. 2013. "Feminist Critique vs. Feminist Production in Digital Humanities." Keynote presented at the Women’s History in the Digital World conference, Bryn Mawr College, Pennsylvania March 22 2013