Many digital humanists are discovering the power and flexibility of “R.” DMDH is fortunate to have Leigh Fisher, a graduate student in BioStatistics, lead this workshop and help to field questions. We've also recently begun an informal survey circulating on Twitter in an effort to gather information about how humanists are using R, and as a way to investigate the many ways in which it is implemented in course instruction or taught through workshops like ours.
Because R does have a steep learning curve, our workshop is intended to be only an introduction to R's basic functionality and interdisciplinary applications. A link to the slides can be found at the bottom of this post. In addition to these slides, you'll find a short, yet immensely useful, list of resources for further instruction and inquiry, with the most strongly recommended being Coursera's 4-week course on R taught by two professors from John Hopkins's Department of Biostatistics, which is a part of their data science certificate program.
- Statistical methods for studying literature using R (A tutorial by Jeff Rydberg-Cox of university of Missouri-Kansas City)
- Executing R in php (Matthew Jockers)
- Sample R script to generate scatterplot from CSV input
- David Birnbaum’s DH Tools page
- Coursera course on R
- Anthony Kenny's The Computation of Style: An Introduction to Statistics for Students of Literature and the Humanities
- J.F. Burrows'Computation into Criticism: A Study of Jane Austen’s Novels
- Douglas Biber's Corpus Linguistics: Investigating Language Structure and Use
- R.H. Baayen's Analyzing Linguistic Data: A Practical Introduction to Statistics Using R
- Matt Jocker's Text Analysis with R for Students of Literature
- Stefan Gries' Statistics for Linguistics with R
- ” ” Quantitative Corpus Linguistics with R
Alan Liu’s Resource Page
Interested in Digital Humanities, but not sure where to start? Having trouble identifying the right technology for your research project? Interested in getting more involved with the UW DH community? Come to office hours!
Members of Demystifying Digital Humanities, UW-IT, and UW Libraries are offering informal office hours for Digital Humanities Researchers this quarter. We offer individual consulting services to help Digital Humanities researchers plan, execute and explore potential solutions for emergent research projects. Digital Humanities Office Hours will be held Thursday afternoons, 3:00-5:00 pm in OUGL 230 April 24th, May 8th, May 22nd, and June 12th.
Here are just a few of the questions that you might ask, and that we would be happy to talk through with you.
- I think I want a ____________ (database, timeline, exhibit for displaying images, etc.); what platforms should I look at?
- I want to __________; what kind of technical skills do I need?
- How do I learn TEI/MySQL/Omeka/Python/ArcGIS?
- Where do I find resources and/or funding that will help me _______?
- How do I prepare my data for that platform?
- How do I digitize [material?]
- How do I install [platform]?
- Can you help me clean up this text so that I can use it with [platform?]
These office hours are a pilot project — we hope to learn more about the type of support that people doing DH at UW need. The more people who come, the more we can learn, and the more likely it is that we'll be able to continue to offer office hours in the future. So: don't wait! Come to office hours this quarter!
Registration is now open for our the spring Demystifying Digital Humanities workshops, which focus on project ideation and development. If you're thinking about developing a digital humanities project, or even just want to learn more about the tools that you might use, then these workshops will help you explore your options.
You can register through the Catalyst survey here:
Our spring workshops focus on project development — you can see the details below. Please note, due to the Easter weekend holiday, these workshops will be held one week apart, on April 5th and 12th.
April 5th, 9:30a.m.-1:00p.m.: Big Project; Small Project: Steps in Ideation and Development: This workshop takes you step by step through the work of developing a project, and growing it, and learning how to present what you're doing while you work: at conferences, workshops, and in scholarship/grant applications. We'll also talk about project management software, and the decision making process for learning new skills, and finding collaborators.
April 12th, 9:30-a.m.-1:00p.m.: Available Tools: Free, Cheap, and Premium: This workshop is all about finding the tools out there that you can use to get started. We'll talk about what the UW makes available (more software than you may have realized!), and the features you need to look for and be aware of when you choose to use a particular tool.
Today we're experimenting with a new kind of event — one where people can get together and test out various data visualization tools, and just get a sense of what it's like to work with them, while Brian and Sarah and I are easily available to provide tech support.
To do this, we're providing links to several different tools, and tutorials that accompany them.
Gephi is a network visualization tool — it allows you to see the connections between various individuals. Here’s a good tutorial for getting started with it. Gephi also provides a number of test datasets that you can work with.
Edited to add: if you're working with Gephi on a MacBook, without a mouse, then you can hold down the Command key while navigating on your trackpad to move the graph around.
ManyEyes is a text visualization tool, created by IBM. Unfortunately, it only runs on browsers with Java-enabled, so you'll need to run it in Safari or Firefox. There are a number of different visualizations that you can run, and a huge number of data sets to work with — just search for a particular text. We have one sample data set below, and will have more available during the workshop: because of copyright issues, they can't be linked here.
MIT's Simile Exhibit allows you to create a timeline using a Google Spreadsheet. It's a really useful tool, though it's a little more work than some of the others. See an example of its use here. There's a tutorial here, written by David Karger of MIT. We think you might also find it useful to have a version of the HTML code with extra comments for explanation of what each part is doing.
Finally, we think you might enjoy playing with the TAPor (Text Analysis Portal) toolsuite. TAPor has hundreds of tools for visualization — and most of them allow you to simply copy and paste plain text in, or upload a plain text file. Some of these tools are in development, meaning that they're a little bit crashprone — you just have to keep experimenting with them. Two tools that work well, however, are Textometrica and Voyant Cirrus.
If you want to find texts to play around with, Project Gutenberg is a great source — just look for the plain text file with UTF-8 encoding.
We've uploaded a few more texts files that you might want to play with:
There are two upcoming events that we think will be of interest to visitors to DMDH.org:
Play With Your Data!
February 13th, 12:30-2:00, Simpson Center Collaboration Lab, 218D
If you want to test out digital humanities tools, then this is the event is your chance. Come to the Simpson Center's new collaboration space to experiment with a variety of text and content analysis and visualization platforms. Experiment with organizing your data, or learn about ways to clean large data sets. Don't worry if you feel like you don't have a data set yet — we'll have practice data that you can play with, in a variety of tools, including MIT's Simile Exhibit, IBM's ManyEyes, Gephi, and more!
No prior programming experience necessary.
Collaborating With Strangers
February 19th, 2:00-4:00, Allen Library Research Commons
Looking for a way to connect with others on campus involved in digital scholarship? Interested in learning about the skills and assets that exist on this topic on campus?
Collaborating with Strangers workshops connect UW students, faculty and researchers during 3-minute speed meetings where participants exchange and generate ideas and build new foundations to start or improve upon research projects. It might otherwise take you years to meet the people you’ll connect with during one-on-one…