Tag Archives: Digital History

Still Playing Catch-Up

As I was flipping through the February 2014 issue of the American Historical Review I was encouraged to see that American historical profession’s flagship journal seems to be doing a pretty decent job of publishing the impressive work of female historians. Three out of its four main articles were written by women and four out of the five books in its “Featured Reviews” section were also by women. That’s encouraging. But what about the rest of the February issue? Figuring out how many women are in the 176 contributors for this single issue is a lot harder. And what about not just this issue, but all five issues it publishes annually? And what about not just this year, but every year since its inception in 1895?

Looking at gender representation in the American Historical Review is exactly the kind of historical project that lends itself well towards digital analysis. Collecting individual author information from 120 years of publication history would take an enormous amount of tedious labor. Fortunately the information is already online. I wrote a Python script to scrape the table-of-contents from every AHR issue and then, with the help of Bridget Baird, began to process all of this text to try and extract the books that were reviewed in the AHR, their authors, and the names of the person reviewing them. The data was something of a nightmare, but we were eventually able to get everything we wanted: around 60,000 books, authors, and reviewers. The challenge turned to: was there a way to automatically identify the gender of all of these different people? Especially for a dataset that spanned more than a hundred years we needed a way to take into account potential changes in naming conventions. A historian named Leslie who was born before 1950 was likely to be a man, but if that same Leslie was born after 1950 the person was likely to be a woman. Bridget’s solution was for us to write a program that relies on a database of names from the Social Security Administration dating back to 1880 to account for these changes. This approach is not without problems. It only includes American names while subtly reinforcing an insidious gender binary framework. Nevertheless, it does contribute a useful new digital humanities methodology and one that we are planning to explore with Lincoln Mullen in more depth.

This might come as a real shock, but the American Historical Review didn’t feature very many women for much of its publication history. Over the first eighty years of the AHR‘s existence there were rarely more than a handful of books written by female authors in any given issue – as a percentage of all authors, women made up less than 10% of reviewed books through the 1970s. But things began to change in the late 1970s, when female authors began a steady ascent in the AHR‘s reviews. By the end of the 1980s women’s books had nearly doubled in the journal. By the twenty-first century there were three times as many women as there had been in the 1970s.

gender_percent_byyear

Gender of book authors (as a percent of all authors) in the American Historical Review between 1895 and 2013. The number of authors categorized as “Unknown” in the early years stems from the widespread use of initials (ex. K. T. Drew). Most of these authors were likely men, but we’ve erred on the safe side in categorizing them as Unknown. In the later years, many of the “Unknowns” stem from non-U.S. names.

But other numbers paint a less rosy picture. Lincoln Mullen’s recent work on history dissertations showed a similarly steady upwards trajectory in the number of female-authored history dissertations since 1950. Although it has plateaued in recent years, women have very nearly closed the gap in terms of newly completed history dissertations. But the glass ceiling remains stubbornly low in terms of what happens from that point onwards. In book reviews published in the AHR male authors continue to outnumber female authors by a factor of nearly 2 to 1. Whereas there is now a gap of around 3-5% separating the proportion of male and female dissertation authors, that gap jumps to 25-35% in terms of the proportion of male and female book authors being reviewed in the American Historical Review.

mf_diss_book_bluegreen

Gender of dissertation authors and of book authors in the American Historical Review. Note: The above chart only looks at authors whose gender was successfully identified by the program. It is also something of an apples-to-oranges comparison given that Lincoln and I were using slightly different methods, but it gives a rough sense for the gap between dissertations and the AHR.

On the reviewer side of the equation, things aren’t much better. There are still more than twice as many male reviewers as female reviewers in the AHR. But gender inflects this relationship in less direct ways. In particular, we can look at the gender dynamics of who reviews who. About three times as many men write reviews of male-authored books as do women. In the case of female-authored books, there are slightly more male reviewers than female reviewers but the ratio is much closer to 50/50. In short, women are much more likely to write reviews of other women. And while men still write reviews of the majority of female-authored books, they tend to gravitate towards male authors – who are, of course, already over-represented in the AHR.

male_authors_withreviewers

Gender of reviewers for male-authored books. Note: The above chart only looks at authors and reviewers whose gender was successfully identified by the program.

female_authors_withreviewers

Gender of reviewers for female-authored books. Note: The above chart only looks at authors and reviewers whose gender was successfully identified by the program.

Bridget and I were also able to extract the subjects used by the AHR to categorize their reviews. Although these conventions changed quite a bit over time, I took a stab at aggregating them into some broad categories for the past forty years. Essentially, I wanted to find out the gender representation within different historical fields. As you can see in the chart below, the proportion of men and women is not the same for all fields. Caribbean/Latin American history has had something approaching equal representation for the past decade-and-a-half. In both African history and Ancient/Medieval history female historians made some quite dramatic gains during the late-nineties and aughts. The guiltiest parties, however, are also the two subject categories that publish the most book reviews: Modern/Early Modern Europe and the United States/Canada. Both of them have made steady progress but still hover at around two-thirds male.

categories_gender_bytime

The different subjects are sorted left-to-right by the number of reviews in the AHR. Again, please note that the above chart only looks at authors whose gender was successfully identified by the program.

Women are now producing history dissertations at nearly the same rate as men, but the flagship journal of the American historical profession has yet to catch up. There are, of course, a lot of factors at play. This gap might reflect a substantial time-lag as a younger, more evenly-balanced generation gradually moves its way through the ranks even as an older, male-skewed generation continues to publish monographs. It might reflect biases in the wider publishing industry, or the fact that female historians continue to bear a disproportionate amount of the time-burden of caring for families. That the AHR continues to publish far more reviews of male authors than female authors is depressing, but unfortunately not surprising given the systemic inequalities that continue to exist across the profession.

Advertisements

Text Analysis of Martha Ballard’s Diary (Part 2)

Given Martha Ballard’s profession as a midwife, it is no surprise that she carefully recorded the 814 births she attended between 1785 and 1812. These events were given precedence over more mundane occurrences by noting them in a separate column from the main entry. Doing so allowed her to keep track not only of the births, but also record payments and restitution for her work. These hundreds of births constituted one of the bedrocks of Ballard’s experience as a skilled and prolific midwife, and this is reflected in her diary.

As births were such a consistent and methodically recorded theme in Ballard’s life, I decided to begin my programming with a basic examination of the deliveries she attended. This examination would take the form of counting the number of deliveries throughout the course of the diary and grouping them by various time-related characteristics, namely: year, month, and day of the week.

Process and Results

The first basic step for performing a more detailed text analysis of Martha Ballard’s diary was to begin cleaning up the data. One step was to take all the words and (temporarily) turn every uppercase letter into a lowercase letter. This kept Python from seeing “Birth” and “birth” as two separate words. For the purposes of this particular program, it was more important to distill words into a basic unit rather than maintain the complexity of capitalized characters.

Once the data was scrubbed, we could turn to writing a program that would count the number of deliveries recorded in the diary. The program we wrote does the following:

  1. Checks to see if Ballard wrote anything in the “birth” column (the first column of the entries that she also used to keep track of deliveries)
  2. If she did write anything in that column, check to see if it contains any of the words: “birth”, “brt”, or “born”.
  3. I then printed the remainder of the entries that contained text in the “birth” column but did not contain one of the above words. From this short list I manually added an additional seven entries into the program, in which she appeared to have attended a delivery but did not record it using the above words.

Using these parameters, the program could iterate through the text and recognize the occurrence of a delivery. Now we could begin to organize these births.

First, we returned the birth counts for each year of the diary, which were then inserted into a table and charted in Excel:

Year Deliveries

At the risk of turning my analysis into a John Henry-esque woman vs. machine, I compared my figures to the chart that Laurel Ulrich created in A Midwife’s Tale that tallied the births Ballard attended (on page 232 of the soft-cover edition). The two charts follow the same broad pattern:

YearDeliveriesCompare

Note: I reverse-built her chart by creating a table from the printed chart, then making my own bar graph. Somewhere in the translation I seem to have misplaced one of the deliveries (Ulrich lists 814 total, whereas I keep counting 813 on her graph). Sorry!

However, a closer look reveals small discrepancies in the numbers for each individual year. I calculated each year’s discrepancy as follows, using Ulrich’s numbers as the “true” figures (she is the acting President of the AHA, after all) from which my own figures deviated, and found that the average deviation for a given year was 4.86%. Apologies for the poor formatting, I had trouble inserting tables into WordPress:

Year Deliveries Count Difference Deviation (from Ulrich)
Manual (Ulrich) Computer Program
1785 28 24 4 14.29%
1786 33 35 2 6.06%
1787 33 33 0 0.00%
1788 27 28 1 3.70%
1789 40 43 3 7.50%
1790 34 35 1 2.94%
1791 39 39 0 0.00%
1792 41 43 2 4.88%
1793 53 50 3 5.66%
1794 48 48 0 0.00%
1795 50 55 5 10.00%
1796 59 56 3 5.08%
1797 54 55 1 1.85%
1798 38 38 0 0.00%
1799 50 51 1 2.00%
1800 27 23 4 14.81%
1801 18 14 4 22.22%
1802 11 12 1 9.09%
1803 19 18 1 5.26%
1804 11 11 0 0.00%
1805 8 8 0 0.00%
1806 10 11 1 10.00%
1807 13 13 0 0.00%
1808 3 3 0 0.00%
1809 21 22 1 4.76%
1810 17 18 1 5.88%
1811 14 14 0 0.00%
1812 14 14 0 0.00%

Keeping the knowledge in the back of my mind that my birth analysis differed slightly from Ulrich’s, I went on to compare my figures with other factors, including the frequency of deliveries by month over the course of the diary.

MonthDeliveries

If we extend the results of this chart and assume a standard nine-month pregnancy, we can also determine roughly which months that Ballard’s neighbors were most likely to be having sex. Unsurprisingly, the warmer period between May and August appears to be a particularly fertile time:

Conceptions

Finally, I looked at how often births occurred on different days of the week. There wasn’t a strong pattern, beyond the fact that Sunday and Thursday seemed to be abnormally common days for deliveries. I’m not sure why that was the case, but would love to hear speculation from any readers.

DeliveriesDayWeek

Analysis

The discrepancies between the program’s tally of deliveries and Ulrich’s delivery count speak to broader issues in “digital” text mining versus “manual” text mining:

Data Quality

Ulrich’s analysis is a result of countless hours spent eye-to-page with the original text. And as every history teacher drills into their students when conducting research, looking directly at the primary documents minimizes the degrees of interpretation that can alter the original documents.  In comparison, my analysis is the result of the original text going through several levels of transformation, like a game of telephone:

Original text -> Typed transcription -> HTML tables -> Python list -> Text file -> Excel table/chart

Each level increases the chance of a mistake.  For instance, a quick manual examination using the online version of the diary for 1785 finds an instance of a delivery (marked by ‘Birth’) showing up in the online HTML, but which does not appear in the “raw” HTML files our program is processing and analyzing.

On the other hand, a machine doesn’t get tired and miscount a word tally or accidently skip an entry.

Context

Ulrich brings to bear on the her textual analysis years of historical training and experience along with a deeply intimate understanding of Ballard’s diary. This allows her to take into account one of the most important aspects of reading a document: context. Meanwhile, our program’s ability to understand context is limited quite specifically to the criteria we use to build it. If Ballard attended a delivery but did not mark it in the standard “birth” column like the others, she might mention it more subtly in the main body of the entry. Whereas Ulrich could recognize this and count it as a delivery, our program cannot (at least with the current criteria).

Where the “traditional” skills of a historian come into play with data mining is in the arena of defining these criteria. Using her understanding of the text on a traditional level, Ulrich could create far, far superior criteria than I could for counting the number of deliveries Martha Ballard attends. The trick comes in translating a historian’s instinctual eye into a carefully spelled-out list of criteria for the program.

Revision

One area that is advantageous for digital text mining is that of revising the program. Hypothetically, if I realized at a later point that Ballard was also tallying births using another method (maybe a different abbreviated word), it’s fairly simple to add this to the program’s criteria, hit the “Run” button, and immediately see the updated figures for the number of deliveries. In contrast, it would be much, much more difficult to do so manually, especially if the realization came at, say, entry number 7,819. The prospect of re-skimming thousands of entries to update your totals would be fairly daunting.

Geeking Out with History

I’ve been meaning to blog about this for awhile, but last month the Digital Youth Project released the results of their three-year study: “Hanging Out, Messing Around, and Geeking Out.” The project,  through funding from the John D. and Catherine T. MacArthur Foundation and collaborative scholarship between researchers in the UC system, looked to examine young people’s use and interaction with new media in three realms: communication, learning and play. The overall results are both fascinating and encouraging, and I’d recommend at least reading the two-page summary of their findings.

The title stems from the three modes of use the researchers identify. Hanging out is the primarily social interaction between friends and peers, exemplified by social networking sites, instant messaging, or text messaging.  The second mode, messing around, is a form of digital exploration and expression, exemplified by uploading videos or photos, trying out different online applications, or passing along discoveries (think Elf Yourself or LOLCats, for two admittedly trivial examples). The final mode, geeking out, is diving into a specific topic, finding a community of like-minded enthusiasts, and working towards a degree of expertise in the area.

For anyone interested in youth participation in new media, reading the white paper is a must. K-12 educators with even a passing interest in what their students are doing should take a glance at it. Upon first skimming it, I thought it did a great job of refuting several commonly-held perceptions about young people’s activity online. First, under the hanging out topic, of particular note is the refutation of a still-pervasive myth that kids  go online and end up primarily interacting with strangers. Instead, the researchers write, “With these ‘friendship-driven’ practices, youth are almost always associating with people they already know in their offline lives.” For the majority of young people, the idea of going into online chatrooms and striking up friendships with complete strangers is largely a relic of the past. With messing around, the researchers stress the fact that young people are not passive recipients of media, but they are increasingly participatory members of a community. There is a critical element of trial-and-error, as kids explore and incorporate (or reject) new activities. Finally, my favorite mode: geeking out. The authors highlight the important point that “one can geek out on topics that are not culturally marked as ‘geeky’.”

For some reason, this relatively innocuous assertion provoked a lot of thought on my end. “Geeking out” still carries strong cultural connotations, bringing to mind images of traditional nerd culture – see Timothy Burke’s recent post on Batman comics, in which he offers a disclaimer: “This entry is going to be the maximally geeky one.” But  “geeking out” as a verb can increasingly apply to non-“geeky” subjects: sports-obsessed fantasy football participants, any and every kind of music enthusiast, political gossip and speculation, etc. This has been one of the true hallmarks of the internet, by breaking down barriers of entry into extremely specialized fields of interest.

Which leads me to the title of this post: historians need to take advantage of the digital landscape to geek out with history. Without any amount of exaggeration, I can confidently say that my own geeking out with history has contributed just as much to my identity as a “historian” as my semesters of traditional scholarly training. Subscribing to blogs or listening to podcasts will not replace formal instruction. But it can certainly enhance the learning process, and in my mind, offers a higher ceiling for immediate participation and access. If a student writes an essay for a college course, most of the time the only reader will be the professor, and possibly some fellow students if it is a seminar. Meanwhile, if a student takes that same energy and enthusiasm to their subject online, they read related thoughts from scholars around the world, exchange comments and dialogue with some of those scholars, or post that same essay as a blog and receive feedback from a much greater number of readers.

On the research side, I think many academics are awakening to the vast potential for vertical exploration of historical source material. Fifteen years ago, doing research on a historical subject meant countless trips to archives and libraries, excursions that were largely hindered by geographical and financial considerations. Today, digitization projects have greatly streamlined the process of finding and accessing this material. In doing so, it is opening up the door for anyone to geek out with history. Genealogists and armchair historians have always greatly contributed to the field, whether or not academics like to admit it. But in the years to come, the ability of non-professionals to do professional work will grow and grow. This is a double-edged sword, as greater participation enhances the possibilities for collective intelligence and collaboration, while also running the risk of suffering from a “Barnes and Noble Syndrome,” of an environment dominated by cream-puff analysis and a lack of vigorous interpretative context.

Academic historians need to get their hands dirty online. Read (and write) blogs, mine some data, listen to podcasts, enter a virtual world, upload media, explore databases, leave comments, and share your research. Take some chances and make mistakes. In short, geek out.

Towards a “History This” Command Line

Mozilla Labs recently released the 0.1 version of Ubiquity, a Firefox extension that allows the user to interact with and direct their browser through intuitive, written commands. Ubiquity has met with largely positive and excited reviews from the tech community, from folks at Lifehacker to Hackaday to Tools for Thought. The extension currently allows for a variety of commands. The common example that everyone likes to point to is the “map these” command, where you select text, hit the keystroke to bring up ubiquity and type “map these,” which brings up something like the following:

From there, you can do a variety of things with the map itself, including navigating and moving, or inserting it into a separate page. And of course you can also highlight text, and in Ubiquity type “email this to _____,” which then searches through your Gmail contacts and sends the highlighted text to them. The most common example I’ve read is if you are looking for a restaurant at which to eat with a friend. You can highlight or type in the restaurant name, map it, look for reviews on Yelp, check your calendar for conflicts, and email an invitation to your friend with all of this information included.

Ubiquity interacts with a wide variety of sites through APIs, including Youtube, Weather, Yelp, Twitter, and Flickr. In addition, you can translate and define words, run calculations, export events to your calendar, count words in an article, or convert units. In many ways, it seems to blur the earlier function of Hyperwords (which I covered in a previous post) with the intuitive command line structure of Quicksilver (for Mac users) or Launchy (for Windows).

I immediately thought of interesting commands someone could write for engaging in historical research. Developing a Ubiquity command set for historians would go a long way towards encouraging traditionalists to finally break into digital history. Instead of reading scary words like Python or machine learning, a researcher with little technological background could hit a couple of keystrokes and be off in running with relatively in-depth analysis of digitized archival material. In many ways, Ubiquity could potentially act as a “gateway drug” for digital history. Of course, this all hinges on at least two things:

1) Quality, standardized digitization of source materials combined with quality, standardized open API’s. Dan Cohen has great arguments for the importance of a digitized collection like Google Books not only having an API, but having a good one.

2) Someone in the digital humanities would have to develop these tailored commands for different archives (Bill Turkel, you know you’re interested…) There’s already a Mozilla Labs wiki for creating new commands that looks relatively straightforward, but would probably be above most members of the history community. I’m intrigued by the idea, but unfortunately my own forays into digital history programming have presently taken a backseat to applying to grad schools. Please let me know if anyone in the digital humanities is interested in this…

I feel that Ubiquity takes a substantial next step in the evolution of online interactivity. It’s admittedly buggy (although given its 0.1 version status, this will certainly get better), but it embodies so much of what is positive in today’s digital environment: namely open-source collaboration. Mozilla Labs actively encourages anyone and everyone to develop their own commands and to share them with others. This openness combines with an intuitive simplicity that makes it truly remarkable. As of right now, Ubiquity is a fantastic timesaver and cool trick, but it lacks depth. Almost anything you do in Ubiquity could be done before – just slower and with much less efficiency or ease of use. I have absolutely no doubt that as the open-source developer community jumps on board, this will change.

But for right now what Ubiquity does best is to begin to break down the barriers between computer geeks and laypeople. Some people are writing about the irony of returning to the infant state of the computer interface: the command line.  While interesting, these two instances are fundamentally different: not many people would know how to write even a simple program when faced with earlier command lines, but just about anyone I know can type “Map this” into Ubiquity and get far more complex results. Even as programmers find new ways to write more and more advanced commands, ordinary Firefox users will adopt the basics of Ubiquity in greater and greater numbers. What I foresee in Ubiquity is part of a broader movement that shifts common computing further down the Web 2.0-blazed path of heightened and evolving user participation, control, and access. Instead of having the website developer determine how and where you can go, suddenly you are at the controls of an increasingly powerful and easy-to-use command center for accessing and manipulating data. And I can only dream of the day a grad student will be able highlight some archival text, type “history this” into their command line, and have a fully-compiled dissertation written before their eyes.

Review: Placing History (III)

(This is the third installment of my review of Placing History. See the first and the second parts.)

I’ve finally finished Placing History: How Maps, Spatial Data, and GIS are Changing Historical Scholarship. As my previous posts have made clear, I’m quite impressed with the breadth and depth of the compilation. As before, I’ll briefly recount the remaining chapters, and wrap up my thoughts at the end.

“Mapping Husbandry in Concord: GIS as a Tool for Environmental History,” by Brian Donahue. I liked this chapters for a multitude of reasons. On a personal note, his research is quite similar (though wider in scale) to the work I did in mapping property holdings and transactions of Venture Smith. So in a self-congratulatory mood, I found myself nodding with satisfied agreement at his various points about the benefits and drawbacks to mapping land deeds and parcels. On a less personal level, I liked the various angles he took in pursuing his study of Concord – especially examining seemingly disparate holdings of a variety of original families and noting patterns of land use.

“Combining Space and Time: New Potential for Temporal GIS,” by Michael Goodchild. For starters, the cover illustration for this chapter was a piece of Charles Minard’s famous “Carte Figurative,” which depicts a staggering array of geographic, temporal, and statistical information regarding Napoleon’s ill-fated Russian campaign:

Charles Minard

Charles Minard's "Carte Figurative"

Information graphic guru Edward Tufte described it as “the best statistical graphic ever drawn,” which effectively canonized it for any map and information graphic nerd such as myself. This is a roundabout way of saying I was excited to start reading Goodchild’s chapter. Goodchild doesn’t dissapoint, as he uses decades of geography experience to explore ways in which the field is gradually shifting to incorporate temporal data. Although its heavy on technical geography, it’s a rewarding chapter that covers one of the fundamental challenges of historical GIS: how do you visually display the relationship between space and time? Goodchild predicts that this challenge will rapidly diminish, as tools and systems to display things such as dynamic data, or even a history-specific model, will become more and more accessible and widespread.

“New Windows on the Peutinger Map of the Roman World,” by Richard J.A. Talbert and Tom Elliot. Talbert and Elliott present an analysis of the Peutinger Map, a nearly 7 meter long Roman map depicting the Mediterranean world and beyond, constructed around 300 CE:

Detail of Peutinger's Map

Detail of Peutinger's Map

I liked this chapter a lot, despite my complete unfamiliarity with the subject matter. The authors make compelling arguments backed by GIS analysis, such as: “the basis of the map’s design was not its network of land routes (as has always been assumed) but rather the shorelines and principal rivers and mountain ranges, together with the major settlements marked by pictorial symbols.” They present a quantitative analysis of routes, and utilize a histogram to further examine the segments and their distances.

“History and GIS: Implications for the Discipline,” by David J. Bodenhamer. This chapter, along with the first chapter and conclusion, gives the best “big-picture” perspective on historical GIS. Bodenhamer describes the field of history as a whole, in particular elements of it that relate to spatial analysis. He believes that in order for GIS to become a valuable historical tool, “it must do so within the norms embraced by historians…” GIS is well-situated to do so, because it uses a format of presenting information (the map) that historians are already familiar with, and its visualization and integration of information makes it easier to display the complexity of historical interpretation. He also discusses the challenges to historical GIS. One point I really liked was that technology as a whole, and GIS in particular, often requires a level of precision that historical documents cannot display within “a technology that requires polygons to be closed and points to be fixed by geographical coordinates.” Other challenges range from the theoretical (ex. temporal analysis) to the practical (ex. learning a completely new discipline). Finally, he succinctly sums up one of the greatest challenges: “GIS does not strike many historians as a useful technology because we are not asking questions that allow us to use it profitably.” I could not have said it better myself – until historians begin to ask the type of questions that can be addressed through spatial analysis, GIS will likely remain a technological oddity within the discipline.

“What Could Lee See At Gettysburg?” Anne Kelly Knowles. This is probably one of the most accessible chapters in the book for a layperson. It combines an engaging narrative prose with rich, stylistic maps, and a “popular” subject matter (the Battle of Gettysburg). But more importantly, it clearly presents an answer to a historical question, while contextualizing the issue and presenting possible ideas for future studies. Viewshed (line-of-sight) analysis is of obvious and particular interest to military historians, but it has other implications as well. In particular, this chapter illustrates the phenomenal power of GIS to transport the reader to the past, and get a micro sense of “being” there.

Beyond thoroughly enjoying Placing History, I believe it’s an important contribution to the field of historical methodology in general, and (of course) historical GIS in particular. The compilation gives a wonderful balance while thoroughly exploring the topic: its current state and background, case studies ranging from micro to macro and “hard” to “soft”, discussions on theory and approach, and an outline for the future. I recommend the book to educators, historians, digital humanists, or anyone with even a passing interest in a growing and valuable area of scholarship.

Scattered Links – 7/20/2008

Bill Turkel wrote a thought-provoking post titled “Towards a Computational History.” I agree completely with his section on collective intelligence. A lot of digital history spells out the methodology of tools and technology, but the more theoretical shift in production and dissemination of information is of course equally important to the future of the field.

Eric Rauchway wrote a great article for The New Republic explaining why parallels between John McCain and Teddy Roosevelt fall flat.

One of the pleasant benefits of taking a break from school and having a 9-5 job (along with a peaceful 45 minute metro commute each way), is that I can read  a ton of books that aren’t assigned to me by a course syllabus. A Pulitzer Prize, along with some interesting blog reviews, have placed Daniel Walker Howe’s What Hath God Wrought onto my short list. For similar reasons, Kate Summerscale’s The Suspicions of Mr. Whicher has been added as well.

Finally, Matthias Schulz of Spiegel Online has an interesting article on how the myth of Cyrus II as a pioneer of human rights developed. Schulz attacks this particularly insidious piece of propaganda, and isn’t afraid to take issue with heavyweights such as the United Nations and Nobel Peace Prize recipient Shirin Ebadi. The historian in me appreciates his revisionism, but would just like to see his sources.

Review: Placing History (II)

(This is the second installment of my review of Placing History. See the first and the third parts)

I’ve just finished reading about half of Placing History: How Maps, Spatial Data, and GIS are Changing Historical Scholarship, edited by Anne Kelly Knowles. I’ll briefly go through each one, and focus on the ones that particularly interested me.

“Creating a GIS for the History of China,” by Peter K. Bol. Bol, chair of Harvard’s Department of East Asian Languages and Civilizations, discusses his China Historical GIS project. The project attempts to create a basic framework and data source (both spatial and temporal) for geospatial analysis of Chinese history. On a theoretical note, Bol argues that in the case of China, historical GIS should utilize a greater reliance on point data in place of polygons for marking boundaries and territory, in order to better replicate the top-down administrative system of traditional Chinese cartography.

“Teaching With GIS,” by the late Robert Churchill and Amy Hillier. Churchill gives a good overview of the value of GIS in a liberal arts education. I liked his point that one of the benefits of using historical GIS is that any in-depth use of the technology requires an equally in-depth understanding of the problem you’re looking to address. Great point. Because so much of GIS is front-loaded, in that you spend a huge amount of effort in obtaining and managing the data, it requires you to really get your hands dirty in the sources themselves. Hillier gives a lot of great examples of students’ work using historical GIS, mostly Philadelphia-based data. Some of them also included a great 1896 map by W.E.B. Du Bois detailing social class in the city. She also gives some useful tips for educators who want to incorporate GIS.

“Scaling the Dust Bowl,” by Geoff Cunfer. I loved this chapter. Cunfer follows up his previous research in Knowles first book, Past Time, Past Place, by additional analysis of the Dust Bowl. In this chapter, he takes on the common perception of the dust bowl as championed by Donald Worster’s Dust Bowl: The Southern Plains in the 1930’s. While some of Cunfer’s analysis supports Worster, he takes issue with Worster’s commonly-held assertion that the capitalistic over-development of lands for farming the major factor in the fabled 1930’s dust storms. Cunfer first demonstrates through spatial analysis that, although plow-up during the 1920’s did contribute to the Dust Bowl, it was in fact instances of drought that had a much more direct correlation.

He goes on to further his critique of the notion that the Dust Bowl was an extraordinary phenomena caused by human activity. By examining and mapping newspaper accounts of dust storms from the 19th century, along with storms after the 1930’s, he finds that “dust storms are a normal part of southern plains ecology, occurring whenever there are extended dry periods.” Although extensive plowing can enhance the problem, it was not “the sole and simple cause of the Dust Bowl.” Cunfer’s analysis succeeds on many different levels. First, I like the accessibility of it. There’s always a temptation to include too much in the final products, to show off the fruits of your hours and hours of labor. Instead, his maps are clear, uncluttered, and persuasive.  Second, I like the way he blended traditionally quantitative analysis tools (GIS) with qualitative historical research (newspaper accounts). He does a good job of highlighting this tension, and aptly warns of its danger, while explaining simply how he accomplished it. Third, his work is a great example of the “right way” to use new technology to both challenge and supplement traditional historical arguments, and in doing so, present an original and different narrative.

“‘A Map is Just a Bad Graph’: Why Spatial Statistics are Important in Historical GIS,” by Ian Gregory. This chapter was much more technical, and included scary words like “regression coefficients” and “heteroscedasticity.” Although statistics in particular, and math in general, is low down on my list of skills, I got a fair amount out of the chapter. I liked his critique of the traditional thematic map, which usually displays one type of data, and with usually one variable involved. Statistical analysis can go beyond simple thematic maps and really open up the powerful underbelly of GIS.

There are several more chapters that I am looking forward to reading and reviewing in a later post.