Home » Posts tagged 'digital humanities'

Tags: digital humanities

Digital Humanities Grant Opportunities for Art Historians

3D Photogrammetry For Cultural Heritage Workshop

CFP deadline December 1, 2017.

A one-week training workshop (March 25-31, 2018) at UCSC on photogrammetry for early-stage graduate students. Participants in this workshop will gain intensive hands-on experience in the techniques and processing workflow for photogrammetric recording for cultural heritage projects, presented within the context of a critical engagement in discussions of the politics of digital knowledge production. Click here for more information: ARC Photogrammetry Workshop Call UCSC.

 

Research Project: Ed Ruscha’s “Streets of Los Angeles”

CFP deadline January 19, 2018.

Scholars from a wide range of fields are invited to submit proposals for research projects investigating Ed Ruscha’s “Streets of Los Angeles” archive—including, but not limited to digital humanities, cultural geography, architecture, art history, photography, and visual culture. Interdisciplinary approaches and team-based projects are particularly encouraged. Selected researchers would collaborate with Getty Research Institute (GRI) staff as part of a larger research-technology project, which seeks to digitize and make publicly-accessible a portion of the archive in innovative ways. The goal is to publish resulting scholarship at the close of the project. For more details, click here.

 

Visualizing Venice Summer Institute: Advanced Topics in Digital Art History: 3D (Geo)Spatial Networks

CFP deadline Janurary 5, 2018.

This Getty Foundation supported workshop will support interdisciplinary teams focused on the hard questions of Digital Art History as a discipline, a set of methods, and a host of technical and institutional challenges and opportunities.

Participants will gather from June 4-16, 2018 in Venice, Italy at Venice International University, with follow-up activities taking place over the course of the 2018-19 academic year, and leading into a follow-on gathering in Summer of 2019 that will operate as a writing and digital publication workshop, building upon work done over the course of the year by the project teams and in collaboration with our wider network.

 

NEH Digital Humanities Advancement Grants

CFP deadline January 16, 2018.

Digital Humanities Advancement Grants (DHAG) support digital projects throughout their lifecycles, from early start-up phases through implementation and long-term sustainability. Experimentation, reuse, and extensibility are hallmarks of this grant category, leading to innovative work that can scale to enhance research, teaching, and public programming in the humanities.

This program is offered twice per year. Proposals are welcome for digital initiatives in any area of the humanities.

 

 

Follow us on social media
Twitter: @ah_library_ucb
Instagram: berkeley_art_history_library

HathiTrust Research Center (HTRC) UnCamp Fellowships

HathiTrust Research Center (HTRC) UnCamp Fellowships
The UCB Libraries are delighted to offer a limited number of general fellowships for free admission to the 2018 HTRC UnCamp at UC Berkeley.
These fellowships are open to current UC Berkeley students and staff. All qualified applicants will be accepted in order of application while fellowships are available, though priority will be given to student applicants. Fellowship applications are due by Nov 13.
Apply here:
Note: Those who do not receive fellowship awards will be informed in time to register at the UnCamp Early Bird price.
About the HathiTrust Research Center (HTRC) 2018 UnCamp
Location: University of California Libraries, Berkeley, CA
Dates: January 25-26, 2018
HTRC UnCamp 2018 aims to facilitate the creation of a national community focussed on improving research use of the HathiTrust corpus through computational analysis. The UnCamp will discuss topics relevant to understanding and utilizing the HathiTrust Digital Library corpus within the modern computational research eco-system. This includes discussion of practices and experiences in mass-scale data mining, visualization, and analysis of the HT collection, with the goal of improving the quality of access and use of the collection by means of the HTRC Data Capsule and other affiliated research tools.
Stacy Reardon
Literatures and Digital Humanities Librarian
438 Doe Library | University of California, Berkeley | Berkeley, CA 94720
sreardon@berkeley.edu

Event: HathiTrust Research Center (HTRC) UnCamp

The UC Berkeley Libraries are excited to host the HathiTrust Research Center (HTRC) UnCamp, on January 25-26, 2018.
HTRC UnCamp 2018 aims to bring together researchers, developers, instructors, librarians, and other information professionals to showcase innovative research, participate in hands-on coding and demonstration sessions, and build community around themes of digital libraries, metadata, copyright, digital humanities, computational text analysis, and digital pedagogy. The UnCamp will discuss topics relevant to understanding and utilizing the HathiTrust Digital Library, including:
  • Demystifying HathiTrust metadata
  • Fair use, copyright, and non-consumptive research
  • HathiTrust development, news, and updates
  • Digital pedagogy and text analysis curricula
  • Scholarly tools and methods for text analysis
  • Corpus creation
  • Early registration price of $100 through November 29, 2017.
  • Standard price of $150 begins on November 30, 2017.
More info is available from the Library news.

Workshops: Digital Publishing

flyer imageWhether you are looking to create a companion website for your book or a full-scale digital project, this workshop series is designed to get you up and running with the user-friendly, open source web publishing platforms Scalar, WordPress, Omeka and Drupal.

  • All platforms are easily managed right through your web browser.
  • No programming or coding knowledge is required.
  • Options for hosting will be covered.
  • Technology workshops will be hands-on; bring a laptop if you can.

This series is designed for faculty, graduate students, and staff in the Humanities and Social Sciences and is open to any member of the UC Berkeley community. Register at bit.ly/dp-berk

WordPress for Easy and Attractive Websites
Thursday, April 20, 4-5pm
Academic Innovation Studio, Dwinelle Hall 117 (Level D)

In this hands-on workshop, we will learn the basics of creating a WordPress site, a web-based platform good for blogs, scholarly portfolios, and websites. By the end of the workshop, you will know how to post content, embed images and video, customize themes and appearance, and work with plugins.
Register

Omeka for Digital Collections and Exhibits
Wednesday, April 26, 1-2pm
D-Lab, 350 Barrows Hall

Omeka is ideal for creating and displaying an online collection or exhibit composed of many digital items. If you have a bunch of digital images, scans, and files around a certain theme or project, and you would like to organize, describe, and showcase these files, Omeka may be a good fit for you. In this hands-on workshop, we will learn how to add and describe items in Omeka, the basics of the Dublin Core metadata schema, and how to create webpages with the Simple Pages plugin.
Register

Copyright and Fair Use for Digital Projects
Thursday, April 27, 11-12noon
D-Lab, 350 Barrows Hall

This training will help you navigate the copyright, fair use, and usage rights of including third-party content in your digital project. Whether you seek to embed video from other sources for analysis, post material you scanned from a visit to the archives, add images, upload documents, or more, understanding the basics of copyright and discovering a workflow for answering copyright-related digital scholarship questions will make you more confident in your publication. We will also provide an overview of your intellectual property rights as a creator and ways to license your own work.

Designing in Drupal
Friday, April 28, 11-12noon
Academic Innovation Studio, Dwinelle Hall 117 (Level D)

Drupal is a powerful open source content management system that provides a flexible platform for developing web-based digital research projects. This workshop will cover the basics of how Drupal works, how you can create templates for storing your research materials, and how you can organize, display, and analyze those materials. Drupal is a good choice for many kinds of projects, including websites and projects underpinned by a database.

Scalar for Multimedia Digital Projects
Tuesday, May 2, 5-6pm
Berkeley Center for New Media Commons, 340 Moffitt

Developed by the Alliance for Networking Visual Culture, Scalar is a web platform designed especially for multimedia digital projects and for multimedia academic texts. Like WordPress, it is easy to create content, but it is distinguished by multiple ways of navigating through a project, annotation and metadata features, and image and video options. Choose it to develop born digital projects and books, or as a companion site for traditional scholarship. In this hands-on workshop, we’ll learn how to create a Scalar project, create pages and media, add metadata and annotations, and define paths.

Register at bit.ly/dp-berk

Go from Analog to Digital Texts with OCR

OCR text

A collection of digitized texts marks the start of a research project —  or does it?

For many social sciences and humanities researchers, creating searchable, editable, and machine-readable digital texts out of heaps of paper in archival boxes or from books painstakingly sourced from overlooked corners of the library can be a tedious, time-consuming process.

Scholars using traditional methodologies may find it advantageous to have a digital copy of their source material, if only to be able to more easily search through it. For anyone who wants to use computational methods and tools, converting print sources to digital text is a prerequisite. The process of converting an image of scanned text to digital text involves Optical Character Recognition (OCR) software. New developments in campus services are providing additional options for researchers who wish to prepare their texts this way.

What resources does UC Berkeley offer to convert scans to digital text?

  • For basic needs, try the Library’s scanners.
  • For documents with complex layouts or for additional language support, ABBYY FineReader with Berkeley’s OCR virtual desktop is a solution.
  • Finally, Tesseract can handle large scale OCR projects.

Books and simple documents: library scanners with OCR software

All of the UC Berkeley libraries, including the Main (Gardner) Stacks, have at least one Scannx scanner station with built-in OCR software. This software automatically identifies and splits apart pages when you’re scanning a book, and it performs OCR on any text it can identify. You can save your results as a “Searchable PDF” (with embedded OCR output) or as a Microsoft Word document, or you can save page images as TIFF, JPEG, or PDF files (omitting digitized text). For book scanning or simple document scanning, the library scanners can take you from analog to digital in a single step.

Complex layouts or language support: ABBYY FineReader and Berkeley Research Computing’s OCR virtual desktop

If your source material has a complex layout (like irregular columns, embedded images, and/or tables that you want to continue to edit as tables) or uses a non-Latin alphabet, ABBYY FineReader OCR may get you better OCR results. FineReader supports Arabic, Chinese, Cyrillic, Greek, Hebrew, Japanese, and Thai, among other languages.

On campus, FineReader is available on computers in the D-Lab (350 Barrows). From off campus, the OCR virtual research desktop provided through Berkeley Research Computing’s AEoD service (Analytic Environments on Demand, pronounced “A-odd”) allows users to log into a virtual Windows environment from their own laptop or desktop computer anywhere there’s an internet connection. If you’re visiting an archive and aren’t sure that your image capture setup is getting good enough results to use as OCR input, you can log into the OCR virtual research desktop and try out a couple samples, then refine your process as needed. You can also work on your OCR project from home, or on nights and weekends when campus buildings are closed. To use the OCR virtual research desktop, sign up for access at http://research-it.berkeley.edu/ocr.

FineReader is not generally recommended for very large numbers of PDFs because each conversion must be started by hand. However, if you don’t need to differentiate the origin of your various source PDFs (e.g., if your text analysis will treat all text as part of a single corpus, and it doesn’t matter which of the million PDFs any particular bit of text originally came from), you might be able to use FineReader by creating one or more “mega-PDFs” that combine tens or hundreds of source PDFs and letting it run over a long period of time. At a certain point, however, Tesseract might be a better choice.

OCR at scale: Tesseract on the Savio high-performance compute cluster

If you have thousands, hundreds of thousands, or millions of PDFs to OCR, a high-powered, automated solution is usually best. One such option is the open source OCR engine Tesseract. Research IT has installed Tesseract in a container that you can use on the Savio high performance computing (HPC) cluster. For researchers who are less comfortable with the command line, there is also a Jupyter notebook available that provides the necessary commands and “human-readable” documentation, in a form that you can run on the cluster. Any tenure-track faculty member is eligible for a Faculty Computing Allowance for using Savio. For graduate students, talk to your advisor about signing up for an allowance and receiving access.

No matter how large or small your OCR project is, UC Berkeley has the perfect tool for you in scanning equipment, ABBYY FineReader, or Tesseract. Happy converting!

 

Related Event: From Sources to Data: Using OCR in the Classroom

March 16, 2017

10:30am to 12:00pm

Open to: All faculty, graduate students, and staff

 

Questions?

Quinn Dombrowski, Research IT  quinnd [at] berkeley.edu

Stacy Reardon, Library  sreardon [a] berkeley.edu 
Thank you to Cody Hennesy for suggestions. Cross posted on the D-Lab blog and the Research IT blog.

 

Gallica Gives

"Ça, mon enfant, c'est du pain.. [That, my child, is bread...]" par Gottlob in L'Assiette au beurre (1901)

Online since 1997, Gallica remains one of the major digital libraries available for free on the Internet. With more than 12 million high-resolution digital objects from the collections of the Bibliothèque Nationale de France (BnF) as well as from hundreds of partner institutions, it includes books, journals, newspapers, manuscripts, maps, images, audio files, and more. The illustration above “Ça, mon enfant, c’est du pain.. [That, my child, is bread…]” by Fernand-Louis Gottlob was published in one of the first issues of the weekly satirical magazine L’Assiette au beurre (1901-1936) which is also held in print at UC Berkeley. Committed to the ever-evolving needs of its user community, Gallica’s social media outlets include Facebook, Twitter, Pinterest and even a BnF app.

Where to Find the Texts for Text Mining

Sketch for Monotype Digital Type Wall
frame1351170437122. Marcin Ignac, CC BY-NC-ND 2.0

Text mining, the process of computationally analyzing large swaths of natural language texts, can illuminate patterns and trends in literature, journalism, and other forms of textual culture that are sometimes discernible only at scale, and it’s an important digital humanities method. If text mining interests you, then finding the right tool — whether you turn to an entry-level system like Voyant or master a programming language like Python — is only a part of the solution. Your analyses are only as strong as the texts you’re working with, after all, and finding authoritative text corpora can sometimes be difficult due to paywalls and licensing restrictions. The good news is the UC Berkeley Libraries offer a range of text corpora for you to analyze, and we can help you get your hands on things we don’t already have access to.

The first step in your exploration should be the library’s Text Mining Guide, which lists text corpora that are either publicly accessible (e.g., the Library of Congress’s Chronicling America newspaper collection) or are available to UCB faculty, students, and staff (e.g., JSTOR Data for Research).  The content of these sources are available in a variety of formats: you may be able to download the texts in bulk, use an API, or make use of a content provider’s in-platform tools. In other cases (e.g., ProQuest Historical Newspapers), the library may be able to arrange access upon request. While the scope of the corpora we have access to is wide, we are particularly strong in newspaper collections, pre-20th century English literature collections, and scholarly texts.

What happens if the library doesn’t have what you need? We regularly facilitate the acquisition of text corpora upon request, and you can always email your subject librarian with specific requests or questions. The library will deal with licensing questions so you don’t have to, and we’ll work with you to figure out the best way to make the texts available for your work, often with the help of our friends in the D-Lab or Research IT . We also offer the Data Acquisition and Access Program to provide special funding for one-time data set purchases, including text corpora.  Your requests and suggestions help the library develop our collection, making text mining easier for the next researcher who comes along.

Important caveats:

  • Unless explicitly stated, our contracts for most Library databases and library resources (e.g., Scopus, Project MUSE) don’t allow for bulk download. Please avoid web scraping licensed library resources on your own: content providers realize what is happening pretty quickly, and they react by shutting down access for our entire campus. Ask your subject librarian  for help instead.
  • Keep in mind that many of the vendors themselves are limited in how, and how much access, they can provide to a particular resource, based on their own contractual agreements. It’s not uncommon for specific contemporary newspapers and journals to be unavailable for analysis at scale, even when library funding for access may be available.

Related resources:

 

Stacy Reardon and Cody Hennesy
Contact us at sreardon [at] berkeley.edu; chennesy [at] berkeley.edu

Digital Humanities for Tomorrow

Opening the Conversation About DH Project Preservation

By Rachael G. Samberg & Stacy Reardon

Digital Object Maker

Digital Object Maker. Sayf, CC BY-NC-ND 2.0

After intensive research, hard work, and maybe even fundraising, you launch your digital humanities (DH) project into the world. Researchers anywhere have instant access to your web app, digital archive, data set, or project website. But what will happen to your scholarly output in five years? In twenty-five? What happens if you change institutions, or institutional priorities shift? Will your digital project be updated or forced to close up shop? Who should ensure that your project remains available to researchers? Which departments should guide long-term sustainability of your research? (more…)

Connect Your Scholarship: Open Access Week 2016

Open Access Week 2016

Open Access connects your scholarship to the world, and for the week of Oct. 24-28, the UC Berkeley Library is highlighting these connections with five exciting workshops and panels.

What’s Open Access?

Open Access (OA) is the free, immediate, online availability of scholarship. Often, OA scholarship is also free of accompanying copyright or licensing reuse restrictions, promoting further innovation. OA removes barriers between readers and scholarly publications—connecting readers to information, and scholars to emerging scholarship and other authors with whom they can collaborate, or whose work they can test, innovate with, and expand upon.

Open Access Week @ UC Berkeley

OA Week 2016 is a global effort to bring attention to the connections that OA makes possible. At UC Berkeley, the University Library—with participation from partners like the D-Lab, California Digital Library, DH@Berkeley, and more—has put together engaging programming demonstrating OA’s connections in action. We hope to see you there.

Schedule

To register for these events and find out more, please visit our OA Week 2016 guide.

  • Digital Humanities for Tomorrow
    2-4 pm, Monday October 24, Doe Library 303
  • Copyright and Your Dissertation
    4-5 pm, Monday October 24, Sproul Hall 309
  • Publishing Your Dissertation
    2-3 pm, Tuesday October 25, Sproul Hall 309
  • Increase and Track Your Scholarly Impact
    2-3 pm, Thursday October 27, Sproul Hall 309
  • Current Topics in Data Publishing
    2-3 pm, Friday October 28, Doe Library 190

You can also talk to a Library expert from 11 a.m. – 1 p.m. on Oct. 24-28 at:

  • North Gate Hall (Mon., Tue.)
  • Kroeber Hall (Wed.–Fri.)

Event attendance and table visits earn raffle tickets for a prize drawing on October 28!

Sponsored by the UC Berkeley Library, and organized by the Library’s Scholarly Communication Expertise Group. Contact Library Scholarly Communication Officer, Rachael Samberg (rsamberg@berkeley.edu), with questions.

Event: Editions Inside of Archives: Literary Editing and Preservation at the Mark Twain Project

I’m sharing this event announcement because it may be of interest to you.

The Literature and Digital Humanities Working Group, and the Americanist Colloquium, would like to invite you to join us at the following talk:

Editions Inside of Archives: Literary Editing and Preservation at the Mark Twain Project

Christopher Ohge

Thursday October 13th, 6.30pm

DLib Collaboratory, 350 Barrows Hall

The Mark Twain Papers & Project not only contains the largest collection of material by and about Mark Twain, it also employs several editors working toward a complete scholarly edition of Mark Twain’s writings and letters. The editors in the Project are sometimes involved in archival management, preservation, and “digital humanities” endeavors. Yet the goals of the archive both overlap with and diverge from those of a scholarly edition, especially in that editions produced by the Mark Twain Project use material from other archives, and considering the limit to which editorial work can faithful to physical manuscripts. Archival projects are sometimes done at the expense of editorial projects, and vice versa; each enterprise has its gains and losses.

Digital scholarly editing  can also depart from more traditional print editorial enterprises. When editorial policy modifications occur simultaneously with the evolution of digital interfaces, what is an editor to do? Put another way, when “digitizing” an old book with a different editorial policy, is one obliged to “re-edit” the text or compromise about how to present the product of a different set of expectations for editing and designing scholarly editions? How do notions of readability and reliability change with concurrent technological innovations? I shall examine instances where the physical archive, the digital archive, and editions at the Mark Twain Project have illuminated common as well as new ground on reading, editing, and cultural heritage.

 

Show Your Support

Show Your Support button to donate to the Library

Library Events Calendar