I'm Gregory Wiedeman,

an archivist who makes information more accessible. Sometimes I write, give presentations, or code.

About

This is a place to put some of the things I'm working on. I am currently the University Archivist at the University of Albany, SUNY.

I focus on collecting documentation of the university and making it available for research use. I get to solve problems with paper records and legacy digital records, work with metadata at scale, and create ways to make a lot of content available on the web.

I do a ton of work with XML, Python, obsolete digital formats, and web technologies.

I have a research background in 19th century American History and did some work on how people understood faraway places, particularly the Eastern Mediterranean.

This is not really a blog, but I have some plans to write stuff here that's not really suitable for publication.

Writing

I write mostly on archives, particularly automating collection management and born-digital records.

Review of Metadata by Marcia Lei Zeng and Jian Qin. Discusses different views of metadata among libraries and archives. Published in Archival Issues.

Article that shows archivists how Python can be used to automate workflows between different systems and reduce manual tasks. Published in Practical Technology for Archives Issue no.7.

Working with a multitude of digital tools is now a core part of an archivist’s skillset. We work with collection management systems, digital asset management systems, public access systems, ticketing or request systems, local databases, general web applications, and systems built on smaller systems linked through application programming interfaces (APIs). Over the past years, more and more of these applications have evolved to meet a variety of archival processes. We no longer expect a single tool to solve all our needs and embraced the "separation of concerns" design principle that smaller, problem-specific and modular systems are more effective than large monolithic tools that try to do everything. All of this has made the lives of archivists easier and empowered us to make our collections more accessible to our users.

Guest post for the Archive-It Blog: In recent years many archives have expanded to preserve the web among other formats in their traditional collecting areas. Yet, unlike traditional formats, the best way to make web archives available to researchers is not with boxes and call numbers, but in their original environment — an Internet-connected web browser. This post discusses how at the University at Albany, SUNY, we are providing minimal access to this new type of archival material together with other formats that originated with the same creators.

Instructional module for the CPR Electronic Records Committee that describes how to do a web crawl using Archive-It

Code4Lib article on the development of ANTS, the digital records transfer system I developed to package records with checksums and filesystem metadata and make network transfers.

Archivists have developed a consensus that forensic disk imaging is the easiest and most effective way to preserve the authenticity and integrity of born-digital materials. Yet, disk imaging also has the potential to conflict with the needs of institutional archives – particularly those governed by public records laws. An alternative possibility is to systematically employ digital forensics tools during accession to acquire a limited amount of contextual metadata from filesystems. This paper will discuss the development of a desktop application that enables records creators to transfer digital records while employing basic digital forensics tools records’ native computing environment to gather record-events from NTFS filesystems.

Article that introduces XQuery to non-programmer archivists and encourages them to see archival description as data. Published in Practical Technology for Archives Issue no.3.

XML has long been an important tool for archivists. The addition of XQuery provides a simple and easy-to-learn tool to extract, transform, and manipulate the large amounts of XML data that archival repositories have committed resources to develop and maintain – particularly EAD finding aids. XQuery allows archivists to make use of that data. Furthermore, using XQuery to query EAD finding aids, rather than merely reformat them with XSLT, forces archivists to look at finding aids as data. This will provide better knowledge of how EAD may be used and further understanding of how finding aids may be better encoded. This article provides a simple how-to guide to get archivists to start experimenting with XQuery.

Unpublished paper I wrote in graduate school for a preservation class. It argues that Nicholson Baker's criticism of microfilming newspapers is muddled in misunderstanding and that the more effective argument against this practice is that preservation microfilming often ignores access concerns. The argument I like best in this piece is that use, not just time, is what damages acidic newsprint.

Unpublished paper I wrote in graduate school on how to make finding aids more accessible on the web.

Unplublished master's thesis.

This study addresses how nineteenth-century Americans perceived the lands of the Eastern Mediterranean. The project rests upon a detailed examination of American primary school geography textbooks that enjoyed widespread circulation during the century. The lack of an effective education apparatus in the period rendered American students incredibly reliant on their textbooks. These texts reflect the general common knowledge of the region shared by most educated Americans. Additionally, this study draws support from a thorough analysis of travel accounts that were extraordinarily popular during the period. These works offered Americans a chance to explore vicariously the most interesting lands of the Levant.

Nineteenth-century Americans sought to locate their essential place, meaning and mission within a universal system of world processes. Geography authors fulfilled this social need by providing students with a systemized structure of knowledge about the Eastern Mediterranean. This framework enabled students to address the complex realities of the region in a simplified and palatable manner – a process that also used to satisfy various social pressures. This episteme of the Eastern Mediterranean provided the context for Americans to regulate their self-meanings and cultural missions in the nineteenth century. Often, the concepts of this knowledge structure took the form of dichotomies which acted as defining antitheses. Students located themselves within these oppositions which became constructs of Sameness and Otherness. The structured framework of knowledge about the Levant provided the setting in which these processes played out. Thus, the people, places, and practices of the region were marked as aspects of “us” and “them” – of heritage and Otherness.

Presentations

Beyond Finding Aids: New Approaches to Archival Description

2017 July

Our SAA 2017 panel on rethinking finding aids for presenting archival description on the web

Describing Web Archives: Automating Access with APIs and ArchivesSpace

2017 June

Talk about using DACS to describe web archives, and authomating the process with APIs

Link Ranking in Web Archives

2017 June

WInning project for Archives Unleashed 4.0 at the British Library

Providing Basic Access to Web Archives Provenance Data

2017 June

Lightning Talk at Archives Unleashed 4.0 at the British Library

Pragmatic Processing: Diverse Approaches to Different Collections

2017 June

Into to a panel I chaired on pragmatic/extensible processing for the 2017 New York Archives Conference in Utica

Born-Digital Records in Practice at UAlbany

2017 April

Guest class for Electronic Records Management graduate course.

What's a Digital Archivist?

2017 April

Talk at MARAC Newark on the differences between "Digital Archivists" and IT Professionals

Automating Web Archives Records in ASpace

2017 February

Short talk at WASAPI Symposium at the Internet Archive

No More Finding Aids: A New Frontend for Special Collections & Archives at UAlbany

2016 December

Presentation at New England Code4Lib in Amherst, MA that argues the finding aid as a concept is no longer useful.

APIs, WARCS, and NPL: How Technology has Created New Opportunities for Historical Research

2016 November

Presentation at Researching New York/Conference on New York State History in Albany, NY

Updating web archives extents/dates using the ASpace API and Archive-it CDX

2016 October

Talk and hands-on workshop at Beyond the Basics ArchivesSpace Skill Share, Philadelphia, PA

SIP Your Pics: an almost-OAIS workflow for 180,000 images

2016 August

Part of a panel on processing born-digital photographs for the SAA Annual Meeting 2016 in Atlanta.

A Sustainable, Large-Scale, Minimal Approach to Accessing Web Archives

2016 August

Talk for the Archive-It Partner's meeting on automating access to Web Archives in finding aids using Archive-It's CDX API

Marcia Brown at the State College for Teachers (1936-1940)

2016 April

Renowned author and illustrator's college experience

How to Metadata

2016 March

Guest lecture for graduate Digital Libraries class

Transferring Records to the Archives with ANTS

2016 March

Code4Lib 2016 Lightning Talk

Auto Upload and On-Demand Digitization

2016 January

Talk at METRO conference

Managing Descriptive Metadata with Open XML... For Now

2015 August

Lightning talk I gave during the Collection Management Tools Roundtable at SAA in Cleveland

Effective Metadata Systems for Archives: EAD and Unified Collection Management Systems

2015 June

Talk I gave at the 2015 New York Archives Conference in Fredonia, NY

Using XML data with XQuery

2014 November

Guest lesson for graduate XML class

Lowell Thomas and American Cultural Knowledge of Palestine and Arabia

2013 March

Talk at Northeastern University Graduate History Conference

Code

Example Scripts for Learning Python

Python

Basic sample scripts to get some results with Python. For use with "Python for Archivists".

Researching NY 2016 Scripts

Python, D3Plus

Example scripts for using computational tools like NLP for historical research. For November 2016 presentation

ArchivesSpace/Archive-It Example Workshop

Python

This is a working example of how to automate description of Web Archives in ArchivesSpace using the API and the Archive-It and Internet Archive CDX Servers. For ASpace Skill Share

Basic access to web archives scripts

Python

Scripts for adding web archives to EAD files from Archive-It's CDX API. For Archive-It Blog post

createSIP.py

Python

Command line tool for automating SIPs as bags with an additional metadata file extraxted from filesystem data

Processing Scripts for Born-Digital Photos

Python

Scripts used to automate extraction of 180,000 photos from disk images and make static pages for display

Random Born-Digital Scripts

Mostly Python

Includes script to automate disk imaging from external drives

Records Transfer Scripts

Python

Script to crawl shared drive for new folders and package into SIP based on hash index

Collections Access System

XSLT, Bootstrap, CSS, JQuery

Custom XTF instance with Bootstrap 3

ANTS: Archives Network Transfer System

Python, wx, Pyinstaller

ANTS is designed for transfering digital records to an institutional archives using digital forensics tools.

EADMachine 2.0

Python, wx, Pyinstaller

Simple GUI to quickly create basic collection-level or series-level EAD finding aids

Auto Upload

Python

Enables easy on-demand digitization and display. Detects files in a directory, creates access and preservation copies, edits EAD, and transforms to HTML

EAD tools

Python

Python scripts for cleaning up and standardizing EAD, and inserting semantic identifiers

EADValidator

Python, Bootstrap

Script for strict rule-based validation for EAD files

EADMachine

Python, wx, Pyinstaller

EADMachine is an easy EAD creation and editing tool. It is a Python .exe that reads and writes complete EAD files from spreadsheets.

CV

Gregory Wiedeman

Science Library 356
1400 Washington Ave
Albany, NY 12222
gwiedeman[at]albany[dot]edu

Education

2013 M.S. Information Science
University at Albany, SUNY

  • ALA accredited program with concentration in Archives & Records Management

2013 M.A. History
University at Albany, SUNY

2010 B.A. History
Marist College, Poughkeepsie, NY

Employment

2015-Present University Archivist

M.E. Grenander Department of Special Collections and Archives
University Libraries
University at Albany, SUNY

  • Manages acquisitions, accessioning, collection development, archival processing for the University Archives

  • Developing born-digital records collections program for University Archives that meets pubic records laws and professional standards

    • Also leads born-digital collecting for manuscript collections
  • Implemented minimal processing techniques and use-based processing to reduce the number of inaccessible collections

  • Performs primary reference services for University Archives, working with students, faculty, staff, alumni, and the general public

  • Oversees web archiving program for Albany.edu domain and outside collecting areas

  • Leading role in implementing new technologies in collection management and public access

    • Designed and developed new web access system for archival collections

    • Mix of Drupal, XTF, and Static page generation, with Bootstrap 3 and back-end Python scripting

    • Leading ArchivesSpace implementation and data migration

    • Implemented guerrilla user testing program

    • Led year-long legacy metadata cleanup process.

    • Implemented reference ticket system and provided vision and direction for on-demand digitization

  • Manages undergraduate and graduate student assistants performing archival processing and digitization

2014-2015 Project Archivist

M.E. Grenander Department of Special Collections and Archives
University Libraries
University at Albany, SUNY

  • Implemented Council On Library and Information Resources (CLIR) Hidden Collections grant to arrange and describe 710 cubic feet of the National Death Penalty Archives (July 2014-March 2015)

    • Processed records of empirical research on Capital Punishment and the effect of race in charging, sentencing, and jury decision-making
  • Supervised three Graduate Student Assistants and three Undergraduate Student Assistants processing a variety of archival material in accordance with Describing Archives: A Content Standard (DACS)

  • Undertook New York State Documentary Heritage Program Grant (April 2014-June 2014) to arrange and describe four collections containing 82 cubic feet of material concerning LGBT civil rights activism

  • Developed an automated Encoded Archival Description (EAD) authoring workflow that maintains consistent local practices and is comfortable for archivists without technical expertise

    • Uses Microsoft Excel XML mapping and basic Python desktop GUI

2011-2014 Project Archivist (2013-2014), Archives Assistant (2011-2013)

Archives and Special Collections
James A. Cannavino Library
Marist College, Poughkeepsie, N.Y.

  • Responsible for the arrangement and description of all major incoming collections from July 2011-April 2014, using professional processing standards, including DACS, MARC, LCNAF, AAT

  • Performed regular in-person and remote reference services, working regularly with undergraduate students and scholarly researchers

  • Regularly managed 5-12 undergraduate student assistants and occasional MLS interns in archival processing, digitization, and creating online exhibits

  • Developed web exhibits, online finding aids, and access tools using XML, EAD, XSLT, XQuery, HTML, and CSS

  • Regularly managed large-scale digitization projects

    • Helped manage Lowell Thomas Digitization Project ($103,979 NHPRC Grant)

    • Over 39,000 images in more than 8 photographic formats

    • Coordinated metadata creation and quality control workflow

    • Developed imaging standards in collaboration

  • Performed regular and on-demand preservation activities

  • Directed the movement and storage of the department’s entire holdings for renovation in summer of 2012

  • Developed copyright and use statements for online collections

  • Performed off-site appraisal of Arthur Glowka Papers

2012-2014 Collections Project Manager

Dutchess County Historical Society, Poughkeepsie, NY

  • Arranged and described 19th century collections and made them available for use

  • Collections included comprehensive administrative records from local units in both the Civil War and the American Revolution

  • Implemented DACS, minimal processing techniques, and other professional standards

  • Wrote and obtained $8,297 New York State Documentary Heritage Grant for the Hubbard Family Papers

  • Developed and implemented low-cost preservation procedures

  • Designed new, easy-to-use collection management and access workflows that are not software-dependent

2009-2010 Undergraduate Student Assistant

Archives and Special Collections
James A. Cannavino Library
Marist College, Poughkeepsie, N.Y.

  • Participated in the 2 year project to arrange and describe the 1,200 linear feet Lowell Thomas Papers, supported by $140,000 NHPRC Grant

Skills

Applications – git, Microsoft Office, Adobe Creative Suite, eXist XML Database, BaseX
Archives Applications – ArchivesSpace, Archivist’s Toolkit, BitCurator Environment, PastPerfect
CMS/Frameworks – Drupal, WordPress, Bootstrap
Disk Imaging – Guymager, dd, FTK Imager
Digital Forensic Tools – The Sleuth Kit, fiwalk, Plaso
Digital Repositories – Luna Insight
Digital Scholarship Tools – NLTK, twarc, D3Plus
Markup Languages – HTML 5, JSON, XML, XSLT, Markdown
Metadata Standards – DACS, EAD (EAD 2002 and EAD3), MODS, METS, PREMIS, Dublin Core, Schema.org
Operating Systems – Windows, Ubuntu Linux
Programming Languages – Python, PHP, XQuery, CSS 3, JavaScript
Shell Scripting – Bash, Windows PowerShell
System Administration – Apache, Tomcat, MySQL, ArchivesSpace