Skip to main content

AI-based Analysis of Historical Handwritten Text Transcription and Historical Image Analysis

Submission Number: 167
Submission ID: 3796
Submission UUID: f7842b35-5003-45ea-9f50-527cf4532103
Submission URI: /form/project

Created: Wed, 06/28/2023 - 10:07
Completed: Wed, 06/28/2023 - 10:07
Changed: Mon, 07/03/2023 - 16:35

Remote IP address: 128.6.36.20
Submitted by: Udi Zelzion
Language: English

Is draft: No
Webform: Project
AI-based Analysis of Historical Handwritten Text Transcription and Historical Image Analysis
CAREERS
{Empty}
ai (271), data-analysis (422), deep-learning (303), distributed-computing (92), python (69), tensorflow (51)
{Empty}

Project Leader

Sonia Yaco
{Empty}
{Empty}

Project Personnel

{Empty}
{Empty}
{Empty}

Project Information

Margaret Clark Griffis was one the first women missionaries to modernizing Japan (1870s) and is credited with helping modernize Japanese women’s education. From 1872 to 1874, she served as a teacher and assistant principal in the Jo-Gakko girls school, the first Japanese government school for girls. Margaret kept a very detailed diaries during her time in Japan; The importance of these texts lies in their potential to provide insights into Japan's history, culture, and social practices, particularly with respect to Christian missionary activities. The diaries together with historical images are part of the William Elliot Griffis Collection at Rutgers’ Special Collections and University Archives.
This project proposes an AI-based transcription system for historical texts written by Christian missionaries in Japan during the 19th century and image analysis of historic photos from Japan form the late 19th century. The use of AI can expedite transcription and improve accuracy while creating digital archives for accessibility. The project aims to facilitate a deeper understanding of Japan's past, as digital archives make the texts more accessible and searchable for researchers and scholars.

Project Information Subsection

{Empty}
{Empty}
We are looking for a Grad student to conduct research on applying AI methods to understand the text and image collections at the libraries. The student will test and deploy machine learning and deep learning models to transcribe the handwritten text and identify the objects in the historical photographs. The project requires data exploration, modeling, and analysis with Python and utilizing packages such as Pandas, scikit-learn, and Tensorflow.
{Empty}
Practical applications
{Empty}
Rutgers University
CoRE Building, 96 Frelinghuysen Road
Piscataway, New Jersey. 08854
CR-Rutgers
07/01/2023
Yes
Already behind3Start date is flexible
6
{Empty}
07/12/2023
{Empty}
01/10/2024
  • Milestone Title: Launch presentation
    Milestone Description: Present an overview of the project at the July monthly CAREERS meeting.
    Completion Date Goal: 2023-07-12
  • Milestone Title: Refine Model
    Milestone Description: Refining the "Griffis" handwriting transcription model.
    Completion Date Goal: 2023-08-16
  • Milestone Title: Run on larger dataset and data analysis
    Milestone Description: Applying handwriting transcription model to additional texts and analyzing the results.
    Completion Date Goal: 2023-09-13
  • Milestone Title: Selecting images for test set
    Milestone Description: Selecting the images for the test-set for model training.
    Completion Date Goal: 2023-10-11
  • Milestone Title: Applying Model
    Milestone Description: Applying the model on a large collection of images and combining the data with the transcription data.
    Completion Date Goal: 2023-12-21
  • Milestone Title: Wrap presentation
    Milestone Description: Presenting the project outcomes at the January CAREERS monthly meeting.
    Completion Date Goal: 2024-01-10
{Empty}
{Empty}
The student will learn to test and deploy machine learning and deep learning models to transcribe handwritten text and identify objects in historical photographs and learn how to deploy the models on distributed computing resources.

{Empty}
{Empty}
Access to Rutgers' HPC cluster named Amarel
{Empty}

Final Report

{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}
{Empty}