Technology Table Models
READ-COOP |
From census records to trade ledgers, a handwritten table in a historical document often holds a wealth of valuable information. Yet much of this data remains locked on paper, accessible only by manually deciphering each page. Imagine the potential if these tables could be transformed into digital, searchable spreadsheets, allowing you to analyse whole volumes at the click of a button. With AI tools likeTranskribus, digitising tabular data in this way is now a reality. Transkribus enables users to convert handwritten tables into structured, searchable data, unlocking the stories within and making research both more efficient and more effective. In this post, we will show you how to convert handwritten tables into digital spreadsheets, helping you bring historical records from paper to pixels. The tables in historical records provide valuable insights into aspects of life at particular times and places. But no matter if the table is handwritten, printed, or a mixture of both, its potential is often locked within physical documents. By digitising these tables and converting them into a format like Excel, researchers can search, analyse, and reorganise data in ways that simply aren’t possible on paper. For example, you could transform lists of trades from centuries-old apprenticeship records into a dataset that reveals changing economic trends over time. Or you could convert registers of handwritten addresses into a searchable resource for genealogical research. Once digitised, these tables can also be uploaded to platforms like Transkribus Sites, where they can be shared and accessed worldwide. Not all tables are equally easy to digitise. For best results, the tables should have a consistent layout across each page, with similar handwriting styles and clear separation between rows and columns. High-quality scans are essential, as blurry or poorly lit images can make it difficult for AI to accurately identify table boundaries or recognise text. Historical records that meet these criteria, such as census tables or registers with well-defined sections, are ideal candidates for this digitisation process. Below is an outline of the workflow for turning handwritten tables into digital spreadsheets. More detailed information about each step of the workflow can be found in ourHelp Center. The first step is totrain an AI Table Modelthat can locate the columns and rows in your handwritten table. To do this, you first need to prepare training data. Upload scanned images of your document to Transkribus, and use the Editor to mark where all the rows and columns are. You will need to prepare between 20 and 50 documents as training data, depending on the complexity and homogeneity of your tables. Now that your table model is ready, you canapply it to the rest of your document collection. This process tells Transkribus to identify the columns and rows in each table in your documents, creating a structured map of the tables. Before performing the text recognition, it is important to run a separate layout recognition first. This allows Transkribus to locate the text within each cell of the table. When running the layout recognition, you should alsoselect the following advanced parameters: "Keep existing regions" and "Split lines on region border". These parameters make sure that the table format recognised in Step 2 is not changed through the layout recognition. Once Transkribus has located the cells and the text contained within them, it can thenrecognise the textand create a digital version of it. This is done in exactly the same way as a "normal" text recognition. A digital version of the table will be displayed on the right side of the Editor screen, and this can be corrected and edited as required. Finally,download the datato your computer as a spreadsheet file. This allows you to open it in programs such as Microsoft Excel or Google Sheets. Yes, Transkribus allows you to choose between exporting each page as a separate sheet within the spreadsheet or merging all pages into one comprehensive spreadsheet. This flexibility makes it easy to structure your output in a way that suits your research goals. Your tabular data doesn't have to be exported as an Excel file. Files can be downloaded from Transkribus in various different formats, from PDF toTEI, meaning you can choose the format that allows you to process and analyse your data most effectively after processing it in Transkribus. Once your tables are digitised, consider uploading them toTranskribus Sites. This platform allows you to share your digital records online, making them accessible to researchers, historians, and the general public worldwide. By publishing on Transkribus Sites, you enhance the visibility of your work and contribute valuable resources to the wider research community. Certainly. Transkribus is designed to digitise a wide range of documents, from medieval manuscripts to handwritten notes. It can also be used asoptical character recognition (OCR) software for printed books. Its versatility makes it a valuable tool for any archive, library, or research institution working with historical records. TheTranskribus Help Centerhas information and tutorials for all the different features and workflows in Transkribus. In theTable Models section, you can find step-by-step instructions for training a Table Model and applying it to your documents. Alternatively, check out the recording of ourTable Models webinaron ourYouTube channelfor a walkthrough of how to train a Table Model, and visit ourEvents pagefor information about upcoming webinars for both beginner and advanced users.Key takeaways:
Transkribus uses the power of AI to turn handwritten tables into digital data. © “Carnegie Corporation Register of Applications from Educational Institutions, 1911-1920” (Carnegie Corporation of New York Records)
Bring your handwritten tables from the archive to the computer
What kind of records can be converted into spreadsheets?
The columns and rows in a table should be separated, whether by a line or blank space. © “A catalogue of pictures” (Paul Mellon Centre for Studies in British Art)
How to convert handwritten tables into spreadsheets
1. Train a Table Model
Save these images as "Ground Truth" and use them to train a Table Model. You may need to retrain the model a few times before it is sufficiently accurate.
2. Run the table model
3. Perform the layout recognition
4. Perform the text recognition
5. Export as an Excel file
Can I export all the document pages as one spreadsheet?
Can I export it in another digital format?
How can I improve access to my documents?
Can I use Transkribus to digitise other types of documents?
Where can I get more information on this topic?