NER in Archival Finding Aids: Extended
نویسندگان
چکیده
The amount of information preserved in Portuguese archives has increased over the years. These documents represent a national heritage high importance, as they portray country’s history. Currently, most have made their finding aids available to public digital format, however, these data do not any annotation, so it is always easy analyze content. In this work, Named Entity Recognition solutions were created that allow identification and classification several named entities from archival aids. translate into crucial about context and, with confidence results, can be used for purposes, example, creation smart browsing tools by using entity linking record techniques. order achieve result scores, we annotated corpora train our own Machine Learning algorithms domain. We also different architectures, such CNNs, LSTMs, Maximum Entropy models. Finally, all datasets ML models developed web platform, NER@DI.
منابع مشابه
Access to Archival Finding Aids: Context Matters
We detail the design of a search engine for archival finding aids based on an XML database system. The resulting system shows results—which can vary in granularity from individual archival items to the whole fonds—within the context of the archive. The presentation preserves the archival structure by providing important contextual information, and all individual results can be “clicked”, warpin...
متن کاملSearching Archival Finding Aids: Retrieval in Original Order?
Archival principles as Provenance (keeping material from the same creator together) and its corollary Original Order (keeping the order of creation intact) could help improve access to the archival materials. We investigate the importance of relevance ranking and ‘Original Order’ when searching finding aids in EAD using XML Retrieval. Our experiment shows that relevance ranking is of paramount ...
متن کاملTHE IHPACT OF COMPUTERIZATION ON ARCHIVAL FINDING AIDS : A RAMP STUDY prepared by
The impact of computerization on archival finding aids: a RAMP study / prepared by Christopher Kitching [for the] General Information Programme and UNISIST. PREFACE In order to aid Member States, particularly developing countries, to meet their needs in the specialized areas of Archives Administration and Records Managemant, the Division of the General Information Programme has developed a long...
متن کاملModeling Archival Repositories for Digital Libraries Extended
This paper studies the archival problem: how a digital library can preserve electronic documents over long periods of time. We analyze how an archival repository can fail and we present diierent strategies that help solve the problem. We introduce ArchSim, a simulation tool that for evaluating an implementation of an archival repository system and compare options such as diierent disk reliabili...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Machine learning and knowledge extraction
سال: 2022
ISSN: ['2504-4990']
DOI: https://doi.org/10.3390/make4010003