The Lens MetaRecord and LensID: An open identifier system for aggregated metadata and versioning of knowledge artefacts

26 November 2019 by Cambia Staff in News

The Lens MetaRecord and LensID: An open identifier system for aggregated metadata and versioning of  knowledge artefacts

Abstract

Ambiguity is inherent in the digital records of entities such as patents, scholarly works, human names, or institutions. While we have made some progress to preserve each entity’s one to one relationship using open persistent identifiers, in this contribution, we show how the Lens has used the MetaRecord (MeR) concept along with the open LensID identifier to begin mapping the one to many relationships among these data elements, disambiguating name variants, and organizing contextual metadata.

Introduction

Our world is different now. Increasingly more connected via digital decision analyses and communications, yet when linking data elements in an innovation ecosystem to help solve a problem, one is faced with an overwhelming confusing mess. Most information is still stored in silos and if accessible, often machine unreadable, ambiguity in humans or institutions names is high, and variation in recorded knowledge across various sites is confusing. This structural complexity generates inefficiencies, errors, and uncertainty in the innovation sector. Besides conferring an exclusive right on an invention or a solution to a problem, a patent is an accumulated body of scientific, technical, and industrial knowledge that is continuously evolving based on other related knowledge artefacts. To gain insight on an invention in an innovation pathway, it is critical to access, analyse, and monitor its varied contextual metadata; the patent file wrapper, its family members, the citations, the office actions, the legal challenges, and other relevant data elements. While patent offices provide information retrieval systems based on patent publication key numbers, mapping capabilities of the contextual complexity around an invention to improve one’s confidence level is still lacking. Similarly, with the recent proliferation of scholarly preprints, postprints, annotations, commentary, and other related data, many public institutions struggle to implement and provide library users with their own comprehensive metadata on scholarly works. To enable a clearer and deeper understanding of the diverse capabilities necessary for science and technology to effectively and efficiently contribute to solving problems of critical importance to society, The Lens has begun addressing some of the encountered structural complexities around innovation knowledge artefacts (KA), initially patents and scholarly works. Currently, the Lens hosts more than 118 M patent and 208 M scholarly work records, representing at least 95 jurisdictions and 193 affiliation-based countries, respectively. Moreover, the Lens enriches its biological data with more than 313 M disclosed genetic sequences and their metadata. All Lens data is served as annotatable digital public goods resolvable by identifiers. In addition, The Lens implements a MetaRecord (MeR) concept to manage complexities around record variability, sources and contextual metadata relevance to the original record. Complemented with a unique open persistent identifier, LensID, The Lens MetaRecord reflects an open, granular, and dynamic mapping system with a logging history of knowledge around a knowledge artefact entity. Such entity can be a patent, a scholarly work, a human name, or an institution. In this contribution, we describe the concept and its applications in patents and scholarly works, and discuss its interoperability with other identifiers and potential use in disambiguation of human and institution names. This concept is similar to the thoroughly reviewed concept of “work” among cataloguers in libraries with successfully complex bibliographic retrieval tools (Smiraglia, 2001). Leading librarians such as Antonio Panizzi, Eva Verona, Seymour Lebetzky and Kristin Antelman have actually shaped that concept as a “central object for retrieval” in the bibliographic universe to exploit recorded knowledge and over time, there seems to be a consensus that while relationships among works are complex, a taxonomy of relationships can still be structured to allow for explicit expression for information retrieval (Antelman, 2004). Antelman advanced the need for a “superwork record”identifier that focuses on functional relationships among various editions, and variation in the author names, bibliographic information of a work rather than its descriptive details and highlighted the need for such a system to preserve and enrich users access and use of collective knowledge on a work. The Lens Metarecord and its LensID aspires to provide such a solution. Read The Full Paper