What is the Lens
The Lens serves linked open knowledge artefacts and metadata with tools to inform effective, efficient and equitable problem solving.
The problems facing our society and the planet require solutions that will only arise through coordinating diverse capabilities of people and institutions. To achieve this requires making comprehensive relevant knowledge available to more and different problem solvers to create products and practices that change our paths and options. These problem solvers are people, and they are institutions. Each has different needs.
The required knowledge falls into many silos of specialization, ranging from scholarly research and patent knowledge to policy, laws, regulations, investment, social norms and business data. These silos create barriers to visualizing effective partnerships, opportunities, risks and trajectories (PORTs), making inclusive problem-solving slow, difficult and expensive.
The Lens, the flagship project of the social enterprise Cambia, seeks to source, merge and link diverse open knowledge sets, including scholarly works and patents, to inform discovery, analysis, decision making and partnering on a human-centered user experience built on an open web platform, Lens.org, with toolkits designed to optimize institutional effectiveness in problem solving.
With over 20 years of development, supported by prominent philanthropic organizations, The Lens ingests, cleans, aggregates, normalizes and serves over 225+ million scholarly works, 127+ million global patent records, and more than 370+ million patent sequences, with rich metadata including the people and institutions that generate this knowledge and the linkages between them, drawn from diverse data sources.
The Lens architecture is built around the Lens MetaRecord. 'Knowledge artefacts' - including scholarly works and patents, exist in a constellation of forms, timelines, degrees of access and quality. By integrating multiple identifiers and sources to provide an open MetaRecord, the best metadata can be assembled, normalized and exposed, while maintaining provenance and linkages.
How are the Lens capabilities organized?
The Lens at its core is an aggregator of metadata, combining three unique content sets and one management tool as a base offering. This base supports the four primary functions of the Lens, which are to discover, analyse, manage and share knowledge.
- Scholarly Works: Discovery and analytics tools providing access to a global corpus of scholarly literature metadata with citation indexing.
- Patents: Discovery and analytics tools on a comprehensive collection of patent literature with citation indexing.
- PatSeq: A facility to search and analyze biological sequences disclosed in patent literature.
- Collections: A management tool to track, monitor, and analyze a collection of works or a collection of patents dynamically or statically.
These foundational tools have equitable access i.e. they are available to any user who needs them. They are open, free, and allow for private and secure access to different sources of information.
The Lens has also developed Application Programming Interface (APIs) for both the scholarly works and patents, customized datasets and bulk downloads of patent sequences, and other specialized institutional tools, and metrics that are licensed to help defray the costs of keeping Lens a trans-disciplinary, trans-domain open platform.
Open for Outcomes
To us ‘Open’ is defined more on its impact than its process. Its about who is included. We need an innovation system that is open to all - priorities and people.
This requires transparency, effectiveness, equity, interoperability and efficiency. We need an active effort to enable inclusive problem solving for neglected people and priorities. These are not all institutions or people that buy fully into a ‘sharing’ model. Most businesses are in fact competitive and confidentiality is in their DNA. They are still critically important parts of problem solving and must be included in our definition of ‘open’. So Open for Business means we include them, try to create tools they need using fully FAIR (shareable) data, but which they can keep as confidential as they wish. Sharing possible and encouraged, but not required. We recognize - especially in the last two decades - that surveillance is anathema to normal business practice and we respect that. No ads, no surveillance. But costs matter hugely. So our goal as a global non-profit organisation is to charge as little as we can, commensurate only with keeping our facility growing, improving and serving all sectors of society.
Planetary ecosystem sustainability, economic and social equity, food security and global health are not just ‘priorities’ they are essential for our species to persist. They are shared existential problems. And creating solutions requires Collective Action. But this is neither Kumbaya nor Commie rhetoric. It is a simple statement that choices made by people and institutions can - in the aggregate - drive outcomes that benefit us all.But only if these choices are clear and actionable - informed by evidence but inspired by imagination. And we need to find the frictions and rents in our current innovation system and decrease them.
Open - to us - means opening up this capability to more and different people and to address as many important challenges as possible. The ‘process’ of open, whether open source, open access, open data and open science can all be a part of our vision, but no one of these, or all of them, is sufficient to address the need. Inclusion is perhaps the only real measure of our success.
Virtually all problem solving requires discovery, coordination and incentivization of multiple partners. This in turn requires building bridges between domains of expertise, and even domains of culture and norms. The Lens is the main project for a social enterprise focused on changing the global problem solving capability through a long-game strategy that includes creating a platform serving linked open knowledge artefacts and open metadata that can inform problem solving by a community vastly larger and more inclusive than just one single group be it scholars or academia or lawyers and businesses.
By design, we made the Lens.org platform open and freely accessible to any and all users, to have access to all our data, analytics and visualizations (contingent on the usual compliance with good-actor rules). Without even having an account or logging into Lens, such users have this access, anonymously and comprehensively, including the right for data exports. Moreover, with a cost-free personal account (which can still be anonymous, but for whom we provide some services associated with account privileges such as saved history, recurring queries, collection management and so on) any user can currently export 50,000 records at a time.
What software does the Lens use?
All of the software that comprises the Lens platform itself is Open Source and free to use. The code base is not proprietary code1. The following is a brief overview of the main technologies and frameworks that make up the Lens applications:
- Lens servers run within the Amazon EC2 cloud-computing platform
- We use PostgreSQL, MySQL and MongoDB databases
- Elasticsearch and Apache Lucene are used for text search
- NGINX and Apache HTTP Servers are used for proxies and load balancing
- The backend applications are powered by Apache Tomcat and Gunicorn
- Images and static resources are stored and served using Amazon S3/CloudFront
- The open source software used in the PatSeq facility includes:
- PatSeq Finder: NCBI BLAST+
- PatSeq Analyzer: opencb/genome-maps, opencb/CellBase, BWA, BLAT, Apache Solr
While The Lens platform is built on open-source, as a small group unfortunately, we have not had yet the capacity to share it. With sufficient time and funds, we’d be delighted to audit, clean, document and share our codebase.
Our Data Partners
- Microsoft Academic - www.academic.microsoft.com2
- CrossRef - www.crossref.org
- ORCID - www.orcid.org
- PubMed - www.ncbi.nlm.nih.gov/pubmed
- Impactstory - www.impactstory.org
- CORE - www.core.ac.uk
- European Patent Office (EPO) - www.epo.org
- United States Patent and Trademark Office (USPTO) - www.uspto.gov
- IP Australia - www.ipaustralia.gov.au
- World Intellectual Property Organization (WIPO) - www.wipo.int
The Lens supports the following initiatives:
- I3: The Lens is a founding member of the Innovation Information Initiative that collaborates on innovation data, analytics and metrics with MIT Knowledge Futures Group, Boston University, and Swiss Federal Institute of Technology in Lausanne.
- ORCID: ORCID provides a persistent digital identifier (an ORCID iD) that you own and control, and that distinguishes you from every other researcher. You can connect your iD with your career information including education, employment, funding, publications, peer review, and more. The Lens supports ORCID linked services and enables you to claim and sync your works with your ORCID record using your profile.
- Crossref: Crossref is a not-for-profit member organization that exists to make scholarly communications better by providing scholarly metadata on an open infrastructure. The Lens deposits patent citation event data into CrossRef.
- Microsoft Academic: Microsoft Academic uses machine learning, semantic inference and knowledge discovery to enable exploration of scholarly information in powerful ways. The Lens provides patent metadata to Microsoft Academic.
- I4OA: The Lens supports the Initiative for Open Abstracts (I4OA) and the call for unrestricted availability of abstracts to boost the discovery of research.
- ROR: The Lens is a community adviser and supporter of the Research Organization Registry (ROR) initiative.
- As a historical aside, 20 years ago in the absence of quality indexing and search tools such as Lucene, we had to create from whole cloth a memory resident search tool in 'C', called Dekko, that allowed Patent Lens to provide a highly performance free and open full text patent search, the first in the world. It's much nicer when we can build on and share to open projects now.
- Microsoft Academic was decommissioned at the end of 2021.