What is the Lens
The Lens is an open global cyberinfrastructure to make the innovation system more efficient and fair, more transparent and inclusive.
The Lens is building an open platform for Innovation Cartography. Specifically, the Lens serves nearly all of the patent documents in the world as open, annotatable digital public goods that are integrated with scholarly and technical literature along with regulatory and business data. The Lens will allow document collections, aggregations, and analyses to be shared, annotated, and embedded to forge open mapping of the world of knowledge-directed innovation. Ultimately, this will restore the role of the patent system as a teaching resource to inspire and inform entrepreneurs, citizens and policy makers.
Within the next two years, we expect to host over 95% of the world's patent information and link to most of the scholarly literature, creating open public innovation portfolios of individuals and institutions. Using all open source components, we are working to create open schemas by which patent documents can be used to teach and communicate, rather than confuse and intimidate.
Underlying data and analytics will be available to the public with APIs. By creating and freely sharing APIs and by building modular, standardized specifications, we can envision growing public use of innovation cartography to decrease fear, uncertainty and doubt hindering investment and enthusiasm.
Check out the latest stats on the Lens patent data (coverage, date range, and various accessible metadata). Updates are performed on 3-4 week basis at the present time. And, here are the various patent data sources ingested and integrated in the Lens:
- The European Patent Office’s DocDB bibliographic data from 1907 - present: 81+ Million documents from nearly 100 jurisdictions.
- USPTO Applications from 2001 – present with full text and images.
- USPTO Grants from 1976 – present with full text and images.
- USPTO Assignments (14+ Million).
- European Patent Office (EP) Grants from 1980 – present with full text and images.
- WIPO PCT Applications from 1978 – present with full text and images.
- Australian Patent Full Text from IP Australia
Similarly, we display the Scholarly data in the Lens and list the various data sources we have integrated so far below:
- Scholarly records from PubMed (28M)
- Scholarly records from Crossref (94M)
- Scholarly records from Microsoft Academic(158M)
And here is a brief list of the metadata available in the scholarly records:
- Lens Scholarly ID
- citation identifiers
- publication date
- publication type
- authors (first and last name, order, affiliation)
- start end pages, volume, issue
- funding/grant information
- keywords (PubMed only)
- mesh_term (PubMed only)
- Field of Study
- chemicals (PubMed only)
- clinical_trial data (PubMed only)
- citing patents
- scholarly citations
- recommended works
- references (string with identifiers if available)
All of the software that comprises the Lens application itself is Open Source and free to use. The following is a brief overview of the main technologies and frameworks that make up the Lens:
- Lens servers run within the Amazon EC2 cloud-computing platform
- We use PostgreSQL, MySQL and MongoDB databases
- Elasticsearch and Apache Lucene are used for text search
- NGINX and Apache HTTP Servers are used for proxies and load balancing
- The backend applications are powered by Apache Tomcat and Gunicorn
- Images and static resources are stored and served using Amazon S3/CloudFront