Coronavirus COVID-19 Datasets

The Lens has assembled free and open datasets of patent documents, scholarly research works metadata and biological sequences from patents, and deposited them in a machine-readable and explorable form.

Human Coronaviruses Data Initiative

The Lens is building an interactive tool for understanding the landscape of patent and research works in any domain, including human coronaviruses and COVID-19. However the full ‘Report Builder’ functionality is still under development.

Considering the urgency of the crisis and the need to develop improved practices and products for diagnosis, therapeutics, medical devices and tools for protection and treatment, vaccines and other interventions, The Lens shared draft collections and datasets created for the Human Coronavirus landscape (see below). For a more updated data, please see the live Lens Report, Human Coronaviruses: Patent and Research Landscape. Alternatively, the Lens provided all search queries and you can replicate these easily in the Lens and export the most updated versions of the datasets.

Surface and Repurpose

Patents are typically first published 18 months after filing, so patents specifically targeted to this new COVID-19 agent may not appear until June of 2021. However, the virus and disease are related to the SARS and MERS viral outbreaks of the last two decades so there is a wealth of knowledge to mine - to 'Surface and Repurpose' - that can help in the fight against COVID-19.

Surfacing technologies and methods of making things within the patent literature where the patent is no longer protected in many parts of the world, the patents have expired, or the technologies exist with rights; is essential and must be openly accessible for re-use and repurposing in the fight against COVID-19. Discovering and forging new partnerships that can bring unique capabilities to the fight is critical, as is targeting investments at the highest impact pathways and trajectories. All of this requires open, reliable evidence.

To inform any move towards common action, we need open data and shared, evidence-driven analysis. We hope these datasets and analytic tools help.

Will we need a COVID-19 Commons?

To develop fast effective solutions, we may need sharing of patented technologies or particular in-house capabilities in a COVID-19 Commons, voluntarily, incentivized or mandated.

In other times of national or global crisis, such pools or license commons have been developed to accelerate solutions and share the benefits widely. In 1917, one year before the Spanish Influenza pandemic, the US government in the throes of World War One, forced the formation of a licensing commons to break the patent logjam that was halting progress on military aircraft (e.g. Manufacturers Aircraft Association). The HIV/AIDS crisis was also critical for stimulating improvements in the TRIPS Agreement to clarify compulsory licensing for public health emergencies.

Will the COVID-19 crisis require such interventions?

Patent Collections Updated: 19 May 2020
Documents
Families
Cited
Downloads
Coronavirus: Broad keywords based patents
(title:(Coronavirus) OR abstract:(Coronavirus) OR claims:(Coronavirus)) OR (title:("Severe acute Respiratory syndrome") OR abstract:("Severe acute Respiratory syndrome") OR claims:("Severe acute Respiratory syndrome")) OR (title:("coronaviridae") OR abstract:("coronaviridae") OR claims:("coronaviridae")) OR claims:("SARS-CoV") OR claims:("MERS-CoV") OR claims:("COVID 19") OR claims:("Wuhan coronavirus") OR claims:("2019-nCoV") OR claims:("Middle East respiratory") OR (title:(COVID 19) OR abstract:(COVID 19) OR claims:(COVID 19))
Coronavirus: SARS & MERS patents
(title:(coronavirus) OR abstract:(coronavirus) OR claims:(coronavirus)) OR (title:("severe acute respiratory") OR abstract:("severe acute respiratory") OR claims:("severe acute respiratory")) OR (title:("Middle East respiratory") OR abstract:("Middle East respiratory") OR claims:("Middle East respiratory"))
Coronavirus: SARS & MERS TAC patents
((title:(SARS) OR abstract:(SARS) OR claims:(SARS)) AND (title:("severe acute respiratory") OR abstract:("severe acute respiratory") OR claims:("severe acute respiratory"))) OR ((title:("Middle East respiratory") OR abstract:("Middle East respiratory") OR claims:("Middle East respiratory")) AND (title:("MERS") OR abstract:("MERS") OR claims:("MERS"))) . + 42 patents from filter Biologicals = ( 227859 , 228330 ) in an empty search which added 28 new patent to this original collection.
Coronavirus: Limited keyword based patents
((title:(SARS) \nOR abstract:(SARS) \nOR claims:(SARS)) \nAND (title:("severe acute respiratory") \nOR abstract:("severe acute respiratory") \nOR claims:("severe acute respiratory"))) \nOR ((title:("Middle East respiratory") \nOR abstract:("Middle East respiratory") \nOR claims:("Middle East respiratory")) \nAND (title:("MERS") \nOR abstract:("MERS") \nOR claims:("MERS")) \nOR (title:(COVID 19) \nOR abstract:(COVID 19) \nOR claims:(COVID 19)))
Coronavirus: CPC based patents
classification_cpc:(C12N2770/20011) OR classification_cpc:(C12N2770/000*) OR classification_cpc:(C07K14/165) OR classification_cpc:(A61K39/215) OR classification_cpc:(G01N2333/165)
Coronavirus: Declared patseq organism patents
sequence_organism_taxid:227859 OR sequence_organism_taxid:228330 OR sequence_organism_taxid:693995 OR sequence_organism_taxid:277944 OR sequence_organism_taxid:11137 OR sequence_organism_taxid:1335626
Coronavirus: SARS patents
((title:(SARS) OR abstract:(SARS) OR claims:(SARS)) AND (title:("severe acute respiratory") OR abstract:("severe acute respiratory") OR claims:("severe acute respiratory"))) + those patents with sequence derived from SARS coronavirus
Coronavirus: MERS patents
((title:("Middle East respiratory") OR abstract:("Middle East respiratory") OR claims:("Middle East respiratory")) AND (title:("MERS") OR abstract:("MERS") OR claims:("MERS")))
Coronavirus: SARS diagnosis patents
Coronavirus_patents_SARS collection refined using claims: (diagnos* OR detect*)
Coronavirus: MERS diagnosis patents
Coronavirus_patents_MERS collection refined using \nclaims: (diagnos* OR detect*)
Coronavirus: SARS treatment patents
Coronavirus_Broad_SARS collection refined with claims:(antiviral OR vaccin* OR treat*)
Coronavirus: MERS treatment patents
Coronavirus: Broad MERS collection refined with claims:(antiviral OR vaccin* OR treat*)
Coronavirus: Hong Kong University patents
Most relevant patents on SARS in HK\npub_num:7553944 OR pub_num:7785775 OR pub_num:7361747 OR pub_num:7371837 OR pub_num:7375202 OR pub_num:2006/007795 OR pub_num:2004/085633 OR pub_key:CN_102021147_A OR pub_key:CN_101076591_A
Ventilators
classification_cpc:(A61M16\/*) grouped by simple family
Respirators and surgical masks
classification_cpc:(A62B23\/025*) OR classification_cpc:(A41d13\/11*)
Scholarly Works Collections Updated: 19 May 2020
Works
Citing
Downloads
Coronavirus: COVID-19
Coronavirus: Broad collection filtered by "Wuhan coronavirus" OR "Wuhan-Hu1" OR "2019-nCoV" OR "COVID 19" OR "SARS-CoV-2"
Coronavirus: Broad
Manually curated collection based on this search (title:(Coronavirus OR SARS OR nCoV OR coronavirinae) OR abstract:(Coronavirus OR SARS OR nCoV OR coronavirinae ) OR keyword:(Coronavirus OR SARS OR nCoV OR coronavirinae ) OR field_of_study:(Coronavirus OR SARS OR nCoV OR coronavirinae )) and later combined with this search: title:("Coronavirus" OR "coronavirinae" OR "Wuhan coronavirus" OR "2019-nCoV" OR "COVID 19" OR "SARS-CoV*" OR "MERS-COV*" OR "severe acute respiratory" OR "Middle East respiratory") OR abstract:("Coronavirus" OR "coronavirinae" OR "Wuhan coronavirus" OR "2019-nCoV" OR "COVID 19" OR "SARS-CoV*" OR "MERS-COV*" OR "severe acute respiratory" OR "Middle East respiratory") OR keyword:("Coronavirus" OR "coronavirinae" OR "Wuhan coronavirus" OR "2019-nCoV" OR "COVID 19" OR "SARS-CoV*" OR "MERS-COV*" OR "severe acute respiratory" OR "Middle East respiratory") OR field_of_study:("Coronavirus" OR "coronavirinae" OR "Wuhan coronavirus" OR "2019-nCoV" OR "COVID 19" OR "SARS-CoV*" OR "MERS-COV*" OR "severe acute respiratory" OR "Middle East respiratory")
Coronavirus: SARS-CoV-1
Coronavirus Broad collection filtered by: title:(SARS) OR title:("severe acute respiratory") OR field_of_study:("severe acute respiratory")
Coronavirus: MERS
Coronavirus: Broad collection filtered by title:"MERS" OR field_of_study:"MERS" OR title: ("Middle east respiratory" ) OR field_of_study: ("Middle east respiratory" )
Coronavirus: Transmission
Coronavirus: Broad Collection filtered by: title:(spread*) OR title:(transmi*)
Coronavirus: Diagnosis & Treatment
Coronavirus: Broad collection filtered by : title:(diagnos*) OR title:(treat* ) OR title:(detect*) OR title:(antiviral) OR title:(vaccin*)
Coronavirus: Treatment
Coronavirus: Broad collection filtered by: (title:treat*) OR ( title: antiviral) OR (title:"disease manag*") OR (title: "disease control") OR (title:vaccin*)
Coronavirus: SARS-CoV-1 diagnosis
Coronavirus: Broad_SARS-CoV-1 collection filtered by "detect*"OR "diagnos*"
Coronavirus: MERS diagnosis
Coronavirus: MERS collection filtered by "detect*"OR "diagnos*"
Coronavirus: Spike protein scholarly works
(title:("S protein") OR abstract:("S protein") OR keyword:("S protein") OR field_of_study:("S protein")) AND (title:(Coronavirus) OR abstract:(Coronavirus) OR keyword:(Coronavirus) OR field_of_study:(Coronavirus))
PatSeq Finder Search Updated: 19 May 2020
Number of hits
Avg. hit sequence length
Unique Patent Docs
Nucleocapsid phosphoprotein: Wuhan-Hu1
Sequence similarity search results in the Lens PatSeq-aa database using BLASTp for GeneID:43740575 (ORF9 structural protein of the NCBI reference Wuhan-Hu-1 isolate sequence)
>500
415 (min 50 - max 674)
172
ORF8 protein: Wuhan-Hu1
Sequence similarity search results in the Lens PatSeq-aa database using BLASTp for GeneID:43740577 (ORF8 protein of the NCBI reference Wuhan-Hu-1 isolate sequence)
104
153 (min 84 - max 503)
63
ORF7a protein: Wuhan-Hu1
Sequence similarity search results in the Lens PatSeq-aa database using BLASTp for GeneID:43740573 (ORF7a protein of the NCBI reference Wuhan-Hu-1 isolate sequence)
177
116 (min 15 - max 784)
61
ORF6 protein: Wuhan-Hu1
Sequence similarity search results in the Lens PatSeq-aa database using BLASTp for GeneID:43740573 (ORF7a protein of the NCBI reference Wuhan-Hu-1 isolate sequence)
86
48 (min 15 - max 65)
57
Membrane glycoprotein: Wuhan-Hu1
Sequence similarity search results in the Lens PatSeq-aa database using BLASTp for GeneID:43740571 (ORF5, structural protein of the NCBI reference Wuhan-Hu-1 isolate sequence)
>500
205 (min 15 - max 6,090)
189
ORF3a protein: Wuhan-Hu1
Sequence similarity search results in the Lens PatSeq-aa database using BLASTp for GeneID:43740569 (ORF3a protein of the NCBI reference Wuhan-Hu-1 isolate sequence)
158
115 (min 15 - max 420)
55
Envelop protein: Wuhan-Hu1
Sequence similarity search results in the Lens PatSeq-aa database using BLASTp for GeneID:43740570 (ORF4, structural protein; E protein of the NCBI reference Wuhan-Hu-1 isolate sequence)
356
198 (min 15 - max 1,288)
147
Surface glycoprotein: Wuhan-Hu1
Sequence similarity search results in the Lens PatSeq-aa database using BLASTp for GeneID:43740568 (Structural protein; spike protein of the NCBI reference Wuhan-Hu-1 isolate sequence)
>500
1,214 (min 623 - max 1,281)
239
ORF1ab: Wuhan-Hu1
Sequence similarity search results in the Lens PatSeq-aa database using BLASTp for GeneID:43740578 (ORF1ab, polyprotein of the NCBI reference Wuhan-Hu-1 isolate sequence)
>500
4,355 (min 601 - max 7,176)
113
Wuhan-Hu1 full genome sequence
Sequence similarity search results in the Lens PatSeq-nt database using BLASTn for the NCBI reference sequence of Wuhan-Hu-1 isolate, NC_045512.2
>500
29,705 (min 21,221 - max 37,971)
105
S protien of HKU1
Sequence similarity search results in the Lens PatSeq-aa database using BLASTp for GeneID: 3200426 (spike glycoprotein of NCBI reference Human coronavirus HKU1 sequence)
>501
1,331 (min 604 - max 1,385)
108
nsp12-HKU1
Sequence similarity search results in the Lens PatSeq-aa database using BLASTp for accession YP_173236.1 (non structural protein 12 or RNA-dependent RNA polymerase of NCBI reference Human coronavirus HKU1 sequence)
>500
3,498 (min 24 - max 7,176)
154
Analyze in PatCite

Analyze linkages between academic research and inventions

You can also explore the citation networks of Coronavirus related scholarly works cited in patents and their citing patent families using PatCite.

Collections Disclaimer

While most of these collections were based on search queries (description is included for each collection), some manual editing and refining was also used to eliminate clearly irrelevant documents. It is important to keep in mind that these are draft collections and we will continue to refine and update them with each new data release. If you have any specific question, or if you have advanced domain knowledge and would like to help, please email support@lens.org or osmat.jefferson@lens.org

Attribution

The Lens provides the data as-is, and under terms requiring only attribution. For any additional information, please see the Lens data terms of use.

Suggested Citation

Human Coronavirus Innovation Landscape: Patent and Research Works Open Datasets. Accessed [date] at https://about.lens.org/covid-19 .

Other COVID-19 Resources

Acknowledgments

Twenty years ago, The Rockefeller Foundation funded Cambia to create the predecessor to The Lens, Patent Lens. We’re pleased that Rockefeller is again working with Cambia and The Lens to advance an open, inclusive and transparent innovation system. We’re honored to acknowledge the support of The Rockefeller Foundation once again as The Lens scales for impact.