With the integration of scholarly works from PubMed, CrossRef and Microsoft Academic into the Lens, we have been working hard to leverage modern open-source tools, such as Elasticsearch, to enable real-time discovery and analysis across the entire corpus of 197M+ scholarly works.
Until now, The Lens has offered a fairly limited set of basic visualisations within our analysis view for scholarly works. While we continue to enrich the scholarly corpus, the ability to visualise the many scholarly facets now available has not kept pace with developments in the data.
"The purpose of visualization is insight, not pictures." Ben Shneiderman
Following this maxim, we have been working on a range of new visualisations and charts to explore and analyse scholarly works. While most of these new visualisations are fairly standard chart types (e.g. scatter plots, line charts, etc.), they allow for a higher degree of interactivity in order to expose meaning and refine, share and understand the underlying data.
Today we’re excited to introduce the new Lens scholarly analytic facility
developed using yet another open-source project, Vega Lite. This technology allowed us to release the new suite of visualisations that we hope will make exploration of our scholarly works more insightful.
Vega-Lite is a high-level grammar of interactive graphics. It provides a concise JSON syntax for rapidly generating visualizations to support analysis vega.github.io/vega-lite
The combination of Vega-Lite and Elasticsearch has opened up a range of possibilities for analysing and visualising the scholarly works corpus, and with today’s release we have taken a big-step in leveraging these modern, open-source technologies to their full potential. Below we provide an overview of the new features and chart types now available in the Lens scholarly analysis.
Accessing the new features
The new analysis features can be found on the analysis tab for any scholarly search. Simply run a scholarly search and select the Analysis tab above the menu bar at the top of the results list.
This will launch you into the default dashboard, which shows an overview of the data in your search and provides a sample of the chart types that can be used to visualise your results. If this is your first time viewing the new analysis dashboard, a feature tour will launch to provide a guided walk through of the analysis features. This can be re-launched at any time by selecting `Launch Guided Tour` on the toolbar.
New chart types
With the help of Vega-Lite, we have added a number of new chart types to our existing visualisations, which provide a lot more flexibility in exploring and customising your analysis to gain further insight into the scholarly data. These include:
- Scatter plots for visualising the Scholarly Citations and Patent Citations of up to 1,000 individual scholarly works. This visualisation shows the top cited works (sorted by either patent citations or scholarly citations) and is fantastic for identifying outliers, individual works which have unusual ratios of citing patents to citing scholarly works. We have also added a viewfinder below the scatter plot (another great feature set available from Vega-Lite) that provides a selectable timeline using the publication year of the works.
- Line or stacked bar charts to visualise the Historical Trends in scholarly works for multiple facets (e.g top institutions, authors, fields of study, etc.) has also finally been added. Now you can view the data from many facets over time - sorry it took so long to add! You even have the option to switch views between a line chart or stacked bar chart.
- Grouped/stacked bar charts for new Targeted Analysis visualisations. This is allows you to view the results of nested aggregations, in which one facet can be grouped by another. This is a great visualisation for quickly comparing the top entities in two facets, for example, authors by field of study or institution, institutions by country/region or field of study or journals by subject, etc.
- Multi-dimensional scatter plot for Citation Comparison. This visualisation offers a unique way of exploring results by plotting the number of unique citing patents and unique citing scholarly works, allowing you to compare the most cited authors or institutions, etc. In this release, the Citation Comparison allows you to compare the number of individual (unique) citing patents and scholarly works for the top cited entities in our facets, providing a unique way of visualising the influence on the patent and scholarly corpus and identifying noteworthy outliers. However, this visualisation will be expanded in future releases to allow you to compare the number of unique citing inventions (citing patent families) or expanded patent families to provide even more insight in the context of invention and scholarship.
A key feature of the new suite of analysis and visualisation tools is the ability to completely customise the charts to explore the results in more depth and share these insights with others. To make this easier, we have developed a chart wizard to help you find and visualise the information of most interest by providing a two step process for selecting the most relevant visualisation using a number of pre-set chart configurations.
Selecting Add New Chart from the analysis toolbar initialises the chart wizard, which guides you to select the facet you're most interested in, and then provides a set of pre-set chart configurations for that facet, adding the new chart to the dashboard.
In addition to the chart wizard, every chart in a dashboard can be further customised using the settings menu on the chart tile, giving you a wide array of options for tailoring each chart and visualising the search results. The title and chart description can also be edited to further customise and annotate the visualisations in the analysis dashboard in preparation for sharing.
Sharing and Export
Once you have used the customisation features to create a highly tailored dashboard to analyse and visualise your search results or collection, the new Presentation Mode and sharing features allow you to preview and share your analysis with anyone at the click of a button.
For example, you can use presentation mode to preview the dashboard as a report or share the report with your colleagues or clients.
After creating your dashboard, you can use the Presentation Mode button on the analysis toolbar to preview the dashboard in a report view. As the name suggests, this provides a clean ‘presentation’ report-view of the dashboard and is the default view for anyone you share the dashboard with. Presentation mode hides the filters and other aspects of the user interface, presenting your charts and annotations in a clean report layout with an automatically generated table of contents. You can exit presentation mode using the button in the top left of the report to continue editing the dashboard or refine your search.
Once you are happy with your dashboard and analysis, you can share it with anyone using the Share Dashboard button on the analysis toolbar or the top right of the report. This will prompt you to give your report a title and then give you an option to share via LinkedIn, Facebook email or copy the link.
In addition to creating and sharing a report, you now have further options for exporting the chart visualisations and data. Each chart provides an option to download the chart as an editable SVG or PNG image file in the settings menu on the chart tile.
You can now also view the source data and code behind each chart. The Data only tab provides the aggregated data behind the chart in JSON format, while the Vega Spec tab provides the chart JSON in Vega-Lite format. Vega has its own online editor which can be used to further customise the chart JSON, so be sure to try it out!
If you find our current visualisations don’t meet your needs, please let us know so we can continue to improve our analysis and visualisation options. In the meantime, you can always export any set of search results, up to 50K scholarly records at a time in BibTex, CSV, JSON, or RIS formats, and perform your own analysis using software like Excel or Tableau.
We hope that you find these additions to our analysis and visualisation facilities to be useful. This is the first set of features that leverages new tools and infrastructure and although the features rolled out in this release only cover our scholarly data, be assured our engineers are hard at work updating our patent infrastructure so we can provide similar analysis and visualisation tools for our patent data and the patent-scholarly works linkages in the near future.
With the addition of this new functionality we are bound to have made some mistakes. We've been through a lot of design iterations to release this set of analysis features, but there are still stacks of features planned for future releases! So please feel free to send us your feedback on what analysis features you want to see in the Lens or any other suggestions.
We would love to see how you use the analysis functionality or hear about any oddities that you come across in the data. You can reach us at firstname.lastname@example.org so please don’t hesitate to get in touch to share your interesting visualisation with us and thank you for using The Lens!