../../_images/badge-colab.svg ../../_images/badge-github-custom.svg

The Dimcli Python library: Magic Commands

The purpose of this notebook is to show how to use Dimcli magic commands.

Python magic commands are essentially shortcuts that allow to perform some common operation without having to type much code.

For example, Dimcli magic commands can be used to quickly launch queries or to retrieve API documentation.

Magic commands can be very useful when testing things out e.g. while trying out a new query, or checking what data is available in Dimensions on a certain topic.

Prerequisites

This notebook assumes you have installed the Dimcli library and are familiar with the Getting Started tutorial.

[1]:
!pip install dimcli --quiet

import dimcli
from dimcli.shortcuts import *
import sys
#

print("==\nLogging in..")
# https://github.com/digital-science/dimcli#authentication
ENDPOINT = "https://app.dimensions.ai"
if 'google.colab' in sys.modules:
  import getpass
  USERNAME = getpass.getpass(prompt='Username: ')
  PASSWORD = getpass.getpass(prompt='Password: ')
  dimcli.login(USERNAME, PASSWORD, ENDPOINT)
else:
  USERNAME, PASSWORD  = "", ""
  dimcli.login(USERNAME, PASSWORD, ENDPOINT)
dsl = dimcli.Dsl()
==
Logging in..
Dimcli - Dimensions API Client (v0.7.4.2)
Connected to: https://app.dimensions.ai - DSL v1.27
Method: dsl.ini file

Prerequisites

This notebook assumes you have installed the Dimcli library and are familiar with the Getting Started tutorial.

[2]:
# !pip install dimcli -U --quiet
[3]:
# username = ""
# password = ""
# endpoint = "https://app.dimensions.ai"

# # import all libraries and login
# import dimcli
# dimcli.login(username, password, endpoint)
# dsl = dimcli.Dsl()

Dimcli ‘magic’ commands

Dimcli includes 5 types of magic commands:

  1. %dsl can be used to run an API query

  2. %dslloop can be used to run an API query, using pagination (= iterations up to 50k records)

  3. %dsldf can be used to run an API query and transform the JSON data to a dataframe

  4. %dslloopdf can be used to run a paginated API query and transform the JSON data to a dataframe

  5. %dsldocs can be used to programmatically extract API schema information

Tip: Accessing data returned by magic queries

By default the results of magic command queries are saved into a variable called dsl_last_results:

[4]:
%dsl search publications for "something" return publications limit 1
type(dsl_last_results)
Returned Publications: 1 (total = 6055557)
Time: 1.18s
[4]:
dimcli.core.api.DslDataset

Note: a DimCli DslDataset object is a wrapper around the raw JSON data, which provides various functionalities (eg counting objects, returning dataframes etc..)

[5]:
print(dsl_last_results.publications[0]['title'])
The Prosimetrum Form 1: Verses as the Voice of the Past

1. Simple queries with %dsl or %%dsl

These commands allow to run an API query after typing %dsl.

Moreover, if you press ‘tab’ after the command, one can also take advantage of a custom DSL autocompleter.

These commands are shortcuts for the standard syntax:

dsl = dimcli.Dsl()
dsl.query("...<some dsl query>...")

Single-line version: ``%dsl``

[6]:
%dsl search publications where journal.title="Nature Energy" return publications
Returned Publications: 20 (total = 1082)
Time: 0.70s
WARNINGS [1]
Please review your query, as it contains an entity filter (journal.title) that can lead to incomplete results. More details on https://docs.dimensions.ai/dsl/language.html#literal-fields-vs-entity-fields
[6]:
<dimcli.DslDataset object #4398528256. Records: 20/1082>

Multi-line version: ``%%dsl``

You can split the query into multiple lines, only this time you need to use the %%dsl command (two %):

[7]:
%%dsl
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return publications[title]
Returned Publications: 20 (total = 3727)
Time: 0.49s
[7]:
<dimcli.DslDataset object #4398527008. Records: 20/3727>

Note: the autocompleter is available only with single-line queries.

2. Loop queries with %dslloop or %%dslloop

This magic command automatically loops over all the pages of a results set, until all possible records have been returned.

This is a short version of the Dimcli.Dsl.query_iterative method, which takes care of timing queries appropriately and aggregating results within a single object (see the Dimcli Library: Installation and Querying notebook for more details).

Single-line version: ``%dslloop``

[8]:
%dslloop search publications for "malaria AND Egypt" where year=2015 return publications
Starting iteration with limit=1000 skip=0 ...
0-1000 / 2496 (4.16s)
1000-2000 / 2496 (1.73s)
2000-2496 / 2496 (0.91s)
===
Records extracted: 2496
[8]:
<dimcli.DslDataset object #4658682560. Records: 2496/2496>

Multi-line version: ``%%dslloop``

[9]:
%%dslloop
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return publications
Starting iteration with limit=1000 skip=0 ...
0-1000 / 3727 (2.05s)
1000-2000 / 3727 (4.34s)
2000-3000 / 3727 (2.01s)
3000-3727 / 3727 (1.71s)
===
Records extracted: 3727
[9]:
<dimcli.DslDataset object #4398004736. Records: 3727/3727>

Like before, the results of a loop query are stored into the dsl_last_results variable.

[10]:
dsl_last_results.stats
[10]:
{'total_count': 3727}

3. Returning dataframes: %dsldf and %%dsldf

These magic commands are similar to the ones above, only they transform the data directly into Pandas dataframe objects.

Dataframes are then easy to sort, analyse, export as CSV and use within visualisation softwares.

Single-line version: ``%dsldf``

[11]:
%dsldf search publications where journal.id="jour.1136447" return publications
Returned Publications: 20 (total = 1082)
Time: 0.49s
[11]:
title pages author_affiliations year id type journal.id journal.title issue volume
0 The role of exciton lifetime for charge genera... 1-9 [[{'first_name': 'Andrej', 'last_name': 'Class... 2020 pub.1130455861 article jour.1136447 Nature Energy NaN NaN
1 A global analysis of the progress and failure ... 1-8 [[{'first_name': 'Galina', 'last_name': 'Alova... 2020 pub.1130455686 article jour.1136447 Nature Energy NaN NaN
2 Effects of technology complexity on the emerge... 1-11 [[{'first_name': 'Kavita', 'last_name': 'Suran... 2020 pub.1130455730 article jour.1136447 Nature Energy NaN NaN
3 The short-term costs of local content requirem... 1-9 [[{'first_name': 'Benedict', 'last_name': 'Pro... 2020 pub.1130456280 article jour.1136447 Nature Energy NaN NaN
4 How to split an exciton 1-2 [[{'first_name': 'Tracey M.', 'last_name': 'Cl... 2020 pub.1130456254 article jour.1136447 Nature Energy NaN NaN
5 Operando decoding of chemical and thermal even... 1-10 [[{'first_name': 'Jiaqiang', 'last_name': 'Hua... 2020 pub.1130291960 article jour.1136447 Nature Energy NaN NaN
6 Molecularly engineered photocatalyst sheet for... 1-8 [[{'first_name': 'Qian', 'last_name': 'Wang', ... 2020 pub.1130290805 article jour.1136447 Nature Energy NaN NaN
7 Sacrificing nothing to reduce CO2 1-2 [[{'first_name': 'Tuo', 'last_name': 'Wang', '... 2020 pub.1130292228 article jour.1136447 Nature Energy NaN NaN
8 Benefits and costs of a utility-ownership busi... 1-9 [[{'first_name': 'Galen', 'last_name': 'Barbos... 2020 pub.1130148818 article jour.1136447 Nature Energy NaN NaN
9 Realizing high zinc reversibility in rechargea... 1-7 [[{'first_name': 'Lin', 'last_name': 'Ma', 'co... 2020 pub.1130097386 article jour.1136447 Nature Energy NaN NaN
10 Challenges and prospects for negawatt trading ... 1-8 [[{'first_name': 'Wayes', 'last_name': 'Tushar... 2020 pub.1130043666 article jour.1136447 Nature Energy NaN NaN
11 Diagnosing and correcting anode-free cell fail... 1-10 [[{'first_name': 'A. J.', 'last_name': 'Louli'... 2020 pub.1130003672 article jour.1136447 Nature Energy NaN NaN
12 Molecular engineering of dispersed nickel phth... 1-9 [[{'first_name': 'Xiao', 'last_name': 'Zhang',... 2020 pub.1130002178 article jour.1136447 Nature Energy NaN NaN
13 Five thermal energy grand challenges for decar... 1-3 [[{'first_name': 'Asegun', 'last_name': 'Henry... 2020 pub.1130002495 article jour.1136447 Nature Energy NaN NaN
14 Quantification beyond expenditure 1-2 [[{'first_name': 'Harriet', 'last_name': 'Thom... 2020 pub.1129934526 article jour.1136447 Nature Energy NaN NaN
15 Impacts of climate change on energy systems in... 1-9 [[{'first_name': 'Seleshi G.', 'last_name': 'Y... 2020 pub.1129832206 article jour.1136447 Nature Energy NaN NaN
16 Leaving the competition in its wake 555-556 [[{'first_name': 'Ian D.', 'last_name': 'Broad... 2020 pub.1129666495 article jour.1136447 Nature Energy 8 5
17 Understanding and applying coulombic efficienc... 561-568 [[{'first_name': 'Jie', 'last_name': 'Xiao', '... 2020 pub.1128743948 article jour.1136447 Nature Energy 8 5
18 A holistic approach to interface stabilization... 596-604 [[{'first_name': 'Zonghao', 'last_name': 'Liu'... 2020 pub.1129487556 article jour.1136447 Nature Energy 8 5
19 Energy justice towards racial justice 551-551 NaN 2020 pub.1130097517 article jour.1136447 Nature Energy 8 5

Multi-line version ``%%dsldf``

You can split the query into multiple lines, only this time you need to use the %%dsldf command (two %):

[12]:
%%dsldf
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return publications[title+year+times_cited] sort by times_cited
Returned Publications: 20 (total = 3727)
Time: 0.50s
[12]:
year times_cited title
0 2014 504 CH3NH3SnxPb(1-x)I3 Perovskite Solar Cells Cove...
1 2015 503 Asymmetric Supercapacitors Using 3D Nanoporous...
2 2016 323 Hierarchical Gaussian Descriptor for Person Re...
3 2018 322 Brain Intelligence: Go beyond Artificial Intel...
4 2014 318 Improved understanding of the electronic and e...
5 2014 248 Underwater image dehazing using joint trilater...
6 2013 247 Comparative study of ceramic and single crysta...
7 2016 236 Flexible Graphene-Based Supercapacitors: A Review
8 2017 202 Highly Luminescent Phase-Stable CsPbI3 Perovsk...
9 2014 187 Recent Progress of Counter Electrode Catalysts...
10 2018 183 Motor Anomaly Detection for Unmanned Aerial Ve...
11 2018 177 Low illumination underwater light field images...
12 2014 173 Hole-Conductor-Free, Metal-Electrode-Free TiO2...
13 2016 159 Implementation of Super-Twisting Control: Supe...
14 2013 149 Study of rare-earth-doped scintillators
15 2015 142 Low-Temperature and Solution-Processed Amorpho...
16 2015 126 Insight into Perovskite Solar Cells Based on S...
17 2014 125 All-Solid Perovskite Solar Cells with HOCO-R-N...
18 2016 117 Photoelectrochemical CO2 reduction by a p-type...
19 2016 117 Fermi-level-dependent charge-to-spin current c...

Note: the autocompleter is available only with single-line queries.

4. Looped dataframe queries: %dslloopdf and %%dslloopdf

These commands behave just like the dataframes magics above, only they trigger an iterative query that will attempt to extract all records available for a chosen DSL query up to the maximum limit of 50k.

[13]:
%dslloopdf search publications for "malaria AND Egypt" where year=2015 return publications
Starting iteration with limit=1000 skip=0 ...
0-1000 / 2496 (3.11s)
1000-2000 / 2496 (1.83s)
2000-2496 / 2496 (0.90s)
===
Records extracted: 2496
[13]:
type volume pages author_affiliations id year title issue journal.id journal.title
0 chapter 65 1002-1012 [[{'first_name': '', 'last_name': 'UN', 'corre... pub.1090179052 2015 Population NaN NaN NaN
1 chapter NaN 127-136 NaN pub.1007697326 2015 D NaN NaN NaN
2 chapter 65 902-936 [[{'first_name': '', 'last_name': 'UN', 'corre... pub.1090180227 2015 International trade, finance and transport NaN NaN NaN
3 chapter 65 753-785 [[{'first_name': '', 'last_name': 'UN', 'corre... pub.1090179484 2015 Human rights country situations NaN NaN NaN
4 chapter NaN 1-80 NaN pub.1086527362 2015 Part I. Introduction to Applied Mathematics NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ...
2491 chapter NaN 101-147 [[{'first_name': 'A.A.', 'last_name': 'Gajadha... pub.1028227767 2015 6 Foodborne apicomplexan protozoa Coccidia NaN NaN NaN
2492 book 16 NaN NaN pub.1042689035 2015 Sustainable Agriculture Reviews, Cereals NaN NaN NaN
2493 chapter 88 165-241 [[{'first_name': 'Rafael', 'last_name': 'Toled... pub.1033609880 2015 Chapter Five Strongyloidiasis with Emphasis on... NaN NaN NaN
2494 book 4 NaN NaN pub.1009592139 2015 Urban Vulnerability and Climate Change in Afri... NaN NaN NaN
2495 book 335 NaN NaN pub.1008794359 2015 Proceedings of Fourth International Conference... NaN NaN NaN

2496 rows × 10 columns

5. Getting API schema documentation with %dsldocs

The %dsldocs magic prints out information about the fields and entities available via the Dimensions Search Language. This command returns a tabular version of the data model specs online (in case you are interested, this is possible thanks to the describe DSL command).

For example, if you pass a source name like grants, what you get back is a nice table showing all fields available for that source.

[14]:
%dsldocs grants
[14]:
sources field type description is_filter is_entity is_facet
0 grants abstract string Abstract or summary from a grant proposal. False False False
1 grants active_year integer List of active years for a grant. True False True
2 grants category_bra categories `Broad Research Areas <https://dimensions.fres... True True True
3 grants category_for categories `ANZSRC Fields of Research classification <htt... True True True
4 grants category_hra categories `Health Research Areas <https://dimensions.fre... True True True
5 grants category_hrcs_hc categories `HRCS - Health Categories <https://dimensions.... True True True
6 grants category_hrcs_rac categories `HRCS – Research Activity Codes <https://dimen... True True True
7 grants category_icrp_cso categories `ICRP Common Scientific Outline <https://dimen... True True True
8 grants category_icrp_ct categories `ICRP Cancer Types <https://dimensions.freshde... True True True
9 grants category_rcdc categories `Research, Condition, and Disease Categorizati... True True True
10 grants category_sdg categories SDG - Sustainable Development Goals True True True
11 grants category_uoa categories `Units of Assessment <https://dimensions.fresh... True True True
12 grants concepts string Concepts describing the main topics of a grant... False False False
13 grants date_inserted date Date when the record was inserted into Dimensi... True False False
14 grants dimensions_url string Link pointing to the Dimensions web application False False False
15 grants end_date date Date when the grant ends. True False False
16 grants foa_number string The funding opportunity announcement (FOA) num... True False False
17 grants funder_countries countries The country linked to the organisation funding... True True True
18 grants funders organizations The organisation funding the grant. This is no... True True True
19 grants funding_aud float Funding amount awarded in AUD. True False False
20 grants funding_cad float Funding amount awarded in CAD. True False False
21 grants funding_chf float Funding amount awarded in CHF. True False False
22 grants funding_currency string Original funding currency. True False True
23 grants funding_eur float Funding amount awarded in EUR. True False False
24 grants funding_gbp float Funding amount awarded in GBP. True False False
25 grants funding_jpy float Funding amount awarded in JPY. True False False
26 grants funding_nzd float Funding amount awarded in NZD. True False False
27 grants funding_org_acronym string Acronym for funding organisation. True False True
28 grants funding_org_city string City name for funding organisation. True False True
29 grants funding_org_name string Name of funding organisation. True False True
30 grants funding_usd float Funding amount awarded in USD. True False False
31 grants grant_number string Grant identifier, as provided by the source (e... True False False
32 grants id string Dimensions grant ID. True False False
33 grants investigator_details json Additional details about investigators, includ... True False False
34 grants language string Grant original language, as ISO 639-1 language... True False True
35 grants language_title string ISO 639-1 language code for the original grant... True False True
36 grants linkout string Original URL for the grant. False False False
37 grants original_title string Title of the grant in its original language. False False False
38 grants research_org_cities cities City of the research organisations receiving t... True True True
39 grants research_org_countries countries Country of the research organisations receivin... True True True
40 grants research_org_names string Names of organizations investigators are affil... True False False
41 grants research_org_state_codes states State of the organisations receiving the grant... True True True
42 grants research_orgs organizations GRID organisations receiving the grant (note: ... True True True
43 grants researchers researchers Dimensions researchers IDs associated to the g... True True True
44 grants start_date date Date when the grant starts, in the format 'YYY... True False False
45 grants start_year integer Year when the grant starts. True False True
46 grants title string Title of the grant in English (if the grant la... False False False

Similarly, for objects of type ‘Entity’ eg countries

[15]:
%dsldocs countries
[15]:
entities field type description is_filter is_entity is_facet
0 countries id string GeoNames country code (eg 'US' for `geonames:6... True False False
1 countries name string GeoNames country name. True False False

But don’t worry if you don’t get it right: if you pass a wrong object name, the full list of available sources and entities is printed.

[16]:
%dsldocs unknown
Can't recognize this object. Dimcli knows about:
 Sources=[publications - grants - patents - clinical_trials - policy_documents - researchers - organizations - datasets] Entities=[categories - cities - countries - journals - org_groups - states - publication_links - open_access]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-16-e3d3c8c65656> in <module>
----> 1 get_ipython().run_line_magic('dsldocs', 'unknown')

~/Envs/jupyterlab/lib/python3.8/site-packages/IPython/core/interactiveshell.py in run_line_magic(self, magic_name, line, _stack_depth)
   2324                 kwargs['local_ns'] = sys._getframe(stack_depth).f_locals
   2325             with self.builtin_trap:
-> 2326                 result = fn(*args, **kwargs)
   2327             return result
   2328

<decorator-gen-130> in dsldocs(self, line)

~/Envs/jupyterlab/lib/python3.8/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    185     # but it's overkill for just that one bit of state.
    186     def magic_deco(arg):
--> 187         call = lambda f, *a, **k: f(*a, **k)
    188
    189         if callable(arg):

~/Envs/jupyterlab/lib/python3.8/site-packages/dimcli/jupyter/magics.py in dsldocs(self, line)
    144         d = {header: [], 'field': [], 'type': [], 'description':[], 'is_filter':[], 'is_entity': [],  'is_facet':[],}
    145         for S in docs_for:
--> 146             for x in sorted(res.json[header][S]['fields']):
    147                 d[header] += [S]
    148                 d['field'] += [x]

KeyError: 'unknown'

Finally, if no object is requested, the full documentation for all the sources gets returned.

[17]:
%dsldocs
[17]:
sources field type description is_filter is_entity is_facet
0 publications altmetric float Altmetric attention score. True False False
1 publications altmetric_id integer AltMetric Publication ID True False False
2 publications authors json Ordered list of authors names and their affili... True False False
3 publications book_doi string The DOI of the book a chapter belongs to (note... True False False
4 publications book_series_title string The title of the book series book, belong to. False False False
... ... ... ... ... ... ... ...
286 datasets research_org_states states State of the organisations the publication aut... True True True
287 datasets research_orgs organizations GRID organisations linked to the publication a... True True True
288 datasets researchers researchers Dimensions researchers IDs associated to the d... True True True
289 datasets title string Title of the dataset. False False False
290 datasets year integer Year of publication of the dataset. True False True

291 rows × 7 columns



Note

The Dimensions Analytics API allows to carry out sophisticated research data analytics tasks like the ones described on this website. Check out also the associated Github repository for examples, the source code of these tutorials and much more.

../../_images/badge-dimensions-api.svg