../../_images/badge-colab.svg ../../_images/badge-github-custom.svg

The Dimcli Python library: Magic Commands

The purpose of this notebook is to show how to use Dimcli magic commands.

Python magic commands are essentially shortcuts that allow to perform some common operation without having to type much code.

For example, Dimcli magic commands can be used to quickly launch queries or to retrieve API documentation.

Magic commands can be very useful when testing things out e.g. while trying out a new query, or checking what data is available in Dimensions on a certain topic.

[1]:
import datetime
print("==\nCHANGELOG\nThis notebook was last run on %s\n==" % datetime.date.today().strftime('%b %d, %Y'))
==
CHANGELOG
This notebook was last run on Jan 24, 2022
==

Prerequisites

This notebook assumes you have installed the Dimcli library and are familiar with the ‘Getting Started’ tutorial.

[1]:
!pip install dimcli --quiet

import dimcli
from dimcli.utils import *
import sys
#

print("==\nLogging in..")
# https://digital-science.github.io/dimcli/getting-started.html#authentication
ENDPOINT = "https://app.dimensions.ai"
if 'google.colab' in sys.modules:
  import getpass
  KEY = getpass.getpass(prompt='API Key: ')
  dimcli.login(key=KEY, endpoint=ENDPOINT)
else:
  KEY = ""
  dimcli.login(key=KEY, endpoint=ENDPOINT)
dsl = dimcli.Dsl()
Searching config file credentials for 'https://app.dimensions.ai' endpoint..
==
Logging in..
Dimcli - Dimensions API Client (v0.9.6)
Connected to: <https://app.dimensions.ai/api/dsl> - DSL v2.0
Method: dsl.ini file

Dimcli ‘magic’ commands

Dimcli includes 5 types of magic commands:

  1. %dsl can be used to run an API query

  2. %dslloop can be used to run an API query, using pagination (= iterations up to 50k records)

  3. %dsldf can be used to run an API query and transform the JSON data to a dataframe

  4. %dslloopdf can be used to run a paginated API query and transform the JSON data to a dataframe

  5. %dsldocs can be used to programmatically extract API schema information

Tip: Accessing data returned by magic queries

By default the results of magic command queries are saved into a variable called dsl_last_results:

[4]:
%dsl search publications for "something" return publications limit 1
type(dsl_last_results)
Returned Publications: 1 (total = 6938991)
Time: 1.87s
[4]:
dimcli.core.api.DslDataset

Note: a DimCli DslDataset object is a wrapper around the raw JSON data, which provides various functionalities (eg counting objects, returning dataframes etc..)

[5]:
print(dsl_last_results.publications[0]['title'])
Which Factor Influences Environmental Care Characters More: Knowledge of Issue or Demographic Factors?

1. Simple queries with %dsl or %%dsl

These commands allow to run an API query after typing %dsl.

Moreover, if you press ‘tab’ after the command, one can also take advantage of a custom DSL autocompleter.

These commands are shortcuts for the standard syntax:

dsl = dimcli.Dsl()
dsl.query("...<some dsl query>...")

Single-line version: ``%dsl``

[6]:
%dsl search publications where journal.title="Nature Energy" return publications
Returned Publications: 20 (total = 1359)
Time: 0.84s
[6]:
<dimcli.DslDataset object #4423808816. Records: 20/1359>

Multi-line version: ``%%dsl``

You can split the query into multiple lines, only this time you need to use the %%dsl command (two %):

[7]:
%%dsl
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return publications[title]
Returned Publications: 20 (total = 5807)
Time: 1.98s
[7]:
<dimcli.DslDataset object #4561410032. Records: 20/5807>

Note: the autocompleter is available only with single-line queries.

2. Loop queries with %dslloop or %%dslloop

This magic command automatically loops over all the pages of a results set, until all possible records have been returned.

This is a short version of the Dimcli.Dsl.query_iterative method, which takes care of timing queries appropriately and aggregating results within a single object (see the Dimcli Library: Installation and Querying notebook for more details).

Single-line version: ``%dslloop``

[8]:
%dslloop search publications for "malaria AND Egypt" where year=2015 return publications
Starting iteration with limit=1000 skip=0 ...
0-1000 / 2699 (3.15s)
1000-2000 / 2699 (2.91s)
2000-2699 / 2699 (2.84s)
===
Records extracted: 2699
[8]:
<dimcli.DslDataset object #4423455024. Records: 2699/2699>

Multi-line version: ``%%dslloop``

[9]:
%%dslloop
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return publications
Starting iteration with limit=1000 skip=0 ...
0-1000 / 5807 (1.73s)
1000-2000 / 5807 (1.61s)
2000-3000 / 5807 (1.58s)
3000-4000 / 5807 (1.55s)
4000-5000 / 5807 (2.25s)
5000-5807 / 5807 (1.47s)
===
Records extracted: 5807
[9]:
<dimcli.DslDataset object #4561410992. Records: 5807/5807>

Like before, the results of a loop query are stored into the dsl_last_results variable.

[10]:
dsl_last_results.stats
[10]:
{'total_count': 5807}

3. Returning dataframes: %dsldf and %%dsldf

These magic commands are similar to the ones above, only they transform the data directly into Pandas dataframe objects.

Dataframes are then easy to sort, analyse, export as CSV and use within visualisation softwares.

Single-line version: ``%dsldf``

[11]:
%dsldf search publications where journal.id="jour.1136447" return publications
Returned Publications: 20 (total = 1359)
Time: 19.09s
[11]:
authors id pages title type year journal.id journal.title issue volume
0 [{'affiliations': [{'city': 'Zurich', 'city_id... pub.1144714324 1-10 Techno-economic analysis of renewable fuels fo... article 2022 jour.1136447 Nature Energy NaN NaN
1 [{'affiliations': [{'city': 'Stanford', 'city_... pub.1144625544 1-13 Rational solvent molecule tuning for high-perf... article 2022 jour.1136447 Nature Energy NaN NaN
2 [{'affiliations': [{'city': 'Shanghai', 'city_... pub.1144466459 1-9 Toxic potency-adjusted control of air pollutio... article 2022 jour.1136447 Nature Energy NaN NaN
3 [{'affiliations': [{'city': 'State College', '... pub.1144465259 1-7 Integrated hydrological, power system and econ... article 2022 jour.1136447 Nature Energy NaN NaN
4 [{'affiliations': [{'city': 'Taiyuan', 'city_i... pub.1144361028 1-10 Fuel cells with an operational range of –20 °C... article 2022 jour.1136447 Nature Energy NaN NaN
5 [{'affiliations': [{'city': 'Waterloo', 'city_... pub.1144359921 1-11 High areal capacity, long cycle life 4 V ceram... article 2022 jour.1136447 Nature Energy NaN NaN
6 [{'affiliations': [{'city': 'Chapel Hill', 'ci... pub.1144054488 1-9 Evolution of defects during the degradation of... article 2021 jour.1136447 Nature Energy NaN NaN
7 [{'affiliations': [{'city': 'Darmstadt', 'city... pub.1144037103 1-2 Whittling iridium down to size article 2021 jour.1136447 Nature Energy NaN NaN
8 [{'affiliations': [{'city': 'Dalian', 'city_id... pub.1143966904 1154-1163 Ti1–graphene single-atom material for improved... article 2021 jour.1136447 Nature Energy 12 6
9 [{'affiliations': [{'city': 'Kyoto', 'city_id'... pub.1143964787 1176-1187 Overcoming humidity-induced swelling of graphe... article 2021 jour.1136447 Nature Energy 12 6
10 [{'affiliations': [{'city': 'Canberra', 'city_... pub.1143932020 1-12 Energy insecurity during temperature extremes ... article 2021 jour.1136447 Nature Energy NaN NaN
11 [{'affiliations': [{'city': 'Erlangen', 'city_... pub.1143931976 1-9 A bilayer conducting polymer structure for pla... article 2021 jour.1136447 Nature Energy NaN NaN
12 [{'affiliations': [{'city': 'Leeds', 'city_id'... pub.1143836040 1188-1197 Characterizing the energy use of disabled peop... article 2021 jour.1136447 Nature Energy 12 6
13 [{'affiliations': [{'city': 'Ulsan', 'city_id'... pub.1143836012 1164-1175 Subnano-sized silicon anode via crystal growth... article 2021 jour.1136447 Nature Energy 12 6
14 [{'affiliations': [{'city': 'Geneva', 'city_id... pub.1143835741 1-9 Integration of prosumer peer-to-peer trading d... article 2021 jour.1136447 Nature Energy NaN NaN
15 [{'affiliations': [{'city': 'Berlin', 'city_id... pub.1143833707 1-9 An open-access database and analysis tool for ... article 2021 jour.1136447 Nature Energy NaN NaN
16 [{'affiliations': [{'city': 'Davis', 'city_id'... pub.1143833650 1-2 A dataquake for solar cells article 2021 jour.1136447 Nature Energy NaN NaN
17 [{'affiliations': [{'city': 'Dresden', 'city_i... pub.1143833612 1092-1093 Upscaling sub-nano-sized silicon particles article 2021 jour.1136447 Nature Energy 12 6
18 [{'affiliations': [{'city': 'Galway', 'city_id... pub.1143780310 1094-1095 Cooking fuel switch or mix article 2021 jour.1136447 Nature Energy 12 6
19 [{'affiliations': [{'city': 'Messina', 'city_i... pub.1143745033 1096-1097 Fluorine-doping boosts performance article 2021 jour.1136447 Nature Energy 12 6

Multi-line version ``%%dsldf``

You can split the query into multiple lines, only this time you need to use the %%dsldf command (two %):

[12]:
%%dsldf
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return publications[title+year+times_cited] sort by times_cited
Returned Publications: 20 (total = 5807)
Time: 0.58s
[12]:
times_cited title year
0 744 Asymmetric Supercapacitors Using 3D Nanoporous... 2015
1 710 CH3NH3Sn x Pb(1–x)I3 Perovskite Solar Cells Co... 2014
2 564 Brain Intelligence: Go beyond Artificial Intel... 2017
3 503 Highly Luminescent Phase-Stable CsPbI3 Perovsk... 2017
4 457 Improved Understanding of the Electronic and E... 2014
5 456 Long Noncoding RNA NEAT1-Dependent SFPQ Reloca... 2014
6 403 Hierarchical Gaussian Descriptor for Person Re... 2016
7 387 Flexible Graphene-Based Supercapacitors: A Review 2016
8 320 Comparative study of ceramic and single crysta... 2013
9 318 Underwater image dehazing using joint trilater... 2014
10 296 Motor Anomaly Detection for Unmanned Aerial Ve... 2017
11 263 Implementation of Super-Twisting Control: Supe... 2016
12 222 Hole-Conductor-Free, Metal-Electrode-Free TiO2... 2014
13 218 Recent Progress of Counter Electrode Catalysts... 2014
14 216 Colloidal Synthesis of Air-Stable Alloyed CsSn... 2017
15 209 Low illumination underwater light field images... 2018
16 203 Photoelectrochemical CO2 reduction by a p-type... 2016
17 203 Fermi-level-dependent charge-to-spin current c... 2016
18 193 Wound intensity correction and segmentation wi... 2016
19 192 Low-Temperature and Solution-Processed Amorpho... 2015

Note: the autocompleter is available only with single-line queries.

4. Looped dataframe queries: %dslloopdf and %%dslloopdf

These commands behave just like the dataframes magics above, only they trigger an iterative query that will attempt to extract all records available for a chosen DSL query up to the maximum limit of 50k.

[13]:
%dslloopdf search publications for "malaria AND Egypt" where year=2015 return publications
Starting iteration with limit=1000 skip=0 ...
0-1000 / 2699 (2.18s)
1000-2000 / 2699 (2.01s)
2000-2699 / 2699 (2.58s)
===
Records extracted: 2699
[13]:
authors id pages title type year volume issue journal.id journal.title
0 [{'affiliations': [], 'corresponding': '', 'cu... pub.1142494539 473-520 Literatur chapter 2015 NaN NaN NaN NaN
1 NaN pub.1142492136 NaN Lexikon der Mensch-Tier-Beziehungen book 2015 Band 1 NaN NaN NaN
2 NaN pub.1142474104 NaN Die Erforschung der Kolonien, Expeditionen und... book 2015 Band 75 NaN NaN NaN
3 NaN pub.1142467346 NaN Vom Geist des Bauches, Für eine Philosophie de... book 2015 NaN NaN NaN NaN
4 [{'affiliations': [], 'corresponding': '', 'cu... pub.1142370689 1-19 Introduction chapter 2015 NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ...
2694 [{'affiliations': [{'city': 'Tampa', 'city_id'... pub.1000633746 477-500 Zika Virus chapter 2015 NaN NaN NaN NaN
2695 [{'affiliations': [], 'corresponding': '', 'cu... pub.1000392107 224-274 Chapter 4 Mitigation chapter 2015 NaN NaN NaN NaN
2696 [{'affiliations': [{'city': 'Buea', 'city_id':... pub.1000250057 26580-26595 The chemistry and biological activities of nat... article 2015 5 34 jour.1046724 RSC Advances
2697 [{'affiliations': [{'city': 'New York City', '... pub.1000241832 59-73 Peacekeeping and the Rule of Law: Challenges P... chapter 2015 NaN NaN NaN NaN
2698 NaN pub.1000058849 NaN Handbook of Sustainable Luxury Textiles and Fa... book 2015 NaN NaN NaN NaN

2699 rows × 10 columns

5. Getting API schema documentation with %dsldocs

The %dsldocs magic prints out information about the fields and entities available via the Dimensions Search Language. This command returns a tabular version of the data model specs online (in case you are interested, this is possible thanks to the describe DSL command).

For example, if you pass a source name like grants, what you get back is a nice table showing all fields available for that source.

[14]:
%dsldocs grants
[14]:
sources field type description is_filter is_entity is_facet
0 grants abstract string Abstract or summary from a grant proposal. False False False
1 grants active_year integer List of active years for a grant. True False True
2 grants category_bra categories `Broad Research Areas <https://dimensions.fres... True True True
3 grants category_for categories `ANZSRC Fields of Research classification <htt... True True True
4 grants category_hra categories `Health Research Areas <https://dimensions.fre... True True True
5 grants category_hrcs_hc categories `HRCS - Health Categories <https://dimensions.... True True True
6 grants category_hrcs_rac categories `HRCS – Research Activity Codes <https://dimen... True True True
7 grants category_icrp_cso categories `ICRP Common Scientific Outline <https://dimen... True True True
8 grants category_icrp_ct categories `ICRP Cancer Types <https://dimensions.freshde... True True True
9 grants category_rcdc categories `Research, Condition, and Disease Categorizati... True True True
10 grants category_sdg categories SDG - Sustainable Development Goals True True True
11 grants category_uoa categories `Units of Assessment <https://dimensions.fresh... True True True
12 grants concepts json Concepts describing the main topics of a publi... True False False
13 grants concepts_scores json Relevancy scores for `concepts`. True False False
14 grants date_inserted date Date when the record was inserted into Dimensi... True False False
15 grants dimensions_url string Link pointing to the Dimensions web application False False False
16 grants end_date date Date when the grant ends. True False False
17 grants foa_number string The funding opportunity announcement (FOA) num... True False False
18 grants funder_countries countries The country linked to the organisation funding... True True True
19 grants funders organizations The organisation funding the grant. This is no... True True True
20 grants funding_aud float Funding amount awarded in AUD. True False False
21 grants funding_cad float Funding amount awarded in CAD. True False False
22 grants funding_chf float Funding amount awarded in CHF. True False False
23 grants funding_currency string Original funding currency. True False True
24 grants funding_eur float Funding amount awarded in EUR. True False False
25 grants funding_gbp float Funding amount awarded in GBP. True False False
26 grants funding_jpy float Funding amount awarded in JPY. True False False
27 grants funding_nzd float Funding amount awarded in NZD. True False False
28 grants funding_org_acronym string Acronym for funding organisation. True False True
29 grants funding_org_city string City name for funding organisation. True False True
30 grants funding_org_name string Name of funding organisation. True False True
31 grants funding_usd float Funding amount awarded in USD. True False False
32 grants grant_number string Grant identifier, as provided by the source (e... True False False
33 grants id string Dimensions grant ID. True False False
34 grants investigators json Additional details about investigators, includ... True False False
35 grants language string Grant original language, as ISO 639-1 language... True False True
36 grants language_title string ISO 639-1 language code for the original grant... True False True
37 grants linkout string Original URL for the grant. False False False
38 grants original_title string Title of the grant in its original language. False False False
39 grants research_org_cities cities City of the research organisations receiving t... True True True
40 grants research_org_countries countries Country of the research organisations receivin... True True True
41 grants research_org_names string Names of organizations investigators are affil... True False False
42 grants research_org_state_codes states State of the organisations receiving the grant... True True True
43 grants research_orgs organizations GRID organisations receiving the grant (note: ... True True True
44 grants researchers researchers Dimensions researchers IDs associated to the g... True True True
45 grants start_date date Date when the grant starts, in the format 'YYY... True False False
46 grants start_year integer Year when the grant starts. True False True
47 grants title string Title of the grant in English (if the grant la... False False False

Similarly, for objects of type ‘Entity’ eg countries

[15]:
%dsldocs countries
[15]:
entities field type description is_filter is_entity is_facet
0 countries id string GeoNames country code (eg 'US' for `geonames:6... True False False
1 countries name string GeoNames country name. True False False

But don’t worry if you don’t get it right: if you pass a wrong object name, the full list of available sources and entities is printed.

[17]:
%dsldocs unknown
Can't recognize this object. Dimcli knows about:
 Sources=[clinical_trials - datasets - grants - organizations - patents - policy_documents - publications - reports - researchers - source_titles] Entities=[categories - cities - countries - journals - open_access - publication_links - states]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/var/folders/zk/bxslv_1d01b983n6l5ky91b80000gn/T/ipykernel_28928/3474323623.py in <module>
----> 1 get_ipython().run_line_magic('dsldocs', 'unknown')

~/Envs/jupyterlab/lib/python3.9/site-packages/IPython/core/interactiveshell.py in run_line_magic(self, magic_name, line, _stack_depth)
   2362                 kwargs['local_ns'] = self.get_local_scope(stack_depth)
   2363             with self.builtin_trap:
-> 2364                 result = fn(*args, **kwargs)
   2365             return result
   2366

~/Envs/jupyterlab/lib/python3.9/site-packages/decorator.py in fun(*args, **kw)
    230             if not kwsyntax:
    231                 args, kw = fix(args, kw, sig)
--> 232             return caller(func, *(extras + args), **kw)
    233     fun.__name__ = func.__name__
    234     fun.__doc__ = func.__doc__

~/Envs/jupyterlab/lib/python3.9/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    185     # but it's overkill for just that one bit of state.
    186     def magic_deco(arg):
--> 187         call = lambda f, *a, **k: f(*a, **k)
    188
    189         if callable(arg):

~/Envs/jupyterlab/lib/python3.9/site-packages/dimcli/jupyter/magics.py in dsldocs(self, line)
    364         d = {header: [], 'field': [], 'type': [], 'description':[], 'is_filter':[], 'is_entity': [],  'is_facet':[],}
    365         for S in docs_for:
--> 366             for x in sorted(res.json[header][S]['fields']):
    367                 d[header] += [S]
    368                 d['field'] += [x]

KeyError: 'unknown'

Finally, if no object is requested, the full documentation for all the sources gets returned.

[18]:
%dsldocs
[18]:
sources field type description is_filter is_entity is_facet
0 clinical_trials abstract string Abstract or description of the clinical trial. False False False
1 clinical_trials acronym string Acronym of the clinical trial. True False False
2 clinical_trials active_years integer List of active years for a clinical trial. True False True
3 clinical_trials altmetric float Altmetric Attention Score. True False False
4 clinical_trials associated_grant_ids string Dimensions IDs of the grants associated to the... True False False
... ... ... ... ... ... ... ...
349 source_titles sjr float SJR indicator (SCImago Journal Rank). This ind... True False False
350 source_titles snip float SNIP indicator (source normalized impact per p... True False False
351 source_titles start_year integer Year when the source started publishing. True False True
352 source_titles title string The title of the source. False False False
353 source_titles type string The source type: one of `book_series`, `procee... True False True

354 rows × 7 columns

[ ]:



Note

The Dimensions Analytics API allows to carry out sophisticated research data analytics tasks like the ones described on this website. Check out also the associated Github repository for examples, the source code of these tutorials and much more.

../../_images/badge-dimensions-api.svg