../../_images/badge-colab.svg ../../_images/badge-github-custom.svg

The Dimcli Python library: Magic Commands

The purpose of this notebook is to show how to use Dimcli magic commands.

Python magic commands are essentially shortcuts that allow to perform some common operation without having to type much code.

For example, Dimcli magic commands can be used to quickly launch queries or to retrieve API documentation.

Magic commands can be very useful when testing things out e.g. while trying out a new query, or checking what data is available in Dimensions on a certain topic.

[1]:
import datetime
print("==\nCHANGELOG\nThis notebook was last run on %s\n==" % datetime.date.today().strftime('%b %d, %Y'))
==
CHANGELOG
This notebook was last run on Jul 28, 2023
==

Prerequisites

This notebook assumes you have installed the Dimcli library and are familiar with the ‘Getting Started’ tutorial.

[2]:
!pip install dimcli --quiet

import dimcli
from dimcli.utils import *
import sys
#

print("==\nLogging in..")
# https://digital-science.github.io/dimcli/getting-started.html#authentication
ENDPOINT = "https://app.dimensions.ai"
if 'google.colab' in sys.modules:
  import getpass
  KEY = getpass.getpass(prompt='API Key: ')
  dimcli.login(key=KEY, endpoint=ENDPOINT)
else:
  KEY = ""
  dimcli.login(key=KEY, endpoint=ENDPOINT)
dsl = dimcli.Dsl()
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 240.6/240.6 kB 6.7 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 80.6 MB/s eta 0:00:00
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 51.1/51.1 kB 5.6 MB/s eta 0:00:00
==
Logging in..
API Key: ··········
Dimcli - Dimensions API Client (v1.1)
Connected to: <https://app.dimensions.ai/api/dsl> - DSL v2.7
Method: manual login

Dimcli ‘magic’ commands

Dimcli includes 5 types of magic commands:

  1. %dsl can be used to run an API query

  2. %dslloop can be used to run an API query, using pagination (= iterations up to 50k records)

  3. %dsldf can be used to run an API query and transform the JSON data to a dataframe

  4. %dslloopdf can be used to run a paginated API query and transform the JSON data to a dataframe

  5. %dsldocs can be used to programmatically extract API schema information

Tip: Accessing data returned by magic queries

By default the results of magic command queries are saved into a variable called dsl_last_results:

[3]:
%dsl search publications for "something" return publications limit 1
type(dsl_last_results)
Returned Publications: 1 (total = 7968398)
Time: 0.55s
WARNINGS [1]
Field current_organization_id of the authors field is deprecated and will be removed in the next major release.
[3]:
dimcli.core.api.DslDataset

Note: a DimCli DslDataset object is a wrapper around the raw JSON data, which provides various functionalities (eg counting objects, returning dataframes etc..)

[4]:
print(dsl_last_results.publications[0]['title'])
Assessing the pragmatic competence of Arab learners of English: The case of apology

1. Simple queries with %dsl or %%dsl

These commands allow to run an API query after typing %dsl.

Moreover, if you press ‘tab’ after the command, one can also take advantage of a custom DSL autocompleter.

These commands are shortcuts for the standard syntax:

dsl = dimcli.Dsl()
dsl.query("...<some dsl query>...")

Single-line version: ``%dsl``

[5]:
%dsl search publications where journal.title="Nature Energy" return publications
Returned Publications: 20 (total = 1705)
Time: 1.31s
WARNINGS [1]
Field current_organization_id of the authors field is deprecated and will be removed in the next major release.
[5]:
<dimcli.DslDataset object #139304202085152. Records: 20/1705>

Multi-line version: ``%%dsl``

You can split the query into multiple lines, only this time you need to use the %%dsl command (two %):

[6]:
%%dsl
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return publications[title]
Returned Publications: 20 (total = 6369)
Time: 0.28s
[6]:
<dimcli.DslDataset object #139304202086784. Records: 20/6369>

Note: the autocompleter is available only with single-line queries.

2. Loop queries with %dslloop or %%dslloop

This magic command automatically loops over all the pages of a results set, until all possible records have been returned.

This is a short version of the Dimcli.Dsl.query_iterative method, which takes care of timing queries appropriately and aggregating results within a single object (see the Dimcli Library: Installation and Querying notebook for more details).

Single-line version: ``%dslloop``

[7]:
%dslloop search publications for "malaria AND Egypt" where year=2015 return publications
Starting iteration with limit=1000 skip=0 ...
0-1000 / 2852 (1.48s)
1000-2000 / 2852 (1.16s)
2000-2852 / 2852 (1.06s)
===
Records extracted: 2852
Warnings:  3
[7]:
<dimcli.DslDataset object #139304216831904. Records: 2852/2852>

Multi-line version: ``%%dslloop``

[8]:
%%dslloop
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return publications
Starting iteration with limit=1000 skip=0 ...
0-1000 / 6369 (1.75s)
1000-2000 / 6369 (1.14s)
2000-3000 / 6369 (1.19s)
3000-4000 / 6369 (1.03s)
4000-5000 / 6369 (2.06s)
5000-6000 / 6369 (1.20s)
6000-6369 / 6369 (3.98s)
===
Records extracted: 6369
Warnings:  7
[8]:
<dimcli.DslDataset object #139304216833344. Records: 6369/6369>

Like before, the results of a loop query are stored into the dsl_last_results variable.

[9]:
dsl_last_results.stats
[9]:
{'total_count': 6369}

3. Returning dataframes: %dsldf and %%dsldf

These magic commands are similar to the ones above, only they transform the data directly into Pandas dataframe objects.

Dataframes are then easy to sort, analyse, export as CSV and use within visualisation softwares.

Single-line version: ``%dsldf``

[10]:
%dsldf search publications where journal.id="jour.1136447" return publications
Returned Publications: 20 (total = 1705)
Time: 0.26s
WARNINGS [1]
Field current_organization_id of the authors field is deprecated and will be removed in the next major release.
[10]:
id title authors pages type year journal.id journal.title issue volume
0 pub.1162679092 Climate change impacts on planned supply–deman... [{'affiliations': [{'city': 'Beijing', 'city_i... 1-11 article 2023 jour.1136447 Nature Energy NaN NaN
1 pub.1162678127 Unequal residential heating burden caused by c... [{'affiliations': [{'city': 'Beijing', 'city_i... 1-10 article 2023 jour.1136447 Nature Energy NaN NaN
2 pub.1161701679 Sodium-ion batteries: capturing and reducing d... [{'affiliations': [{'city': 'Daejeon', 'city_i... 1-2 article 2023 jour.1136447 Nature Energy NaN NaN
3 pub.1160834246 Insights into advanced models for energy pover... [{'affiliations': [{'city': 'Ljubljana', 'city... 1-3 article 2023 jour.1136447 Nature Energy NaN NaN
4 pub.1160826991 Using narratives to infer preferences in under... [{'affiliations': [{'city': 'Zurich', 'city_id... 1-13 article 2023 jour.1136447 Nature Energy NaN NaN
5 pub.1160822084 Contextualizing coal communities for Australia... [{'affiliations': [{'city': 'Canberra', 'city_... 1-3 article 2023 jour.1136447 Nature Energy NaN NaN
6 pub.1160811717 Connecting women in the hydrogen world [{'affiliations': [{'city': 'Berlin', 'city_id... 1-1 article 2023 jour.1136447 Nature Energy NaN NaN
7 pub.1160805012 Silicon solar cells step up [{'affiliations': [{'city': 'Sydney', 'city_id... 1-2 article 2023 jour.1136447 Nature Energy NaN NaN
8 pub.1160649649 Identifying the intrinsic anti-site defect in ... [{'affiliations': [{'city': 'Beijing', 'city_i... 1-9 article 2023 jour.1136447 Nature Energy NaN NaN
9 pub.1160568081 Engineering relaxors by entropy for high energ... [{'affiliations': [{'city': 'Beijing', 'city_i... 1-9 article 2023 jour.1136447 Nature Energy NaN NaN
10 pub.1160403680 2D/3D heterojunction engineering at the buried... [{'affiliations': [{'city': 'Chongqing', 'city... 1-10 article 2023 jour.1136447 Nature Energy NaN NaN
11 pub.1160394361 Diversifying the solvent [{'affiliations': [{'city': 'Singapore', 'city... 1-2 article 2023 jour.1136447 Nature Energy NaN NaN
12 pub.1160392655 Understanding hydrogen electrocatalysis by pro... [{'affiliations': [{'city': 'Boston', 'city_id... 1-11 article 2023 jour.1136447 Nature Energy NaN NaN
13 pub.1160386348 High-entropy electrolytes for practical lithiu... [{'affiliations': [{'city': 'Stanford', 'city_... 1-13 article 2023 jour.1136447 Nature Energy NaN NaN
14 pub.1160383454 Increasing the reach of low-income energy prog... [{'affiliations': [{'city': 'Chicago', 'city_i... 1-9 article 2023 jour.1136447 Nature Energy NaN NaN
15 pub.1160378064 A Li-rich layered oxide cathode with negligibl... [{'affiliations': [{'city': 'Hong Kong', 'city... 1-10 article 2023 jour.1136447 Nature Energy NaN NaN
16 pub.1160324074 Reduction of bulk and surface defects in inver... [{'affiliations': [{'city': 'Wuhan', 'city_id'... 1-11 article 2023 jour.1136447 Nature Energy NaN NaN
17 pub.1160323839 Addendum to: Understanding environmental trade... [{'affiliations': [{'city': 'Freiburg', 'city_... 1-2 article 2023 jour.1136447 Nature Energy NaN NaN
18 pub.1162678453 Devices for Li-mediated synthesis [{'affiliations': [{'city': None, 'city_id': N... 641-641 article 2023 jour.1136447 Nature Energy 7 8
19 pub.1162678348 Granularity and green recovery [{'affiliations': [{'city': None, 'city_id': N... 642-642 article 2023 jour.1136447 Nature Energy 7 8

Multi-line version ``%%dsldf``

You can split the query into multiple lines, only this time you need to use the %%dsldf command (two %):

[11]:
%%dsldf
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return publications[title+year+times_cited] sort by times_cited
Returned Publications: 20 (total = 6369)
Time: 0.26s
[11]:
title times_cited year
0 Asymmetric Supercapacitors Using 3D Nanoporous... 849 2015
1 Brain Intelligence: Go beyond Artificial Intel... 819 2017
2 CH3NH3Sn x Pb(1–x)I3 Perovskite Solar Cells Co... 818 2014
3 Highly Luminescent Phase-Stable CsPbI3 Perovsk... 697 2017
4 Improved Understanding of the Electronic and E... 558 2014
5 Long Noncoding RNA NEAT1-Dependent SFPQ Reloca... 533 2014
6 Pt‐Free Counter Electrode for Dye‐Sensitized S... 506 2014
7 Flexible Graphene-Based Supercapacitors: A Review 481 2016
8 Hierarchical Gaussian Descriptor for Person Re... 468 2016
9 Comparative study of ceramic and single crysta... 396 2013
10 Motor Anomaly Detection for Unmanned Aerial Ve... 392 2017
11 Implementation of Super-Twisting Control: Supe... 358 2016
12 Underwater image dehazing using joint trilater... 354 2014
13 Development of X-ray-induced afterglow charact... 314 2014
14 Fermi-level-dependent charge-to-spin current c... 293 2016
15 Colloidal Synthesis of Air-Stable Alloyed CsSn... 290 2017
16 Thermal diodes, regulators, and switches: Phys... 287 2017
17 Photoelectrochemical CO2 reduction by a p-type... 278 2016
18 Hole-Conductor-Free, Metal-Electrode-Free TiO2... 252 2014
19 Low illumination underwater light field images... 235 2018

Note: the autocompleter is available only with single-line queries.

4. Looped dataframe queries: %dslloopdf and %%dslloopdf

These commands behave just like the dataframes magics above, only they trigger an iterative query that will attempt to extract all records available for a chosen DSL query up to the maximum limit of 50k.

[12]:
%dslloopdf search publications for "malaria AND Egypt" where year=2015 return publications
Starting iteration with limit=1000 skip=0 ...
0-1000 / 2852 (1.68s)
1000-2000 / 2852 (1.21s)
2000-2852 / 2852 (3.93s)
===
Records extracted: 2852
Warnings:  3
[12]:
id title type year authors pages volume issue journal.id journal.title
0 pub.1154679158 The Sound of the Sundial book 2015 NaN NaN NaN NaN NaN NaN
1 pub.1154675682 VI The Witch Savitri chapter 2015 [{'affiliations': [], 'corresponding': '', 'cu... 93-126 NaN NaN NaN NaN
2 pub.1142494539 Literatur chapter 2015 [{'affiliations': [], 'corresponding': '', 'cu... 473-520 NaN NaN NaN NaN
3 pub.1142492136 Lexikon der Mensch-Tier-Beziehungen book 2015 NaN NaN Band 1 NaN NaN NaN
4 pub.1142474104 Die Erforschung der Kolonien, Expeditionen und... book 2015 NaN NaN Band 75 NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ...
2847 pub.1000633746 Zika Virus chapter 2015 [{'affiliations': [{'city': 'Tampa', 'city_id'... 477-500 NaN NaN NaN NaN
2848 pub.1000392107 Chapter 4 Mitigation chapter 2015 [{'affiliations': [], 'corresponding': '', 'cu... 224-274 NaN NaN NaN NaN
2849 pub.1000250057 The chemistry and biological activities of nat... article 2015 [{'affiliations': [{'city': 'Buea', 'city_id':... 26580-26595 5 34 jour.1046724 RSC Advances
2850 pub.1000241832 Peacekeeping and the Rule of Law: Challenges P... chapter 2015 [{'affiliations': [{'city': 'New York City', '... 59-73 NaN NaN NaN NaN
2851 pub.1000058849 Handbook of Sustainable Luxury Textiles and Fa... book 2015 NaN NaN NaN NaN NaN NaN

2852 rows × 10 columns

5. Getting API schema documentation with %dsldocs

The %dsldocs magic prints out information about the fields and entities available via the Dimensions Search Language. This command returns a tabular version of the data model specs online (in case you are interested, this is possible thanks to the describe DSL command).

For example, if you pass a source name like grants, what you get back is a nice table showing all fields available for that source.

[13]:
%dsldocs grants
[13]:
sources field type description is_filter is_entity is_facet
0 grants abstract string Abstract or summary from a grant proposal. False False False
1 grants active_year integer List of active years for a grant. True False True
2 grants category_bra categories `Broad Research Areas <https://dimensions.fres... True True True
3 grants category_for categories ANZSRC Fields of Research classification (alia... True True True
4 grants category_for_2008 categories `ANZSRC Fields of Research classification <htt... True True True
5 grants category_for_2020 categories `ANZSRC Fields of Research classification <htt... True True True
6 grants category_hra categories `Health Research Areas <https://dimensions.fre... True True True
7 grants category_hrcs_hc categories `HRCS - Health Categories <https://dimensions.... True True True
8 grants category_hrcs_rac categories `HRCS – Research Activity Codes <https://dimen... True True True
9 grants category_icrp_cso categories `ICRP Common Scientific Outline <https://dimen... True True True
10 grants category_icrp_ct categories `ICRP Cancer Types <https://dimensions.freshde... True True True
11 grants category_rcdc categories `Research, Condition, and Disease Categorizati... True True True
12 grants category_sdg categories SDG - Sustainable Development Goals True True True
13 grants category_uoa categories `Units of Assessment <https://dimensions.fresh... True True True
14 grants concepts json Concepts describing the main topics of a publi... True False False
15 grants concepts_scores json Relevancy scores for `concepts`. True False False
16 grants date_inserted date Date when the record was inserted into Dimensi... True False False
17 grants dimensions_url string Link pointing to the Dimensions web application False False False
18 grants end_date date Date when the grant ends. True False False
19 grants foa_number string The funding opportunity announcement (FOA) num... True False False
20 grants funder_org_acronym string None True False True
21 grants funder_org_cities cities City name for funding organisation. True True True
22 grants funder_org_countries countries The country linked to the organisation funding... True True True
23 grants funder_org_name string Name of funding organisation. True False True
24 grants funder_org_states states State name for funding organisation. True True True
25 grants funder_orgs organizations The organisation funding the grant. This is no... True True True
26 grants funding_aud float Funding amount awarded in AUD. True False False
27 grants funding_cad float Funding amount awarded in CAD. True False False
28 grants funding_chf float Funding amount awarded in CHF. True False False
29 grants funding_cny float Funding amount awarded in CNY. True False False
30 grants funding_currency string Original funding currency. True False True
31 grants funding_eur float Funding amount awarded in EUR. True False False
32 grants funding_gbp float Funding amount awarded in GBP. True False False
33 grants funding_jpy float Funding amount awarded in JPY. True False False
34 grants funding_nzd float Funding amount awarded in NZD. True False False
35 grants funding_schemes string Information that the data sources provide rega... True False False
36 grants funding_usd float Funding amount awarded in USD. True False False
37 grants id string Dimensions grant ID. True False False
38 grants investigators json Additional details about investigators, includ... True False False
39 grants keywords string Keywords provided by the original data source. True False True
40 grants language string Grant original language, as ISO 639-1 language... True False True
41 grants language_title string ISO 639-1 language code for the original grant... True False True
42 grants linkout string Original URL for the grant. False False False
43 grants original_title string Title of the grant in its original language. False False False
44 grants project_numbers json Grant identifiers, as provided by the source (... True False False
45 grants research_org_cities cities City of the research organisations receiving t... True True True
46 grants research_org_countries countries Country of the research organisations receivin... True True True
47 grants research_org_names string Names of organizations investigators are affil... True False False
48 grants research_org_state_codes states State of the organisations receiving the grant... True True True
49 grants research_orgs organizations GRID organisations receiving the grant (note: ... True True True
50 grants researchers researchers Dimensions researchers IDs associated to the g... True True True
51 grants score float For full-text queries, the relevance score is ... True False False
52 grants start_date date Date when the grant starts, in the format 'YYY... True False False
53 grants start_year integer Year when the grant starts. True False True
54 grants title string Title of the grant in English (if the grant la... False False False

Similarly, for objects of type ‘Entity’ eg countries

[14]:
%dsldocs countries
[14]:
entities field type description is_filter is_entity is_facet
0 countries id string GeoNames country code (eg 'US' for `geonames:6... True False False
1 countries name string GeoNames country name. True False False

But don’t worry if you don’t get it right: if you pass a wrong object name, the full list of available sources and entities is printed.

[15]:
%dsldocs unknown
Can't recognize this object. Dimcli knows about:
 Sources=[clinical_trials - datasets - funder_groups - grants - organizations - patents - policy_documents - publications - reports - research_org_groups - researchers - source_titles] Entities=[categories - cities - countries - journals - open_access - publication_links - repositories - states]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-15-e3d3c8c65656> in <cell line: 1>()
----> 1 get_ipython().run_line_magic('dsldocs', 'unknown')

/usr/local/lib/python3.10/dist-packages/IPython/core/interactiveshell.py in run_line_magic(self, magic_name, line, _stack_depth)
   2416                 kwargs['local_ns'] = self.get_local_scope(stack_depth)
   2417             with self.builtin_trap:
-> 2418                 result = fn(*args, **kwargs)
   2419             return result
   2420

<decorator-gen-126> in dsldocs(self, line, cell)

/usr/local/lib/python3.10/dist-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    185     # but it's overkill for just that one bit of state.
    186     def magic_deco(arg):
--> 187         call = lambda f, *a, **k: f(*a, **k)
    188
    189         if callable(arg):

/usr/local/lib/python3.10/dist-packages/dimcli/jupyter/magics.py in dsldocs(self, line, cell)
    449         d = {header: [], 'field': [], 'type': [], 'description':[], 'is_filter':[], 'is_entity': [],  'is_facet':[],}
    450         for S in docs_for:
--> 451             for x in sorted(res.json[header][S]['fields']):
    452                 d[header] += [S]
    453                 d['field'] += [x]

KeyError: 'unknown'

Finally, if no object is requested, the full documentation for all the sources gets returned.

[16]:
%dsldocs
[16]:
sources field type description is_filter is_entity is_facet
0 clinical_trials abstract string Abstract or description of the clinical trial. False False False
1 clinical_trials acronym string Acronym of the clinical trial. True False False
2 clinical_trials active_years integer List of active years for a clinical trial. True False True
3 clinical_trials altmetric float Altmetric Attention Score. True False False
4 clinical_trials associated_grant_ids string Dimensions IDs of the grants associated to the... True False False
... ... ... ... ... ... ... ...
400 source_titles sjr float SJR indicator (SCImago Journal Rank). This ind... True False False
401 source_titles snip float SNIP indicator (source normalized impact per p... True False False
402 source_titles start_year integer Year when the source started publishing. True False True
403 source_titles title string The title of the source. False False False
404 source_titles type string The source type: one of `book_series`, `procee... True False True

405 rows × 7 columns



Note

The Dimensions Analytics API allows to carry out sophisticated research data analytics tasks like the ones described on this website. Check out also the associated Github repository for examples, the source code of these tutorials and much more.

../../_images/badge-dimensions-api.svg