../../_images/badge-colab.svg ../../_images/badge-github-custom.svg

The Dimcli Python library: Magic Commands

The purpose of this notebook is to show how to use Dimcli magic commands.

Python magic commands are essentially shortcuts that allow to perform some common operation without having to type much code.

For example, Dimcli magic commands can be used to quickly launch queries or to retrieve API documentation.

Magic commands can be very useful when testing things out e.g. while trying out a new query, or checking what data is available in Dimensions on a certain topic.

Prerequisites

This notebook assumes you have installed the Dimcli library and are familiar with the Getting Started tutorial.

[5]:
!pip install dimcli -U --quiet
[6]:
username = ""
password = ""
endpoint = "https://app.dimensions.ai"

# import all libraries and login
import dimcli
dimcli.login(username, password, endpoint)
dsl = dimcli.Dsl()
Dimcli - Dimensions API Client (v0.6.9)
Connected to endpoint: https://app.dimensions.ai - DSL version: 1.24
Method: dsl.ini file

Dimcli ‘magic’ commands

Dimcli includes 5 types of magic commands:

  1. %dsl can be used to run an API query

  2. %dslloop can be used to run an API query, using pagination (= iterations up to 50k records)

  3. %dsldf can be used to run an API query and transform the JSON data to a dataframe

  4. %dslloopdf can be used to run a paginated API query and transform the JSON data to a dataframe

  5. %dsldocs can be used to programmatically extract API schema information

Tip: Accessing data returned by magic queries

By default the results of magic command queries are saved into a variable called dsl_last_results:

[7]:
%dsl search publications for "something" return publications limit 1
type(dsl_last_results)
Returned Publications: 1 (total = 5908299)
[7]:
dimcli.core.api.DslDataset

Note: a DimCli DslDataset object is a wrapper around the raw JSON data, which provides various functionalities (eg counting objects, returning dataframes etc..)

[8]:
print(dsl_last_results.publications[0]['title'])
Introduction: Murra, Materialism, Anthropology, and the Andes

1. Simple queries with %dsl or %%dsl

These commands allow to run an API query after typing %dsl.

Moreover, if you press ‘tab’ after the command, one can also take advantage of a custom DSL autocompleter.

These commands are shortcuts for the standard syntax:

dsl = dimcli.Dsl()
dsl.query("...<some dsl query>...")

Single-line version: ``%dsl``

[9]:
%dsl search publications where journal.title="Nature Energy" return publications
Returned Publications: 20 (total = 1022)
WARNINGS [1]
Please review your query, as it contains an entity filter (journal.title) that can lead to incomplete results. More details on https://docs.dimensions.ai/dsl/language.html#literal-fields-vs-entity-fields
[9]:
<dimcli.DslDataset object #4656067664. Records: 20/1022>

Multi-line version: ``%%dsl``

You can split the query into multiple lines, only this time you need to use the %%dsl command (two %):

[10]:
%%dsl
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return publications[title]
Returned Publications: 20 (total = 3769)
[10]:
<dimcli.DslDataset object #4655822096. Records: 20/3769>

Note: the autocompleter is available only with single-line queries.

2. Loop queries with %dslloop or %%dslloop

This magic command automatically loops over all the pages of a results set, until all possible records have been returned.

This is a short version of the Dimcli.Dsl.query_iterative method, which takes care of timing queries appropriately and aggregating results within a single object (see the Dimcli Library: Installation and Querying notebook for more details).

Single-line version: ``%dslloop``

[11]:
%dslloop search publications for "malaria AND Egypt" where year=2015 return publications
1000 / ...
1000 / 2463
2000 / 2463
2463 / 2463
===
Records extracted: 2463
[11]:
<dimcli.DslDataset object #4655636240. Records: 2463/2463>

Multi-line version: ``%%dslloop``

[12]:
%%dslloop
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return publications
1000 / ...
1000 / 3769
2000 / 3769
3000 / 3769
3769 / 3769
===
Records extracted: 3769
[12]:
<dimcli.DslDataset object #4655682576. Records: 3769/3769>

Like before, the results of a loop query are stored into the dsl_last_results variable.

[13]:
dsl_last_results.stats
[13]:
{'total_count': 3769}

3. Returning dataframes: %dsldf and %%dsldf

These magic commands are similar to the ones above, only they transform the data directly into Pandas dataframe objects.

Dataframes are then easy to sort, analyse, export as CSV and use within visualisation softwares.

Single-line version: ``%dsldf``

[14]:
%dsldf search publications where journal.id="jour.1136447" return publications
Returned Publications: 20 (total = 1022)
[14]:
author_affiliations year type title id pages journal.id journal.title issue volume
0 [[{'first_name': 'Fumiaki', 'last_name': 'Aman... 2020 article In the loop pub.1127689083 1-2 jour.1136447 Nature Energy NaN NaN
1 [[{'first_name': 'Xiang', 'last_name': 'Yu', '... 2020 article Stoichiometric methane conversion to ethane us... pub.1127686607 1-9 jour.1136447 Nature Energy NaN NaN
2 [[{'first_name': 'Swapna', 'last_name': 'Ganap... 2020 article Fast interfaces pub.1127689305 1-2 jour.1136447 Nature Energy NaN NaN
3 [[{'first_name': 'Joan A.', 'last_name': 'Case... 2020 article Publisher Correction: Coal-fired power plant c... pub.1127611570 1-1 jour.1136447 Nature Energy NaN NaN
4 [[{'first_name': 'Vanesa', 'last_name': 'Castá... 2020 article Energy access is needed to maintain health dur... pub.1127501502 1-3 jour.1136447 Nature Energy NaN NaN
5 [[{'first_name': 'Xue', 'last_name': 'Wang', '... 2020 article Efficient electrically powered CO2-to-ethanol ... pub.1127499508 1-9 jour.1136447 Nature Energy NaN NaN
6 [[{'first_name': 'Michael J.', 'last_name': 'F... 2020 article Make fun of your research pub.1127404382 1-3 jour.1136447 Nature Energy NaN NaN
7 [[{'first_name': 'Jens', 'last_name': 'Appel',... 2020 article Cyanobacterial in vivo solar hydrogen producti... pub.1127347060 1-10 jour.1136447 Nature Energy NaN NaN
8 [[{'first_name': 'Oliver', 'last_name': 'Lenz'... 2020 article Hydrogen comes alive pub.1127347477 1-2 jour.1136447 Nature Energy NaN NaN
9 [[{'first_name': 'Joan A.', 'last_name': 'Case... 2020 article Coal-fired power plant closures and retrofits ... pub.1127254634 1-2 jour.1136447 Nature Energy NaN NaN
10 [[{'first_name': 'Michelle', 'last_name': 'Gra... 2020 article COVID-19 assistance needs to target energy ins... pub.1127249329 1-3 jour.1136447 Nature Energy NaN NaN
11 [[{'first_name': 'Jun', 'last_name': 'Du', 'in... 2020 article Spectroscopic insights into high defect tolera... pub.1127686387 1-9 jour.1136447 Nature Energy NaN NaN
12 [[{'first_name': 'Iván', 'last_name': 'Mora-Se... 2020 article Turn defects into strengths pub.1127687396 1-2 jour.1136447 Nature Energy NaN NaN
13 [[{'first_name': 'Jiangyan', 'last_name': 'Wan... 2020 article Electrolytes for microsized silicon pub.1127174686 1-2 jour.1136447 Nature Energy NaN NaN
14 [[{'first_name': 'William E.', 'last_name': 'M... 2020 article Improving alkaline ionomers pub.1127160760 1-2 jour.1136447 Nature Energy NaN NaN
15 [[{'first_name': 'Sunil', 'last_name': 'Mani',... 2020 article The drivers of sustained use of liquified petr... pub.1127145259 1-8 jour.1136447 Nature Energy NaN NaN
16 [[{'first_name': 'Ji', 'last_name': 'Chen', 'i... 2020 article Electrolyte design for LiF-rich solid–electrol... pub.1126820081 1-12 jour.1136447 Nature Energy NaN NaN
17 [[{'first_name': 'Joan A.', 'last_name': 'Case... 2020 article Improved asthma outcomes observed in the vicin... pub.1126634439 1-11 jour.1136447 Nature Energy NaN NaN
18 NaN 2020 article Recovering fast and slow pub.1126269829 273-273 jour.1136447 Nature Energy 4 5
19 [[{'first_name': 'Constantine E.', 'last_name'... 2020 article Mandatory building energy audits alone are ins... pub.1125976997 282-283 jour.1136447 Nature Energy 4 5

Multi-line version ``%%dsldf``

You can split the query into multiple lines, only this time you need to use the %%dsldf command (two %):

[15]:
%%dsldf
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return publications[title+year+times_cited] sort by times_cited
Returned Publications: 20 (total = 3769)
[15]:
year title times_cited
0 2014 CH3NH3SnxPb(1-x)I3 Perovskite Solar Cells Cove... 488
1 2015 Asymmetric Supercapacitors Using 3D Nanoporous... 480
2 2014 Improved understanding of the electronic and e... 301
3 2016 Hierarchical Gaussian Descriptor for Person Re... 297
4 2018 Brain Intelligence: Go beyond Artificial Intel... 282
5 2013 Comparative study of ceramic and single crysta... 235
6 2014 Underwater image dehazing using joint trilater... 228
7 2016 Flexible Graphene-Based Supercapacitors: A Review 210
8 2014 Recent Progress of Counter Electrode Catalysts... 183
9 2017 Highly Luminescent Phase-Stable CsPbI3 Perovsk... 168
10 2014 Hole-Conductor-Free, Metal-Electrode-Free TiO2... 167
11 2018 Motor Anomaly Detection for Unmanned Aerial Ve... 163
12 2018 Low illumination underwater light field images... 161
13 2013 Study of rare-earth-doped scintillators 146
14 2016 Implementation of Super-Twisting Control: Supe... 146
15 2015 Low-Temperature and Solution-Processed Amorpho... 137
16 2013 Maximum Torque per Ampere (MTPA) Control of an... 128
17 2014 All-Solid Perovskite Solar Cells with HOCO-R-N... 122
18 2015 Insight into Perovskite Solar Cells Based on S... 113
19 2016 Photoelectrochemical CO2 reduction by a p-type... 106

Note: the autocompleter is available only with single-line queries.

4. Looped dataframe queries: %dslloopdf and %%dslloopdf

These commands behave just like the dataframes magics above, only they trigger an iterative query that will attempt to extract all records available for a chosen DSL query up to the maximum limit of 50k.

[16]:
%dslloopdf search publications for "malaria AND Egypt" where year=2015 return publications
1000 / ...
1000 / 2463
2000 / 2463
2463 / 2463
===
Records extracted: 2463
[16]:
id title volume author_affiliations type year pages issue journal.id journal.title
0 pub.1090179052 Population 65 [[{'first_name': '', 'last_name': 'UN', 'initi... chapter 2015 1002-1012 NaN NaN NaN
1 pub.1007697326 D NaN NaN chapter 2015 127-136 NaN NaN NaN
2 pub.1086527362 Part I. Introduction to Applied Mathematics NaN NaN chapter 2015 1-80 NaN NaN NaN
3 pub.1090180227 International trade, finance and transport 65 [[{'first_name': '', 'last_name': 'UN', 'initi... chapter 2015 902-936 NaN NaN NaN
4 pub.1090179484 Human rights country situations 65 [[{'first_name': '', 'last_name': 'UN', 'initi... chapter 2015 753-785 NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ...
2458 pub.1046846662 Schistosoma japonicum NaN [[{'first_name': 'Zhongdao', 'last_name': 'Wu'... chapter 2015 1-7 NaN NaN NaN
2459 pub.1038593222 Phenomics in Crop Plants: Trends, Options and ... NaN NaN book 2015 NaN NaN NaN NaN
2460 pub.1009941724 Pediatric Urology, Contemporary Strategies fro... NaN NaN book 2015 NaN NaN NaN NaN
2461 pub.1007654739 The Quest for Trust in the Face of Uncertainty... NaN [[{'first_name': 'Nadine', 'last_name': 'Beckm... chapter 2015 59-83 NaN NaN NaN
2462 pub.1015527531 Flax: Sustainability Is the New Luxury NaN [[{'first_name': 'Joan', 'last_name': 'Farrer'... chapter 2015 19-41 NaN NaN NaN

2463 rows × 10 columns

5. Getting API schema documentation with %dsldocs

The %dsldocs magic prints out information about the fields and entities available via the Dimensions Search Language. This command returns a tabular version of the data model specs online (in case you are interested, this is possible thanks to the describe DSL command).

For example, if you pass a source name like grants, what you get back is a nice table showing all fields available for that source.

[17]:
%dsldocs grants
[17]:
sources field type description is_filter is_entity is_facet
0 grants abstract string Abstract or summary from a grant proposal. False False False
1 grants active_year integer List of active years for a grant. True False True
2 grants category_bra categories `Broad Research Areas <https://app.dimensions.... True True True
3 grants category_for categories `ANZSRC Fields of Research classification <htt... True True True
4 grants category_hra categories `Health Research Areas <https://app.dimensions... True True True
5 grants category_hrcs_hc categories `HRCS - Health Categories <https://app.dimensi... True True True
6 grants category_hrcs_rac categories `HRCS – Research Activity Codes <https://app.d... True True True
7 grants category_icrp_cso categories `ICRP Common Scientific Outline <https://app.d... True True True
8 grants category_icrp_ct categories `ICRP Cancer Types <https://app.dimensions.ai/... True True True
9 grants category_rcdc categories `Research, Condition, and Disease Categorizati... True True True
10 grants category_uoa categories `Units of Assessment <https://app.dimensions.a... True True True
11 grants concepts string Concepts describing the main topics of a grant... False False False
12 grants date_inserted date Date when the record was inserted into Dimensi... True False False
13 grants end_date date Date when the grant ends. True False False
14 grants foa_number string The funding opportunity announcement (FOA) num... True False False
15 grants funder_countries countries The country linked to the organisation funding... True True True
16 grants funders organizations The organisation funding the grant. This is no... True True True
17 grants funding_aud float Funding amount awarded in AUD. True False False
18 grants funding_cad float Funding amount awarded in CAD. True False False
19 grants funding_chf float Funding amount awarded in CHF. True False False
20 grants funding_currency string Original funding currency. True False True
21 grants funding_eur float Funding amount awarded in EUR. True False False
22 grants funding_gbp float Funding amount awarded in GBP. True False False
23 grants funding_jpy float Funding amount awarded in JPY. True False False
24 grants funding_nzd float Funding amount awarded in NZD. True False False
25 grants funding_org_acronym string Acronym for funding organisation. True False True
26 grants funding_org_city string City name for funding organisation. True False True
27 grants funding_org_name string Name of funding organisation. True False True
28 grants funding_usd float Funding amount awarded in USD. True False False
29 grants grant_number string Grant identifier, as provided by the source (e... True False False
30 grants id string Dimensions grant ID. True False False
31 grants investigator_details json Additional details about investigators, includ... True False False
32 grants language string Grant original language, as ISO 639-1 language... True False True
33 grants language_title string ISO 639-1 language code for the original grant... True False True
34 grants linkout string Original URL for the grant. False False False
35 grants original_title string Title of the grant in its original language. False False False
36 grants research_org_cities cities City of the research organisations receiving t... True True True
37 grants research_org_countries countries Country of the research organisations receivin... True True True
38 grants research_org_state_codes states State of the organisations receiving the grant... True True True
39 grants research_orgs organizations GRID organisations receiving the grant (note: ... True True True
40 grants researchers researchers Dimensions researchers IDs associated to the g... True True True
41 grants start_date date Date when the grant starts, in the format 'YYY... True False False
42 grants start_year integer Year when the grant starts. True False True
43 grants title string Title of the grant in English (if the grant la... False False False

Similarly, for objects of type ‘Entity’ eg countries

[18]:
%dsldocs countries
[18]:
entities field type description is_filter is_entity is_facet
0 countries id string GeoNames country code (eg 'US' for `geonames:6... True False False
1 countries name string GeoNames country name. True False False

But don’t worry if you don’t get it right: if you pass a wrong object name, the full list of available sources and entities is printed.

[19]:
%dsldocs unknown
Can't recognize this object. Dimcli knows about:
 Sources=[publications - grants - patents - clinical_trials - policy_documents - researchers - organizations - datasets] Entities=[categories - cities - countries - journals - org_groups - states - open_access]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-19-e3d3c8c65656> in <module>
----> 1 get_ipython().run_line_magic('dsldocs', 'unknown')

~/Envs/jupyterlab/lib/python3.7/site-packages/IPython/core/interactiveshell.py in run_line_magic(self, magic_name, line, _stack_depth)
   2315                 kwargs['local_ns'] = sys._getframe(stack_depth).f_locals
   2316             with self.builtin_trap:
-> 2317                 result = fn(*args, **kwargs)
   2318             return result
   2319

<decorator-gen-130> in dsldocs(self, line)

~/Envs/jupyterlab/lib/python3.7/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    185     # but it's overkill for just that one bit of state.
    186     def magic_deco(arg):
--> 187         call = lambda f, *a, **k: f(*a, **k)
    188
    189         if callable(arg):

~/Envs/jupyterlab/lib/python3.7/site-packages/dimcli/jupyter/magics.py in dsldocs(self, line)
    144         d = {header: [], 'field': [], 'type': [], 'description':[], 'is_filter':[], 'is_entity': [],  'is_facet':[],}
    145         for S in docs_for:
--> 146             for x in sorted(res.json[header][S]['fields']):
    147                 d[header] += [S]
    148                 d['field'] += [x]

KeyError: 'unknown'

Finally, if no object is requested, the full documentation for all the sources gets returned.

[20]:
%dsldocs
[20]:
sources field type description is_filter is_entity is_facet
0 publications altmetric float Altmetric attention score. True False False
1 publications altmetric_id integer AltMetric Publication ID True False False
2 publications authors json Ordered list of authors names and their affili... True False False
3 publications book_doi string The DOI of the book a chapter belongs to (note... True False False
4 publications book_series_title string The title of the book series book, belong to. False False False
... ... ... ... ... ... ... ...
262 datasets research_org_states states State of the organisations the publication aut... True True True
263 datasets research_orgs organizations GRID organisations linked to the publication a... True True True
264 datasets researchers researchers Dimensions researchers IDs associated to the d... True True True
265 datasets title string Title of the dataset. False False False
266 datasets year integer Year of publication of the dataset. True False True

267 rows × 7 columns



Note

The Dimensions Analytics API allows to carry out sophisticated research data analytics tasks like the ones described on this website. Check out also the associated Github repository for examples, the source code of these tutorials and much more.

../../_images/badge-dimensions-api.svg