Exploring The Dimensions Search Language (DSL) - Quick Intro¶
This Notebook takes you through the basics of using the Dimensions API.
See also: official DSL documentation online
In this tutorial we leverage the capabilities of the Dimcli library in the context of Jupyter Notebooks. Dimcli is an open source Python library that simplifies common operations like logging in, querying and displaying results.
[1]:
import datetime
print("==\nCHANGELOG\nThis notebook was last run on %s\n==" % datetime.date.today().strftime('%b %d, %Y'))
==
CHANGELOG
This notebook was last run on Jan 24, 2022
==
Prerequisites¶
This notebook assumes you have installed the Dimcli library and are familiar with the ‘Getting Started’ tutorial.
[1]:
!pip install dimcli -U --quiet
import dimcli
from dimcli.utils import *
import sys
print("==\nLogging in..")
# https://digital-science.github.io/dimcli/getting-started.html#authentication
ENDPOINT = "https://app.dimensions.ai"
if 'google.colab' in sys.modules:
import getpass
KEY = getpass.getpass(prompt='API Key: ')
dimcli.login(key=KEY, endpoint=ENDPOINT)
else:
KEY = ""
dimcli.login(key=KEY, endpoint=ENDPOINT)
dsl = dimcli.Dsl()
Searching config file credentials for 'https://app.dimensions.ai' endpoint..
==
Logging in..
Dimcli - Dimensions API Client (v0.9.6)
Connected to: <https://app.dimensions.ai/api/dsl> - DSL v2.0
Method: dsl.ini file
What the query statistics refer to¶
When performing a DSL search, a _stats
object is return which contains some useful info eg the total number of records available for a search.
[2]:
res1 = dsl.query("""
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return publications""", verbose=False)
print(res1.stats) # PS this is short for `res.json['_stats'])`
{'total_count': 5807}
It is important to note though that the total number always refers to the main source one is searching for, not necessarily the results being returned. For example, in this query we return researchers
linked to publications:
[3]:
res2 = dsl.query("""
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return researchers""", verbose=False)
print(res2.stats)
{'total_count': 5807}
Still 3815 records! That’s because the total count always refers to the main object type one is searching for, not to the facet being returned.
Tip: this basic information about objects returned is also available via the count_batch
and count_total
methods of the query results object.
[4]:
result = dsl.query("""
search publications
for "malaria AND congo"
return publications[basics]
limit 30
""", verbose=False)
# print some stats using the Result object
print("Results in this batch: ", result.count_batch)
print("Results in total: ", result.count_total)
print("Errors: ",result.errors)
Results in this batch: 30
Results in total: 86890
Errors: None
Working with fields¶
Note: in the following examples we use the magic command %%dsldf
for quicker querying.
Control the fields you return¶
[5]:
%%dsldf
search publications
return publications[id+title+year+doi]
limit 5
Returned Publications: 5 (total = 124736479)
Time: 2.29s
[5]:
doi | id | title | year | |
---|---|---|---|---|
0 | 10.13170/depik.10.3.22492 | pub.1144593888 | Profile of ectoparasites and biometric conditi... | 2022 |
1 | 10.1007/s11708-021-0812-6 | pub.1144587500 | Experimental study of stratified lean burn cha... | 2022 |
2 | 10.1145/3480027 | pub.1141731113 | Opportunities and Challenges in Code Search Tools | 2022 |
3 | 10.1145/3479393 | pub.1141731112 | Ransomware Mitigation in the Modern Era: A Com... | 2022 |
4 | 10.1145/3478680 | pub.1141731111 | Service Computing for Industry 4.0: State of t... | 2022 |
Make a mistake, and the DSL will tell you what fields that you could have used¶
[6]:
%%dsldf
search publications
return publications[dois]
limit 100
Returned Errors: 1
Time: 4.06s
1 QueryError found
Semantic errors found:
Field / Fieldset 'dois' is not present in Source 'publications'. Available fields: abstract,acknowledgements,altmetric,altmetric_id,arxiv_id,authors,authors_count,book_doi,book_series_title,book_title,category_bra,category_for,category_hra,category_hrcs_hc,category_hrcs_rac,category_icrp_cso,category_icrp_ct,category_rcdc,category_sdg,category_uoa,clinical_trial_ids,concepts,concepts_scores,date,date_inserted,date_online,date_print,dimensions_url,doi,field_citation_ratio,funder_countries,funders,id,issn,issue,journal,journal_lists,journal_title_raw,linkout,mesh_terms,open_access,pages,pmcid,pmid,proceedings_title,publisher,recent_citations,reference_ids,referenced_pubs,relative_citation_ratio,research_org_cities,research_org_countries,research_org_country_names,research_org_names,research_org_state_codes,research_org_state_names,research_orgs,researchers,resulting_publication_doi,source_title,subtitles,supporting_grant_ids,times_cited,title,type,volume,year and available fieldsets: basics,book,categories,extras
Full text search¶
You can search for full text in the full text, in abstracts or in the title only.
[8]:
%dsldf search publications in concepts for "situ detection OR malaria" return publications
Returned Publications: 20 (total = 238349)
Time: 2.24s
[8]:
authors | id | pages | title | type | volume | year | journal.id | journal.title | issue | |
---|---|---|---|---|---|---|---|---|---|---|
0 | [{'affiliations': [{'city': 'Qingdao', 'city_i... | pub.1143924946 | 111-122 | In-situ constructing visible light CdS/Cd-MOF ... | article | 69 | 2022 | jour.1138885 | Particuology | NaN |
1 | [{'affiliations': [{'city': 'Johor Bahru', 'ci... | pub.1141511490 | 27-34 | In situ biosynthesized silver nanoparticle-inc... | article | 67 | 2022 | jour.1138885 | Particuology | NaN |
2 | [{'affiliations': [{'city': 'Tianjin', 'city_i... | pub.1141114620 | 59-70 | Thermodynamic and kinetic mechanism of phase t... | article | 66 | 2022 | jour.1138885 | Particuology | NaN |
3 | [{'affiliations': [{'city': 'Montpellier', 'ci... | pub.1144553393 | 107498 | Small angle x-ray scattering to investigate th... | article | 127 | 2022 | jour.1096852 | Food Hydrocolloids | NaN |
4 | [{'affiliations': [{'city': 'Beijing', 'city_i... | pub.1144437185 | 199-206 | Dual-function redox mediator enhanced lithium-... | article | 113 | 2022 | jour.1053018 | Journal of Material Science and Technology | NaN |
5 | [{'affiliations': [{'city': 'Wollongong', 'cit... | pub.1144337486 | 90-104 | Effects of inter-layer remelting frequency on ... | article | 113 | 2022 | jour.1053018 | Journal of Material Science and Technology | NaN |
6 | [{'affiliations': [{'city': 'Taipei', 'city_id... | pub.1144230882 | 100831 | Traditional Chinese medicine attenuates hospit... | article | 11 | 2022 | jour.1048721 | Integrative Medicine Research | 2 |
7 | [{'affiliations': [{'city': 'Wuhan', 'city_id'... | pub.1143825067 | 1-10 | Solar fuel generation over nature-inspired rec... | article | 112 | 2022 | jour.1053018 | Journal of Material Science and Technology | NaN |
8 | [{'affiliations': [{'city': 'Pretoria', 'city_... | pub.1143661252 | 153-161 | Heat-treatment effect on anti-corrosion behavi... | article | 5 | 2022 | jour.1319579 | International Journal of Lightweight Materials... | 2 |
9 | [{'affiliations': [{'city': 'Aachen', 'city_id... | pub.1143575354 | 100081 | Adjustment of chemical composition with dissim... | article | 5 | 2022 | jour.1386545 | Journal of Advanced Joining Processes | NaN |
10 | [{'affiliations': [{'city': 'Guilin', 'city_id... | pub.1143206146 | 189-203 | Remarkable catalysis of spinel ferrite XFe2O4 ... | article | 111 | 2022 | jour.1053018 | Journal of Material Science and Technology | NaN |
11 | [{'affiliations': [{'city': 'Mosul', 'city_id'... | pub.1142709745 | e00787 | Rehabilitation and Repair of AL- Tahera Church... | article | 16 | 2022 | jour.1150125 | Case Studies in Construction Materials | NaN |
12 | [{'affiliations': [{'city': 'Guangzhou', 'city... | pub.1142053424 | 120-132 | 3D-printed bioactive ceramic scaffolds with bi... | article | 12 | 2022 | jour.1053750 | Bioactive Materials | NaN |
13 | [{'affiliations': [{'city': 'Suzhou', 'city_id... | pub.1142009407 | 169-184 | Local bone metabolism balance regulation via d... | article | 12 | 2022 | jour.1053750 | Bioactive Materials | NaN |
14 | [{'affiliations': [{'city': 'Sydney', 'city_id... | pub.1144688828 | 100069 | Development of a portable Universal Testing Ma... | article | 4 | 2022 | jour.1392528 | Advances in Industrial and Manufacturing Engin... | NaN |
15 | [{'affiliations': [{'city': 'Hefei', 'city_id'... | pub.1144613324 | 106950 | A sensitive carbon monoxide sensor for industr... | article | 152 | 2022 | jour.1038957 | Optics and Lasers in Engineering | NaN |
16 | [{'affiliations': [{'city': 'Shanghai', 'city_... | pub.1144588137 | 163702 | Co decoration of molybdenum sulfide and carbon... | article | 902 | 2022 | jour.1041821 | Journal of Alloys and Compounds | NaN |
17 | [{'affiliations': [{'city': "Xi'an", 'city_id'... | pub.1144433265 | 163631 | Solid CoZn glycerate template-based engineerin... | article | 902 | 2022 | jour.1041821 | Journal of Alloys and Compounds | NaN |
18 | [{'affiliations': [{'city': 'Monterrey', 'city... | pub.1144368261 | 100180 | Laccase-assisted biosensing constructs – Robus... | article | 5 | 2022 | jour.1378967 | Case Studies in Chemical and Environmental Eng... | NaN |
19 | [{'affiliations': [{'city': 'Harbin', 'city_id... | pub.1144117575 | 121033 | In situ unraveling surface reconstruction of N... | article | 305 | 2022 | jour.1039901 | Applied Catalysis B Environmental | NaN |
[9]:
%%dsldf
search publications in title_abstract_only for "nanotechnology"
return publications
limit 3
Returned Publications: 3 (total = 98598)
Time: 1.14s
[9]:
authors | id | pages | title | type | volume | year | journal.id | journal.title | issue | |
---|---|---|---|---|---|---|---|---|---|---|
0 | [{'affiliations': [{'city': 'Guangzhou', 'city... | pub.1143936192 | 334-361 | Energetics Systems and artificial intelligence... | article | 8 | 2022 | jour.1150945 | Energy Reports | NaN |
1 | [{'affiliations': [{'name': 'CAS Key Laborator... | pub.1143460385 | 31-48 | Toxicity of manufactured nanomaterials | article | 69 | 2022 | jour.1138885 | Particuology | NaN |
2 | [{'affiliations': [{'city': 'Huzhou', 'city_id... | pub.1144622580 | 978-983 | The Effect of Bone Morphogenetic Protein 2 (BM... | article | 12 | 2022 | jour.1047400 | Journal of Biomaterials and Tissue Engineering | 5 |
A simple author search¶
[10]:
%%dsldf
search publications in authors for "\"Daniel Hook\""
return publications
limit 10
Returned Publications: 10 (total = 85)
Time: 1.14s
[10]:
authors | id | title | type | year | journal.id | journal.title | issue | pages | volume | |
---|---|---|---|---|---|---|---|---|---|---|
0 | [{'affiliations': [], 'corresponding': '', 'cu... | pub.1143968248 | Connecting Scientometrics: Dimensions as a rou... | preprint | 2021 | jour.1371339 | arXiv | NaN | NaN | NaN |
1 | [{'affiliations': [{'city': 'St Louis', 'city_... | pub.1142152310 | PT -symmetric classical mechanics | article | 2021 | jour.1043366 | Journal of Physics Conference Series | 1 | 012003 | 2038 |
2 | [{'affiliations': [{'city': 'Townsville', 'cit... | pub.1141486003 | Can I breastfeed my baby with Down syndrome? A... | article | 2021 | jour.1057714 | Journal of Paediatrics and Child Health | 12 | 1866-1880 | 57 |
3 | [{'affiliations': [{'city': 'London', 'city_id... | pub.1137191304 | Scaling Scientometrics: Dimensions on Google B... | article | 2021 | jour.1292498 | Frontiers in Research Metrics and Analytics | NaN | 656233 | 6 |
4 | [{'affiliations': [], 'corresponding': '', 'cu... | pub.1136235066 | $PT$-symmetric classical mechanics | preprint | 2021 | jour.1371339 | arXiv | NaN | NaN | NaN |
5 | [{'affiliations': [], 'corresponding': '', 'cu... | pub.1134860042 | Scaling Scientometrics: Dimensions on Google B... | preprint | 2021 | jour.1371339 | arXiv | NaN | NaN | NaN |
6 | [{'affiliations': [{'city': 'London', 'city_id... | pub.1134491856 | Real-Time Bibliometrics: Dimensions as a Resou... | article | 2021 | jour.1292498 | Frontiers in Research Metrics and Analytics | NaN | 595299 | 5 |
7 | [{'affiliations': [{'name': 'Digital Science'}... | pub.1124226668 | Dimensions: Bringing down barriers between sci... | article | 2020 | jour.1377615 | Quantitative Science Studies | 1 | 387-395 | 1 |
8 | [{'affiliations': [{'city': 'Oxford', 'city_id... | pub.1115957159 | Perception, prestige and PageRank | article | 2019 | jour.1037553 | PLOS ONE | 5 | e0216783 | 14 |
9 | [{'affiliations': [], 'corresponding': '', 'cu... | pub.1119449118 | The Price of Gold: Curiosity? | preprint | 2019 | jour.1371339 | arXiv | NaN | NaN | NaN |
..or search for a researcher by a specific id¶
[11]:
%%dsldf
search publications
where researchers.id = "ur.013514345521.07"
return publications[doi+researchers]
limit 1
Returned Publications: 1 (total = 22)
Time: 2.68s
[11]:
doi | researchers | |
---|---|---|
0 | 10.1201/9781003042570-10 | [{'first_name': 'Rashi', 'id': 'ur.01001350755... |
Sources VS Facets¶
One of the queries above is using the researchers
facet of the publications
source.
In general source-queries can return up to 1000 records. For example this throws an exception:
[12]:
dsl.query("""
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return publications limit 2000
""")
Returned Errors: 1
Time: 0.57s
1 QueryError found
Semantic errors found:
Limit 2000 exceeds maximum allowed limit 1000
[12]:
<dimcli.DslDataset object #4812964912. Errors: 1>
You can paginate through source results up to 50000 rows¶
With sources, you can use the limit/skip syntax in order to paginate through results:
[13]:
dsl.query("""
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return publications limit 1000 skip 1000
""")
Returned Publications: 1000 (total = 5807)
Time: 2.40s
[13]:
<dimcli.DslDataset object #4407315520. Records: 1000/5807>
You can return max 1000 facet
rows¶
It is important to remember that when using facets you cannot use the skip operation so the maximum number of records is always 1000.
[14]:
dsl.query("""
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return researchers limit 1 skip 1000
""")
Returned Errors: 1
Time: 0.95s
1 QueryError found
Semantic errors found:
Offset is not supported for facet results
[14]:
<dimcli.DslDataset object #4811599632. Errors: 1>
While this works…
[15]:
dsl.query("""
search publications
where year in [2013:2018] and research_orgs="grid.258806.1"
return researchers limit 1000
""")
Returned Researchers: 1000
Time: 2.94s
[15]:
<dimcli.DslDataset object #4811691728. Records: 1000/5807>
Just make a mistake, and you will ge the complete list of available facets¶
[16]:
dsl.query("""
search publications
return years
""")
Returned Errors: 1
Time: 0.74s
1 QueryError found
Semantic errors found:
Facet 'years' is not present in source 'publications'. Available facets are: authors_count,category_bra,category_for,category_hra,category_hrcs_hc,category_hrcs_rac,category_icrp_cso,category_icrp_ct,category_rcdc,category_sdg,category_uoa,funder_countries,funders,journal,journal_lists,mesh_terms,open_access,publisher,referenced_pubs,research_org_cities,research_org_countries,research_org_state_codes,research_orgs,researchers,source_title,times_cited,type,year
[16]:
<dimcli.DslDataset object #4811597088. Errors: 1>
Note
The Dimensions Analytics API allows to carry out sophisticated research data analytics tasks like the ones described on this website. Check out also the associated Github repository for examples, the source code of these tutorials and much more.