../../_images/badge-colab.svg ../../_images/badge-github-custom.svg

Exploring The Dimensions Search Language (DSL) - Deep Dive

This tutorial provides a detailed walkthrough of the most important features of the Dimensions Search Language.

This tutorial is based on the Query Syntax section of the official documentation. So, it can be used as an interactive version of the documentation, as it allows to try out the various DSL queries presented there.

What is the Dimensions Search Language?

The DSL aims to capture the type of interaction with Dimensions data that users are accustomed to performing graphically via the web application, and enable web app developers, power users, and others to carry out such interactions by writing query statements in a syntax loosely inspired by SQL but particularly suited to our specific domain and data organization.

Note: this notebook uses the Python programming language, however all the DSL queries are not Python-specific and can in fact be reused with any other API client.

Prerequisites

This notebook assumes you have installed the Dimcli library and are familiar with the Getting Started tutorial.

[1]:
!pip install dimcli --quiet

import dimcli
from dimcli.shortcuts import *
import json
import sys
import pandas as pd
#

print("==\nLogging in..")
# https://github.com/digital-science/dimcli#authentication
ENDPOINT = "https://app.dimensions.ai"
if 'google.colab' in sys.modules:
  import getpass
  USERNAME = getpass.getpass(prompt='Username: ')
  PASSWORD = getpass.getpass(prompt='Password: ')
  dimcli.login(USERNAME, PASSWORD, ENDPOINT)
else:
  USERNAME, PASSWORD  = "", ""
  dimcli.login(USERNAME, PASSWORD, ENDPOINT)
dsl = dimcli.Dsl()
==
Logging in..
Dimcli - Dimensions API Client (v0.6.9.2)
Connected to endpoint: https://app.dimensions.ai - DSL version: 1.25
Method: dsl.ini file

Sections Index

  1. Basic query structure

  2. Full-text searching

  3. Field searching

  4. Searching for researchers

  5. Returning results

  6. Aggregations

1. Basic query structure

DSL queries consist of two required components: a search phrase that indicates the scientific records to be searched, and one or more return phrases which specify the contents and structure of the desired results.

The simplest valid DSL query is of the form search <source>|return <result>:

[3]:
%%dsldf
search grants return  grants limit 5
Returned Grants: 5 (total = 5310256)
[3]:
language title_language active_year project_num start_year funding_org_name id title start_date original_title funders end_date
0 en en [2021] 2018-HRSI-1548 2021 New Brunswick Health Research Foundation grant.8690978 APPROACH to Enriching the Real World Evidence ... 2021-11-30 APPROACH to Enriching the Real World Evidence ... [{'id': 'grid.484521.e', 'acronym': 'NBHRF', '... NaN
1 en en [2021] 1301720F 2021 Fund for Scientific Research grant.8950252 Molecular mechanism of DNA double strand break... 2021-10-01 Mécanismes moléculaires de la formation et la ... [{'id': 'grid.424470.1', 'acronym': 'FRS FNRS'... NaN
2 en en [2021, 2022, 2023] M 2734 2021 FWF Austrian Science Fund grant.8715161 Life as concept and as science 2021-10-01 Life as concept and as science [{'id': 'grid.25111.36', 'acronym': 'FWF', 'ci... 2023-09-30
3 en en [2021, 2022, 2023] 892933 2021 European Commission grant.8964235 Scintillation Light For New Physics with Liqui... 2021-09-01 Scintillation Light For New Physics with Liqui... [{'id': 'grid.270680.b', 'acronym': 'EC', 'cit... 2023-08-31
4 en en [2021, 2022, 2023] 893021 2021 European Commission grant.8963889 Jet quenching for heavy-ion collisions at the LHC 2021-09-01 Jet quenching for heavy-ion collisions at the LHC [{'id': 'grid.270680.b', 'acronym': 'EC', 'cit... 2023-08-31

search source

A query must begin with the word search followed by a source name, i.e. the name of a type of scientific record, such as grants or publications.

What are the sources available? See the data sources section of the documentation.

Alternatively, we can use the ‘schema’ API (describe) to return this information programmatically:

[4]:
dsl.query("describe schema")
[4]:
<dimcli.DslDataset object #4635749392. Dict keys: 'sources', 'entities'>

A more useful query might also make use of the optional for and where phrases to limit the set of records returned.

[5]:
%%dsldf
search grants  for "lung cancer"
    where active_year=2000
return  grants  limit 5
Returned Grants: 5 (total = 1734)
[5]:
project_num end_date start_date original_title start_year title_language id funding_org_name funders active_year language title
0 F32HL010455 2002-01-01 2000-12-31 ROLE OF CD44 ISOFORMS IN ENDOTHELIAL CELL DAMAGE 2000 en grant.2386513 National Heart Lung and Blood Institute [{'id': 'grid.279885.9', 'country_name': 'Unit... [2000, 2001, 2002] en ROLE OF CD44 ISOFORMS IN ENDOTHELIAL CELL DAMAGE
1 R01HL063695 2004-11-30 2000-12-18 ESTROGEN, ANGIOGENESIS AND ENDOTHELIAL PROGENI... 2000 en grant.2537116 National Heart Lung and Blood Institute [{'id': 'grid.279885.9', 'country_name': 'Unit... [2000, 2001, 2002, 2003, 2004] en ESTROGEN, ANGIOGENESIS AND ENDOTHELIAL PROGENI...
2 R01HL066221 2007-11-30 2000-12-18 GENETIC ANALYSIS OF EPHRIN-EPH SIGNALING IN AN... 2000 en grant.2537801 National Heart Lung and Blood Institute [{'id': 'grid.279885.9', 'country_name': 'Unit... [2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007] en GENETIC ANALYSIS OF EPHRIN-EPH SIGNALING IN AN...
3 R01HL062244 2017-12-31 2000-12-15 Synthetic Heparan Sulfate: Probing Biosynthesi... 2000 en grant.2536777 National Heart Lung and Blood Institute [{'id': 'grid.279885.9', 'country_name': 'Unit... [2000, 2001, 2002, 2003, 2004, 2005, 2006, 200... en Synthetic Heparan Sulfate: Probing Biosynthesi...
4 R01CA088932 2019-03-31 2000-12-01 Regulation of Telomerase by Sphingolipid Signa... 2000 en grant.2475193 National Cancer Institute [{'id': 'grid.48336.3a', 'country_name': 'Unit... [2000, 2001, 2002, 2003, 2004, 2005, 2006, 200... en Regulation of Telomerase by Sphingolipid Signa...

return result (source or facet)

The most basic return phrase consists of the keyword return followed by the name of a record or facet to be returned.

This must be the name of the source used in the search phrase, or the name of a facet of that source.

[6]:
%%dsldf
search grants for "laryngectomy"
return grants limit 5
Returned Grants: 5 (total = 110)
[6]:
start_date title end_date title_language project_num id funders original_title funding_org_name start_year language active_year
0 2019-08-15 Wearable silent speech technology to enhance i... 2024-07-31 en R01DC016621 grant.8554260 [{'id': 'grid.214431.1', 'types': ['Facility']... Wearable silent speech technology to enhance i... National Institute on Deafness and Other Commu... 2019 en [2019, 2020, 2021, 2022, 2023, 2024]
1 2019-04-01 Construction of a nursing system leading to im... 2023-03-31 en 19H03937 grant.8428997 [{'id': 'grid.54432.34', 'types': ['Nonprofit'... Construction of a nursing system leading to im... Japan Society for the Promotion of Science 2019 ja [2019, 2020, 2021, 2022, 2023]
2 2019-04-01 Development of self-directed TE shunt speech t... 2022-03-31 en 19K10927 grant.8441322 [{'id': 'grid.54432.34', 'types': ['Nonprofit'... Development of self-directed TE shunt speech t... Japan Society for the Promotion of Science 2019 ja [2019, 2020, 2021, 2022]
3 2019-04-01 Development of an olfactory improvement progra... 2021-03-31 ja 19K19574 grant.8422934 [{'id': 'grid.54432.34', 'types': ['Nonprofit'... 喉頭がん、下咽頭がんにより喉頭摘出術を受けた患者に対する嗅覚向上プログラムの開発 Japan Society for the Promotion of Science 2019 ja [2019, 2020, 2021]
4 2019-03-01 Early postoperative complications of laryngect... 2019-05-01 lv AP-44/19 grant.9013618 [{'id': 'grid.453247.3', 'types': ['Government... Agrīnās laringektomiju pēcoperācijas komplikāc... Ministry of Education and Science 2019 lv [2019]

Eg let’s see what are the facets available for the grants source:

[7]:
fields = dsl.query("describe schema")['sources']['grants']['fields']
[x for x in fields if fields[x]['is_facet']]
[7]:
['category_uoa',
 'category_for',
 'category_hrcs_hc',
 'category_hra',
 'category_rcdc',
 'language',
 'funder_countries',
 'research_org_state_codes',
 'category_hrcs_rac',
 'research_org_cities',
 'start_year',
 'funders',
 'funding_currency',
 'research_org_countries',
 'active_year',
 'funding_org_city',
 'funding_org_name',
 'language_title',
 'funding_org_acronym',
 'research_orgs',
 'researchers',
 'category_icrp_cso',
 'category_icrp_ct',
 'category_bra']

2. Full-text Searching

Full-text search or keyword search finds all instances of a term (keyword) in a document, or group of documents.

Full text search works by using search indexes, which can be targeting specific sections of a document e.g. its \(abstract\), \(authors\), \(full text\) etc…

[8]:
%%dsldf
search publications
    in full_data for "moon landing"
return publications limit 5
Returned Publications: 5 (total = 168408)
[8]:
type pages author_affiliations year id title
0 chapter 14-30 [[{'first_name': 'Alessandro', 'last_name': 'B... 2020 pub.1127643502 1. Into the Woods (Via Cuma 320, Bacoli)
1 chapter 82-103 NaN 2020 pub.1125153646 ANDRIY BONDAR
2 chapter 160-174 [[{'first_name': 'Laura', 'last_name': 'Marcus... 2020 pub.1126253233 15. H. G. Wells at Uppark
3 chapter 224-240 [[{'first_name': 'Jacob L.', 'last_name': 'Mac... 2020 pub.1127632269 12. The Silence of Aeneid 6 in Augustine’s Con...
4 chapter 232-271 [[{'first_name': 'Alison', 'last_name': 'Finla... 2020 pub.1125633591 Skald Sagas in their Literary Context 2: Possi...

2.1 in [search index]

This optional phrase consists of the particle in followed by a term indicating a search index, specifying for example whether the search is limited to full text, title and abstract only, or title only.

[9]:
%%dsldf
search grants
    in title_abstract_only for "something"
return grants limit 5
Returned Grants: 5 (total = 9677)
[9]:
start_date title end_date title_language project_num id funders original_title funding_org_name start_year language active_year
0 2020-10-01 SaTC: CORE: Medium: Collaborative: Hardening O... 2024-09-30 en 1954521 grant.9046367 [{'id': 'grid.457785.c', 'types': ['Government... SaTC: CORE: Medium: Collaborative: Hardening O... Directorate for Computer & Information Science... 2020 en [2020, 2021, 2022, 2023, 2024]
1 2020-10-01 SaTC: CORE: Medium: Collaborative: Hardening O... 2024-09-30 en 1955270 grant.9046432 [{'id': 'grid.457785.c', 'types': ['Government... SaTC: CORE: Medium: Collaborative: Hardening O... Directorate for Computer & Information Science... 2020 en [2020, 2021, 2022, 2023, 2024]
2 2020-10-01 SaTC: CORE: Medium: Collaborative: Hardening O... 2024-09-30 en 1954712 grant.9046384 [{'id': 'grid.457785.c', 'types': ['Government... SaTC: CORE: Medium: Collaborative: Hardening O... Directorate for Computer & Information Science... 2020 en [2020, 2021, 2022, 2023, 2024]
3 2020-09-30 The Cosmology of the Early and Late Universe 2023-09-29 en ST/T000732/1 grant.8673892 [{'id': 'grid.14467.30', 'types': ['Government... The Cosmology of the Early and Late Universe Science and Technology Facilities Council 2020 en [2020, 2021, 2022, 2023]
4 2020-09-01 Decoding the Infrared Spectra of High Frequenc... 2023-08-31 en 1900095 grant.8966252 [{'id': 'grid.457875.c', 'types': ['Government... Decoding the Infrared Spectra of High Frequenc... Directorate for Mathematical & Physical Sciences 2020 en [2020, 2021, 2022, 2023]

Eg let’s see what are the search fields available for the grants source:

[10]:
dsl.query("describe schema")['sources']['grants']['search_fields']
[10]:
['concepts',
 'title_abstract_only',
 'title_only',
 'noun_phrases',
 'investigators',
 'full_data']
[11]:
%%dsldf
search grants
    in full_data for "graphene AND computer AND iron"
return grants limit 5
Returned Grants: 5 (total = 10)
[11]:
start_date title end_date title_language project_num id funders original_title funding_org_name start_year language active_year
0 2019-01-01 Weyl and Dirac semimetals and beyond - predict... 2021-12-31 en 19-43-04129 grant.8413990 [{'id': 'grid.454869.2', 'types': ['Nonprofit'... Weyl and Dirac semimetals and beyond - predict... Russian Science Foundation 2019 en [2019, 2020, 2021]
1 2018-01-01 Project of the organization of the 18th Intern... 2018-12-31 ru 18-02-20097 grant.8731867 [{'id': 'grid.452899.b', 'types': ['Government... Проект организации 18-ой Международной конфере... Russian Foundation for Basic Research 2018 ru [2018]
2 2016-02-22 Subject subsidy for maintaining the research p... 2016-12-31 pl 4491/E-370/S/2016 grant.7397800 [{'id': 'grid.425823.a', 'types': ['Government... Dotacja podmiotowa na utrzymanie potencjału ba... Ministry of Science and Higher Education 2016 pl [2016]
3 2015-02-19 Subject subsidy for maintaining the research p... 2015-12-31 pl 4491/E-370/S/2015 grant.7397795 [{'id': 'grid.425823.a', 'types': ['Government... Dotacja podmiotowa na utrzymanie potencjału ba... Ministry of Science and Higher Education 2015 pl [2015]
4 2014-04-09 Intentional grant for conducting in 2014 the F... 2014-12-31 pl 4491/E-370/M/2014 grant.7397490 [{'id': 'grid.425823.a', 'types': ['Government... Dotacja celowa na prowadzenie w 2014 przez Wyd... Ministry of Science and Higher Education 2014 pl [2014]

Special search indexes for persons names permit to perform full text searches on publications authors or grants investigators. Please see the Researchers Search section below for more information on how searches work in this case.

[12]:
%dsldf search publications in authors for "\"Jennifer A Doudna\"" return publications limit 5
Returned Publications: 5 (total = 323)
[12]:
title author_affiliations issue id year volume type pages journal.id journal.title
0 Machine learning predicts new anti-CRISPR prot... [[{'first_name': 'Simon', 'last_name': 'Eitzin... 9 pub.1125959258 2020 48 article 4698-4708 jour.1018982 Nucleic Acids Research
1 Author Correction: Phage-assisted evolution of... [[{'first_name': 'Michelle F.', 'last_name': '... NaN pub.1127737872 2020 NaN article 1-1 jour.1115214 Nature Biotechnology
2 Huge and variable diversity of episymbiotic CP... [[{'first_name': 'Christine Y', 'last_name': '... NaN pub.1127645424 2020 NaN preprint 2020.05.14.094862 jour.1293558 bioRxiv
3 Cancer-specific loss of TERT activation sensit... [[{'first_name': 'Alexandra M', 'last_name': '... NaN pub.1127163455 2020 NaN preprint 2020.04.25.061606 jour.1293558 bioRxiv
4 Blueprint for a Pop-up SARS-CoV-2 Testing Lab [[{'first_name': 'Innovative Genomics Institut... NaN pub.1126635310 2020 NaN article 2020.04.11.20061424 jour.1369542 medRxiv

2.2 for "search term"

This optional phrase consists of the keyword for followed by a search term string, enclosed in double quotes (").

Strings in double quotes can contain nested quotes escaped by a backslash \. This will ensure that the string in nested double quotes is searched for as if it was a single phrase, not multiple words.

An example of a phrase: "\"Machine Learning\"" : results must contain Machine Learning as a phrase.

[13]:
%dsldf search publications for "\"Machine Learning\"" return publications limit 5
Returned Publications: 5 (total = 1139898)
[13]:
pages id type title author_affiliations year volume issue journal.id journal.title
0 243-248 pub.1124666091 chapter Towards maritime traffic coordination in the e... [[{'first_name': 'Eetu', 'last_name': 'Heikkil... 2020 NaN NaN NaN NaN
1 1726672 pub.1125710665 article Recognizing hotspots in Brief Eclectic Psychot... [[{'first_name': 'Sytske', 'last_name': 'Wiege... 2020 11 1 jour.1045059 European Journal of Psychotraumatology
2 41-54 pub.1126735888 article Capacitated vehicle routing problem with colum... [[{'first_name': 'Baze University Abuja', 'las... 2020 3 1 jour.1365688 Open Journal of Discrete Applied Mathematics
3 219-250 pub.1124034443 chapter Die Erfassung und Messung von Bedeutungsstrukt... [[{'first_name': 'Jan', 'last_name': 'Goldenst... 2020 NaN NaN NaN NaN
4 83-94 pub.1124677880 chapter Korean Technical Innovation: toward Autonomous... [[{'first_name': 'Yongwon', 'last_name': 'Kwon... 2020 NaN NaN NaN NaN

Example of multiple keywords: "Machine Learning" : this searches for keywords independently.

[14]:
%dsldf search publications for "Machine Learning" return publications limit 5
Returned Publications: 5 (total = 2400834)
[14]:
pages id type title author_affiliations year
0 84-118 pub.1124947017 chapter 4. Visualizing the Division of Labor: William ... [[{'first_name': 'John', 'last_name': 'Barrell... 2020
1 65-368 pub.1127396158 chapter Documents NaN 2020
2 87-139 pub.1125380179 chapter I. THE PHILOSOPHY OF SUCCESS [[{'first_name': 'Heinrich Robert', 'last_name... 2020
3 243-248 pub.1124666091 chapter Towards maritime traffic coordination in the e... [[{'first_name': 'Eetu', 'last_name': 'Heikkil... 2020
4 1-60 pub.1124109965 chapter George Eliot’s Spinoza. An introduction [[{'first_name': 'Benedictus de', 'last_name':... 2020

Note: Special characters, such as any of ^ " : ~ \ [ ] { } ( ) ! | & + must be escaped by a backslash \. Also, please note escaping rules in Python (or other languages). For example, when writing a query with escaped quotes, such as search publications for "\"phrase 1\" AND \"phrase 2\"", in Python, it is necessary to escape the backslashes as well, so it would look like: 'search publications for "\\"phrase 1\\" AND \\"phrase 2\\""'.

See the official docs for more details.

2.3 Boolean Operators

Search term can consist of multiple keywords or phrases connected using boolean logic operators, e.g. AND, OR and NOT.

[15]:
%dsldf search publications for "(dose AND concentration)" return publications limit 5
Returned Publications: 5 (total = 5259655)
[15]:
title id year type pages author_affiliations issue volume journal.id journal.title
0 ANHANG. Part 2 pub.1126070644 2020 chapter 802-1094 NaN NaN NaN NaN NaN
1 England in 1845 and in 1885 pub.1126070808 2020 chapter 61-66 NaN NaN NaN NaN NaN
2 Translational studies of estradiol and progest... pub.1124948447 2020 article 1723857 [[{'first_name': 'Antonia V', 'last_name': 'Se... 1 11 jour.1045059 European Journal of Psychotraumatology
3 7. Conservation of the Amsterdam Sunflowers: F... pub.1125801745 2020 chapter 175-206 [[{'first_name': 'Ella', 'last_name': 'Hendrik... NaN NaN NaN NaN
4 New findings questioning the construct validit... pub.1124216519 2020 article 1708145 [[{'first_name': 'Julian D', 'last_name': 'For... 1 11 jour.1045059 European Journal of Psychotraumatology

When specifying Boolean operators with keywords such as AND, OR and NOT, the keywords must appear in all uppercase.

The operators available are shown in the table below. .

Boolean Operator

Alternative Symbol

Description

AND

&&

Requires both terms on either side of the Boolean operator to be present for a match.

NOT

!

Requires that the following term not be present.

OR

||

Requires that either term (or both terms) be present for a match.

+

Requires that the following term be present.

-

Prohibits the following term (that is, matches on fields or documents that do not include that term). The - operator is functionally similar to the Boolean operator !.

[16]:
%dsldf search publications for "(dose OR concentration) AND (-malaria +africa)" return publications limit 5
Returned Publications: 5 (total = 1355217)
[16]:
type pages year id title author_affiliations
0 chapter 65-368 2020 pub.1127396158 Documents NaN
1 chapter 129-143 2020 pub.1124248733 8. India in the Early Nuclear Age [[{'first_name': 'Campbell', 'last_name': 'Cra...
2 chapter 155-174 2020 pub.1127822864 The Economy of Detainability [[{'first_name': 'Nicholas', 'last_name': 'De ...
3 chapter 634-688 2020 pub.1124248682 17. Institutions for Infrastructure in Develop... [[{'first_name': 'Antonio', 'last_name': 'Esta...
4 chapter 285-304 2020 pub.1124946791 16. The Neuroethology of Birdsong [[{'first_name': 'Eliot A.', 'last_name': 'Bre...

The combination of keywords and boolean operators allow to construct rather sophisticated queries. For example, here’s a real-world query used to extract publications related to COVID-19.

[70]:
q_inner = """ "2019-nCoV" OR "COVID-19" OR "SARS-CoV-2" OR "HCoV-2019" OR "hcov" OR "NCOVID-19" OR
    "severe acute respiratory syndrome coronavirus 2" OR "severe acute respiratory syndrome corona virus 2"
    OR (("coronavirus"  OR "corona virus") AND (Wuhan OR China OR novel)) """

# tip: dsl_escape is a dimcli utility function for escaping special characters
q_outer = f"""search publications in full_data for "{dsl_escape(q_inner)}" return publications"""
print(q_outer)

dsl.query(q_outer)
search publications in full_data for " \"2019-nCoV\" OR \"COVID-19\" OR \"SARS-CoV-2\" OR \"HCoV-2019\" OR \"hcov\" OR \"NCOVID-19\" OR
    \"severe acute respiratory syndrome coronavirus 2\" OR \"severe acute respiratory syndrome corona virus 2\"
    OR ((\"coronavirus\"  OR \"corona virus\") AND (Wuhan OR China OR novel)) " return publications
Returned Publications: 20 (total = 99186)
[70]:
<dimcli.DslDataset object #4639883024. Records: 20/99186>

2.4 Wildcard Searches

The DSL supports single and multiple character wildcard searches within single terms. Wildcard characters can be applied to single terms, but not to search phrases.

[17]:
%dsldf search publications in title_only for "ital? malaria" return publications limit 5
Too Many Requests for the Server. Sleeping for 30 seconds and then retrying.
Returned Publications: 5 (total = 142)
[17]:
title author_affiliations id year type pages journal.id journal.title volume issue
0 Seasons in Italy: Northern European travelers,... [[{'first_name': 'Benjamin', 'last_name': 'Rei... pub.1124231018 2020 article 1-20 jour.1141817 Journal of Tourism and Cultural Change NaN NaN
1 Updated guidelines for malaria prophylaxis in ... [[{'first_name': 'Guido', 'last_name': 'Caller... pub.1123222257 2020 article 101544 jour.1034401 Travel Medicine and Infectious Disease 33 NaN
2 Clinical management of imported malaria in Ita... [[{'first_name': 'Luciana', 'last_name': 'Lepo... pub.1125332077 2020 article 28-33 jour.1089291 Microbiologica 43 1
3 Investigation on potential malaria vectors (An... [[{'first_name': 'Valentina', 'last_name': 'Ta... pub.1113815431 2019 article 151 jour.1030597 Malaria Journal 18 1
4 Increasing imported malaria in children and ad... [[{'first_name': 'Fiorenza', 'last_name': 'Pan... pub.1113201846 2019 article 34-39 jour.1034401 Travel Medicine and Infectious Disease 29 NaN
[18]:
%dsldf search publications in title_only for "it* malaria" return publications limit 5
Returned Publications: 5 (total = 1498)
[18]:
type pages author_affiliations issue volume year id title journal.id journal.title
0 article 24 [[{'first_name': 'Monica P.', 'last_name': 'Sh... 1 19 2020 pub.1124106064 The effectiveness of older insecticide-treated... jour.1030597 Malaria Journal
1 article 109809 [[{'first_name': 'Berge', 'last_name': 'Tsanou... NaN 136 2020 pub.1126819455 Modeling pyrethroids repellency and its role o... jour.1026215 Chaos Solitons & Fractals
2 article 100333 [[{'first_name': 'Toussaint', 'last_name': 'Ro... NaN 33 2020 pub.1124902730 Severe-malaria infection and its outcomes amon... jour.1042240 Spatial and Spatio-temporal Epidemiology
3 article NaN [[{'first_name': 'Arif Jamal', 'last_name': 'S... NaN 67 2020 pub.1127964785 Neurological disorder and psychosocial aspects... jour.1006696 Folia Parasitologica
4 preprint NaN [[{'first_name': 'Jifar', 'last_name': 'Hassen... NaN NaN 2020 pub.1127968073 Urban Malaria Prevalence and Its Associated Ri... jour.1380788 Research Square

Wildcard Search Type

Special Character

Example

Single character - matches a single character

?

The search string te?t would match both test and text.

Multiple characters - matches zero or more sequential characters

*

The wildcard search: tes* would match test, testing, and tester. You can also use wildcard characters in the middle of a term. For example: te*t would match test and text. *est would match pest and test.

2.5 Proximity Searches

A proximity search looks for terms that are within a specific distance from one another.

To perform a proximity search, add the tilde character ~ and a numeric value to the end of a search phrase. For example, to search for a formal and model within 10 words of each other in a document, use the search:

[19]:
%dsldf search publications for "\"formal model\"~10" return publications limit 5
Returned Publications: 5 (total = 468787)
[19]:
pages id type title author_affiliations year volume issue journal.id journal.title
0 84-102 pub.1124248667 chapter 2. Clientelistic Politics and Economic Develop... [[{'first_name': 'Pranab', 'last_name': 'Bardh... 2020 NaN NaN NaN NaN
1 1726722 pub.1125320181 article Building cooperative learning to address alcoh... [[{'first_name': 'Oladapo', 'last_name': 'Olad... 2020 13 1 jour.1041075 Global Health Action
2 xi-xvi pub.1125144025 chapter Foreword NaN 2020 NaN NaN NaN NaN
3 137-159 pub.1125788857 chapter 6. Hierarchy and Power in the Tropical Forest [[{'first_name': 'Irving', 'last_name': 'Goldm... 2020 NaN NaN NaN NaN
4 136-161 pub.1125789336 chapter 6. The Structure and Workings of Employer-Prom... [[{'first_name': 'Joseph F.', 'last_name': 'Ge... 2020 NaN NaN NaN NaN
[20]:
%dsldf search publications for "\"digital humanities\"~5  +ontology" return publications limit 5
Returned Publications: 5 (total = 7345)
[20]:
pages id type title author_affiliations volume year issue journal.id journal.title
0 89 pub.1127423858 article Citizen science in the social sciences and hum... [[{'first_name': 'Loreta', 'last_name': 'Taugi... 6 2020 1 jour.1136613 Palgrave Communications
1 471-478 pub.1127978306 proceeding Atlante dei siti fortificati della provincia d... [[{'first_name': 'Maurizio', 'last_name': 'Tos... NaN 2020 NaN NaN NaN
2 NaN pub.1127498852 monograph Emerging Extended Reality Technologies For Ind... [[{'first_name': 'Jolanda G.', 'last_name': 'T... NaN 2020 NaN NaN NaN
3 185-196 pub.1124901249 article Sparse Low Rank Factorization for Deep Neural ... [[{'first_name': 'Sridhar', 'last_name': 'Swam... 398 2020 NaN jour.1128607 Neurocomputing
4 585-604 pub.1120871378 article WebKey: a graph-based method for event detecti... [[{'first_name': 'Elham', 'last_name': 'Rasoul... 54 2020 3 jour.1327483 Journal of Intelligent Information Systems
The distance referred to here is the number of term movements needed to match the specified phrase.
In the example above, if formal and model were 10 spaces apart in a field, but formal appeared before model, more than 10 term movements would be required to move the terms together and position formal to the right of model with a space in between.

3. Field Searching

Field searching allows to use a specific field of a source as a query filter. For example, this can be a Literal field such as the \(type\) of a publication, its \(date\), \(mesh terms\), etc.. Or it can be an entity field, such as the \(journal title\) for a publication, the \(country name\) of its author affiliations, etc..

What are the fields available for each source? See the data sources section of the documentation.

Alternatively, we can use the ‘schema’ API (describe) to return this information programmatically:

[21]:
%dsldocs publications
[21]:
sources field type description is_filter is_entity is_facet
0 publications altmetric float Altmetric attention score. True False False
1 publications altmetric_id integer AltMetric Publication ID True False False
2 publications authors json Ordered list of authors names and their affili... True False False
3 publications book_doi string The DOI of the book a chapter belongs to (note... True False False
4 publications book_series_title string The title of the book series book, belong to. False False False
5 publications book_title string The title of the book a chapter belongs to (no... False False False
6 publications category_bra categories `Broad Research Areas <https://app.dimensions.... True True True
7 publications category_for categories `ANZSRC Fields of Research classification <htt... True True True
8 publications category_hra categories `Health Research Areas <https://app.dimensions... True True True
9 publications category_hrcs_hc categories `HRCS - Health Categories <https://app.dimensi... True True True
10 publications category_hrcs_rac categories `HRCS – Research Activity Codes <https://app.d... True True True
11 publications category_icrp_cso categories `ICRP Common Scientific Outline <https://app.d... True True True
12 publications category_icrp_ct categories `ICRP Cancer Types <https://app.dimensions.ai/... True True True
13 publications category_rcdc categories `Research, Condition, and Disease Categorizati... True True True
14 publications category_sdg categories SDG - Sustainable Development Goals True True True
15 publications category_uoa categories `Units of Assessment <https://app.dimensions.a... True True True
16 publications concepts json None True False False
17 publications concepts_scores json None True False False
18 publications date date The publication date of a document, eg "2018-0... True False False
19 publications date_inserted date Date when the record was inserted into Dimensi... True False False
20 publications doi string Digital object identifier. True False False
21 publications field_citation_ratio float Relative citation performance of article when ... True False False
22 publications funder_countries countries The country of the organisations funding this ... True True True
23 publications funders organizations The GRID organisation funding this publication. True True True
24 publications id string Dimensions publication ID. True False False
25 publications issn string International Standard Serial Number True False False
26 publications issue string The issue number of a publication. True False False
27 publications journal journals The journal a publication belongs to. True True True
28 publications journal_lists string Independent grouping of journals outside of Di... True False True
29 publications linkout string Original URL for a publication full text. False False False
30 publications mesh_terms string Medical Subject Heading terms as used in PubMed. True False True
31 publications open_access_categories open_access Open Access categories for publications. See b... True True True
32 publications pages string The pages of the publication, as they would ap... True False False
33 publications pmcid string PubMed Central ID. True False False
34 publications pmid string PubMed ID. True False False
35 publications proceedings_title string Title of the conference proceedings volume ass... False False False
36 publications publisher string Name of the publisher as a string. True False True
37 publications recent_citations integer Number of citations received in the last two y... True False False
38 publications reference_ids string Dimensions publication ID for publications in ... True False False
39 publications relative_citation_ratio float Relative citation performance of an article wh... True False False
40 publications research_org_cities cities City of the organisations authors are affiliat... True True True
41 publications research_org_countries countries Country of the organisations authors are affil... True True True
42 publications research_org_country_names string Country name of the organisations authors are ... True False False
43 publications research_org_names string Names of organizations authors are affiliated to. True False False
44 publications research_org_state_codes states State of the organisations authors are affilia... True True True
45 publications research_org_state_names string State name of the organisations authors are af... True False False
46 publications research_orgs organizations GRID organisations associated to a publication... True True True
47 publications researchers researchers Researcher IDs matched to the publication's au... True True True
48 publications resulting_publication_doi string For preprints, the DOIs of the resulting full ... True False False
49 publications supporting_grant_ids string Grants supporting a publication, returned as a... True False False
50 publications times_cited integer Number of citations (note: does not support em... True False True
51 publications title string Title of a publication. False False False
52 publications type string Publication type (one of: article, chapter, pr... True False True
53 publications volume string Publication volume. True False False
54 publications year integer The year of publication (note: when the `date`... True False True

3.1 where

This optional phrase consists of the keyword where followed by a filters phrase consisting of DSL filter expressions, as described below.

[22]:
%dsldf search publications where type = "book" return publications limit 5
Returned Publications: 5 (total = 289608)
[22]:
id type title year
0 pub.1125300609 book Duoethnography in English Language Teaching 2020
1 pub.1108455576 book The Indo-Aryans of Ancient South Asia 2020
2 pub.1031251220 book Scholia in Aeschinem 2020
3 pub.1124703342 book Learning to Read Talmud 2020
4 pub.1125300607 book Sociolinguistic Perspectives on Migration Control 2020

If a for phrase is also used in a filtered query, the system will first apply the filters, and then search the resulting restricted set of documents for the search term.

[23]:
%dsldf search publications for "malaria" where type = "book" return publications limit 5
Returned Publications: 5 (total = 12374)
[23]:
type year id title
0 book 2020 pub.1127956583 Food Microbiology and Biotechnology
1 book 2020 pub.1127885675 Armed Conflict Survey 2020
2 book 2020 pub.1127764124 Textiles, Identity and Innovation: In Touch
3 book 2020 pub.1127540316 Phagocytes and Cellular Immunity
4 book 2020 pub.1127312535 Pharmaceutical Drug Product Development and Pr...

3.2 in

For convenience, the DSL also supports shorthand notation for filters where a particular field should be restricted to a specified range or list of values (although the same logic may be expressed using complex filters as shown below).

Syntax: a range filter consists of the field name, the keyword in, and a range of values enclosed in square brackets ([]), where the range consists of a low value, colon :, and a high value.

[24]:
%%dsldf
search grants
    for "malaria"
    where start_year in [ 2010 : 2015 ]
return grants limit 5
Returned Grants: 5 (total = 3046)
[24]:
language title_language active_year project_num start_year funding_org_name id title start_date original_title funders end_date
0 en en [2015, 2016, 2017] R21AI120981 2015 National Institute of Allergy and Infectious D... grant.4729738 Bloodborne tropical pathogen detection using m... 2015-12-28 Bloodborne tropical pathogen detection using m... [{'id': 'grid.419681.3', 'acronym': 'NIAID', '... 2017-11-30
1 en en [2015, 2016, 2017, 2018, 2019] R21AI120973 2015 National Institute of Allergy and Infectious D... grant.4729736 Field-deployable Assay for Differential Diagno... 2015-12-24 Field-deployable Assay for Differential Diagno... [{'id': 'grid.419681.3', 'acronym': 'NIAID', '... 2019-02-28
2 en en [2015, 2016, 2017, 2018] R21AI109439 2015 National Institute of Allergy and Infectious D... grant.4729699 T cell driven antigen discovery for vaccine ca... 2015-12-21 T cell driven antigen discovery for vaccine ca... [{'id': 'grid.419681.3', 'acronym': 'NIAID', '... 2018-11-30
3 en en [2015, 2016, 2017, 2018] 91488 2015 Volkswagen Foundation grant.4854433 Senior Fellowship for Dr. Eduardo Samo Gudo: E... 2015-12-18 Senior Fellowship for Dr. Eduardo Samo Gudo: E... [{'id': 'grid.452969.5', 'acronym': 'Volkswage... 2018-12-18
4 en en [2015, 2016, 2017, 2018, 2019] MIS-311250 2015 National Institute of Food and Agriculture grant.8821176 Biology, Ecology & Management of Emerging Dise... 2015-12-10 Biology, Ecology & Management of Emerging Dise... [{'id': 'grid.482914.2', 'acronym': 'NIFA', 's... 2019-09-30

Syntax: a list filter consists of the field name, the keyword in, and a list of one or more value s enclosed in square brackets ([]), where values are separated by commas (,):

[25]:
%%dsldf
search grants
    for "malaria"
    where research_org_name in [ "UC Berkeley", "UC Davis", "UCLA"  ]
return grants limit 5
Returned Grants: 0
WARNINGS [1]
Field 'research_org_name' is deprecated in favor of research_orgs. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
[25]:

3.3 count - filter function

The filter function count is supported on some fields in publications (e.g. researchers and research_orgs).

Use of this filter is shown on the example below:

[26]:
%%dsldf
search publications
    for "malaria"
    where count(research_orgs) > 5
return research_orgs limit 5
Returned Research_orgs: 5
[26]:
id count country_name name longitude state_name city_name latitude linkout types acronym
0 grid.4991.5 1477 United Kingdom University of Oxford -1.254010 Oxfordshire Oxford 51.753437 [http://www.ox.ac.uk/] [Education] NaN
1 grid.8991.9 1396 United Kingdom London School of Hygiene & Tropical Medicine -0.130700 Camden London 51.520900 [http://www.lshtm.ac.uk/] [Education] LSHTM
2 grid.38142.3c 1015 United States Harvard University -71.116650 Massachusetts Cambridge 42.377052 [http://www.harvard.edu/] [Education] NaN
3 grid.21107.35 814 United States Johns Hopkins University -76.620280 Maryland Baltimore 39.328888 [https://www.jhu.edu/] [Education] JHU
4 grid.7445.2 730 United Kingdom Imperial College London -0.175478 Westminster London 51.498600 [http://www.imperial.ac.uk/] [Education] NaN

Number of publications with more than 50 researcher.

[27]:
%%dsldf
search publications
    for "malaria"
    where count(researchers) > 50
return publications limit 5
Returned Publications: 5 (total = 190)
[27]:
id type title author_affiliations year journal.id journal.title pages volume issue
0 pub.1127418736 article Mapping geographical inequalities in childhood... [[{'first_name': 'Robert C', 'last_name': 'Rei... 2020 jour.1077219 The Lancet NaN NaN NaN
1 pub.1127157285 article Frequency and management of maternal infection... [[{'first_name': 'Mercedes', 'last_name': 'Bon... 2020 jour.1048786 The Lancet Global Health e661-e671 8 5
2 pub.1126151286 article Genetic tool development in marine protists: e... [[{'first_name': 'Drahomíra', 'last_name': 'Fa... 2020 jour.1033763 Nature Methods 481-494 17 5
3 pub.1127247220 article A SARS-CoV-2 protein interaction map reveals t... [[{'first_name': 'David E.', 'last_name': 'Gor... 2020 jour.1018957 Nature 1-13 NaN NaN
4 pub.1125560167 article Triple artemisinin-based combination therapies... [[{'first_name': 'Rob W', 'last_name': 'van de... 2020 jour.1077219 The Lancet 1345-1360 395 10233

Number of publications with more than one researcher.

[28]:
%%dsldf
search publications
where count(researchers) > 1
return funders limit 5
Returned Funders: 5
[28]:
id count types city_name longitude name country_name linkout acronym latitude state_name
0 grid.419696.5 1758857 [Government] Beijing 116.339830 National Natural Science Foundation of China China [http://www.nsfc.gov.cn/publish/portal1/] NSFC 40.005177 NaN
1 grid.270680.b 645606 [Government] Brussels 4.363670 European Commission Belgium [http://ec.europa.eu/index_en.htm] EC 50.851650 NaN
2 grid.424020.0 565529 [Government] Beijing 116.316284 Ministry of Science and Technology of the Peop... China [http://www.most.gov.cn/eng/] MOST 39.827835 NaN
3 grid.48336.3a 554556 [Government] Rockville -77.101190 National Cancer Institute United States [http://www.cancer.gov/] NCI 39.004326 Maryland
4 grid.54432.34 525799 [Nonprofit] Tokyo 139.740390 Japan Society for the Promotion of Science Japan [http://www.jsps.go.jp/] JSPS 35.687160 NaN

International collaborations: number of publications with more than one author and affiliations located in more than one country.

[29]:
%%dsldf
search publications
where count(researchers) > 1
and count(research_org_countries) > 1
return funders limit 5
Returned Funders: 5
[29]:
id count types city_name longitude name country_name linkout acronym latitude
0 grid.419696.5 433678 [Government] Beijing 116.339830 National Natural Science Foundation of China China [http://www.nsfc.gov.cn/publish/portal1/] NSFC 40.005177
1 grid.270680.b 331110 [Government] Brussels 4.363670 European Commission Belgium [http://ec.europa.eu/index_en.htm] EC 50.851650
2 grid.424150.6 150024 [Facility] Bonn 7.147797 German Research Foundation Germany [http://www.dfg.de/en/] DFG 50.699340
3 grid.424020.0 143572 [Government] Beijing 116.316284 Ministry of Science and Technology of the Peop... China [http://www.most.gov.cn/eng/] MOST 39.827835
4 grid.54432.34 132520 [Nonprofit] Tokyo 139.740390 Japan Society for the Promotion of Science Japan [http://www.jsps.go.jp/] JSPS 35.687160

Domestic collaborations: number of publications with more than one author and more than one affiliation located in exactly one country.

[30]:
%%dsldf
search publications
where count(researchers) > 1
and count(research_org_countries) = 1
return funders limit 5
Returned Funders: 5
[30]:
id count types city_name longitude name country_name linkout acronym latitude state_name
0 grid.419696.5 1285232 [Government] Beijing 116.339830 National Natural Science Foundation of China China [http://www.nsfc.gov.cn/publish/portal1/] NSFC 40.005177 NaN
1 grid.424020.0 411382 [Government] Beijing 116.316284 Ministry of Science and Technology of the Peop... China [http://www.most.gov.cn/eng/] MOST 39.827835 NaN
2 grid.48336.3a 406370 [Government] Rockville -77.101190 National Cancer Institute United States [http://www.cancer.gov/] NCI 39.004326 Maryland
3 grid.54432.34 361012 [Nonprofit] Tokyo 139.740390 Japan Society for the Promotion of Science Japan [http://www.jsps.go.jp/] JSPS 35.687160 NaN
4 grid.280785.0 314257 [Facility] Bethesda -77.099380 National Institute of General Medical Sciences United States [http://www.nigms.nih.gov/Pages/default.aspx] NIGMS 38.997833 Maryland

3.4 Filter Operators

A simple filter expression consists of a field name, an in-/equality operator op, and the desired field value.

The value must be a string enclosed in double quotes (") or an integer (e.g. 1234).

The available operators are:

op

meaning

=

is (or contains if the given field is multi-value)

!=

is not

>

is greater than

<

is less than

>=

is greater than or equal to

<=

is less than or equal to

~

partially matches (see partial-string-matching below)

is empty

is empty (see emptiness-filters below)

is not empty

is not empty (see emptiness-filters below)

A couple of examples

[31]:
%dsldf search datasets where year > 2010 and year < 2012 return datasets limit 5
Returned Datasets: 5 (total = 38341)
[31]:
authors keywords id title year journal.id journal.title
0 [{'name': 'Minna Väliranta', 'orcid': ''}, {'n... [PANGAEA] 10993892 (Table 1) Radiocarbon ages of samples taken fr... 2011 jour.1020344 Journal of Biogeography
1 [{'name': 'Charles-Edouard Thuróczy', 'orcid':... [PANGAEA] 10993247 Average fluorescence and dissolved iron and Fe... 2011 jour.1023157 Deep Sea Research Part II Topical Studies in O...
2 [{'name': 'Charles-Edouard Thuróczy', 'orcid':... [PANGAEA] 10993244 (Table 1) Average fluorescence in the surface ... 2011 NaN NaN
3 [{'name': 'Charles-Edouard Thuróczy', 'orcid':... [PANGAEA] 10993241 Dissolved and dissolvable iron concentrations ... 2011 jour.1312079 Journal of Geophysical Research
4 [{'name': 'Jean-François Therrien', 'orcid': '... [PANGAEA] 10993193 (Table 1) Movement parameters of nine adult fe... 2011 jour.1023041 Journal of Avian Biology
[32]:
%dsldf search patents where assignees != "grid.410484.d" return patents limit 5
Returned Patents: 5 (total = 39704493)
[32]:
publication_date inventor_names granted_year filing_status assignees assignee_names id title times_cited year
0 2009-12-09 [TUMBACK, STEFAN, SCHNELLE, KLAUS-PETER] 2009.0 Grant [{'id': 'grid.6584.f', 'city_name': 'Stuttgart... [Robert Bosch GmbH, BOSCH GMBH ROBERT] EP-1409282-B1 METHODS FOR OPERATING A MOTOR VEHICLE DRIVEN B... 0 2001
1 2009-12-10 [SHKEDI, ROY] NaN Application NaN [SHKEDI ROY] WO-2009149128-A2 TARGETED TELEVISION ADVERTISEMENTS ASSOCIATED ... 1 2009
2 2009-12-09 [RIVIELLO, JOHN, M., REY, MARIA, A.] 2009.0 Grant [{'id': 'grid.418190.5', 'acronym': 'Life Tech... [Dionex Corp, DIONEX CORP] EP-0868664-B1 MULTI-CYCLE LOOP INJECTION FOR TRACE ANALYSIS ... 0 1996
3 2009-12-09 [TANAKA, EIJI, HIGASHI, TAMIO, KITAMURA, TAKAN... 2009.0 Grant [{'id': 'grid.471210.1', 'city_name': 'Tokyo',... [Kuraray Co Ltd, KURARAY CO] EP-0861808-B1 Waste water treatment apparatus 1 1998
4 2009-12-09 [NAKAI, MICHIHIRO, SHIMA, KENSUKE, HIDAKA, HIR... 2009.0 Grant [{'id': 'grid.471143.4', 'city_name': 'Tokyo',... [Fujikura Ltd, FUJIKURA LTD] EP-0805365-B1 Optical waveguide grating and production metho... 0 1997

3.5 Partial string matching with ~

The ~ operator indicates that the given field need only partially, instead of exactly, match the given string (the value used with this operator must be a string, not an integer).

For example, the filter where research_orgs.name~"Saarland Uni" would match both the organization named “Saarland University” and the one named “Universitätsklinikum des Saarlandes”, and any other organization whose name includes the terms “Saarland” and “Uni” (the order is unimportant).

[33]:
%%dsldf
search patents
    where assignee_names ~ "IBM"
return assignees limit 5
Returned Assignees: 5
[33]:
id count city_name name country_name
0 grid.410484.d 329418 Armonk IBM (United States) United States
1 grid.471366.1 22089 George Town GlobalFoundries (Cayman Islands) Cayman Islands
2 grid.14648.3f 5071 Winchester IBM (United Kingdom) United Kingdom
3 grid.420451.6 3555 Mountain View Google United States
4 grid.472772.3 2717 Beijing Lenovo (China) China

3.6 Emptiness filters is empty

To filter records which contain specific field or to filter those which contain an empty field, it is possible to use something like where research_orgs is not empty or where issn is empty.

[34]:
%%dsldf
search publications
    for "iron graphene"
    where researchers is empty
    and research_orgs is not empty
return publications[id+title+researchers+research_orgs+type] limit 5
Returned Publications: 5 (total = 2066)
[34]:
type research_orgs id title
0 article [{'id': 'grid.411507.6', 'country_name': 'Indi... pub.1127980991 Sensitive determination of kojic acid in tomat...
1 article [{'id': 'grid.411507.6', 'country_name': 'Indi... pub.1127901191 Copper oxide immobilized clay nano architectur...
2 article [{'id': 'grid.33764.35', 'country_name': 'Chin... pub.1125095130 Molecular Dynamics Simulations of Melting Iron...
3 article [{'id': 'grid.411510.0', 'country_name': 'Chin... pub.1124438091 Sulfur-Doped Alkylated Graphene Oxide as High-...
4 article [{'id': 'grid.410726.6', 'country_name': 'Chin... pub.1127875464 Application of Raman spectroscopy to probe fun...

4. Searching for Researchers

The DSL offers different mechanisms for searching for researchers (e.g. publication authors, grant investigators), each of them presenting specific advantages.

4.1 Exact name searches

Special full-text indices allows to look up a researcher’s name and surname exactly as they appear in the source documents they derive from.

This approach has a broad scope, as it allows to search the full collection of Dimensions documents irrespectively of whether a researcher was succesfully disambiguated (and hence given a Dimensions ID). On the other hand, this approach will only match names as they appear in the source document, so different spellings or initials are not necessarily returned via a single query.

search in [authors|investigators|inventors]

It is possible to look up publications authors using a specific search index called authors.

This method expects case insensitive phrases, in format \("<first name> <last name>"\) or reverse order. Note that strings in double quotes that contain nested quotes must always be escaped by a backslash \.

[35]:
%dsldf search publications in authors for "\"Charles Peirce\"" return publications limit 5
Returned Publications: 5 (total = 229)
[35]:
title author_affiliations id year type pages
0 26. Assurance through Reasoning [[{'first_name': 'Charles S.', 'last_name': 'P... pub.1123488542 2019 chapter 565-585
1 Abbreviations of Peirce’s Works and Archives [[{'first_name': 'Charles S.', 'last_name': 'P... pub.1123488550 2019 chapter x-xii
2 5. On Logical Graphs [[{'first_name': 'Charles S.', 'last_name': 'P... pub.1123488521 2019 chapter 211-261
3 12. Peripatetic Talks [[{'first_name': 'Charles S.', 'last_name': 'P... pub.1123488528 2019 chapter 348-366
4 14. On the First Principles of Logical Algebra [[{'first_name': 'Charles S.', 'last_name': 'P... pub.1123488530 2019 chapter 385-398

Instead of first name, initials can also be used. These are examples of valid research search phrases:

  • \"Peirce, Charles S.\"

  • \"Charles S. Peirce\"

  • \"CS Peirce\"

  • \"Peirce CS\"

  • \"C S Peirce\"

  • \"Peirce C S\"

  • \"C Peirce\"

  • \"Peirce C\"

  • \"Charles Peirce\"

  • \"Peirce Charles\"

Warning: In order to produce valid results an author or an investigator search query must contain at least two components or more (e.g., name and surname, either in full or initials).

Investigators search is similar to authors search, only it allows to search on grants and clinical trials using a separate search index investigators, and on patents using the index inventors.

[36]:
%%dsldf
search clinical_trials in investigators for "\"John Smith\""
return clinical_trials limit 5
Returned Clinical_trials: 2 (total = 2)
[36]:
active_years id investigator_details title
0 [2008, 2009, 2010, 2011, 2012, 2013, 2014, 201... NCT00689533 [[John M Flynn, MD, Principal Investigator, Ch... VEPTR Implantation to Treat Children With Earl...
1 NaN NCT01241149 [[Ellie Mentler, MD, Principal Investigator, U... Prospective Evaluation of Symptom Resolution i...
[37]:
%%dsldf
search grants in investigators for "\"Satoko Shimazaki\""
return grants limit 5
Returned Grants: 4 (total = 4)
[37]:
start_date title end_date title_language project_num id funders original_title funding_org_name start_year language active_year
0 2020-09-01 Kabuki Actors, Print Technology, and the Theat... 2021-08-31 en FEL-263245-19 grant.7925589 [{'id': 'grid.422239.c', 'types': ['Government... Kabuki Actors, Print Technology, and the Theat... National Endowment for the Humanities 2020 en [2020, 2021]
1 2018-04-01 Genealogy research on female saints in the Pal... 2021-03-31 ja 18K00431 grant.7527261 [{'id': 'grid.54432.34', 'types': ['Nonprofit'... 古・中英語期における女性聖人伝の系譜研究:Aelfricのテクストと言語を中心に Japan Society for the Promotion of Science 2018 ja [2018, 2019, 2020, 2021]
2 2015-04-01 Images of Women in the Old English Lives of Sa... 2018-03-31 en 15K02313 grant.5858713 [{'id': 'grid.54432.34', 'types': ['Nonprofit'... Images of Women in the Old English Lives of Sa... Japan Society for the Promotion of Science 2015 en [2015, 2016, 2017, 2018]
3 2012-04-01 Reception and Transfromation of the Images of ... 2015-03-31 en 24520310 grant.6086985 [{'id': 'grid.54432.34', 'types': ['Nonprofit'... Reception and Transfromation of the Images of ... Japan Society for the Promotion of Science 2012 en [2012, 2013, 2014, 2015]
[38]:
%%dsldf
search patents in inventors for "\"John Smith\""
return patents limit 5
Returned Patents: 5 (total = 501)
[38]:
title times_cited filing_status publication_date id year assignee_names assignees inventor_names granted_year
0 Diagnostic method 0 Application 2002-10-31 US-20020160362-A1 2001 [AstraZeneca AB, SMITH JOHN CRAIG] [{'id': 'grid.418151.8', 'city_name': 'Södertä... [John Smith] NaN
1 Automotive heat exchanger 0 Grant 2006-03-22 GB-2384299-B 2002 [Llanelli Radiators Ltd, Calsonic Kansei UK Lt... [{'id': 'grid.472810.8', 'city_name': 'Llanell... [SMITH JOHN] 2006.0
2 Microelectronic assemblies with composite cond... 2 Application 2005-06-23 US-20050133900-A1 2005 [Tessera Inc, TESSERA INC] [{'id': 'grid.455499.0', 'city_name': 'San Jos... [John Smith] NaN
3 A lockable safety insert for an electrical dom... 0 Grant 2004-11-03 IE-S20030195-A2 2003 [SMITH JOHN] NaN [SMITH JOHN] 2004.0
4 Ammunition cartridge 0 Application 2014-10-22 GB-2513101-A 2013 [Eley Ltd, ELEY LTD] NaN [SMITH JOHN] NaN

4.2 Fuzzy Searches

This type of search is similar to full-text search, with the difference that it allows searching by only a part of a name, e.g. only the ‘last name’ of a person, by using the where clause.

Note At this moment, this type of search is only available for publications. Other sources will add this option in the future.

For example:

[39]:
%%dsldf
search publications where authors = "Hawking"
return publications limit 5[id+doi+title+authors] limit 10
Returned Errors: 1
1 QuerySyntaxError found
1 ParserError found
  * [Line 2:27] ('[') mismatched input '[' expecting <EOF>

Generally speaking, using a where clause to search authors is less precise that using the relevant exact-search syntax.

On the other hand, using a where clause can be handy if one wants to combine an author search with another full-text search index.

For example:

[40]:
%%dsldf
search publications
    in title_abstract_only for "dna replication"
    where authors = "smith"
return publications limit 5
Returned Publications: 5 (total = 1527)
[40]:
pages id type title author_affiliations volume year issue journal.id journal.title
0 37 pub.1124910780 article Genetic associations with clozapine-induced my... [[{'first_name': 'Paul', 'last_name': 'Lacaze'... 10 2020 1 jour.1045271 Translational Psychiatry
1 46 pub.1125664041 article An epigenome-wide association study of posttra... [[{'first_name': 'Mark W.', 'last_name': 'Logu... 12 2020 1 jour.1042271 Clinical Epigenetics
2 11 pub.1124060243 article Longitudinal epigenome-wide association studie... [[{'first_name': 'Clara', 'last_name': 'Snijde... 12 2020 1 jour.1042271 Clinical Epigenetics
3 250-256 pub.1126387158 article Molecular Targeting of Cancer-Associated PCNA ... [[{'first_name': 'Shanna J.', 'last_name': 'Sm... 17 2020 NaN jour.1052368 Molecular Therapy - Oncolytics
4 rna.073114.119 pub.1125466205 article Reciprocal monoallelic expression of ASAR lncR... [[{'first_name': 'Michael', 'last_name': 'Hesk... 26 2020 6 jour.1114285 RNA

4.3 Using the disambiguated Researchers database

The Dimensions Researchers source is a database of researchers information algorithmically extracted and disambiguated from all of the other content sources (publications, grants, clinical trials etc..).

By using the researchers source it is possible to match an ‘aggregated’ person object linking together multiple publication authors, grant investigators etc.. irrespectively of the form their names can take in the original source documents.

However, since database does not contain all authors and investigators information available in Dimensions.

E.g. think of authors from older publications, or authors with very common names that are difficult to disambiguate, or very new authors, who have only one or few publications. In such cases, using full-text authors search might be more appropriate.

Examples:

[41]:
%%dsldf
search researchers for "\"Satoko Shimazaki\""
return researchers[basics+obsolete]
Returned Researchers: 4 (total = 4)
[41]:
id obsolete first_name last_name research_orgs
0 ur.014307627665.09 0 Satoko Shimazaki [{'id': 'grid.19006.3e', 'types': ['Education'...
1 ur.010537333602.30 1 Satoko Shimazaki NaN
2 ur.07751146721.59 0 Satoko Shimazaki NaN
3 ur.015527473602.63 0 Satoko Shimazaki [{'id': 'grid.266190.a', 'types': ['Education'...

NOTE pay attentiont to the obsolete field. This indicates the researcher ID status. 0 means that the researcher ID is still active, 1 means that the researcher ID is no longer valid. This is due to the ongoing process of refinement of Dimensions researchers.

Hence the query above is best written like this:

[42]:
%%dsldf
search researchers where obsolete=0 for "\"Satoko Shimazaki\""
return researchers[basics+obsolete]
Returned Researchers: 3 (total = 3)
[42]:
id research_orgs first_name last_name obsolete
0 ur.014307627665.09 [{'id': 'grid.19006.3e', 'acronym': 'UCLA', 's... Satoko Shimazaki 0
1 ur.07751146721.59 NaN Satoko Shimazaki 0
2 ur.015527473602.63 [{'id': 'grid.266190.a', 'acronym': 'UCB', 'st... Satoko Shimazaki 0

With Researchers, one can use other fields as well:

[43]:
%%dsldf
search researchers
    where obsolete=0 and last_name="Shimazaki"
return researchers[basics] limit 5
Returned Researchers: 5 (total = 468)
[43]:
id research_orgs first_name last_name
0 ur.013510032403.65 [{'id': 'grid.419075.e', 'acronym': 'ARC', 'st... Tatsuo Shimazaki
1 ur.010700310627.87 [{'id': 'grid.471199.3', 'city_name': 'Kyoto',... Tomomi Shimazaki
2 ur.011035131473.19 [{'id': 'grid.415776.6', 'acronym': 'NIPH', 'c... Dai Shimazaki
3 ur.016627632300.80 NaN Koji Shimazaki
4 ur.013205240215.48 [{'id': 'grid.420062.2', 'city_name': 'Tokyo',... Toshiyuki Shimazaki

5. Returning results

After the search phrase, a query must contain one or more return phrases, specifying the content and format of the information that should be returned.

5.1 Returning Multiple Sources

Multiple results may not be returned in a single return phrase.

[44]:
%%dsldf
search publications
return funders limit 5
return research_orgs limit 5
return year
Returned Year: 20
Returned Funders: 5
Returned Research_orgs: 5
[Warning] Dataframe created from first available key, but more than one JSON key found: ['year', 'funders', 'research_orgs']
[44]:
id count
0 2019 5488513
1 2018 5118512
2 2017 4789054
3 2016 4403893
4 2015 4219869
5 2014 4077817
6 2013 3885112
7 2012 3624608
8 2011 3506759
9 2010 3090707
10 2009 2960156
11 2007 2804855
12 2008 2789225
13 2020 2506606
14 2006 2496433
15 2005 2283046
16 2004 2163862
17 2003 1972500
18 2002 1841789
19 2001 1779842

5.2 Returning Specific Fields

For control over which information from each given record will be returned, a source or entity name in the results phrase can be optionally followed by a specification of fields and fieldsets to be included in the JSON results for each retrieved record.

The fields specification may be an arbitrary list of field names enclosed in brackets ([, ]), with field names separated by a plus sign (+). Minus sign (-) can be used to exclude field or a fieldset from the result. Field names thus listed within brackets must be “known” to the DSL, and therefore only a subset of fields may be used in this syntax (see note below).

[45]:
%%dsldf
search grants
return grants[grant_number + title + language] limit 5
Returned Grants: 5 (total = 5310256)
[45]:
grant_number title language
0 2018-HRSI-1548 APPROACH to Enriching the Real World Evidence ... en
1 1301720F Molecular mechanism of DNA double strand break... en
2 M 2734 Life as concept and as science en
3 892933 Scintillation Light For New Physics with Liqui... en
4 893021 Jet quenching for heavy-ion collisions at the LHC en
[46]:
%%dsldf
search clinical_trials
return clinical_trials [id+ title + acronym + phase] limit 5
Returned Clinical_trials: 5 (total = 562451)
[46]:
id title phase acronym
0 NCT00249756 Re-Entry MTC for Offenders With MICA Disorders NaN NaN
1 NCT00249782 A Phase II, Randomized, Partial-Blind, Paralle... Phase 2 NaN
2 NCT00249795 A Parallel Randomized Controlled Evaluation of... Phase 3 ACTIVE I
3 NCT00249847 A Feasibility Study of Positron Emission Tomog... NaN NaN
4 NCT00249860 A Multicentre Phase III Study of Interferon-be... Phase 3 NaN

Shortcuts: ``fieldsets``

The fields specification may be the name of a pre-defined fieldset (e.g. extras, basics). These are shortcuts that can be handy when testing out new queries, for example.

NOTE In general when writing code used in integrations or long-standing extraction scripts it is best to return specific fields rather that a predefined set. This has also the advantage of making queries faster by avoiding the extraction of unnecessary data.

[47]:
%%dsldf
search grants
return grants [basics] limit 5
Returned Grants: 5 (total = 5310256)
WARNINGS [2]
Field 'project_num' is deprecated in favor of grant_number. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'title_language' is deprecated in favor of language_title. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
[47]:
start_date title language project_num id funders original_title funding_org_name start_year title_language active_year end_date
0 2021-11-30 APPROACH to Enriching the Real World Evidence ... en 2018-HRSI-1548 grant.8690978 [{'id': 'grid.484521.e', 'types': ['Nonprofit'... APPROACH to Enriching the Real World Evidence ... New Brunswick Health Research Foundation 2021 en [2021] NaN
1 2021-10-01 Molecular mechanism of DNA double strand break... en 1301720F grant.8950252 [{'id': 'grid.424470.1', 'types': ['Nonprofit'... Mécanismes moléculaires de la formation et la ... Fund for Scientific Research 2021 en [2021] NaN
2 2021-10-01 Life as concept and as science en M 2734 grant.8715161 [{'id': 'grid.25111.36', 'types': ['Nonprofit'... Life as concept and as science FWF Austrian Science Fund 2021 en [2021, 2022, 2023] 2023-09-30
3 2021-09-01 Scintillation Light For New Physics with Liqui... en 892933 grant.8964235 [{'id': 'grid.270680.b', 'types': ['Government... Scintillation Light For New Physics with Liqui... European Commission 2021 en [2021, 2022, 2023] 2023-08-31
4 2021-09-01 Jet quenching for heavy-ion collisions at the LHC en 893021 grant.8963889 [{'id': 'grid.270680.b', 'types': ['Government... Jet quenching for heavy-ion collisions at the LHC European Commission 2021 en [2021, 2022, 2023] 2023-08-31
[48]:
%%dsldf
search publications
return publications [basics+times_cited] limit 5
Returned Publications: 5 (total = 110113720)
WARNINGS [1]
Field 'author_affiliations' is deprecated in favor of authors. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
[48]:
type times_cited pages author_affiliations year id title journal.id journal.title issue volume
0 article 0 1-18 [[{'first_name': 'Nihal', 'last_name': 'ATA TU... 2020 pub.1125931386 Visual research on the trustability of classic... jour.1142190 Hacettepe Journal of Mathematics and Statistics NaN NaN
1 chapter 0 21-48 [[{'first_name': 'Nienke', 'last_name': 'Bakke... 2020 pub.1125801740 2. The Sunflowers in Perspective NaN NaN NaN NaN
2 chapter 0 333-349 NaN 2020 pub.1125632078 Literature NaN NaN NaN NaN
3 monograph 32 NaN [[{'first_name': 'Jochen', 'last_name': 'Taupi... 2020 pub.1096916023 Die Standesordnungen der freien Berufe NaN NaN NaN NaN
4 article 0 1711335 [[{'first_name': 'Nathaly', 'last_name': 'Aya ... 2020 pub.1124196727 The gender responsiveness of social marketing ... jour.1041075 Global Health Action 1 13

The fields specification may be an (all), to indicate that all fields available for the given source should be returned.

[49]:
%%dsldf
search publications
return publications [all] limit 5
Returned Publications: 5 (total = 110113720)
WARNINGS [10]
Field 'FOR_first' is deprecated in favor of category_for. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'FOR' is deprecated in favor of category_for. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'author_affiliations' is deprecated in favor of authors. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'terms' is deprecated in favor of concepts. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'RCDC' is deprecated in favor of category_rcdc. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'HRCS_RAC' is deprecated in favor of category_hrcs_rac. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'HRCS_HC' is deprecated in favor of category_hrcs_hc. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'open_access' is deprecated in favor of open_access_categories. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'references' is deprecated in favor of reference_ids. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'category_ua' is deprecated in favor of category_uoa. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
[49]:
date_inserted references reference_ids researchers linkout id pages concepts_scores concepts year ... volume RCDC category_rcdc research_org_country_names research_org_countries research_org_names research_org_cities category_hra issue research_orgs
0 2020-03-28 [pub.1107763504, pub.1061471419, pub.100981774... [pub.1107763504, pub.1061471419, pub.100981774... [{'id': 'ur.015425340575.47', 'first_name': 'N... https://dergipark.org.tr/tr/download/article-f... pub.1125931386 1-18 [{'concept': 'Cox regression', 'relevance': 0.... [Cox regression, regression, research, method,... 2020 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 2020-03-22 NaN NaN NaN NaN pub.1125801740 21-48 [{'concept': 'perspective', 'relevance': 0.055... [perspective, sunflower] 2020 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 2020-03-15 NaN NaN NaN NaN pub.1125632078 333-349 NaN NaN 2020 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 2017-12-07 NaN NaN NaN NaN pub.1096916023 NaN NaN NaN 2020 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 2020-01-21 [pub.1038918292, pub.1013186597, pub.101488649... [pub.1038918292, pub.1013186597, pub.101488649... [{'id': 'ur.07430064243.75', 'first_name': 'Na... https://www.tandfonline.com/doi/pdf/10.1080/16... pub.1124196727 1711335 [{'concept': 'social marketing interventions',... [social marketing interventions, tropical dise... 2020 ... 13 [{'id': '498', 'name': 'Behavioral and Social ... [{'id': '498', 'name': 'Behavioral and Social ... [Switzerland] [{'id': 'CH', 'name': 'Switzerland'}] [Universita della Svizzera Italiana, Graduate ... [{'id': 2657896, 'name': 'Zürich'}, {'id': 265... [{'id': '3903', 'name': 'Population & Society'}] 1 [{'id': 'grid.424404.2', 'types': ['Education'...

5 rows × 53 columns

5.3 Returning Facets

In addition to returning source records matching a query, it is possible to \(facet\) on the entity fields related to a particular source and return only those entity values as an aggregrated view of the related source data. This operation is similar to a \(group by\) or \(pivot table\).

Warning Faceting can return up to a maximum of 1000 results. This is to ensure adequate performance with all queries. Furthemore, although the limit operator is allowed, the skip operator cannot be used.

[50]:
%%dsldf
search publications
    for "coronavirus"
return research_orgs limit 5
Returned Research_orgs: 5
[50]:
id count country_name name longitude state_name city_name latitude linkout types acronym
0 grid.194645.b 984 China University of Hong Kong 114.13708 Hong Kong Hong Kong 22.283287 [http://www.hku.hk/] [Education] HKU
1 grid.21107.35 827 United States Johns Hopkins University -76.62028 Maryland Baltimore 39.328888 [https://www.jhu.edu/] [Education] JHU
2 grid.38142.3c 760 United States Harvard University -71.11665 Massachusetts Cambridge 42.377052 [http://www.harvard.edu/] [Education] NaN
3 grid.25879.31 725 United States University of Pennsylvania -75.19322 Pennsylvania Philadelphia 39.952457 [http://www.upenn.edu/] [Education] NaN
4 grid.4991.5 703 United Kingdom University of Oxford -1.25401 Oxfordshire Oxford 51.753437 [http://www.ox.ac.uk/] [Education] NaN
[51]:
%%dsldf
search publications
    for "coronavirus"
return research_org_countries limit 5
return year limit 5
return category_for limit 5
Returned Category_for: 5
Returned Research_org_countries: 5
Returned Year: 5
[Warning] Dataframe created from first available key, but more than one JSON key found: ['category_for', 'research_org_countries', 'year']
[51]:
id count name
0 2211 61716 11 Medical and Health Sciences
1 2206 21254 06 Biological Sciences
2 3114 19179 1108 Medical Microbiology
3 3053 15688 1103 Clinical Sciences
4 3177 15199 1117 Public Health and Health Services

For control over the organization and headers of the JSON query results, the return keyword in a return phrase may be followed by the keyword in and then a group name for this group of results, where the group name is enclosed in double quotes(").

Also, one can define aliases that replace the defaul JSON fields names with other ones provided by the user.

See the official documentation for more details about this feature.

[52]:
%%dsldf
search publications
return in "facets" funders
return in "facets" research_orgs
Returned Facets: 2
[52]:
funders research_orgs
0 [{'id': 'grid.419696.5', 'count': 1951296, 'ty... [{'id': 'grid.26999.3d', 'count': 325233, 'typ...

5.4 What the query statistics refer to - sources VS facets

When performing a DSL search, a _stats object is return which contains some useful info eg the total number of records available for a search.

[53]:
%%dsldf
search publications
  where year in [2013:2018] and research_orgs="grid.258806.1"
return publications limit 5
Returned Publications: 5 (total = 3768)
[53]:
type pages author_affiliations issue volume year id title journal.id journal.title
0 article 18124-18131 [[{'first_name': 'Siewteng', 'last_name': 'Sim... 12 3 2018 pub.1110885950 Development of Organo-Dispersible Graphene Oxi... jour.1157000 ACS Omega
1 proceeding NaN [[{'first_name': 'T.', 'last_name': 'Miyagi', ... NaN NaN 2018 pub.1110925389 Nuclear Ab Initio Calculations with the Unitar... NaN NaN
2 article 29200-29209 [[{'first_name': 'Taro', 'last_name': 'Toyoda'... 51 122 2018 pub.1110369527 Anisotropic Crystal Growth, Optical Absorption... jour.1038386 The Journal of Physical Chemistry C
3 article 28491-28496 [[{'first_name': 'Liang', 'last_name': 'Wang',... 50 122 2018 pub.1110271601 Indium Zinc Oxide Electron Transport Layer for... jour.1038386 The Journal of Physical Chemistry C
4 article 43682-43690 [[{'first_name': 'Ami', 'last_name': 'Nomura',... 50 10 2018 pub.1110222625 Chalcopyrite ZnSnSb2: A Promising Thermoelectr... jour.1041450 ACS Applied Materials & Interfaces

It is important to note though that the total number always refers to the main source, never the facets one is searching for.

For example, in this query we return researchers linked to publications:

[54]:
%%dsldf
search publications
  where year in [2013:2018] and research_orgs="grid.258806.1"
return researchers limit 5
Returned Researchers: 5
[54]:
id count research_orgs first_name last_name orcid_id
0 ur.01055753603.27 138 [grid.14003.36, grid.266298.1, grid.258806.1, ... Shuzi Shuzi Hayase NaN
1 ur.011212042763.67 102 [grid.258806.1, grid.27476.30, grid.462727.2] Masayuki Hikita NaN
2 ur.01144540527.52 98 [grid.258806.1, grid.177174.3, grid.11135.37, ... Ting-Li Ma [0000-0002-3310-459X]
3 ur.07644453127.11 96 [grid.258806.1, grid.471634.3, grid.11417.32, ... M Kozako M Kozako NaN
4 ur.016357156077.09 91 [grid.54432.34, grid.454850.8, grid.268415.c, ... Huimin Lu [0000-0001-9794-3221]

NOTE: facet results can be 1000 at most (due to performance limitations) so if there are more than 1000 it is not possible to know the total number.

5.5 Paginating Results

At the end of a return phrase, the user can specify the maximum number of results to be returned and the number of top records to skip over before returning the first result record, for e.g. returning large result sets page-by-page (i.e. “paging” results) as described below.

This is done using the keyword limit followed by the maximum number of results to return, optionally followed by the keyword skip and the number of results to skip (the offset).

[55]:
%%dsldf
search publications return publications limit 10
Returned Publications: 10 (total = 110113720)
[55]:
title author_affiliations id year type pages journal.id journal.title issue volume
0 Visual research on the trustability of classic... [[{'first_name': 'Nihal', 'last_name': 'ATA TU... pub.1125931386 2020 article 1-18 jour.1142190 Hacettepe Journal of Mathematics and Statistics NaN NaN
1 2. The Sunflowers in Perspective [[{'first_name': 'Nienke', 'last_name': 'Bakke... pub.1125801740 2020 chapter 21-48 NaN NaN NaN NaN
2 Literature NaN pub.1125632078 2020 chapter 333-349 NaN NaN NaN NaN
3 Die Standesordnungen der freien Berufe [[{'first_name': 'Jochen', 'last_name': 'Taupi... pub.1096916023 2020 monograph NaN NaN NaN NaN NaN
4 The gender responsiveness of social marketing ... [[{'first_name': 'Nathaly', 'last_name': 'Aya ... pub.1124196727 2020 article 1711335 jour.1041075 Global Health Action 1 13
5 To start or to complete? – Challenges in imple... [[{'first_name': 'Mahendra M', 'last_name': 'R... pub.1124099280 2020 article 1704540 jour.1041075 Global Health Action 1 13
6 Long-term trends in seasonality of mortality i... [[{'first_name': 'Benjamin-Samuel', 'last_name... pub.1124649186 2020 article 1717411 jour.1041075 Global Health Action 1 13
7 Eine Warnung an alle, dy sych etwaz duncken: D... [[{'first_name': 'Ulla', 'last_name': 'William... pub.1125632729 2020 chapter 167-190 NaN NaN NaN NaN
8 Marienklagen und Pietà [[{'first_name': 'Georg', 'last_name': 'Satzin... pub.1125635978 2020 chapter 241-276 NaN NaN NaN NaN
9 Johannes Taulers Via negationis [[{'first_name': 'Walter', 'last_name': 'Haug'... pub.1125632704 2020 chapter 76-93 NaN NaN NaN NaN

If paging information is not provided, the default values limit 20 skip 0 are used, so the two following queries are equivalent:

Combining limit and skip across multiple queries enables paging or batching of results; e.g. to retrieve 30 grant records divided into 3 pages of 10 records each, the following three queries could be used:

return grants limit 10           => get 1st 10 records for page 1 (skip 0, by default)
return grants limit 10 skip 10   => get next 10 for page 2; skip the 10 we already have
return grants limit 10 skip 20   => get another 10 for page 3, for a total of 30

5.6 Sorting Results

A sort order for the results in a given return phrase can be specified with the keyword sort by followed by the name of * a field (in the case that a source is being requested) * an indicator (aggregation) (in the case that one or more facets are being requested).

By default, the result set of full text queries (\(search ... for "full text query"\)) is sorted by “relevance”. Additionally, it is possible to specify the sort order, using asc or desc keywords. By default, descending order is selected.

[56]:
%%dsldf
search grants
    for "nanomaterials"
return grants sort by title desc limit 5
Returned Grants: 5 (total = 17719)
[56]:
project_num end_date start_date original_title start_year title_language id funding_org_name funders active_year language title
0 2018/29/N/ST5/01240 2022-03-31 2019-04-01 x 2019 pl grant.8518592 National Science Center [{'id': 'grid.436846.b', 'country_name': 'Pola... [2019, 2020, 2021, 2022] pl x
1 280331443 NaN 2015-01-01 Transmissionselektronenmikroskop 2015 en grant.4841519 German Research Foundation [{'id': 'grid.424150.6', 'country_name': 'Germ... [2015] en Transmissionselektronenmikroskop
2 220923099 NaN 2012-01-01 Transmissionselektronenmikroskop 2012 en grant.4823271 German Research Foundation [{'id': 'grid.424150.6', 'country_name': 'Germ... [2012] de Transmissionselektronenmikroskop
3 3E120109 2015-06-13 2011-06-16 Snowcontrol. 2011 en grant.6774902 Belgian Federal Science Policy Office [{'id': 'grid.425119.a', 'country_name': 'Belg... [2011, 2012, 2013, 2014, 2015] en Snowcontrol.
4 245513494 NaN 2014-01-01 Röntgenquelle 2014 en grant.4834305 German Research Foundation [{'id': 'grid.424150.6', 'country_name': 'Germ... [2014] de Röntgenquelle
[57]:
%%dsldf
search grants
    for "nanomaterials"
return grants  sort by relevance desc limit 5
Returned Grants: 5 (total = 17719)
[57]:
start_date title end_date title_language project_num id funders original_title funding_org_name start_year language active_year
0 2012-06-01 Optically-active chiral nanomaterials 2013-05-31 en 11/W.1/I2065 grant.3984032 [{'id': 'grid.437854.9', 'types': ['Nonprofit'... Optically-active chiral nanomaterials Science Foundation Ireland 2012 en [2012, 2013]
1 2016-04-01 Polymer Nanomaterials 2017-03-31 en 617505 grant.6973622 [{'id': 'grid.452912.9', 'types': ['Government... Polymer Nanomaterials Natural Sciences and Engineering Research Council 2016 en [2016, 2017]
2 2016-04-01 Polymer Nanomaterials 2017-03-31 en 617153 grant.6973270 [{'id': 'grid.452912.9', 'types': ['Government... Polymer Nanomaterials Natural Sciences and Engineering Research Council 2016 en [2016, 2017]
3 2013-04-01 Polymer Nanomaterials 2014-03-31 en 543663 grant.3643972 [{'id': 'grid.452912.9', 'types': ['Government... Polymer Nanomaterials Natural Sciences and Engineering Research Council 2013 en [2013, 2014]
4 2010-04-01 Polymer Nanomaterials 2011-03-31 en 454382 grant.2865162 [{'id': 'grid.452912.9', 'types': ['Government... Polymer Nanomaterials Natural Sciences and Engineering Research Council 2010 en [2010, 2011]

Number of citations per publication

[58]:
%%dsldf
search publications
return publications  [doi + times_cited]
    sort by times_cited limit 5
Returned Publications: 5 (total = 110023255)
[58]:
times_cited doi
0 230793 NaN
1 196708 10.1038/227680a0
2 178696 10.1016/0003-2697(76)90527-3
3 87448 10.1006/meth.2001.1262
4 82895 10.1103/physrevlett.77.3865

Recent citations per publication. Note: Recent citation refers to the number of citations accrued in the last two year period. A single value is stored per document and the year window rolls over in July.

[59]:
%%dsldf
search publications
return publications [doi + recent_citations]
    sort by recent_citations limit 5
Returned Publications: 5 (total = 110023255)
[59]:
recent_citations doi
0 29381 10.1006/meth.2001.1262
1 22006 10.1103/physrevlett.77.3865
2 21376 10.1176/appi.books.9780890425596
3 20907 10.1109/cvpr.2016.90
4 20077 10.1191/1478088706qp063oa

When a facet is being returned, the indicator used in the sort phrase must either be count (the default, such that sort by count is unnecessary), or one of the indicators specified in the aggregate phrase, i.e. one whose values are being computed in the faceting operation.

[60]:
%%dsldf
search publications
    for "nanomaterials"
return research_orgs
    aggregate altmetric_median, rcr_avg sort by rcr_avg limit 5
Returned Research_orgs: 5
[60]:
id count rcr_avg altmetric_median types city_name longitude name country_name linkout latitude acronym state_name
0 grid.11444.34 1 207.839996 343.0 [Facility] Shanghai 121.467255 Shanghai Institute of Hypertension China [http://www.china-sih.com/] 31.211678 NaN NaN
1 grid.11485.39 1 207.839996 343.0 [Nonprofit] London -0.106269 Cancer Research UK United Kingdom [http://www.cancerresearchuk.org/] 51.531322 CRUK NaN
2 grid.11642.30 1 207.839996 343.0 [Education] Saint-Denis 55.484550 University of La Réunion Reunion [http://www.univ-reunion.fr/university-of-reun... -20.901735 NaN NaN
3 grid.120073.7 1 207.839996 343.0 [Healthcare] Cambridge 0.140000 Addenbrooke's Hospital United Kingdom [http://www.cuh.org.uk/addenbrookes-hospital] 52.176000 NaN Cambridgeshire
4 grid.20931.39 1 207.839996 343.0 [Education] London -0.134000 Royal Veterinary College United Kingdom [http://www.rvc.ac.uk/] 51.536800 RVC NaN

6. Aggregations

In a return phrase requesting one or more facet results, aggregation operations to perform during faceting can be specified after the facet name(s) by using the keyword aggregate followed by a comma-separated list of one or more indicator names corresponding to the source being searched.

[61]:
%%dsldf
search publications
    where year > 2010
return research_orgs
    aggregate rcr_avg, altmetric_median limit 5
Returned Research_orgs: 5
[61]:
id count rcr_avg altmetric_median country_name name longitude state_name city_name latitude linkout types acronym
0 grid.17063.33 140923 1.692821 4.0 Canada University of Toronto -79.395000 Ontario Toronto 43.661667 [http://www.utoronto.ca/] [Education] NaN
1 grid.38142.3c 136543 2.213127 5.0 United States Harvard University -71.116650 Massachusetts Cambridge 42.377052 [http://www.harvard.edu/] [Education] NaN
2 grid.11899.38 132248 1.045882 2.0 Brazil University of São Paulo -46.730103 NaN São Paulo -23.563051 [http://www5.usp.br/en/] [Education] USP
3 grid.83440.3b 120731 1.906856 4.0 United Kingdom University College London -0.133982 NaN London 51.524470 [http://www.ucl.ac.uk/] [Education] UCL
4 grid.26999.3d 119074 1.181334 2.0 Japan University of Tokyo 139.762220 NaN Tokyo 35.713333 [http://www.u-tokyo.ac.jp/en/] [Education] UT

What are the metrics/aggregations available? See the data sources documentation for information about available indicators.

Alternatively, we can use the ‘schema’ API (describe) to return this information programmatically:

[62]:
schema = dsl.query("describe schema")
sources = [x for x in schema['sources']]
# for each source name, extract metrics info
for s in sources:
    print("SOURCE:", s)
    for m in schema['sources'][s]['metrics']:
        print("--", schema['sources'][s]['metrics'][m]['name'], " => ", schema['sources'][s]['metrics'][m]['description'], )
SOURCE: publications
-- count  =>  Total count
-- altmetric_median  =>  Median Altmetric attention score
-- altmetric_avg  =>  Altmetric attention score mean
-- citations_total  =>  Aggregated number of citations
-- citations_avg  =>  Arithmetic mean of citations
-- citations_median  =>  Median of citations
-- recent_citations_total  =>  For a given article, in a given year, the number of citations accrued in the last two year period. Single value stored per document, year window rolls over in July.
-- rcr_avg  =>  Arithmetic mean of `relative_citation_ratio` field.
-- fcr_gavg  =>  Geometric mean of `field_citation_ratio` field (note: This field cannot be used for sorting results).
SOURCE: grants
-- count  =>  Total count
-- funding  =>  Total funding amount, in USD.
SOURCE: patents
-- count  =>  Total count
SOURCE: clinical_trials
-- count  =>  Total count
SOURCE: policy_documents
-- count  =>  Total count
SOURCE: researchers
-- count  =>  Total count
SOURCE: organizations
-- count  =>  Total count
SOURCE: datasets
-- count  =>  Total count

NOTE In addition to any specified aggregations, count is always computed and reported when facet results are requested.

[63]:
%%dsldf
search grants
    for "5g network"
return funders
    aggregate count, funding sort by funding limit 5
Returned Funders: 5
[63]:
id count funding acronym city_name name types longitude linkout latitude country_name state_name
0 grid.270680.b 175 834354500.0 EC Brussels European Commission [Government] 4.363670 [http://ec.europa.eu/index_en.htm] 50.851650 Belgium NaN
1 grid.421091.f 68 52650403.0 EPSRC Swindon Engineering and Physical Sciences Research Cou... [Government] -1.784602 [https://www.epsrc.ac.uk/] 51.567093 United Kingdom England
2 grid.457785.c 106 49446108.0 NSF CISE Arlington Directorate for Computer & Information Science... [Government] -77.111000 [http://www.nsf.gov/dir/index.jsp?org=CISE] 38.880580 United States Virginia
3 grid.55047.33 5 47182381.0 NCRD Warsaw National Centre for Research and Development [Government] 21.007630 [http://www.ncbr.gov.pl/en/] 52.227455 Poland NaN
4 grid.457810.f 73 24371660.0 NSF ENG Arlington Directorate for Engineering [Government] -77.111000 [http://www.nsf.gov/dir/index.jsp?org=ENG] 38.880580 United States Virginia

Aggregated total number of citations

[64]:
%%dsldf
search publications
    for "ontologies"
return funders
    aggregate citations_total
    sort by citations_total  limit 5
Returned Funders: 5
[64]:
id count citations_total types city_name longitude name country_name linkout state_name acronym latitude
0 grid.48336.3a 12083 807005.0 [Government] Rockville -77.101190 National Cancer Institute United States [http://www.cancer.gov/] Maryland NCI 39.004326
1 grid.280785.0 11603 777080.0 [Facility] Bethesda -77.099380 National Institute of General Medical Sciences United States [http://www.nigms.nih.gov/Pages/default.aspx] Maryland NIGMS 38.997833
2 grid.280128.1 4424 575386.0 [Facility] Bethesda -77.096930 National Human Genome Research Institute United States [https://www.genome.gov/] Maryland NHGRI 38.996967
3 grid.270680.b 18022 548865.0 [Government] Brussels 4.363670 European Commission Belgium [http://ec.europa.eu/index_en.htm] NaN EC 50.851650
4 grid.52788.30 4838 418936.0 [Nonprofit] London -0.135005 Wellcome Trust United Kingdom [http://www.wellcome.ac.uk/] NaN WT 51.525867

Arithmetic mean number of citations

[65]:
%%dsldf
search publications
return funders
    aggregate citations_avg
    sort by citations_avg limit 5
Returned Funders: 5
[65]:
id count citations_avg types city_name longitude name country_name linkout state_name latitude
0 grid.478308.0 169 276.136095 [Nonprofit] Washington D.C. -77.039730 Alexander & Margaret Stewart Trust United States [http://www.stewart-trust.org/] District of Columbia 38.90116
1 grid.453780.d 143 186.685315 [Nonprofit] Washington D.C. -77.039520 Accelerate Brain Cancer Cure United States [http://www.abc2.org/] District of Columbia 38.90672
2 grid.478789.d 568 164.917254 [Other] Las Vegas -115.299850 Donald W. Reynolds Foundation United States [http://www.dwreynolds.org/] Nevada 36.19046
3 grid.417710.4 181 162.027624 [Company] Rockville -77.203760 Human Genome Sciences (United States) United States [http://www.hgsi.com] Maryland 39.09665
4 grid.429197.0 719 146.849791 [Other] New City -73.982895 Helen Hay Whitney Foundation United States [http://www.hhwf.org/] New York 41.15845

Geometric mean of FCR

[66]:
%%dsldf
search publications
return funders
    aggregate fcr_gavg limit 5
Returned Funders: 5
[66]:
id fcr_gavg count acronym city_name name types longitude linkout latitude country_name state_name
0 grid.419696.5 2.304725 1951296 NSFC Beijing National Natural Science Foundation of China [Government] 116.339830 [http://www.nsfc.gov.cn/publish/portal1/] 40.005177 China NaN
1 grid.270680.b 3.281903 677891 EC Brussels European Commission [Government] 4.363670 [http://ec.europa.eu/index_en.htm] 50.851650 Belgium NaN
2 grid.424020.0 2.523239 612579 MOST Beijing Ministry of Science and Technology of the Peop... [Government] 116.316284 [http://www.most.gov.cn/eng/] 39.827835 China NaN
3 grid.48336.3a 4.901802 584689 NCI Rockville National Cancer Institute [Government] -77.101190 [http://www.cancer.gov/] 39.004326 United States Maryland
4 grid.54432.34 2.258015 574493 JSPS Tokyo Japan Society for the Promotion of Science [Nonprofit] 139.740390 [http://www.jsps.go.jp/] 35.687160 Japan NaN

Median Altmetric Attention Score

[67]:
%%dsldf
search publications
return funders aggregate altmetric_median
    sort by altmetric_median limit 5
Returned Funders: 5
[67]:
id count altmetric_median types city_name longitude name country_name linkout acronym latitude state_name
0 grid.258806.1 6 309.0 [Education] Kitakyushu 130.839200 Kyushu Institute of Technology Japan [https://www.kyutech.ac.jp/english/] KIT 33.894436 NaN
1 grid.470711.4 2 110.5 [Nonprofit] Edinburgh -3.219597 Chest Heart and Stroke Scotland United Kingdom [http://www.chss.org.uk/] CHSS 55.946075 NaN
2 grid.443873.f 5 99.0 [Nonprofit] Chicago -87.626480 LUNGevity Foundation United States [http://www.lungevity.org/] LUNG 41.878674 Illinois
3 grid.473856.b 2 66.0 [Government] Washington D.C. -77.016370 Administration for Children and Families United States [https://www.acf.hhs.gov/] ACF 38.885940 District of Columbia
4 grid.473769.8 1 33.0 [Nonprofit] Bethesda -77.097880 Bladder Cancer Advocacy Network United States [http://www.bcan.org/] BCAN 38.988724 Maryland


Note

The Dimensions Analytics API allows to carry out sophisticated research data analytics tasks like the ones described on this website. Check out also the associated Github repository for examples, the source code of these tutorials and much more.

../../_images/badge-dimensions-api.svg