../../_images/badge-colab.svg ../../_images/badge-github-custom.svg

Exploring The Dimensions Search Language (DSL) - Deep Dive

This tutorial provides a detailed walkthrough of the most important features of the Dimensions Search Language.

This tutorial is based on the Query Syntax section of the official documentation. So, it can be used as an interactive version of the documentation, as it allows to try out the various DSL queries presented there.

What is the Dimensions Search Language?

The DSL aims to capture the type of interaction with Dimensions data that users are accustomed to performing graphically via the web application, and enable web app developers, power users, and others to carry out such interactions by writing query statements in a syntax loosely inspired by SQL but particularly suited to our specific domain and data organization.

Note: this notebook uses the Python programming language, however all the DSL queries are not Python-specific and can in fact be reused with any other API client.

Prerequisites

This notebook assumes you have installed the Dimcli library and are familiar with the Getting Started tutorial.

[1]:
!pip install dimcli --quiet

import dimcli
from dimcli.shortcuts import *
import json
import sys
import pandas as pd
#

print("==\nLogging in..")
# https://github.com/digital-science/dimcli#authentication
ENDPOINT = "https://app.dimensions.ai"
if 'google.colab' in sys.modules:
  import getpass
  USERNAME = getpass.getpass(prompt='Username: ')
  PASSWORD = getpass.getpass(prompt='Password: ')
  dimcli.login(USERNAME, PASSWORD, ENDPOINT)
else:
  USERNAME, PASSWORD  = "", ""
  dimcli.login(USERNAME, PASSWORD, ENDPOINT)
dsl = dimcli.Dsl()
==
Logging in..
Dimcli - Dimensions API Client (v0.7.4.2)
Connected to: https://app.dimensions.ai - DSL v1.27
Method: dsl.ini file

Sections Index

  1. Basic query structure

  2. Full-text searching

  3. Field searching

  4. Searching for researchers

  5. Returning results

  6. Aggregations

1. Basic query structure

DSL queries consist of two required components: a search phrase that indicates the scientific records to be searched, and one or more return phrases which specify the contents and structure of the desired results.

The simplest valid DSL query is of the form search <source>|return <result>:

[2]:
%%dsldf
search grants return  grants limit 5
Returned Grants: 5 (total = 5514056)
Time: 0.61s
[2]:
funders title start_year title_language original_title project_num funding_org_name language start_date id active_year end_date
0 [{'id': 'grid.420488.2', 'city_name': 'The Hag... Sensing alarm responses of ungulate herds to p... 2021 en Sensing alarm responses of ungulate herds to p... RAAK.PRO02.048 Dutch Research Council en 2021-12-27 grant.6946936 [2021] NaN
1 [{'id': 'grid.270680.b', 'city_name': 'Brussel... Functional analysis of ribosome heterogeneity ... 2021 en Functional analysis of ribosome heterogeneity ... 890218 European Commission en 2021-12-01 grant.9064785 [2021, 2022, 2023] 2023-11-30
2 [{'id': 'grid.484521.e', 'state_name': 'New Br... APPROACH to Enriching the Real World Evidence ... 2021 en APPROACH to Enriching the Real World Evidence ... 2018-HRSI-1548 New Brunswick Health Research Foundation en 2021-11-30 grant.8690978 [2021] NaN
3 [{'id': 'grid.270680.b', 'city_name': 'Brussel... Knowledge Transfer in Global Gender Programmes... 2021 en Knowledge Transfer in Global Gender Programmes... 894029 European Commission en 2021-10-01 grant.9064813 [2021, 2022, 2023, 2024] 2024-09-30
4 [{'id': 'grid.424470.1', 'city_name': 'Brussel... Molecular mechanism of DNA double strand break... 2021 en Mécanismes moléculaires de la formation et la ... 1301720F Fund for Scientific Research en 2021-10-01 grant.8950252 [2021] NaN

search source

A query must begin with the word search followed by a source name, i.e. the name of a type of scientific record, such as grants or publications.

What are the sources available? See the data sources section of the documentation.

Alternatively, we can use the ‘schema’ API (describe) to return this information programmatically:

[3]:
dsl.query("describe schema")
[3]:
<dimcli.DslDataset object #4399011200. Dict keys: 'sources', 'entities'>

A more useful query might also make use of the optional for and where phrases to limit the set of records returned.

[4]:
%%dsldf
search grants  for "lung cancer"
    where active_year=2000
return  grants  limit 5
Returned Grants: 5 (total = 1745)
Time: 0.50s
[4]:
funders title end_date start_year title_language original_title project_num funding_org_name language start_date id active_year
0 [{'id': 'grid.279885.9', 'state_name': 'Maryla... ROLE OF CD44 ISOFORMS IN ENDOTHELIAL CELL DAMAGE 2002-01-01 2000 en ROLE OF CD44 ISOFORMS IN ENDOTHELIAL CELL DAMAGE F32HL010455 National Heart Lung and Blood Institute en 2000-12-31 grant.2386513 [2000, 2001, 2002]
1 [{'id': 'grid.279885.9', 'state_name': 'Maryla... ESTROGEN, ANGIOGENESIS AND ENDOTHELIAL PROGENI... 2004-11-30 2000 en ESTROGEN, ANGIOGENESIS AND ENDOTHELIAL PROGENI... R01HL063695 National Heart Lung and Blood Institute en 2000-12-18 grant.2537116 [2000, 2001, 2002, 2003, 2004]
2 [{'id': 'grid.279885.9', 'state_name': 'Maryla... GENETIC ANALYSIS OF EPHRIN-EPH SIGNALING IN AN... 2007-11-30 2000 en GENETIC ANALYSIS OF EPHRIN-EPH SIGNALING IN AN... R01HL066221 National Heart Lung and Blood Institute en 2000-12-18 grant.2537801 [2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007]
3 [{'id': 'grid.279885.9', 'state_name': 'Maryla... Synthetic Heparan Sulfate: Probing Biosynthesi... 2017-12-31 2000 en Synthetic Heparan Sulfate: Probing Biosynthesi... R01HL062244 National Heart Lung and Blood Institute en 2000-12-15 grant.2536777 [2000, 2001, 2002, 2003, 2004, 2005, 2006, 200...
4 [{'id': 'grid.419213.c', 'state_name': 'New Je... SmokeLess States Program - Implementation 2001-02-28 2000 en SmokeLess States Program - Implementation 41067 Robert Wood Johnson Foundation en 2000-12-01 grant.8616620 [2000, 2001]

return result (source or facet)

The most basic return phrase consists of the keyword return followed by the name of a record or facet to be returned.

This must be the name of the source used in the search phrase, or the name of a facet of that source.

[5]:
%%dsldf
search grants for "laryngectomy"
return grants limit 5
Returned Grants: 5 (total = 115)
Time: 0.50s
[5]:
start_date language id original_title title_language title active_year start_year funding_org_name end_date project_num funders
0 2020-04-01 ja grant.9201764 喉頭全摘出者の家族の術後生活への移行を促進する外来での生活支援プログラムの開発 ja Development of an outpatient life support prog... [2020, 2021, 2022, 2023, 2024] 2020 Japan Society for the Promotion of Science 2024-03-31 20K10777 [{'id': 'grid.54432.34', 'types': ['Nonprofit'...
1 2019-09-29 en grant.8674095 UKRI CDT in SLT- Continuous End-to-End Streami... en UKRI CDT in SLT- Continuous End-to-End Streami... [2019, 2020, 2021, 2022, 2023] 2019 Engineering and Physical Sciences Research Cou... 2023-09-28 2268211 [{'id': 'grid.421091.f', 'types': ['Government...
2 2019-08-15 en grant.8554260 Wearable silent speech technology to enhance i... en Wearable silent speech technology to enhance i... [2019, 2020, 2021, 2022, 2023, 2024] 2019 National Institute on Deafness and Other Commu... 2024-07-31 R01DC016621 [{'id': 'grid.214431.1', 'types': ['Facility']...
3 2019-04-01 ja grant.8428997 Construction of a nursing system leading to im... en Construction of a nursing system leading to im... [2019, 2020, 2021, 2022, 2023] 2019 Japan Society for the Promotion of Science 2023-03-31 19H03937 [{'id': 'grid.54432.34', 'types': ['Nonprofit'...
4 2019-04-01 ja grant.8422934 喉頭がん、下咽頭がんにより喉頭摘出術を受けた患者に対する嗅覚向上プログラムの開発 ja Development of an olfactory improvement progra... [2019, 2020, 2021] 2019 Japan Society for the Promotion of Science 2021-03-31 19K19574 [{'id': 'grid.54432.34', 'types': ['Nonprofit'...

Eg let’s see what are the facets available for the grants source:

[6]:
fields = dsl.query("describe schema")['sources']['grants']['fields']
[x for x in fields if fields[x]['is_facet']]
[6]:
['category_hrcs_rac',
 'active_year',
 'funding_org_acronym',
 'category_rcdc',
 'funder_countries',
 'funders',
 'research_org_state_codes',
 'start_year',
 'research_orgs',
 'research_org_countries',
 'funding_org_name',
 'researchers',
 'language',
 'category_icrp_cso',
 'category_sdg',
 'category_uoa',
 'category_for',
 'language_title',
 'category_bra',
 'category_hrcs_hc',
 'category_hra',
 'research_org_cities',
 'funding_currency',
 'category_icrp_ct',
 'funding_org_city']

2. Full-text Searching

Full-text search or keyword search finds all instances of a term (keyword) in a document, or group of documents.

Full text search works by using search indexes, which can be targeting specific sections of a document e.g. its \(abstract\), \(authors\), \(full text\) etc…

[7]:
%%dsldf
search publications
    in full_data for "moon landing"
return publications limit 5
Returned Publications: 5 (total = 174854)
Time: 1.33s
[7]:
title pages author_affiliations year id type
0 Bringing My Wife and Children to the Field 185-206 [[{'first_name': 'Leberecht', 'last_name': 'Fu... 2020 pub.1128623295 chapter
1 UJ to Frederick and Maud Clapp 3-250 NaN 2020 pub.1130258832 chapter
2 1. Into the Woods (Via Cuma 320, Bacoli) 14-30 [[{'first_name': 'Alessandro', 'last_name': 'B... 2020 pub.1127643502 chapter
3 1898–1899 Movies and Entrepreneurs 66-90 [[{'first_name': 'Patrick', 'last_name': 'Loug... 2020 pub.1126778139 chapter
4 2. Grand Steerage 51-81 [[{'first_name': 'Barry', 'last_name': 'Naught... 2020 pub.1129002686 chapter

2.1 in [search index]

This optional phrase consists of the particle in followed by a term indicating a search index, specifying for example whether the search is limited to full text, title and abstract only, or title only.

[8]:
%%dsldf
search grants
    in title_abstract_only for "something"
return grants limit 5
Returned Grants: 5 (total = 10001)
Time: 0.53s
[8]:
start_year funding_org_name end_date active_year language start_date funders title_language id original_title title project_num
0 2021 European Commission 2023-08-31 [2021, 2022, 2023] en 2021-09-01 [{'id': 'grid.270680.b', 'name': 'European Com... en grant.9064570 Deciphering fundamental constraints on pathoge... Deciphering fundamental constraints on pathoge... 890630
1 2021 European Research Council 2025-12-31 [2021, 2022, 2023, 2024, 2025] en 2021-01-01 [{'id': 'grid.452896.4', 'name': 'European Res... en grant.8964099 Overcoming stellar activity in radial velocity... Overcoming stellar activity in radial velocity... 865624
2 2021 Swedish Research Council for Health Working Li... 2022-12-31 [2021, 2022] en 2021-01-01 [{'id': 'grid.434365.3', 'name': 'Swedish Rese... en grant.9242822 Everyday Violence: Understanding and preventin... Everyday Violence: Understanding and preventin... 2020-01152_Forte
3 2021 European Commission 2022-12-31 [2021, 2022] en 2021-01-01 [{'id': 'grid.270680.b', 'name': 'European Com... en grant.9065705 Political Dynamics of Slow-Onset Disasters: Co... Political Dynamics of Slow-Onset Disasters: Co... 897656
4 2020 Directorate for Computer & Information Science... 2024-09-30 [2020, 2021, 2022, 2023, 2024] en 2020-10-01 [{'id': 'grid.457785.c', 'name': 'Directorate ... en grant.9046367 SaTC: CORE: Medium: Collaborative: Hardening O... SaTC: CORE: Medium: Collaborative: Hardening O... 1954521

Eg let’s see what are the search fields available for the grants source:

[9]:
dsl.query("describe schema")['sources']['grants']['search_fields']
[9]:
['title_only', 'investigators', 'title_abstract_only', 'full_data', 'concepts']
[10]:
%%dsldf
search grants
    in full_data for "graphene AND computer AND iron"
return grants limit 5
Returned Grants: 5 (total = 10)
Time: 0.51s
[10]:
start_year funding_org_name end_date active_year language start_date funders title_language id original_title title project_num
0 2019 Russian Science Foundation 2021-12-31 [2019, 2020, 2021] en 2019-01-01 [{'id': 'grid.454869.2', 'name': 'Russian Scie... en grant.8413990 Weyl and Dirac semimetals and beyond - predict... Weyl and Dirac semimetals and beyond - predict... 19-43-04129
1 2018 Russian Foundation for Basic Research 2018-12-31 [2018] ru 2018-01-01 [{'id': 'grid.452899.b', 'name': 'Russian Foun... ru grant.8731867 Проект организации 18-ой Международной конфере... Project of the organization of the 18th Intern... 18-02-20097
2 2016 Ministry of Science and Higher Education 2016-12-31 [2016] pl 2016-02-22 [{'id': 'grid.425823.a', 'name': 'Ministry of ... pl grant.7397800 Dotacja podmiotowa na utrzymanie potencjału ba... Subject subsidy for maintaining the research p... 4491/E-370/S/2016
3 2015 Ministry of Science and Higher Education 2015-12-31 [2015] pl 2015-02-19 [{'id': 'grid.425823.a', 'name': 'Ministry of ... pl grant.7397795 Dotacja podmiotowa na utrzymanie potencjału ba... Subject subsidy for maintaining the research p... 4491/E-370/S/2015
4 2014 Ministry of Science and Higher Education 2014-12-31 [2014] pl 2014-04-09 [{'id': 'grid.425823.a', 'name': 'Ministry of ... pl grant.7397490 Dotacja celowa na prowadzenie w 2014 przez Wyd... Intentional grant for conducting in 2014 the F... 4491/E-370/M/2014

Special search indexes for persons names permit to perform full text searches on publications authors or grants investigators. Please see the Researchers Search section below for more information on how searches work in this case.

[11]:
%dsldf search publications in authors for "\"Jennifer A Doudna\"" return publications limit 5
Returned Publications: 5 (total = 332)
Time: 0.69s
[11]:
id title volume pages type year author_affiliations journal.id journal.title issue
0 pub.1129492680 Engineering of Monosized Lipid-Coated Mesoporo... 114 358-368 article 2020 [[{'first_name': 'Achraf', 'last_name': 'Noure... jour.1034525 Acta Biomaterialia NaN
1 pub.1130231355 Site-Specific Bioconjugation through Enzyme-Ca... NaN NaN article 2020 [[{'first_name': 'Marco J.', 'last_name': 'Lob... jour.1051962 ACS Central Science NaN
2 pub.1130116638 Chemistry of Class 1 CRISPR-Cas effectors: bin... NaN jbc.rev120.007034 article 2020 [[{'first_name': 'Tina Y.', 'last_name': 'Liu'... jour.1077138 Journal of Biological Chemistry NaN
3 pub.1129110288 A scoutRNA Is Required for Some Type V CRISPR-... 79 416-424.e5 article 2020 [[{'first_name': 'Lucas B.', 'last_name': 'Har... jour.1117828 Molecular Cell 3
4 pub.1129776449 DNA capture by a CRISPR-Cas9–guided adenine ba... 369 566-571 article 2020 [[{'first_name': 'Audrone', 'last_name': 'Lapi... jour.1346339 Science 6503

2.2 for "search term"

This optional phrase consists of the keyword for followed by a search term string, enclosed in double quotes (").

Strings in double quotes can contain nested quotes escaped by a backslash \. This will ensure that the string in nested double quotes is searched for as if it was a single phrase, not multiple words.

An example of a phrase: "\"Machine Learning\"" : results must contain Machine Learning as a phrase.

[12]:
%dsldf search publications for "\"Machine Learning\"" return publications limit 5
Returned Publications: 5 (total = 1217944)
Time: 1.88s
[12]:
type pages author_affiliations id year title volume issue journal.id journal.title
0 chapter 243-248 [[{'first_name': 'Eetu', 'last_name': 'Heikkil... pub.1124666091 2020 Towards maritime traffic coordination in the e... NaN NaN NaN NaN
1 chapter 39-60 [[{'first_name': 'Anya', 'last_name': 'Kamenet... pub.1130268195 2020 2. DIY U NaN NaN NaN NaN
2 article 1726672 [[{'first_name': 'Sytske', 'last_name': 'Wiege... pub.1125710665 2020 Recognizing hotspots in Brief Eclectic Psychot... 11 1 jour.1045059 European Journal of Psychotraumatology
3 article 41-54 [[{'first_name': 'Baze University Abuja', 'las... pub.1126735888 2020 Capacitated vehicle routing problem with colum... 3 1 jour.1365688 Open Journal of Discrete Applied Mathematics
4 chapter 219-250 [[{'first_name': 'Jan', 'last_name': 'Goldenst... pub.1124034443 2020 Die Erfassung und Messung von Bedeutungsstrukt... NaN NaN NaN NaN

Example of multiple keywords: "Machine Learning" : this searches for keywords independently.

[13]:
%dsldf search publications for "Machine Learning" return publications limit 5
Returned Publications: 5 (total = 2524786)
Time: 1.53s
[13]:
type pages id year title author_affiliations
0 chapter 65-368 pub.1127396158 2020 Documents NaN
1 chapter 114-125 pub.1127466829 2020 The influence of ecological constraints on the... [[{'first_name': 'André', 'last_name': 'Boyer'...
2 chapter 84-118 pub.1124947017 2020 4. Visualizing the Division of Labor: William ... [[{'first_name': 'John', 'last_name': 'Barrell...
3 chapter 217-276 pub.1126774980 2020 4 Hinduism [[{'first_name': 'Laurie L.', 'last_name': 'Pa...
4 chapter 44-58 pub.1125150382 2020 3. Rural-Urban Divides and Digital Literacy in... [[{'first_name': 'Daariimaa', 'last_name': 'Ma...

Note: Special characters, such as any of ^ " : ~ \ [ ] { } ( ) ! | & + must be escaped by a backslash \. Also, please note escaping rules in Python (or other languages). For example, when writing a query with escaped quotes, such as search publications for "\"phrase 1\" AND \"phrase 2\"", in Python, it is necessary to escape the backslashes as well, so it would look like: 'search publications for "\\"phrase 1\\" AND \\"phrase 2\\""'.

See the official docs for more details.

2.3 Boolean Operators

Search term can consist of multiple keywords or phrases connected using boolean logic operators, e.g. AND, OR and NOT.

[14]:
%dsldf search publications for "(dose AND concentration)" return publications limit 5
Returned Publications: 5 (total = 5370106)
Time: 1.00s
[14]:
id title volume issue pages type year author_affiliations journal.id journal.title
0 pub.1124948447 Translational studies of estradiol and progest... 11 1 1723857 article 2020 [[{'first_name': 'Antonia V', 'last_name': 'Se... jour.1045059 European Journal of Psychotraumatology
1 pub.1128226413 Interrupting traumatic memories in the emergen... 11 1 1750170 article 2020 [[{'first_name': 'Sara A.', 'last_name': 'Free... jour.1045059 European Journal of Psychotraumatology
2 pub.1128351891 7. Wetland Animal Ecology NaN NaN 242-284 chapter 2020 [[{'first_name': 'Darold P.', 'last_name': 'Ba... NaN NaN
3 pub.1130114635 Correlation Of Calcium Levels With The Strengh... NaN NaN 174-181 chapter 2020 [[{'first_name': 'Joserizal', 'last_name': 'Se... NaN NaN
4 pub.1125801745 7. Conservation of the Amsterdam Sunflowers: F... NaN NaN 175-206 chapter 2020 [[{'first_name': 'Ella', 'last_name': 'Hendrik... NaN NaN

When specifying Boolean operators with keywords such as AND, OR and NOT, the keywords must appear in all uppercase.

The operators available are shown in the table below. .

Boolean Operator

Alternative Symbol

Description

AND

&&

Requires both terms on either side of the Boolean operator to be present for a match.

NOT

!

Requires that the following term not be present.

OR

||

Requires that either term (or both terms) be present for a match.

+

Requires that the following term be present.

-

Prohibits the following term (that is, matches on fields or documents that do not include that term). The - operator is functionally similar to the Boolean operator !.

[15]:
%dsldf search publications for "(dose OR concentration) AND (-malaria +africa)" return publications limit 5
Returned Publications: 5 (total = 1402625)
Time: 0.88s
[15]:
type pages author_affiliations id year title
0 chapter 634-688 [[{'first_name': 'Antonio', 'last_name': 'Esta... pub.1124248682 2020 17. Institutions for Infrastructure in Develop...
1 chapter 285-304 [[{'first_name': 'Eliot A.', 'last_name': 'Bre... pub.1124946791 2020 16. The Neuroethology of Birdsong
2 chapter 1-8 [[{'first_name': 'John S.', 'last_name': 'Hend... pub.1125788851 2020 Introduction: Murra, Materialism, Anthropology...
3 chapter 129-143 [[{'first_name': 'Campbell', 'last_name': 'Cra... pub.1124248733 2020 8. India in the Early Nuclear Age
4 chapter 100-114 [[{'first_name': 'Isabelle', 'last_name': 'Roh... pub.1128661435 2020 4. The Franco Regime and the Jews of North Afr...

The combination of keywords and boolean operators allow to construct rather sophisticated queries. For example, here’s a real-world query used to extract publications related to COVID-19.

[16]:
q_inner = """ "2019-nCoV" OR "COVID-19" OR "SARS-CoV-2" OR "HCoV-2019" OR "hcov" OR "NCOVID-19" OR
    "severe acute respiratory syndrome coronavirus 2" OR "severe acute respiratory syndrome corona virus 2"
    OR (("coronavirus"  OR "corona virus") AND (Wuhan OR China OR novel)) """

# tip: dsl_escape is a dimcli utility function for escaping special characters
q_outer = f"""search publications in full_data for "{dsl_escape(q_inner)}" return publications"""
print(q_outer)

dsl.query(q_outer)
search publications in full_data for " \"2019-nCoV\" OR \"COVID-19\" OR \"SARS-CoV-2\" OR \"HCoV-2019\" OR \"hcov\" OR \"NCOVID-19\" OR
    \"severe acute respiratory syndrome coronavirus 2\" OR \"severe acute respiratory syndrome corona virus 2\"
    OR ((\"coronavirus\"  OR \"corona virus\") AND (Wuhan OR China OR novel)) " return publications
Returned Publications: 20 (total = 193181)
Time: 6.47s
[16]:
<dimcli.DslDataset object #4662481968. Records: 20/193181>

2.4 Wildcard Searches

The DSL supports single and multiple character wildcard searches within single terms. Wildcard characters can be applied to single terms, but not to search phrases.

[17]:
%dsldf search publications in title_only for "ital? malaria" return publications limit 5
Returned Publications: 5 (total = 144)
Time: 0.88s
[17]:
title pages author_affiliations year issue id type volume journal.id journal.title
0 Non-imported malaria in Italy: paradigmatic ap... 857 [[{'first_name': 'Daniela', 'last_name': 'Bocc... 2020 1 pub.1128245696 article 20 jour.1024954 BMC Public Health
1 A Cluster of Cryptic Plasmodium falciparum Mal... NaN [[{'first_name': 'Gaetano', 'last_name': 'Brin... 2020 NaN pub.1130290794 article NaN jour.1023805 Vector-Borne and Zoonotic Diseases
2 Seasons in Italy: Northern European travelers,... 1-20 [[{'first_name': 'Benjamin', 'last_name': 'Rei... 2020 NaN pub.1124231018 article NaN jour.1141817 Journal of Tourism and Cultural Change
3 Updated guidelines for malaria prophylaxis in ... 101544 [[{'first_name': 'Guido', 'last_name': 'Caller... 2020 NaN pub.1123222257 article 33 jour.1034401 Travel Medicine and Infectious Disease
4 Clinical management of imported malaria in Ita... 28-33 [[{'first_name': 'Luciana', 'last_name': 'Lepo... 2020 1 pub.1125332077 article 43 jour.1089291 Microbiologica
[18]:
%dsldf search publications in title_only for "it* malaria" return publications limit 5
Returned Publications: 5 (total = 1541)
Time: 0.51s
[18]:
type volume pages author_affiliations id year issue title journal.id journal.title
0 article 20 857 [[{'first_name': 'Daniela', 'last_name': 'Bocc... pub.1128245696 2020 1 Non-imported malaria in Italy: paradigmatic ap... jour.1024954 BMC Public Health
1 article 19 24 [[{'first_name': 'Monica P.', 'last_name': 'Sh... pub.1124106064 2020 1 The effectiveness of older insecticide-treated... jour.1030597 Malaria Journal
2 article 19 299 [[{'first_name': 'Lemu', 'last_name': 'Golassa... pub.1130290155 2020 1 The biology of unconventional invasion of Duff... jour.1030597 Malaria Journal
3 article 13 348 [[{'first_name': 'Richard', 'last_name': 'Echo... pub.1129556766 2020 1 High insecticide resistances levels in Anophel... jour.1039457 BMC Research Notes
4 article NaN 104530 [[{'first_name': 'Kirti', 'last_name': 'Upmany... pub.1130570962 2020 NaN Allelic variation of msp-3α gene in Plasmodium... jour.1027256 Infection Genetics and Evolution

Wildcard Search Type

Special Character

Example

Single character - matches a single character

?

The search string te?t would match both test and text.

Multiple characters - matches zero or more sequential characters

*

The wildcard search: tes* would match test, testing, and tester. You can also use wildcard characters in the middle of a term. For example: te*t would match test and text. *est would match pest and test.

2.5 Proximity Searches

A proximity search looks for terms that are within a specific distance from one another.

To perform a proximity search, add the tilde character ~ and a numeric value to the end of a search phrase. For example, to search for a formal and model within 10 words of each other in a document, use the search:

[19]:
%dsldf search publications for "\"formal model\"~10" return publications limit 5
Returned Publications: 5 (total = 483576)
Time: 2.48s
[19]:
title pages author_affiliations year id type issue volume journal.id journal.title
0 1. The Political Economy of Environmental Just... 1-20 [[{'first_name': 'H. Spencer', 'last_name': 'B... 2020 pub.1130374367 chapter NaN NaN NaN NaN
1 15. Organizational Governance 513-555 [[{'first_name': 'Nicolai J.', 'last_name': 'F... 2020 pub.1130267294 chapter NaN NaN NaN NaN
2 2. Clientelistic Politics and Economic Develop... 84-102 [[{'first_name': 'Pranab', 'last_name': 'Bardh... 2020 pub.1124248667 chapter NaN NaN NaN NaN
3 4. The Classification of Organizational Forms 84-110 [[{'first_name': 'Martin', 'last_name': 'Ruef'... 2020 pub.1130269657 chapter NaN NaN NaN NaN
4 Building cooperative learning to address alcoh... 1726722 [[{'first_name': 'Oladapo', 'last_name': 'Olad... 2020 pub.1125320181 article 1 13 jour.1041075 Global Health Action
[20]:
%dsldf search publications for "\"digital humanities\"~5  +ontology" return publications limit 5
Returned Publications: 5 (total = 8109)
Time: 1.36s
[20]:
id title volume issue pages type year author_affiliations journal.id journal.title
0 pub.1128167997 The gains of reduction in translational proces... 6 1 109 article 2020 [[{'first_name': 'Anita', 'last_name': 'Wohlma... jour.1136613 Palgrave Communications
1 pub.1127423858 Citizen science in the social sciences and hum... 6 1 89 article 2020 [[{'first_name': 'Loreta', 'last_name': 'Taugi... jour.1136613 Palgrave Communications
2 pub.1129593819 A methodology for multilayer networks analysis... 5 1 41 article 2020 [[{'first_name': 'Maria', 'last_name': 'Malek'... jour.1158525 Applied Network Science
3 pub.1127978306 Atlante dei siti fortificati della provincia d... NaN NaN 471-478 proceeding 2020 [[{'first_name': 'Maurizio', 'last_name': 'Tos... NaN NaN
4 pub.1122198573 Semantic-based privacy settings negotiation an... 111 NaN 879-898 article 2020 [[{'first_name': 'Odnan Ref', 'last_name': 'Sa... jour.1125399 Future Generation Computer Systems
The distance referred to here is the number of term movements needed to match the specified phrase.
In the example above, if formal and model were 10 spaces apart in a field, but formal appeared before model, more than 10 term movements would be required to move the terms together and position formal to the right of model with a space in between.

3. Field Searching

Field searching allows to use a specific field of a source as a query filter. For example, this can be a Literal field such as the \(type\) of a publication, its \(date\), \(mesh terms\), etc.. Or it can be an entity field, such as the \(journal title\) for a publication, the \(country name\) of its author affiliations, etc..

What are the fields available for each source? See the data sources section of the documentation.

Alternatively, we can use the ‘schema’ API (describe) to return this information programmatically:

[21]:
%dsldocs publications
[21]:
sources field type description is_filter is_entity is_facet
0 publications altmetric float Altmetric attention score. True False False
1 publications altmetric_id integer AltMetric Publication ID True False False
2 publications authors json Ordered list of authors names and their affili... True False False
3 publications book_doi string The DOI of the book a chapter belongs to (note... True False False
4 publications book_series_title string The title of the book series book, belong to. False False False
5 publications book_title string The title of the book a chapter belongs to (no... False False False
6 publications category_bra categories `Broad Research Areas <https://dimensions.fres... True True True
7 publications category_for categories `ANZSRC Fields of Research classification <htt... True True True
8 publications category_hra categories `Health Research Areas <https://dimensions.fre... True True True
9 publications category_hrcs_hc categories `HRCS - Health Categories <https://dimensions.... True True True
10 publications category_hrcs_rac categories `HRCS – Research Activity Codes <https://dimen... True True True
11 publications category_icrp_cso categories `ICRP Common Scientific Outline <https://dimen... True True True
12 publications category_icrp_ct categories `ICRP Cancer Types <https://dimensions.freshde... True True True
13 publications category_rcdc categories `Research, Condition, and Disease Categorizati... True True True
14 publications category_sdg categories SDG - Sustainable Development Goals True True True
15 publications category_uoa categories `Units of Assessment <https://dimensions.fresh... True True True
16 publications concepts json Concepts describing the main topics of a publi... True False False
17 publications concepts_scores json Relevancy scores for `concepts`. True False False
18 publications date date The publication date of a document, eg "2018-0... True False False
19 publications date_inserted date Date when the record was inserted into Dimensi... True False False
20 publications dimensions_url string Link pointing to the Dimensions web application False False False
21 publications doi string Digital object identifier. True False False
22 publications field_citation_ratio float Relative citation performance of article when ... True False False
23 publications funder_countries countries The country of the organisations funding this ... True True True
24 publications funders organizations The GRID organisation funding this publication. True True True
25 publications id string Dimensions publication ID. True False False
26 publications issn string International Standard Serial Number True False False
27 publications issue string The issue number of a publication. True False False
28 publications journal journals The journal a publication belongs to. True True True
29 publications journal_lists string Independent grouping of journals outside of Di... True False True
30 publications linkout string Original URL for a publication full text. False False False
31 publications mesh_terms string Medical Subject Heading terms as used in PubMed. True False True
32 publications open_access_categories open_access Open Access categories for publications. See b... True True True
33 publications pages string The pages of the publication, as they would ap... True False False
34 publications pmcid string PubMed Central ID. True False False
35 publications pmid string PubMed ID. True False False
36 publications proceedings_title string Title of the conference proceedings volume ass... False False False
37 publications publisher string Name of the publisher as a string. True False True
38 publications recent_citations integer Number of citations received in the last two y... True False False
39 publications reference_ids string Dimensions publication ID for publications in ... True False False
40 publications referenced_pubs publication_links Publication IDs of the publications in the ref... True True True
41 publications relative_citation_ratio float Relative citation performance of an article wh... True False False
42 publications research_org_cities cities City of the organisations authors are affiliat... True True True
43 publications research_org_countries countries Country of the organisations authors are affil... True True True
44 publications research_org_country_names string Country name of the organisations authors are ... True False False
45 publications research_org_names string Names of organizations authors are affiliated to. True False False
46 publications research_org_state_codes states State of the organisations authors are affilia... True True True
47 publications research_org_state_names string State name of the organisations authors are af... True False False
48 publications research_orgs organizations GRID organisations associated to a publication... True True True
49 publications researchers researchers Researcher IDs matched to the publication's au... True True True
50 publications resulting_publication_doi string For preprints, the DOIs of the resulting full ... True False False
51 publications supporting_grant_ids string Grants supporting a publication, returned as a... True False False
52 publications times_cited integer Number of citations (note: does not support em... True False True
53 publications title string Title of a publication. False False False
54 publications type string Publication type (one of: article, chapter, pr... True False True
55 publications volume string Publication volume. True False False
56 publications year integer The year of publication (note: when the `date`... True False True

3.1 where

This optional phrase consists of the keyword where followed by a filters phrase consisting of DSL filter expressions, as described below.

[22]:
%dsldf search publications where type = "book" return publications limit 5
Returned Publications: 5 (total = 296478)
Time: 0.57s
[22]:
id title type year volume
0 pub.1125300609 Duoethnography in English Language Teaching book 2020 NaN
1 pub.1108455576 The Indo-Aryans of Ancient South Asia book 2020 NaN
2 pub.1125300607 Sociolinguistic Perspectives on Migration Control book 2020 NaN
3 pub.1108473781 Die Passion Christi in Literatur und Kunst des... book 2020 NaN
4 pub.1129458015 Neuromodulation for Facial Pain book 2020 35

If a for phrase is also used in a filtered query, the system will first apply the filters, and then search the resulting restricted set of documents for the search term.

[23]:
%dsldf search publications for "malaria" where type = "book" return publications limit 5
Returned Publications: 5 (total = 12497)
Time: 0.48s
[23]:
type id year title
0 book pub.1130620714 2020 Nano-Enabled Medical Applications
1 book pub.1130505886 2020 Human Ecology, Human Economy
2 book pub.1130318304 2020 Pharmaceutical Biocatalysis
3 book pub.1129886893 2020 Wild Plants
4 book pub.1130227719 2020 Medicine in the Twentieth Century

3.2 in

For convenience, the DSL also supports shorthand notation for filters where a particular field should be restricted to a specified range or list of values (although the same logic may be expressed using complex filters as shown below).

Syntax: a range filter consists of the field name, the keyword in, and a range of values enclosed in square brackets ([]), where the range consists of a low value, colon :, and a high value.

[24]:
%%dsldf
search grants
    for "malaria"
    where start_year in [ 2010 : 2015 ]
return grants limit 5
Returned Grants: 5 (total = 3134)
Time: 0.52s
[24]:
funders title end_date start_year title_language original_title project_num funding_org_name language start_date id active_year
0 [{'id': 'grid.419681.3', 'state_name': 'Maryla... Bloodborne tropical pathogen detection using m... 2017-11-30 2015 en Bloodborne tropical pathogen detection using m... R21AI120981 National Institute of Allergy and Infectious D... en 2015-12-28 grant.4729738 [2015, 2016, 2017]
1 [{'id': 'grid.419681.3', 'state_name': 'Maryla... Field-deployable Assay for Differential Diagno... 2019-02-28 2015 en Field-deployable Assay for Differential Diagno... R21AI120973 National Institute of Allergy and Infectious D... en 2015-12-24 grant.4729736 [2015, 2016, 2017, 2018, 2019]
2 [{'id': 'grid.419681.3', 'state_name': 'Maryla... T cell driven antigen discovery for vaccine ca... 2018-11-30 2015 en T cell driven antigen discovery for vaccine ca... R21AI109439 National Institute of Allergy and Infectious D... en 2015-12-21 grant.4729699 [2015, 2016, 2017, 2018]
3 [{'id': 'grid.452969.5', 'city_name': 'Hanover... Senior Fellowship for Dr. Eduardo Samo Gudo: E... 2018-12-18 2015 en Senior Fellowship for Dr. Eduardo Samo Gudo: E... 91488 Volkswagen Foundation en 2015-12-18 grant.4854433 [2015, 2016, 2017, 2018]
4 [{'id': 'grid.482914.2', 'state_name': 'Distri... Biology, Ecology & Management of Emerging Dise... 2019-09-30 2015 en Biology, Ecology & Management of Emerging Dise... N/A National Institute of Food and Agriculture en 2015-12-10 grant.8821176 [2015, 2016, 2017, 2018, 2019]

Syntax: a list filter consists of the field name, the keyword in, and a list of one or more value s enclosed in square brackets ([]), where values are separated by commas (,):

[25]:
%%dsldf
search grants
    for "malaria"
    where research_org_name in [ "UC Berkeley", "UC Davis", "UCLA"  ]
return grants limit 5
Returned Grants: 0
Time: 0.46s
WARNINGS [1]
Field 'research_org_name' is deprecated in favor of research_orgs. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
[25]:

3.3 count - filter function

The filter function count is supported on some fields in publications (e.g. researchers and research_orgs).

Use of this filter is shown on the example below:

[26]:
%%dsldf
search publications
    for "malaria"
    where count(research_orgs) > 5
return research_orgs limit 5
Returned Research_orgs: 5
Time: 2.59s
[26]:
id count types name latitude longitude linkout city_name country_name state_name acronym
0 grid.4991.5 1571 [Education] University of Oxford 51.753437 -1.254010 [http://www.ox.ac.uk/] Oxford United Kingdom Oxfordshire NaN
1 grid.8991.9 1473 [Education] London School of Hygiene & Tropical Medicine 51.520900 -0.130700 [http://www.lshtm.ac.uk/] London United Kingdom Camden LSHTM
2 grid.38142.3c 1095 [Education] Harvard University 42.377052 -71.116650 [http://www.harvard.edu/] Cambridge United States Massachusetts NaN
3 grid.21107.35 867 [Education] Johns Hopkins University 39.328888 -76.620280 [https://www.jhu.edu/] Baltimore United States Maryland JHU
4 grid.7445.2 803 [Education] Imperial College London 51.498600 -0.175478 [http://www.imperial.ac.uk/] London United Kingdom Westminster NaN

Number of publications with more than 50 researcher.

[27]:
%%dsldf
search publications
    for "malaria"
    where count(researchers) > 50
return publications limit 5
Returned Publications: 5 (total = 241)
Time: 0.78s
[27]:
id title volume issue pages type year author_affiliations journal.id journal.title
0 pub.1130215447 The global distribution of lymphatic filariasi... 8 9 e1186-e1194 article 2020 [[{'first_name': 'Aniruddha', 'last_name': 'De... jour.1048786 The Lancet Global Health
1 pub.1126915860 Health sector spending and spending on HIV/AID... 396 10252 693-724 article 2020 [[{'first_name': 'Global Burden of Disease Hea... jour.1077219 The Lancet
2 pub.1130211369 Mapping geographical inequalities in access to... 8 9 e1162-e1185 article 2020 [[{'first_name': 'Aniruddha', 'last_name': 'De... jour.1048786 The Lancet Global Health
3 pub.1129557093 Mapping geographical inequalities in oral rehy... 8 8 e1038-e1060 article 2020 [[{'first_name': 'Local Burden of Disease Diar... jour.1048786 The Lancet Global Health
4 pub.1130303438 Use of hydroxychloroquine in hospitalised COVI... NaN NaN NaN article 2020 [[{'first_name': 'Augusto', 'last_name': 'Di C... jour.1100229 European Journal of Internal Medicine

Number of publications with more than one researcher.

[28]:
%%dsldf
search publications
where count(researchers) > 1
return funders limit 5
Returned Funders: 5
Time: 1.89s
[28]:
id count city_name types name country_name linkout latitude acronym longitude state_name
0 grid.419696.5 1870091 Beijing [Government] National Natural Science Foundation of China China [http://www.nsfc.gov.cn/publish/portal1/] 40.005177 NSFC 116.339830 NaN
1 grid.270680.b 674156 Brussels [Government] European Commission Belgium [http://ec.europa.eu/index_en.htm] 50.851650 EC 4.363670 NaN
2 grid.424020.0 597126 Beijing [Government] Ministry of Science and Technology of the Peop... China [http://www.most.gov.cn/eng/] 39.827835 MOST 116.316284 NaN
3 grid.48336.3a 568639 Rockville [Government] National Cancer Institute United States [http://www.cancer.gov/] 39.004326 NCI -77.101190 Maryland
4 grid.54432.34 542093 Tokyo [Nonprofit] Japan Society for the Promotion of Science Japan [http://www.jsps.go.jp/] 35.687160 JSPS 139.740390 NaN

International collaborations: number of publications with more than one author and affiliations located in more than one country.

[29]:
%%dsldf
search publications
where count(researchers) > 1
and count(research_org_countries) > 1
return funders limit 5
Returned Funders: 5
Time: 1.09s
[29]:
id count types name latitude longitude linkout city_name country_name acronym
0 grid.419696.5 452873 [Government] National Natural Science Foundation of China 40.005177 116.339830 [http://www.nsfc.gov.cn/publish/portal1/] Beijing China NSFC
1 grid.270680.b 344656 [Government] European Commission 50.851650 4.363670 [http://ec.europa.eu/index_en.htm] Brussels Belgium EC
2 grid.424150.6 157949 [Facility] German Research Foundation 50.699340 7.147797 [http://www.dfg.de/en/] Bonn Germany DFG
3 grid.424020.0 149276 [Government] Ministry of Science and Technology of the Peop... 39.827835 116.316284 [http://www.most.gov.cn/eng/] Beijing China MOST
4 grid.54432.34 136016 [Nonprofit] Japan Society for the Promotion of Science 35.687160 139.740390 [http://www.jsps.go.jp/] Tokyo Japan JSPS

Domestic collaborations: number of publications with more than one author and more than one affiliation located in exactly one country.

[30]:
%%dsldf
search publications
where count(researchers) > 1
and count(research_org_countries) = 1
return funders limit 5
Returned Funders: 5
Time: 2.58s
[30]:
id count city_name types name country_name linkout latitude acronym longitude state_name
0 grid.419696.5 1373160 Beijing [Government] National Natural Science Foundation of China China [http://www.nsfc.gov.cn/publish/portal1/] 40.005177 NSFC 116.339830 NaN
1 grid.424020.0 435916 Beijing [Government] Ministry of Science and Technology of the Peop... China [http://www.most.gov.cn/eng/] 39.827835 MOST 116.316284 NaN
2 grid.48336.3a 415902 Rockville [Government] National Cancer Institute United States [http://www.cancer.gov/] 39.004326 NCI -77.101190 Maryland
3 grid.54432.34 371463 Tokyo [Nonprofit] Japan Society for the Promotion of Science Japan [http://www.jsps.go.jp/] 35.687160 JSPS 139.740390 NaN
4 grid.280785.0 326036 Bethesda [Facility] National Institute of General Medical Sciences United States [http://www.nigms.nih.gov/Pages/default.aspx] 38.997833 NIGMS -77.099380 Maryland

3.4 Filter Operators

A simple filter expression consists of a field name, an in-/equality operator op, and the desired field value.

The value must be a string enclosed in double quotes (") or an integer (e.g. 1234).

The available operators are:

op

meaning

=

is (or contains if the given field is multi-value)

!=

is not

>

is greater than

<

is less than

>=

is greater than or equal to

<=

is less than or equal to

~

partially matches (see partial-string-matching below)

is empty

is empty (see emptiness-filters below)

is not empty

is not empty (see emptiness-filters below)

A couple of examples

[31]:
%dsldf search datasets where year > 2010 and year < 2012 return datasets limit 5
Returned Datasets: 5 (total = 38764)
Time: 0.53s
[31]:
keywords id authors year title
0 [human populations, single nucleotide polymorp... 105 [{'name': 'Blaise Li', 'orcid': '0000-0003-308... 2011 India Africa Asia HGDP HapMap frappe K3
1 [human populations, single nucleotide polymorp... 106 [{'name': 'Blaise Li', 'orcid': '0000-0003-308... 2011 India Africa Asia HGDP HapMap frappe K4
2 [human populations, single nucleotide polymorp... 107 [{'name': 'Blaise Li', 'orcid': '0000-0003-308... 2011 India Africa Asia HGDP HapMap frappe K5
3 [human populations, single nucleotide polymorp... 108 [{'name': 'Blaise Li', 'orcid': '0000-0003-308... 2011 India Africa Asia HGDP HapMap frappe K6
4 [human populations, single nucleotide polymorp... 109 [{'name': 'Blaise Li', 'orcid': '0000-0003-308... 2011 India Africa Asia HGDP HapMap frappe K7
[32]:
%dsldf search patents where assignees != "grid.410484.d" return patents limit 5
Returned Patents: 5 (total = 40195054)
Time: 0.66s
[32]:
id times_cited title assignees granted_year assignee_names year publication_date inventor_names filing_status
0 EP-1409282-B1 0 METHODS FOR OPERATING A MOTOR VEHICLE DRIVEN B... [{'id': 'grid.6584.f', 'name': 'Robert Bosch (... 2009 [Robert Bosch GmbH, BOSCH GMBH ROBERT] 2001 2009-12-09 [TUMBACK, STEFAN, SCHNELLE, KLAUS-PETER] Grant
1 EP-0868664-B1 0 MULTI-CYCLE LOOP INJECTION FOR TRACE ANALYSIS ... [{'id': 'grid.418190.5', 'name': 'Thermo Fishe... 2009 [Dionex Corp, DIONEX CORP] 1996 2009-12-09 [RIVIELLO, JOHN, M., REY, MARIA, A.] Grant
2 EP-0861808-B1 1 Waste water treatment apparatus [{'id': 'grid.471210.1', 'name': 'Kuraray (Jap... 2009 [Kuraray Co Ltd, KURARAY CO] 1998 2009-12-09 [TANAKA, EIJI, HIGASHI, TAMIO, KITAMURA, TAKAN... Grant
3 EP-0805365-B1 0 Optical waveguide grating and production metho... [{'id': 'grid.471143.4', 'name': 'Fujikura (Ja... 2009 [Fujikura Ltd, FUJIKURA LTD] 1997 2009-12-09 [NAKAI, MICHIHIRO, SHIMA, KENSUKE, HIDAKA, HIR... Grant
4 EP-1970973-B1 0 Method for thermal matching of a thermoelectri... [{'id': 'grid.426571.3', 'name': 'Imec the Net... 2009 [INTERUNIVERSITAIR MICROELEKTRONICA CENTRUM NE... 2007 2009-12-09 [LEONOV, VLADIMIR] Grant

3.5 Partial string matching with ~

The ~ operator indicates that the given field need only partially, instead of exactly, match the given string (the value used with this operator must be a string, not an integer).

For example, the filter where research_orgs.name~"Saarland Uni" would match both the organization named “Saarland University” and the one named “Universitätsklinikum des Saarlandes”, and any other organization whose name includes the terms “Saarland” and “Uni” (the order is unimportant).

[33]:
%%dsldf
search patents
    where assignee_names ~ "IBM"
return assignees limit 5
Returned Assignees: 5
Time: 2.04s
[33]:
id count name city_name country_name
0 grid.410484.d 336471 IBM (United States) Armonk United States
1 grid.471366.1 22104 GlobalFoundries (Cayman Islands) George Town Cayman Islands
2 grid.14648.3f 5139 IBM (United Kingdom) Winchester United Kingdom
3 grid.420451.6 3555 Google Mountain View United States
4 grid.472772.3 2716 Lenovo (China) Beijing China

3.6 Emptiness filters is empty

To filter records which contain specific field or to filter those which contain an empty field, it is possible to use something like where research_orgs is not empty or where issn is empty.

[34]:
%%dsldf
search publications
    for "iron graphene"
    where researchers is empty
    and research_orgs is not empty
return publications[id+title+researchers+research_orgs+type] limit 5
Returned Publications: 5 (total = 1883)
Time: 1.71s
[34]:
id research_orgs type title
0 pub.1129668998 [{'id': 'grid.440673.2', 'name': 'Changzhou Un... article Removal of Toxic Heavy Metal Ions (Pb, Cr, Cu,...
1 pub.1129771684 [{'id': 'grid.412246.7', 'name': 'Northeast Fo... article Nanofluid-based pulsating heat pipe for therma...
2 pub.1129041696 [{'id': 'grid.79703.3a', 'name': 'South China ... article Fabrication of the novel Ag-doped SnS2@InVO4 c...
3 pub.1130477930 [{'id': 'grid.452276.0', 'name': 'Institute of... article Atomically-precise dopant-controlled single cl...
4 pub.1130537929 [{'id': 'grid.79703.3a', 'name': 'South China ... article Crafting visible-light-absorbing dye-doped pha...

4. Searching for Researchers

The DSL offers different mechanisms for searching for researchers (e.g. publication authors, grant investigators), each of them presenting specific advantages.

4.1 Exact name searches

Special full-text indices allows to look up a researcher’s name and surname exactly as they appear in the source documents they derive from.

This approach has a broad scope, as it allows to search the full collection of Dimensions documents irrespectively of whether a researcher was succesfully disambiguated (and hence given a Dimensions ID). On the other hand, this approach will only match names as they appear in the source document, so different spellings or initials are not necessarily returned via a single query.

search in [authors|investigators|inventors]

It is possible to look up publications authors using a specific search index called authors.

This method expects case insensitive phrases, in format \("<first name> <last name>"\) or reverse order. Note that strings in double quotes that contain nested quotes must always be escaped by a backslash \.

[35]:
%dsldf search publications in authors for "\"Charles Peirce\"" return publications limit 5
Returned Publications: 5 (total = 144)
Time: 0.62s
[35]:
title pages author_affiliations year id type
0 5. On Logical Graphs 211-261 [[{'first_name': 'Charles S.', 'last_name': 'P... 2019 pub.1123488521 chapter
1 12. Peripatetic Talks 348-366 [[{'first_name': 'Charles S.', 'last_name': 'P... 2019 pub.1123488528 chapter
2 Bibliography of Peirce’s References 642-651 [[{'first_name': 'Charles S.', 'last_name': 'P... 2019 pub.1123488545 chapter
3 14. On the First Principles of Logical Algebra 385-398 [[{'first_name': 'Charles S.', 'last_name': 'P... 2019 pub.1123488530 chapter
4 26. Assurance through Reasoning 565-585 [[{'first_name': 'Charles S.', 'last_name': 'P... 2019 pub.1123488542 chapter

Instead of first name, initials can also be used. These are examples of valid research search phrases:

  • \"Peirce, Charles S.\"

  • \"Charles S. Peirce\"

  • \"CS Peirce\"

  • \"Peirce CS\"

  • \"C S Peirce\"

  • \"Peirce C S\"

  • \"C Peirce\"

  • \"Peirce C\"

  • \"Charles Peirce\"

  • \"Peirce Charles\"

Warning: In order to produce valid results an author or an investigator search query must contain at least two components or more (e.g., name and surname, either in full or initials).

Investigators search is similar to authors search, only it allows to search on grants and clinical trials using a separate search index investigators, and on patents using the index inventors.

[36]:
%%dsldf
search clinical_trials in investigators for "\"John Smith\""
return clinical_trials limit 5
Returned Clinical_trials: 3 (total = 3)
Time: 0.72s
[36]:
id active_years title investigator_details
0 NCT00689533 [2008, 2009, 2010, 2011, 2012, 2013, 2014, 201... VEPTR Implantation to Treat Children With Earl... [[John M Flynn, MD, Principal Investigator, Ch...
1 NCT01241149 NaN Prospective Evaluation of Symptom Resolution i... [[Ellie Mentler, MD, Principal Investigator, U...
2 NCT04072380 [2019, 2020] A Phase 2, Double-blind, Placebo-controlled, P... [[Rohith G. Patel, MD, Principal Investigator,...
[37]:
%%dsldf
search grants in investigators for "\"Satoko Shimazaki\""
return grants limit 5
Returned Grants: 4 (total = 4)
Time: 0.54s
[37]:
funders title end_date start_year title_language original_title project_num funding_org_name language start_date id active_year
0 [{'id': 'grid.422239.c', 'state_name': 'Distri... Kabuki Actors, Print Technology, and the Theat... 2022-08-31 2021 en Kabuki Actors, Print Technology, and the Theat... FEL-263245-19 National Endowment for the Humanities en 2021-09-01 grant.7925589 [2021, 2022]
1 [{'id': 'grid.54432.34', 'city_name': 'Tokyo',... Genealogy research on female saints in the Pal... 2021-03-31 2018 ja 古・中英語期における女性聖人伝の系譜研究:Aelfricのテクストと言語を中心に 18K00431 Japan Society for the Promotion of Science ja 2018-04-01 grant.7527261 [2018, 2019, 2020, 2021]
2 [{'id': 'grid.54432.34', 'city_name': 'Tokyo',... Images of Women in the Old English Lives of Sa... 2018-03-31 2015 en Images of Women in the Old English Lives of Sa... 15K02313 Japan Society for the Promotion of Science en 2015-04-01 grant.5858713 [2015, 2016, 2017, 2018]
3 [{'id': 'grid.54432.34', 'city_name': 'Tokyo',... Reception and Transfromation of the Images of ... 2015-03-31 2012 en Reception and Transfromation of the Images of ... 24520310 Japan Society for the Promotion of Science en 2012-04-01 grant.6086985 [2012, 2013, 2014, 2015]
[38]:
%%dsldf
search patents in inventors for "\"John Smith\""
return patents limit 5
Returned Patents: 5 (total = 502)
Time: 0.75s
[38]:
title publication_date granted_year assignee_names year inventor_names times_cited filing_status id assignees
0 A lockable safety insert for an electrical dom... 2004-11-03 2004.0 [SMITH JOHN] 2003 [SMITH JOHN] 0.0 Grant IE-S20030195-A2 NaN
1 Automotive heat exchanger 2006-03-22 2006.0 [Llanelli Radiators Ltd, Calsonic Kansei UK Lt... 2002 [SMITH JOHN] 0.0 Grant GB-2384299-B [{'id': 'grid.472810.8', 'city_name': 'Llanell...
2 Extractor 2007-10-25 NaN [SMITH JOHN A] 2007 [John Smith] 6.0 Application US-20070245563-A1 NaN
3 Boom utilized in a geometric end effector system 2018-02-06 2018.0 [DESTACO Europe GmbH, CAPITAL FORMATION INC, D... 2014 [John Smith] NaN Grant US-9884426-B2 [{'id': 'grid.472738.d', 'city_name': 'Teltow'...
4 Ammunition cartridge 2014-10-22 NaN [Eley Ltd, ELEY LTD] 2013 [SMITH JOHN] 0.0 Application GB-2513101-A NaN

4.2 Fuzzy Searches

This type of search is similar to full-text search, with the difference that it allows searching by only a part of a name, e.g. only the ‘last name’ of a person, by using the where clause.

Note At this moment, this type of search is only available for publications. Other sources will add this option in the future.

For example:

[39]:
%%dsldf
search publications where authors = "Hawking"
return publications limit 5[id+doi+title+authors] limit 10
Returned Errors: 1
Time: 0.44s
1 QuerySyntaxError found
1 ParserError found
  * [Line 2:27] ('[') mismatched input '[' expecting <EOF>

Generally speaking, using a where clause to search authors is less precise that using the relevant exact-search syntax.

On the other hand, using a where clause can be handy if one wants to combine an author search with another full-text search index.

For example:

[40]:
%%dsldf
search publications
    in title_abstract_only for "dna replication"
    where authors = "smith"
return publications limit 5
Returned Publications: 5 (total = 1544)
Time: 1.14s
[40]:
title pages author_affiliations year issue id type volume journal.id journal.title
0 Identifying epigenetic biomarkers of establish... 95 [[{'first_name': 'Ryan', 'last_name': 'Langdon... 2020 1 pub.1128835470 article 12 jour.1042271 Clinical Epigenetics
1 Genetic associations with clozapine-induced my... 37 [[{'first_name': 'Paul', 'last_name': 'Lacaze'... 2020 1 pub.1124910780 article 10 jour.1045271 Translational Psychiatry
2 Genomic analyses of early responses to radiati... 8979 [[{'first_name': 'Saket', 'last_name': 'Choudh... 2020 1 pub.1128124846 article 10 jour.1045337 Scientific Reports
3 An epigenome-wide association study of posttra... 46 [[{'first_name': 'Mark W.', 'last_name': 'Logu... 2020 1 pub.1125664041 article 12 jour.1042271 Clinical Epigenetics
4 Longitudinal epigenome-wide association studie... 11 [[{'first_name': 'Clara', 'last_name': 'Snijde... 2020 1 pub.1124060243 article 12 jour.1042271 Clinical Epigenetics

4.3 Using the disambiguated Researchers database

The Dimensions Researchers source is a database of researchers information algorithmically extracted and disambiguated from all of the other content sources (publications, grants, clinical trials etc..).

By using the researchers source it is possible to match an ‘aggregated’ person object linking together multiple publication authors, grant investigators etc.. irrespectively of the form their names can take in the original source documents.

However, since database does not contain all authors and investigators information available in Dimensions.

E.g. think of authors from older publications, or authors with very common names that are difficult to disambiguate, or very new authors, who have only one or few publications. In such cases, using full-text authors search might be more appropriate.

Examples:

[41]:
%%dsldf
search researchers for "\"Satoko Shimazaki\""
return researchers[basics+obsolete]
Returned Researchers: 4 (total = 4)
Time: 1.24s
[41]:
id first_name last_name obsolete research_orgs
0 ur.07751146721.59 Satoko Shimazaki 0 NaN
1 ur.010537333602.30 Satoko Shimazaki 1 NaN
2 ur.014307627665.09 Satoko Shimazaki 0 [{'id': 'grid.19006.3e', 'types': ['Education'...
3 ur.015527473602.63 Satoko Shimazaki 0 [{'id': 'grid.266190.a', 'types': ['Education'...

NOTE pay attentiont to the obsolete field. This indicates the researcher ID status. 0 means that the researcher ID is still active, 1 means that the researcher ID is no longer valid. This is due to the ongoing process of refinement of Dimensions researchers.

Hence the query above is best written like this:

[42]:
%%dsldf
search researchers where obsolete=0 for "\"Satoko Shimazaki\""
return researchers[basics+obsolete]
Returned Researchers: 3 (total = 3)
Time: 1.21s
[42]:
last_name first_name id obsolete research_orgs
0 Shimazaki Satoko ur.07751146721.59 0 NaN
1 Shimazaki Satoko ur.014307627665.09 0 [{'id': 'grid.19006.3e', 'name': 'University o...
2 Shimazaki Satoko ur.015527473602.63 0 [{'id': 'grid.266190.a', 'name': 'University o...

With Researchers, one can use other fields as well:

[43]:
%%dsldf
search researchers
    where obsolete=0 and last_name="Shimazaki"
return researchers[basics] limit 5
Returned Researchers: 5 (total = 454)
Time: 0.72s
[43]:
last_name first_name id research_orgs
0 Shimazaki Tatsuo ur.013510032403.65 [{'id': 'grid.419075.e', 'name': 'Ames Researc...
1 Shimazaki Tomomi ur.010700310627.87 [{'id': 'grid.471199.3', 'name': 'Murata (Japa...
2 Shimazaki Dai ur.011035131473.19 [{'id': 'grid.415776.6', 'name': 'National Ins...
3 Shimazaki Koji ur.016627632300.80 NaN
4 Shimazaki Toshiyuki ur.013205240215.48 [{'id': 'grid.420062.2', 'name': 'Nissan Chemi...

5. Returning results

After the search phrase, a query must contain one or more return phrases, specifying the content and format of the information that should be returned.

5.1 Returning Multiple Sources

Multiple results may not be returned in a single return phrase.

[44]:
%%dsldf
search publications
return funders limit 5
return research_orgs limit 5
return year
Returned Year: 20
Returned Research_orgs: 5
Returned Funders: 5
Time: 4.38s
[Warning] Dataframe created from first available key, but more than one JSON key found: ['year', 'research_orgs', 'funders']
[44]:
id count
0 2019 5573486
1 2018 5172592
2 2017 4817375
3 2016 4426951
4 2015 4244304
5 2020 4166850
6 2014 4101478
7 2013 3909910
8 2012 3646455
9 2011 3527334
10 2010 3103077
11 2009 2963600
12 2008 2804045
13 2007 2800801
14 2006 2503143
15 2005 2293845
16 2004 2174926
17 2003 1991601
18 2002 1849468
19 2001 1789290

5.2 Returning Specific Fields

For control over which information from each given record will be returned, a source or entity name in the results phrase can be optionally followed by a specification of fields and fieldsets to be included in the JSON results for each retrieved record.

The fields specification may be an arbitrary list of field names enclosed in brackets ([, ]), with field names separated by a plus sign (+). Minus sign (-) can be used to exclude field or a fieldset from the result. Field names thus listed within brackets must be “known” to the DSL, and therefore only a subset of fields may be used in this syntax (see note below).

[45]:
%%dsldf
search grants
return grants[grant_number + title + language] limit 5
Returned Grants: 5 (total = 5514056)
Time: 0.46s
[45]:
grant_number title language
0 RAAK.PRO02.048 Sensing alarm responses of ungulate herds to p... en
1 890218 Functional analysis of ribosome heterogeneity ... en
2 2018-HRSI-1548 APPROACH to Enriching the Real World Evidence ... en
3 894029 Knowledge Transfer in Global Gender Programmes... en
4 1301720F Molecular mechanism of DNA double strand break... en
[46]:
%%dsldf
search clinical_trials
return clinical_trials [id+ title + acronym + phase] limit 5
Returned Clinical_trials: 5 (total = 582398)
Time: 0.50s
[46]:
phase id title acronym
0 N/A NCT02318264 Influence of Elastic Tape on Activation of the... NaN
1 N/A NCT02318290 Opioids Withdrawal Syndrome in Critically Ill ... WAAICUP
2 Phase 2 NCT02318303 A Double-blind, Randomized, Parallel-group, Co... NaN
3 N/A NCT02318316 "Exhaled Breath Condensate" in Allogeneic Stem... NaN
4 Phase 1 NCT02318329 A Phase 1 Open-Label, Dose-Finding Study Evalu... NaN

Shortcuts: ``fieldsets``

The fields specification may be the name of a pre-defined fieldset (e.g. extras, basics). These are shortcuts that can be handy when testing out new queries, for example.

NOTE In general when writing code used in integrations or long-standing extraction scripts it is best to return specific fields rather that a predefined set. This has also the advantage of making queries faster by avoiding the extraction of unnecessary data.

[47]:
%%dsldf
search grants
return grants [basics] limit 5
Returned Grants: 5 (total = 5514056)
Time: 0.57s
WARNINGS [2]
Field 'project_num' is deprecated in favor of grant_number. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'title_language' is deprecated in favor of language_title. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
[47]:
language start_date project_num title_language funding_org_name id original_title start_year active_year title funders end_date
0 en 2021-12-27 RAAK.PRO02.048 en Dutch Research Council grant.6946936 Sensing alarm responses of ungulate herds to p... 2021 [2021] Sensing alarm responses of ungulate herds to p... [{'id': 'grid.420488.2', 'name': 'Dutch Resear... NaN
1 en 2021-12-01 890218 en European Commission grant.9064785 Functional analysis of ribosome heterogeneity ... 2021 [2021, 2022, 2023] Functional analysis of ribosome heterogeneity ... [{'id': 'grid.270680.b', 'name': 'European Com... 2023-11-30
2 en 2021-11-30 2018-HRSI-1548 en New Brunswick Health Research Foundation grant.8690978 APPROACH to Enriching the Real World Evidence ... 2021 [2021] APPROACH to Enriching the Real World Evidence ... [{'id': 'grid.484521.e', 'name': 'New Brunswic... NaN
3 en 2021-10-01 894029 en European Commission grant.9064813 Knowledge Transfer in Global Gender Programmes... 2021 [2021, 2022, 2023, 2024] Knowledge Transfer in Global Gender Programmes... [{'id': 'grid.270680.b', 'name': 'European Com... 2024-09-30
4 en 2021-10-01 1301720F en Fund for Scientific Research grant.8950252 Mécanismes moléculaires de la formation et la ... 2021 [2021] Molecular mechanism of DNA double strand break... [{'id': 'grid.424470.1', 'name': 'Fund for Sci... NaN
[48]:
%%dsldf
search publications
return publications [basics+times_cited] limit 5
Returned Publications: 5 (total = 112275335)
Time: 1.20s
WARNINGS [1]
Field 'author_affiliations' is deprecated in favor of authors. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
[48]:
year id pages author_affiliations volume times_cited issue type title journal.id journal.title
0 2020 pub.1130041027 1793599 [[{'first_name': 'Thanos', 'last_name': 'Karat... 11 0 1 article Adverse and benevolent childhood experiences i... jour.1045059 European Journal of Psychotraumatology
1 2020 pub.1129454261 191-202 [[{'first_name': 'Rafael', 'last_name': 'Valdi... NaN 0 NaN chapter FACTORES PSICOSOCIALES ASOCIADOS A MENORES CON... NaN NaN
2 2020 pub.1125632078 333-349 NaN NaN 0 NaN chapter Literature NaN NaN
3 2020 pub.1124099280 1704540 [[{'first_name': 'Mahendra M', 'last_name': 'R... 13 0 1 article To start or to complete? – Challenges in imple... jour.1041075 Global Health Action
4 2020 pub.1124649186 1717411 [[{'first_name': 'Benjamin-Samuel', 'last_name... 13 1 1 article Long-term trends in seasonality of mortality i... jour.1041075 Global Health Action

The fields specification may be an (all), to indicate that all fields available for the given source should be returned.

[49]:
%%dsldf
search publications
return publications [all] limit 5
Returned Publications: 5 (total = 112275334)
Time: 1.27s
WARNINGS [10]
Field 'references' is deprecated in favor of reference_ids. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'terms' is deprecated in favor of concepts. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'RCDC' is deprecated in favor of category_rcdc. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'HRCS_RAC' is deprecated in favor of category_hrcs_rac. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'FOR' is deprecated in favor of category_for. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'author_affiliations' is deprecated in favor of authors. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'category_ua' is deprecated in favor of category_uoa. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'HRCS_HC' is deprecated in favor of category_hrcs_hc. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'open_access' is deprecated in favor of open_access_categories. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
Field 'FOR_first' is deprecated in favor of category_for. Please refer to https://docs.dimensions.ai/dsl/releasenotes.html for more details
[49]:
open_access_categories pages publisher altmetric_id type title year recent_citations doi times_cited ... authors references HRCS_RAC volume open_access concepts_scores journal.id journal.title research_org_state_names research_org_state_codes
0 [{'id': 'closed', 'description': 'No freely av... 333-349 De Gruyter 0 chapter Literature 2020 0 10.1515/9783110823547-013 0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 [{'id': 'oa_all', 'description': 'Article is f... 1704540 Taylor & Francis 74041725 article To start or to complete? – Challenges in imple... 2020 0 10.1080/16549716.2019.1704540 0 ... [{'first_name': 'Mahendra M', 'last_name': 'Re... [pub.1084776885, pub.1026226848, pub.100783600... [{'id': '10801', 'name': '8.1 Organisation and... 13 [Open Access - all, Open Access - publisher, O... [{'concept': 'isoniazid preventive therapy', '... jour.1041075 Global Health Action NaN NaN
2 [{'id': 'oa_all', 'description': 'Article is f... 1717411 Taylor & Francis 75135566 article Long-term trends in seasonality of mortality i... 2020 1 10.1080/16549716.2020.1717411 1 ... [{'first_name': 'Benjamin-Samuel', 'last_name'... [pub.1070577469, pub.1035360137, pub.111994906... NaN 13 [Open Access - all, Open Access - publisher, O... [{'concept': 'cause-specific mortality', 'rele... jour.1041075 Global Health Action [New Jersey] [{'id': 'US-NJ', 'name': 'New Jersey'}]
3 [{'id': 'closed', 'description': 'No freely av... 167-190 De Gruyter 0 chapter Eine Warnung an alle, dy sych etwaz duncken: D... 2020 0 10.1515/9783110950762-012 0 ... [{'first_name': 'Ulla', 'last_name': 'Williams... NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 [{'id': 'closed', 'description': 'No freely av... 241-276 De Gruyter 0 chapter Marienklagen und Pietà 2020 0 10.1515/9783110922035-011 0 ... [{'first_name': 'Georg', 'last_name': 'Satzing... NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 57 columns

5.3 Returning Facets

In addition to returning source records matching a query, it is possible to \(facet\) on the entity fields related to a particular source and return only those entity values as an aggregrated view of the related source data. This operation is similar to a \(group by\) or \(pivot table\).

Warning Faceting can return up to a maximum of 1000 results. This is to ensure adequate performance with all queries. Furthemore, although the limit operator is allowed, the skip operator cannot be used.

[50]:
%%dsldf
search publications
    for "coronavirus"
return research_orgs limit 5
Returned Research_orgs: 5
Time: 0.53s
[50]:
id count name latitude state_name types linkout country_name longitude city_name acronym
0 grid.38142.3c 1394 Harvard University 42.377052 Massachusetts [Education] [http://www.harvard.edu/] United States -71.11665 Cambridge NaN
1 grid.21107.35 1288 Johns Hopkins University 39.328888 Maryland [Education] [https://www.jhu.edu/] United States -76.62028 Baltimore JHU
2 grid.17063.33 1199 University of Toronto 43.661667 Ontario [Education] [http://www.utoronto.ca/] Canada -79.39500 Toronto NaN
3 grid.4991.5 1183 University of Oxford 51.753437 Oxfordshire [Education] [http://www.ox.ac.uk/] United Kingdom -1.25401 Oxford NaN
4 grid.194645.b 1176 University of Hong Kong 22.283287 Hong Kong [Education] [http://www.hku.hk/] China 114.13708 Hong Kong HKU
[51]:
%%dsldf
search publications
    for "coronavirus"
return research_org_countries limit 5
return year limit 5
return category_for limit 5
Returned Research_org_countries: 5
Returned Year: 5
Returned Category_for: 5
Time: 0.60s
[Warning] Dataframe created from first available key, but more than one JSON key found: ['research_org_countries', 'year', 'category_for']
[51]:
id count name
0 US 44418 United States
1 CN 19128 China
2 GB 14325 United Kingdom
3 DE 8371 Germany
4 IT 8316 Italy

For control over the organization and headers of the JSON query results, the return keyword in a return phrase may be followed by the keyword in and then a group name for this group of results, where the group name is enclosed in double quotes(").

Also, one can define aliases that replace the defaul JSON fields names with other ones provided by the user.

See the official documentation for more details about this feature.

[70]:
%%dsl
search publications
return in "facets" funders
return in "facets" research_orgs
Returned Facets: 2
Time: 2.77s
[70]:
<dimcli.DslDataset object #4663838032. Records: 2/112275334>

5.4 What the query statistics refer to - sources VS facets

When performing a DSL search, a _stats object is return which contains some useful info eg the total number of records available for a search.

[53]:
%%dsldf
search publications
  where year in [2013:2018] and research_orgs="grid.258806.1"
return publications limit 5
Returned Publications: 5 (total = 3727)
Time: 0.55s
[53]:
type volume pages author_affiliations id year issue title journal.id journal.title
0 article 3 18124-18131 [[{'first_name': 'Siewteng', 'last_name': 'Sim... pub.1110885950 2018 12 Development of Organo-Dispersible Graphene Oxi... jour.1157000 ACS Omega
1 proceeding NaN NaN [[{'first_name': 'T.', 'last_name': 'Miyagi', ... pub.1110925389 2018 NaN Nuclear Ab Initio Calculations with the Unitar... NaN NaN
2 article 122 29200-29209 [[{'first_name': 'Taro', 'last_name': 'Toyoda'... pub.1110369527 2018 51 Anisotropic Crystal Growth, Optical Absorption... jour.1038386 The Journal of Physical Chemistry C
3 article 122 28491-28496 [[{'first_name': 'Liang', 'last_name': 'Wang',... pub.1110271601 2018 50 Indium Zinc Oxide Electron Transport Layer for... jour.1038386 The Journal of Physical Chemistry C
4 article 10 43682-43690 [[{'first_name': 'Ami', 'last_name': 'Nomura',... pub.1110222625 2018 50 Chalcopyrite ZnSnSb2: A Promising Thermoelectr... jour.1041450 ACS Applied Materials & Interfaces

It is important to note though that the total number always refers to the main source, never the facets one is searching for.

For example, in this query we return researchers linked to publications:

[54]:
%%dsldf
search publications
  where year in [2013:2018] and research_orgs="grid.258806.1"
return researchers limit 5
Returned Researchers: 5
Time: 0.86s
[54]:
id count last_name first_name research_orgs orcid_id
0 ur.01055753603.27 140 Hayase Shuzi Shuzi [grid.419082.6, grid.482504.f, grid.14003.36, ... NaN
1 ur.011212042763.67 102 Hikita Masayuki [grid.27476.30, grid.462727.2, grid.258806.1] NaN
2 ur.01144540527.52 100 Ma Ting-Li [grid.177174.3, grid.30055.33, grid.11135.37, ... [0000-0002-3310-459X]
3 ur.07644453127.11 96 Kozako M Kozako M [grid.462727.2, grid.471634.3, grid.258806.1, ... NaN
4 ur.016357156077.09 86 Lu Huimin [grid.454850.8, grid.41156.37, grid.258806.1, ... [0000-0001-9794-3221]

NOTE: facet results can be 1000 at most (due to performance limitations) so if there are more than 1000 it is not possible to know the total number.

5.5 Paginating Results

At the end of a return phrase, the user can specify the maximum number of results to be returned and the number of top records to skip over before returning the first result record, for e.g. returning large result sets page-by-page (i.e. “paging” results) as described below.

This is done using the keyword limit followed by the maximum number of results to return, optionally followed by the keyword skip and the number of results to skip (the offset).

[55]:
%%dsldf
search publications return publications limit 10
Returned Publications: 10 (total = 112275335)
Time: 0.46s
[55]:
title pages author_affiliations year issue id type volume journal.id journal.title
0 Adverse and benevolent childhood experiences i... 1793599 [[{'first_name': 'Thanos', 'last_name': 'Karat... 2020 1 pub.1130041027 article 11 jour.1045059 European Journal of Psychotraumatology
1 FACTORES PSICOSOCIALES ASOCIADOS A MENORES CON... 191-202 [[{'first_name': 'Rafael', 'last_name': 'Valdi... 2020 NaN pub.1129454261 chapter NaN NaN NaN
2 Literature 333-349 NaN 2020 NaN pub.1125632078 chapter NaN NaN NaN
3 To start or to complete? – Challenges in imple... 1704540 [[{'first_name': 'Mahendra M', 'last_name': 'R... 2020 1 pub.1124099280 article 13 jour.1041075 Global Health Action
4 Long-term trends in seasonality of mortality i... 1717411 [[{'first_name': 'Benjamin-Samuel', 'last_name... 2020 1 pub.1124649186 article 13 jour.1041075 Global Health Action
5 Eine Warnung an alle, dy sych etwaz duncken: D... 167-190 [[{'first_name': 'Ulla', 'last_name': 'William... 2020 NaN pub.1125632729 chapter NaN NaN NaN
6 Marienklagen und Pietà 241-276 [[{'first_name': 'Georg', 'last_name': 'Satzin... 2020 NaN pub.1125635978 chapter NaN NaN NaN
7 Johannes Taulers Via negationis 76-93 [[{'first_name': 'Walter', 'last_name': 'Haug'... 2020 NaN pub.1125632704 chapter NaN NaN NaN
8 Die editorische Einheit ,Textstufe' 177-194 [[{'first_name': 'Hermann', 'last_name': 'Zwer... 2020 NaN pub.1125636152 chapter NaN NaN NaN
9 ad Iliadis librum Ζ 123-221 NaN 2020 NaN pub.1125636759 chapter NaN NaN NaN

If paging information is not provided, the default values limit 20 skip 0 are used, so the two following queries are equivalent:

Combining limit and skip across multiple queries enables paging or batching of results; e.g. to retrieve 30 grant records divided into 3 pages of 10 records each, the following three queries could be used:

return grants limit 10           => get 1st 10 records for page 1 (skip 0, by default)
return grants limit 10 skip 10   => get next 10 for page 2; skip the 10 we already have
return grants limit 10 skip 20   => get another 10 for page 3, for a total of 30

5.6 Sorting Results

A sort order for the results in a given return phrase can be specified with the keyword sort by followed by the name of * a field (in the case that a source is being requested) * an indicator (aggregation) (in the case that one or more facets are being requested).

By default, the result set of full text queries (\(search ... for "full text query"\)) is sorted by “relevance”. Additionally, it is possible to specify the sort order, using asc or desc keywords. By default, descending order is selected.

[56]:
%%dsldf
search grants
    for "nanomaterials"
return grants sort by title desc limit 5
Returned Grants: 5 (total = 18268)
Time: 0.51s
[56]:
start_date language id original_title title_language title active_year start_year funding_org_name project_num funders end_date
0 2012-01-01 de grant.4823271 Transmissionselektronenmikroskop en Transmissionselektronenmikroskop [2012] 2012 German Research Foundation 220923099 [{'id': 'grid.424150.6', 'types': ['Facility']... NaN
1 2015-01-01 en grant.4841519 Transmissionselektronenmikroskop en Transmissionselektronenmikroskop [2015] 2015 German Research Foundation 280331443 [{'id': 'grid.424150.6', 'types': ['Facility']... NaN
2 2011-06-16 en grant.6774902 Snowcontrol. en Snowcontrol. [2011, 2012, 2013, 2014, 2015] 2011 Belgian Federal Science Policy Office 3E120109 [{'id': 'grid.425119.a', 'types': ['Government... 2015-06-13
3 2014-01-01 de grant.4834305 Röntgenquelle en Röntgenquelle [2014] 2014 German Research Foundation 245513494 [{'id': 'grid.424150.6', 'types': ['Facility']... NaN
4 2015-01-01 de grant.4839883 Röntgendiffraktometer en Röntgendiffraktometer [2015] 2015 German Research Foundation 279250642 [{'id': 'grid.424150.6', 'types': ['Facility']... NaN
[57]:
%%dsldf
search grants
    for "nanomaterials"
return grants  sort by relevance desc limit 5
Returned Grants: 5 (total = 18268)
Time: 0.45s
[57]:
start_date language id original_title title_language title active_year start_year funding_org_name end_date project_num funders
0 2012-06-01 en grant.3984032 Optically-active chiral nanomaterials en Optically-active chiral nanomaterials [2012, 2013] 2012 Science Foundation Ireland 2013-05-31 11/W.1/I2065 [{'id': 'grid.437854.9', 'types': ['Nonprofit'...
1 2000-09-01 en grant.3526883 NOVEL LANTHANIDE LUMINESCENT SYSTEMS: FROM SUP... en NOVEL LANTHANIDE LUMINESCENT SYSTEMS: FROM SUP... [2000, 2001, 2002, 2003] 2000 Foundation for Science and Technology 2003-12-31 35378 [{'id': 'grid.22919.31', 'types': ['Nonprofit'...
2 2003-03-01 en grant.3531153 Transport properties and electrochemical appli... en Transport properties and electrochemical appli... [2003, 2004, 2005, 2006] 2003 Foundation for Science and Technology 2006-08-31 39381 [{'id': 'grid.22919.31', 'types': ['Nonprofit'...
3 2014-04-01 en grant.4167216 Polymer Nanomaterials en Polymer Nanomaterials [2014, 2015] 2014 Natural Sciences and Engineering Research Council 2015-03-31 557300 [{'id': 'grid.452912.9', 'types': ['Government...
4 2012-01-01 en grant.4849153 Novel biocomposite nanomaterials en Novel biocomposite nanomaterials [2012, 2013, 2014, 2015] 2012 Israel Science Foundation 2015-12-31 25813 [{'id': 'grid.425339.a', 'types': ['Nonprofit'...

Number of citations per publication

[58]:
%%dsldf
search publications
return publications  [doi + times_cited]
    sort by times_cited limit 5
Returned Publications: 5 (total = 112275334)
Time: 1.70s
[58]:
times_cited doi
0 231730 NaN
1 197598 10.1038/227680a0
2 180841 10.1016/0003-2697(76)90527-3
3 91278 10.1006/meth.2001.1262
4 85717 10.1103/physrevlett.77.3865

Recent citations per publication. Note: Recent citation refers to the number of citations accrued in the last two year period. A single value is stored per document and the year window rolls over in July.

[59]:
%%dsldf
search publications
return publications [doi + recent_citations]
    sort by recent_citations limit 5
Returned Publications: 5 (total = 112275334)
Time: 1.24s
[59]:
recent_citations doi
0 33085 10.1006/meth.2001.1262
1 25320 10.1109/cvpr.2016.90
2 24834 10.1103/physrevlett.77.3865
3 24068 10.1176/appi.books.9780890425596
4 23012 10.1191/1478088706qp063oa

When a facet is being returned, the indicator used in the sort phrase must either be count (the default, such that sort by count is unnecessary), or one of the indicators specified in the aggregate phrase, i.e. one whose values are being computed in the faceting operation.

[60]:
%%dsldf
search publications
    for "nanomaterials"
return research_orgs
    aggregate altmetric_median, rcr_avg sort by rcr_avg limit 5
Returned Research_orgs: 5
Time: 3.12s
[60]:
id count rcr_avg altmetric_median types name latitude longitude linkout city_name country_name acronym state_name
0 grid.11444.34 1 207.399994 345.0 [Facility] Shanghai Institute of Hypertension 31.211678 121.467255 [http://www.china-sih.com/] Shanghai China NaN NaN
1 grid.11485.39 1 207.399994 345.0 [Nonprofit] Cancer Research UK 51.531322 -0.106269 [http://www.cancerresearchuk.org/] London United Kingdom CRUK NaN
2 grid.11642.30 1 207.399994 345.0 [Education] University of La Réunion -20.901735 55.484550 [http://www.univ-reunion.fr/university-of-reun... Saint-Denis Reunion NaN NaN
3 grid.120073.7 1 207.399994 345.0 [Healthcare] Addenbrooke's Hospital 52.176000 0.140000 [http://www.cuh.org.uk/addenbrookes-hospital] Cambridge United Kingdom NaN Cambridgeshire
4 grid.20931.39 1 207.399994 345.0 [Education] Royal Veterinary College 51.536800 -0.134000 [http://www.rvc.ac.uk/] London United Kingdom RVC NaN

6. Aggregations

In a return phrase requesting one or more facet results, aggregation operations to perform during faceting can be specified after the facet name(s) by using the keyword aggregate followed by a comma-separated list of one or more indicator names corresponding to the source being searched.

[61]:
%%dsldf
search publications
    where year > 2010
return research_orgs
    aggregate rcr_avg, altmetric_median limit 5
Returned Research_orgs: 5
Time: 14.61s
[61]:
id count rcr_avg altmetric_median name latitude state_name types linkout country_name longitude city_name acronym
0 grid.17063.33 146656 1.701046 4.000000 University of Toronto 43.661667 Ontario [Education] [http://www.utoronto.ca/] Canada -79.395000 Toronto NaN
1 grid.38142.3c 144250 2.230168 5.132735 Harvard University 42.377052 Massachusetts [Education] [http://www.harvard.edu/] United States -71.116650 Cambridge NaN
2 grid.11899.38 138910 1.050863 2.000000 University of São Paulo -23.563051 NaN [Education] [http://www5.usp.br/en/] Brazil -46.730103 São Paulo USP
3 grid.83440.3b 126466 1.914593 4.000000 University College London 51.524470 NaN [Education] [http://www.ucl.ac.uk/] United Kingdom -0.133982 London UCL
4 grid.26999.3d 122350 1.185757 2.000000 University of Tokyo 35.713333 NaN [Education] [http://www.u-tokyo.ac.jp/en/] Japan 139.762220 Tokyo UT

What are the metrics/aggregations available? See the data sources documentation for information about available indicators.

Alternatively, we can use the ‘schema’ API (describe) to return this information programmatically:

[62]:
schema = dsl.query("describe schema")
sources = [x for x in schema['sources']]
# for each source name, extract metrics info
for s in sources:
    print("SOURCE:", s)
    for m in schema['sources'][s]['metrics']:
        print("--", schema['sources'][s]['metrics'][m]['name'], " => ", schema['sources'][s]['metrics'][m]['description'], )
SOURCE: publications
-- count  =>  Total count
-- altmetric_median  =>  Median Altmetric attention score
-- altmetric_avg  =>  Altmetric attention score mean
-- citations_total  =>  Aggregated number of citations
-- citations_avg  =>  Arithmetic mean of citations
-- citations_median  =>  Median of citations
-- recent_citations_total  =>  For a given article, in a given year, the number of citations accrued in the last two year period. Single value stored per document, year window rolls over in July.
-- rcr_avg  =>  Arithmetic mean of `relative_citation_ratio` field.
-- fcr_gavg  =>  Geometric mean of `field_citation_ratio` field (note: This field cannot be used for sorting results).
SOURCE: grants
-- count  =>  Total count
-- funding  =>  Total funding amount, in USD.
SOURCE: patents
-- count  =>  Total count
SOURCE: clinical_trials
-- count  =>  Total count
SOURCE: policy_documents
-- count  =>  Total count
SOURCE: researchers
-- count  =>  Total count
SOURCE: organizations
-- count  =>  Total count
SOURCE: datasets
-- count  =>  Total count

NOTE In addition to any specified aggregations, count is always computed and reported when facet results are requested.

[63]:
%%dsldf
search grants
    for "5g network"
return funders
    aggregate count, funding sort by funding limit 5
Returned Funders: 5
Time: 0.47s
[63]:
id count funding types name latitude longitude linkout city_name country_name acronym state_name
0 grid.270680.b 194 923867691.0 [Government] European Commission 50.851650 4.363670 [http://ec.europa.eu/index_en.htm] Brussels Belgium EC NaN
1 grid.421091.f 69 53295321.0 [Government] Engineering and Physical Sciences Research Cou... 51.567093 -1.784602 [https://www.epsrc.ac.uk/] Swindon United Kingdom EPSRC England
2 grid.457785.c 113 51989327.0 [Government] Directorate for Computer & Information Science... 38.880580 -77.111000 [http://www.nsf.gov/dir/index.jsp?org=CISE] Arlington United States NSF CISE Virginia
3 grid.55047.33 8 50109038.0 [Government] National Centre for Research and Development 52.227455 21.007630 [http://www.ncbr.gov.pl/en/] Warsaw Poland NCRD NaN
4 grid.453115.7 33 29462562.0 [Government] Innovation and Technology Commission 22.282640 114.166580 [http://www.itc.gov.hk/en/about/org.htm] Hong Kong China ITC NaN

Aggregated total number of citations

[64]:
%%dsldf
search publications
    for "ontologies"
return funders
    aggregate citations_total
    sort by citations_total  limit 5
Returned Funders: 5
Time: 1.18s
[64]:
id count citations_total types name latitude longitude linkout city_name country_name state_name acronym
0 grid.48336.3a 13207 864977.0 [Government] National Cancer Institute 39.004326 -77.101190 [http://www.cancer.gov/] Rockville United States Maryland NCI
1 grid.280785.0 12900 830574.0 [Facility] National Institute of General Medical Sciences 38.997833 -77.099380 [http://www.nigms.nih.gov/Pages/default.aspx] Bethesda United States Maryland NIGMS
2 grid.280128.1 4857 608945.0 [Facility] National Human Genome Research Institute 38.996967 -77.096930 [https://www.genome.gov/] Bethesda United States Maryland NHGRI
3 grid.270680.b 19178 588854.0 [Government] European Commission 50.851650 4.363670 [http://ec.europa.eu/index_en.htm] Brussels Belgium NaN EC
4 grid.52788.30 5530 447416.0 [Nonprofit] Wellcome Trust 51.525867 -0.135005 [http://www.wellcome.ac.uk/] London United Kingdom NaN WT

Arithmetic mean number of citations

[65]:
%%dsldf
search publications
return funders
    aggregate citations_avg
    sort by citations_avg limit 5
Returned Funders: 5
Time: 2.17s
[65]:
id count citations_avg name latitude state_name types linkout country_name longitude city_name acronym
0 grid.478308.0 185 260.870270 Alexander & Margaret Stewart Trust 38.901160 District of Columbia [Nonprofit] [http://www.stewart-trust.org/] United States -77.039730 Washington D.C. NaN
1 grid.453780.d 144 190.722222 Accelerate Brain Cancer Cure 38.906720 District of Columbia [Nonprofit] [http://www.abc2.org/] United States -77.039520 Washington D.C. NaN
2 grid.478789.d 586 168.203072 Donald W. Reynolds Foundation 36.190460 Nevada [Other] [http://www.dwreynolds.org/] United States -115.299850 Las Vegas NaN
3 grid.417710.4 182 164.719780 Human Genome Sciences (United States) 39.096650 Maryland [Company] [http://www.hgsi.com] United States -77.203760 Rockville NaN
4 grid.484432.d 1 150.000000 Macmillan Cancer Support 51.488003 NaN [Nonprofit] [https://www.macmillan.org.uk/] United Kingdom -0.123164 London Macmillan Cancer Support

Geometric mean of FCR

[66]:
%%dsldf
search publications
return funders
    aggregate fcr_gavg limit 5
Returned Funders: 5
Time: 3.48s
[66]:
id fcr_gavg count types name latitude longitude linkout city_name country_name acronym state_name
0 grid.419696.5 2.337550 2048348 [Government] National Natural Science Foundation of China 40.005177 116.339830 [http://www.nsfc.gov.cn/publish/portal1/] Beijing China NSFC NaN
1 grid.270680.b 3.310395 706071 [Government] European Commission 50.851650 4.363670 [http://ec.europa.eu/index_en.htm] Brussels Belgium EC NaN
2 grid.424020.0 2.555937 641515 [Government] Ministry of Science and Technology of the Peop... 39.827835 116.316284 [http://www.most.gov.cn/eng/] Beijing China MOST NaN
3 grid.48336.3a 4.933944 598242 [Government] National Cancer Institute 39.004326 -77.101190 [http://www.cancer.gov/] Rockville United States NCI Maryland
4 grid.54432.34 2.288957 587177 [Nonprofit] Japan Society for the Promotion of Science 35.687160 139.740390 [http://www.jsps.go.jp/] Tokyo Japan JSPS NaN

Median Altmetric Attention Score

[67]:
%%dsldf
search publications
return funders aggregate altmetric_median
    sort by altmetric_median limit 5
Returned Funders: 5
Time: 7.19s
[67]:
id count altmetric_median city_name types name country_name linkout latitude acronym longitude state_name
0 grid.258806.1 8 150.5 Kitakyushu [Education] Kyushu Institute of Technology Japan [https://www.kyutech.ac.jp/english/] 33.894436 KIT 130.839200 NaN
1 grid.470711.4 2 108.5 Edinburgh [Nonprofit] Chest Heart and Stroke Scotland United Kingdom [http://www.chss.org.uk/] 55.946075 CHSS -3.219597 NaN
2 grid.443873.f 5 96.0 Chicago [Nonprofit] LUNGevity Foundation United States [http://www.lungevity.org/] 41.878674 LUNG -87.626480 Illinois
3 grid.473856.b 2 66.0 Washington D.C. [Government] Administration for Children and Families United States [https://www.acf.hhs.gov/] 38.885940 ACF -77.016370 District of Columbia
4 grid.419979.b 2 44.0 Philadelphia [Healthcare] Einstein Healthcare Network United States [http://www.einstein.edu/] 40.036827 AEHN -75.143140 Pennsylvania


Note

The Dimensions Analytics API allows to carry out sophisticated research data analytics tasks like the ones described on this website. Check out also the associated Github repository for examples, the source code of these tutorials and much more.

../../_images/badge-dimensions-api.svg