Expert Identification with the Dimensions API - An Introduction¶
This notebook shows to use the new expert identification feature of Dimensions’ Analytics API.
Prerequisites¶
This notebook assumes you have installed the Dimcli library and are familiar with the Getting Started tutorial.
[1]:
!pip install dimcli --quiet
import dimcli
from dimcli.utils import *
import json
import sys
import pandas as pd
print("==\nLogging in..")
# https://digital-science.github.io/dimcli/getting-started.html#authentication
ENDPOINT = "https://app.dimensions.ai"
if 'google.colab' in sys.modules:
import getpass
KEY = getpass.getpass(prompt='API Key: ')
dimcli.login(key=KEY, endpoint=ENDPOINT)
else:
KEY = ""
dimcli.login(key=KEY, endpoint=ENDPOINT)
dsl = dimcli.Dsl()
==
Logging in..
Dimcli - Dimensions API Client (v0.8.2)
Connected to: https://app.dimensions.ai - DSL v1.28
Method: dsl.ini file
At a glance¶
At its simplest, an expert search query looks like this:
[6]:
%%dsl
identify experts from concepts "malaria OR \"effective malaria vaccine\" OR \"effective prevention\""
using publications
where year >= 2015
return experts[basics]
[6]:
<dimcli.DslDataset object #4405912048. Dict keys: '_stats', '_version', '_copyright', 'experts'>
The query takes a list of concepts defining the expertise you’re looking for, plus other parameters defining the pool of publications to be used, and it returns a list of researchers sorted by relevance.
[8]:
pd.DataFrame(dsl_last_results['experts'])
[8]:
id | score | orcid_id | first_name | last_name | research_orgs | docs_found | |
---|---|---|---|---|---|---|---|
0 | ur.01332073522.49 | 4.307605 | [0000-0002-3396-1700] | Nicholas John | White | [grid.417815.e, grid.22072.35, grid.5335.0, gr... | 7 |
1 | ur.01303637137.59 | 3.853188 | [0000-0001-8300-9593] | Miriam K | Laufer | [grid.8271.c, grid.10595.38, grid.420069.9, gr... | 6 |
2 | ur.01314633455.19 | 3.788729 | NaN | Ritabrata | Kundu | [grid.414710.7] | 6 |
3 | ur.01355076624.38 | 3.788729 | NaN | Jaydeep Choudhury | Choudhury | [grid.414710.7] | 6 |
4 | ur.01333507624.36 | 3.360286 | NaN | Christopher Vine | Plowe | [grid.15653.34, grid.4305.2, grid.94365.3d, gr... | 3 |
5 | ur.07764267264.89 | 3.211528 | [0000-0002-7951-0745] | Francois Henri | Nosten | [grid.11586.3b, grid.4367.6, grid.412433.3, gr... | 4 |
6 | ur.01323510115.98 | 3.214206 | NaN | Danielle I | Stanisic | [grid.1008.9, grid.1042.7, grid.1049.c, grid.4... | 3 |
7 | ur.0752141120.95 | 3.214206 | NaN | Michael Francis | Good | [grid.1048.d, grid.417993.1, grid.1003.2, grid... | 3 |
8 | ur.015476113652.05 | 3.180696 | [0000-0001-5725-9118] | Brian Mellor | Greenwood | [grid.8348.7, grid.10025.36, grid.415375.1, gr... | 5 |
9 | ur.016122312437.59 | 3.172880 | NaN | Ogobara K | Doumbo | [grid.8191.1, grid.8982.b, grid.10548.38, grid... | 3 |
10 | ur.01225135650.70 | 2.805810 | [0000-0002-1018-7898] | James G | Beeson | [grid.1013.3, grid.1056.2, grid.1042.7, grid.1... | 3 |
11 | ur.01162445502.98 | 2.653315 | NaN | Martha | Sedegah | [grid.428999.7, grid.201075.1, grid.290496.0, ... | 3 |
12 | ur.01165702423.17 | 2.653315 | NaN | Michael R | Hollingdale | [grid.417587.8, grid.265436.0, grid.8991.9, gr... | 3 |
13 | ur.0703623237.41 | 2.653315 | NaN | Eileen D | Villasante | [grid.415913.b] | 3 |
14 | ur.01240215027.61 | 2.616639 | NaN | Rose M | Mcgready | [grid.1005.4, grid.462844.8, grid.4991.5, grid... | 3 |
15 | ur.01204711510.82 | 2.598776 | [0000-0001-9773-2192] | Alfonso Javier | Rodriguez-Morales | [grid.419226.a, grid.441853.f, grid.8171.f, gr... | 4 |
16 | ur.0667763776.52 | 2.515726 | NaN | Ashley Michael | Vaughan | [grid.28046.38, grid.413019.e, grid.53964.3d, ... | 2 |
17 | ur.01022543462.48 | 2.427615 | [0000-0002-0607-6941] | Paul | Garner | [grid.417153.5, grid.7445.2, grid.48004.38, gr... | 4 |
18 | ur.01270143765.64 | 2.422990 | [0000-0003-4566-4030] | Joel | Tarning | [grid.8761.8, grid.501272.3, grid.413674.3, gr... | 4 |
19 | ur.0751102271.80 | 2.368244 | [0000-0002-9415-1357] | Simon J | Draper | [grid.425090.a, grid.4991.5, grid.10253.35, gr... | 3 |
Often though, we start from some text and want to find experts relevant to that text (as opposed to starting from concepts).
The expert identification workflow, in such a case, consists of two steps:
Concepts extraction from text
Expert identification using concepts
In the first step, the user extracts concepts from an abstract. The user can review and modify the list of extracted concepts and then feed it into the actual expert identification workflow. In the following sections we will go though these steps in details.
Step 1: Concept Extraction¶
Extracting concepts is implemented using the extract_concepts DSL function. This is the syntax:
extract_concepts("publication abstract")
This query will return a list of extracted concepts, ordered by weight, in descending order. For example:
[2]:
abstract = """We describe monocrystalline graphitic films, which are a few atoms thick but are nonetheless stable under ambient conditions,
metallic, and of remarkably high quality. The films are found to be a two-dimensional semimetal with a tiny overlap between
valence and conductance bands, and they exhibit a strong ambipolar electric field effect such that electrons and
holes in concentrations up to 10 per square centimeter and with room-temperature mobilities of approximately 10,000 square
centimeters per volt-second can be induced by applying gate voltage.
"""
abstract = abstract.replace("\n", " ")
res = dsl.query(f"""extract_concepts("{abstract}")""")
CONCEPTS = res['extracted_concepts']
pd.DataFrame(CONCEPTS)
[2]:
0 | |
---|---|
0 | ambipolar electric field effect |
1 | two-dimensional semimetal |
2 | room-temperature mobility |
3 | electric field effects |
4 | field effects |
5 | graphitic films |
6 | gate voltage |
7 | conductance band |
8 | square centimeter |
9 | films |
10 | electrons |
11 | semimetals |
12 | ambient conditions |
13 | atoms |
14 | holes |
15 | centimeters |
16 | metallic |
17 | voltage |
18 | band |
19 | high quality |
20 | valence |
21 | mobility |
22 | overlap |
23 | effect |
24 | conditions |
25 | concentration |
26 | quality |
27 | monocrystalline graphitic films |
28 | tiny overlap |
29 | strong ambipolar electric field effect |
Step 2: Expert Identification¶
Extracted concepts, from step one, can be used in a identify experts
queries, for example:
identify experts from concepts "+malaria OR \"effective malaria vaccine\" OR \"effective prevention\""
using publications
where research_org_countries is not empty
and year >= 2013
return experts[basics]
limit 20 skip 0
annotate organizational, coauthorship overlap
with ["ur.016204724721.35", "ur.012127355561.32"]
Returned experts are ordered by their relevance.
A few important things to remember:
Sources. Experts identification can use either
publications
orgrants
(when not specified, publications are used)Default connector is AND. When multiple concepts are provided, these are transformed automatically into an
AND
query. To match any of the concepts, one should then explicitly addOR
connectors.Where conditions. It is possible to specify
where-filters
but that’s not required. Fields available for filtering are exactly the same as the ones in standardsearch
expressions.Pagination. Similarly, the
paging-phrase
is optional. By default, the top 20 experts get returned - using limit/skip it is possible up to a maximum of 200.Overlap annotations. Annotating results with organizational and/or coauthorship overlap will produce another JSON object for each identified expert. This object has two parts.
The Organizational overlap is defined as a boolean value that is true if the expert and the researchers from the query have the same current research organization.
The Coauthorship conflict is defined as the number documents the expert has coauthored with any of the researchers provided in the query, in the last three years.
Example 1. Basic query using concepts
¶
[3]:
# take the top 15 concepts
some_concepts = " ".join(['"%s"' % x for x in CONCEPTS[:15]])
q = f"""
identify experts
from concepts "{dsl_escape(some_concepts)}"
return experts
"""
print("Query:\n======", q)
dsl.query(q).as_dataframe()
Query:
======
identify experts
from concepts "\"ambipolar electric field effect\" \"two-dimensional semimetal\" \"room-temperature mobility\" \"electric field effects\" \"field effects\" \"graphitic films\" \"gate voltage\" \"conductance band\" \"square centimeter\" \"films\" \"electrons\" \"semimetals\" \"ambient conditions\" \"atoms\" \"holes\""
return experts
[3]:
id | score | research_orgs | last_name | first_name | docs_found | orcid_id | |
---|---|---|---|---|---|---|---|
0 | ur.011033016243.08 | 7.87576 | [grid.4886.2, grid.424048.e, grid.425037.7, gr... | Firsov | Anatoly A | 1 | NaN |
1 | ur.01146544531.57 | 7.87576 | [grid.5379.8] | Jiang | Da | 1 | NaN |
2 | ur.011535264111.51 | 7.87576 | [grid.4886.2, grid.5379.8, grid.5254.6, grid.5... | Dubonos | Sergey V | 1 | NaN |
3 | ur.01207120103.29 | 7.87576 | [grid.5379.8, grid.425037.7, grid.116068.8, gr... | Novoselov | Konstantin Sergeevich | 1 | [0000-0003-4972-5371] |
4 | ur.0657076451.24 | 7.87576 | [grid.8547.e, grid.5386.8, grid.184769.5, grid... | Zhang | Yuanbo | 1 | NaN |
5 | ur.0721730631.45 | 7.87576 | [grid.7340.0, grid.5254.6, grid.418975.6, grid... | Geim | Andre Konstantin | 1 | [0000-0003-2861-8331] |
6 | ur.07423561367.62 | 7.87576 | [grid.4886.2, grid.425081.a, grid.28171.3d, gr... | Morozov | Sergey V | 1 | [0000-0003-3075-7787] |
7 | ur.0767105504.29 | 7.87576 | [grid.4886.2, grid.7340.0, grid.5337.2, grid.4... | Grigorieva | Irina V | 1 | [0000-0001-5991-7778] |
Example 2. Query with OR
connectors¶
Note: this time we return all experts fields by using the syntax experts[all]
.
[4]:
some_concepts = " OR ".join(['"%s"' % x for x in CONCEPTS[:15]])
q = f"""
identify experts
from concepts "{dsl_escape(some_concepts)}"
return experts[all]
"""
print("Query:\n======", q)
dsl.query(q).as_dataframe()
Query:
======
identify experts
from concepts "\"ambipolar electric field effect\" OR \"two-dimensional semimetal\" OR \"room-temperature mobility\" OR \"electric field effects\" OR \"field effects\" OR \"graphitic films\" OR \"gate voltage\" OR \"conductance band\" OR \"square centimeter\" OR \"films\" OR \"electrons\" OR \"semimetals\" OR \"ambient conditions\" OR \"atoms\" OR \"holes\""
return experts[all]
[4]:
id | score | research_orgs | orcid_id | total_grants | last_grant_year | obsolete | last_name | total_publications | first_publication_year | last_publication_year | current_research_org | first_name | first_grant_year | docs_found | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | ur.01207120103.29 | 8.406035 | [grid.5379.8, grid.425037.7, grid.116068.8, gr... | [0000-0003-4972-5371] | 11 | 2023.0 | 0 | Novoselov | 590 | 1997 | 2020 | grid.5379.8 | Konstantin Sergeevich | 2006.0 | 3 |
1 | ur.0721730631.45 | 8.406035 | [grid.7340.0, grid.5254.6, grid.418975.6, grid... | [0000-0003-2861-8331] | 10 | 2024.0 | 0 | Geim | 582 | 1991 | 2020 | grid.5379.8 | Andre Konstantin | 2006.0 | 3 |
2 | ur.07423561367.62 | 8.406035 | [grid.4886.2, grid.425081.a, grid.28171.3d, gr... | [0000-0003-3075-7787] | 6 | 2021.0 | 0 | Morozov | 269 | 1990 | 2020 | grid.425081.a | Sergey V | 2013.0 | 3 |
3 | ur.0657076451.24 | 8.355439 | [grid.8547.e, grid.5386.8, grid.184769.5, grid... | NaN | 0 | NaN | 0 | Zhang | 74 | 2004 | 2019 | grid.8547.e | Yuanbo | NaN | 2 |
4 | ur.011033016243.08 | 8.081146 | [grid.4886.2, grid.424048.e, grid.425037.7, gr... | NaN | 0 | NaN | 0 | Firsov | 25 | 2003 | 2018 | grid.424048.e | Anatoly A | NaN | 2 |
5 | ur.01146544531.57 | 8.081146 | [grid.5379.8] | NaN | 0 | NaN | 0 | Jiang | 11 | 2004 | 2008 | grid.5379.8 | Da | NaN | 2 |
6 | ur.011535264111.51 | 7.875760 | [grid.4886.2, grid.5379.8, grid.5254.6, grid.5... | NaN | 0 | NaN | 0 | Dubonos | 81 | 1990 | 2009 | grid.425037.7 | Sergey V | NaN | 1 |
7 | ur.0767105504.29 | 7.875760 | [grid.4886.2, grid.7340.0, grid.5337.2, grid.4... | [0000-0001-5991-7778] | 4 | 2021.0 | 0 | Grigorieva | 158 | 1989 | 2020 | grid.5379.8 | Irina V | 2007.0 | 1 |
8 | ur.011513332561.53 | 5.697777 | [grid.39158.36, grid.260539.b, grid.69566.3a] | NaN | 23 | 2011.0 | 0 | Ohta | 208 | 1976 | 2020 | grid.260539.b | Nobuhiro | 1987.0 | 21 |
9 | ur.01055006635.53 | 3.264948 | [grid.450314.7, grid.4605.7, grid.7727.5, grid... | NaN | 12 | 2011.0 | 0 | Kvon | 326 | 1983 | 2020 | grid.4605.7 | Ze Don | 1993.0 | 8 |
10 | ur.013312524031.58 | 3.022535 | [grid.443127.7, grid.420030.5, grid.419396.0, ... | NaN | 28 | 2003.0 | 0 | Yamazaki | 258 | 1966 | 2013 | grid.39158.36 | Iwao | 1984.0 | 11 |
11 | ur.01203703171.12 | 2.642304 | [grid.26999.3d, grid.69566.3a, grid.472717.0, ... | [0000-0002-6631-5131] | 11 | 2022.0 | 0 | Chiba | 208 | 2000 | 2020 | grid.136593.b | Daichi | 2009.0 | 9 |
12 | ur.0740560235.48 | 2.564912 | [grid.4886.2, grid.15276.37, grid.11899.38, gr... | NaN | 0 | NaN | 0 | Olshanetsky | 91 | 1989 | 2020 | grid.450314.7 | Eugene | NaN | 6 |
13 | ur.01034030721.03 | 2.516857 | [grid.5338.d, grid.116068.8, grid.21941.3f, gr... | [0000-0001-8217-8213] | 7 | 2023.0 | 0 | Jarillo-Herrero | 233 | 2000 | 2020 | grid.116068.8 | Pablo | 2009.0 | 5 |
14 | ur.01340766601.27 | 1.758987 | [grid.469490.6, grid.432790.b, grid.431860.8, ... | NaN | 6 | 2009.0 | 0 | Venkatesan | 777 | 1975 | 2020 | grid.4280.e | Thirumalai Venky | 1984.0 | 6 |
15 | ur.011775522057.45 | 1.718926 | [grid.448924.7, grid.450314.7, grid.4605.7, gr... | NaN | 0 | NaN | 0 | Mikhailov | 222 | 1995 | 2020 | grid.450314.7 | Nikolay N | NaN | 5 |
16 | ur.01024676171.26 | 1.649508 | [grid.5801.c, grid.481554.9, grid.410387.9, gr... | NaN | 0 | NaN | 0 | Bednorz | 129 | 1976 | 2017 | grid.410387.9 | Johannes Georg | NaN | 4 |
17 | ur.01267137567.67 | 1.649508 | [grid.410387.9, grid.7307.3, grid.10392.39, gr... | [0000-0001-6331-2640] | 7 | 2020.0 | 0 | Mannhart | 358 | 1986 | 2020 | grid.419552.e | Jochen D | 1990.0 | 4 |
18 | ur.07376375471.82 | 1.637159 | [grid.4886.2, grid.4605.7, grid.7727.5, grid.4... | NaN | 2 | 2019.0 | 0 | Kozlov | 52 | 2007 | 2020 | grid.450314.7 | Dmitriy A | 2014.0 | 4 |
19 | ur.07410612715.77 | 1.631930 | [grid.24434.35, grid.39158.36, grid.20515.33, ... | [0000-0002-9982-141X] | 15 | 2008.0 | 0 | Nishimura | 173 | 1987 | 2020 | grid.20515.33 | Yoshinobu | 1989.0 | 5 |
Example 3. Query with where
filters¶
[5]:
some_concepts = " ".join(['"%s"' % x for x in CONCEPTS[:10]])
q = f"""identify experts
from concepts "{dsl_escape(some_concepts)}"
using publications
where research_org_countries is not empty
and year >= 2000
and times_cited > 100
return experts
"""
print("Query:\n======", q)
dsl.query(q).as_dataframe()
Query:
====== identify experts
from concepts "\"ambipolar electric field effect\" \"two-dimensional semimetal\" \"room-temperature mobility\" \"electric field effects\" \"field effects\" \"graphitic films\" \"gate voltage\" \"conductance band\" \"square centimeter\" \"films\""
using publications
where research_org_countries is not empty
and year >= 2000
and times_cited > 100
return experts
[5]:
id | score | first_name | research_orgs | last_name | docs_found | orcid_id | |
---|---|---|---|---|---|---|---|
0 | ur.011033016243.08 | 7.383875 | Anatoly A | [grid.4886.2, grid.424048.e, grid.425037.7, gr... | Firsov | 1 | NaN |
1 | ur.01146544531.57 | 7.383875 | Da | [grid.5379.8] | Jiang | 1 | NaN |
2 | ur.011535264111.51 | 7.383875 | Sergey V | [grid.4886.2, grid.5379.8, grid.5254.6, grid.5... | Dubonos | 1 | NaN |
3 | ur.01207120103.29 | 7.383875 | Konstantin Sergeevich | [grid.5379.8, grid.425037.7, grid.116068.8, gr... | Novoselov | 1 | [0000-0003-4972-5371] |
4 | ur.0657076451.24 | 7.383875 | Yuanbo | [grid.8547.e, grid.5386.8, grid.184769.5, grid... | Zhang | 1 | NaN |
5 | ur.0721730631.45 | 7.383875 | Andre Konstantin | [grid.7340.0, grid.5254.6, grid.418975.6, grid... | Geim | 1 | [0000-0003-2861-8331] |
6 | ur.07423561367.62 | 7.383875 | Sergey V | [grid.4886.2, grid.425081.a, grid.28171.3d, gr... | Morozov | 1 | [0000-0003-3075-7787] |
7 | ur.0767105504.29 | 7.383875 | Irina V | [grid.4886.2, grid.7340.0, grid.5337.2, grid.4... | Grigorieva | 1 | [0000-0001-5991-7778] |
Example 4. Adding Overlap Annotations (eg for conflict of interests checks)¶
[6]:
overlap_researchers = ["ur.011535264111.51", "ur.011033016243.08", "ur.01207120103.29"]
q = f"""
identify experts
from concepts "{dsl_escape(some_concepts)}"
using publications
where research_org_countries is not empty
and year >= 2000
return experts
annotate coauthorship, organizational overlap
with {json.dumps(overlap_researchers)}
"""
print("Query:\n======", q)
dsl.query(q).as_dataframe()
Query:
======
identify experts
from concepts "\"ambipolar electric field effect\" \"two-dimensional semimetal\" \"room-temperature mobility\" \"electric field effects\" \"field effects\" \"graphitic films\" \"gate voltage\" \"conductance band\" \"square centimeter\" \"films\""
using publications
where research_org_countries is not empty
and year >= 2000
return experts
annotate coauthorship, organizational overlap
with ["ur.011535264111.51", "ur.011033016243.08", "ur.01207120103.29"]
[6]:
id | score | first_name | research_orgs | last_name | docs_found | overlap.coauthorship | overlap.organizational | orcid_id | |
---|---|---|---|---|---|---|---|---|---|
0 | ur.011033016243.08 | 7.382478 | Anatoly A | [grid.4886.2, grid.424048.e, grid.425037.7, gr... | Firsov | 1 | 3 | True | NaN |
1 | ur.01146544531.57 | 7.382478 | Da | [grid.5379.8] | Jiang | 1 | 0 | True | NaN |
2 | ur.011535264111.51 | 7.382478 | Sergey V | [grid.4886.2, grid.5379.8, grid.5254.6, grid.5... | Dubonos | 1 | 0 | True | NaN |
3 | ur.01207120103.29 | 7.382478 | Konstantin Sergeevich | [grid.5379.8, grid.425037.7, grid.116068.8, gr... | Novoselov | 1 | 153 | True | [0000-0003-4972-5371] |
4 | ur.0657076451.24 | 7.382478 | Yuanbo | [grid.8547.e, grid.5386.8, grid.184769.5, grid... | Zhang | 1 | 0 | False | NaN |
5 | ur.0721730631.45 | 7.382478 | Andre Konstantin | [grid.7340.0, grid.5254.6, grid.418975.6, grid... | Geim | 1 | 38 | True | [0000-0003-2861-8331] |
6 | ur.07423561367.62 | 7.382478 | Sergey V | [grid.4886.2, grid.425081.a, grid.28171.3d, gr... | Morozov | 1 | 6 | False | [0000-0003-3075-7787] |
7 | ur.0767105504.29 | 7.382478 | Irina V | [grid.4886.2, grid.7340.0, grid.5337.2, grid.4... | Grigorieva | 1 | 6 | True | [0000-0001-5991-7778] |
Example 5. Query with MUST/NOT Operators¶
By default, the string containing a list of concepts is interpreted as a sequence of AND
clauses. That is, the query tries to match the highest number of concepts without any preference.
It is possible to specify MUST/NOT rules with concepts by passing them via a string and using the +
and -
operators.
Note: please remember that concepts phrases (= concepts that are composed by more than one word) need to be wrapped using quotes, and the quotes need to be escaped with a \
.
[7]:
concepts = """
+"ambipolar electric field effect"
-"graphitic films"
+"films"
"electric field effects"
"""
q = f"""
identify experts
from concepts "{dsl_escape(concepts)}"
using publications
return experts
"""
print("Query:\n======", q)
dsl.query(q).as_dataframe()
Query:
======
identify experts
from concepts "
+\"ambipolar electric field effect\"
-\"graphitic films\"
+\"films\"
\"electric field effects\"
"
using publications
return experts
[7]:
id | score | research_orgs | last_name | first_name | docs_found | orcid_id | |
---|---|---|---|---|---|---|---|
0 | ur.01005576245.93 | 3.480071 | [grid.6520.1, grid.121334.6] | Henrard | Luc | 1 | NaN |
1 | ur.01251242035.86 | 3.480071 | [grid.6520.1, grid.121334.6, grid.12082.39, gr... | Latil | Sylvain | 1 | NaN |
2 | ur.01000623240.81 | 2.589067 | [grid.164295.d] | Syers | Paul | 1 | NaN |
3 | ur.01046736440.46 | 2.589067 | [grid.266100.3, grid.410443.6, grid.250008.f, ... | Butch | Nicholas Patrick | 1 | NaN |
4 | ur.01060352233.12 | 2.589067 | [grid.266100.3, grid.440050.5, grid.410443.6, ... | Paglione | John-Pierre | 1 | NaN |
5 | ur.01200656557.13 | 2.589067 | [grid.47840.3f, grid.499241.3, grid.184769.5, ... | Fuhrer | Michael Sears | 1 | [0000-0001-6183-2773] |
6 | ur.01205352017.54 | 2.589067 | [grid.31501.36, grid.164295.d, grid.35541.36, ... | Kim | Dohun | 1 | [0000-0001-9687-2089] |
7 | ur.01025667341.62 | 2.061342 | [grid.263856.c, grid.78837.33, grid.35043.31, ... | Sysoev | Victor V | 1 | [0000-0002-0372-1802] |
8 | ur.01245543252.06 | 2.061342 | [grid.14476.30, grid.24434.35] | Shekhirev | Mikhail A | 1 | [0000-0002-8381-1276] |
9 | ur.01276657166.76 | 2.061342 | [grid.426324.5, grid.10420.37, grid.24434.35, ... | Lipatov | Alexey | 1 | [0000-0001-5043-1616] |
10 | ur.013212454037.49 | 2.061342 | [grid.78837.33] | Lashkov | Andrey V | 1 | [0000-0001-6794-8523] |
11 | ur.016560200577.43 | 2.061342 | [grid.24434.35] | Vorobeva | Nataliia S | 1 | NaN |
12 | ur.0646414360.09 | 2.061342 | [grid.35043.31, grid.24434.35, grid.170430.1, ... | Sinitskii | Alexander S | 1 | [0000-0002-8688-3451] |
Example 6. MUST together with AND/OR¶
[8]:
concepts = """
(+"ambipolar electric field effect" -"graphitic films") OR
(+"films" -"electric field effects")
"""
q = f"""
identify experts
from concepts "{dsl_escape(concepts)}"
using publications
return experts
"""
print("Query:\n======", q)
dsl.query(q).as_dataframe()
Query:
======
identify experts
from concepts "
(+\"ambipolar electric field effect\" -\"graphitic films\") OR
(+\"films\" -\"electric field effects\")
"
using publications
return experts
[8]:
id | score | first_name | research_orgs | last_name | docs_found | orcid_id | |
---|---|---|---|---|---|---|---|
0 | ur.014516430466.88 | 10.359314 | Ledford C | [grid.411377.7] | Carter | 18 | NaN |
1 | ur.01034030721.03 | 3.576317 | Pablo | [grid.5338.d, grid.116068.8, grid.21941.3f, gr... | Jarillo-Herrero | 3 | [0000-0001-8217-8213] |
2 | ur.010122277451.23 | 3.499347 | Alberta | NaN | Meyer | 4 | NaN |
3 | ur.012760700525.87 | 3.499347 | Esther | NaN | Aschemeyer | 4 | NaN |
4 | ur.011313310557.79 | 2.884011 | Erwin Randolph | [grid.26009.3d] | Parson | 5 | NaN |
5 | ur.011224625507.86 | 2.798074 | W | [grid.461804.f] | Feneberg | 1 | NaN |
6 | ur.015134442047.63 | 2.798074 | Manfred A | [grid.16463.36] | Hellberg | 1 | [0000-0003-0785-8125] |
7 | ur.01150036175.42 | 2.434489 | Peng | [grid.59025.3b] | Ren | 2 | NaN |
8 | ur.013275477227.26 | 2.434489 | Lan | [grid.17635.36, grid.59025.3b, grid.451303.0, ... | Wang | 2 | [0000-0001-7124-2718] |
9 | ur.056250446.77 | 2.434489 | Azat | [grid.59025.3b] | Sulaev | 2 | NaN |
10 | ur.0624630056.98 | 2.434489 | Shun-Qing | [grid.8547.e, grid.450298.2, grid.194645.b, gr... | Shen | 2 | NaN |
11 | ur.0756673070.05 | 2.434489 | Bin | [grid.59025.3b] | Xia | 2 | NaN |
12 | ur.01003543541.78 | 2.370124 | Shlomo | [grid.12136.37, grid.13992.30, grid.133342.4, ... | Efrima | 3 | NaN |
13 | ur.015146652071.00 | 2.370124 | D | [grid.13992.30, grid.7489.2] | Yogev | 3 | NaN |
14 | ur.012476642650.51 | 2.251025 | B Ruby | [grid.205975.c] | Rich | 3 | NaN |
15 | ur.01022425321.95 | 2.067556 | Andrey A | [grid.35043.31, grid.7491.b, grid.4764.1, grid... | Turchanin | 2 | [0000-0003-2388-1042] |
16 | ur.01161437031.05 | 2.067556 | Joachim | [grid.5719.a, grid.419534.e, grid.4372.2, grid... | Mayer | 2 | [0000-0003-3292-5342] |
17 | ur.01163755245.41 | 2.067556 | Konstantin B | [grid.4886.2, grid.457334.2, grid.411233.6, gr... | Efetov | 2 | [0000-0003-2245-1366] |
18 | ur.01172120354.34 | 2.067556 | Armin | [grid.7491.b, grid.7700.0, grid.414703.5, grid... | Gölzhäuser | 2 | [0000-0002-0838-9028] |
19 | ur.0704114136.03 | 2.067556 | Thomas | [grid.10392.39, grid.4764.1, grid.7491.b] | Weimann | 2 | NaN |
Example 7. Wildcard searches¶
[9]:
concepts = """temperat* "ray diffraction" -magnet* """
q = f"""
identify experts
from concepts "{dsl_escape(concepts)}"
using publications
return experts
"""
print("Query:\n======", q)
dsl.query(q).as_dataframe()
Query:
======
identify experts
from concepts "temperat* \"ray diffraction\" -magnet* "
using publications
return experts
[9]:
id | score | research_orgs | last_name | first_name | docs_found | orcid_id | |
---|---|---|---|---|---|---|---|
0 | ur.010752560241.92 | 9.023557 | [grid.494717.8, grid.411717.5, grid.5399.6, gr... | Buscail | Henri | 4 | NaN |
1 | ur.016151106345.71 | 8.850567 | [grid.461616.2] | Kolarik | Vladislav | 4 | NaN |
2 | ur.012006337013.67 | 8.245036 | [grid.461616.2] | Engel | Walter | 4 | NaN |
3 | ur.01264404625.74 | 8.127706 | [grid.425759.8, grid.415877.8, grid.465435.5, ... | Boldyreva | Elena V | 4 | [0000-0002-1401-2438] |
4 | ur.01356350415.50 | 8.127706 | [grid.415877.8, grid.4605.7, grid.418421.a, gr... | Zakharov | Boris A | 4 | [0000-0002-3520-632X] |
5 | ur.07650346631.13 | 7.050485 | [grid.4444.0, grid.462844.8, grid.424133.3, gr... | Itié | Jean-Paul | 3 | NaN |
6 | ur.011274203435.25 | 6.858861 | [grid.27736.37, grid.418094.0] | Kocharyan | Vahan | 3 | NaN |
7 | ur.015270341551.59 | 6.810949 | [grid.494717.8, grid.5399.6] | Caudron | Eric | 3 | NaN |
8 | ur.012153454351.77 | 6.641345 | [grid.461616.2, grid.466709.a] | Juez-Lorenzo | Maria Del Mar | 3 | NaN |
9 | ur.011235502761.97 | 6.217678 | NaN | Triviño | F | 3 | NaN |
10 | ur.012267521017.15 | 6.217678 | NaN | Vázquez | T | 3 | NaN |
11 | ur.012630443761.25 | 6.217678 | NaN | Ruiz De Gauna | A | 3 | NaN |
12 | ur.012352473245.95 | 6.180346 | [grid.47894.36, grid.168010.e, grid.299175.1, ... | Qadri | Syed B | 3 | NaN |
13 | ur.013352205311.74 | 6.088813 | [grid.412761.7] | Ustinova | I S | 3 | NaN |
14 | ur.0654202176.07 | 6.088813 | [grid.4886.2, grid.465372.1] | Kadyrova | Nadezda I | 3 | NaN |
15 | ur.012077537127.36 | 6.035814 | [grid.461616.2, grid.4561.6, grid.4886.2] | Eisenreich | Norbert | 3 | NaN |
16 | ur.01015306115.93 | 5.794733 | [grid.418421.a, grid.4605.7, grid.435414.3] | Losev | Evgeniy A | 3 | [0000-0003-1743-4166] |
17 | ur.014146743075.39 | 5.561709 | [grid.4795.f, grid.463879.7, grid.411840.8, gr... | Hagenmuller | Paul | 3 | NaN |
18 | ur.015400602443.87 | 5.309069 | [grid.423902.e, grid.435347.2] | Guseinov | G G | 3 | NaN |
19 | ur.012651704451.05 | 5.108303 | [grid.32197.3e, grid.136593.b] | Oguni | Masaharu | 2 | NaN |
Note
The Dimensions Analytics API allows to carry out sophisticated research data analytics tasks like the ones described on this website. Check out also the associated Github repository for examples, the source code of these tutorials and much more.