Skip to content

Data Access

Getters

Functions to retrieve configuration data and questionnaire values for MaRDMO.

Provides fast, app-config-backed accessors for Wikibase vocabulary (items, properties), ontology registries (MathModDB, MathAlgoDB), RDMO question definitions, and project answer values. Most getters are thin wrappers around :attr:~MaRDMO.apps.MaRDMOConfig attributes or :func:~functools.lru_cache-decorated file readers.

Provides:

  • get_mathmoddb — return a :class:~.helpers.PropertyRegistry for MathModDB
  • get_mathalgodb — return a :class:~.helpers.PropertyRegistry for MathAlgoDB
  • get_publication_mapping — return a :class:~.helpers.PropertyRegistry for shared publication roles
  • get_options — return the RDMO options dict from the app config
  • get_items — return the Wikibase items dict from the app config
  • get_properties — return the Wikibase properties dict from the app config
  • get_questions — return the questions sub-dict for a given catalog section
  • get_url — return a configured URL for a Wikibase provider
  • get_item_url — return the base item-browse URL for a Wikibase provider
  • get_data — load and cache JSON data from a package-relative file
  • get_sparql_query — load and cache a SPARQL query file
  • get_sparql_query_optional — load and cache a SPARQL query file, or None if absent
  • get_id — retrieve attribute field(s) from all questionnaire values at a URI
  • get_answers — read questionnaire values for one attribute and merge into a dict
  • get_user_entries — fetch raw ID/name/description values for a domain attribute

get_answers(project, val, config)

Read questionnaire values for one attribute and merge them into val.

Iterates over all :class:~rdmo.projects.models.Value objects for the attribute described by config, determines the appropriate path in val via a :data:~.constants.flag_dict handler, and calls :func:~.helpers.nested_set to write the entry.

Parameters:

Name Type Description Default
project

RDMO project instance.

required
val

Top-level answers dict that is mutated in place.

required
config

Dict describing how to map a questionnaire attribute to val:

 * ``"uri"``              – RDMO attribute URI fragment
 * ``"key1"``             – top-level key in *val*
 * ``"key2"``             – second-level key (may be empty)
 * ``"set_prefix"``       – bool flag: use set_prefix as path element
 * ``"set_index"``        – bool flag: use set_index as path element
 * ``"collection_index"`` – bool flag: use collection_index
 * ``"external_id"``      – bool flag: write external_id field
 * ``"option_text"``      – bool flag: write option URI as text
required

Returns:

Type Description

The mutated val dict.

Source code in MaRDMO/getters.py
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
def get_answers(project, val, config):
    '''Read questionnaire values for one attribute and merge them into *val*.

    Iterates over all :class:`~rdmo.projects.models.Value` objects for the
    attribute described by *config*, determines the appropriate path in *val*
    via a :data:`~.constants.flag_dict` handler, and calls
    :func:`~.helpers.nested_set` to write the entry.

    Args:
        project: RDMO project instance.
        val:     Top-level answers dict that is mutated in place.
        config:  Dict describing how to map a questionnaire attribute to *val*:

                 * ``"uri"``              – RDMO attribute URI fragment
                 * ``"key1"``             – top-level key in *val*
                 * ``"key2"``             – second-level key (may be empty)
                 * ``"set_prefix"``       – bool flag: use set_prefix as path element
                 * ``"set_index"``        – bool flag: use set_index as path element
                 * ``"collection_index"`` – bool flag: use collection_index
                 * ``"external_id"``      – bool flag: write external_id field
                 * ``"option_text"``      – bool flag: write option URI as text

    Returns:
        The mutated *val* dict.
    '''

    val.setdefault(config["key1"], {})

    try:
        values = project.values.filter(
            snapshot=None,
            attribute=Attribute.objects.get(uri = f"{BASE_URI}{config['uri']}")
            )
    except Attribute.DoesNotExist:
        values = []

    if not (config["key1"] or config["key2"]):
        return val

    for value in values:

        # Set Prefix IDX
        prefix_idx = None
        if value.set_prefix:
            prefix_idx = int(value.set_prefix.split('|')[0])

        # Set Flags
        flags = (
                 bool(config["set_prefix"]),
                 bool(config["set_index"]),
                 bool(config["collection_index"]),
                 bool(config["external_id"]),
                 bool(config["option_text"]),
                )

        # Set Attribute
        attribute = 'option_uri' if value.option else 'text' if value.text else None

        if not attribute:
            # Ignore if not Attribute Set
            continue

        # Get Flag Combo Handler
        handler = flag_dict[flags]

        # Get Entry and Path
        entry, path = handler(value, attribute, config, prefix_idx)

        # Generate nested Dict Entry
        nested_set(data=val,
                   path=path,
                   entry=entry)

    return val

get_data(file_name) cached

Load and return JSON data from a file relative to the MaRDMO package root.

Result is cached indefinitely after the first read.

Parameters:

Name Type Description Default
file_name

Path relative to the package directory (e.g. "data/items.json").

required

Returns:

Type Description

Parsed JSON value (typically a dict or list).

Source code in MaRDMO/getters.py
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
@lru_cache(maxsize=None)
def get_data(file_name):
    '''Load and return JSON data from a file relative to the MaRDMO package root.

    Result is cached indefinitely after the first read.

    Args:
        file_name: Path relative to the package directory (e.g. ``"data/items.json"``).

    Returns:
        Parsed JSON value (typically a dict or list).
    '''
    path = os.path.join(os.path.dirname(__file__), file_name)
    with open(path, "r", encoding="utf-8") as json_file:
        data = json.load(json_file)
    return data

get_id(project, uri, keys)

Retrieve one or more attribute fields from all questionnaire values at uri.

Parameters:

Name Type Description Default
project

RDMO project instance.

required
uri

Full RDMO attribute URI to filter on.

required
keys

List of :class:~rdmo.projects.models.Value field names to read (e.g. ["set_index"], ["external_id"], or ["set_index", "set_prefix"]). When keys has exactly one element, multi-value pipe-separated strings are split and only the first part is returned.

required

Returns:

Type Description

List of scalar values (single-key case) or list of lists (multi-key

case), one entry per matching questionnaire value.

Source code in MaRDMO/getters.py
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
def get_id(project, uri, keys):
    '''Retrieve one or more attribute fields from all questionnaire values at *uri*.

    Args:
        project: RDMO project instance.
        uri:     Full RDMO attribute URI to filter on.
        keys:    List of :class:`~rdmo.projects.models.Value` field names to
                 read (e.g. ``["set_index"]``, ``["external_id"]``, or
                 ``["set_index", "set_prefix"]``).  When *keys* has exactly
                 one element, multi-value pipe-separated strings are split and
                 only the first part is returned.

    Returns:
        List of scalar values (single-key case) or list of lists (multi-key
        case), one entry per matching questionnaire value.
    '''
    values = project.values.filter(
        snapshot=None,
        attribute=Attribute.objects.get(
            uri=uri
        )
    )
    identifiers = []
    if len(keys) == 1:
        for value in values:
            identifier = getattr(value, keys[0])
            if isinstance(identifier, str) and '|' in identifier:
                identifier = identifier.split('|')[0]
            identifiers.append(identifier)
    else:
        for value in values:
            identifier = []
            for key in keys:
                identifier.append(getattr(value, key))
            identifiers.append(identifier)
    return identifiers

get_item_url(source)

Return the base URL for browsing Wikibase items on a provider's wiki.

Wikidata URI is taken from the internal :data:~MaRDMO.constants.WIKIDATA constant. All other sources are read from settings.MARDMO_PROVIDER.

Parameters:

Name Type Description Default
source

Provider key — "wikidata" or a key in MARDMO_PROVIDER.

required

Returns:

Type Description

URL string ending in "/wiki/Item:" for constructing full item links.

Source code in MaRDMO/getters.py
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
def get_item_url(source):
    '''Return the base URL for browsing Wikibase items on a provider's wiki.

    Wikidata URI is taken from the internal :data:`~MaRDMO.constants.WIKIDATA`
    constant.  All other sources are read from ``settings.MARDMO_PROVIDER``.

    Args:
        source: Provider key — ``"wikidata"`` or a key in ``MARDMO_PROVIDER``.

    Returns:
        URL string ending in ``"/wiki/Item:"`` for constructing full item links.
    '''
    if source == 'wikidata':
        return f"{WIKIDATA['uri']}/wiki/Item:"
    return f"{settings.MARDMO_PROVIDER[source]['uri']}/wiki/Item:"

get_items()

Return the Wikibase items dict from the app config.

Returns:

Type Description

Dict mapping item label strings to Wikibase QID strings.

Source code in MaRDMO/getters.py
80
81
82
83
84
85
86
def get_items():
    '''Return the Wikibase items dict from the app config.

    Returns:
        Dict mapping item label strings to Wikibase QID strings.
    '''
    return apps.get_app_config("MaRDMO").items

get_mathalgodb()

Return a :class:~.helpers.PropertyRegistry for the MathAlgoDB ontology.

Returns:

Type Description

class:~.helpers.PropertyRegistry wrapping

attr:~MaRDMO.apps.MaRDMOConfig.mathalgodb.

Source code in MaRDMO/getters.py
50
51
52
53
54
55
56
57
58
59
def get_mathalgodb():
    '''Return a :class:`~.helpers.PropertyRegistry` for the MathAlgoDB ontology.

    Returns:
        :class:`~.helpers.PropertyRegistry` wrapping
        :attr:`~MaRDMO.apps.MaRDMOConfig.mathalgodb`.
    '''
    return PropertyRegistry(
        apps.get_app_config("MaRDMO").mathalgodb
    )

get_mathmoddb()

Return a :class:~.helpers.PropertyRegistry for the MathModDB ontology.

Returns:

Type Description

class:~.helpers.PropertyRegistry wrapping

attr:~MaRDMO.apps.MaRDMOConfig.mathmoddb.

Source code in MaRDMO/getters.py
39
40
41
42
43
44
45
46
47
48
def get_mathmoddb():
    '''Return a :class:`~.helpers.PropertyRegistry` for the MathModDB ontology.

    Returns:
        :class:`~.helpers.PropertyRegistry` wrapping
        :attr:`~MaRDMO.apps.MaRDMOConfig.mathmoddb`.
    '''
    return PropertyRegistry(
        apps.get_app_config("MaRDMO").mathmoddb
    )

get_options()

Return the RDMO options dict from the app config.

Returns:

Type Description

Dict mapping RDMO option URIs to display strings.

Source code in MaRDMO/getters.py
72
73
74
75
76
77
78
def get_options():
    '''Return the RDMO options dict from the app config.

    Returns:
        Dict mapping RDMO option URIs to display strings.
    '''
    return apps.get_app_config("MaRDMO").options

get_properties()

Return the Wikibase properties dict from the app config.

Returns:

Type Description

Dict mapping property label strings to Wikibase property ID strings.

Source code in MaRDMO/getters.py
88
89
90
91
92
93
94
def get_properties():
    '''Return the Wikibase properties dict from the app config.

    Returns:
        Dict mapping property label strings to Wikibase property ID strings.
    '''
    return apps.get_app_config("MaRDMO").properties

get_publication_mapping()

Return a :class:~.helpers.PropertyRegistry for the shared publication role options.

Returns:

Type Description

class:~.helpers.PropertyRegistry wrapping

attr:~MaRDMO.apps.MaRDMOConfig.publication_mapping.

Source code in MaRDMO/getters.py
61
62
63
64
65
66
67
68
69
70
def get_publication_mapping():
    '''Return a :class:`~.helpers.PropertyRegistry` for the shared publication role options.

    Returns:
        :class:`~.helpers.PropertyRegistry` wrapping
        :attr:`~MaRDMO.apps.MaRDMOConfig.publication_mapping`.
    '''
    return PropertyRegistry(
        apps.get_app_config("MaRDMO").publication_mapping
    )

get_questions(question_set)

Return the questions sub-dict for a given catalog question set.

Parameters:

Name Type Description Default
question_set

Key identifying the question set (catalog section name).

required

Returns:

Type Description

Dict mapping question names to their RDMO attribute URI fragments and

related metadata.

Source code in MaRDMO/getters.py
 96
 97
 98
 99
100
101
102
103
104
105
106
def get_questions(question_set):
    '''Return the questions sub-dict for a given catalog question set.

    Args:
        question_set: Key identifying the question set (catalog section name).

    Returns:
        Dict mapping question names to their RDMO attribute URI fragments and
        related metadata.
    '''
    return apps.get_app_config("MaRDMO").questions[question_set]

get_sparql_query(file_name) cached

Load and return the contents of a SPARQL query file.

Result is cached indefinitely after the first read.

Parameters:

Name Type Description Default
file_name

Path relative to the package directory (e.g. "model/queries/field.sparql").

required

Returns:

Type Description

Query string (may contain {} format placeholders).

Source code in MaRDMO/getters.py
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
@lru_cache(maxsize=None)
def get_sparql_query(file_name):
    '''Load and return the contents of a SPARQL query file.

    Result is cached indefinitely after the first read.

    Args:
        file_name: Path relative to the package directory (e.g.
                   ``"model/queries/field.sparql"``).

    Returns:
        Query string (may contain ``{}`` format placeholders).
    '''
    path = os.path.join(os.path.dirname(__file__), file_name)
    with open(path, "r", encoding="utf-8") as sparql_file:
        return sparql_file.read()

get_sparql_query_optional(file_name) cached

Load a SPARQL query file, returning None if the file does not exist.

Result is cached indefinitely after the first read.

Parameters:

Name Type Description Default
file_name

Path relative to the package directory.

required

Returns:

Type Description

Query string on success; None when the file is absent.

Source code in MaRDMO/getters.py
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
@lru_cache(maxsize=None)
def get_sparql_query_optional(file_name):
    '''Load a SPARQL query file, returning ``None`` if the file does not exist.

    Result is cached indefinitely after the first read.

    Args:
        file_name: Path relative to the package directory.

    Returns:
        Query string on success; ``None`` when the file is absent.
    '''
    path = os.path.join(os.path.dirname(__file__), file_name)
    if not os.path.exists(path):
        return None
    with open(path, "r", encoding="utf-8") as sparql_file:
        return sparql_file.read()

get_url(source, url_type)

Return a URL for a Wikibase provider.

Wikidata URLs are taken from the internal :data:~MaRDMO.constants.WIKIDATA constant. All other sources are read from settings.MARDMO_PROVIDER.

Parameters:

Name Type Description Default
source

Provider key — "wikidata" or a key in MARDMO_PROVIDER.

required
url_type

URL type key — one of "api", "sparql", or "uri".

required

Returns:

Type Description

URL string for the requested source and type.

Source code in MaRDMO/getters.py
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
def get_url(source, url_type):
    '''Return a URL for a Wikibase provider.

    Wikidata URLs are taken from the internal :data:`~MaRDMO.constants.WIKIDATA`
    constant.  All other sources are read from ``settings.MARDMO_PROVIDER``.

    Args:
        source:   Provider key — ``"wikidata"`` or a key in ``MARDMO_PROVIDER``.
        url_type: URL type key — one of ``"api"``, ``"sparql"``, or ``"uri"``.

    Returns:
        URL string for the requested source and type.
    '''
    if source == 'wikidata':
        return WIKIDATA[url_type]
    return settings.MARDMO_PROVIDER[source][url_type]

get_user_entries(project, query_attribute, values)

Fetch raw ID, Name, and Description questionnaire values for a domain attribute.

Parameters:

Name Type Description Default
project

RDMO project instance.

required
query_attribute

Attribute path fragment (e.g. "software"); the three sub-attributes <fragment>/id, /name, and /description are queried.

required
values

Dict to populate; mutated in place.

required

Returns:

Type Description

The values dict with keys "id", "name", and "description"

each holding a :class:~django.db.models.QuerySet of

class:~rdmo.projects.models.Value instances.

Source code in MaRDMO/getters.py
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
def get_user_entries(project, query_attribute, values):
    '''Fetch raw ID, Name, and Description questionnaire values for a domain attribute.

    Args:
        project:         RDMO project instance.
        query_attribute: Attribute path fragment (e.g. ``"software"``); the
                         three sub-attributes ``<fragment>/id``, ``/name``,
                         and ``/description`` are queried.
        values:          Dict to populate; mutated in place.

    Returns:
        The *values* dict with keys ``"id"``, ``"name"``, and ``"description"``
        each holding a :class:`~django.db.models.QuerySet` of
        :class:`~rdmo.projects.models.Value` instances.
    '''
    for question in ('id', 'name', 'description'):
        attr = Attribute.objects.filter(
            uri=f'{BASE_URI}domain/{query_attribute}/{question}'
        ).first()
        values[question] = project.values.filter(snapshot=None, attribute=attr) if attr else []
    return values

Queries

Functions that query external data sources (Wikibase API, SPARQL endpoints, Crossref, etc.).

Provides helpers for:

  • query_item / query_api – Wikibase wbsearchentities lookups
  • query_sparql / query_sparql_pool – SPARQL SELECT requests
  • query_sources / query_sources_with_user_additions – combined MaRDI + Wikidata search with optional user-defined entries
  • query_api_per_class – class-filtered full-text search on the MaRDI portal

query_api(api_url, search_term, timeout=5)

Search a Wikibase API endpoint for items whose label matches search_term.

Uses the wbsearchentities action with limit=10.

Parameters:

Name Type Description Default
api_url

Full URL of the Wikibase API endpoint.

required
search_term

Label string to search for.

required
timeout

Request timeout in seconds (default 5).

5

Returns:

Type Description

List of raw Wikibase search-result dicts; empty list on any error.

Source code in MaRDMO/queries.py
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
def query_api(api_url, search_term, timeout=5):
    '''Search a Wikibase API endpoint for items whose label matches *search_term*.

    Uses the ``wbsearchentities`` action with ``limit=10``.

    Args:
        api_url:     Full URL of the Wikibase API endpoint.
        search_term: Label string to search for.
        timeout:     Request timeout in seconds (default 5).

    Returns:
        List of raw Wikibase search-result dicts; empty list on any error.
    '''
    try:
        response = requests.get(
            api_url,
            params={
                'action': 'wbsearchentities',
                'format': 'json',
                'language': 'en',
                'type': 'item',
                'limit': 10,
                'search': search_term
            },
            headers={'User-Agent': 'MaRDMO (https://zib.de; reidelbach@zib.de)'},
            timeout=timeout
        )
        response.raise_for_status()  # Raise an error on bad HTTP status codes
        try:
            return response.json().get('search', [])
        except ValueError:
            # Malformed JSON
            logger.error("Failed to parse JSON.")
    except requests.exceptions.RequestException as e:
        logger.error("Request failed due to %s", e)

    return []

query_api_per_class(search_term, item_class)

Search the MaRDI portal for items belonging to one or more Wikibase classes.

Performs a MediaWiki full-text search filtered by haswbstatement:<instance-of>=QID (combined with | for multiple classes), then fetches English labels and descriptions for the matching QIDs via wbgetentities. The "instance of" property ID is read from the configured properties.json so it works across portals.

Parameters:

Name Type Description Default
search_term str

Free-text search string (appended with * for prefix matching).

required
item_class

QID string or list of QID strings; e.g. "Q42" or ["Q42", "Q43"].

required

Returns:

Type Description
list[dict]

List of {"id": "mardi:<QID>", "text": "Label (Description) [mardi]"}

list[dict]

dicts; empty list on any network error or no hits.

Source code in MaRDMO/queries.py
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
def query_api_per_class(search_term: str, item_class) -> list[dict]:
    '''Search the MaRDI portal for items belonging to one or more Wikibase classes.

    Performs a MediaWiki full-text search filtered by
    ``haswbstatement:<instance-of>=QID`` (combined with ``|`` for multiple
    classes), then fetches English labels and descriptions for the matching
    QIDs via ``wbgetentities``.  The "instance of" property ID is read from
    the configured ``properties.json`` so it works across portals.

    Args:
        search_term: Free-text search string (appended with ``*`` for prefix matching).
        item_class:  QID string or list of QID strings; e.g. ``"Q42"`` or
                     ``["Q42", "Q43"]``.

    Returns:
        List of ``{"id": "mardi:<QID>", "text": "Label (Description) [mardi]"}``
        dicts; empty list on any network error or no hits.
    '''
    if isinstance(item_class, str):
        item_class = [item_class]

    instance_of  = get_properties()['instance of']
    class_filter = '|'.join(f'{instance_of}={qid}' for qid in item_class)
    api_url      = get_url('mardi', 'api')

    try:
        search_resp = requests.get(
            api_url,
            params={
                'action':      'query',
                'list':        'search',
                'srsearch':    f'{search_term}* haswbstatement:{class_filter}',
                'srnamespace': 120,
                'srlimit':     50,
                'srprop':      'snippet',
                'format':      'json',
            },
            headers={'User-Agent': 'MaRDMO (https://zib.de; reidelbach@zib.de)'},
            timeout=5,
        )
        search_resp.raise_for_status()
        hits = search_resp.json().get('query', {}).get('search', [])
    except requests.exceptions.RequestException as e:
        logger.error("Class-based MaRDI search failed: %s", e)
        return []

    if not hits:
        return []

    qids = [hit['title'].removeprefix('Item:') for hit in hits]

    try:
        entity_resp = requests.get(
            api_url,
            params={
                'action':    'wbgetentities',
                'ids':       '|'.join(qids),
                'props':     'labels|descriptions',
                'languages': 'en',
                'format':    'json',
            },
            headers={'User-Agent': 'MaRDMO (https://zib.de; reidelbach@zib.de)'},
            timeout=5,
        )
        entity_resp.raise_for_status()
        entities = entity_resp.json().get('entities', {})
    except requests.exceptions.RequestException as e:
        logger.error("MaRDI entity fetch failed: %s", e)
        return []

    results = []
    for qid in qids:
        entity      = entities.get(qid, {})
        label       = entity.get('labels',       {}).get('en', {}).get('value', qid)
        description = entity.get('descriptions', {}).get('en', {}).get('value',
                                                                        'No Description Provided!')
        results.append({
            'id':   f'mardi:{qid}',
            'text': f'{label} ({description}) [mardi]',
        })

    return results

query_item(label, description, api=get_url('mardi', 'api'))

Search a Wikibase API for an item matching label and description exactly.

Skips the request if description is empty or the placeholder string "No Description Provided!".

Parameters:

Name Type Description Default
label

Label (or alias) to search for (case-insensitive).

required
description

Exact description the matched item must have.

required
api

Wikibase wbsearchentities API URL; defaults to MaRDI Portal.

get_url('mardi', 'api')

Returns:

Type Description

The Wikibase QID string of the first match, or None if no match.

Source code in MaRDMO/queries.py
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
def query_item(label, description, api = get_url('mardi', 'api')):
    '''Search a Wikibase API for an item matching *label* and *description* exactly.

    Skips the request if *description* is empty or the placeholder string
    ``"No Description Provided!"``.

    Args:
        label:       Label (or alias) to search for (case-insensitive).
        description: Exact description the matched item must have.
        api:         Wikibase ``wbsearchentities`` API URL; defaults to MaRDI Portal.

    Returns:
        The Wikibase QID string of the first match, or ``None`` if no match.
    '''
    # Only check Items with description
    if not description or description == 'No Description Provided!':
        return None

    # Get data from API
    data = query_api(api, label)

    # Normalize input label for comparison (case insensitive)
    norm_label = label.strip().lower()

    matched_items = []
    for item in data:
        item_label = item.get('label', '').strip().lower()
        item_description = item.get('description', '').strip()
        item_aliases = [alias.strip().lower() for alias in item.get('aliases', [])]

        # Check label or alias match AND description match
        if (
            (item_label == norm_label or norm_label in item_aliases)
            and item_description == description
        ):
            matched_items.append(item)

    if matched_items:
        return matched_items[0]['id']

    return None

query_sources(search, item_class=[], sources=None, not_found=True)

Query one or more knowledge-graph sources for items matching search.

Queries the requested sources in parallel, merges the results, and sorts them by relevance to search using :func:~.helpers.rank_by_search_term.

Parameters:

Name Type Description Default
search

Free-text search string.

required
item_class

QID string or list of QID strings used for class-filtered search on MaRDI (e.g. "Q42" or ["Q42", "Q43"]). Wikidata is always searched by label/description only.

[]
sources

List of source keys to query; defaults to ["mardi", "wikidata"].

None
not_found

If True (default), prepends a {"id": "not found", "text": "not found"} sentinel to the option list.

True

Returns:

Type Description

List of {"id": …, "text": …} option dicts, sorted by relevance.

Source code in MaRDMO/queries.py
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
def query_sources(search, item_class=[], sources=None, not_found=True):
    '''Query one or more knowledge-graph sources for items matching *search*.

    Queries the requested sources in parallel, merges the results, and sorts
    them by relevance to *search* using :func:`~.helpers.rank_by_search_term`.

    Args:
        search:     Free-text search string.
        item_class: QID string or list of QID strings used for class-filtered
                    search on MaRDI (e.g. ``"Q42"`` or ``["Q42", "Q43"]``).
                    Wikidata is always searched by label/description only.
        sources:    List of source keys to query; defaults to
                    ``["mardi", "wikidata"]``.
        not_found:  If ``True`` (default), prepends a ``{"id": "not found",
                    "text": "not found"}`` sentinel to the option list.

    Returns:
        List of ``{"id": …, "text": …}`` option dicts, sorted by relevance.
    '''
    if sources is None:
        sources = ['mardi', 'wikidata']

    source_functions = {}
    if 'mardi' in sources:
        source_functions['mardi'] = lambda s: query_api_per_class(s, item_class)
    if 'wikidata' in sources:
        source_functions['wikidata'] = lambda s: query_api(get_url('wikidata', 'api'), s)

    pool = ThreadPool(processes=len(source_functions))
    results = pool.map(lambda func: func(search), source_functions.values())
    results_dict = dict(zip(source_functions.keys(), results))

    options = []
    for source in sources:
        if source not in results_dict:
            continue
        raw = results_dict[source][:25]
        if source == 'wikidata':
            # query_api returns raw Wikibase search dicts; format them here
            display_key = 'display'
            raw = [
                {
                    'id':   f"wikidata:{r['id']}",
                    'text': (f"{r[display_key].get('label', {}).get('value', 'No Label Provided!')}"
                             f" ({r[display_key].get('description', {}).get('value', 'No Description Provided!')})"
                             f" [wikidata]"),
                }
                for r in raw
                if display_key in r
            ]
        options += raw

    options.sort(key=lambda opt: rank_by_search_term(opt, search))

    if not_found:
        options = [{'id': 'not found', 'text': 'not found'}] + options

    return options

query_sources_with_user_additions(search, project, setup)

Fetch KG options, prepend matching user-defined entries, and optionally allow creation.

Combines external source results (via :func:query_sources) with entries the user has already added to the questionnaire (cached by project/attribute). Optionally inserts a creation option for new entries.

Parameters:

Name Type Description Default
search

Free-text search string entered by the user.

required
project

RDMO project instance used to look up existing user entries.

required
setup

Configuration dict with keys:

 * ``sources``          – list of source keys (default
   ``["mardi", "wikidata"]`` when ``None``)
 * ``item_class``       – QID or list of QIDs for class filtering
 * ``query_attributes`` – list of attribute URI fragments to
   query for user-defined entries
 * ``creation``         – bool; if ``True``, prepend an option
   that lets users create a new entry with the search text
required

Returns:

Type Description

List of {"id": …, "text": …} option dicts.

Source code in MaRDMO/queries.py
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
def query_sources_with_user_additions(search, project, setup):
    '''Fetch KG options, prepend matching user-defined entries, and optionally allow creation.

    Combines external source results (via :func:`query_sources`) with entries
    the user has already added to the questionnaire (cached by project/attribute).
    Optionally inserts a creation option for new entries.

    Args:
        search:  Free-text search string entered by the user.
        project: RDMO project instance used to look up existing user entries.
        setup:   Configuration dict with keys:

                 * ``sources``          – list of source keys (default
                   ``["mardi", "wikidata"]`` when ``None``)
                 * ``item_class``       – QID or list of QIDs for class filtering
                 * ``query_attributes`` – list of attribute URI fragments to
                   query for user-defined entries
                 * ``creation``         – bool; if ``True``, prepend an option
                   that lets users create a new entry with the search text

    Returns:
        List of ``{"id": …, "text": …}`` option dicts.
    '''
    if setup['sources'] is None:
        setup['sources'] = ['mardi', 'wikidata']

    # Query external sources
    try:
        options = query_sources(
            search=search,
            item_class=setup['item_class'],
            sources=setup['sources'],
            not_found=False,
        )
    except (requests.exceptions.RequestException, KeyError, ValueError, TypeError) as e:
        logger.error("Query sources failed: %s", e)
        options = []

    if setup.get('query_attributes'):
        # Get or build user entries dictionary
        cache_key = f"user_entries_{project.id}_{','.join(setup['query_attributes'])}"
        dic = cache.get(cache_key)

        if dic is None:
            logger.debug("Cache miss for %s, querying database", cache_key)
            try:
                dic = query_user_entries(project, setup)
            except Exception as e:
                logger.error("User entries query failed: %s", e)
                dic = {}
            cache.set(cache_key, dic, timeout=180)
            logger.debug("Cached user entries for %s", cache_key)
        else:
            logger.debug("Cache hit for %s", cache_key)

        # Filter and merge options
        options_user = [
            {'id': value['id'], 'text': key}
            for key, value in dic.items()
            if search.lower() in key.lower()
        ]
        options = options_user + options

    # Add creation option if needed
    if setup['creation']:
        creation_option = {'id': 'not found', 'text': search}
        if creation_option not in options:
            options.insert(0, creation_option)

    return options

query_sparql(query, sparql_endpoint)

Execute a SPARQL SELECT query against sparql_endpoint and return bindings.

Parameters:

Name Type Description Default
query

SPARQL SELECT query string (UTF-8 encoded on the wire).

required
sparql_endpoint

URL of the SPARQL endpoint (e.g. the MaRDI or Wikidata endpoint).

required

Returns:

Type Description

List of result-binding dicts from results.bindings; empty list on

any network error, HTTP error, or missing endpoint.

Source code in MaRDMO/queries.py
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
def query_sparql(query, sparql_endpoint):
    '''Execute a SPARQL SELECT query against *sparql_endpoint* and return bindings.

    Args:
        query:           SPARQL SELECT query string (UTF-8 encoded on the wire).
        sparql_endpoint: URL of the SPARQL endpoint (e.g. the MaRDI or Wikidata endpoint).

    Returns:
        List of result-binding dicts from ``results.bindings``; empty list on
        any network error, HTTP error, or missing endpoint.
    '''
    if not sparql_endpoint:
        logger.warning("SPARQL query attempted without a valid endpoint.")
        return []

    try:
        response = requests.post(
            sparql_endpoint,
            data=query.encode("utf-8"),
            headers={
                "User-Agent": "MaRDMO (https://zib.de; reidelbach@zib.de)",
                "Content-Type": "application/sparql-query; charset=UTF-8",
                "Accept": "application/sparql-results+json"
            },
            timeout = 60
        )
        # Check if request was successful
        if response.status_code == 200:
            return response.json().get('results', {}).get('bindings', [])

        logger.error(
            "SPARQL request to %s failed with status %s: %s",
            sparql_endpoint,
            response.status_code,
            response.text,
        )

        return []

    except requests.exceptions.ConnectionError:
        logger.error(
            "SPARQL query failed: Unable to connect to the %s.",
            sparql_endpoint
        )

    except requests.exceptions.RequestException as e:
        logger.exception("SPARQL request failed: %s", e)

    return []

query_sparql_pool(query_input)

Execute multiple SPARQL queries in parallel and return a keyed result dict.

Parameters:

Name Type Description Default
query_input

Dict mapping a source key to a (query, endpoint) tuple, e.g. {"mardi": (query_str, url), "wikidata": ...}.

required

Returns:

Type Description

Dict with the same keys as query_input, each mapping to the list of

SPARQL result bindings returned by :func:query_sparql.

Source code in MaRDMO/queries.py
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
def query_sparql_pool(query_input):
    '''Execute multiple SPARQL queries in parallel and return a keyed result dict.

    Args:
        query_input: Dict mapping a source key to a ``(query, endpoint)`` tuple,
                     e.g. ``{"mardi": (query_str, url), "wikidata": ...}``.

    Returns:
        Dict with the same keys as *query_input*, each mapping to the list of
        SPARQL result bindings returned by :func:`query_sparql`.
    '''
    pool = ThreadPool(processes = len(query_input))
    # Map each endpoint's query and store results in a dictionary
    results = pool.map(lambda args: query_sparql(*args), query_input.values())
    data = dict(zip(query_input.keys(), results))
    return data

query_user_entries(project, setup)

Build a label→ID dict from entries the user has already added to the questionnaire.

Reads ID, Name, and Description values from the RDMO database for each attribute listed in setup["query_attributes"], aligns them by set_index/set_prefix, and returns only entries whose source is not in setup["sources"] (i.e. not already covered by KG results).

Parameters:

Name Type Description Default
project

RDMO project instance.

required
setup

Configuration dict (see :func:query_sources_with_user_additions).

required

Returns:

Type Description

Dict mapping display strings ("Label (Description)" or

"Label (Description) [source]") to {"id": item_id} dicts.

Source code in MaRDMO/queries.py
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
def query_user_entries(project, setup):
    '''Build a label→ID dict from entries the user has already added to the questionnaire.

    Reads ID, Name, and Description values from the RDMO database for each
    attribute listed in ``setup["query_attributes"]``, aligns them by
    ``set_index``/``set_prefix``, and returns only entries whose source is not
    in ``setup["sources"]`` (i.e. not already covered by KG results).

    Args:
        project: RDMO project instance.
        setup:   Configuration dict (see :func:`query_sources_with_user_additions`).

    Returns:
        Dict mapping display strings (``"Label (Description)"`` or
        ``"Label (Description) [source]"``) to ``{"id": item_id}`` dicts.
    '''
    dic = {}

    for query_attribute in setup['query_attributes']:

        # Get entries from database
        values = get_user_entries(
            project=project,
            query_attribute=query_attribute,
            values={}
        )

        # Align id/name/description by numeric index
        entries_by_idx = {}
        for value_id in values['id']:
            idx = value_id.set_index
            entries_by_idx.setdefault(idx, {})['id'] = value_id

        for value_name in values['name']:
            idx = int(value_name.set_prefix)
            entries_by_idx.setdefault(idx, {})['name'] = value_name

        for value_desc in values['description']:
            idx = int(value_desc.set_prefix)
            entries_by_idx.setdefault(idx, {})['description'] = value_desc

        # Process aligned entries
        for idx in sorted(entries_by_idx.keys()):
            entry = entries_by_idx[idx]

            if not entry['id'].text:
                continue

            # Build item
            if entry['id'].text == 'not found':
                # User-defined item
                label = entry['name'].text or "No Label Provided!"
                description = entry['description'].text or "No Description Provided!"
                item_id = 'not found'
                source = 'user'
            else:
                # External ID item
                label, description, source = extract_parts(entry['id'].text)
                _, item_id = entry['id'].external_id.split(':')
                item_id = f"{source}:{item_id}"

            # Add to dictionary if not from primary sources
            if source not in setup['sources']:
                if source == 'user':
                    dic[f"{label} ({description})"] = {'id': item_id}
                else:
                    dic[f"{label} ({description}) [{source}]"] = {'id': item_id}

    return dic