Comparing different interpretations of the set-based Dempster-Shafer theory for explainable and fair prediction of hereditary BRCA1/2 mutations

Authors

  • Ekaterina Auer* Department of Electrical Engineering and Computer Science, University of Applied Sciences Wismar, Germany
  • Wolfram Luther Department of Computer Science and Applied Cognitive Science, University of Duisburg-Essen, Germany
  • Lorenz Gillner Department of Electrical Engineering and Computer Science, University of Applied Sciences Wismar, Germany
  • Pascal Müller Department of Electrical Engineering and Computer Science, University of Applied Sciences Wismar, Germany

Abstract

In recent decades, the use of the Dempster-Shafer theory (DST [1]) for software-based decision-making or classification, for example, in the context of machine learning, has gained significant importance. This is largely due to the theory's ability to provide a detailed explanation of the underlying reasoning that generates the results, thereby increasing the software's interpretability and trustworthiness. Another advantage of the DST framework is its capacity to handle conflicting or missing decision criteria and data.

DST captures the uncertainty associated with incomplete or imprecise knowledge, in addition to the uncertainty inherent in random events. Specifically, the traditional (discrete) DST assumes that the probability $P(X=x)=p_i\in\mathbb{R}$ of a random variable $X$ taking on a realization $x\in\mathbb{R}$ holds not for a point (or crisp) value of $x=x_i\in\mathbb{R}$ but rather for a set of values, such as an interval $x\in[\underline{x}_i,\overline{x}_i]\subset\mathbb{R}$. Further, $p_i$ itself may be considered not as a crisp value but as ranging within a certain set (e.g. $[\underline{p}_i,\overline{p}_i]\subset\mathbb{R}$). This latter extension of DST can be achieved in various ways, each with its own strengths and limitations.

This talk centers on the hereditary breast and ovarian cancer syndrome (HBOC) and evaluating the cumulative risk of inheriting a harmful mutation in the BRCA1/2 genes. In our recent papers (e.g. [2]), we introduced an interval DST method to categorize individuals into risk classes based on their personal and family history, providing explainable results. Here, we explore approaches using further interval-based arithmetics and discuss their merits and limitations for this specific application, with a special focus on computing fair lower risk bounds for individuals with low risk.

Downloads

Published

2025-04-21

Issue

Section

Conference Contributions