Postdoctoral Researcher
Max Planck Institute for Software Systems, Germany
bghosh[at]mpi-sws.org
I am a postdoctoral researcher at Max Planck Institute for Software Systems, Germany. My research interest is on trustworthy machine learning and formal methods. Currently, I work with Krishna P. Gummadi on understanding the foundation of Large Language Models (LLMs) in terms of memorization and generalization through the lens of formal methods.
I have completed my Ph.D. from the School of Computing, National University of Singapore (NUS) on 30 September 2023 under the supervision of Prof. Kuldeep S. Meel. My Ph.D. thesis is Interpretability and Fairness in Machine Learning: A Formal Methods Approach.
I have completed my bachelor's degree (BSc.) in Computer Science and Engineering from Bangladesh University of Engineering and Technology (BUET) in September 2017. My undergraduate thesis is on socio-spatial group queries advised by Dr. Mohammed Eunus Ali.
In the past decades, machine learning (ML) has made significant progress in medical image classification.
The success can be attributed to two factors: (i) unique patient data collected and processed
by clinics/hospitals and (ii) corresponding ML models solving the underlying classification task.
In practice, patient data may contain sensitive information unique to patients’ demography; and
ML models often require higher computational resources beyond the affordability of an individual hospital.
Considering practical concerns, we explore a collaborative ML approach in which
the data provider, referred to as the client, aims to leverage the computational
resources of a server in jointly training a unified ML model without the need
to share any raw data. Specifically, we focus on the skin lesion classification
problem using a real-world dataset containing multi-modal image inputs and
multi-label ground truth.
To enable collaborative yet privacy-preserving skin lesion classification,
we develop a learning framework called SplitFusionNet based on u-shape split
learning. The key idea of SplitFusionNet is to split the ML model into a
(client, server) partition of deep neural network layers: the client
layers process multi-modal input data and multi-labels, while server
layers perform computationally extensive mid-layer computations.
Additionally, we apply lossless compression and decompression to
improve the communication cost between the client and the server.
Experimentally, SplitFusionNet requires less training pipeline
time than non-split centralized training while achieving equal
predictive performance.
Fairness in machine learning has attained significant focus due to the widespread application in high-stake decision-making tasks. Unregulated machine learning classifiers can exhibit bias towards certain demographic groups in data, thus the quantification and mitigation of classifier bias is a central concern in fairness in machine learning. In this paper, we aim to quantify the influence of different features in a dataset on the bias of a classifier. To do this, we introduce the Fairness Influence Function (FIF). This function breaks down bias into its components among individual features and the intersection of multiple features. The key idea is to represent existing group fairness metrics as the difference of the scaled conditional variances in the classifier’s prediction and apply a decomposition of variance according to global sensitivity analysis. To estimate FIFs, we instantiate an algorithm FairXplainer that applies variance decomposition of classifier’s prediction following local regression. Experiments demonstrate that FairXplainer captures FIFs of individual feature and intersectional features, provides a better approximation of bias based on FIFs, demonstrates higher correlation of FIFs with fairness interventions, and detects changes in bias due to fairness affirmative/punitive actions in the classifier.
We propose neighborhood-based core decomposition: a novel way of decomposing hypergraphs into hierarchical neighborhood-cohesive subhypergraphs. Alternative approaches to decomposing hypergraphs, e.g., reduction to clique or bipartite graphs, are not meaningful in certain applications, the later also results in inefficient decomposition; while existing degree-based hypergraph decomposition does not distinguish nodes with different neighborhood sizes. Our case studies show that the proposed decomposition is more effective than degree and clique graph-based decompositions in disease intervention and in extracting provably approximate and application-wise meaningful densest subhypergraphs. We propose three algorithms: Peel, its efficient variant E-Peel, and a novel local algorithm: Local- core with parallel implementation. Our most efficient parallel algorithm Local-core(P) decomposes hypergraph with 27M nodes and 17M hyperedges in-memory within 91 seconds by adopting various optimizations. Finally, we develop a new hypergraph-core model, the (neighborhood, degree)-core by considering both neighborhood and degree constraints, design its decomposition algorithm Local-core+Peel, and demonstrate its superiority in spreading diffusion.
In recent years, machine learning (ML) algorithms have been deployed in safety-critical and high-stake decision-making, where the fairness of algorithms is of paramount importance. Fairness in ML centers on detecting bias towards certain demographic populations induced by an ML classifier and proposes algorithmic solutions to mitigate the bias with respect to different fairness definitions. To this end, several fairness verifiers have been proposed that compute the bias in the prediction of an ML classifier -- essentially beyond a finite dataset -- given the probability distribution of input features. In the context of verifying linear classifiers, existing fairness verifiers are limited by accuracy due to imprecise modelling of correlations among features and scalability due to restrictive formulations of the classifiers as SSAT or SMT formulas or by sampling.
In this paper, we propose an efficient fairness verifier, called FVGM, that encodes the correlations among features as a Bayesian network. In contrast to existing verifiers, FVGM proposes a stochastic subset-sum based approach for verifying linear classifiers. Experimentally, we show that FVGM leads to an accurate and scalable assessment for more diverse families of fairness-enhancing algorithms, fairness attacks, and group/causal fairness metrics than the state-of-the-art. We also demonstrate that FVGM facilitates the computation of fairness influence functions as a stepping stone to detect the source of bias induced by subsets of features.
Machine learning has become omnipresent with applications in various safety-critical domains such as medical, law, and transportation. In these domains, high-stake decisions provided by machine learning necessitate researchers to design interpretable models, where the prediction is understandable to a human. In interpretable machine learning, rule-based classifiers are particularly effective in representing the decision boundary through a set of rules comprising input features. Examples of such classifiers include decision trees, decision lists, and decision sets. The interpretability of rule-based classifiers is in general related to the size of the rules, where smaller rules are considered more interpretable. To learn such a classifier, the brute-force direct approach is to consider an optimization problem that tries to learn the smallest classification rule that has close to maximum accuracy. This optimization problem is computationally intractable due to its combinatorial nature and thus, the problem is not scalable in large datasets. To this end, in this paper we study the triangular relationship among the accuracy, interpretability, and scalability of learning rule-based classifiers.
The contribution of this paper is an interpretable learning framework IMLI, that is based on maximum satisfiability (MaxSAT) for synthesizing classification rules expressible in proposition logic. IMLI considers a joint objective function to optimize the accuracy and the interpretability of classification rules and learns an optimal rule by solving an appropriately designed MaxSAT query. Despite the progress of MaxSAT solving in the last decade, the straightforward MaxSAT-based solution cannot scale to practical classification datasets containing thousands to millions of samples. Therefore, we incorporate an efficient incremental learning technique inside the MaxSAT formulation by integrating mini-batch learning and iterative rule-learning. The resulting framework learns a classifier by iteratively covering the training data, wherein in each iteration, it solves a sequence of smaller MaxSAT queries corresponding to each mini-batch. In our experiments, IMLI achieves the best balance among prediction accuracy, interpretability, and scalability. For instance, IMLI attains a competitive prediction accuracy and interpretability w.r.t. existing interpretable classifiers and demonstrates impressive scalability on large datasets where both interpretable and non-interpretable classifiers fail. As an application, we deploy IMLI in learning popular interpretable classifiers such as decision lists and decision sets.
As a technology ML is oblivious to societal good or bad, and thus, the field of fair machine learning has stepped up to propose multiple mathematical definitions, algorithms, and systems to ensure different notions of fairness in ML applications. Given the multitude of propositions, it has become imperative to formally verify the fairness metrics satisfied by different algorithms on different datasets. In this paper, we propose a stochastic satisfiability (SSAT) framework, Justicia, that formally verifies different fairness measures of supervised learning algorithms with respect to the underlying data distribution. We instantiate Justicia on multiple classification and bias mitigation algorithms, and datasets to verify different fairness metrics, such as disparate impact, statistical parity, and equalized odds. Justicia is scalable, accurate, and operates on non-Boolean and compound sensitive attributes unlike existing distribution-based verifiers, such as FairSquare and VeriFair. Being distribution-based by design, Justicia is more robust than the verifiers, such as AIF360, that operate on specific test samples. We also theoretically bound the finite-sample error of the verified fairness measure.
Social networks with location enabling technologies, also known as geo-social networks, allow users to share their location-specific activities and preferences through check-ins. A user in such a geo-social network can be attributed to an associated location (spatial), her preferences as keywords (textual), and the connectivity (social) with her friends. The fusion of social, spatial, and textual data of a large number of users in these networks provide an interesting insight for finding meaningful geo-social groups of users supporting many real-life applications, including activity planning and recommendation systems. In this article, we introduce a novel query, namely, Top-k Flexible Socio-Spatial Keyword-aware Group Query (SSKGQ), which finds the best k groups of varying sizes around different points of interest (POIs), where the groups are ranked based on the social and textual cohesiveness among members and spatial closeness with the corresponding POI and the number of members in the group. We develop an efficient approach to solve the SSKGQ problem based on our theoretical upper bounds on distance, social connectivity, and textual similarity. We prove that the SSKGQ problem is NP-Hard and provide an approximate solution based on our derived relaxed bounds, which run much faster than the exact approach by sacrificing the group quality slightly. Our extensive experiments on real data sets show the effectiveness of our approaches in different real-life settings.
The success of MaxSAT (maximum satisfiability) solving in recent years has motivated researchers to apply MaxSAT solvers in diverse discrete combinatorial optimization problems. Group testing has been studied as a combinatorial optimization problem, where the goal is to find defective items among a set of items by performing sets of tests on items. In this paper, we propose a MaxSAT-based framework, called MGT, that solves group testing, in particular, the decoding phase of non-adaptive group testing. We extend this approach to the noisy variant of group testing, and propose a compact MaxSAT-based encoding that guarantees an optimal solution. Our extensive experimental results show that MGT can solve group testing instances of 10000 items with 3% defectivity, which no prior work can handle to the best of our knowledge. Furthermore, MGT has better accuracy than the LP-based approach. We also discover an interesting phase transition behavior in the runtime, which reveals the easy-hard-easy nature of group testing.
Machine learning algorithms that produce rule-based predictions in Conjunctive Normal form (CNF) or in Disjunctive Normal form (DNF) are arguably some of the most interpretable ones. For example, decision set is an interpretable model in practice, that represents the decision function in the form of DNF. In this paper, we consider relaxed definitions of standard OR/AND operators which allow exceptions in the construction of a clause and also in the selection of clauses in a rule. Building on these relaxed definition, we introduce relaxed-CNF rules, which are motivated by the popular usage of checklists in the medical domain and generalizes the widely employed rule representations including CNF, DNF, and decision sets. While the combinatorial structure of relaxed-CNF rules offers exponential succinctness, the naive learning techniques are computationally expensive. To this end, we propose a novel incremental mini-batch learning procedure, called CRR, that employs advances in the Mixed-Integer Linear Programming (MILP) solvers to efficiently learn relaxed-CNF rules. Our experimental analysis demonstrates that CRR can generate relaxed-CNF rules, which are more accurate and sparser compared to the alternative rule-based models.
The wide adoption of machine learning in the critical domains such as medical diagnosis, law, education had propelled the need for interpretable techniques due to the need for end user to understand the reasoning behind decisions due to learning systems. The computational intractability of interpretable learning led practitioners to design heuristic techniques, which fail to provide sound handles to tradeoff accuracy and interpretability.
Motivated by the success of MaxSAT solvers over the past decade, recently MaxSAT-based approach, called MLIC, was proposed that seeks to reduce the problem of learning interpretable rules expressed in Conjunctive Normal Form (CNF) to a MaxSAT query. While MLIC was shown to achieve accuracy similar to that of other state oft he art black-box classifiers while generating small interpretable CNF formulas, the runtime performance of MLIC is significantly lagging and renders approach unusable in practice. In this context, authors raised the question: Is it possible to achieve the best of both worlds, i.e., a sound framework for interpretable learning that can take advantage of MaxSAT solvers while scaling to real-world instances?
In this paper, we take a step towards answering the above question in affirmation. We propose an incremental approach to MaxSAT based framework that achieves scalable runtime performance via partition-based training methodology. Extensive experiments on benchmarks arising from UCI repository demonstrate that IMLI achieves up to three orders of magnitude runtime improvement without loss of accuracy and interpretability.
A socio spatial group query finds a group of users who possess strong social connections with each other and have the minimum aggregate spatial distance to a meeting point. Existing studies limit the socio spatial group search to either finding best group of a fixed size for a single meeting location, or a single group of a fixed size w.r.t. multiple meeting locations. However, it is highly desirable to consider multiple meeting locations/POIs in a real-life scenario in order to organize impromptu activities of user groups of various sizes. In this paper, we propose Top k Flexible Socio Spatial Group Query (Top k-FSSGQ) to find and rank the top k groups w.r.t. multiple POIs where each group follows the minimum social connectivity constraints. We devise a ranking function to measure the group score by combining social closeness, spatial distance, and group size, which provides the flexibility of choosing groups of different sizes under different constraints. To effectively process the Top k-FSSGQ, we first develop an Exact approach that ensures early termination of the search based on the derived upper bounds. We prove that the problem is NP-hard, hence we first present a heuristic based approximation algorithm to effectively select members in intermediate solution groups based on the social connectivity of the users. Later we design a Fast Approximate approach based on the relaxed social and spatial bounds, and connectivity constraint heuristic. Experimental studies have verified the effectiveness and efficiency of our proposed approaches on real datasets.
Our paper on reliable knowledge estimation in LLMs is accepted at WSDM, 2025. We propose a simple method that can reliably estimate the latent knowledge across multiple LLMs and relation queries.
Paper accepted on split learning for multi-modal medical image classification at IEEE CAI, 2024. Joint work with Yuan Wang, Huazhu Fu, Wei Qingsong, Yong Liu, and Rick Goh.
I have jointly presented a tutorial with Debabrota Basu on auditing fairness in machine learning at IJCAI 2023.
I have defended my Ph.D. thesis in NUS.
I am joining Institute of High Performance Computing (IHPC), A*STAR, Singapore as a scientist.
Our tutorial on "Auditing Bias of Machine Learning Algorithms: Tools and Overview" is accepted in IJCAI 2023. This is a joint work with Debabrota Basu.
My thesis abstract is accepted in IJCAI 2023 Doctoral Consortium. The title of the abstract is "Interpretability and Fairness in Machine Learning: A Formal Methods Approach".
Our work on "Neighborhood-based Hypergraph Core Decomposition" is accepted in VLDB 2023. This is a joint work with Naheed Anjum Arafat, Arijit Khan, and Arpit Kumar Rai.
Our work on "How Biased are Your Features?: Computing Fairness Influence Functions with Global Sensitivity Analysis" is accepted in FAccT 2023. This is a joint work with Debabrota Basu and Kuldeep S. Meel.