Bishwamittra Ghosh is a PhD student in School of Computing, National University of Singapore (NUS). Currently, he is working under the supervision of Kuldeep Meel. His current research interest is in interpretable and fair machine learning models.
He has completed his BSc in Computer Science and Engineering from Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology (BUET) in September 2017. His undergraduate thesis is on socio-spatial group queries under the supervision of Dr. Mohammed Eunus Ali. He has been a lecturer at United International University, Bangladesh. He loves experimenting with ideas in CS by combining various formal techniques, e.g. SAT (satisfiability), CP (constraint programming) with machine learning.
The wide adoption of machine learning in the critical domains such as medical diagnosis, law, education had propelled the need for interpretable techniques due to the need for end user to understand the reasoning behind decisions due to learning systems. The computational intractability of interpretable learning led practitioners to design heuristic techniques, which fail to provide sound handles to tradeoff accuracy and interpretability.
Motivated by the success of MaxSAT solvers over the past decade, recently MaxSAT-based approach, called MLIC, was proposed that seeks to reduce the problem of learning interpretable rules expressed in Conjunctive Normal Form (CNF) to a MaxSAT query. While MLIC was shown to achieve accuracy similar to that of other state oft he art black-box classifiers while generating small interpretable CNF formulas, the runtime performance of MLIC is significantly lagging and renders approach unusable in practice. In this context, authors raised the question: Is it possible to achieve the best of both worlds, i.e., a sound framework for interpretable learning that can take advantage of MaxSAT solvers while scaling to real-world instances?
In this paper, we take a step towards answering the above question in affirmation. We propose an incremental approach to MaxSAT based framework that achieves scalable runtime performance via partition-based training methodology. Extensive experiments on benchmarks arising from UCI repository demonstrate that IMLI achieves up to three orders of magnitude runtime improvement without loss of accuracy and interpretability.
As a technology ML is oblivious to societal good or bad, and thus, the field of fair machine learning has stepped up to propose multiple mathematical definitions, algorithms, and systems to ensure different notions of fairness in ML applications. Given the multitude of propositions, it has become imperative to formally verify the fairness metrics satisfied by different algorithms on different datasets. In this paper, we propose a stochastic satisfiability (SSAT) framework, Justicia, that formally verifies different fairness measures of supervised learning algorithms with respect to the underlying data distribution. We instantiate Justicia on multiple classification and bias mitigation algorithms, and datasets to verify different fairness metrics, such as disparate impact, statistical parity, and equalized odds. Justicia is scalable, accurate, and operates on non-Boolean and compound sensitive attributes unlike existing distribution-based verifiers, such as FairSquare and VeriFair. Being distribution-based by design, Justicia is more robust than the verifiers, such as AIF360, that operate on specific test samples. We also theoretically bound the finite-sample error of the verified fairness measure.
A socio spatial group query finds a group of users who possess strong social connections with each other and have the minimum aggregate spatial distance to a meeting point. Existing studies limit the socio spatial group search to either finding best group of a fixed size for a single meeting location, or a single group of a fixed size w.r.t. multiple meeting locations. However, it is highly desirable to consider multiple meeting locations/POIs in a real-life scenario in order to organize impromptu activities of user groups of various sizes. In this paper, we propose Top k Flexible Socio Spatial Group Query (Top k-FSSGQ) to find and rank the top k groups w.r.t. multiple POIs where each group follows the minimum social connectivity constraints. We devise a ranking function to measure the group score by combining social closeness, spatial distance, and group size, which provides the flexibility of choosing groups of different sizes under different constraints. To effectively process the Top k-FSSGQ, we first develop an Exact approach that ensures early termination of the search based on the derived upper bounds. We prove that the problem is NP-hard, hence we first present a heuristic based approximation algorithm to effectively select members in intermediate solution groups based on the social connectivity of the users. Later we design a Fast Approximate approach based on the relaxed social and spatial bounds, and connectivity constraint heuristic. Experimental studies have verified the effectiveness and efficiency of our proposed approaches on real datasets.
We have improved fairness-verification for linear classifiers both in terms of accuracy and scalability. The paper is available in arXiv now.
I have completed my internship at Goldman Sachs. During my internship, I have experimented with recent advances in transformer-based models, such as BERT, in natural language processing (NLP).
I have joined Goldman Sachs, Singapore as an AI research intern.
Our work on formal fairness verification based on Stochastic Boolean Satisfiability (SSAT) is accepted in AAAI 2021. This is a joint work with Debabrota Basu and Kuldeep S. Meel.