Publications

2020

Unpaired Data Empowers Association Tests
M. Gong, P. Liu, F. C Sciurba, P. Stojanov, D. Tao, G. Tseng, K. Zhang, K. Batmanghelich
Accepted to Bioinformatics (Preprint: bioRxiv DOI:10.1101/839159)
[pdf][code]

Abstract

To achieve a holistic view of the underlying mechanisms of human diseases, the biomedical research community is moving toward harvesting retrospective data available in Electronic Healthcare Records (EHRs). The first step for causal understanding is to perform association tests between types of potentially high-dimensional biomedical data, such as genetic, blood biomarkers, and imaging data. To obtain a reasonable power, current methods require a substantial sample size of individuals with both data modalities. This prevents researchers from using much larger EHR samples that include individuals with at least one data type, limits the power of the association test, and may result in a higher false discovery rate. We present a new method called the Semi-paired Association Test (SAT) that makes use of both paired and unpaired data. In contrast to classical approaches, incorporating unpaired data allows SAT to produce better control of false discovery and, under some conditions, improve the association test power. We study the properties of SAT theoretically and empirically, through simulations and application to real studies in the context of Chronic Obstructive Pulmonary Disease. Our method identifies an association between the high-dimensional characterization of  Computed Tomography (CT) chest images and blood biomarkers as well as the expression of dozens of genes involved in the immune system.

Hierarchical Amortized Training for Memory-efficient High-Resolution 3D GAN
Li Sun, Junxiang Chen, Yanwu Xu, Mingming Gong, Ke Yu, Kayhan Batmanghelich
Preprint:arXiv:2008.01910
[pdf]

Abstract

Generative Adversarial Networks (GAN) have many potential medical imaging applications, including data augmentation, domain adaptation, and model explanation. Due to the limited embedded memory of Graphical Processing Units (GPUs), most current 3D GAN models are trained on low-resolution medical images. In this work, we propose a novel end-to-end GAN architecture that can generate high-resolution 3D images. We achieve this goal by separating training and inference. During training, we adopt a hierarchical structure that simultaneously generates a low-resolution version of the image and a randomly selected sub-volume of the high-resolution image. The hierarchical design has two advantages: First, the memory demand for training on high-resolution images is amortized among subvolumes. Furthermore, anchoring the high-resolution subvolumes to a single low-resolution image ensures anatomical consistency between subvolumes. During inference, our model can directly generate full high-resolution images. We also incorporate an encoder with a similar hierarchical structure into the model to extract features from the images. Experiments on 3D thorax CT and brain MRI demonstrate that our approach outperforms state of the art in image generation, image reconstruction, and clinical-relevant variables prediction.

Label-Noise Robust Domain Adaptation
Xiyu Yu, Tongliang Liu, Mingming Gong,  Kun Zhang, Kayhan Batmanghelich, Dacheng Tao 
ICML, 2020
[pdf]

Abstract

Domain adaptation aims to correct the classifiers when faced with a distribution shift between the source (training) and target (test) domains. State-of-the-art domain adaptation methods make use of deep networks to extract domain-invariant representations. However, existing methods assume that all the instances in the source domain are correctly labeled; while in reality, it is unsurprising that we may obtain a source domain with noisy labels. In this paper, we are the first to comprehensively investigate how the label-noise could adversely affect existing domain adaptation methods in various scenarios. Further, we theoretically prove that there exists a method that can essentially reduce the side-effect of noisy source labels in domain adaptation. Specifically, focusing on the generalized target shift scenario, where both label distribution $P_Y$ and the class-conditional distribution $P_{X|Y}$ can change, we discover that the denoising Conditional Invariant Component (DCIC) framework can provably ensure (1) extracting invariant representations given examples with noisy labels in the source domain and unlabeled examples in the target domain and (2) estimating the label distribution in the target domain with no bias. Experimental results on both synthetic and real-world data verify the effectiveness of the proposed method.

Semi-Supervised Hierarchical Drug Embedding in Hyperbolic Space
Ke Yu, Shyam Visweswaran, Kayhan Batmanghelich
Accepted to Journal of Chemical Information and Modeling (DOI: 10.1021/acs.jcim.0c00681 )
[pdf][PrePrint][code]

Abstract

Learning accurate drug representation is essential for tasks such as computational drug repositioning and prediction of drug side-effects. A drug hierarchy is a valuable source that encodes human knowledge of drug relations in a tree-like structure where drugs that act on the same organs, treat the same disease, or bind to the same biological target are grouped together. However, its utility in learning drug representations has not yet been explored, and currently described drug representations cannot place novel molecules in a drug hierarchy. Here, we develop a semi-supervised drug embedding that incorporates two sources of information: (1) underlying chemical grammar that is inferred from molecular structures of drugs and drug-like molecules (unsupervised), and (2) hierarchical relations that are encoded in an expert-crafted hierarchy of approved drugs (supervised). We use the Variational Auto-Encoder (VAE) framework to encode the chemical structures of molecules and use the knowledge-based drug-drug similarity to induce the clustering of drugs in hyperbolic space. The hyperbolic space is amenable for encoding hierarchical concepts. Both quantitative and qualitative results support that the learned drug embedding can accurately reproduce the chemical structure and induce the hierarchical relations among drugs. Furthermore, our approach can infer the pharmacological properties of novel molecules by retrieving similar drugs from the embedding space. We demonstrate that the learned drug embedding can be used to find new uses for existing drugs and to discover side-effects. We show that it significantly outperforms baselines in both tasks.

decorative

3D-BoxSup: Positive-Unlabeled Learning of Brain Tumor Segmentation Networks From 3D Bounding Boxes
Yanwu Xu, Mingming Gong, Junxiang Chen, Ziye Chen, Kayhan Batmanghelich
Frontiers in Neuroscience, 2020
[pdf]

Abstract

Accurate segmentation is an essential task when working with medical images. Recently, deep convolutional neural networks achieved state-of-the-art performance for many segmentation benchmarks. Regardless of the network architecture, the deep learning-based segmentation methods view the segmentation problem as a supervised task that requires a relatively large number of annotated images. Acquiring a large number of annotated medical images is time-consuming, and high-quality segmented images (i.e., strong labels) crafted by human experts are expensive. In this paper, we have proposed a method that achieves competitive accuracy from a “weakly annotated” image where the weak annotation is obtained via a 3D bounding box denoting an object of interest. Our method, called “3D-BoxSup,” employs a positive-unlabeled learning framework to learn segmentation masks from 3D bounding boxes. Specially, we consider the pixels outside of the bounding box as positively labeled data and the pixels inside the bounding box as unlabeled data. Our method can suppress the negative effects of pixels residing between the true segmentation mask and the 3D bounding box and produce accurate segmentation masks. We applied our method to segment a brain tumor. The experimental results on the BraTS 2017 dataset (Menze et al., 2015; Bakas et al., 2017a,b,c) have demonstrated the effectiveness of our method.

decorative figure

Human-Machine Collaboration for Medical Image Segmentation
Mahdyar Ravanbakhsh, Vadim Tschernezki, Felix Last, Tassilo Klein, Kayhan Batmanghelich, Volker Tresp, Moin Nabi
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020
[pdf]

Abstract

Image segmentation is a ubiquitous step in almost any medical image study. Deep learning-based approaches achieve state-of-the-art in the majority of image segmentation benchmarks. However, end-to-end training of such models requires sufficient annotation. In this paper, we propose a method based on conditional Generative Adversarial Network (cGAN) to address segmentation in semi-supervised setup and in a human-in-the-loop fashion. More specifically, we use the generator in the GAN to synthesize segmentations on unlabeled data and use the discriminator to identify unreliable slices for which expert annotation is required. The quantitative results on a conventional standard benchmark show that our method is comparable with the state-of-the-art fully supervised methods in slice-level evaluation, despite of requiring far less annotated data.

Explanation by Progressive Exaggeration
Sumedha Singla, Brian Pollack, Junxiang Chen, Kayhan Batmanghelich
International Conference on Learning Representation (ICLR) 2020 [Spotlight]
[pdf][code]

Abstract

As machine learning methods see greater adoption and implementation in high stakes applications such as medical image diagnosis, the need for model interpretability and explanation has become more critical. Classical approaches that assess feature importance (eg saliency maps) do not explain how and why a particular region of an image is relevant to the prediction. We propose a method that explains the outcome of a classification black-box by gradually exaggerating the semantic effect of a given class. Given a query input to a classifier, our method produces a progressive set of plausible variations of that query, which gradually changes the posterior probability from its original class to its negation. These counter-factually generated samples preserve features unrelated to the classification decision, such that a user can employ our method as a “tuning knob” to traverse a data manifold while crossing the decision boundary. Our method is model agnostic and only requires the output value and gradient of the predictor with respect to its input.

Generative-Discriminative Complementary Learning
Yanwu Xu, Mingming Gong, Junxiang Chen, Tongliang Liu, Kun Zhang, Kayhan Batmanghelich
Proceeding of Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20)
[pdf]

Abstract

The majority of state-of-the-art deep learning methods are discriminative approaches, which model the conditional distribution of labels given input features. The success of such approaches heavily depends on high-quality labeled instances, which are not easy to obtain, especially as the number of candidate classes increases. In this paper, we study the complementary learning problem. Unlike ordinary labels, complementary labels are easy to obtain because an annotator only needs to provide a yes/no answer to a randomly chosen candidate class for each instance. We propose a generative-discriminative complementary learning method that estimates the ordinary labels by modeling both the conditional (discriminative) and instance (generative) distributions. Our method, we call Complementary Conditional GAN (CCGAN), improves the accuracy of predicting ordinary labels and is able to generate high-quality instances in spite of weak supervision. In addition to the extensive empirical studies, we also theoretically show that our model can retrieve the true conditional distribution from the complementarily-labeled data.

Weakly Supervised Disentanglement by Pairwise Similarities
Junxiang Chen, Kayhan Batmanghelich
Proceeding of Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20)
[pdf][code]

Abstract

Recently, researches related to unsupervised disentanglement learning with deep generative models have gained substantial popularity. However, without introducing supervision, there is no guarantee that the factors of interest can be successfully recovered. In this paper, we propose a setting where the user introduces weak supervision by providing similarities between instances based on a factor to be disentangled. The similarity is provided as either a discrete (yes/no) or real-valued label describing whether a pair of instances are similar or not. We propose a new method for weakly supervised disentanglement of latent variables within the framework of Variational Autoencoder. Experimental results demonstrate that utilizing weak supervision improves the performance of the disentanglement method substantially.

2019

Geometry-Consistent Adversarial Networks for One-Sided Unsupervised Domain Mapping
H. Fu*, M. Gong*, C. Wang, K. Batmanghelich, K. Zhang, D. Tao (*: equal contribution)
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
[pdf][code]

Abstract

Unsupervised domain mapping aims to learn a function to translate domain X to Y by a function GXY in the absence of paired examples. Finding the optimal GXY without paired data is an ill-posed problem, so appropriate constraints are required to obtain reasonable solutions. One of the most prominent constraints is cycle consistency, which enforces the translated image by GXY to be translated back to the input image by an inverse mapping GYX. While cycle consistency requires the simultaneous training of GXY and GY X, recent studies have shown that one-sided domain mapping can be achieved by preserving pairwise distances between images. Although cycle consistency and distance preservation successfully constrain the solution space, they overlook the special properties that simple geometric transformations do not change the semantic structure of images. Based on this special property, we develop a geometry-consistent generative adversarial network (GcGAN), which enables one-sided unsupervised domain mapping. GcGAN takes the original image and its counterpart image transformed by a predefined geometric transformation as inputs and generates two images in the new domain coupled with the corresponding geometry-consistency constraint. The geometry-consistency constraint reduces the space of possible solutions while keeping the correct solutions in the search space. Quantitative and qualitative comparisons with the baseline (GAN alone) and the state-of-the-art methods including CycleGAN and DistanceGAN demonstrate the effectiveness of our method.

Twin Auxiliary Classifiers GAN
M.*, Y. Xu*, Ch. Li, Kun Zhang, K. Batmanghelich (*: equal contribution)
NeurIPS 2019 [Spotlight 2.4%]
[pdf][code]

Abstract

Conditional generative models enjoy remarkable progress over the past few years.  One of the popular conditional models is Auxiliary Classifier GAN (AC-GAN), which generates highly discriminative images by extending the loss function of GAN with an auxiliary classifier. However, the diversity of the generated samples by AC-GAN tends to decrease as the number of classes increases, hence limiting its power on large-scale data. In this paper, we identify the source of the low diversity issue theoretically and propose a practical solution to solve the problem. We show that the auxiliary classifier in AC-GAN imposes perfect separability, which is disadvantageous when the supports of the class distributions have significant overlap. To address the issue, we propose Twin Auxiliary Classifiers Generative Adversarial Net (TAC-GAN) that further benefits from a new player that interacts with other players (the generator and the discriminator) in GAN. Theoretically, we demonstrate that TAC-GAN can effectively minimize the divergence between the generated and real-data distributions. Extensive experimental results show that our
TAC-GAN can successfully replicate the true data distributions on simulated data, and significantly improves the diversity of class-conditional image generation on real datasets.

Generative Interpretability: Application in Disease Subtyping
P. Yadollahpour, A. Saeedi, S. Singla, F. C. Sciurba, K. Batmanghelich
Submitted to IEEE Transaction of Medical Imaging
[pdf]

Abstract

We present a probabilistic approach to characterize heterogeneous disease in a way that is reflective of disease severity. In many diseases, multiple subtypes of disease present simultaneously in each patient. Generative models provide a flexible and readily explainable framework to discover disease subtypes from imaging data. However, discovering local image descriptors of each subtype in a fully unsupervised way is an ill-posed problem and may result in loss of valuable information about disease severity. Although supervised approaches, and more recently deep learning methods, have achieved state-of-the-art performance for predicting clinical variables relevant to diagnosis, interpreting those models is a crucial yet challenging task. In this paper, we propose a method that aims to achieve the best of both worlds, namely we maintain the predictive power of supervised methods and the interpretability of probabilistic methods. Taking advantage of recent progress in deep learning, we propose to incorporate the discriminative information extracted by the predictive model into the posterior distribution over the latent variables of the generative model. Hence, one can view the generative model as a template for interpretation of a discriminative method in a clinically meaningful way. We illustrate an application of this method on a large-scale lung CT study of Chronic Obstructive Pulmonary Disease (COPD), which is a highly heterogeneous disease. As our experiments show, our interpretable model does not compromise the prediction of the relevant clinical variables, unlike purely unsupervised methods. We also show that some of the discovered subtypes are correlated with genetic measurements suggesting that the discovered subtypes characterize the underlying etiology of the disease.

Robust Ordinal VAE: Employing Noisy Pairwise Comparisons for Disentanglement
J. Chen, K. Batmanghelich
Preprint: arXiv:1910.05898
[pdf][code]

Abstract

Recent work by Locatello et al. (2018) has shown that an inductive bias is required to disentangle factors of interest in Variational Autoencoder (VAE). Motivated by a real-world problem, we propose a setting where such bias is introduced by providing pairwise ordinal comparisons between instances, based on the desired factor to be disentangled. For example, a doctor compares pairs of patients based on the level of severity of their illnesses, and the desired factor is a quantitive level of the disease severity. In a real-world application, the pairwise comparisons are usually noisy. Our method, Robust Ordinal VAE (ROVAE), incorporates the noisy pairwise ordinal comparisons in the disentanglement task. We introduce non-negative random variables in ROVAE, such that it can automatically determine whether each pairwise ordinal comparison is trustworthy and ignore the noisy comparisons. Experimental results demonstrate that ROVAE outperforms existing methods and is more robust to noisy pairwise comparisons in both benchmark datasets and a real-world application.

2018

Subject2Vec: Generative-Discriminative Approach from a Set of Image Patches to a Vector
S. Singla, M. Gong, S. Ravanbakhsh, F. Sciurba, B. Poczos, K. N. Batmanghelich
Medical Image Computing & Computer-Assisted Intervention
[pdf]

Abstract

We propose an attention-based method that aggregates local image features to a subject-level representation for predicting disease severity. In contrast to classical deep learning that requires a fixed dimensional input, our method operates on a set of image patches; hence it can accommodate variable length input image without image resizing. The model learns a clinically interpretable subject-level representation that is reflective of the disease severity. Our model consists of three mutually dependent modules which regulate each other: (1) a discriminative network that learns a fixed-length representation from local features and maps them to disease severity; (2) an attention mechanism that provides interpretability by focusing on the areas of the anatomy that contribute the most to the prediction task; and (3) a generative network that encourages the diversity of the local latent features. The generative term ensures that the attention weights are non-degenerate while maintaining the relevance of the local regions to the disease severity. We train our model end-to-end in the context of a large-scale lung CT study of Chronic Obstructive Pulmonary Disease (COPD). Our model gives state-of-the-art performance in predicting clinical measures of severity for COPD. The distribution of the attention provides the regional relevance of lung tissue to the clinical measurements.

 

A structural equation model for imaging genetics using spatial transcriptomics
S. M. H. Huisman, A. Mahfouz, K. Batmanghelich, B. P. F. Lelieveldt, M. J. T. Reinders
Brain Informatics
[pdf]

Abstract

Imaging genetics deals with relationships between genetic variation and imaging variables, often in a disease context. The complex relationships between brain volumes and genetic variants have been explored with both dimension reduction methods and model-based approaches. However, these models usually do not make use of the extensive knowledge of the spatio-anatomical patterns of gene activity. We present a method for integrating genetic markers (single nucleotide polymorphisms) and imaging features, which is based on a causal model and, at the same time, uses the power of dimension reduction. We use structural equation models to find latent variables that explain brain volume changes in a disease context, and which are in turn affected by genetic variants. We make use of publicly available spatial transcriptome data from the Allen Human Brain Atlas to specify the model structure, which reduces noise and improves interpretability. The model is tested in a simulation setting and applied on a case study of the Alzheimer’s Disease Neuroimaging Initiative.

Causal Generative Domain Adaptation Networks
M. Gong, K. Zhang, B. Huang, C. Glymour, D. Tao, K. Batmanghelich
Preprint: arXiv:1804.04333
[pdf]

Abstract

We propose a new generative model for domain adaptation, in which training data (source domain) and test data (target domain) come from different distributions. An essential problem in domain adaptation is to understand how the distribution shifts across domains. For this purpose, we propose a generative domain adaptation network to understand and identify the domain changes, which enables the generation of new domains. In addition, focusing on single domain adaptation, we demonstrate how our model recovers the joint distribution on the target domain from unlabeled target domain data by transferring valuable information between domains. Finally, to improve transfer efficiency, we build a causal generative domain adaptation network by decomposing the joint distribution of features and labels into a series of causal modules according to a causal model. Due to the modularity property of a causal model, we can improve the identification of distribution changes by modeling each causal modules separately. With the proposed adaptation networks, the predictive model on the target domain can be easily trained on data sampled from the learned networks. We demonstrate the efficacy of our method on both synthetic and real data experiments.

Deep Diffeomorphic Normalizing Flows
H. Salman, P. Yadollahpour, T. Fletcher, K. Batmanghelich
Preprint: arXiv:1810.03256
[pdf]

Abstract

The Normalizing Flow (NF) models a general probability density by estimating an invertible transformation applied on samples drawn from a known distribution. We introduce a new type of NF, called Deep Diffeomorphic Normalizing Flow (DDNF). A diffeomorphic flow is an invertible function where both the function and its inverse are smooth. We construct the flow using an ordinary differential equation (ODE) governed by a time-varying smooth vector field. We use a neural network to parametrize the smooth vector field and a recursive neural network (RNN) for approximating the solution of the ODE. Each cell in the RNN is a residual network implementing one Euler integration step. The architecture of our flow enables efficient likelihood evaluation, straightforward flow inversion, and results in highly flexible density estimation. An end-to-end trained DDNF achieves competitive results with state-of-the-art methods on a suite of density estimation and variational inference tasks. Finally, our method brings concepts from Riemannian geometry that, we believe, can open a new research direction for neural density estimation.

An Efficient and Provable Approach for Mixture Proportion Estimation Using Linear Independence Assumption
X. Yu, T. Liu, M. Gong, K. Batmanghelich, D. Tao
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018
[pdf][code]

Abstract

In this paper, we study the mixture proportion estimation (MPE) problem in a new setting: given samples from the mixture and the component distributions, we identify the proportions of the components in the mixture distribution. To address this problem, we make use of a linear independence assumption, ie, the component distributions are independent from each other, which is much weaker than assumptions exploited in the previous MPE methods. Based on this assumption, we propose a method (1) that uniquely identifies the mixture proportions,(2) whose output provably converges to the optimal solution, and (3) that is computationally efficient. We show the superiority of the proposed method over the state-of-the-art methods in two applications including learning with label noise and semi-supervised learning on both synthetic and real-world datasets.

Deep Ordinal Regression Network for Monocular Depth Estimation
H. Fu, M. Gong, C. Wang, K. Batmanghelich, D. Tao
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018
[pdf][code]

Abstract

Monocular depth estimation, which plays a crucial role in understanding 3D scene geometry, is an ill-posed problem. Recent methods have gained significant improvement by exploring image-level information and hierarchical features from deep convolutional neural networks (DCNNs). These methods model depth estimation as a regression problem and train the regression networks by minimizing mean squared error, which suffers from slow convergence and unsatisfactory local solutions. Besides, existing depth estimation networks employ repeated spatial pooling operations, resulting in undesirable low-resolution feature maps. To obtain high-resolution depth maps, skip-connections or multilayer deconvolution networks are required, which complicates network training and consumes much more computations. To eliminate or at least largely reduce these problems, we introduce a spacing-increasing discretization (SID) strategy to discretize depth and recast depth network learning as an ordinal regression problem. By training the network using an ordinary regression loss, our method achieves much higher accuracy and faster convergence in synch. Furthermore, we adopt a multi-scale network structure which avoids unnecessary spatial pooling and captures multi-scale information in parallel. The proposed deep ordinal regression network (DORN) achieves state-of-the-art results on three challenging benchmarks, ie, KITTI, Make3D, and NYU Depth v2, and outperforms existing methods by a large margin.

Textured Graph-Based Model of the Lungs: Application on Tuberculosis Type Classification and Multi-drug Resistance Detection
Y.D. Cid, K. Batmanghelich, H. Müller
International Conference of the Cross-Language Evaluation Forum for European Languages
[pdf]

Abstract

Tuberculosis (TB) remains a leading cause of death worldwide. Two main challenges when assessing computed tomography scans of TB patients are detecting multi-drug resistance and differentiating TB types. In this article we model the lungs as a graph entity where nodes represent anatomical lung regions and edges encode interactions between them. This graph is able to characterize the texture distribution along the lungs, making it suitable for describing patients with different TB types. In 2017, the ImageCLEF benchmark proposed a task based on computed tomography volumes of patients with TB. This task was divided into two subtasks: multi-drug resistance prediction, and TB type classification. The participation in this task showed the strength of our model, leading to best results in the competition for multi-drug resistance detection (AUC = 0.5825) and good results in the TB type classification (Cohen’s Kappa coefficient = 0.1623).

2017

Transformations Based on Continuous Piecewise-Affine Velocity Fields
O. Freifeld, S. Hauberg, J. Fisher III, N. Batmanghelich
IEEE Transactions on Pattern Analysis and Machine Intelligence
[pdf]

Abstract

We propose novel finite-dimensional spaces of well-behaved Rn → Rn transformations. The latter are obtained by (fast and highly-accurate) integration of continuous piecewise-affine velocity fields. The proposed method is simple yet highly expressive, effortlessly handles optional constraints (e.g., volume preservation and/or boundary conditions), and supports convenient modeling choices such as smoothing priors and coarse-to-fine analysis. Importantly, the proposed approach, partly due to its rapid likelihood evaluations and partly due to its other properties, facilitates tractable inference over rich transformation spaces, including using Markov-Chain Monte-Carlo methods. Its applications include, but are not limited to: monotonic regression (more generally, optimization over monotonic functions); modeling cumulative distribution functions or histograms; time-warping; image warping; image registration; real-time diffeomorphic image editing; data augmentation for image classifiers. Our GPU-based code is publicly available.

A Likelihood-Free Approach for Characterizing Heterogeneous Diseases in Large-Scale Studies
J. Schabdach, S. Wells, M. Cho, N. Batmanghelich
Information Processing in Medical Imaging (IPMI)
[pdf]

Abstract

We propose a non-parametric approach for characterizing heterogeneous diseases in large-scale studies. We target diseases where multiple types of pathology present simultaneously in each subject and a more severe disease manifests as a higher level of tissue destruction.  For each subject, we model the collection of local image descriptors as samples generated by an unknown subject-specic probability density. Instead of approximating the probability density via a parametric family, we propose to side step the parametric inference by directly estimating the divergence between subject densities. Our method maps the collection of local image descriptors to a signature vector that is used to predict a clinical measurement. We are able to interpret the prediction of the clinical variable in the population and individual levels by carefully studying the divergences. We illustrate an application this method on simulated data as well as on a large-scale lung CT study of Chronic Obstructive Pulmonary Disease (COPD). Our approach outperforms classical methods on both simulated and COPD data and demonstrates the state-of-the-art prediction on an important physiologic measure of airflow (the forced respiratory volume in one second, FEV1).

 

2016

Unsupervised Discovery of Emphysema Subtypes in a Large Clinical Cohort
P. Binder N. Batmanghelich, R. J. Estepar, P. Golland
7th International Workshop on Machine Learning in Medical Imaging (MLMI)
[pdf]

Abstract

Emphysema is one of the hallmarks of Chronic Obstructive Pulmonary Disorder (COPD), a devastating lung disease often caused by smoking. Emphysema appears on Computed Tomography (CT) scans as a variety of textures that correlate with disease subtypes. It has been shown that the disease subtypes and textures are linked to physiological indicators and prognosis, although neither is well characterized clinically. Most previous computational approaches to modeling emphysema imaging data have focused on supervised classification of lung textures in patches of CT scans. In this work, we describe a generative model that jointly captures heterogeneity of disease subtypes and of the patient population. We also describe a corresponding inference algorithm that simultaneously discovers disease subtypes and population structure in an unsupervised manner. This approach enables us to create image-based descriptors of emphysema beyond those that can be identified through manual labeling of currently defined phenotypes. By applying the resulting algorithm to a large data set, we identify groups of patients and disease subtypes that correlate with distinct physiological indicators.

Probabilistic Modeling of Imaging, Genetics and the Diagnosis
K.N. Batmanghelich, A. Dalca, G. Quon, M. Sabuncu, P. Golland
IEEE Trans Med Imaging
[pdf]

Abstract

We propose a unified Bayesian framework for detecting genetic variants associated with disease by exploiting imagebased features as an intermediate phenotype. The use of imaging data for examining genetic associations promises new directions of analysis, but currently the most widely used methods make sub-optimal use of the richness that these data types can offer. Currently, image features are most commonly selected based on their relevance to the disease phenotype. Then, in a separate step, a set of genetic variants is identified to explain theselected features. In contrast, our method performs these tasks simultaneously in order to jointly exploit information in both data types. The analysis yields probabilistic measures of clinical relevance for both imaging and genetic markers. We derive an efficient approximate inference algorithm that handles the high dimensionality of image and genetic data. We evaluate the algorithm on synthetic data and demonstrate that it outperforms traditional models. We also illustrate our method on Alzheimer’s Disease Neuroimaging Initiative data.

 

Nonparametric Spherical Topic Modeling with Word Embeddings
N. Batmanghelich, A. Saeediy, K. Narasimhan, S. Gershman
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL)
[pdf]

Abstract

Traditional topic models do not account for semantic regularities in language. Recent distributional representations of words exhibit semantic consistency over directional metrics such as cosine similarity. However, neither categorical nor Gaussian observational distributions used in existing topic models are appropriate to leverage such correlations. In this paper, we propose to use the von Mises-Fisher distribution to model the density of words over a unit sphere. Such a representation is well-suited for directional data. We use a Hierarchical Dirichlet Process for our base topic model and propose an efficient inference algorithm based on Stochastic Variational Inference. This model enables us to naturally exploit the semantic structures of word embeddings while flexibly discovering the number of topics. Experiments demonstrate that our method outperforms competitive approaches in terms of topic coherence on two different text corpora while offering efficient inference.

 

Inferring Disease Status by non-Parametric Probabilistic Embedding
N. Batmanghelich, A. Saeedi, R. J. Estepar, M. Cho, S. Wells
Workshop on Medical Computer Vision: Algorithms for Big Data (MCV)
[pdf]

Abstract

Computing similarity between all pairs of patients in a dataset enables us to group the subjects into disease subtypes and infer their disease status. However, robust and efficient computation of pairwise similarity is challenging task for large-scale medical image datasets. We specifically target diseases where multiple subtypes of pathology present simultaneously, rendering the denition of the similarity a difficult task. To define pairwise patient similarity, we characterize each subject by a probability distribution that generates its local image descriptors. We adopt a notion of affinity between probability distributions which lends itself to similarity between subjects. Instead of approximating the distributions by a parametric family, we propose to compute the affinity measure indirectly using an approximate nearest neighbor estimator. Computing pairwise similarities enables us to embed the entire patient population into a lower dimensional manifold, mapping each subject from high-dimensional image space to an informative low-dimensional representation. We validate our method on a large-scale lung CT scan study and demonstrate the state-of-the-art prediction on an important physiologic measure of airflow (the forced expiratory volume in one second, FEV1) in addition to a 5-category clinical rating (so-called GOLD score).

2015

Highly-Expressive Spaces of Well- Behaved Transformations: Keeping It Simple
O. Freifeld, S. Hauberg, N. Batmanghelich
Proceedings of the IEEE International Conference on Computer Vision (ICCV)
[pdf]

Abstract

Generative Method to Discover Genetically Driven Image Biomarkers
K.N. Batmanghelich*, A. Saeedi*, M. Cho, R. Jose, P. Golland
In Proc. IPMI: International Conference on Information Processing and Medical Imaging
[pdf]

Abstract

We present a generative probabilistic approach to discovery of disease subtypes determined by the genetic code. In many diseases, multiple types of pathology may present simultaneously in a patient, making quantification of the disease challenging. Our method seeks common co-occurring image and genetic patterns in a population as a way to model these two different data types jointly. We assume that each patient is a mixture of multiple disease subtypes and use the joint generative model of image and genetic markers to identify disease subtypes guided by known genetic influences. Our model is based on a variant of the so-called topic models that uncover the latent structure in a collection of data. We derive an efficient variational inference algorithm to extract patterns of co-occurrence and to quantify the presence of a heterogeneous disease in each patient. We evaluate the method on simulated data and illustrate its use in the context of Chronic Obstructive Pulmonary Disease (COPD) to characterize the relationship between image and genetic signatures of the COPD subtypes in a large patient cohort.

 

2014

Spherical Topic Models for Imaging Phenotype Discovery in Genetic Studies
K.N. Batmanghelich, M. Cho, R. Jose, P. Golland
In Proc. BAMBI: Workshop on Bayesian and Graphical Models for Biomedical imaging, International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)
[pdf]

Abstract

In this paper, we use Spherical Topic Models to discover the latent structure of lung disease. This method can be widely employed when a measurement for each subject is provided as a normalized histogram of relevant features. In this paper, the resulting descriptors are used as phenotypes to identify genetic markers associated with the Chronic Obstructive Pulmonary Disease (COPD). Features extracted from images capture the heterogeneity of the disease and therefore promise to improve detection of relevant genetic variants in Genome Wide Association Studies (GWAS). Our generative model is based on normalized histograms of image intensity of each subject and it can be readily extended to other forms of features as long as they are provided as normalized histograms. The resulting algorithm represents the intensity distribution as a combination of meaningful latent factors and mixing coefficients that can be used for genetic association analysis. This approach is motivated by a clinical hypothesis that COPD symptoms are caused by multiple coexisting disease processes. Our experiments show that the new features enhance the previously detected signal on chromosome 15 with respect to standard respiratory and imaging measurements.

Diversifying Sparsity Using Variational Determinantal Point Processes
K.N. Batmanghelich, G. Quon, A. Kuleza, M. Kellis, P. Golland, L. Bornn
In Proc. ArXiv
[pdf]

Abstract

We propose a novel diverse feature selection method based on determinantal point processes (DPPs). Our model enables one to flexibly define diversity based on the covariance of features (similar to orthogonal matching pursuit) or alternatively based on side information. We introduce our approach in the context of Bayesian sparse regression, employing a DPP as a variational approximation to the true spike and slab posterior distribution. We subsequently show how this variational DPP approximation generalizes and extends mean-field approximation, and can be learned efficiently by exploiting the fast sampling properties of DPPs. Our motivating application comes from bioinformatics, where we aim to identify a diverse set of genes whose expression profiles predict a tumor type where the diversity is defined with respect to a gene-gene interaction network. We also explore an application in spatial statistics. In both cases, we demonstrate that the proposed method yields significantly more diverse feature sets than classic sparse methods, without compromising accuracy.

BrainPrint in the Computer-Aided Diagnosis of Alzheimer’s Disease
C. Wachinger, K. Batmanghelich, P. Golland, M. Reuter
Challenge on Computer-Aided Diagnosis of Dementia, MICCAI, 2014
[pdf]

Abstract

We investigate the potential of shape information in assisting the computer-aided diagnosis of Alzheimer’s disease and its prodromal stage of mild cognitive impairment. We employ BrainPrint to obtain an extensive characterization of the shape of brain structures. BrainPrint captures shape information of an ensemble of cortical and subcortical structures by solving the 2D and 3D Laplace-Beltrami operator on triangular and tetrahedral meshes. From the shape descriptor, we derive features for the classi cation by computing lateral shape di erences and the projection on the principal component. Volume and thickness measurements from FreeSurfer complement the shape features in our model. We use the generalized linear model with a multinomial link function for the classification. Next to manual model selection, we employ the elastic-net regularizer and stepwise model selection with the Akaike information criterion. Training is performed on data provided by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and testing on the data provided by the challenge. The approach runs fully automatically.

2013

Joint Modeling of Imaging and Genetics
K.N. Batmanghelich, A.V. Dalca, M.R. Sabuncu, P. Golland
In Proc. IPMI: International Conference on Information Processing and Medical Imaging
[pdf]

Abstract

We propose a unified Bayesian framework for detecting genetic variants associated with a disease while exploiting image-based features as an intermediate phenotype. Traditionally, imaging genetics methods comprise two separate steps. First, image features are selected based on their relevance to the disease phenotype. Second, a set of genetic variants are identified to explain the selected features. In contrast, our method performs these tasks simultaneously to ultimately assign probabilistic measures of relevance to both genetic and imaging markers. We derive an efficient approximate inference algorithm that handles high dimensionality of imaging genetic data. We evaluate the algorithm on synthetic data and show that it outperforms traditional models. We also illustrate the application of the method on ADNI data.

2012

Dominant Component Analysis of Electro- physiological Connectivity Network
Y. Ghanbari, L. Bloy, N. Batmanghelich, R. Verma
International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI)
[pdf]

Abstract

Connectivity matrices obtained from various modalities (DTI, MEG and fMRI) provide a unique insight into brain processes. Their high dimensionality necessitates the development of methods for population-based statistics, in the face of small sample sizes. In this paper, we present such a method applicable to functional connectivity networks, based on identifying the basis of dominant connectivity components that characterize the patterns of brain pathology and population variation. Projection of individual connectivity matrices into this basis allows for dimensionality reduction, facilitating subsequent statistical analysis. We find dominant components for a collection of connectivity matrices by using the projective non-negative component analysis technique which ensures that the components have non-negative elements and are non-negatively combined to obtain individual subject networks, facilitating interpretation. We demonstrate the feasibility of our novel framework by applying it to simulated connectivity matrices as well as to a clinical study using connectivity matrices derived from resting state magnetoencephalography (MEG) data in a population of subjects diagnosed with autism spectrum disorder (ASD).

An integrated Framework for High Angular Resolution Diffusion Imaging-Based Investigation of Structural Connectivity
L. Bloy, M. Ingalhalikar, N. Batmanghelich
Brain Connectivity
[pdf]

Abstract

Structural connectivity models hold great promise for expanding what is known about the ways information travels throughout the brain. The physiologic interpretability of structural connectivity models depends heavily on how the connections between regions are quantified. This article presents an integrated structural connectivity framework designed around such an interpretation. The framework provides three measures to characterize the structural connectivity of a subject: (1) the structural connectivity matrix describing the proportion of connections between pairs of nodes, (2) the nodal connection distribution (nCD) characterizing the proportion of connections that terminate in each node, and (3) the connection density image, which presents the density of connections as they traverse through white matter (WM). Individually, each possesses different information concerning the structural connectivity of the individual and could potentially be useful for a variety of tasks, ranging from characterizing and localizing group differences to identifying novel parcellations of the cortex. The efficiency of the proposed framework allows the determination of large structural connectivity networks, consisting of many small nodal regions, providing a more detailed description of a subject’s connectivity. The nCD provides a gray matter contrast that can potentially aid in investigating local cytoarchitecture and connectivity. Similarly, the connection density images offer insight into the WM pathways, potentially identifying focal differences that affect a number of pathways. The reliability of these measures was established through a test/retest paradigm performed on nine subjects, while the utility of the method was evaluated through its applications to 20 diffusion datasets acquired from typically developing adolescents.

Generative-Discriminative Basis Learning for Medical Imaging
N. Batmanghelich, B. Taskar, C. Davatzikos
IEEE Trans Med Imaging
[pdf]

Abstract

This paper presents a novel dimensionality reduction method for classification in medical imaging. The goal is to transform very high-dimensional input (typically, millions of voxels) to a low-dimensional representation (small number of constructed features) that preserves discriminative signal and is clinically interpretable. We formulate the task as a constrained optimization problem that combines generative and discriminative objectives and show how to extend it to the semisupervised learning (SSL) setting. We propose a novel largescale algorithm to solve the resulting optimization problem. In the fully supervised case, we demonstrate accuracy rates that are better than or comparable to state-of-the-art algorithms on several datasets while producing a representation of the group difference that is consistent with prior clinical reports. Effectiveness of the proposed algorithm for SSL is evaluated with both benchmark and medical imaging datasets. In the benchmark datasets, the results are better than or comparable to the state-of-the-art methods for SSL. For evaluation of the SSL setting in medical datasets, we use images of subjects with Mild Cognitive Impairment (MCI), which is believed to be a precursor to Alzheimer’s disease (AD), as unlabeled data. AD subjects and Normal Control (NC) subjects are used as labeled data, and we try to predict conversion from MCI to AD on follow-up. The semi-supervised extension of this method not only improves the generalization accuracy for the labeled data (AD/NC) slightly but is also able to predict subjects which are likely to converge to AD.

 

2011

Regularized Tensor Factorization for Multi-Modality Medical Image Classification
N. Batmanghelich, B. Taskar, C. Davatzikos
International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2011)
[pdf]

Abstract

This paper presents a general discriminative dimensionality reduction framework for multi-modal image-based classification in medical imaging datasets. The major goal is to use all modalities simultaneously to transform very high dimensional image to a lower dimensional representation in a discriminative way. In addition to being discriminative, the proposed approach has the advantage of being clinically interpretable. We propose a framework based on regularized tensor decomposition. We show that different variants of tensor factorization imply various hypothesis about data. Inspired by the idea of multi-view dimensionality reduction in machine learning community, two different kinds of tensor decomposition and their implications are presented. We have validated our method on a multi-modal longitudinal brain imaging study. We compared this method with a publically available classification software based on SVM that has shown state-of-the-art classification rate in number of publications.

Disease Classification and Prediction via Semi-Supervised Dimensionality Reduction
N. Batmanghelich, D. Ye, K. Pohl, B. Taskar, C. Davatzikos
ISBI 2011 (Oral Presentation)
[pdf]

Abstract

We present a new semi-supervised algorithm for dimensionality reduction which exploits information of unlabeled data in order to improve the accuracy of image-based disease classification based on medical images. We perform dimensionality reduction by adopting the formalism of constrained matrix decomposition of [1] to semi-supervised learning. In addition, we add a new regularization term to the objective function to better capture the affinity between labeled and unlabeled data. We apply our method to a data set consisting of medical scans of subjects classified as Normal Control (CN) and Alzheimer (AD). The unlabeled data are scans of subjects diagnosedwith Mild Cognitive Impairment (MCI), which are at high risk to develop AD in the future. We measure the accuracy of our algorithm in classifying scans as AD and NC. In addition, we use the classifier to predict which subjects with MCI will converge to AD and compare those results to the diagnosis given at later follow-ups. The experiments highlight that unlabeled data greatly improves the accuracy of our classifier.

 

2010

Prediction of MCI Conversion via MRI, CSF Biomarkers, and Pattern Classification
C. Davatzikos, P. Bhatt, L. Shaw, N. Batmanghelich, J. Trojanowski
Neurobiology of Aging
[pdf]

Abstract

Magnetic resonance imaging (MRI) patterns were examined together with cerebrospinal fluid (CSF) biomarkers in serial scans of Alzheimer’s Disease Neuroimaging Initiative (ADNI) participants with mild cognitive impairment (MCI). The SPARE-AD score, summarizing brain atrophy patterns, was tested as a predictor of short-term conversion to Alzheimer’s disease (AD). MCI individuals that converted to AD (MCI-C) had mostly positive baseline SPARE-AD (Spatial Pattern of Abnormalities for Recognition of Early AD) and atrophy in temporal lobe gray matter (GM) and white matter (WM), posterior cingulate/precuneous, and insula. MCI individuals that converted to AD had mostly AD-like baseline CSF biomarkers. MCI nonconverters (MCI-NC) had mixed baseline SPARE-AD and CSF values, suggesting that some MCI-NC subjects may later convert. Those MCI-NC with most negative baseline SPARE-AD scores (normal brain structure) had significantly higher baseline Mini Mental State Examination (MMSE) scores (28.67) than others, and relatively low annual rate of Mini Mental State Examination decrease (-0.25). MCI-NC with midlevel baseline SPARE-AD displayed faster annual rates of SPARE-AD increase (indicating progressing atrophy). SPARE-AD and CSF combination improved prediction over individual values. In summary, both SPARE-AD and CSF biomarkers showed high baseline sensitivity, however, many MCI-NC had abnormal baseline SPARE-AD and CSF biomarkers. Longer follow-up will elucidate the specificity of baseline measurements.

Application of Trace-Norm and Low-Rank Matrix Decomposition for Computational Anatomy
N. Batmanghelich, A. Gooya, B. Taskar, C. Davatzikos
IEEE Computer Society Workshop on Mathematical Methods in Biomedical Image Analysis (MMBIA10)
[pdf]

Abstract

We propose a generative model to distinguish normal anatomical variations from abnormal deformations given a group of images with normal and abnormal subjects. We assume that abnormal subjects share common factors which characterize the abnormality. These factors are hard to discover due to large variance of normal anatomical differences. Assuming that the deformation fields are parametrized by their stationary velocity fields, these factors constitute a low-rank subspace (abnormal space) that is corrupted by high variance normal anatomical differences. We assume that these normal anatomical variations are not correlated. We form an optimization problem and propose an efficient iterative algorithm to recover the low-rank subspace. The algorithm iterates between image registration and the decomposition steps and hence can be seen as a group-wise registration algorithm. We apply our method on synthetic and real data and discover abnormality of the population that cannot be recovered by some of the well-known matrix decompositions (e.g. Singular Value Decomposition).