Research


Research interests · Publications · Software


Research interests

  • Statistical modelling in high-dimensional spaces with mixed covariates; Stochastic computation and simulation methods (MCMC, Monte Carlo) for mining massive datasets; High performance/distributed statistical computing; Graphical models and related graph theory; Categorical data analysis in large, sparse multi-way tables; Multivariate survival analysis.


  • Statistical genomics; Modelling approaches to combining genomic, clinical and sequence information; Stochastic modelling of high dimensional biological networks; Graphical association networks for exploration, visualization and summarization of gene expression data.



Publications

2010
(1) Rodriguez, A., Lenkoski, A. and Dobra, A. (2010). Sparse covariance estimation in heterogeneous samples. Submitted for publication. [PDF]
(2) Dobra, A. and Fienberg, S.E. (2010). The generalized shuttle algorithm. Algebraic and geometric methods in statistics, Volume dedicated to Professor Giovanni Pistone, Cambridge University Press (P. Gibilisco, E. Riccomagno, M.P. Rogantin and H.P. Wynn, eds.), 135-156. [PDF]
(3) Lenkoski, A. and Dobra, A. (2010). Computational aspects related to inference in Gaussian graphical models with the G-Wishart prior. Journal of Computational and Graphical Statistics, accepted for publication. [PDF]
(4) Dobra, A., Briollais, L., Jarjanazi, H., Ozcelik, H. and Massam, H. (2010). Applications of the mode oriented stochastic search (MOSS) algorithm for discrete multi-way data to genomewide studies. To appear in the volume Bayesian Modeling in Bioinformatics, Taylor & Francis (D. Dey, S. Ghosh and B. Mallick, eds.). [PDF]
2009
(1) Dobra, A. (2009). Variable selection and dependency networks for genomewide data. Biostatistics, 10, 621-639. [PDF]
(2) Dobra, A. (2009). Computing exact p-values in incomplete multi-way tables. Technical Report No. 548, Department of Statistics, University of Washington. [PDF]
(3) Dobra, A. and Lenkoski, A. (2009). Copula Gaussian graphical models. Technical Report No. 555, Department of Statistics, University of Washington. [PDF]
(4) Dobra, A., Eicher, T.S. and Lenkoski, A. (2009). Modeling uncertainty in macroeconomic growth determinants using Gaussian graphical models. Statistical Methodology, accepted for publication. [PDF]
(5) Massam, H., Liu, J. and Dobra, A. (2009). A conjugate prior for discrete hierarchical log-linear models. The Annals of Statistics, 37, 3431--3467. [PDF]
(6) Dobra, A. and Massam, H. (2009). The mode oriented stochastic search (MOSS) algorithm for log-linear models with conjugate priors. Statistical Methodology, in press. [PDF]
(7) Dobra, A., Fienberg, S.E., Rinaldo, A., Slavkovic, A. and Zhou, Y. (2009). Algebraic statistics and contingency table problems: estimation and disclosure limitation. IMA Volume 149 on Emerging applications of algebraic geometry (S. Sullivant and M. Putinar, eds.). Springer Science, 63-88. [PDF]
2007
(1) Hans, C., Dobra, A. and West, M. (2007). Shotgun stochastic search for "large p" regression. Journal of the American Statistical Association, 102, 507-516. [PDF]
2006
(1) Dobra, A., Tebaldi, C. and West, M. (2006). Data augmentation in multi-way contingency tables with fixed marginal totals. Journal of Statistical Planning and Inference, 136, 355-372. [PDF]
(2) Huber, M., Chen, Y., Dinwoodie, I., Dobra, A. and Nicholas, M. (2006). Monte Carlo algorithms for Hardy-Weinberg proportions. Biometrics, 62, 49-53. [PDF]
2005
(1) Chen, Y., Dinwoodie, I., Dobra, A. and Huber, M. (2005). Lattice points, contingency tables, and sampling. Contemporary Mathematics, 374, 65-78. [PDF]
(2) Jones, B., Carvalho, C., Dobra, A., Hans, C., Carter, C. and West, M. (2005). Experiments in stochastic computation for high dimensional graphical models. Statistical Science, 20, 388-400. [PDF]
(3) DeLong, M., Yao, G., Wang, Q., Dobra, A., Black, E.P., Chang, J.T., Bild, A., West, M., Nevins, J.R. and Dressman, H. (2005). DIG - a system for gene annotation and functional discovery. Bioinformatics, 21, 2957-2959. [PDF]
(4) Rich, J.N., Hans, C., Jones, B., Iversen, E.S., McClendon, R.E., Rasheed, B.K.A., Dobra, A., Dressman, H.K., Bigner, D.D., Nevins, J.R. and West, M. (2005). Gene expression profiling and analysis in graphical association studies in glioblastoma survival. Cancer Research, 65, 4051-4058. [PDF]
2004
(1) Dobra, A. and Sullivant, S. (2004). A divide-and-conquer algorithm for generating Markov bases of multi-way tables. Computational Statistics, 19, 347-366. [PDF]
(2) Hauser, E.R., Gregory, S., Seo, D., Dobra, A., Iversen, E., Karra, R., Haynes, C., Stenger, J., Xu, H., Wang, L., Huang, L., West, M., Sketch, M., Vance, J., Kraus, W.E., Goldschmidt, P. (2004). Convergence of genome-wide expression analysis and genome-wide linkage analysis identifies candidate genes for atherosclerosis. Circulation 110(17:Supplement):III823.
2003
(1) Dobra, A., Jones B., Hans C., Nevins J. and West, M. (2003). Sparse graphical models for exploring gene expression data. Journal of Multivariate Analysis, special issue on Multivariate Methods in Genomic Data Analysis, 90, 196-212. [PDF]
(2) Dobra, A. (2003). Markov bases for decomposable graphical models. Bernoulli, 9, No. 6, 1-16. [PDF]
(3) Dobra, A., Karr, A. and Sanil A. (2003). Preserving confidentiality of high-dimensional tabulated data: statistical and computational issues. Statistics and Computing, 13, 363-370. [PDF]
(4) Dobra, A., Fienberg, S. E. and Trottini, M. (2003). Assessing the risk of disclosure of confidential categorical data. Bayesian Statistics 7 (J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith and M. West, eds.), Oxford University Press, 125-144. [PDF]
(5) Dobra, A. and Fienberg, S. E. (2003). How large is the World Wide Web? Web Dynamics (M. Levene and A. Poulovassilis, eds.), Springer-Verlag, 23-45. [PDF]
(6) Dobra, A. and Fienberg, S. E. (2003). Bounding entries in multi-way contingency tables given a set of marginal totals. In Y. Haitovsky, H. R. Lerche and Y. Ritov, editors, Foundations of Statistical Inference, Proceedings of the Shoresh Conference 2000, 3-16. Springer-Verlag, Berlin. [PDF]
2002
(1) Dobra, A., Karr, A., Sanil, A. and Fienberg, S. E. (2002). Software systems for tabular data releases. International Journal of Uncertainty, Fuzziness and Knowledge Based Systems, special issue on Aggregation and Risk Assessment on Statistical Disclosure Control, 10, 529-544. [PDF]
(2) Karr, A., Dobra, A. and Sanil, A. (2002). Table servers: protecting confidentiality in tabular data releases. Communications of the ACM, special issue on Digital Government, 46, No. 1, 57-58. [PDF]
(3) Dobra, A., Erosheva, E. A. and Fienberg, S. E. (2002). Disclosure limitation methods based on bounds for large contingency tables with application to disability data. In H. Bozdogan, ed., Statistical Data Mining and Knowledge Discovery, CRC Press. [PDF]
2001
(1) Dobra, A. and Fienberg, S. E. (2001). Bounds for cell entries in contingency tables induced by fixed marginal totals. UNECE Statistical Journal, 18, 363-371. [PDF]
2000
(1) Dobra, A. and Fienberg, S. E. (2000). Bounds for cell entries in contingency tables given marginal totals and decomposable graphs. Proceedings of the National Academy of Sciences, 97, No. 22, 11885-11892. [PDF]

Software

BMSS Learns dependency networks with continuous and binary variables.
GSA Calculates bounds, enumerates all feasible tables and performs exact testing for multi-way contingency tables with fixed marginal totals.
MC3TABLES Performs the MC3 stochastic search for decomposable, graphical and hierarchical log-linear models.
MOSSTABLES Performs the mode oriented stochastic search (MOSS) for cluster, decomposable, graphical and hierarchical log-linear models.
MOSSLARGETABLES Performs the mode oriented stochastic search (MOSS) for regressions with discrete variables. It also does the MOSS log-linear search for each relevant regression.