# Estimation and Sampling in Slow Mixing Markov Processes

Date: 2015-08-13 Add to Google CalendarTime: 1130am-2pm

Location: Holmes Hall 389

Speaker: Ramezan Paravitorghabeh, candidate for PhD, advisor: Dr. Narayana Prasad Santhanam

Abstract

This dissertation concentrates on estimation and sampling aspects in slow mixing Markov Processes. When a process has slow mixing property, a large number of observations is needed before exploring the state space of the process properly. Empirical properties of nite sized samples from Markov processes need not reflect stationary properties. When empirical counts of samples eventually reflect the stationary properties, we say that the process has mixed. The contributions of this work revolve around interpretation of samples before mixing has occurred. In first part of the work, we deal with estimation of parameters of the process, while in the second part, we will show how the evolution of the samples obtained from the process can be exploited to identify subsets of state space which communicate well together. This insight will be translated into algorithmic rules for detecting communities in graphs.

Estimation: Given a length-n sample generated by an unknown Markov process (model ) over a nite alphabet A and any string w of symbols from A, we want estimates of the conditional probability distribution of symbols following w (model parameters), as well as the stationary probability of w. Two distinct problems that complicate estimation in this setting are (i) long memory, and (ii) slow mixing which could happen even with only one bit of memory. Any consistent estimator can only converge pointwise over the class of all ergodic Markov models. Combining results in universal compression with Aldous' coupling arguments, we obtain sufficient conditions on the length-n sample (even for slow mixing models) to identify when naive (i) estimates of the model parameters and (ii) estimates related to the stationary probabilities are accurate; and also bound the deviations of the naive estimates from true values.

Community Detection: We introduce a new randomized algorithm for community detection in graphs by defining Markov random walks on them. The mixing properties of the random walks we construct are used to identify communities. The more polarized the communities are, the slower mixing the random walk is. The algorithm creates a random walk on the nodes of the graph.

We start different, coupled random walks from different nodes. We then adapt the coupling from the past approach to identify clusters before the chain mixes (rather than sample from the stationary distribution). The Markov random walks are built such that the restriction of the random walk to within any one cluster mixes much faster than the overall walk itself. Finally, we analyze the performance of the algorithm on specific graph structures, including the Stochastic Block Models, LFR benchmark random graphs and real world networks.

<back>