Open Access Open Access  Restricted Access Subscription or Fee Access

Weighted Latent Dirichlet Allocation, an Optimized and Generalized Probabilistic Topic Model

Muzafar Rasool Bhat, M. Arif Wani

Abstract


In this study, we have introduced a new distribution named as Mixture Weighted Dirichlet Distribution (MWDD) which acts as generalization to other distributions viz. Dirichlet Distribution, Length or Size-Biased Dirichlet, Area Biased and Volume Biased Dirichlet distributions. Aim of this research is to introduce and implement MWDD for Probabilistic Topic Modeling of Corpus of textual data as an optimized as well as generalized probabilistic topic model. Various statistical and structural properties which include moments (1st, 2nd and rth) moment about origin, Variance and Standard deviation of the proposed distribution are thoroughly studied. Cora, a dataset of 2410 scientific documents in LDA format conforming to probabilistic topic modeling experimentations has been used in this study.  Generative process of new probabilistic topic modeling technique using MWDD is also elaborated in this manuscript. For comparing the efficiency of existing as well as various special cases of the proposed model, AIC’s, AICC’s, BIC’s and log likelihood measures have been utilized.  This study concludes that special case of the proposed model namely Mixture Volume Biased Weighted Dirichlet distribution (MVBWDD) is efficient as it has least AIC, AICC and BIC values with p, a mixture parameter varying from 0 to 1 on probabilistic topic modeling of the Dataset. 


Keywords


Dirichlet distribution;LDA; Weighted Dirichlet Distribution (WLDA); Probablistic Topic modelling; Mixture Weighted Dirichlet Disribution (MWDD); MWDD Topic Model.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.