Docente
Lecturer
- Paolo Frasconi
- Dipartimento di Sistemi e Informatica
- Email:
- Ricevimento: Giovedi 15:00-17:00Office hours: Thursday 3:00pm-5:00pm
Orario delle lezioni
Class schedule
| Mercoledi | 16:00-18:00 | Aula 205 (S. Marta) |
| Giovedi | 17:00-19:00 | Aula 205 (S. Marta) |
| Venerdi | 10:15-13:15 | Aula 204 (S. Marta) |
| Wednesday | 4:00pm-6:00pm | Room 205 (S. Marta) |
| Thursday | 5:00pm-7:00pm | Room 205 (S. Marta) |
| Friday | 10:15am-1:15pm | Room 204 (S. Marta) |
Obiettivi
Goals
- Al termine del corso conoscerete numerosi algoritmi fondamentali ed alcuni algoritmi avanzati per l'apprendimento statistico, avrete basi di teoria computazionale dell'apprendimento e sarete in grado di progettare soluzioni allo stato dell'arte per risolvere problemi applicativi.
- You will learn about several fundamental and some advanced algorithms for statistical learning, you will know the basics of computational learning theory, and will be able to design state-of-the-art solutions to application problems.
Crediti
Credits
- L'insegnamento vale 9 crediti formativi, formalmente corrispondenti a circa 72 ore di didattica assistita (lezioni ed esercitazioni in classe), e 153 ore di studio individuale ed attività sperimentale.
- Per altri corsi di laurea (p.es. Ingegneria dell'automazione e Informatica per la facoltà di Scienze) il corso vale i crediti formativi stabiliti dal vostro ordinamento (tipicamente 5). In questi casi potete abbandonare le lezioni una volta conclusa la parte sui modelli grafici probabilistici.
- The course is worth 9 credits (about 72 hours in class, and 153 hours of individual study and experimental work).
- For other degree programs (e.g. M.Sc. in Automation, or M.Sc. in Computer Science) the course is worth the credits that have been assigned to it in your curriculum (usually 5). In these cases, you may drop classes after we complete the part on probabilistic graphical models.
Prerequisiti
Prerequisites
- I prerequisiti necessari includono una buona conoscenza di un linguaggio di programmazione e solide basi di matematica (analisi, algebra lineare, teoria della probabilità). Conoscenze preliminari di tecniche di ottimizzazione e di statistica sono utili ma non strettamente necessarie.
- A good knowledge of a programming language, and a solid background in mathematics (calculus, linear algebra, and probability theory) are necessary prerequisites to this course. Previous knowledge of optimization techniques and statistics would be useful but not strictly necessary.
Laboratorio virtuale
Virtual lab
- Potete esercitarvi sugli algoritmi esposti durante il corso e preparare l'esame utilizzando il laboratorio virtuale del corso di Laurea Magistrale in Ingegneria Informatica. Contattatemi con nome, cognome e numero di matricola per ottenere l'accesso al laboratorio
- You can practice on the main algorithms and prepare the exams on the M.Sc. Computer Engineering degree program virtual lab. Please contact me providing name and student ID to get access to the lab.
Esame
Exams
- L'esame consiste di un'unica prova orale. L'argomento è a vostra scelta ma vi consiglio di concordarlo con me prima di iniziare la preparazione.
- Di solito viene assegnata la lettura di alcuni articoli scientifici e potrebbe essere richiesto riprodurre alcuni risultati sperimentali.
- Dovrete esporre brevemente (30 minuti) il vostro argomento. Assicuratevi che la presentazione includa un'introduzione al problema studiato, una breve rassegna della letteratura rilevante, descrizione tecnica del metodo e, se appropriato, una descrizione dettagliata del lavoro sperimentale. Potete usare strumenti multimediali oer la vostra presentazione. Siete responsabili della comprensione dei concetti rilevanti e della teoria.
- Potete formare gruppi di due studenti (tre in casi eccezionali e motivati) per lavori di tipo sperimentale. In questi casi assicuratevi che il contributo apportato da ciascuno di voi sia chiaramente riconoscibile.
- A partire da quest'anno la verbalizzazione è elettronica e pertanto è obbligatoria la registrazione online usando questo link. Per offrire maggiore flessibilità, le date effettive di esame non sono esattamente legate al calendario. Per favore contattatemi via email per concordare la data del vostro esame.
- There is a single oral final exam. You can choose the exam topic but you are strongly advised to discuss with me before you begin working on it.
- Typically, you will be assigned a set of papers to read and you may be asked to reproduce some experimental results.
- You will be required to give a short (30 min) presentation during the exam. Please ensure that your presentation includes an introduction to the problem being addressed, a brief review of relevant literature, technical derivation of methods, and, if appropriate, a detailed description of experimental work. You are allowed to use multimedia tools to prepare your presentation. You are responsible for understanding all the relevant concepts and the underlying theory.
- You can work in groups of two to carry out experimental works (three is an exceptional number that you must motivate clearly). If you do so, please ensure that personal contributions to the overall work are clearly identifiable.
- Exams will be registered electronically starting from this year. Therefore you must enroll online using this link. To provide extra flexibility, actual exam dates are not exactly bound to the calendar. Please send me email to schedule your exam.
Libri suggeriti
Suggested books
- [HTF09] T. Hastie, R. Tibshirani, and J. Friedman (2009) The Elements of Statistical Learning Springer (free PDF copy available).
- [KF09] D. Koller and N. Friedman (2009) Probabilistic Graphical Models: Principles and Techniques MIT Press
- [D08] L. De Raedt (2008) Logical and Relational Learning. Springer.
- + Altri libri potenzialmente utili:
- + Additional books that may be useful:
Programma sintetico
Topics
- Apprendimento con supervisione. Regressione. Classificazione. Teoria computazionale dell'apprendimento. Modelli lineari generalizzati. Apprendimento non supervisionato. Macchine a vettori di supporto. Nuclei. Boosting e bagging. Modelli grafici probabilistici (semantica, inferenza, apprendimento di parametri e struttura). Apprendimento con uscite strutturate. Programmazione logica induttiva e apprendimento statistico-relazionale. Applicazioni.
- Supervisised learning. Regression. Classification. Computational learning theory. Generalized linear models. Unsupervised learning. Support vector machines. Kernels. Boosting and bagging. Probabilistici graphical models (semantics, inference, parameters and structure learning). Structured-output learning. Inductive logic programming and statistical relational learning. Applications.
Calendario e materiale didattico
Calendar and handouts
| Data | Argomenti | Letture e materiale* |
| 13/10/2010 | Aspetti organizzativi del corso. Introduzione: l'apprendimento automatico per risolvere problemi difficili in IA. | HTB09 1. |
|---|---|---|
| 14/10/2010 | Apprendimento con supervisione. Tipologie di ingressi e uscite. Esempi di applicazioni. Compiti: regressione, classificazione, classificazione multi-classe, ranking, uscite strutturate. Schema astratto di algoritmo di apprendimento supervisionato. Assunzioni sui dati. | HTB09 2. Appunti della lezione |
| 15/10/2010 | Regressione lineare ordinaria. Stima ai minimi quadrati. Interpretazione geometrica. Analisi statistica. Teorema di Gauss-Markov. Decomposizione bias-varianza. | HTB09 3.2 Dimostrazione del teorema di Gauss-Markov |
| 21/10/2010 | Selezione di un sottoinsieme di predittori: best-subset, forward selection, backward selection. Regressione ridge e lasso. Discussione. | HTB09 3.3, 3.4 |
| 22/10/2010 | Il principio di massima verosimiglianza. Esempi. Proprietà degli stimatori a massima verosimiglianza: consistenza, equivarianza, gaussianità asintotica, ottimalità. Minimi quadrati per la regressione come caso particolare di massima verosimiglianza. | Wasserman 2004, cap. 9 Massima verosimiglianza |
| 27/10/2010 | Modelli generativi per la classificazione. Analisi discriminante lineare (Gaussiana). Errore di Bayes. Esempi. | HTB09 4.3. DHS 5.8.2 |
| 28/10/2010 | Regressione logistica. Ottimizzazione con Newton-Raphson. Modelli lineari generalizzati (esempio: regressione di Poisson). | HTB09 4.4. M. Jordan. Why the logistic function? - UAI 1995 |
| 29/10/2010 | Regressione softmax (regressione logistica multiclasse). Introduzione a Naive Bayes. | HTB09 4.4, 6.6.3. T. Mitchell: Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression |
| 3/11/2010 | Naive Bayes. Laplace smoothing. Caso di attributi categorici. Discretizzazione di attributi continui. Modello multinomiale per la categorizzazione del testo. |
A. McCallum and K. Nigam.
A Comparison of Event Models for Naive Bayes Text Classification
- AAAI 1998
U. Fayyad and K. Irani: Multi-interval discretization of continuous-valued attributes for classification learning - IJCAI 1993 |
| 4/11/2010 | Decomposizione dell'errore di generalizzazione. Learning curves. Naive Bayes vs. Logistic Regression. Introduzione a SVM: Iperpiano a massimo margine. | A. Ng and M. Jordan. On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes - NIPS 2002 |
| 10/11/2010 | Iperpiano a massimo margine (MMH). Teoria di Karush Kuhn Tucker per problemi di ottimizzazione convessi. Struttura della soluzione di MMH. | HTB09 4.5. Schoelkopf and Smola (2002) Learning with Kernels Ch. 6, 7.3 |
| 11/11/2010 | Macchine a vettori di supporto (C-SVC, nu-SVC). Analisi dei vettori di supporto. Risolutori per SVM. SMO. SVM come minimizzazione di una loss empirica con regolarizzazione. Hinge vs. logit (regressione logistica) loss. | HTB09 12.1, 12.2.
L. Bottou and C.J. Lin (2006). Support Vector Machine Solvers. L. Bottou Trade-offs of Large-Scale Learning - NIPS 2008 LibSVM software |
| 12/11/2010 | Regressione a vettori di supporto. Introduzione informale ai kernels: Kernel polinomiale e Gaussiano. Perceptron e Kernel Perceptron. Voted-perceptron. | HTB09 12.3.1, 12.3.2. Cristianini and Shawe-Taylor, Ch. 2 Y. Freund and R.E. Schapire: Large Margin Classification Using the Perceptron Algorithm |
| 17/11/2010 | Teorema di Mercer. Spazi di Hilbert a nucleo riproducente. Teorema di rappresentazione. Kernel ridge regression. | HTB09 5.8, 12.3 T. Poggio and S. Smale The Mathematics of Learning: Dealing with Data (quite readable) F. Cucker and S. Smale On the mathematical foundations of learning (very detailed but somewhat difficult paper) |
| 18/11/2010 | Progetto di kernels. Proprietā di chiusura. Min-kernel. Kernels su istogrammi. Pyramidal match kernel. Kernels convoluzionali. |
K. Grauman and T. Derrel:
The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features - ICCV 2005 D. Haussler (1999): Convolution Kernels on Discrete Structures - MLJ 1999 |
| 19/11/2010 | Kernels per dati strutturati: k-grams per classificazione di proteine, kernels per alberi sintattici. Feature hashing. |
C. Leslie et al.:
The spectrum kernel: A string kernel for SVM protein classification - PSB 2002
P. Kuksa et al.: Scalable Algorithms for String Kernels with Inexact Matching - NIPS 2008 M. Collins and N. Duffy Convolution Kernels for Natural Language - NIPS 2001 K. Weinberger et al. Feature Hashing for Large Scale Multitask Learning ICML 2009. |
| 24/11/2010 | Panoramica sui kernels per grafi. Teoria statistica dell'apprendimento. Bounds per la generalizzazione. Caso di spazi di ipotesi a dimensione finita. VC-dimension e bound di Vapnick. |
L. Ralaivola et al. (2005)
Graph kernels for chemical informatics Bousquet et al. (2004)Introduction to Statistical Learning Theory |
| 25/11/2010 | Introduzione ai modelli grafici probabilistici. Probabilitā soggettive (credenze). Indipendenza condizionale e modelli di indipendenza. Mappe di indipendenza non orientate (Markov networks). |
Details in KF09 Ch 1-4 D. Koller et al. (2007)Graphical models in a Nutshell in L. Getoor and B. Taskar: Introduction to Statistical Relational Learning |
| 26/11/2010 | Reti di Markov: decomposizione della distribuzione congiunta (Teorema di Hammersley-Clifford). Reti di Bayes. D-separazione. Parametrizzazione. Introduzione all'inferenza (eliminazione di variabili). | A. Darwiche Bayesian Networks Communications of the ACM. |
| 1/12/2010 | Inferenza con l'algoritmo del junction tree. |
D. Koller et al. (2007)Graphical models in a Nutshell in L. Getoor and B. Taskar: Introduction to Statistical Relational Learning |
| 02/12/2010 | Software per reti Bayesiane. Apprendimento dei parmetri nelle reti Bayesiane: Stima a massima verosimiglianza con dati completamente osservati. Apprendimento Bayesiano, priors Beta e di Dirichlet. |
Bayesian networks software: Hugin,
Genie. D. Heckerman (1997) Bayesian Networks for Data Mining Data Mining and Knowledge Discovery. |
| 03/12/2010 | Apprendimento con dati mancanti. Expectation-Maximization (EM). Modelli a mistura e misture di gaussiane (GMM). K-means. | HTB09 8.5, 13.2.1, 13.2.3, 14.3.6, 14.3.7. T. Minka Expectation-Maximization as lower bound maximization Tech. Report. 1999. |
| 9/12/2010 | EM per l'apprendimento nelle reti Bayesiane.
Plates, dipendenze tra i parametri nelle reti Bayesiane.
Idee base su reti probabilistiche relazionali. Apprendimento semi-supervisionato. Auto-addestramento, uso di modelli a mistura, Naive Bayes semi-supervisionato. |
K. Nigam et al. (2006)
Semi-Supervised Text Classification Using EM in O Chapelle et al.:
Semi-supervised learning L. Getoor et al. (2007) Probabilistic Relational Models in L. Getoor and B. Taskar: Introduction to Statistical Relational Learning |
| 10/12/2010 |
Co-training. Semi-supervised SVM. Regolarizzazione entropica per regressione logistica. Introduzione agli Hidden Markov Models |
Zhu and Goldberg
Introduction to Semi-Supervised Learning 2009. L.R. Rabiner A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. IEEE 77(2):257-286, 1989. |
| 15/12/2010 | Hidden Markov Models (Inferenza e procedura forward-backward di Baum-Welch) | L.R. Rabiner A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. IEEE 77(2):257-286, 1989. |
| 16/12/2010 | Algoritmo di Viterbi per HMMs. Apprendimento con EM. HMMs discriminativi (Maximum Entropy Markov Models). | |
| 17/12/2010 | Discriminative sequence labeling: Conditional random fields. Structured-output learning and SVM-HMM |
C. Sutton and A. McCallum (2007)
An Introduction to Conditional Random Fields for Relational Learning.
in L. Getoor and B. Taskar: Introduction to Statistical Relational Learning Y. Altun et al (2007). Support Vector Machine Learning for Interdependent and Structured Output Spaces. in BakIr et al: Predicting Structured Data Mallet (LDA and CFR Software) |
| 12/1/2011 | Apprendumento statistico relazionale. Introduzione alla logica di Markov. | P. Domingos et al. (2008) Markov Logic in L. De Raedt et al. Probabilistic Inductive Logic Programming. |
| 13/1/2011 | Logica di Markov e applicazioni. |
M. Lippi and P. Frasconi (2009)
Prediction of protein beta-residue contacts by Markov logic
networks with grounding-specific weights Bioinformatics 25(18):2326-2333. Alchemy (Markov Logic software) |
| 14/1/2011 | Topic modeling. Latent Dirichlet Allocation. |
D. Blei et al. (2003)
Latent Dirichlet Allocation Journal of Machine Learning Research 3, 993--1022 T.L Griffiths and M. Steyvers (2004) Finding scientific topics Proceedings of the National Academy of Sciences 101(Supp. 1):5228 Y. Wang (2008) Distributed Gibbs Sampling of Latent Topic Models: The Gritty Details Mallet (LDA and CFR Software) |
| Date | Topics | Readings and handouts* |
| 13/10/2010 | Administrivia. Introduction. Solving difficult AI problems: the machine learning approach. | HTB09 1. |
|---|---|---|
| 14/10/2010 | Supervised learning. Types of inputs and outputs. Application examples. Tasks: regression, classification, multi-class classification, ranking, structured-output learning. Abstract view of a supervised learning algorithm. Assumptions on data. | HTB09 2. Class notes |
| 15/10/2010 | Ordinary Linear Regression. Least squares estimates. Geometric interpretation. Statistical analysis. Gauss-Markov Theorem. Bias-variance decomposition. | HTB09 3.2. |
| 21/10/2010 | Subset selection: best subset, forward and backward stepwise. Ridge regression and Lasso. Discussion. | HTB09 3.3, 3.4 |
| 22/10/2010 | The maximum likelihood principle. Examples. Properties of maximum likelihood estimators: consistency, equivariance, asyntotic normality, optimality. Ordinary least squares as a special case of MLE. | Wasserman 2004, Ch. 9 Maximum likelihood |
| 27/10/2010 | Generative models for classification. Linear (Gaussian) discriminant analysis. Bayes risk. Esempi. | HTB09 4.3. DHS 5.8.2 |
| 28/10/2010 | Logistic regression. Newton-Raphson optimization. Generalized linear models (example: Poisson regression). | HTB09 4.4. M. Jordan. Why the logistic function? - UAI 1995 |
| 29/10/2010 | Softmax regression (multiclass logistic regression). Introduction to Naive Bayes. | HTB09 4.4, 6.6.3. T. Mitchell Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression |
| 3/11/2010 | Naive Bayes. Laplace smoothing. General categorical attributes. Discretization of continuous attributes. Multinomial model for text categorization. |
A. McCallum and K. Nigam.
A Comparison of Event Models for Naive Bayes Text Classification
- AAAI 1998
U. Fayyad and K. Irani: Multi-interval discretization of continuous-valued attributes for classification learning - IJCAI 1993 |
| 4/11/2010 | Decomposition of the generalization error. Learning curves. Naive Bayes vs. Logistic Regression. Introduction to SVM: Maximal margin hyperplane. | A. Ng and M. Jordan. On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes - NIPS 2002 |
| 10/11/2010 | Maximal margin hyperplane. Karush Kuhn Tucker theory for solving convex optimization problems in dual form. Structure of the MMH solution. | HTB09 4.5. Schoelkopf and Smola (2002) Learning with Kernels Ch. 6, 7.3 |
| 11/11/2010 | Support vector machines (C-SVC, nu-SVC). Analysis of support vectors. Overview of SVM solvers. SMO. SVM as regularized empirical loss minimization. Hinge vs. logit (logistic regression) loss. | HTB09 12.1, 12.2.
L. Bottou and C.J. Lin (2006). Support Vector Machine Solvers. L. Bottou Trade-offs of Large-Scale Learning - NIPS 2008 LibSVM software |
| 12/11/2010 | Support vector regression. Informal introduction to kernels: Polynomial and Gaussian kernel. Perceptron and Kernel Perceptron. Voted-perceptron. | HTB09 12.3.1, 12.3.2. Cristianini and Shawe-Taylor, Ch. 2 Y. Freund and R.E. Schapire: Large Margin Classification Using the Perceptron Algorithm |
| 17/11/2010 | Mercer's theorem. Reproducing kernel Hilbert spaces. Representer theorem. Kernel ridge regression. | HTB09 5.8, 12.3 T. Poggio and S. Smale The Mathematics of Learning: Dealing with Data (quite readable) F. Cucker and S. Smale On the mathematical foundations of learning (very detailed but somewhat difficult paper) |
| 18/11/2010 | Designing kernels. Closure properties. Min-kernel. Kernels on histograms. Pyramidal match kernel. Convolution kernels. |
K. Grauman and T. Derrel:
The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features - ICCV 2005 D. Haussler (1999): Convolution Kernels on Discrete Structures - MLJ 1999 |
| 19/11/2010 | Kernels for structured instances: k-grams for protein classification, subtree kernels for parse trees. Feature hashing. |
C. Leslie et al.:
The spectrum kernel: A string kernel for SVM protein classification - PSB 2002
P. Kuksa et al.: Scalable Algorithms for String Kernels with Inexact Matching - NIPS 2008 M. Collins and N. Duffy Convolution Kernels for Natural Language - NIPS 2001 K. Weinberger et al. Feature Hashing for Large Scale Multitask Learning ICML 2009. |
| 24/11/2010 | Overview of graph kernels. Learning theory. Generalization bounds. Case of finite-dimensional hypothesis spaces. VC-dimension and Vapnick's bound. |
L. Ralaivola et al. (2005)
Graph kernels for chemical informatics Bousquet et al. (2004)Introduction to Statistical Learning Theory |
| 25/11/2010 | Introduction to probabilistic graphical models. Review of probabilities and subjective probability (beliefs). Conditional independence and dependency models. Undirected I-maps (Markov networks). |
Details in KF09 Ch 1-4 D. Koller et al. (2007)Graphical models in a Nutshell in L. Getoor and B. Taskar: Introduction to Statistical Relational Learning |
| 26/11/2010 | Markov networks: decomposition of the joint distribution (Hammersley-Clifford theorem). Bayesian networks. D-separation. Parameterization. Introduction to inference (variable elimination). | A. Darwiche Bayesian Networks Communications of the ACM. |
| 1/12/2010 | Inference with the junction tree algorithm. |
D. Koller et al. (2007)Graphical models in a Nutshell in L. Getoor and B. Taskar: Introduction to Statistical Relational Learning |
| 02/12/2010 | Using Bayesian network software. Parameter learning in Bayesian networks: Maximum likelihood estimation with complete data. Bayesian learning, Beta and Dirichlet priors. |
Bayesian networks software: Hugin,
Genie. D. Heckerman (1997) Bayesian Networks for Data Mining Data Mining and Knowledge Discovery. |
| 03/12/2010 | Learning with missing data. Expectation-Maximization (EM). Mixture models and mixtures of Gaussians (GMM). K-means. | HTB09 8.5, 13.2.1, 13.2.3, 14.3.6, 14.3.7. T. Minka Expectation-Maximization as lower bound maximization Tech. Report. 1999. |
| 9/12/2010 | EM for learning Bayesian Networks.
Plates, parameter dependences in Bayesian networks, core ideas of probabilistic relational networks. Semi-supervised learning: self-training, using mixture models, semi-supervised Naive Bayes. |
K. Nigam et al. (2006)
Semi-Supervised Text Classification Using EM in O Chapelle et al.:
Semi-supervised learning L. Getoor et al. (2007) Probabilistic Relational Models in L. Getoor and B. Taskar: Introduction to Statistical Relational Learning |
| 10/12/2010 |
Co-training. Semi-supervised SVM and Logistic Regression. Introduction to Hidden Markov Models |
Zhu and Goldberg
Introduction to Semi-Supervised Learning 2009. L.R. Rabiner A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. IEEE 77(2):257-286, 1989. |
| 15/12/2010 | Hidden Markov Models (Baum-Welch's forward backward procedure) | L.R. Rabiner A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. IEEE 77(2):257-286, 1989. |
| 16/12/2010 | Viterbi Algorithm for HMMs. Learning with EM. Discriminative HMMs (Maximum Entropy Markov Models). | |
| 17/12/2010 | Discriminative sequence labeling: Conditional random fields. Structured-output learning and SVM-HMM |
C. Sutton and A. McCallum (2007)
An Introduction to Conditional Random Fields for Relational Learning.
in L. Getoor and B. Taskar: Introduction to Statistical Relational Learning Y. Altun et al (2007). Support Vector Machine Learning for Interdependent and Structured Output Spaces. in BakIr et al: Predicting Structured Data Mallet (LDA and CFR Software) |
| 12/1/2011 | Overview of statistical relational learning. Introduction to Markov logic. | P. Domingos et al. (2008) Markov Logic in L. De Raedt et al. Probabilistic Inductive Logic Programming. |
| 13/1/2011 | Markov logic and applications. |
M. Lippi and P. Frasconi (2009)
Prediction of protein beta-residue contacts by Markov logic
networks with grounding-specific weights Bioinformatics 25(18):2326-2333. Alchemy (Markov Logic software) |
| 14/1/2011 | Topic modeling. Latent Dirichlet Allocation. |
D. Blei et al. (2003)
Latent Dirichlet Allocation Journal of Machine Learning Research 3, 993--1022 T.L Griffiths and M. Steyvers (2004) Finding scientific topics Proceedings of the National Academy of Sciences 101(Supp. 1):5228 Y. Wang (2008) Distributed Gibbs Sampling of Latent Topic Models: The Gritty Details Mallet (LDA and CFR Software) |

