Latent Variable Models
In generative modeling we want to model the generative distribution of the observed variables , [^1]. Take for example a set of images that depict a pendulum excited by a force. We would like to generate the successive frames of the pendulum swinging behavior and we do so we are assisted by a set of latent variables that represent the underlying laws of physics. Generative modeling is especially well suited for:- Testing out hypotheses about the underlying rules that generated the observed data. Such rules can also offer interpretable models.
- Ability to capture causal relationships, since the ability of a factor to generate data very close to the ones observed, offers a strong indication of such relationship.
- Semi-supervised learning where the generated data are very close to already labeled data and therefore can improve classification model accuracy.
- Think of what factor make a person a real one. For example, can we have a person without a head (assume we are interested in living humans) ? Lets call the headedness factor a latent variable and similarly specify the other factors grouping them together into a set of rendom variables that we will call , captured via their distribution .
- Generate an image from the latent representation. This is a process that takes the latent representation and generates an image of the person (hopefully realistic).
Probabilistic Graphical Models
A graph representation, the Probabilistic Graphical Model (PGM) (also called Bayesian network) can be used to capture dependencies between the random variables involved in the modeling of a problem. We can use such representations to efficiently compute conditional probabilities using the graph to identify the conditionally independent random variables that are present in our problem. By convention, we represent PGMs as directed graphs, with nodes being the random variables involved in the model and directed edges indicating a parent child relationship, with the arrow pointing to a child, representing that the child nodes are probabilistically conditioned on the parent(s). We have assumed that our model does not have variables involved in directed cycles and therefore we call such graphs Directed Acyclic Graphs (DAGs). In a hypothetical example of a joint distribution with random variables,



