Skip to content Skip to sidebar Skip to footer

Specification Of Multinomial Model In Tensorflow Probability

I am playing with a mixed multinomial discrete choice model in Tensorflow Probability. The function should take an input of a choice among 3 alternatives. The chosen alternative is

Solution 1:

More than likely this is an issue with your initial state and number of chains. You can try to initialize your kernel outside of the sampler call:

nuts_kernel = tfp.mcmc.NoUTurnSampler(
      target_log_prob_fn=mmnl_log_prob, 
      step_size=init_step_size,
      )
    adapt_nuts_kernel = tfp.mcmc.DualAveragingStepSizeAdaptation(
  inner_kernel=nuts_kernel,
  num_adaptation_steps=nuts_burnin,
  step_size_getter_fn=lambda pkr: pkr.step_size,
  log_accept_prob_getter_fn=lambda pkr: pkr.log_accept_ratio,
  step_size_setter_fn=lambda pkr, new_step_size: pkr._replace(step_size=new_step_size)
       )

and then do

nuts_kernel.bootstrap_results(initial_state)

and investigate the shapes of logL, and proposal states are being returned.

Another thing to do is to feed your initial state into your log-likelihood/posterior and see if the dimensions of the returned log-likelihoods match what you think it should be (if you are doing multiple chains then maybe it should be returning # chains log likelihoods).

It is my understanding that the batch dimension (# chains) has to be the first one in all your vectorized calculations.

The very last part of my blog post on tensorflow and custom likelihoods has working code for an example that does this.


Solution 2:

I was able to get reasonable results from my model. Thank you to everyone for the help! The following points helped solve the various issues.

  1. Use of JointDistributionSequentialAutoBatched() to produce consistent batch shapes. You need tf-nightly installed for access.

  2. More informative priors for hyperparameters. The exponential transformation in the Multinomial() distribution means that uninformative hyperparameters (i.e., with sigma = 1e5) mean you quickly have large positive numbers entering the exp(), leading to infinities.

  3. Setting the step size, etc. was also important.

  4. I found an answer by Christopher Suter to a recent question on the Tensorflow Probability forum specifying a model from STAN useful. I took the use of taking a sample from my prior as a starting point for the initial likelihood parameters useful.

  5. Despite JointDistributionSequentialAutoBatched() correcting the batch shapes, I went back and corrected my joint distribution shapes so that printing log_prob_parts() gives consistent shapes (i.e., [10,1] for 10 chains). I still get a shape error without using JointDistributionSequentialAutoBatched() but the combination seem to work.

  6. I separated my affine() into two functions. They do the same thing but remove retracing warnings. Basically, affine() was able to broadcast the inputs but they differed and it was easier to write two functions that setup the inputs with consistent shapes. Differently shaped inputs causes Tensorflow to trace the function multiple times.


Post a Comment for "Specification Of Multinomial Model In Tensorflow Probability"