Hi,
I am using Cobaya to test a modified model in CAMB, with only two additional modification parameters varied, and all other parameters fixed. I chose some fiducial values for the two parameters, say A and B, to generate the mock DES data using the DES likelihood, and then fit the model to the data using DES likelihood only, with A and B varied. The result is OK with great convergence (quite small R1), and the bestfit/mean values of the A and B are very close to my fiducials, also with smooth contour between A and B. However, for the chi2 stats from getdist, it gives something like:
Best fit sample log(Like) = 1.831643
Ln(mean 1/like) = 1.779169
mean(Ln(like)) = 0.828389
Ln(mean like) = 1.151868
2*Var(Ln(like)) = 2.311052
which contains some negative values, and it is strange because the chi2 can never be negative. Also from the margestats file, the result for chi2 is like:
parameter mean sddev lower1 upper1 limit1 lower2 upper2 limit2
chi2* 2.0083858E+00 2.1499081E+00 1.0033729E02 2.4238994E+00 two 2.3517432E01 6.1697150E+00 two \chi^2
so the negative chi2 might come from the large sdv compared to mean.
I searched the chi2 values in the chains but found no negative values in them, so I was wondering if this negativity issue comes from some kind of normalization process from getdsit? Are the results trustworthy in this case then? Thanks.
Cobaya: negative chi2 for a model

 Posts: 31
 Joined: October 09 2021
 Affiliation: Simon Fraser University

 Posts: 1983
 Joined: September 23 2004
 Affiliation: University of Sussex
 Contact:
Re: Cobaya: negative chi2 for a model
The total also has contributions from the prior. The chi2 for DES should be listed separately in the chain.

 Posts: 31
 Joined: October 09 2021
 Affiliation: Simon Fraser University
Re: Cobaya: negative chi2 for a model
Thanks. I checked the margestat again and DES y1 is the only contribution to the chi2.
chi2* 2.0083858E+00 2.1499081E+00 1.0033729E02 2.4238994E+00 two 2.3517432E01 6.1697150E+00 two
chi2__des_y1.joint* 2.0083858E+00 2.1499081E+00 1.0033729E02 2.4238994E+00 two 2.3517432E01 6.1697150E+00 two
chi2* 2.0083858E+00 2.1499081E+00 1.0033729E02 2.4238994E+00 two 2.3517432E01 6.1697150E+00 two
chi2__des_y1.joint* 2.0083858E+00 2.1499081E+00 1.0033729E02 2.4238994E+00 two 2.3517432E01 6.1697150E+00 two

 Posts: 1983
 Joined: September 23 2004
 Affiliation: University of Sussex
 Contact:
Re: Cobaya: negative chi2 for a model
I guess 2 sigma marginalized could be a numerical error because the chi2 is dropping very suddenly at zero. You should give getdist the >0 range for the parameter where you know there is a lower bound. The "Best fit sample log(Like)" must have a negative contribution though, e.g. from priors.

 Posts: 31
 Joined: October 09 2021
 Affiliation: Simon Fraser University
Re: Cobaya: negative chi2 for a model
Thanks. It looks like the problem indeed comes from the prior. I noticed that the minuslogpost term in chain is negative, which is abnormal. For this simple test, when I set the priors tight for the two parameters, they will be constrained and converged very well, but with this negative prior contribution issue. Whereas if I set the priors wider, there will be errors from the chains saying that "the chains get stuck after many attempts..." and so forth, so it looks like the result highly depends on the prior settings. But does it mean my tight prior setup is a wrong operation?
Also, I am still confused about why there would be negative log(prior) values coming out from the sampling, since by definition, all probabilities (including prior and likelihood function) should fall between (0,1) and the log conversion should always give positive values.
Also, I am still confused about why there would be negative log(prior) values coming out from the sampling, since by definition, all probabilities (including prior and likelihood function) should fall between (0,1) and the log conversion should always give positive values.

 Posts: 37
 Joined: April 15 2013
 Affiliation: RWTH Aachen
 Contact:
Re: Cobaya: negative chi2 for a model
Two issues here:
* Probabilities do fall between 0 and 1, assuming a properly normalised probability density function p(x). But the density itself, which is what `minuslogprior` represents, can take any positive value, (so its log can be either positive or negative), as long as any probability definied as P(x in [a,b]) = int_a^b p(x) dx is between 0 and 1.
* Prior probabilities is Cobaya are not necessarily normalised. They are if you specify your prior entirely with `prior` fields inside `params` definitions. But they are not if you specify logprior functions via a separate `prior` block, unless you have ensured that yourself.
So in general, it is not an issue that your minuslogprior has either sign.
If your MCMC chains get stuck with a large prior, you can either:
 Give each parameter a "ref" distribution, specified the same as the priors, but with tighter bounds (the initial points for MCMC are drawn from them), as well as a "proposal" field with a reasonable value for the expected standard deviation of the final sample for each parameter (with an order of magnitude should be enough). If the chains keep getting stuck, reduce the value of "proposal" for each parameter until it doesn't.
 Use PolyChord, which is less efficient but better at finding modes within small priors. If you do, MPI is not necessary but helps a lot.
* Probabilities do fall between 0 and 1, assuming a properly normalised probability density function p(x). But the density itself, which is what `minuslogprior` represents, can take any positive value, (so its log can be either positive or negative), as long as any probability definied as P(x in [a,b]) = int_a^b p(x) dx is between 0 and 1.
* Prior probabilities is Cobaya are not necessarily normalised. They are if you specify your prior entirely with `prior` fields inside `params` definitions. But they are not if you specify logprior functions via a separate `prior` block, unless you have ensured that yourself.
So in general, it is not an issue that your minuslogprior has either sign.
If your MCMC chains get stuck with a large prior, you can either:
 Give each parameter a "ref" distribution, specified the same as the priors, but with tighter bounds (the initial points for MCMC are drawn from them), as well as a "proposal" field with a reasonable value for the expected standard deviation of the final sample for each parameter (with an order of magnitude should be enough). If the chains keep getting stuck, reduce the value of "proposal" for each parameter until it doesn't.
 Use PolyChord, which is less efficient but better at finding modes within small priors. If you do, MPI is not necessary but helps a lot.