Model Specification#
We cannot begin to do inference on our data without writing out, explicitly, a model for the data. Heuristically, a model ought to be a ‘story for how the observed data was generated’. Our model is comprised of two components:
A parameterized physical model for the lensing system, consisting of a model for the mass profile of the main lens, potentially a model for the effects of any nearby interlopers, and models for the light profiles of the lens and the source.
A probabilistic model, that consists of a prior for the physical parameters, as well as a likelihood function. Defining a likelihood requires a noise model, which for most purposes will consist of modeling the noise on a given pixel as the quadrature sum of background Gaussian noise \(\sigma_{bkg}\) (conventionally written as
background_rms
) and Poisson shot noise with exposure time \(t_{exp}\) (conventionally written asexp_time
).
Keep in mind that for any given physical model, there can be a number of valid choices for the probabilistic model. Therefore, in our implementation, we are careful to keep these two components of the model distinct (see Section 2.1 of our paper). The following are our package’s high level descriptions for a physical and probabilistic model.
- class gigalens.model.PhysicalModel(lenses, lens_light, source_light)[source]#
A physical model for the lensing system.
- Parameters
lenses (
list
ofMassProfile
) – A list of mass profiles used to model the deflectionlens_light (
list
ofLightProfile
) – A list of light profiles used to model the lens lightsource_light (
list
ofLightProfile
) – A list of light profiles used to model the source light
- lenses#
A list of mass profiles used to model the deflection
- Type
list
ofMassProfile
- lens_light#
A list of light profiles used to model the lens light
- Type
list
ofLightProfile
- source_light#
A list of light profiles used to model the source light
- Type
list
ofLightProfile
- class gigalens.model.ProbabilisticModel(prior, bij=None, *args)[source]#
A probabilistic model for the lensing system.
- Parameters
prior – Prior distribution of lens parameters
bij – A bijector that can unconstrain physical parameters (e.g., applying an exponential bijector to strictly positive variables)
*args – Information about observed data (typically includes the observed image, estimated noise characteristics, etc.)
- prior#
Prior distribution of lens parameters
- bij#
A bijector that can unconstrain physical parameters (e.g., applying an exponential bijector to strictly positive variables)
- abstract log_prob(simulator, z)[source]#
Returns the unconstrained log posterior density (i.e., includes the Jacobian factor due to the bijector)
- Parameters
simulator (
LensSimulatorInterface
) – an object that can simulate a lens with (unconstrained parameters) zz – Unconstrained parameters
The TensorFlow implementation of the two above classes are below.
- class gigalens.tf.model.ForwardProbModel(prior, observed_image=None, background_rms=None, exp_time=None)[source]#
Probabilistic model defined using the simulated image as an estimator for the noise variance map. Linear parameters are not automatically solved for using least squares.
- observed_image#
The observed image.
- Type
- pack_bij#
A bijector that reshapes from a tensor to a structured parameter object (i.e., dictionaries of parameter values). Does not change the input parameters whatsoever, it only reshapes them.
- unconstraining_bij#
A bijector that maps from physical parameter space to unconstrained parameter space. Outputs an identical structure as the input, only maps values to unconstrained space.
- bij#
The composition of
pack_bij
andunconstraining_bij
. The inverse method ofbij
will unconstrain parameter values and then flatten the entire structure into one tensor.
- log_prob(simulator: gigalens.tf.simulator.LensSimulator, z)[source]#
Evaluate the reparameterized log posterior density for a batch of unconstrained lens parameters
z
. Simulates the lenses with parametersz
to evaluate the log likelihood. The log prior is calculated by calculating the log prior in physical parameter space and adding the log determinant of the Jacobian.Notes
The log determinant for the default bijector will output a scalar regardless of the batch size of
z
due to the flattening behavior oftfp.bijectors.Split
. To get around this, we use the fact thatpack_bij
has determinant 1 (it simply reshapes/restructures the input), and the determinant comes purely fromunconstraining_bij
.- Parameters
simulator (
LensSimulator
) – A simulator object that has the same batch size asz
.z (
tf.Tensor
) – Parameters in unconstrained space with shape(bs, d)
, wherebs
is the batch size andd
is the number of parameters per lens.
- Returns
The reparameterized log posterior density, with shape
(bs,)
.
- class gigalens.tf.model.BackwardProbModel(prior, observed_image, background_rms, exp_time)[source]#
Probabilistic model defined using the observed image as an estimator for the noise variance map. Linear parameters are automatically solved for using least squares.
- observed_image#
The observed image.
- Type
- pack_bij#
A bijector that reshapes from a tensor to a structured parameter object (i.e., dictionaries of parameter values). Does not change the input parameters whatsoever, it only reshapes them.
- unconstraining_bij#
A bijector that maps from physical parameter space to unconstrained parameter space. Outputs an identical structure as the input, only maps values to unconstrained space.
- bij#
The composition of
pack_bij
andunconstraining_bij
. The inverse method ofbij
will unconstrain parameter values and then flatten the entire structure into one tensor.
- log_prob(simulator: gigalens.tf.simulator.LensSimulator, z)[source]#
Evaluate the reparameterized log posterior density for a batch of unconstrained lens parameters
z
. Simulates the lenses with parametersz
(using least squares to solve for linear light parameters) to evaluate the log likelihood. The log prior is calculated by calculating the log prior in physical parameter space and adding the log determinant of the Jacobian.Notes
The log determinant for the default bijector will output a scalar regardless of the batch size of
z
due to the flattening behavior oftfp.bijectors.Split
. To get around this, we use the fact thatpack_bij
has determinant 1 (it simply reshapes/restructures the input), and the determinant comes purely fromunconstraining_bij
.- Parameters
simulator (
LensSimulator
) – A simulator object that has the same batch size asz
.z (
tf.Tensor
) – Parameters in unconstrained space with shape(bs, d)
, wherebs
is the batch size andd
is the number of parameters per lens.
- Returns
The reparameterized log posterior density, with shape
(bs,)
.
Prior Specification#
Once the physical model for the lens is settled, the next step is to specify the prior. For example, if our model is comprised of an SIS + Shear for the mass model, 2 Sersic’s for the lens light, and 1 Sersic for the source light, the prior might look like the following:
1import tensorflow as tf
2from tensorflow_probability import distributions as tfd
3
4lens_prior = tfd.JointDistributionSequential(
5 [
6 tfd.JointDistributionNamed(
7 dict(
8 theta_E=tfd.Normal(1.5, 0.25),
9 center_x=tfd.Normal(0, 0.05),
10 center_y=tfd.Normal(0, 0.05),
11 )
12 ),
13 tfd.JointDistributionNamed(
14 dict(gamma1=tfd.Normal(0, 0.05), gamma2=tfd.Normal(0, 0.05))
15 ),
16 ]
17)
18lens_light_prior = tfd.JointDistributionSequential(
19 [
20 tfd.JointDistributionNamed(
21 dict(
22 R_sersic=tfd.LogNormal(tf.math.log(1.0), 0.15),
23 n_sersic=tfd.Uniform(2, 6),
24 center_x=tfd.Normal(0, 0.05),
25 center_y=tfd.Normal(0, 0.05),
26 Ie=tfd.Normal(150.0, 0.5),
27 )
28 ),
29 tfd.JointDistributionNamed(
30 dict(
31 R_sersic=tfd.LogNormal(tf.math.log(1.0), 0.15),
32 n_sersic=tfd.Uniform(2, 6),
33 center_x=tfd.Normal(0, 0.05),
34 center_y=tfd.Normal(0, 0.05),
35 Ie=tfd.Normal(150.0, 0.5),
36 )
37 )
38 ]
39)
40
41source_light_prior = tfd.JointDistributionSequential(
42 [
43 tfd.JointDistributionNamed(
44 dict(
45 R_sersic=tfd.LogNormal(tf.math.log(0.25), 0.15),
46 n_sersic=tfd.Uniform(0.5, 4),
47 center_x=tfd.Normal(0, 0.25),
48 center_y=tfd.Normal(0, 0.25),
49 Ie=tfd.Normal(150.0, 0.5),
50 )
51 )
52 ]
53)
54
55prior = tfd.JointDistributionSequential(
56 [lens_prior, lens_light_prior, source_light_prior]
57)
Note that this is a different model than the one used in our paper. Although this may appear complex at first, the length is only due to the fact that the model has 20 parameters, each of which must have a prior distribution specified. All this says is the prior distribution is the product of independent distributions, that begins with \(\theta_E \sim \mathcal{N}(1.5, 0.25)\) and ends with \(I_{e,src} \sim \mathcal{N}(150.0, 0.5)\). This way of defining the prior is very flexible, and allows for dependent distributions as well. The dependence can be specified with functions:
1lens_prior = tfd.JointDistributionNamed(
2 dict(
3 theta_E=tfd.Normal(1.5, 0.25),
4 center_x=lambda theta_E: tfd.Normal(theta_E, 0.05),
5 center_y=lambda theta_E, center_x: tfd.Normal(theta_E, center_x),
6 )
7)
This corresponds to a prior
Obviously, this is a completely silly prior, but demonstrates the flexibility of the
tfp.distributions.JointDistribution
interface.
Sampling from the prior will produce samples that have an identical structure to the prior. Sampling twice from the example prior might give something like:
1[
2 [
3 {
4 "theta_E": [1.2670871, 1.3154074],
5 "center_y": [0.057920452, -0.07995515],
6 "center_x": [-0.02513125, -0.034906887],
7 },
8 {
9 "gamma2": [0.026222957, 0.0174865],
10 "gamma1": [-0.014953673, 0.025195256]
11 },
12 ],
13 [
14 {
15 "n_sersic": [4.55016, 2.9208045],
16 "center_y": [-0.08046845, -0.06934502],
17 "center_x": [-0.0336911, -0.013034515],
18 "R_sersic": [1.0177809, 0.9325099],
19 "Ie": [150.55762, 150.0797],
20 },
21 {
22 "n_sersic": [4.1814566, 3.0877438],
23 "center_y": [0.033258155, 0.055594254],
24 "center_x": [0.08249615, 0.043057345],
25 "R_sersic": [1.2974737, 1.0547239],
26 "Ie": [150.98956, 149.52852],
27 },
28 ],
29 [
30 {
31 "n_sersic": [1.9998379, 2.1873033],
32 "center_y": [-0.16878262, 0.24757618],
33 "center_x": [0.46696508, 0.3786787],
34 "R_sersic": [0.2810259, 0.27873865],
35 "Ie": [150.91617, 150.61823],
36 }
37 ],
38]
This is a format that is friendly for the gigalens
code to work with.
Unconstraining and Restructuring Parameters#
Although the above format is more human-readable, TensorFlow prefers to work directly with Tensors. More precisely, it prefers to work with unconstrained tensors (to ensure differentiability). In the current formulation, there are certain parameters that are constrained (such as Sersic index). To handle this, we make use of bijectors that unconstrain parameters and then restructure them into a convenient form.
Tensorflow Implementation
The built-in tfp.bijectors.Restructure
does the restructuring conveniently. However,
this still outputs a list of tensors, when ideally we would like to concatenate all of them into one single Tensor.
This can be done with tfp.bijectors.Split
, but this will completely flatten the parameters into a length
bs*d
vector. We reshape using the remaining two transforms (tfp.bijectors.Reshape
and
tfp.bijectors.Transpose
). Typically, we will use z
to denote this flattened, unconstrained vector.
Due to our use of tfp.bijectors.Split
, the gigalens.tf.model.ForwardProbModel.log_prob()
method
and gigalens.tf.model.BackwardProbModel.log_prob()
both expect a rank-2 Tensor z
with shape (bs,d)
.
If you are looking to evaluate the log density for just one example, you must expand the first dimension so it has
shape (1,d).
- class gigalens.tf.model.ForwardProbModel(prior, observed_image=None, background_rms=None, exp_time=None)[source]#
Probabilistic model defined using the simulated image as an estimator for the noise variance map. Linear parameters are not automatically solved for using least squares.
- observed_image#
The observed image.
- Type
- pack_bij#
A bijector that reshapes from a tensor to a structured parameter object (i.e., dictionaries of parameter values). Does not change the input parameters whatsoever, it only reshapes them.
- unconstraining_bij#
A bijector that maps from physical parameter space to unconstrained parameter space. Outputs an identical structure as the input, only maps values to unconstrained space.
- bij#
The composition of
pack_bij
andunconstraining_bij
. The inverse method ofbij
will unconstrain parameter values and then flatten the entire structure into one tensor.
- log_prob(simulator: gigalens.tf.simulator.LensSimulator, z)[source]#
Evaluate the reparameterized log posterior density for a batch of unconstrained lens parameters
z
. Simulates the lenses with parametersz
to evaluate the log likelihood. The log prior is calculated by calculating the log prior in physical parameter space and adding the log determinant of the Jacobian.Notes
The log determinant for the default bijector will output a scalar regardless of the batch size of
z
due to the flattening behavior oftfp.bijectors.Split
. To get around this, we use the fact thatpack_bij
has determinant 1 (it simply reshapes/restructures the input), and the determinant comes purely fromunconstraining_bij
.- Parameters
simulator (
LensSimulator
) – A simulator object that has the same batch size asz
.z (
tf.Tensor
) – Parameters in unconstrained space with shape(bs, d)
, wherebs
is the batch size andd
is the number of parameters per lens.
- Returns
The reparameterized log posterior density, with shape
(bs,)
.
- class gigalens.tf.model.BackwardProbModel(prior, observed_image, background_rms, exp_time)[source]#
Probabilistic model defined using the observed image as an estimator for the noise variance map. Linear parameters are automatically solved for using least squares.
- observed_image#
The observed image.
- Type
- pack_bij#
A bijector that reshapes from a tensor to a structured parameter object (i.e., dictionaries of parameter values). Does not change the input parameters whatsoever, it only reshapes them.
- unconstraining_bij#
A bijector that maps from physical parameter space to unconstrained parameter space. Outputs an identical structure as the input, only maps values to unconstrained space.
- bij#
The composition of
pack_bij
andunconstraining_bij
. The inverse method ofbij
will unconstrain parameter values and then flatten the entire structure into one tensor.
- log_prob(simulator: gigalens.tf.simulator.LensSimulator, z)[source]#
Evaluate the reparameterized log posterior density for a batch of unconstrained lens parameters
z
. Simulates the lenses with parametersz
(using least squares to solve for linear light parameters) to evaluate the log likelihood. The log prior is calculated by calculating the log prior in physical parameter space and adding the log determinant of the Jacobian.Notes
The log determinant for the default bijector will output a scalar regardless of the batch size of
z
due to the flattening behavior oftfp.bijectors.Split
. To get around this, we use the fact thatpack_bij
has determinant 1 (it simply reshapes/restructures the input), and the determinant comes purely fromunconstraining_bij
.- Parameters
simulator (
LensSimulator
) – A simulator object that has the same batch size asz
.z (
tf.Tensor
) – Parameters in unconstrained space with shape(bs, d)
, wherebs
is the batch size andd
is the number of parameters per lens.
- Returns
The reparameterized log posterior density, with shape
(bs,)
.
JAX Implementation
Since the JAX integration with TPF is still mostly experimental, it seems tfp.substrates.jax.bijectors.Split
does not work well with traced JAX arrays (which JAX uses with JIT-compiled functions). Therefore, the parameters
typically remain in list form, rather than a JAX array (although the entire structure is at least flatten into one
list). Internally, this is managed by converting all parameters to lists before evaluating prior density.
For all intermediate modeling steps, we will always prefer to work in unconstrained space, only converting back to physical parameter space when absolutely necessary. The final posterior samples from HMC will of course be converted to physical parameters by applying the bijector. We note that the use of unconstrained parameters means we must reparameterize the posterior density: this is easily done by multiplying the prior density by the absolute value of the determinant of the bijector Jacobian.