Model Specification#

We cannot begin to do inference on our data without writing out, explicitly, a model for the data. Heuristically, a model ought to be a ‘story for how the observed data was generated’. Our model is comprised of two components:

A parameterized physical model for the lensing system, consisting of a model for the mass profile of the main lens, potentially a model for the effects of any nearby interlopers, and models for the light profiles of the lens and the source.

A probabilistic model, that consists of a prior for the physical parameters, as well as a likelihood function. Defining a likelihood requires a noise model, which for most purposes will consist of modeling the noise on a given pixel as the quadrature sum of background Gaussian noise \(\sigma_{bkg}\) (conventionally written as background_rms) and Poisson shot noise with exposure time \(t_{exp}\) (conventionally written as exp_time).

Keep in mind that for any given physical model, there can be a number of valid choices for the probabilistic model. Therefore, in our implementation, we are careful to keep these two components of the model distinct (see Section 2.1 of our paper). The following are our package’s high level descriptions for a physical and probabilistic model.

class gigalens.model.PhysicalModel(lenses, lens_light, source_light)[source]#

A physical model for the lensing system.

Parameters

lenses (list of MassProfile) – A list of mass profiles used to model the deflection
lens_light (list of LightProfile) – A list of light profiles used to model the lens light
source_light (list of LightProfile) – A list of light profiles used to model the source light

lenses#

A list of mass profiles used to model the deflection

Type: list of MassProfile

lens_light#

A list of light profiles used to model the lens light

Type: list of LightProfile

source_light#

A list of light profiles used to model the source light

Type: list of LightProfile

class gigalens.model.ProbabilisticModel(prior, bij=None, *args)[source]#

A probabilistic model for the lensing system.

Parameters

prior – Prior distribution of lens parameters
bij – A bijector that can unconstrain physical parameters (e.g., applying an exponential bijector to strictly positive variables)
*args – Information about observed data (typically includes the observed image, estimated noise characteristics, etc.)

prior#: Prior distribution of lens parameters

bij#: A bijector that can unconstrain physical parameters (e.g., applying an exponential bijector to strictly positive variables)

abstract log_prob(simulator, z)[source]#

Returns the unconstrained log posterior density (i.e., includes the Jacobian factor due to the bijector)

Parameters

simulator (LensSimulatorInterface) – an object that can simulate a lens with (unconstrained parameters) z
z – Unconstrained parameters

The TensorFlow implementation of the two above classes are below.

class gigalens.tf.model.ForwardProbModel(prior, observed_image=None, background_rms=None, exp_time=None)[source]#

Probabilistic model defined using the simulated image as an estimator for the noise variance map. Linear parameters are not automatically solved for using least squares.

observed_image#

The observed image.

Type: tf.Tensor or numpy.array

background_rms#

The estimated background Gaussian noise level

Type: float

exp_time#

The exposure time (used for calculating Poisson shot noise)

Type: float

pack_bij#

A bijector that reshapes from a tensor to a structured parameter object (i.e., dictionaries of parameter values). Does not change the input parameters whatsoever, it only reshapes them.

Type: tfp.bijectors.Bijector

unconstraining_bij#

A bijector that maps from physical parameter space to unconstrained parameter space. Outputs an identical structure as the input, only maps values to unconstrained space.

Type: tfp.bijectors.Bijector

bij#

The composition of pack_bij and unconstraining_bij. The inverse method of bij will unconstrain parameter values and then flatten the entire structure into one tensor.

Type: tfp.bijectors.Bijector

log_prob(simulator: gigalens.tf.simulator.LensSimulator, z)[source]#

Evaluate the reparameterized log posterior density for a batch of unconstrained lens parameters z. Simulates the lenses with parameters z to evaluate the log likelihood. The log prior is calculated by calculating the log prior in physical parameter space and adding the log determinant of the Jacobian.

Notes

The log determinant for the default bijector will output a scalar regardless of the batch size of z due to the flattening behavior of tfp.bijectors.Split. To get around this, we use the fact that pack_bij has determinant 1 (it simply reshapes/restructures the input), and the determinant comes purely from unconstraining_bij.

Parameters

simulator (LensSimulator) – A simulator object that has the same batch size as z.
z (tf.Tensor) – Parameters in unconstrained space with shape (bs, d), where bs is the batch size and d is the number of parameters per lens.

Returns

The reparameterized log posterior density, with shape (bs,).

class gigalens.tf.model.BackwardProbModel(prior, observed_image, background_rms, exp_time)[source]#

Probabilistic model defined using the observed image as an estimator for the noise variance map. Linear parameters are automatically solved for using least squares.

observed_image#

The observed image.

Type: tf.Tensor or numpy.array

background_rms#

The estimated background Gaussian noise level

Type: float

exp_time#

The exposure time (used for calculating Poisson shot noise)

Type: float

pack_bij#

A bijector that reshapes from a tensor to a structured parameter object (i.e., dictionaries of parameter values). Does not change the input parameters whatsoever, it only reshapes them.

Type: tfp.bijectors.Bijector

unconstraining_bij#

A bijector that maps from physical parameter space to unconstrained parameter space. Outputs an identical structure as the input, only maps values to unconstrained space.

Type: tfp.bijectors.Bijector

bij#

The composition of pack_bij and unconstraining_bij. The inverse method of bij will unconstrain parameter values and then flatten the entire structure into one tensor.

Type: tfp.bijectors.Bijector

log_prob(simulator: gigalens.tf.simulator.LensSimulator, z)[source]#

Evaluate the reparameterized log posterior density for a batch of unconstrained lens parameters z. Simulates the lenses with parameters z (using least squares to solve for linear light parameters) to evaluate the log likelihood. The log prior is calculated by calculating the log prior in physical parameter space and adding the log determinant of the Jacobian.

Notes

Parameters

simulator (LensSimulator) – A simulator object that has the same batch size as z.
z (tf.Tensor) – Parameters in unconstrained space with shape (bs, d), where bs is the batch size and d is the number of parameters per lens.

Returns

The reparameterized log posterior density, with shape (bs,).

Prior Specification#

Once the physical model for the lens is settled, the next step is to specify the prior. For example, if our model is comprised of an SIS + Shear for the mass model, 2 Sersic’s for the lens light, and 1 Sersic for the source light, the prior might look like the following:

import tensorflow as tf
from tensorflow_probability import distributions as tfd

lens_prior = tfd.JointDistributionSequential(
   [
       tfd.JointDistributionNamed(
           dict(
               theta_E=tfd.Normal(1.5, 0.25),
               center_x=tfd.Normal(0, 0.05),
               center_y=tfd.Normal(0, 0.05),
           )
       ),
       tfd.JointDistributionNamed(
           dict(gamma1=tfd.Normal(0, 0.05), gamma2=tfd.Normal(0, 0.05))
       ),
   ]
)
lens_light_prior = tfd.JointDistributionSequential(
   [
       tfd.JointDistributionNamed(
           dict(
               R_sersic=tfd.LogNormal(tf.math.log(1.0), 0.15),
               n_sersic=tfd.Uniform(2, 6),
               center_x=tfd.Normal(0, 0.05),
               center_y=tfd.Normal(0, 0.05),
               Ie=tfd.Normal(150.0, 0.5),
           )
       ),
       tfd.JointDistributionNamed(
           dict(
               R_sersic=tfd.LogNormal(tf.math.log(1.0), 0.15),
               n_sersic=tfd.Uniform(2, 6),
               center_x=tfd.Normal(0, 0.05),
               center_y=tfd.Normal(0, 0.05),
               Ie=tfd.Normal(150.0, 0.5),
           )
       )
   ]
)

source_light_prior = tfd.JointDistributionSequential(
   [
       tfd.JointDistributionNamed(
           dict(
               R_sersic=tfd.LogNormal(tf.math.log(0.25), 0.15),
               n_sersic=tfd.Uniform(0.5, 4),
               center_x=tfd.Normal(0, 0.25),
               center_y=tfd.Normal(0, 0.25),
               Ie=tfd.Normal(150.0, 0.5),
           )
       )
   ]
)

prior = tfd.JointDistributionSequential(
   [lens_prior, lens_light_prior, source_light_prior]
)

Note that this is a different model than the one used in our paper. Although this may appear complex at first, the length is only due to the fact that the model has 20 parameters, each of which must have a prior distribution specified. All this says is the prior distribution is the product of independent distributions, that begins with \(\theta_E \sim \mathcal{N}(1.5, 0.25)\) and ends with \(I_{e,src} \sim \mathcal{N}(150.0, 0.5)\). This way of defining the prior is very flexible, and allows for dependent distributions as well. The dependence can be specified with functions:

lens_prior = tfd.JointDistributionNamed(
   dict(
       theta_E=tfd.Normal(1.5, 0.25),
       center_x=lambda theta_E: tfd.Normal(theta_E, 0.05),
       center_y=lambda theta_E, center_x: tfd.Normal(theta_E, center_x),
   )
)

This corresponds to a prior

\[\begin{split}\theta_E \sim \mathcal{N}(1.5,0.25) \\ x \sim \mathcal{N}(\theta_E,0.25) \\ y \sim \mathcal{N}(\theta_E,x) \\\end{split}\]

Obviously, this is a completely silly prior, but demonstrates the flexibility of the tfp.distributions.JointDistribution interface.

Sampling from the prior will produce samples that have an identical structure to the prior. Sampling twice from the example prior might give something like:

[
    [
        {
            "theta_E": [1.2670871, 1.3154074],
            "center_y": [0.057920452, -0.07995515],
            "center_x": [-0.02513125, -0.034906887],
        },
        {
            "gamma2": [0.026222957, 0.0174865],
            "gamma1": [-0.014953673, 0.025195256]
        },
    ],
    [
        {
            "n_sersic": [4.55016, 2.9208045],
            "center_y": [-0.08046845, -0.06934502],
            "center_x": [-0.0336911, -0.013034515],
            "R_sersic": [1.0177809, 0.9325099],
            "Ie": [150.55762, 150.0797],
        },
        {
            "n_sersic": [4.1814566, 3.0877438],
            "center_y": [0.033258155, 0.055594254],
            "center_x": [0.08249615, 0.043057345],
            "R_sersic": [1.2974737, 1.0547239],
            "Ie": [150.98956, 149.52852],
        },
    ],
    [
        {
            "n_sersic": [1.9998379, 2.1873033],
            "center_y": [-0.16878262, 0.24757618],
            "center_x": [0.46696508, 0.3786787],
            "R_sersic": [0.2810259, 0.27873865],
            "Ie": [150.91617, 150.61823],
        }
    ],
]

This is a format that is friendly for the gigalens code to work with.

Unconstraining and Restructuring Parameters#

Although the above format is more human-readable, TensorFlow prefers to work directly with Tensors. More precisely, it prefers to work with unconstrained tensors (to ensure differentiability). In the current formulation, there are certain parameters that are constrained (such as Sersic index). To handle this, we make use of bijectors that unconstrain parameters and then restructure them into a convenient form.

Tensorflow Implementation

The built-in tfp.bijectors.Restructure does the restructuring conveniently. However, this still outputs a list of tensors, when ideally we would like to concatenate all of them into one single Tensor. This can be done with tfp.bijectors.Split, but this will completely flatten the parameters into a length bs*d vector. We reshape using the remaining two transforms (tfp.bijectors.Reshape and tfp.bijectors.Transpose). Typically, we will use z to denote this flattened, unconstrained vector. Due to our use of tfp.bijectors.Split, the gigalens.tf.model.ForwardProbModel.log_prob() method and gigalens.tf.model.BackwardProbModel.log_prob() both expect a rank-2 Tensor z with shape (bs,d). If you are looking to evaluate the log density for just one example, you must expand the first dimension so it has shape (1,d).

class gigalens.tf.model.ForwardProbModel(prior, observed_image=None, background_rms=None, exp_time=None)[source]#

Probabilistic model defined using the simulated image as an estimator for the noise variance map. Linear parameters are not automatically solved for using least squares.

observed_image#

The observed image.

Type: tf.Tensor or numpy.array

background_rms#

The estimated background Gaussian noise level

Type: float

exp_time#

The exposure time (used for calculating Poisson shot noise)

Type: float

pack_bij#

A bijector that reshapes from a tensor to a structured parameter object (i.e., dictionaries of parameter values). Does not change the input parameters whatsoever, it only reshapes them.

Type: tfp.bijectors.Bijector

unconstraining_bij#

A bijector that maps from physical parameter space to unconstrained parameter space. Outputs an identical structure as the input, only maps values to unconstrained space.

Type: tfp.bijectors.Bijector

bij#

The composition of pack_bij and unconstraining_bij. The inverse method of bij will unconstrain parameter values and then flatten the entire structure into one tensor.

Type: tfp.bijectors.Bijector

log_prob(simulator: gigalens.tf.simulator.LensSimulator, z)[source]#

Notes

Parameters

simulator (LensSimulator) – A simulator object that has the same batch size as z.
z (tf.Tensor) – Parameters in unconstrained space with shape (bs, d), where bs is the batch size and d is the number of parameters per lens.

Returns

The reparameterized log posterior density, with shape (bs,).

class gigalens.tf.model.BackwardProbModel(prior, observed_image, background_rms, exp_time)[source]#

Probabilistic model defined using the observed image as an estimator for the noise variance map. Linear parameters are automatically solved for using least squares.

observed_image#

The observed image.

Type: tf.Tensor or numpy.array

background_rms#

The estimated background Gaussian noise level

Type: float

exp_time#

The exposure time (used for calculating Poisson shot noise)

Type: float

pack_bij#

A bijector that reshapes from a tensor to a structured parameter object (i.e., dictionaries of parameter values). Does not change the input parameters whatsoever, it only reshapes them.

Type: tfp.bijectors.Bijector

unconstraining_bij#

A bijector that maps from physical parameter space to unconstrained parameter space. Outputs an identical structure as the input, only maps values to unconstrained space.

Type: tfp.bijectors.Bijector

bij#

The composition of pack_bij and unconstraining_bij. The inverse method of bij will unconstrain parameter values and then flatten the entire structure into one tensor.

Type: tfp.bijectors.Bijector

log_prob(simulator: gigalens.tf.simulator.LensSimulator, z)[source]#

Notes

Parameters

simulator (LensSimulator) – A simulator object that has the same batch size as z.
z (tf.Tensor) – Parameters in unconstrained space with shape (bs, d), where bs is the batch size and d is the number of parameters per lens.

Returns

The reparameterized log posterior density, with shape (bs,).

JAX Implementation

Since the JAX integration with TPF is still mostly experimental, it seems tfp.substrates.jax.bijectors.Split does not work well with traced JAX arrays (which JAX uses with JIT-compiled functions). Therefore, the parameters typically remain in list form, rather than a JAX array (although the entire structure is at least flatten into one list). Internally, this is managed by converting all parameters to lists before evaluating prior density.

For all intermediate modeling steps, we will always prefer to work in unconstrained space, only converting back to physical parameter space when absolutely necessary. The final posterior samples from HMC will of course be converted to physical parameters by applying the bijector. We note that the use of unconstrained parameters means we must reparameterize the posterior density: this is easily done by multiplying the prior density by the absolute value of the determinant of the bijector Jacobian.