Skip to content

[Docs] Implementing Custom mean function documentation #674

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
michael-ziedalski opened this issue May 3, 2019 · 15 comments
Closed

[Docs] Implementing Custom mean function documentation #674

michael-ziedalski opened this issue May 3, 2019 · 15 comments

Comments

@michael-ziedalski
Copy link

michael-ziedalski commented May 3, 2019

Hello, all. While Gaussian Processes are often described as not needing a mean (or only a zero, or constant one), that is not true, and being able to implement custom means would be incredibly useful.

I see in the docs that gpytorch.means.Mean seems to be set aside for this specific purpose, but there exists no explanation (or examples) demonstrating how to use it.

Is there an obvious method of defining simple, custom means that I am not seeing? If not, I think a couple of basic examples would be helpful.

@KeAWang
Copy link
Collaborator

KeAWang commented May 3, 2019

You can find some examples here: https://github.com/cornellius-gp/gpytorch/tree/5411c905b778280122e2524fc4aafd13cdc7270d/gpytorch/means.

Basically you just need to define how to initialize the Mean object and define its forward function much like how you would define a Kernel.

@michael-ziedalski
Copy link
Author

michael-ziedalski commented May 3, 2019

Going over the examples above, I still feel some slight confusion. Would you be willing to code a tiny example, such as a mean that depended on just one input?

So, that, for example, based on this picture,
ee57a99f3f36eefb33d02504fd644b57e4973454_2_690x474
, the single input, x, would be able to partially capture the growing trend?

@KeAWang
Copy link
Collaborator

KeAWang commented May 3, 2019

Something like this

class AffineMean(gpytorch.means.Mean):                                                                                                                                                                        
    def __init__(self, input_size, batch_shape=torch.Size()):
        super().__init__()
        self.register_parameter(name='weight', parameter=torch.nn.Parameter(torch.randn(*batch_shape, input_size)))
        self.register_parameter(name='bias', parameter=torch.nn.Parameter(torch.randn(*batch_shape, 1)))
        
    def forward(self, x):
        return torch.einsum('...j, ...ij->...i', self.weight, x) + self.bias

would learn a different affine mean for each batch dimension.

@jacobrgardner
Copy link
Member

Thanks, Alex. Worth noting: just doing a matmul between n x d input x and d x 1 weight matrix w would also work -- the einsum seems overkill here, since matmul will broadcast over any batch dimensions just fine :-).

If there's still some confusion about this let us know and I'll try to give a few other examples.

@dtort
Copy link

dtort commented Jul 15, 2019

I am still a little confused as to how to use gpytorch means. I want to implement a constant mean of 5, but I don't understand how to specify that in the Constant Mean or if I have to implement a custom mean for that.

@jacobrgardner
Copy link
Member

@dtort For that just use the constant mean, initialize the constant to 5 and then make the constant untrainable by setting requires_grad=False

self.mean_module = ConstantMean()
self.mean_module.initialize(constant=5.)
self.mean_module.constant.requires_grad = False

@Alaya-in-Matrix
Copy link

Alaya-in-Matrix commented Oct 23, 2019

I would like to use neural network as the mean function, is this a valid implementation of mean function?

class MLPMean(gp.means.Mean):
    def __init__(self, dim):
        super(MLPMean, self).__init__()
        self.mlp = nn.Sequential(
            nn.Linear(dim, 32), 
            nn.ReLU(), 
            nn.Linear(32, 1))

        count = 0
        for n, p in self.mlp.named_parameters():
            self.register_parameter(name = 'mlp' + str(count), parameter = p)
            count += 1
    
    def forward(self, x):
        m = self.mlp(x)
        return m.squeeze()

Is the register_parameter necessary? my code still works fine if that part is removed.

@jacobrgardner
Copy link
Member

@Alaya-in-Matrix Yep, that's valid. No need for the register_parameter -- that's how you register parameters to Modules in PyTorch. torch.nn.Linear modules already have all of their parameters registered :-).

@gpleiss
Copy link
Member

gpleiss commented Nov 12, 2019

closing for now - reopen if there's still questions :)

@gpleiss gpleiss closed this as completed Nov 12, 2019
@tniggs84
Copy link

tniggs84 commented Mar 8, 2020

Hello, I apologize for the basic question, but I am sort of new to both GP’s and Torch, so would appreciate any help you can provide.

Basically I have a 2D surface I’d like to fit but if it is easier, we can pretend it’s 1D. The challenge I have is that I have strong prior belief in the basic shape of this function and that the noise varies (in this case decreases as both axes grow). Ideally I’d like to specify a grid of values as priors and some noise around each point that is also custom.

This is the only thread I’ve been able to define that shows anything about specifying the mean function and I’m about out of ideas. Can anyone help me?

Thank you in advance,
Ted

@jacobrgardner
Copy link
Member

Hi @tniggs84

To encode a prior about the basic shape of the function, I might recommend taking your grid of points and fitting some interpolating model to the grid of points and using that as a mean function. If your function is well modeled by a kernel method (e.g., a GP) you might even consider kernel regression (e.g., first take your grid of points, fit a GP to them and use the mean prediction as a prior mean).

To encode the noises that you know, try using a FixedNoiseGaussianLikelihood:

class FixedNoiseGaussianLikelihood(_GaussianLikelihoodBase):

This takes a noise vector that specifies some known amount of noise at each training data. If you don't have specific known noise values at the training data and only at a grid, you could again train some interpolating model on the grid values and predict at the training data, and use the vector of predictions as your fixed noises.

If you have more questions about this, please open another issue so that it's easier to track.

@tniggs84
Copy link

Thank you for your help @jacobrgardner! I did follow your directions and created a new issue (#1073 ). Can you please look at that as I still have some outstanding questions.

Thank you!
Ted

@Jahnvi99
Copy link

hello
I am new to gpytorch. Can someone help me in defining the negative quadratic function as prior for the mean?

@gpleiss
Copy link
Member

gpleiss commented Jun 23, 2021

@Jahnvi99 can you please open up a discussion topic with your question?

@Jahnvi99
Copy link

Sure, as u suggested I opened a discussion topic with my question!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants