-
Notifications
You must be signed in to change notification settings - Fork 6k
Annealed Langevin Sampler #1032
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
def step( | ||
self, | ||
model_output: torch.FloatTensor, | ||
timestep: int, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anton-l can you check here if this fits with our scheduler API in general ? It seems like t
is more used as an index then the corresponding timestep no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@natolambert I think we have to change this here - We don't really want to pass timesteps as indexes for sigma except for the indexes and sigma are always identical or this use case is very special.
Overall, if you don't think we can also adapt the scheduler later on - fine for me as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a literature review to track this down. So what is happening here is that Score matching with Langevin dynamics (SMLD) (this paper) is an intellectual predecessor of SDE VE/P, which also uses this self.sigma[timestep]
API.
What I need to do to match this to the SDE VE scheduler is the make it index discrete sigmas, and that'll make timestep be an int again.
@anton-l could you review here? :-) |
It would useful to compare this sampler to other samplers with the same prompt/seed/etc with images to see the difference it might have on images. |
@natolambert would it be possible to add an integration test with one of the existing models? Both for a usage example to understand the API (see Patrick's comment above), and for test coverage |
Leaving open for now :-) Let me know if you require another review @natolambert - but I agree with @anton-l. We'd need some tests and also I think we might have to change the API regarding timesteps rather being indices. |
Yeah I'll come back to this when I have time. Mostly this was a learning side-project for me, so leaving open is preferred. |
What do people think. Is it worth spending time fleshing this out? SDE VE is the improved version of this scheduler, so unless we want to include it for completeness, it's not adding much. Really it goes from NCSN to NCSNv2 to SDE VE/P. We don't have the samplers for NCSN or NCSNv2 exactly. I could work to add them, but it may be better for a community member to pick this up? |
I think pretty much no-one uses SDE VE and SDE VP or Karras VE (<- this scheduler has a design that doesn't fit really anyways) Think a community contributor could def take this over :-) |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
Closes #84
Maybe @MinkaiXu can give a review on this?