-
Notifications
You must be signed in to change notification settings - Fork 79
RFC-0011-InferenceMode #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
c0be269
23fa81d
5cdeaa7
7746b41
3a4edd6
575861b
0fee4b7
2537803
1effdab
02f6eeb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,12 +3,12 @@ Note: a large part of this RFC will become "InferenceMode" documentation once it | |
## Goals: | ||
- Provide a RAII in C++ and a context manager in Python frontend to switch between inference mode and normal mode, with the following constraints: | ||
- correctness is always guaranteed. (compared to `AutoNonVariableType` which has risks producing silent wrong result.) | ||
- performance of infenrence mode should match current existing `AutoNonVariableTypeMode` which is widely used in prod. | ||
- performance of inference mode should match current existing `AutoNonVariableTypeMode` which is widely used in prod. | ||
- switching between normal mode and inference mode should be really easy with minimal code change. | ||
- Make `AutoNonVariableTypeMode` an internal only API, replace all callsites of `AutoNonVariableTypeMode` outside pytorch codebase with the new `InferenceMode`. | ||
|
||
## Non-goals: | ||
- Match the theorectical best inference performance which can be achieved by striping all autograd related stuff at build time (not flexible). | ||
- Match the theoretical best inference performance which can be achieved by stripping all autograd related stuff at build time (not flexible). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, are we sure about this? If you write code solely in InferenceMode, with no interaction with non-InferenceMode tensors, it seems to me that theoretical best performance should be attainable (since we never have to hit the safety code for the intermixing situation). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh yes, without the intermixing situation (and maybe some startup cost initializing dispatch table) the performance should be attainable. :D |
||
- Allowing the most flexible interaction between normal mode and inference mode. Current main use case for inference mode is "either inference or normal" without mixing, so we ban a lot of interactions between two modes to keep the implementation simple. | ||
|
||
# Different levels of control over autograd (copied from @Alban) | ||
|
@@ -60,11 +60,11 @@ In this RFC we introduces the following new concepts: | |
return result; | ||
} | ||
``` | ||
- **Inference mode** can be turned on when you are sure you don't need any autograd computation. This saves the cost of creating autograd graph and as_view/version_counter setup compared to the normal mode. | ||
- **Inference mode** can be turned on when you are sure you don't need any autograd computation. This saves the cost of creating autograd graph and `as_view` / `version_counter` setup compared to the normal mode. | ||
- **Inference tensor** is defined as a tensor without Autograd **and** InplaceOrView keys on it. | ||
- **Normal tensor** has both Autograd & InplaceOrView keys. This includes both `requires_grad=true` and `requires_grad=false` tensors. (see [Ideal end state] section for more details). | ||
- Additional notes: | ||
- All Inference tensors are created in inference mode, but not all of the tensors created in inference mode are inference tensors. For example, a view of normal tensor created in inference mode is still a normal tensor (but with special creation_meta!). | ||
- All Inference tensors are created in inference mode, but not all of the tensors created in inference mode are inference tensors. For example, a view of normal tensor created in inference mode is still a normal tensor (but with special `creation_meta`!). | ||
- (Autograd & !InplaceOrView) and (!Autogad & InplaceOrView) are invalid states, we don't have such tensors. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we should have called this Autograd and NoAutograd LOL There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you mean autograd tensors and NoAutograd tensors? I like those names but they sounds too related to the GradMode which will be confusing to users :( |
||
|
||
# Expected Behavior | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Python frontend context manager isn't that important, right? Because this is oriented to performance use cases where you ought to be in C++ only anyway (it's good that it is possible and maybe some time we should add it, but I wouldn't say it's a primary goal)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep agreed, didn't plan to add that until we become stable on the C++ end. Mentioning it here just to make sure it's possible. :D