Skip to content

RFC-0011-InferenceMode #17

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jun 10, 2021
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
minor copyedit
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
  • Loading branch information
ezyang committed Mar 17, 2021
commit 23fa81d1cab195e3b283eab7d5ca26e019e37e32
8 changes: 4 additions & 4 deletions RFC-0011-InferenceMode.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@ Note: a large part of this RFC will become "InferenceMode" documentation once it
## Goals:
- Provide a RAII in C++ and a context manager in Python frontend to switch between inference mode and normal mode, with the following constraints:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Python frontend context manager isn't that important, right? Because this is oriented to performance use cases where you ought to be in C++ only anyway (it's good that it is possible and maybe some time we should add it, but I wouldn't say it's a primary goal)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep agreed, didn't plan to add that until we become stable on the C++ end. Mentioning it here just to make sure it's possible. :D

- correctness is always guaranteed. (compared to `AutoNonVariableType` which has risks producing silent wrong result.)
- performance of infenrence mode should match current existing `AutoNonVariableTypeMode` which is widely used in prod.
- performance of inference mode should match current existing `AutoNonVariableTypeMode` which is widely used in prod.
- switching between normal mode and inference mode should be really easy with minimal code change.
- Make `AutoNonVariableTypeMode` an internal only API, replace all callsites of `AutoNonVariableTypeMode` outside pytorch codebase with the new `InferenceMode`.

## Non-goals:
- Match the theorectical best inference performance which can be achieved by striping all autograd related stuff at build time (not flexible).
- Match the theoretical best inference performance which can be achieved by stripping all autograd related stuff at build time (not flexible).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, are we sure about this? If you write code solely in InferenceMode, with no interaction with non-InferenceMode tensors, it seems to me that theoretical best performance should be attainable (since we never have to hit the safety code for the intermixing situation).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes, without the intermixing situation (and maybe some startup cost initializing dispatch table) the performance should be attainable. :D

- Allowing the most flexible interaction between normal mode and inference mode. Current main use case for inference mode is "either inference or normal" without mixing, so we ban a lot of interactions between two modes to keep the implementation simple.

# Different levels of control over autograd (copied from @Alban)
Expand Down Expand Up @@ -60,11 +60,11 @@ In this RFC we introduces the following new concepts:
return result;
}
```
- **Inference mode** can be turned on when you are sure you don't need any autograd computation. This saves the cost of creating autograd graph and as_view/version_counter setup compared to the normal mode.
- **Inference mode** can be turned on when you are sure you don't need any autograd computation. This saves the cost of creating autograd graph and `as_view` / `version_counter` setup compared to the normal mode.
- **Inference tensor** is defined as a tensor without Autograd **and** InplaceOrView keys on it.
- **Normal tensor** has both Autograd & InplaceOrView keys. This includes both `requires_grad=true` and `requires_grad=false` tensors. (see [Ideal end state] section for more details).
- Additional notes:
- All Inference tensors are created in inference mode, but not all of the tensors created in inference mode are inference tensors. For example, a view of normal tensor created in inference mode is still a normal tensor (but with special creation_meta!).
- All Inference tensors are created in inference mode, but not all of the tensors created in inference mode are inference tensors. For example, a view of normal tensor created in inference mode is still a normal tensor (but with special `creation_meta`!).
- (Autograd & !InplaceOrView) and (!Autogad & InplaceOrView) are invalid states, we don't have such tensors.
Copy link
Contributor

@ezyang ezyang Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should have called this Autograd and NoAutograd LOL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean autograd tensors and NoAutograd tensors? I like those names but they sounds too related to the GradMode which will be confusing to users :(


# Expected Behavior
Expand Down
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy