Skip to content

port distributed pipeline test files for Intel GPU #159033

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

wincent8
Copy link

@wincent8 wincent8 commented Jul 24, 2025

In this PR we will port all distributed pipeline test files.
We could enable Intel GPU with following methods and try the best to keep the original code styles:

  1. instantiate_device_type_tests()
  2. use "torch.accelerator.current_accelerator()" to determine the accelerator backend
  3. use "requires_accelerator_dist_backend()" to replace requires_nccl()
  4. use "get_default_backend_for_device()" to get backend
  5. enabled XPU for some test path
  6. add TEST_MULTIACCELERATOR in common_utils for all backend.

cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta @gujinghui @EikanWang @fengyuan14 @guangyey

@wincent8 wincent8 requested a review from a team as a code owner July 24, 2025 10:31
Copy link

pytorch-bot bot commented Jul 24, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159033

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (4 Unrelated Failures)

As of commit 909fcf7 with merge base f636736 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Jul 24, 2025
Copy link

linux-foundation-easycla bot commented Jul 24, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@wincent8
Copy link
Author

@pytorchbot label "module: xpu"
@pytorchbot label "triaged"

@pytorch-bot pytorch-bot bot added the module: xpu Intel XPU related issues label Jul 25, 2025
@wincent8
Copy link
Author

@pytorchbot label "triaged"

@pytorch-bot pytorch-bot bot added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jul 25, 2025
@guangyey guangyey added the ciflow/xpu Run XPU CI tasks label Jul 25, 2025
Copy link

pytorch-bot bot commented Jul 25, 2025

To add the ciflow label ciflow/xpu please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@pytorch-bot pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Jul 25, 2025
@guangyey guangyey added the topic: not user facing topic category label Jul 25, 2025
@guangyey guangyey added the ciflow/xpu Run XPU CI tasks label Jul 25, 2025
@pytorch-bot pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Jul 25, 2025
@wincent8
Copy link
Author

@pytorchbot label "module: xpu"

@wincent8
Copy link
Author

@pytorchbot label "triaged"

@guangyey guangyey added the ciflow/xpu Run XPU CI tasks label Jul 26, 2025
Copy link
Collaborator

@guangyey guangyey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. I recommend to change TEST_MULTIGPU to TEST_MULTIACCELERATOR

@pytorch-bot pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Jul 28, 2025
@guangyey guangyey changed the title [WIP]port distributed pipeline test files for Intel GPU port distributed pipeline test files for Intel GPU Jul 28, 2025
@guangyey
Copy link
Collaborator

Thanks for the update!

@guangyey guangyey moved this to Review Required in PyTorch Intel Jul 28, 2025
@guangyey guangyey added the ciflow/xpu Run XPU CI tasks label Jul 28, 2025
@guangyey guangyey requested review from kwen2501, d4l3k and albanD July 28, 2025 06:58
@pytorch-bot pytorch-bot bot removed the ciflow/xpu Run XPU CI tasks label Jul 28, 2025
@guangyey guangyey added the ciflow/xpu Run XPU CI tasks label Jul 28, 2025
@guangyey guangyey removed the status in PyTorch Intel Jul 28, 2025
@guangyey guangyey moved this to Review Required in PyTorch Intel Jul 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/xpu Run XPU CI tasks module: xpu Intel XPU related issues oncall: distributed Add this issue/PR to distributed oncall triage queue open source topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
Status: Review Required
Development

Successfully merging this pull request may close these issues.

4 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy