[NO MERGE] is no_x_dim really faster? #159048

jataylo · 2025-07-24T15:11:31Z

Testing rocm inductor CI

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben

pytorch-bot · 2025-07-24T15:11:35Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159048

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures, 1 Unrelated Failure

As of commit fdfc7c0 with merge base bcf34d2 ():

NEW FAILURES - The following jobs have failed:

Check Labels / Check labels (gh)
RuntimeError: Error checking labels: PR does not have required labels
inductor / unit-test / linux-jammy-cpu-py3.12-gcc11-inductor-triton-cpu / test (inductor-triton-cpu, 1, 1, linux.12xlarge) (gh)
inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestCPU::test_2d_reduction_no_x_dim_cpu
Lint / lintrunner-noclang / linux-job (gh)
>>> Lint for torch/_inductor/choices.py:
pull / linux-jammy-cuda12.8-py3.10-gcc11-sm89 / test (default, 2, 5, linux.g6.4xlarge.experimental.nvidia.gpu) (gh)
inductor/test_torchinductor_strided_blocks.py::TritonBlockPointerTestGPU::test_2d_reduction_no_x_dim_cuda

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / linux-jammy-py3_9-clang9-xla / test (xla, 1, 1, linux.12xlarge, unstable) (gh) (#158876)
sccache: error: couldn't connect to server

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-07-24T15:12:52Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

pytorch-bot · 2025-07-25T00:45:10Z

Warning: Unknown label ciflow/inductor-perf-test-nightly.
Currently recognized labels are

ciflow/binaries
ciflow/binaries_libtorch
ciflow/binaries_wheel
ciflow/triton_binaries
ciflow/inductor
ciflow/inductor-periodic
ciflow/inductor-rocm
ciflow/inductor-perf-test-nightly-rocm
ciflow/inductor-perf-compare
ciflow/inductor-micro-benchmark
ciflow/inductor-micro-benchmark-cpu-x86
ciflow/inductor-perf-test-nightly-x86-zen
ciflow/inductor-cu126
ciflow/linux-aarch64
ciflow/mps
ciflow/nightly
ciflow/periodic
ciflow/periodic-rocm-mi300
ciflow/rocm
ciflow/rocm-mi300
ciflow/s390
ciflow/slow
ciflow/trunk
ciflow/unstable
ciflow/xpu
ciflow/torchbench
ciflow/op-benchmark
ciflow/pull
ciflow/h100
ciflow/h100-distributed
ciflow/win-arm64
ciflow/h100-symm-mem
ciflow/h100-cutlass-backend

Please add the new label to .github/pytorch-probot.yml

jataylo · 2025-07-25T00:45:24Z

@pytorchbot rebase -b main

pytorchmergebot · 2025-07-25T00:46:52Z

@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here

pytorchmergebot · 2025-07-25T00:46:55Z

Successfully rebased no_x_dim_test onto refs/remotes/origin/main, please pull locally before adding more changes (for example, via git checkout no_x_dim_test && git pull --rebase)

jataylo · 2025-07-25T09:45:56Z

@pytorchbot rebase

pytorchmergebot · 2025-07-25T09:47:22Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2025-07-25T09:47:25Z

Successfully rebased no_x_dim_test onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout no_x_dim_test && git pull --rebase)

…im removal (#2417) We noticed persistent reduction kernels can be extremely poor performing https://ontrack-internal.amd.com/browse/SWDEV-539215 The root cause is that in certain size restrictions and kernels "no_x_dim" mode is enabled, which embeds static XBLOCK=1 into the kernel. This means tuning is not optimal. Removing this mode and enabling autotune we achieve 2x performance proving that new heuristics must be made. We will bring this into 2.7 for perf uplift, discussion is undergoing with upstream on removing no_x_dim, if there is no perf regression they are in agreement. Draft PR shows no perf loss on ROCm for any inductor benchmark pytorch#159048 Removing tests because no longer relevant.

jataylo added the ciflow/inductor-perf-test-nightly-rocm Trigger inductor perf tests on ROCm label Jul 24, 2025

pytorch-bot bot added ciflow/inductor module: inductor labels Jul 24, 2025

pytorchbot added the open source label Jul 24, 2025

jataylo added the ciflow/inductor-perf-test-nightly Trigger nightly inductor perf tests label Jul 25, 2025

pytorchmergebot force-pushed the no_x_dim_test branch from 861d9ea to 9b5cc18 Compare July 25, 2025 00:46

jataylo added pt2-pass-rate-regression Track regression of PT2 dashboard pass rate and removed pt2-pass-rate-regression Track regression of PT2 dashboard pass rate labels Jul 25, 2025

[NO MERGE] is no_x_dim really faster?

fdfc7c0

pytorchmergebot force-pushed the no_x_dim_test branch from 9b5cc18 to fdfc7c0 Compare July 25, 2025 09:47

jataylo mentioned this pull request Jul 25, 2025

[SWDEV-539215] - Autotune support for persistent reduction and no_x_dim removal ROCm/pytorch#2417

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NO MERGE] is no_x_dim really faster? #159048

[NO MERGE] is no_x_dim really faster? #159048

jataylo commented Jul 24, 2025 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Jul 24, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 24, 2025

Uh oh!

pytorch-bot bot commented Jul 25, 2025

Uh oh!

jataylo commented Jul 25, 2025

Uh oh!

pytorchmergebot commented Jul 25, 2025

Uh oh!

pytorchmergebot commented Jul 25, 2025

Uh oh!

jataylo commented Jul 25, 2025

Uh oh!

pytorchmergebot commented Jul 25, 2025

Uh oh!

pytorchmergebot commented Jul 25, 2025

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

[NO MERGE] is no_x_dim really faster? #159048

Are you sure you want to change the base?

[NO MERGE] is no_x_dim really faster? #159048

Conversation

jataylo commented Jul 24, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159048

❌ 4 New Failures, 1 Unrelated Failure

Uh oh!

github-actions bot commented Jul 24, 2025

This PR needs a release notes: label

Uh oh!

pytorch-bot bot commented Jul 25, 2025

Uh oh!

jataylo commented Jul 25, 2025

Uh oh!

pytorchmergebot commented Jul 25, 2025

Uh oh!

pytorchmergebot commented Jul 25, 2025

Uh oh!

jataylo commented Jul 25, 2025

Uh oh!

pytorchmergebot commented Jul 25, 2025

Uh oh!

pytorchmergebot commented Jul 25, 2025

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

jataylo commented Jul 24, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Jul 24, 2025 •

edited

Loading

This PR needs a `release notes:` label