Convolution: CPU memory increase with growing number of different sequence lengths

### Issue type

Bug

### Have you reproduced the bug with TensorFlow Nightly?

No

### Source

source

### TensorFlow version

tf 2.14.0

### Custom code

Yes

### OS platform and distribution

Linux Ubuntu 22.04

### Mobile device

_No response_

### Python version

3.11

### Bazel version

_No response_

### GCC/compiler version

_No response_

### CUDA/cuDNN version

CUDA V11.8.89, cuDNN version 8600

### GPU model and memory

NVIDIA GeForce GTX 1080 Ti

### Current behavior?

I noticed a linear increase of CPU memory usage in my setups when using a convolution on raw waveforms (i.e., sequences which are long in time and 1D in feature). I could isolate the issue and it seems to be related to the number of different sequence lengths that occur.  I.e., if the sequence length is fixed to 100k, the memory consumption is constant. If it is randomly sampled from a given range, the memory consumption asymptotically grows towards a larger value as the range gets larger. This can be observed in the plot below. Also note that the memory consumption is not influenced by the absolute sequence length, just by the size of the range.

![image](https://github.com/tensorflow/tensorflow/assets/45091115/cbaeaadf-6d95-434f-bf2f-1a4aec232efc)

I measured the memory consumption using `watch_memory()` from [here](https://github.com/rwth-i6/returnn/blob/c230d1408a2d7620a9d00a5171c998d3876e69dc/returnn/util/watch_memory.py#L13). The different runs in the plot correspond to different `n_time_min` and `n_time_max` in the stand-alone code.

I reproduced the issue with an apptainer image built on top of the [tensorflow 2.14 image from dockerhub](https://hub.docker.com/layers/tensorflow/tensorflow/2.14.0-gpu-jupyter/images/sha256-981372796921ef7bb75f4fe5fbe98c335824d08233bed57586633199028d5e18?context=explore). The image definition file looks as follows:

<details>

```
Bootstrap: docker
From: tensorflow/tensorflow:2.14.0-gpu 
Stage: build

%post
    apt update -y

    # all the fundamental basics, zsh is need because calling the cache manager might launch the user shell
    DEBIAN_FRONTEND=noninteractive apt install -y wget git unzip gzip libssl-dev lsb-release zsh \
        bison libxml2-dev libopenblas-dev libsndfile1-dev libcrypto++-dev libcppunit-dev \
        parallel xmlstarlet python3-lxml htop strace gdb sox python3-pip cmake ffmpeg vim

    cd /usr/local
    git clone https://github.com/rwth-i6/cache-manager.git
    cd bin
    ln -s ../cache-manager/cf cf

    echo /usr/local/lib/python3.11/dist-packages/tensorflow > /etc/ld.so.conf.d/tensorflow.conf
    ldconfig

    apt install -y python3 python3-pip
    pip3 install -U pip setuptools wheel
    pip3 install ipdb
    pip3 install h5py six soundfile librosa==0.10 better-exchook dm-tree psutil
    pip3 install --ignore-installed psutil flask ipython
    pip3 install git+https://github.com/rwth-i6/sisyphus
    pip3 install black==22.3.0 matplotlib typing-extensions typeguard  # sequitur-g2p==1.0.1668.23
    pip3 install memray objgraph Pympler

```

</details>

### Standalone code to reproduce the issue

```shell
import numpy as np
import tensorflow as tf

n_feat = 1
n_out = 30
filter_size = 5
n_steps = 100000
n_time_min = 10000
n_time_max = 30000
batch_size_max = 400000

filters = tf.Variable(tf.random.normal((filter_size, n_feat, n_out), stddev=0.01))
for step in range(n_steps):
    n_time = np.random.randint(n_time_min, n_time_max)
    n_batch = batch_size_max // n_time
    x = tf.random.normal((n_batch, n_time, n_feat))
    y = tf.nn.convolution(
        x,
        filters=filters,
        padding="VALID",
    )
```


### Relevant log output

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Convolution: CPU memory increase with growing number of different sequence lengths #62441

Issue type

Have you reproduced the bug with TensorFlow Nightly?

Source

TensorFlow version

Custom code

OS platform and distribution

Mobile device

Python version

Bazel version

GCC/compiler version

CUDA/cuDNN version

GPU model and memory

Current behavior?

Standalone code to reproduce the issue

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Convolution: CPU memory increase with growing number of different sequence lengths #62441

Description

Issue type

Have you reproduced the bug with TensorFlow Nightly?

Source

TensorFlow version

Custom code

OS platform and distribution

Mobile device

Python version

Bazel version

GCC/compiler version

CUDA/cuDNN version

GPU model and memory

Current behavior?

Standalone code to reproduce the issue

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.