Audio resampling layer for tensorflow

1k Views Asked by ir0098 At 28 July 2025 at 00:22

It is required to resample audio signals within a custom model structure. This resampling task is not a kind of pre/post-processing operation that can be developed out of the model. In other words, this resampling is a section of model's internal design. Then, it is required to define the gradient operation for such a layer as well. For the resampling operation, it is going to employ tensorflow I/O:

tfio.audio.resample

The operation works perfectly and can be easily used as a pre/post-processing unit; however, its implementation a a custom layer being embedding within the model is challenging as I don't know how to implement the backward path.

How the backward path should be implemented for such a 1D signal resampling layer?
Is there any other open source 1D signal resampling layer that be employed?

P.S., I tried to employ conventional upsampling/pooling like layers, but not accurate enough comparing the tfio which implements other resampling methods like FFT-based.

To give more understanding, please have a look at: another question

Original Q&A

There are 1 best solutions below

Jirayu Kaewprateep On 29 March 2022 at 16:07

You must tell the objective of re-samplings, it can be done in many ways including concluding sing signals then you can represent with smaller sizes of sine values.

By changing of the samplig rate you can save the DATA space 0.05 * tf.math.sin(audio[:5 * 22050]).numpy()

sec_1 = np.zeros((2750)) * tf.math.sin(audio[0:2750]).numpy() and

sec_2 = np.ones((2750)) * tf.math.sin(audio[2750:5500]).numpy()

[ Sample ]:

import numpy as np
import tensorflow as tf

import matplotlib.pyplot as plt

contents = tf.io.read_file("F:\\temp\\Python\\Speech\\temple_of_love-sisters_of_mercy.wav")
audio, sample_rate = tf.audio.decode_wav(
    contents, desired_channels=-1, desired_samples=-1, name=None
)

print(audio)
print(sample_rate)

plt.plot(audio[:5 * 22050])
plt.show()
plt.close()

plt.plot(0.05 * tf.math.sin(audio[:5 * 22050]).numpy())
plt.show()
plt.close()

sec_1 = np.zeros((2750)) * tf.math.sin(audio[0:2750]).numpy()
sec_2 = np.ones((2750)) * tf.math.sin(audio[2750:5500]).numpy()


plt.plot(0.05 * tf.concat([sec_1, sec_2], 0).numpy())
plt.show()
plt.close()

[ Output ]:

array([[0.],
       [0.],
       [0.],
       ...,
       [0.],
       [0.],
       [0.]], dtype=float32)>, sample_rate=<tf.Tensor: shape=(), dtype=int32, numpy=22050>)

tf.Tensor(22050, shape=(), dtype=int32)

Audio resampling layer for tensorflow

There are 1 best solutions below

Related Questions in TENSORFLOW

Related Questions in KERAS

Related Questions in AUDIO

Related Questions in NEURAL-NETWORK

Related Questions in RESAMPLE

Trending Questions

Popular # Hahtags

Popular Questions