Saving and loading the downsampled audio results in a tensor with format (32,32,32,128)

22 Views Asked by At

I am trying to save a .wav file with format (32,32,32,128) following the code as

However it can not work. Please support me to do this.

waveform, sample_rate = torchaudio.load('output.wav')

downsample_rate=8000

downsample_resample = torchaudio.transforms.Resample(sample_rate, downsample_rate, resampling_method='sinc_interpolation')

down_sampled = downsample_resample(waveform)

print(down_sampled.shape) 
output: torch.Size([1, 25121]).

I tried as like this code. but It still is not working.

target_shape = (32, 32, 32, 128)

# Calculate the number of elements in the down_sampled tensor
num_elements = down_sampled.numel()

# If the number of elements matches the target shape, reshape it
if num_elements == torch.tensor(target_shape).prod():
    reshaped_down_sampled = down_sampled.view(target_shape)
    print("Reshaped down_sampled shape:", reshaped_down_sampled.shape)
else:
    # If the number of elements does not match, resize it
    resized_down_sampled = F.interpolate(down_sampled.unsqueeze(0).unsqueeze(0), size=target_shape[1:3], mode='bilinear', align_corners=False)
    resized_down_sampled = resized_down_sampled.squeeze(0).squeeze(0)
    print("Resized down_sampled shape:", resized_down_sampled.shape)
0

There are 0 best solutions below