PyAutoTune silent output

268 Views Asked by At

I've been playing around with the PyAutoTune module for a while now, and I can't seem to make it work. Unfortunately, documentation is non-existent, and the only thing I have to work with is is the example in the Examples folder.

What I would like to do is get a .wav file from a TTS engine and autotune it, to get a more robot-like voice. But I don't seem to be able to make it work. When I try to run the FromFileAutoTune form command line I only get an empty wav file with no sound in it.

What I've tried:

  • I've tried writing print statements to monitor the different parts of the process.
  • I noticed that my wav file had a 16 bit depth, so I tried reading the chunks as 16 bit integers and then converting them to float32. The data I get is exactly the same, so that didn't really help.
  • I can see that the chunks are correctly read and the datas array has the right format
  • I can see that the rawfromC data from AutoTune.Tuner(...) is almost always just a long string of 0s, which worries me.
  • I've checked the other example (RealTimeAutoTune) and it works really well. I'm able to speak in my microphone and get the autotuned version in (almost) real time. So I know that the module somehow works there.

Since the data that I plug in the AutoTune.Tuner() function has virtually the same structure in both examples, and the other parameters are exactly the same, I would expect a non-blank output, but that is not the case. Am I missing something obvious? Also, right now AutoTune.Tuner() is just a magical black box to me, there is no indication of what inputs it requires and what output it produces except what can be deduced from the two examples provided. Anything that can point me in the right direction is greatly appreciated!

Here's my code so far:

#Copyright (c) 2012, Eng Eder de Souza
#AutoTune from Wav File Example!

import sys
import numpy
import scikits.audiolab as audiolab 
import AutoTune

FORM_CORR=0
SCALE_ROTATE=0
LFO_QUANT=0
CONCERT_A=440.0
FIXED_PITCH=2.0
FIXED_PULL=0.1
CORR_STR=1.0
CORR_SMOOTH=0.0
PITCH_SHIFT=1.0
LFO_DEPTH=0.1
LFO_RATE=1.0
LFO_SHAPE=0.0
LFO_SYMM=0.0
FORM_WARP=0.0
MIX=1.0
KEY="c"
CHUNK=2048


NewSignal=[]

if len(sys.argv)<3 :
        print('Usage: '+sys.argv[0]+' <Input audio file.wav> <Output audio file.wav>')
        sys.exit(0)

IN=sys.argv[1]
OUT=sys.argv[2]





f = audiolab.Sndfile(IN, 'r')

FS = f.samplerate
nchannels  = f.channels



datas = f.read_frames(CHUNK, dtype=numpy.int16).astype(numpy.float32)*(1.0/32768.0)
print(datas) #this returns the correct data structure ([0.xxxxx 0.xxxxxx ... 0.xxxx])

while datas !='':

    print(".")
    Signal = datas.data[:]
    print(datas.dtype) #this outputs float32, as it should be

    rawfromC=AutoTune.Tuner(Signal,FS,CHUNK,SCALE_ROTATE,SCALE_ROTATE,LFO_QUANT,CONCERT_A,FIXED_PITCH,FIXED_PULL,CORR_STR,CORR_SMOOTH,PITCH_SHIFT,LFO_DEPTH,LFO_RATE,LFO_SHAPE,LFO_SYMM,FORM_WARP,MIX,KEY)
    for s in rawfromC:
        NewSignal.append(s)
    print(rawfromC) #this almost only returns an array of 0s

    try:
        datas = f.read_frames(CHUNK, dtype=numpy.int16).astype(numpy.float32)*(1.0/32768.0)
        print("datas:"+str(datas)) #datas seems to be correctly formatted

    except:
        break

dataArray = numpy.array(NewSignal)
fmt         = audiolab.Format('wav', 'pcm32')


# making the file .wav
afile =  audiolab.Sndfile(OUT, 'w', fmt, nchannels, FS)

#writing in the file
afile.write_frames(dataArray)

print("Done!")

EDIT: Looking at the output, there are actually some chunks where almost 25% of the entries are not 0. But still, this is very far from the desired output...

0

There are 0 best solutions below