How to stream .aac url to wit.ai speech api

1k Views Asked by At

I am trying to get a facebook messenger audioclip url and forward the audio clip at that url to the wit.ai speech api.

The incoming message provides a payload url which when downloaded has a .aac file. From the API docs here HTTP API - speech endpoint it looks like the .aac filetype is not supported.

i have tried messing around with the header to send mpeg3 content type (in the hope that it may take) however all my responses have no text and the wit console shows "no text" for the logs of the incoming message. I have tried the below with all combinations of audio described on the docs page to be clear.

The send is in the form:

curl -XPOST 'https://api.wit.ai/speech?' \
 -i -L \
 -H "Authorization: Bearer <TOKEN>" \
 -H "Content-Type: audio/mpeg3" \
 -H "Transfer-encoding: chunked" \
 --data-binary "https://cdn.fbsbx.com/v/<rest of url>"

Clearly the request is OK as indicated by the response but there is nothing being returned from the text so the filetype is my issue.

HTTP/1.1 200 OK
Server: nginx/1.8.0
Date: Wed, 04 Jan 2017 12:51:13 GMT
Content-Type: application/json
Content-Length: 91
Connection: keep-alive

{
   "msg_id" : "12265ac7-3050-4cd2-94c1-7bf0d27eeab4",
   "_text" : "",
   "entities" : { }
}

Checking the with console under "Voice" for inbox I see nothing so it is obviously not picking up the audio as my headers and or filetype are wrong.

I don't think it is possible to stream the file that the messenger app creates (on ios if that matters) so is it possible to convert .aac to .wav at runtime using node/python on the backend?

Any help appreciated.

1

There are 1 best solutions below

0
On

For now it looks like the technical answer to my question is "it can't be done" as Wit doesn't support that format. However as a workaround I have created a python script that can take the url and create a local converted .wav file which I then stream to the speech api.

from pydub import AudioSegment
import os
import sys
import urllib2
import uuid
import json

# Globals
PATH = os.path.dirname(os.path.realpath(sys.argv[0])) + "/speech/"


def convertFiletoWAV(infile):
    OUT_FILE = 'speech_'+ str(uuid.uuid1())+ '.wav'
    OUT_NAME = PATH + OUT_FILE
    INPUT = PATH + infile
    AudioSegment.from_file(INPUT,"aac").export(OUT_NAME, format="wav")
    return {
        'in_full':INPUT,
        'in_file':infile,
        'out_full':OUT_NAME,
        'out_file':OUT_FILE
    }

def main(file):
    print json.dumps(convertFiletoWAV(str(file)))
    sys.stdout.flush()

if __name__ == "__main__":
    main(sys.argv[1:][0])

This is then called from the js audio handler:

let convertAndParseSpeech = (url) => {
  let file_p = config.root + '/python/speech/'
  let file = "fb_" + uuidV1() + "_down.aac"
  return new Promise(function(resolve, reject) {
    let stream = request
      .get(<url passed from Facebook payload>)
      .pipe(fs.createWriteStream(file_p + file))

    stream.on('finish',()=>{
      console.log(file);
      PythonShell.defaultOptions = {
        scriptPath: config.root + '/python/'
      };
      var options = {
        mode: 'text',
        args: [file],
      }
      PythonShell.run('convertAudio.py',options,(err, results) => {
        if (err || results.ERR) {
          reject(err)
        }else {
          console.log("RESULTS :",JSON.parse(results))
          fs.createReadStream(file_p + JSON.parse(results).out_file)
          .pipe(
            request.post({
              url : 'https://api.wit.ai/speech?v=20160526',
              json : true,
              headers: {
                "Content-Type": "audio/wav",
                "Authorization": "Bearer " + config.WIT_TOKEN
              }
            },(err,res,body)=>{
              if (!err || res.statusCode == 200) {
                resolve("OK", res.body);
              }else {
                reject("NOK", err);
              }
            })
          );
        }
      })
    })
  });

This works when segmented out into individual parts or when run from command like however I'm getting an error returned from the python script when running tests saying pydub is unavailable.

ImportError: No module named pydub

I will follow up with the PythonShell people to see what I'm doing wrong and edit the answer with the result but this solution gives me the basic functionality of what I wanted for now