Why does the Cloud Natural Language Model API return so many NULLs?

21 Views Asked by At

I have been working with Google's Cloud Natural Language Model in BigQuery, and I have noticed that a significant percent of requests generate a NULL response. Why?

Here's an example...

This code creates a table with 54 movie reviews from the publicly available IMDB movie reviews dataset, then creates a remote connection to the API, and then uses the function ML.UNDERSTAND_TEXT to do NLP on the reviews via the API. 10 out of the 54 results = NULL. I have tried this again with a different sample of movie reviews with the same result.

The code:

-- from: https://www.samthebrand.com/sentiment-analysis-using-sql-ai-bigquery/
-- Isolate a sample of natural language data

CREATE OR REPLACE TABLE `[project].[dataset].[table]` AS
SELECT review, movie_url, label, reviewer_rating
FROM `bigquery-public-data.imdb.reviews`
WHERE reviewer_rating IS NOT NULL AND RAND() < 0.001;

-- Create a connection to a remote model

CREATE OR REPLACE MODEL `[project].[dataset].[model_name_a]`
REMOTE WITH CONNECTION `[dataset].[location].[dataset]`
OPTIONS (REMOTE_SERVICE_TYPE = 'CLOUD_AI_NATURAL_LANGUAGE_V1');

-- Run the data through your model to extract sentiment

CREATE OR REPLACE TABLE `[project].[dataset].[table]` AS
SELECT *, float64(ml_understand_text_result.document_sentiment.score) as sentiment_score, float64(ml_understand_text_result.document_sentiment.magnitude) as magnitude_score, 
FROM ML.UNDERSTAND_TEXT(
  MODEL `[project].[dataset].[model_name_a]`,
  (SELECT review AS text_content, movie_url, label, reviewer_rating from `[dataset].[table]`)
  STRUCT('analyze_sentiment' AS nlu_option));

-- see how many NULLs

SELECT sentiment_score, COUNT(*)
FROM `[project].[dataset].[table]`
GROUP BY 1 ORDER BY 1;
0

There are 0 best solutions below