RserveException: eval failed Syntax error

1.6k Views Asked by At

I have an R Function that Removes all html data from an html page. It works when I run it in R But when I Run it through Rserve it produces error :

Exception in thread "main" org.rosuda.REngine.Rserve.RserveException: eval failed, request status: R parser: syntax error

at org.rosuda.REngine.Rserve.RConnection.eval(RConnection.java:234)
at CereScope_Data.main(CereScope_Data.java:80)

Java Eval Where I get the error :

REXP lstrRemoveHtml = cobjConn.eval("RemoveHtml('" + lstrRawData + "')");

My R Function: rawdata is an HTML page

RemoveHtml <- function(rawdata) {
  
  library("tm")
  
  ## Convering Data To UTF-8 Format
  ## Creating Corpus
  Encoding(rawdata) <- "latin1"
  docs <- Corpus(VectorSource(iconv(rawdata, from = "latin1", to = "UTF-8", sub = "")))
  
  toSpace <- content_transformer(function(x , pattern) gsub(pattern, " ", x))
  
  docs <- gsub("[^\\b]*(<style).*?(</style>)", " ", docs)
  docs <- Corpus(VectorSource(gsub("[^\\b]*(<script).*?(</script>)", " ", docs)))
  docs <- tm_map(docs, toSpace, "<.*?>")
  docs <- tm_map(docs, toSpace, "(//).*?[^\n]*")
  docs <- tm_map(docs, toSpace, "/")
  docs <- tm_map(docs, toSpace, "\\\\t")
  docs <- tm_map(docs, toSpace, "\\\\n")
  docs <- tm_map(docs, toSpace, "\\\\")
  docs <- tm_map(docs, toSpace, "@")
  docs <- tm_map(docs, toSpace, "\\|")
  
  docs <- tm_map(docs, toSpace, "\\\"")
  docs <- tm_map(docs, toSpace, ",")
  RemoveHtmlDocs <- tm_map(docs, stripWhitespace)
  
  return(as.character(RemoveHtmlDocs)[1])
}

Update - Things I tried already

  1. Escaping characters which may cause problems such as Single and Double Quotes and Backslashes
  2. I also tried assigning whole data to an R variable through eval and then running the function

New Update - Question Solved

  1. Escaping characters were causing problems such as Single and Double Quotes and Backslashes
  2. Another line which was no longer necessary was causing the problem as I didn't comment or remove it.

Thanks All!! : ) Check My Answer For Description!! : )

2

There are 2 best solutions below

0
On BEST ANSWER

The Escaping Characters was the issue. To solve this problem I Escaped Escapes And Quotes. I created This Method to make it simpler:

public static String Regexer(String Data) {
    String RegexedData = Data.replaceAll("\\\\", "\\\\\\\\").replaceAll("'", "\\\\'").replaceAll("\"", "\\\\\"");
    return (RegexedData);
}

I Escaped the Escaped characters again in the above function so that they are escaped in R functions also.

Tip : Don't Forget To Convert REXP to a Java variable. : )

4
On

Error lies in

REXP lstrRemoveHtml = cobjConn.eval("RemoveHtml('" + lstrRawData + "')");

In Java, \ is an escape character. So it escapes the meaning of " which is meant to act as r expression

Solution: Just append lstrRawData before passing to eval function as

exp = "RemoveHtml(\"" + lstrRawData + "\")";
REXP lstrRemoveHtml = cobjConn.eval(exp)