Removing smart quotes from an SQL server text column

2k Views Asked by At

Problem: My company is using Liferay 6.0 which being upgraded to Liferay 6.1 GA2. In the 6.0 we have a large number of rows in an SQL server database column that have smart quote characters. In the automatic upgrade the smart quotes are being replaced by a '?'. We were planning to run a find and replace SQL statement on the content before upgrading to prevent this issue. I wrote a small Java program to find the ASCII values of the character. The character appears differently in ANSI ("A‎A") than in UTF-8 ("A‎A"). I have sandwiched the smart quote character between capital A's in the preceding sentence.

public class CheckAscii {
    public static void main(String args[]) {
        asciiVals("A‎A"); //ANSI
        asciiVals("A‎A"); //UTF-8
    }
    static void asciiVals(String str) {
        System.out.println("Length of the string: " + str + " is " + str.length());
        for (int j=0; j<str.length(); j++){
            System.out.println("Ascii value of char " + str.charAt(j) + " : " + (int)str.charAt(j));
        }
        System.out.println("");
    }
}

The output was

Length of the string: A‎A is 5
Ascii value of char A : 65
Ascii value of char â : 226
Ascii value of char € : 8364
Ascii value of char Ž : 381
Ascii value of char A : 65

Length of the string: A‎A is 3
Ascii value of char A : 65
Ascii value of char ‎ : 8206
Ascii value of char A : 65

The below queries did not give me any rows,

SELECT [COLUMN] FROM [TABLE] where [COLUMN] like '%['+char(226)+']%'
SELECT [COLUMN] FROM [TABLE] where [COLUMN] like '%['+char(8364)+']%'
SELECT [COLUMN] FROM [TABLE] where [COLUMN] like '%['+char(381)+']%'
SELECT [COLUMN] FROM [TABLE] where [COLUMN] like '%['+char(8206)+']%'

This query fetches,

select CHAR(226), CHAR(8364), CHAR(381), CHAR(8206)

â null null null

I don't how to look for those characters in the text. Does anyone know how to form a search query for the smart quotes in SQL Server?

1

There are 1 best solutions below

0
On

If the data type is in fact of TEXT, first cast the column to VARCHAR(MAX) like this:

SELECT [COLUMN] FROM [TABLE] where CAST([COLUMN] AS VARCHAR(MAX)) like '%['+char(226)+']%'
SELECT [COLUMN] FROM [TABLE] where CAST([COLUMN] AS VARCHAR(MAX)) like '%['+char(8364)+']%'
SELECT [COLUMN] FROM [TABLE] where CAST([COLUMN] AS VARCHAR(MAX)) like '%['+char(381)+']%'
SELECT [COLUMN] FROM [TABLE] where CAST([COLUMN] AS VARCHAR(MAX)) like '%['+char(8206)+']%'