Extract URL part of Text before a certain character

168 Views Asked by At

I am trying to extract the hostname from these strings, i.e. everything before the first % sign.

URL before the first % sign must be extracted (the bold parts), I have tried to use MID, SUBSTRING, CHARINDEX, LEFT, and LEN and just getting all muddled up.

Here are the code (at the bottom is one of my attempts).

DROP TABLE IF EXISTS #temp2;

CREATE TABLE #temp2 (url_link VARCHAR(2500));

INSERT INTO #temp2 (url_link)
VALUES
    ('az.account.box.com%2fsignup%2fcoll'),
    ('az.account.box.com%2fsignup%2fcollab%2f6kn7988x7i%3fbox_source%3dlegacy-external_collab_email%26amp%3bbox_action%3dclick_button'),
    ('app.box.com%2ffile%2f1105756887416%3fsb%3d%2factivity%2fcomments%2f515200504%26comment_id%3d515200504'),
    ('evelyn.sharefile.eu%2ff%2ffo292663-12a3-437f-8ec4-17a8f49ba4b6%3fa%3dc23e44cbd721d386'),
    ('app.box.com%2f'),
    ('app.box.com%2ffolder%2f0'),
    ('www.evernote.com%2fLogin.action');

SELECT
    url_link,
    CHARINDEX('%', url_link) AS [Pos%Char],
    LEFT(url_link, CHARINDEX('%', url_link) - 1) AS ExtractedHostname
FROM #temp2;
1

There are 1 best solutions below

0
Aswin P.M On

You're using LEFT function to extract the substring before the % sign, but you need to subtract 1 from the length to exclude the % sign itself. Here's the corrected version of your SQL query:

SELECT
    url_link,
    CHARINDEX('%', url_link) AS [Pos%Char],
    LEFT(url_link, CHARINDEX('%', url_link) - 1) AS ExtractedHostname
FROM #temp2;