awk dot in regex doesn't match space

Question

awk dot in regex doesn't match space

101 Views Asked by minseong At 18 February 2024 at 07:01

I want to print everything after the first whitespace. E.g. given hello there I want to print hello there, given what's up I want to print what's up.

I wrote this fully expecting it to work:

{ print $0 }
match($0, /[^[:space:]].*$/) {
    print $1
}

I thought the regex /[^[:space:]].*$/ would match the first non-space character, and then .*$ would match all of the characters after it.

But the regex only seems to capture up to the next whitespace:

$ echo hello there | awk -f after_indent.awk
hello there
hello

Original Q&A

There are 5 best solutions below

**minseong** · Answer 1 · 2024-02-18T07:33:03.873000

minseong On 18 February 2024 at 07:33

$1 is globally considered as the first field, it's not the match result. You have to use substr to get the match result:

{ print $0 }
match($0, /[^[:space:]].*$/) {
    print substr($0, RSTART, RLENGTH)
}

**The fourth bird** · Answer 2 · 2024-02-18T10:56:35.990000

You might also remove any leading spaces from the row. Using * as the quantifier, sub will return 1 and print the whole row.

awk 'sub(/^[[:space:]]*/, "")' file

If both your example strings as in file, that will print:

hello there
what's up

**glenn jackman** · Answer 3 · 2024-02-18T14:01:59.837000

With GNU awk, the match function can take a 3rd argument: an array that will contain the text matched in capturing parentheses:

gawk 'match($0, /([^[:blank:]].*)/, m) {print m[1]}' file
# ...............^..............^

m[1] contains the text from the 1st pair of parentheses.

**Ed Morton** · Answer 4 · 2024-02-19T14:59:30.580000

FWIW I'd just use sed for this, e.g. given this input:

$ cat file
hello there
          what's up

and using any POSIX sed depending on whether or not you really want to duplicate lines in the output and how you want to handle lines that don't start with spaces:

$ sed 's/^[[:space:]]*//p' file
hello there
hello there
what's up
what's up

$ sed 's/^[[:space:]]\+//p' file
hello there
what's up
what's up

$ sed -n 's/^[[:space:]]*//p' file
hello there
what's up

$ sed -n 's/^[[:space:]]\+//p' file
what's up

**RARE Kpop Manifesto** · Answer 5 · 2024-02-21T14:24:25.787000

echo "      what's up\n \t  hello there   \t " |

— trimming just the head :

mawk ++NF FS='^[ \t-\r]+' OFS=

|what's up|
|hello there     |

— trimming head and tail :

gawk ++NF FS='^[ \t-\r]+|[ \t-\r]+$' OFS=

|what's up|
|hello there|

If you think \v, \f, and \r are impossible from your data, then it's a lot cleaner :

change [ \t-\r]+ — to — [ \t]+

awk dot in regex doesn't match space

There are 5 best solutions below

Related Questions in AWK

Related Questions in POSIX

Trending Questions

Popular # Hahtags

Popular Questions