Using AWK with multiple delimiters

55 Views Asked by At

I'm trying to get the first part of Column $1 with the entire Column $2,$3 and $5 but unfortunately I'm getting a different output

Input:

2023-08-01 05:30:01,Lakers,CA,LA,US
2023-10-05 16:40:23,Denver Nuggets,CO,DN,US
2024-01-20 16:40:23,Utah Jazz,UT,SLC,US

Expected output

2023-08-01,Lakers,CA,US

cut -d, -f1,2,3,5

awk -F',' '{ print $1,$2,$3,$5 }'
4

There are 4 best solutions below

2
Ed Morton On BEST ANSWER

This may be what you want, using any awk, but without expected output in the question it's a guess:

$ awk 'BEGIN{FS=OFS=","} {sub(/ [^,]+/,""); print $1, $2, $3, $5}' file
2023-08-01,Lakers,CA,US
2023-10-05,Denver Nuggets,CO,US
2024-01-20,Utah Jazz,UT,US
1
jhnc On
awk '{ sub(/ .*/,"",$1); print $1,$2,$3,$5 }' FS=, OFS=, input
  • set input and output field separators to both be comma
  • strip space and trailing from field 1
  • print the relevant fields
0
pmf On

$1 appears to be a timestamp formatted as YYYY-MM-DD HH:MM:SS. If you want $1 to only contain the date part, you could use substr to extract the first 10 characters only:

awk 'BEGIN{FS=OFS=","} {print substr($1,1,10),$2,$3,$5}'
2023-08-01,Lakers,CA,US
2023-10-05,Denver Nuggets,CO,US
2024-01-20,Utah Jazz,UT,US
0
Daweo On

If you want to use cut you might get desired output following way, let file.txt content be

2023-08-01 05:30:01,Lakers,CA,LA,US
2023-10-05 16:40:23,Denver Nuggets,CO,DN,US
2024-01-20 16:40:23,Utah Jazz,UT,SLC,US

then

cut --bytes=1-10,20- file.txt | cut --delimiter=, --fields=1,2,3,5

gives output

2023-08-01,Lakers,CA,US
2023-10-05,Denver Nuggets,CO,US
2024-01-20,Utah Jazz,UT,US

Explanation: As your timestamp is in fixed-width format it is possible to remove hour-minute-seconds parts by instructing cut to take certain ranges of characters - from 1 to 10 (date) and from 20 to end (first , and following characters) then use cut again to get desired column sheared by ,. Disclaimer: this solution assumes your dealing solely with ASCII-encoded files.