I would love if someone could help me with this(updated):
I have a table that looks like this:
Id | name | countryId Sequence
--------+--------+------------------------+-------------------
23 | John | 102 | 1
45 | Luis | 102 | 1
23 | Alex | 102 | 2
45 | James | 102 | 2
And I want to merge similar ids into a single row using the ARRAY_AVG, so I use this query
SELECT DISTINCT
b.Id
String.Join(",", ARRAY_AGG(b.name)) AS names
FROM table AS b
GROUP BY b.Id;
However doing this returns the following table but the name has been sorted on ASC
Id | names
--------+--------+
23 | Alex,John
45 | James,Luis
Since I want to create the names field as the name appears(based on index) so e.g John,Alex. so I tried this:
SELECT DISTINCT
b.Id,
String.Join(",", ARRAY_AGG(b.name).OrderBy(p => b.Sequence)) AS names
FROM table AS b
GROUP BY b.Id, b.Sequence;
But I get the results but this time, it is a bit different
Id | names
--------+--------+
23 | Alex
23 | John
45 | James
45 | Luis
The Sequence is ignored and the result isn't desired.
So I want to ask for the use the ARRAG_AVG properly in U-SQL.
Is your dummy data at the start of your question accurate? If so, there is no difference in
countryIdvalues so there is no 'definitive' sort order for yournamevalues when using yourOrderBy.Within SQL based languages, there is also no concept of data order when consuming or exporting data besides what is explicity specified in an
order by. If you don't have a suitable, definitive sort order as defined by the data, then you cannot guarantee consistency over multiple executions.In order to get your sort order dependant on the position in the source file, you will need to either augment your source data to include an index value, or create a custom exporter that will scan through the file line by line and add the index on read.
What you have to ask yourself is why you even need to do this in the first place? Concatenating strings like this into the one row is generally bad practice in set based languages and is very rarely appropriate.