How do the Perl 6 set operations compare elements?

401 Views Asked by At

Running under moar (2016.10)

Consider this code that constructs a set and tests for membership:

my $num_set = set( < 1 2 3 4 > );
say "set: ", $num_set.perl;
say "4 is in set: ", 4 ∈ $num_set;
say "IntStr 4 is in set: ", IntStr.new(4, "Four") ∈ $num_set;
say "IntStr(4,...) is 4: ", IntStr.new(4, "Four") == 4;
say "5 is in set: ", 5 ∈ $num_set;

A straight 4 is not in the set, but the IntStr version is:

set: set(IntStr.new(4, "4"),IntStr.new(1, "1"),IntStr.new(2, "2"),IntStr.new(3, "3"))
4 is in set: False
IntStr 4 is in set: True
IntStr(4,...) is 4: True
5 is in set: False

I think most people aren't going to expect this, but the docs doesn't say anything about how this might work. I don't have this problem if I don't use the quote words (i.e. set( 1, 2, 3, 4)).

5

There are 5 best solutions below

8
On BEST ANSWER

I think this is a bug, but not in the set stuff. The other answers were very helpful in sorting out what was important and what wasn't.

I used the angle-brackets form of the quote words. The quote words form is supposed to be equivalent to the quoting version (that is, True under eqv). Here's the doc example:

<a b c> eqv ('a', 'b', 'c')

But, when I try this with a word that is all digits, this is broken:

 $ perl6
 > < a b 137 > eqv ( 'a', 'b', '137' )
 False

But, the other forms work:

> qw/ a b 137 / eqv ( 'a', 'b', '137' )
True
> Q:w/ a b 137 / eqv ( 'a', 'b', '137' )
True

The angle-bracket word quoting uses IntStr:

> my @n = < a b 137 >
[a b 137]
> @n.perl
["a", "b", IntStr.new(137, "137")]

Without the word quoting, the digits word comes out as [Str]:

> ( 'a', 'b', '137' ).perl
("a", "b", "137")
> ( 'a', 'b', '137' )[*-1].perl
"137"
> ( 'a', 'b', '137' )[*-1].WHAT
(Str)
> my @n = ( 'a', 'b', '137' );
[a b 137]
> @n[*-1].WHAT
(Str)

You typically see these sorts of errors when there are two code paths to get to a final result instead of shared code that converges to one path very early. That's what I would look for if I wanted to track this down (but, I need to work on the book!)

This does highlight, though, that you have to be very careful about sets. Even if this bug was fixed, there are other, non-buggy ways that eqv can fail. I would have still failed because 4 as Int is not "4" as Str. I think this level of attention to data types in unperly in it's DWIMery. It's certainly something I'd have to explain very carefully in a classroom and still watch everyone mess up on it.

For what it's worth, I think the results of gist tend to be misleading in their oversimplification, and sometimes the results of perl aren't rich enough (e.g. hiding Str which forces me to .WHAT). The more I use those, the less useful I find them.

But, knowing that I messed up before I even started would have saved me from that code spelunking that ended up meaning nothing!

0
On

Write your list of numbers using commas

As you mention in your answer, your code works if you write your numbers as a simple comma separated list rather than using the <...> construct.

Here's why:

4 ∈ set 1, 2, 3, 4 # True

A bare numeric literal in code like the 4 to the left of constructs a single value with a numeric type. (In this case the type is Int, an integer.) If a set constructor receives a list of similar literals on the right then everything works out fine.

<1 2 3 4> produces a list of "dual values"

The various <...> "quote words" constructs turn the list of whitespace separated literal elements within the angle brackets into an output list of values.

The foundational variant (qw<...>) outputs nothing but strings. Using it for your use case doesn't work:

4 ∈ set qw<1 2 3 4> # False

The 4 on the left constructs a single numeric value, type Int. In the meantime the set constructor receives a list of strings, type Str: ('1','2','3','4'). The operator doesn't find an Int in the set because all the values are Strs so returns False.

Moving along, the huffmanized <...> variant outputs Strs unless an element is recognized as a number. If an element is recognized as a number then the output value is a "dual value". For example a 1 becomes an IntStr.

According to the doc "an IntStr can be used interchangeably where one might use a Str or an Int". But can it?

Your scenario is a case in point. While 1 ∈ set 1,2,3 and <1> ∈ set <1 2 3> both work, 1 ∈ set <1 2 3> and <1> ∈ set 1, 2, 3 both return False.

So it seems the operator isn't living up to the quoted doc's claim of dual value interchangeability.

This may already be recognized as a bug in the set operation and/or other operations. Even if not, this sharp "dual value" edge of the <...> list constructor may eventually be viewed as sufficiently painful that Perl 6 needs to change.

1
On

Just to add to the other answers and point out a consistancy here between sets and object hashes.

An object hash is declared as my %object-hash{Any}. This effectively hashes on objects .WHICH method, which is similar to how sets distinguish individual members.

Substituting the set with an object hash:

my %obj-hash{Any};

%obj-hash< 1 2 3 4 > = Any;
say "hash: ", %obj-hash.keys.perl;
say "4 is in hash: ", %obj-hash{4}:exists;
say "IntStr 4 is in hash: ", %obj-hash{ IntStr.new(4, "Four") }:exists;
say "IntStr(4,...) is 4: ", IntStr.new(4, "Four") == 4;
say "5 is in hash: ", %obj-hash{5}:exists;

gives similar results to your original example:

hash: (IntStr.new(4, "4"), IntStr.new(1, "1"), IntStr.new(2, "2"), IntStr.new(3, "3")).Seq
4 is in hash: False
IntStr 4 is in hash: True
IntStr(4,...) is 4: True
5 is in hash: False
1
On

You took a wrong turn in the middle. The important part is what nqp::existskey is called with: the k.WHICH. This method is there for value types, i.e. immutable types where the value - rather than identity - defines if two things are supposed to be the same thing (even if created twice). It returns a string representation of an object's value that is equal for two things that are supposed to be equal. For <1>.WHICH you get IntStr|1 and for 1.WHICH you get just Int|1.

3
On

As explained in the Set documentation, sets compare object identity, same as the === operator:

Within a Set, every element is guaranteed to be unique (in the sense that no two elements would compare positively with the === operator)

The identity of an object is defined by the .WHICH method, as timotimo elaborates in his answer.