TITLE

Synopsis 3: Summary of Perl 6 Operators

AUTHOR

Luke Palmer <luke@luqui.org>

VERSION
  Maintainer: Larry Wall <larry@wall.org>
  Date: 8 Mar 2004
  Last Modified: 16 Jun 2006
  Number: 3
  Version: 41
Changes to existing operators

Several operators have been given new names to increase clarity and better Huffman-code the language, while others have changed precedence.

New operatorsTTT
Hyper operatorsT

The Unicode characters » (\x[BB]) and « (\x[AB]) and their ASCII digraphs >> and << are used to denote "list operations", which operate on each element of two lists (or arrays) and return a list (or array) of the results. Spaces are not allowed on the "pointy" end of each "hyper", but are allowed on the blunt end (except for postfix operators, which must still follow postfix spacing rules, but do allow for an additional dot before the "hyper").

For example:

     (1,1,2,3,5) »+« (1,2,3,5,8);  # (2,3,5,8,13)

If either argument is insufficiently dimensioned, Perl "upgrades" it:

     (3,8,2,9,3,8) >>-<< 1;          # (2,7,1,8,2,7)

In fact, this is the only form that will work for an unordered type such as a Bag:

     Bag(3,8,2,9,3,8) >>-<< 1;       # Bag(2,7,1,8,2,7) ~~ Bag(1,2,2,7,7,8)

When using a unary operator, only put the "hyper" on the side of the single operand:

     @negatives = -« @positives;

     @positions»++;            # Increment all positions

     @positions.»++;           # Same thing, dot form
     @positions».++;           # Same thing, dot form
     @positions.».++;          # Same thing, dot form
     @positions\  .»\  .++;    # Same thing, long dot form

     @objects.».run();
     ("f","oo","bar").>>.chars;   # (1,2,3)

Note that method calls are really postfix operators, not infix, so you shouldn't put a « after the dot.

Hyper operators are defined recursively on arrays, so:

    -« [[1, 2], 3]               #    [-«[1, 2], -«3]
                                 # == [[-1, -2], -3]
    [[1, 2], 3] »+« [4, [5, 6]]  #    [[1,2] »+« 4, 3 »+« [5, 6]]
                                 # == [[5, 6], [8, 9]]

More generally, hyper operators work recursively for any object matching the Each role even if the object itself doesn't support the operator in question:

    Bag(3,8,[2,Seq(9,3)],8) >>-<< 1;         # Bag(2,7,[1,Seq(8,2)],7)
    Seq(3,8,[2,Seq(9,3)],8) >>-<< (1,1,2,1); # Seq(2,7,[0,Seq(7,1)],7)

In particular, tree node types with Each semantics enable visitation:

    $tree.».foo;        # short for $tree.foo, $tree.each: { .».foo }

If not all nodes support the operation, you need a form of it that specifies the call is optional:

    $tree.».?foo;       # short for $tree.?foo, $tree.each: { .».?foo }
    $tree.».*foo;       # short for $tree.*foo, $tree.each: { .».*foo }

You are not allowed to define your own hyper operators, because they are supposed to have consistent semantics derivable entirely from the modified scalar operator. If you're looking for a mathematical vector product, this isn't where you'll find it. A hyperoperator is one of the ways that you can promise to the optimizer that your code is parallelizable. (The tree visitation above is allowed to have side effects, but it is erroneous for the meaning of those side effects to depend on the order of visitation. [Conjecture: we could allow dependencies that assume top-down visitation and only leaves sibling calls unordered.])

Even in the absence of hardware that can do parallel processing, hyperoperators may be faster than the corresponding scalar operators if they can factor out looping overhead to lower-level code, or can apply loop-unrolling optimizations, or can factor out some or all of the MMD dispatch overhead, based on the known types of the operands (and also based on the fact that hyper operators promise no interaction among the "iterations", whereas the corresponding scalar operator in a loop cannot make the same promise unless all the operations within the loop are known to be side-effect free.)

In particular, infix hyperops on two int or num arrays need only do a single MMD dispatch to find the correct function to call for all pairs, and can further bypass any type-checking or type-coercion entry points to such functions when there are known to be low-level entry points of the appropriate type. (And similarly for unary int or num ops.)

Application-wide analysis of finalizable object types may also enable such optimizations to be applied to Int, Num, and such. In the absence of that, run-time analysis of partial MMD dispatch may save some MMD searching overhead. Or particular object arrays might even keep track of their own run-time type purity and cache partical MMD dispatch tables when they know they're likely to be used in hyperops.

Reduction operators

The other metaoperator in Perl 6 is the reduction operator. Any infix operator (except for non-associating operators and assignment operators) can be surrounded by square brackets in term position to create a list operator that reduces using that operation:

    [+] 1, 2, 3;      # 1 + 2 + 3 = 6
    my @a = (5,6);
    [*] @a;           # 5 * 6 = 30

As with the all metaoperators, space is not allowed inside. The whole thing parses as a single token.

A reduction operator really is a list operator, and is invoked as one. Hence, you can implement a reduction operator in one of two ways. Either you can write an explicit list operator:

    proto prefix:<[+]> (*@args) {
        my $accum = 0;
        while (@args) {
            $accum += @args.shift();
        }
        return $accum;
    }

or you can let the system autogenerate one for you based on the corresponding infix operator, probably by currying:

    # (examples, actual system may define prefix:[**] instead)
    &prefix:<[*]> ::= &reduce.assuming(&infix:<*>, 1);
    &prefix:<[**]> ::= &reducerev.assuming(&infix:<**>);

As a special form of name, the non-prefix notation, as in

    proto [foo] (*@args) {
        ...
    }

or

    &[foo] ::= ...

defines both the [foo] reduce operator and the foo infix operator. Where appropriate, use of the infix form may be optimized like this:

    # Original          # Optimized
    $a foo $b           # [foo] $a, $b
    $a foo $b foo $c    # [foo] $a, $b, $c

If the reduction operator is defined separately from the infix operator, it must associate the same way as the operator used:

    [-] 4, 3, 2;      # 4-3-2 = (4-3)-2 = -1
    [**] 4, 3, 2;     # 4**3**2 = 4**(3**2) = 262144

For list-associating operators (like <), all arguments are taken together, just as if you had written it out explicitly:

    [<] 1, 3, 5;      # 1 < 3 < 5

If fewer than two arguments are given, a dispatch is still attempted with whatever arguments are given, and it is up to the receiver of that dispatch to deal with fewer than two arguments. Note that the proto list operator definition is the most general, so you are allowed to define different ways to handle the one argument case depending on type:

    multi prefix:<[foo]> (Int $x) { 42 }
    multi prefix:<[foo]> (Str $x) { fail "Can't foo a single Str" }

However, the zero argument case must of necessity be handled by the proto version, since there is no type information to dispatch on. Operators that wish to specify an identity value should do so by specifying the proto listop. Among the builtin operators, [+]() returns 0 and [*]() returns 1, for instance.

By default, if there is one argument, the built-in reduce operators return that one argument. However, this default doesn't make sense for operators like < that don't return the same type as they take, so these kinds of operators overload the single-argument case to return something more meaningful. All the comparison operators return a boolean for either 1 or 0 arguments. Negated operators return Bool::False, and all the rest return Bool::True.

You can also make a reduce operator of the comma operator. This has the effect of dereferencing its arguments into another argument list as if they'd been placed there directly.

    @args = \@foo,1,2,3;
    push [,] @args;     # same as push @foo,1,2,3

See S06 for more.

You may also reduce using the semicolon second-dimension separator:

    [[;] 1,2,3]   # equivalent to [1;2;3]

Builtin reduce operators return the following identity operations:

    [**]()      # 1     (arguably nonsensical)
    [*]()       # 1
    [/]()       # fail  (reduce is nonsensical)
    [%]()       # fail  (reduce is nonsensical)
    [x]()       # fail  (reduce is nonsensical)
    [xx]()      # fail  (reduce is nonsensical)
    [+&]()      # +^0   (-1 on 2's complement machine)
    [+<]()      # fail  (reduce is nonsensical)
    [+>]()      # fail  (reduce is nonsensical)
    [~&]()      # fail  (sensical but 1's length indeterminate)
    [~<]()      # fail  (reduce is nonsensical)
    [~>]()      # fail  (reduce is nonsensical)
    [+]()       # 0
    [-]()       # 0
    [~]()       # ''
    [+|]()      # 0
    [+^]()      # 0
    [~|]()      # ''    (length indeterminate but 0's default)
    [~^]()      # ''    (length indeterminate but 0's default)
    [&]()       # all()
    [|]()       # any()
    [^]()       # one()
    [!=]()      # Bool::False   (also for 1 arg)
    [==]()      # Bool::True    (also for 1 arg)
    [<]()       # Bool::True    (also for 1 arg)
    [<=]()      # Bool::True    (also for 1 arg)
    [>]()       # Bool::True    (also for 1 arg)
    [>=]()      # Bool::True    (also for 1 arg)
    [~~]()      # Bool::True    (also for 1 arg)
    [!~]()      # Bool::False   (also for 1 arg)
    [eq]()      # Bool::True    (also for 1 arg)
    [ne]()      # Bool::False   (also for 1 arg)
    [lt]()      # Bool::True    (also for 1 arg)
    [le]()      # Bool::True    (also for 1 arg)
    [gt]()      # Bool::True    (also for 1 arg)
    [ge]()      # Bool::True    (also for 1 arg)
    [=:=]()     # Bool::True    (also for 1 arg)
    [===]()     # Bool::True    (also for 1 arg)
    [&&]()      # Bool::True
    [||]()      # Bool::False
    [^^]()      # Bool::False
    [//]()      # undef
    [=]()       # undef    (same for all assignment operators)
    [,]()       # ()
    [¥]()       # []

User-defined operators may define their own identity values, but there is no explicit identity property. The value is implicit in the behavior of the 0-arg reduce, so mathematical code wishing to find the identity value for an operation can call prefix:{"[$opname]"}() to discover it.

To call some other non-infix function as a reduce operator, you may define an alias in infix form. The infix form will parse the right argument as a scalar even if the aliased function would have parsed it as a list:

    &infix:<dehash> ::= postcircumfix:<{ }>;
    $x = [dehash] $a,'foo','bar';  # $a<foo><bar>, not $a<foo bar>

Alternately, just define your own prefix:<[dehash]> routine.

Note that, because a reduce is a list operator, the argument list is evaluated in list context. Therefore the following would be incorrect:

    $x = [dehash] %a,'foo','bar';

You'd instead have to say one of:

    $x = [dehash] \%a,'foo','bar';
    $x = [dehash] %a<foo>,'bar';

On the plus side, this works without a star:

    @args = (\%a,'foo','bar');
    $x = [dehash] @args;

A reduce operator returns only a scalar result regardless of context. (Even [,] returns a single Capture object which is then spliced into the outer argument list.) To return all intermediate results, backslash the operator:

    say [\+] 1..*  #  (1, 3, 6, 10, 15, ...)

The visual picture of a triangle is not accidental. To produce a triangular list of lists, you can use a "triangular comma":

    [\,] 1..5
    [1],
    [1,2],
    [1,2,3],
    [1,2,3,4],
    [1,2,3,4,5]

If there is ambiguity between a triangular reduce and an infix operator beginning with backslash, the infix operator is chosen, and an extra backslash indicates the corresponding triangular reduce. As a consequence, defining an infix operator beginning with backslash, infix:<\x> say, will make it impossible to write certain triangular reduction operators, since [\x] would mean the normal reduction of infix:<\x> operator, not the triangular reduction of infix:<x>. This is deemed to be an insignificant problem.

Junctive operatorsTTT

|, &, and ^ are no longer bitwise operators (see "Operator Renaming") but now serve a much higher cause: they are now the junction constructors.

A junction is a single value that is equivalent to multiple values. They thread through operationsTT, returning another junction representing the result:

     (1|2|3) + 4;                            # 5|6|7
     (1|2) + (3&4);                          # (4|5) & (5|6)

Note how when two junctions are applied through an operator, the result is a junction representing the operator applied to each combination of values.

Junctions come with the functional variants any, all, one, and none.

This opens doors for constructions likeT:

     unless $roll == any(1..6) { print "Invalid roll" }

     if $roll == 1|2|3 { print "Low roll" }

Junctions work through subscriptingT:

    print if @foo[any(1,2,3)]

Junctions are specifically unorderedT. So if you say

    for all(@foo) {...}

it indicates to the compiler that there is no coupling between loop iterations and they can be run in any order or even in parallel.

Chained comparisonsTTTTT

Perl 6 supports the natural extension to the comparison operators, allowing multiple operands.

    if 1 < $a < 100 { say "Good, you picked a number *between* 1 and 100." }

    if 3 < $roll <= 6              { print "High roll" }

    if 1 <= $roll1 == $roll2 <= 6  { print "Doubles!" }

Note: any operator beginning with < must have whitespace in front of it, or it will be interpreted as a hash subscript instead.

BindingTTTTTT

A new form of assignment is present in Perl 6, called "binding," used in place of typeglob assignment. It is performed with the := operator. Instead of replacing the value in a container like normal assignment, it replaces the container itself. For instance:

    my $x = 'Just Another';
    my $y := $x;
    $y = 'Perl Hacker';

After this, both $x and $y contain the string "Perl Hacker," since they are really just two different names for the same variable.

There is another variant, spelled ::=, that does the same thing at compile time.

There is also an identity test, =:=, which tests whether two names are bound to the same underlying variable. $x =:= $y would return true in the above example.

The binding fails if the type of the variable being bound is sufficiently inconsistent with the type of the current declaration.

Declarators

The list of variable declarators has expanded from my and our to include:

    my $foo             # ordinary lexically scoped variable
    our $foo            # lexically scoped alias to package variable
    has $foo            # object attribute
    env $foo            # environmental lexical
    state $foo          # persistent lexical (cloned with closures)
    constant $foo       # lexically scoped compile-time constant

Variable declarators such as my now take a Signature as their argument. The parentheses around the signature may be omitted for a simple declaration that declares a single variable, along with its associated type and traits. Parentheses must always be used when declaring multiple parameters:

    my $a;              # okay
    my ($b, $c);        # okay
    my $b, $c;          # wrong: "Use of undeclared variable: $c"

The syntax for a Signature when one isn't expected is:

    :(Dog $a, *@c)

The colon (and sometimes the parens) may be omitted within declarators where a signature is expected, for instance in the formal list of a loop block:

    for @dogpound -> Dog $fido { ... }

If a Signature is assigned to (whether declared or colon form), the signature is converted to a list of lvalue variables and the ordinary rules of assignment apply, except that the evaluation of the right side and the assignment happens at time determined by the declarator. (With my this is always when an ordinary assignment would happen.) If the signature is too complicated to convert to an assignment, a compile-time error occurs. Assignment to a signature makes the same scalar/list distinction as ordinary assignment, so

    my $a = foo();      # foo in scalar context
    my ($a) = foo();    # foo in list context

If a Signature is bound to an argument list, then the binding of the arguments proceeds as if the Signature were the formal parameters for a function, except that, unlike in a function call, the parameters are bound rw by default rather than readonly. See Binding below.

Note that temp and let are not variable declarators, because their effects only take place at runtime. Therefore, they take an ordinary lvalue object as their argument. See S04 for more details.

There are a number of other declarators that are not variable declarators. These include both type declarators:

    package Foo
    module Foo
    class Foo
    role Foo
    subset Foo

and code declarators:

    sub foo
    method foo
    submethod foo
    multi foo
    proto foo
    regex foo
    rule foo
    token foo

These all have their uses and are explained in subsequent Synopses.

Argument List Interpolating

Perl 5 forced interpolation of a functions argument list by use of the & prefix. That option is no longer available in Perl 6, so instead the [,] reduction operator serves as an interpolator, by casting its operands to Capture objects and inserting them into the current argument list.

It can be used to interpolate an Array or Hash into the current call, as positional and named arguments respectively.

Note that those arguments still must comply with the subroutine's signature, but the presence of [,] defers that test until run time for that argument (and for any subsequent arguments):

    my @args = (scalar @foo, @bar);
    push [,] @args;

is equivalent to:

    push @foo, @bar;

as is this:

    my $args = \(@foo, @bar);    # construct a Capture object
    push [,] @$args;

In list context, a Scalar holding an Array object does not flatten. Hence

    $bar = @bar;
    push @foo, $bar;

merely pushes a single Array object onto @foo. You can explicitly flatten it in either of these ways:

    push @foo, @$bar;
    push @foo, $bar[];

Those two forms work because the slurpy array in push's signature flattens the Array object into a list argument.

Note that those two forms also allow you to specify list context on assignment:

    @$bar = (1,2,3);
    $bar[] = (1,2,3);

The last is particularly useful at the end of a long name naming an array attribute:

    $foo.bar.baz.bletch.whatever.attr[] = 1,2,3;

The empty [] and .[] postfix operators are interpreted as zero-dimensional slices returning the entire array, not null slices returning no elements. Likewise for {} and .{} on hashes, not to mention the <>, .<>, «», and .«» constant and interpolating slice subscripting forms.

The [,] operator interpolates lazily for Array and Range objects. To get an immediate interpolation like Perl 5 does, add the eager list operator:

    func([,] 1..Inf);         # works fine
    func([,] eager 1..Inf);   # never terminates

To interpolate a function's return value, you must say:

    push [,] func()

Within the argument list of a [,], function return values are automatically exploded into their various parts, as if you'd said:

    \$capture := func();
    push [,] $$capture: @$capture, %$capture;

or some such. The [,] then handles the various zones appropriately depending on the context. An invocant only makes sense as the first argument to the outer function call. An invocant inserted anywhere else just becomes a positional argument at the front of its list, as if its colon changed back to a comma.

If you already have a capture variable, you can interpolated all of its bits at once using the prefix:<=> operator. The above is equivalent to

    \$capture := func();
    push [,] =$capture;
Piping operatorsT

The new operators ==> and <== are akin to UNIX pipes, but work with functions that accept and return lists. For example,

     @result = map { floor($^x / 2) },
                 grep { /^ \d+ $/ },
                   @data;

Can also now be written:

     @data ==> grep { /^ \d+ $/ }
           ==> map { floor($^x / 2) }
           ==> @result;

or:

     @result <== map { floor($^x / 2) }
             <== grep { /^ \d+ $/ }
             <== @data;

Either form more clearly indicates the flow of data. See S06 for more of the (less-than-obvious) details on these two operators.

Invocant marker

An appended : marks the invocant when using the indirect-object syntax for Perl 6 method calls. The following two statements are equivalent:

    $hacker.feed('Pizza and cola');
    feed $hacker: 'Pizza and cola';

A colon may also be used on an ordinary method call to indicate that it should be parsed as a list operator:

    $hacker.feed: 'Pizza and cola';

This colon is a separate token. A colon prefixing an adverb is not a separate token. Therefore, under the longest-token rule,

    $hacker.feed:xxx('Pizza and cola');

is tokenized as an adverb applying to the method:

    $hacker.feed :xxx('Pizza and cola');

not as an xxx sub in the argument list of .feed:

    $hacker.feed: xxx('Pizza and cola');  # wrong

If you want both meanings of colon, you have to put it twice:

    $hacker.feed: :xxx('Pizza and cola'), 1,2,3;

(For similar reasons it's best to put whitespace after the colon of a label.)

zipT

In order to support parallel iteration over multiple arrays, Perl 6 has a zip function that builds Seq objects from the elements of two or more arrays.

    for zip(@names; @codes) -> [$name, $zip] {
        print "Name: $name;   Zip code: $zip\n";
    }

zip has an infix synonym, the Unicode operator ¥, and its the ASCII equivalent Y.

To read arrays in parallel like zip but just sequence the values rather than generating tuples, use each instead of zip.

    for each(@names; @codes) -> $name, $zip {
        print "Name: $name;   Zip code: $zip\n";
    }

The each function reads to the end of the longest list, not counting lists that are known to be infinite such as 0..Inf. Missing values are replaced with undef. In contrast, use roundrobin if you just wish to skip missing entries:

    for roundrobin(@queue1; @queue2; @queue3) -> $next {
        ...
    }

To read arrays serially rather than in parallel, use cat(@x;@y).

Minimal whitespace DWIMmery

Whitespace is no longer allowed before the opening bracket of an array or hash accessor. That is:

    %monsters{'cookie'} = Monster.new;  # Valid Perl 6
    %people  {'john'}   = Person.new;   # Not valid Perl 6

One of the several useful side-effects of this restriction is that parentheses are no longer required around the condition of control constructs:

    if $value eq $target {
        print "Bullseye!";
    }
    while 0 < $i { $i++ }

It is, however, still possible to align accessors by explicitly using the long dot syntax:

     %monsters.{'cookie'} = Monster.new;
     %people\ .{'john'}   = Person.new;
     %cats\   .{'fluffy'} = Cat.new;
PrecedenceT

Perl 6 has 22 precedence levels (which is fewer than Perl 5):

    loose andT           and

Comma is the only listop that is allowed to occur where an operator is expected. All other listops function as a term within the list to the left.