TITLE

Synopsis 6: Subroutines

AUTHOR

Damian Conway <damian@conway.org> and Allison Randal <al@shadowed.net>

VERSION
  Maintainer: Larry Wall <larry@wall.org>
  Date: 21 Mar 2003
  Last Modified: 26 May 2006
  Number: 6
  Version: 37

This document summarizes Apocalypse 6, which covers subroutines and the new type system.

Subroutines and other code objectsT

Subroutines (keyword: sub) are non-inheritable routines with parameter lists.

Methods (keyword: method) are inheritable routines which always have an associated object (known as their invocant) and belong to a particular kind or class.

Submethods (keyword: submethod) are non-inheritable methods, or subroutines masquerading as methods. They have an invocant and belong to a particular kind or class.

Regexes (keyword: regex) are methods (of a grammar) that perform pattern matching. Their associated block has a special syntax (see Synopsis 5). (We also use the term "regex" for anonymous patterns of the traditional form.)

Tokens (keyword: token) are regexes that perform low-level non-backtracking (by default) pattern matching.

Rules (keyword: rule) are regexes that perform non-backtracking (by default) pattern matching (and also enable rules to do whitespace dwimmery).

Macros (keyword: macro) are routines whose calls execute as soon as they are parsed (i.e. at compile-time). Macros may return another source code string or a parse-tree.

Routine modifiers

Multimethods (keyword: multi) are routines that can have multiple variants that share the same name, selected by arity, types, or some other constraints. They may have multliple invocants.

Prototypes (keyword: proto) specify the commonalities (such as parameter names, fixity and associativity) shared by all multis of that name in the scope of the proto declaration.

A modifier keyword may occur before the routine keyword in a named routine:

    proto sub foo {...}
    multi sub foo {...}
    proto method bar {...}
    multi method bar {...}

If the routine keyword is omitted, it defaults to sub.

Named subroutines

The general syntax for named subroutines is any of:

     my RETTYPE sub NAME ( PARAMS ) TRAITS {...}    # lexical only
    our RETTYPE sub NAME ( PARAMS ) TRAITS {...}    # also package-scoped
                sub NAME ( PARAMS ) TRAITS {...}    # same as "our"

The return type may also be put inside the parentheses:

    sub NAME (PARAMS --> RETTYPE) {...}

Unlike in Perl 5, named subroutines are considered expressions, so this is valid Perl 6:

    my @subs = (sub foo { ... }, sub bar { ... });
Anonymous subroutines

The general syntax for anonymous subroutines is:

    sub ( PARAMS ) TRAITS {...}

But one can also use a scope modifier to introduce the return type first:

     my RETTYPE sub ( PARAMS ) TRAITS {...}
    our RETTYPE sub ( PARAMS ) TRAITS {...} # means the same as "my" here

Trait is the name for a compile-time (is) property. See "Traits and Properties"

Perl5ish subroutine declarations

You can declare a sub without parameter list, as in Perl 5:

    sub foo {...}

Arguments implicitly come in via the @_ array, but they are readonly aliases to actual arguments:

    sub say { print qq{"@_[]"\n}; }   # args appear in @_

    sub cap { $_ = uc $_ for @_ }   # Error: elements of @_ are read-only

If you need to modify the elements of @_, declare the array explicitly with the is rw trait:

    sub swap (*@_ is rw) { @_[0,1] = @_[1,0] }
BlocksTTT

Raw blocks are also executable code structures in Perl 6.

Every block defines an object of type Code, which may either be executed immediately or passed on as a Code object. A bare block where an operator is expected is bound to the current statement level control syntax. A bare block where a term is expected merely produces a Code object. If the term bare block occurs in a list, it is considered the final element of that list unless followed immediately by a comma or comma surrogate.

"Pointy subs"TTTT

Semantically the arrow operator -> is almost a synonym for the anonymous sub keyword, except that the parameter list of a pointy sub does not require parenthesesT, and a pointy sub may not be given traits. Syntactically a pointy sub is parsed exactly like a bare block.

    $sq = -> $val { $val**2 };  # Same as: $sq = sub ($val) { $val**2 };

    for @list -> $elem {        # Same as: for @list, sub ($elem) {
        print "$elem\n";        #              print "$elem\n";
    }                           #          }

It also behaves like a block with respect to control exceptionsT. If you return from within a pointy sub, it will return from the innermost enclosing sub or method, not the block itself. It is referenced by &?BLOCK, not &?ROUTINE.

Stub declarations

To predeclare a subroutine without actually defining it, use a "stub block":

    sub foo {...}     # Yes, those three dots are part of the actual syntax

The old Perl 5 form:

    sub foo;

is a compile-time error in Perl 6 (because it would imply that the body of the subroutine extends from that statement to the end of the file, as class and module declarations do).

Redefining a stub subroutine does not produce an error, but redefining an already-defined subroutine does. If you wish to redefine a defined sub, you must explicitly use the "is instead" trait.

The ... is the "yadayadayada" operator, which is executable but returns a failure. You can also use ??? to produce a warning, or !!! to always die. These also officially define stub blocks if used as the only expression in the block.

Globally scoped subroutines

Subroutines and variables can be declared in the global namespace, and are thereafter visible everywhere in a program.

Global subroutines and variables are normally referred to by prefixing their identifiers with * (short for "GLOBAL::"). The * is normally required on the declaration but may be omitted on use if the reference is unambiguous:

    $*next_id = 0;
    sub *saith($text)  { print "Yea verily, $text" }

    module A {
        my $next_id = 2;    # hides any global or package $next_id
        saith($next_id);    # print the lexical $next_id;
        saith($*next_id);   # print the global $next_id;
    }

    module B {
        saith($next_id);    # Unambiguously the global $next_id
    }

Only the name is installed into the GLOBAL package by *. To define subs completely within the scope of the GLOBAL namespace you should use "package GLOBAL {...}" around the declaration.

Lvalue subroutinesT

Lvalue subroutines return a "proxy" object that can be assigned to. It's known as a proxy because the object usually represents the purpose or outcome of the subroutine call.

Subroutines are specified as being lvalue using the is rw trait.

An lvalue subroutine may return a variable:

    my $lastval;
    sub lastval () is rw { return $lastval }

or the result of some nested call to an lvalue subroutine:

    sub prevval () is rw { return lastval() }

or a specially tied proxy object, with suitably programmed FETCH and STORE methods:

    sub checklastval ($passwd) is rw {
        return new Proxy:
                FETCH => sub ($self) {
                            return lastval();
                         },
                STORE => sub ($self, $val) {
                            die unless check($passwd);
                            lastval() = $val;
                         };
    }

Other methods may be defined for specialized purposes such as temporizing the value of the proxy.

Operator overloadingTT

Operators are just subroutines with special names and scoping. An operator name consists of a grammatical category name followed by a single colon followed by an operator name specified as if it were a hash subscript (but evaluated at compile time). So any of these indicate the same binary addition operator:

    infix:<+>
    infix:«+»
    infix:<<+>>
    infix:{'+'}
    infix:{"+"}

Use the & sigil just as you would on ordinary subs.

Unary operators are defined as prefix or postfix:

    sub prefix:<OPNAME>  ($operand) {...}
    sub postfix:<OPNAME> ($operand) {...}

Binary operators are defined as infix:

    sub infix:<OPNAME> ($leftop, $rightop) {...}

Bracketing operators are defined as circumfix where a term is expected or postcircumfix where a postfix is expected. A two-element slice containing the leading and trailing delimiters is the name of the operator.

    sub circumfix:<LEFTDELIM RIGHTDELIM> ($contents) {...}
    sub circumfix:{'LEFTDELIM','RIGHTDELIM'} ($contents) {...}

Contrary to A6, there is no longer any rule about splitting an even number of characters. You must use a two element slice. Such names are canonicalized to a single form within the symbol table, so you must use the canonical name if you wish to subscript the symbol table directly (as in PKG::{'infix:<+>'}). Otherwise any form will do. (Symbolic references do not count as direct subscripts since they go through a parsing process.) The canonical form always uses angle brackets and a single space between slice elements. The elements are not escaped, so PKG::circumfix:{'<','>'} is canonicalized to PKG::{'circumfix:<< >>'}, and decanonicalizing always involves stripping the outer angles and splitting on space, if any. This works because a hash key knows how long it is, so there's no ambiguity about where the final angle is. And space works because operators are not allowed to contain spaces.

Operator names can be any sequence of non-whitespace characters including Unicode characters. For example:

    sub infix:<(c)> ($text, $owner) { return $text but Copyright($owner) }
    method prefix:<±> (Num $x --> Num) { return +$x | -$x }
    multi sub postfix:<!> (Int $n) { $n < 2 ?? 1 !! $n*($n-1)! }
    macro circumfix:«<!-- -->» ($text) is parsed / .*? / { "" }

    my $document = $text (c) $me;

    my $tolerance = ±7!;

    <!-- This is now a comment -->

Whitespace may never be part of the name (except as separator within a <...> or «...» slice, as in the example above).

A null operator name does not define a null or whitespace operator, but a default matching subrule for that syntactic category, which is useful when there is no fixed string that can be recognized, such as tokens beginning with digits. Such an operator must supply an is parsed trait. The Perl grammar uses a default subrule for the :1st, :2nd, :3rd, etc. regex modifiers, something like this:

    sub regex_mod_external:<> ($x) is parsed(token { \d+[st|nd|rd|th] }) {...}

Such default rules are attempted in the order declared. (They always follow any rules with a known prefix, by the longest-token-first rule.)

Although the name of an operator can be installed into any package or lexical namespace, the syntactic effects of an operator declaration are always lexically scoped. Operators other than the standard ones should not be installed into the * namespace. Always use exportation to make non-standard syntax available to other scopes.

Parameters and arguments

Perl 6 subroutines may be declared with parameter lists.

By default, all parameters are readonly aliases to their corresponding arguments--the parameter is just another name for the original argument, but the argument can't be modified through it. To allow modification, use the is rw trait. To pass-by-copy, use the is copy trait.

Parameters may be required or optional. They may be passed by position, or by name. Individual parameters may confer a scalar or list context on their corresponding arguments, but unlike in Perl 5, this is decided lazily at parameter binding time.

Arguments destined for required positional parameters must come before those bound to optional positional parameters. Arguments destined for named parameters may come before and/or after the positional parameters. (To avoid confusion it is highly recommended that all positional parameters be kept contiguous in the call syntax, but this is not enforced, and custom arg list processors are certainly possible on those arguments that are bound to a final slurpy or arglist variable.)

Named arguments

Named arguments are recognized syntactically at the "comma" level. Since parameters are identified using identifiers, the recognized syntaxes are those where the identifier in question is obvious. You may use either the adverbial form, :name($value), or the autoquoted arrow form, name => $value. These must occur at the top "comma" level, and no other forms are taken as named pairs by default. Pairs intended as positional arguments rather than named arguments may be indicated by extra parens or by explicitly quoting the key to suppress autoquoting:

    doit :when<now>,1,2,3;      # always a named arg
    doit (:when<now>),1,2,3;    # always a positional arg

    doit when => 'now',1,2,3;   # always a named arg
    doit (when => 'now'),1,2,3; # always a positional arg
    doit 'when' => 'now',1,2,3; # always a positional arg

Only bare keys with valid identifier names are recognized as named arguments:

    doit when => 'now';         # always a named arg
    doit 'when' => 'now';       # always a positonal arg
    doit 123  => 'now';         # always a positonal arg
    doit :123<now>;             # always a positonal arg

Going the other way, pairs intended as named arguments that don't look like pairs must be introduced with the [,] reduction operator:

    $pair = :when<now>;
    doit $pair,1,2,3;                   # always a positional arg
    doit [,] %$pair,1,2,3;              # always a named arg
    doit [,] %(get_pair()),1,2,3;       # always a named arg
    doit [,] %('when' => 'now'),1,2,3;  # always a named arg

Note that, to apply [,] to a single arg you may need to use parentheses. In general it doesn't matter.

Likewise, if you wish to pass a hash and have its entries treated as named arguments, you must dereference it with a [,]:

    %pairs = {:when<now> :what<any>};
    doit %pairs,1,2,3;          # always a positional arg
    doit [,](%pairs),1,2,3;     # always named args

Variables with a : prefix in rvalue context autogenerate pairs, so you can also say this:

    $when = 'now';
    doit $when,1,2,3;   # always a positional arg of 'now'
    doit :$when,1,2,3;  # always a named arg of :when<now>

In other words :$when is shorthand for :when($when). This works for any sigil:

    :$what      :what($what)
    :@what      :what(@what)
    :%what      :what(%what)
    :&what      :what(&what)

There is a corresponding shortcut for hash keys if you prefix the subscript instead of the sigil. The : is not functioning as an operator here, but as a modifier of the following token:

    doit %hash:<a>,1,2,3;
    doit %hash:{'b'},1,2,3;

are short for

    doit :a(%hash<a>),1,2,3;
    doit :b(%hash{'b'}),1,2,3;

Ordinary hash notation will just pass the value of the hash entry as a positional argument regardless of whether it is a pair or not. To pass both key and value out of hash as a positional pair, use :p.

    doit %hash<a>:p,1,2,3;
    doit %hash{'b'}:p,1,2,3;

instead.. (The :p stands for "pairs", not "positional"--the :p adverb may be placed on any Hash objects to make it mean "pairs" instead of "values".)

Pair constructors are recognized syntactically at the call level and put into the named slot of the Capture structure. Hence they may be bound to positionals only by name, not as ordinary positional Pair objects. Leftover named arguments can be slurped into a slurpy hash.

Because named and positional arguments can be freely mixed, the programmer always needs to disambiguate pairs literals from named arguments with parenthesis or quotes:

    # Named argument "a"
    push @array, 1, 2, :a<b>;

    # Pair object (a=>'b')
    push @array, 1, 2, (:a<b>);
    push @array, 1, 2, 'a' => 'b';

Perl 6 allows multiple same-named arguments, and records the relative order of arguments with the same name. When there are more than one argument, the @ sigil in the parameter list causes the arguments to be concatenated:

    sub fun (Int @x) { ... }
    fun( x => 1, x => 2 );              # @x := (1, 2)
    fun( x => (1, 2), x => (3, 4) );    # @x := (1, 2, 3, 4)

Other sigils binds only to the last argument with that name:

    sub fun (Int $x) { ... }
    f( x => 1, x => 2 );                # $x := 2
    fun( x => (1, 2), x => (3, 4) );    # $x := (3, 4)

This means a hash holding default values must come before known named parameters, similar to how hash constructors work:

    # Allow "x" and "y" in %defaults be overrided
    f( [,](%defaults), x => 1, y => 2 );
Invocant parameters

A method invocant may be specified as the first parameter in the parameter list, with a colon (rather than a comma) immediately after it:

    method get_name ($self:) {...}
    method set_name ($me: $newname) {...}

The corresponding argument (the invocant) is evaluated in scalar context and is passed as the left operand of the method call operator:

    print $obj.get_name();
    $obj.set_name("Sam");

Multimethod and multisub invocants are specified at the start of the parameter list, with a colon terminating the list of invocants:

    multi sub handle_event ($window, $event: $mode) {...}   # two invocants
    multi method set_name ($self, $name: $nick) {...}       # two invocants

If the parameter list for a multi contains no colon to delimit the list of invocant parameters, then all positional parameters are considered invocants. If it's a multi method and multi submethod, an additional implicit unnamed self invocant is prepended to the signature list.

For the purpose of matching positional arguments against invocant parameters, the invocant argument passed via the method call syntax is considered the first positional argument:

    handle_event($w, $e, $m);   # calls the multi sub
    $w.handle_event($e, $m);    # ditto, but only if there is no
                                # suitable $w.handle_event method

Invocants may also be passed using the indirect object syntax, with a colon after them. The colon is just a special form of the comma, and has the same precedence:

    set_name $obj: "Sam";   # try $obj.set_name("Sam") first, then
                            # fall-back to set_name($obj, "Sam")
    $obj.set_name("Sam");   # same as the above

Passing too many or too few invocants is a fatal error if no matching definition can be found.

An invocant is the topic of the corresponding method or multi if that formal parameter is declared with the name $_. A method's first invocant always has the alias self. Other styles of self can be declared with the self pragma.

Required parameters

Required parameters are specified at the start of a subroutine's parameter list:

    sub numcmp ($x, $y) { return $x <=> $y }

Required parameters may optionally be declared with a trailingT !, though that's already the default for positional parameters:

    sub numcmp ($x!, $y!) { return $x <=> $y }

The corresponding arguments are evaluated in scalar context and may be passed positionally or by name. To pass an argument by name, specify it as a pair: parameter_name => argument_value.

    $comparison = numcmp(2,7);
    $comparison = numcmp(x=>2, y=>7);
    $comparison = numcmp(y=>7, x=>2);

Pairs may also be passed in adverbial pair notation:

    $comparison = numcmp(:x(2), :y(7));
    $comparison = numcmp(:y(7), :x(2));

Passing the wrong number of required arguments to a normal subroutine is a fatal error. Passing a named argument that cannot be bound to a normal subroutine is also a fatal error. (Methods are different.)

The number of required parameters a subroutine has can be determined by calling its .arity method:T

    $args_required = &foo.arity;
Optional parameters

Optional positional parameters are specified after all the required parameters and each is marked with a ? after the parameter:

    sub my_substr ($str, $from?, $len?) {...}

Alternately, optional fields may be marked by supplying a default value. The = sign introduces a default value:

    sub my_substr ($str, $from = 0, $len = Inf) {...}

Default values can be calculated at run-timeT. They may even use the values of preceding parameters:

    sub xml_tag ($tag, $endtag = matching_tag($tag) ) {...}

Arguments that correspond to optional parameters are evaluated in scalar context. They can be omitted, passed positionally, or passed by name:

    my_substr("foobar");            # $from is 0, $len is infinite
    my_substr("foobar",1);          # $from is 1, $len is infinite
    my_substr("foobar",1,3);        # $from is 1, $len is 3
    my_substr("foobar",len=>3);     # $from is 0, $len is 3

Missing optional argumentsT default to their default value, or to an undefined value if they have no default. (A supplied argument that is undefined is not considered to be missing, and hence does not trigger the default. Use //= within the body for that.)

(Conjectural: Within the body you may also use exists on the parameter name to determine whether it was passed. Maybe this will have to be restricted to the ? form, unless we're willing to admit that a parameter could be simultaneously defined and non-existant.)

Named parametersTTT

Named-only parameters follow any required or optional parameters in the signature. They are marked by a prefix ::

    sub formalizeT($text, :$case, :$justify) {...}

This is actually shorthand for:

    sub formalize($text, :case($case), :justify($justify)) {...}

If the longhand form is used, the label name and variable name can be different:

    sub formalize($text, :case($required_case), :justify($justification)) {...}

so that you can use more descriptive internal parameter names without imposing inconveniently long external labels on named arguments.

Arguments that correspond to named parameters are evaluated in scalar context. They can only be passed by name, so it doesn't matter what order you pass them in, so long as they don't intermingle with any positional arguments:

    $formal = formalize($title, case=>'upper');
    $formal = formalize($title, justify=>'left');
    $formal = formalize($title, :justify<right>, :case<title>);

Named parameters are optionalT unless marked with a following !. Default values for optional named parameters are defined in the same way as for positional parameters, but may depend only on the values of parameters that have already been bound. (Note that binding happens in the call order, not declaration order.) Named optional parameters default to undef if they have no default. Named required parameters fail unless an argument pair of that name is supplied.

Again, note the use of adverbial pairs in the argument list. See S02 for the correspondence between adverbial form and arrow notation.

List parametersT

List parameters capture a variable length list of data. They're used in subroutines like print, where the number of arguments needs to be flexible. They're also called "variadic parameters", because they take a variable number of arguments. But generally we call them "slurpy" parameters because they slurp up arguments.

Slurpy parameters follow any required or optional parameters. They are marked by a * before the parameter:

    sub duplicate($n, *%flag, *@data) {...}

Named arguments are bound to the slurpy hash (*%flag in the above example). Such arguments are evaluated in scalar context. Any remaining variadic arguments at the end of the argument list are bound to the slurpy array (*@data above) and are evaluated in list context.

For example:

    duplicate(3, reverse => 1, collate => 0, 2, 3, 5, 7, 11, 14);
    duplicate(3, :reverse, :!collate, 2, 3, 5, 7, 11, 14);  # same

    # The @data parameter receives [2, 3, 5, 7, 11, 14]
    # The %flag parameter receives { reverse => 1, collate => 0 }

Slurpy scalar parameters capture what would otherwise be the first elements of the variadic array:T

    sub head(*$head, *@tail)         { return $head }
    sub neck(*$head, *$neck, *@tail) { return $neck }
    sub tail(*$head, *@tail)         { return @tail }

    head(1, 2, 3, 4, 5);        # $head parameter receives 1
                                # @tail parameter receives [2, 3, 4, 5]

    neck(1, 2, 3, 4, 5);        # $head parameter receives 1
                                # $neck parameter receives 2
                                # @tail parameter receives [3, 4, 5]

Slurpy scalars still impose list context on their arguments.

Slurpy parameters are treated lazily -- the list is only flattened into an array when individual elements are actually accessed:

    @fromtwo = tail(1..Inf);        # @fromtwo contains a lazy [2..Inf]

You can't bind to the name of a slurpy parameter: the name is just there so you can refer to it within the body.

    sub foo(*%flag, *@data) {...}

    foo(:flag{ a => 1 }, :data[ 1, 2, 3 ]);
        # %flag has elements (flag => (a => 1)) and (data => [1,2,3])
        # @data has nothing
Slurpy block

It's also possible to declare a slurpy block: *&block. It slurps up any nameless block, specified by {...}, at either the current positional location or the end of the syntactic list. Put it first if you want the option of putting a block either first or last in the arguments. Put it last if you want to force it to come in as the last argument.

Argument list binding

The underlying Capture object may be bound to a single scalar parameter markedT with a \.

    sub bar ($a,$b,$c,:$mice) { say $mice }
    sub foo (\$args) { say $args.perl; &bar.call($args); }

The .call method of Code objects accepts a single Capture object, and calls it without introducing a CALLER frame.

    foo 1,2,3,:mice<blind>;     # says "\(1,2,3,:mice<blind>)" then "blind"

It is allowed to specify a return type:

    sub foo (\$args --> Num) { ... }

Apart from that, no other parameters are allowed in the signature after the List. Parameters before the List either do not show up in the List or are marked as already bound somehow. In other words, parameters are bound normally up to the List parameter, and then \$args takes a snapshot of the remaining input without further attempts at binding.

Flattening argument lists

The reduce operator [,] casts each of its arguments to a Capture object, then splices each of those captures into the argument list it occurs in.

Casting Capture to Capture is a no-op:

    [,](\(1, x=>2));    # Capture, becomes \(1, x=>2)

Pair and Hash become named arguments:

    [,](x=>1);          # Pair, becomes \(x=>1)
    [,]{x=>1, y=>2};    # Hash, becomes \(x=>1, y=>2)

List (also Seq, Range, etc.) are simply turned into positional arguments:

    [,](1,2,3);         # Seq, becomes \(1,2,3)
    [,](1..3);          # Range, becomes \(1,2,3)
    [,](1..2, 3);       # List, becomes \(1,2,3)
    [,]([x=>1, x=>2]);  # List (from an Array), becomes \((x=>1), (x=>2))

For example:

    sub foo($x, $y, $z) {...}    # expects three scalars
    @onetothree = 1..3;          # array stores three scalars

    foo(1,2,3);                  # okay:  three args found
    foo(@onetothree);            # error: only one arg
    foo([,]@onetothree);         # okay:  @onetothree flattened to three args

The [,] operator flattens lazily -- the array is flattened only if flattening is actually required within the subroutine. To flatten before the list is even passed into the subroutine, use the eager list operator:

    foo([,] eager @onetothree);          # array flattened before &foo called
Multidimensional argument list bindingT

Some functions take more than one lists of positional and/or named arguments, that they wish not to be flattened into one list. For instance, zip() wants to iterate several lists in parallel, while array and hash subscripts want to process multidimensional slices. The set of underlying argument lists may be bound to a single array parameter declared with a double @@ sigil:

    sub foo (*@@slices) { ... }

Note that this is different from

    sub foo (\$slices) { ... }

insofar as \$slices is bound to a single argument-list object that makes no commitment to processing its structure (and maybe doesn't even know its own structure yet), while *@@slices has to create an array that binds the incoming dimensional lists to the array's dimensions, and make that commitment visible to the rest of the scope via the sigil so that constructs expecting multidimensional lists know that multidimensionality is the intention.

It is allowed to specify a return type:

    sub foo (*@@slices --> Num) { ... }

The invocant does not participate in multi-dimensional argument lists, so self is not present in any of the @@slices below:

    method foo (*@@slices) { ... }

The @@ sigil is just a variant of the @ sigil, so @@slices and @slices are really the same array. In particular, @@_ is really the good old @_ array viewed as multidimensional.

Zero-dimensional argument list

If you call a function without parens and supply no arguments, the argument list becomes a zero-dimensional slice. It differs from \() in several ways:

    sub foo (*@@slices) {...}
    foo;        # +@@slices == 0
    foo();      # +@@slices == 1

    sub bar (\$args = \(1,2,3)) {...}
    bar;        # $args === \(1,2,3)
    bar();      # $args === \()
Pipe operatorsT

The variadic list of a subroutine call can be passed in separately from the normal argument list, by using either of the "pipe" operators: <== or ==>.

Each operator expects to find a call to a variadic receiver on its "sharp" end, and a list of values on its "blunt" end:

    grep { $_ % 2 } <== @data;

    @data ==> grep { $_ % 2 };

It binds the (potentially lazy) list from the blunt end to the slurpy parameter(s) of the receiver on the sharp end. In the case of a receiver that is a variadic function, the pipe is received as part of its slurpy list. So both of the calls above are equivalent to:

    grep { $_ % 2 }, @data;

Note that all such pipes (and indeed all lazy argument lists) supply an implicit promise that the code producing the lists may execute in parallel with the code receiving the lists. (Pipes, hyperops, and junctions all have this promise of parallelizability in common, but differ in interface. Code which violates these promises is erroneous, and will produce undefined results when parallelized.) In particular, a pipeline may not begin and end with the same array. (You may, however, assign to an array that is used within a pipeline on the right side of the assignment, since list assignment will clear and copy as necessary to make it work.) That is, this doesn't work:

    @data <== grep { $_ % 2 } <== @data;

but this does:

    @data = grep { $_ % 2 } <== @data;

Leftward pipes are a convenient way of explicitly indicating the typical right-to-left flow of data through a chain of operations:

    @oddsquares = map { $_**2 }, sort grep { $_ % 2 }, @nums;

    # more clearly written as...

    @oddsquares = map { $_**2 } <== sort <== grep { $_ % 2 } <== @nums;

Rightward pipes are a convenient way of reversing the normal data flow in a chain of operations, to make it read left-to-right:

    @oddsquares =
            (@nums ==> grep { $_ % 2 } ==> sort ==> map { $_**2 });

Note that the parens are necessary there due to precedence.

If the operand on the sharp end of a pipe is not a call to a variadic operation, it must be something else that can be interpreted as a list receiver.

Any list operator is considered a variadic operation, so ordinarily a list operator adds any piped input to the end of its list. But sometimes you want to interpolate elsewhere, so the *** term may be used to indicating the target of a pipe without the use of a temporary array:

    foo() ==> say ***, " is what I meant";
    bar() ==> ***.baz();

Piping to the * "whatever" term is considered a pipe to the lexically following *** term:

    0..*       ==> *;
    'a'..*     ==> *;
    pidigits() ==> *;

    # outputs "(0, 'a', 3)\n"...
    for zip(***) { .perl.say }

You may use a variable (or variable declaration) as a receiver, in which case the list value is bound as the "todo" of the variable. Do not think of it as an assignment, nor as an ordinary binding. Think of it as iterator creation. In the case of a scalar variable, that variable contains the newly created iterator itself. In the case of an array, the new iterator is installed as the method for extending the array. Unlike with assignment, no clobbering of the array is implied. It's therefore more like a push than an assignment.

In general you can simply think of a receiver array as representing the results of the pipeline, so you can equivalently write any of:

    my @oddsquares <== map { $_**2 } <== sort <== grep { $_ % 2 } <== @nums;

    my @oddsquares
        <== map { $_**2 }
        <== sort
        <== grep { $_ % 2 }
        <== @nums;

    @nums ==> grep { $_ % 2 } ==> sort ==> map { $_**2 } ==> my @oddsquares;

    @nums
    ==> grep { $_ % 2 }
    ==> sort
    ==> map { $_**2 }
    ==> my @oddsquares;

Since the pipe iterator is bound into the final variable, the variable can be just as lazy as the pipe that is producing the values.

Because pipes are bound to arrays with "push" semantics, you can have a receiver for multiple pipes:

    my @foo;
    0..2       ==> @foo;
    'a'..'c'   ==> @foo;
    say @foo;   # 0,1,2,'a','b','c'

Note how the pipes are concatenated in @foo so that @foo is a list of 6 elements. This is the default behavior. However, sometimes you want to capture the outputs as a list of two iterators, namely the two iterators that represent the two input pipes. You can get at those two iterators by using the name @@foo instead, where the "pipe" twigil marks a multidimensional array, that is, an array of slices.

    0..*       ==> @@foo;
    'a'..*     ==> @@foo;
    pidigits() ==> @@foo;

    for zip(@@foo) { say }

        [0,'a',3]
        [1,'b',1]
        [2,'c',4]
        [3,'d',1]
        [4,'e',5]
        [5,'f',9]
        ...

Here @@foo is an array of three iterators, so

    zip(@@foo)

is equivalent to

    zip(@@foo[0]; @@foo[1]; @@foo[2])

A semicolon inside brackets is equivalent to stacked pipes. The code above could be rewritten as:

    (0..*; 'a'..*; pidigits()) ==> my @@foo;
    for @@foo.zip { say }

which is in turn equivalent to

    for zip(0..*; 'a'..*; pidigits()) { say }

A named receiver array is useful when you wish to pipe into an expression that is not an ordinary list operator, and you wish to be clear where the pipe's destination is supposed to be:

    picklist() ==> my @baz;
    my @foo = @bar[@baz];

Various contexts may or may not be expecting multi-dimensional slices or pipes. By default, ordinary arrays are flattened, that is, they have "cat" semantics. If you say

    (0..2; 'a'..'c') ==> my @tmp;
    for @tmp { say }

then you get 0,1,2,'a','b','c'. If you have a multidim array, you can ask for cat semantics explicitly with cat():

    (0..2; 'a'..'c') ==> my @@tmp;
    for @@tmp.cat { say }

As we saw earlier, "zip" produces little arrays by taking one element from each list in turn, so

    (0..2; 'a'..'c') ==> my @@tmp;
    for @@tmp.zip { say }

produces [0,'a'],[1,'b'],[2,'c']. If you don't want the subarrays, then use each() instead:

    (0..2; 'a'..'c') ==> my @@tmp;
    for @@tmp.each { say }

and then you just get 0,'a',1,'b',2,'c'. This is good for

    for @@tmp.each -> $i, $a { say "$i: $a" }

In list context the @@foo notation is really a shorthand for [;](@@foo). In particular, you can use @@foo to interpolate a multidimensional slice in an array or hash subscript.

If @@foo is currently empty, then for zip(@@foo) {...} acts on a zero-dimensional slice (i.e. for (zip) {...}), and outputs nothing at all.

Note that with the current definition, the order of pipes is preserved left to right in general regardless of the position of the receiver.

So

    ('a'..*; 0..*) ==> *;
     for zip(*** <== @foo) -> [$a, $i, $x] { ...}

is the same as

    'a'..* ==> *;
     0..*  ==> *;
     for zip(*** <== @foo) -> [$a, $i, $x] { ...}

which is the same as

    for zip('a'..*; 0..*; @foo) -> [$a, $i, $x] { ...}

And

    @foo ==> *;
    0..* ==> *;
    for each(***) -> $x, $i { ...}

is the same as

    0..* ==> *;
    for each(@foo; ***) -> $x, $i { ...}

which is the same as

    for each(@foo; 0..*) -> $x, $i { ...}

Note that the each method is also sensitive to multislicing, so you could also just write that as:

    (@foo; 0..*).each: -> $x, $i { ...}

Also note that these come out to identical for ordinary arrays:

    @foo.each
    @foo.cat

The @@($foo) coercer can be used to pull a multidim out of some object that contains one, such as a Capture or Match object. Like @(), @@() defaults to @@($/), and returns a multidimensional view of any match that repeatedly applies itself with :g and the like. In contrast, @() would flatten those into one list.

Closure parameters

Parameters declared with the & sigil take blocks, closures, or subroutines as their arguments. Closure parameters can be required, optional, named, or slurpy.

    sub limited_grep (Int $count, &block, *@list) {...}

    # and later...

    @first_three = limited_grep 3, {$_<10}, @data;

(The comma is required after the closure.)

Within the subroutine, the closure parameter can be used like any other lexically scoped subroutine:

    sub limited_grep (Int $count, &block, *@list) {
        ...
        if block($nextelem) {...}
        ...
    }

The closure parameter can have its own signature in a type specification written with :(...):

    sub limited_Dog_grep ($count, &block:(Dog), Dog *@list) {...}

and even a return type:

    sub limited_Dog_grep ($count, &block:(Dog --> Bool), Dog *@list) {...}

When an argument is passed to a closure parameter that has this kind of signature, the argument must be a Code object with a compatible parameter list and return type.

Type parameters

Unlike normal parameters, type parameters often come in piggybacked on the actual value as "kind", and you'd like a way to capture both the value and its kind at once. (A "kind" is a class or type that an object is allowed to be. An object is not officially allowed to take on a constrained or contravariant type.) A type variable can be used anywhere a type name can, but instead of asserting that the value must conform to a particular type, instead captures the actual "kind" of object and also declares a package/type name by which you can refer to that kind later in the signature or body. For instance, if you wanted to match any two Dogs as long as they were of the same kind, you can say:

    sub matchedset (Dog ::T $fido, T $spot) {...}

(Note that ::T is not required to contain Dog, only a type that is compatible with Dog.)

The :: sigil is short for "subset" in much the same way that & is short for "sub". Just as & can be used to name any kind of code, so too :: can be used to name any kind of type. Both of them insert a bare identifier into the grammar, though they fill different syntactic spots.

Note that it is not required to capture the object associated with the class unless you want it. The sub above could be written

    sub matchedset (Dog ::T, T) {...}

if we're not interested in $fido or $spot. Or just

    sub matchedset (::T, T) {...}

if we don't care about anything but the matching.

Unpacking array parametersT

Instead of specifying an array parameter as an array:

    sub quicksort (@data, $reverse?, $inplace?) {
        my $pivot := shift @data;
        ...
    }

it may be broken up into components in the signature, by specifying the parameter as if it were an anonymous array of parameters:

    sub quicksort ([$pivot, *@data], $reverse?, $inplace?) {
        ...
    }

This subroutine still expects an array as its first argument, just like the first version.

Unpacking a single list argument

To match the first element of the slurpy list, use a "slurpy" scalar:

    sub quicksort (:$reverse, :$inplace, *$pivot, *@data)
Unpacking hash parametersT

Likewise, a hash argument can be mapped to a hash of parameters, specified as named parameters within curlies. Instead of saying:

    sub register (%guest_data, $room_num) {
        my $name := delete %guest_data<name>;
        my $addr := delete %guest_data<addr>;
        ...
    }

you can get the same effect with:

    sub register ({:$name, :$addr, *%guest_data}, $room_num) {
        ...
    }
Unpacking tree node parameters

You can unpack tree nodes in various dwimmy ways by enclosing the bindings of child nodes and attributes in parentheses following the declaration of the node itself

    sub traverse ( BinTree $top ( $left, $right ) ) {
        traverse($left);
        traverse($right);
    }

In this, $left and $right are automatically bound to the left and right nodes of the tree. If $top is an ordinary object, it binds the $top.left and $top.right attributes. If it's a hash, it binds $top<left> and $top<right>. If BinTree is a signature type and $top is a List (argument list) object, the child types of the signature are applied to the actual arguments in the argument list object. (Signature types have the benefit that you can view them inside-out as constructors with positional arguments, such that the transformations can be reversible.)

However, the full power of signatures can be applied to pattern match just about any argument or set of arguments, even though in some cases the reverse transformation is not intuitable. For instance, to bind to an array of children named .kids or .<kids>, use something like:

    sub traverse ( NAry $top ( :kids [$eldest, *@siblings] ) ) {
        traverse($eldest);
        traverse(@siblings);
    }

Likewise, to bind to a hash element of the node and then bind to keys in that hash by name:

    sub traverse ( AttrNode $top ( :%attr{ :$vocalic, :$tense } ) {
        say "Has {+%attr} attributes, of which";
        say "vocalic = $vocalic";
        say "tense = $tense";
    }

You may omit the top variable if you prefix the parentheses with a colon to indicate a signature. Otherwise you must at least put the sigil of the variable, or we can't correctly differentiate:

    my Dog ($fido, $spot)   := twodogs();       # list of two dogs
    my Dog $ ($fido, $spot) := twodogs();       # one twodog object
    my Dog :($fido, $spot)  := twodogs();       # one twodog object

Subsignatures can be matched directly within regexes by using :(...) notation.

    push @a, "foo";
    push @a, \(1,2,3);
    push @a, "bar";
    ...
    my ($i, $j, $k);
    @a ~~ rx/
            <,>                         # match initial elem boundary
            :(Int $i,Int $j,Int? $k)    # match lists with 2 or 3 ints
            <,>                         # match final elem boundary
          /;
    say "i = $<i>";
    say "j = $<j>";
    say "k = $<k>" if defined $<k>;

If you want a parameter bound into $/, you have to say $<i> within the signature. Otherwise it will try to bind an external $i instead, and fail if no such variable is declared.

Note that unlike a sub declaration, a regex-embedded signature has no associated "returns" syntactic slot, so you have to use --> within the signature to specify the of type of the signature, or match as an arglist:

    :(Num, Num --> Coord)
    :(\Coord(Num, Num))

A consequence of the latter form is that you can match the type of an object with :(\Dog) without actually breaking it into its components. Note, however, that it's not equivalent to say

    :(--> Dog)

which would be equivalent to

    :(\Dog())

that is, match a nullary function of type Dog. Nor is it equivalent to

    :(Dog)

which would be equivalent to

    :(\Any(Dog))

and match a function taking a single value of type Dog.

Note also that bare \(1,2,3) is never legal in a regex since the first paren would try to match literally.

Attributive parameters

If a submethod's parameter is declared with a . or ! after the sigil (like an attribute):

    submethod initialize($.name, $!age) {}

then the argument is assigned directly to the object's attribute of the same name. This avoids the frequent need to write code like:

    submethod initialize($name, $age) {
        $.name = $name;
        $!age  = $age;
    }

To rename an attribute parameter you can use the explicit pair form:

    submethod initialize(:moniker($.name), :youth($!age)) {}

The :$name shortcut may be combined with the $.name shortcut, but the twigil is ignored for the parameter name, so

    submethod initialize(:$.name, :$!age) {}

is the same as:

    submethod initialize(:name($.name), :age($!age)) {}

Note that $!age actually refers to the private "has" variable that can be referred to either as $age or $!age.

Placeholder variables

Even though every bare block is a closure, bare blocks can't have explicit parameter lists. Instead, they use "placeholder" variables, marked by a caret (^) after their sigils.

Using placeholders in a block defines an implicit parameter list. The signature is the list of distinct placeholder names, sorted in Unicode order. So:

    { $^y < $^z && $^x != 2 }

is a shorthand for:

    -> $x,$y,$z { $y < $z && $x != 2 }

Note that placeholder variables syntactically cannot have type constraints. Also, it is illegal to use placeholder variables in a block that already has a signature, because the autogenerated signature would conflict with that.

Native types

Values with these types autobox to their uppercase counterparts when you treat them as objects:

    bit         single native bit
    int         native signed integer
    uint        native unsigned integer (autoboxes to Int)
    buf         native buffer (finite seq of native ints or uints, no Unicode)
    num         native floating point
    complex     native complex number
    bool        native boolean
Undefined types

These can behave as values or objects of any class, but always return a .id of 0. One can create them with the built-in undef and fail functions. (See S02 for how failures are handled.)

    Undef       Undefined (can serve as a prototype object of any class)
    Whatever    Wildcard (like undef, but subject to do-what-I-mean via MMD)
    Failure     Failure (throws an exception if not handled properly)
Immutable types

Objects with these types behave like values, i.e. $x === $y is true if and only if their types and contents are identical.

    Bit         Perl single bit (allows traits, aliasing, undef, etc.)
    Int         Perl integer (allows Inf/NaN, arbitrary precision, etc.)
    Str         Perl string (finite sequence of Unicode characters)
    Num         Perl number
    Complex     Perl complex number
    Bool        Perl boolean
    Exception   Perl exception
    Code        Base class for all executable objects
    Block       Executable objects that have lexical scopes
    List        Lazy Perl list (composed of Seq and Range parts)
    Seq         Completely evaluated (hence immutable) sequence
    Range       Incrementally generated (hence lazy) sequence
    Set         Unordered Seqs that allow no duplicates
    Junction    Sets with additional behaviours
    Pair        Seq of two elements that serves as an one-element Mapping
    Mapping     Pairs with no duplicate keys
    Signature   Function parameters (left-hand side of a binding)
    Capture     Function call arguments (right-hand side of a binding)
Mutable types

Objects with these types have distinct .id values.

    Array       Perl array
    Hash        Perl hash
    Scalar      Perl scalar
    Buf         Perl buffer (an stringish array of memory locations)
    IO          Perl filehandle
    Routine     Base class for all wrappable executable objects
    Sub         Perl subroutine
    Method      Perl method
    Submethod   Perl subroutine acting like a method
    Macro       Perl compile-time subroutine
    Regex       Perl pattern
    Match       Perl match, usually produced by applying a pattern
    Package     Perl 5 compatible namespace
    Module      Perl 6 standard namespace
    Class       Perl 6 standard class namespace
    Role        Perl 6 standard generic interface/implementation
    Object      Perl 6 object
    Grammar     Perl 6 pattern matching namespace
Value types

Explicit types are optional. Perl variables have two associated types: their "value type" and their "implementation type". (More generally, any container has an implementation type, including subroutines and modules.) The value type is stored as its of property, while the implementation type of the container is just the object type of the container itself.

The value type specifies what kinds of values may be stored in the variable. A value type is given as a prefix or with the of keyword:

    my Dog $spot;
    my $spot of Dog;

In either case this sets the of property of the container to Dog.

Subroutines have a variant of the of property, returns, that sets the returns property instead. The returns property specifies a constraint (or perhaps coercion) to be enforced on the return value (either by explicit call to return or by implicit fall-off-the-end return). This constraint, unlike the of property, is not advertised as the type of the routine. You can think of it as the implicit type signature of the (possibly implicit) return statement. It's therefore available for type inferencing within the routine but not outside it. If no inner type is declared, it is assumed to be the same as the of type, if declared.

    sub get_pet() of Animal {...}       # of type, obviously
    our Animal sub get_pet() {...}      # of type
    sub get_pet() returns Animal {...}  # inner type

A value type on an array or hash specifies the type stored by each element:

    my Dog @pound;  # each element of the array stores a Dog

    my Rat %ship;   # the value of each entry stores a Rat

The key type of a hash may be specified as a shape trait--see S09.

Implementation types

The implementation type specifies how the variable itself is implemented. It is given as a trait of the variable:

    my $spot is Scalar;             # this is the default
    my $spot is PersistentScalar;
    my $spot is DataBase;

Defining an implementation type is the Perl 6 equivalent to tying a variable in Perl 5. But Perl 6 variables are tied directly at declaration time, and for performance reasons may not be tied with a run-time tie statement unless the variable is explicitly declared with an implementation type that does the Tieable role.

However, package variables are always considered Tieable by default. As a consequence, all named packages are also Tieable by default. Classes and modules may be viewed as differently tied packages. Looking at it from the other direction, classes and modules that wish to be bound to a global package name must be able to do the Package role.

Hierarchical types

A non-scalar type may be qualified, in order to specify what type of value each of its elements stores:

    my Egg $cup;                       # the value is an Egg
    my Egg @carton;                    # each elem is an Egg
    my Array of Egg @box;              # each elem is an array of Eggs
    my Array of Array of Egg @crate;   # each elem is an array of arrays of Eggs
    my Hash of Array of Recipe %book;  # each value is a hash of arrays of Recipes

Each successive of makes the type on its right a parameter of the type on its left. Parametric types are named using square brackets, so:

    my Hash of Array of Recipe %book;

actually means:

    my Hash[of => Array[of => Recipe]] %book; 

Because the actual variable can be hard to find when complex types are specified, there is a postfix form as well:

    my Hash of Array of Recipe %book;           # HoHoAoRecipe
    my %book of Hash of Array of Recipe;        # same thing

The returns form may be used in subroutines:

    my sub get_book ($key) returns Hash of Array of Recipe {...}

Alternately, the return type may be specified within the signature:

    my sub get_book ($key --> Hash of Array of Recipe) {...}

There is a slight difference, insofar as the type inferencer will ignore a returns but pay attention to --> or prefix type declarations, also known as the of type. Only the inside of the subroutine pays attention to returns.

You may also specify the of type as the of trait:

    my Hash of Array of Recipe sub get_book ($key) {...}
    my sub get_book ($key) of Hash of Array of Recipe {...}
Polymorphic types

Anywhere you can use a single type you can use a set of types, for convenience specifiable as if it were an "or" junction:

    my Int|Str $error = $val;              # can assign if $val~~Int or $val~~Str

Fancier type constraints may be expressed through a subtype:

    subset Shinola of Any where {.does(DessertWax) and .does(FloorTopping)};
    if $shimmer ~~ Shinola {...}  # $shimmer must do both interfaces

Since the terms in a parameter could be viewed as a set of constraints that are implicitly "anded" together (the variable itself supplies type constraints, and where clauses or tree matching just add more constraints), we relax this to allow juxtaposition of types to act like an "and" junction:

    # Anything assigned to the variable $mitsy must conform
    # to the type Fish and either the Cat or Dog type...
    my Cat|Dog Fish $mitsy = new Fish but { int rand 2 ?? .does Cat;
                                                       !! .does Dog };
Parameter types

Parameters may be given types, just like any other variable:

    sub max (int @array is rw) {...}
    sub max (@array of int is rw) {...}
Generic types

Within a declaration, a class variable (either by itself or following an existing type name) declares a new type name and takes its parametric value from the actual type of the parameter it is associated with. It declares the new type name in the same scope as the associated declaration.

    sub max (Num ::X @array ) {
        push @array, X.new();
    }

The new type name is introduced immediately, so two such types in the same signature must unify compatibly if they have the same name:

    sub compare (Any ::T $x, T $y) {
        return $x eqv $y;
    }
Return types

On a scoped subroutine, a return type can be specified before or after the name. We call all return types "return types", but distinguish two kinds of return type, the inner type and the of type, because the of type is normally an "official" named type and declares the official interface to the routine, while the inner type is merely a constraint on what may be returned by the routine from the routine's point of view.

    our sub lay returns Egg {...}       # inner type
    our Egg sub lay {...}               # of type
    our sub lay of Egg {...}            # of type
    our sub lay (--> Egg) {...}         # of type

    my sub hat returns Rabbit {...}     # inner type
    my Rabbit sub hat {...}             # of type
    my sub hat of Rabbit {...}          # of type
    my sub hat (--> Rabbit) {...}       # of type

If a subroutine is not explicitly scoped, it belongs to the current namespace (module, class, grammar, or package), as if it's scoped with the our scope modifier. Any return type must go after the name:

    sub lay returns Egg {...}           # inner type
    sub lay of Egg {...}                # of type
    sub lay (--> Egg) {...}             # of type

On an anonymous subroutine, any return type can only go after the sub keyword:

    $lay = sub returns Egg {...};       # inner type
    $lay = sub of Egg {...};            # of type
    $lay = sub (--> Egg) {...};         # of type

but you can use a scope modifier to introduce an of prefix type:

    $lay = my Egg sub {...};            # of type
    $hat = my Rabbit sub {...};         # of type

Because they are anonymous, you can change the my modifier to our without affecting the meaning.

The return type may also be specified after a --> token within the signature. This doesn't mean exactly the same thing as returns. The of type is the "official" return type, and may therefore be used to do type inferencing outside the sub. The inner type only makes the return type available to the internals of the sub so that the return statement can know its context, but outside the sub we don't know anything about the return value, as if no return type had been declared. The prefix form specifies the of type rather than the inner type, so the return type of

    my Fish sub wanda ($x) { ... }

is known to return an object of type Fish, as if you'd said:

    my sub wanda ($x --> Fish) { ... }

not as if you'd said

    my sub wanda ($x) returns Fish { ... }

It is possible for the of type to disagree with the inner type:

    my Squid sub wanda ($x) returns Fish { ... }

or equivalently,

    my sub wanda ($x --> Squid) returns Fish { ... }

This is not lying to yourself--it's lying to the world. Having a different inner type is useful if you wish to hold your routine to a stricter standard than you let on to the outside world, for instance.

Properties and traits

Compile-time properties are called "traits". The is NAME (DATA) syntax defines traits on containers and subroutines, as part of their declaration:

    constant $pi is Approximated = 3;   # variable $pi has Approximated trait

    my $key is Persistent(:file<.key>);

    sub fib is cached {...}

The will NAME BLOCK syntax is a synonym for is NAME (BLOCK):

    my $fh will undo { close $fh };    # Same as: my $fh is undo({ close $fh });

The but NAME (DATA) syntax specifies run-time properties on values:

    constant $pi = 3 but Inexact;       # value 3 has Inexact property

    sub system {
        ...
        return $error but false if $error;
        return 0 but True;
    }

Properties are predeclared as roles and implemented as mixins--see S12.

Subroutine traits

These traits may be declared on the subroutine as a whole (individual parameters take other traits).

is signature

The signature of a subroutine. Normally declared implicitly, by providing a parameter list and/or return type.

returns/is returns

The inner type constraint that a routine imposes in its return value.

of/is of

The of type that is the official return type of the routine.

will do

The block of code executed when the subroutine is called. Normally declared implicitly, by providing a block after the subroutine's signature definition.

is rw

Marks a subroutine as returning an lvalue.

is parsed

Specifies the subrule by which a macro call is parsed. The parse always starts after the macro token, but the token may be referred to within the subrule as $<KEY>.

is cached

Marks a subroutine as being memoized.

is inline

Suggests to the compiler that the subroutine is a candidate for optimization via inlining.

is tighter/is looser/is equiv

Specifies the precedence of an operator relative to an existing operator. equiv also specifies the default associativity to be the same as the operator to which the new operator is equivalent. tighter and looser operators default to left associative.

is assoc

Specifies the associativity of an operator explicitly. Valid values are:

    Tag         Examples        Meaning of $a op $b op $c
    ===         ========        =========================
    left        + - * / x       ($a op $b) op $c
    right       ** =            $a op ($b op $c)
    non         cmp <=> ..      ILLEGAL
    chain       == eq ~~        ($a op $b) and ($b op $c)
    list        | & ^ ¥         listop($a, $b, $c) or listop($a; $b; $c)

Note that operators "equiv" to relationals are automatically considered chaining operators. When creating a new precedence level, the chaining is determined by the presence or absence of "is assoc('chaining')", and other operators defined at that level are required to be the same.

PRE/POST

Mark blocks that are to be unconditionally executed before/after the subroutine's do block. These blocks must return a true value, otherwise an exception is thrown.

When applied to a method, a PRE block automatically also calls all PRE blocks on any method of the same long name in each parent class. The precondition is satisfied if either the method's own PRE block returns true, or all of its parents' PRE blocks return true. This "me-or-all-my-parents" requirement applies recursively to each parent's PRE block as well.

When applied to a method, a POST block automatically also calls all POST blocks on any method of the same long name in every ancestral class. The postcondition is satisfied only if the method's own POST block and every one of its ancestral POST blocks all return true.

FIRST/LAST/NEXT/KEEP/UNDO/etc.

Mark blocks that are to be conditionally executed before or after the subroutine's do block. These blocks are generally used only for their side effects, since most return values will be ignored. FIRST may be an exception, but in that case you probably want to use a state variable anyway.

Parameter traitsT

The following traits can be applied to many types of parameters.

is readonly

Specifies that the parameter cannot be modified (e.g. assigned to, incremented). It is the default for parameters.

is rw

Specifies that the parameter can be modified (assigned to, incremented, etc). Requires that the corresponding argument is an lvalue or can be converted to one.

When applied to a variadic parameter, the rw trait applies to each element of the list:

    sub incr (*@vars is rw) { $_++ for @vars }

(The variadic array as a whole is always modifiable, but such modifications have no effect on the original argument list.)

is ref

Specifies that the parameter is passed by reference. Unlike is rw, the corresponding argument must already be a suitable lvalue. No attempt at coercion or autovivification is made, so unsuitable values throw an exception when you try to modify them.

is copy

Specifies that the parameter receives a distinct, read-writable copy of the original argument. This is commonly known as "pass-by-value".

    sub reprint ($text, $count is copy) {
        print $text while $count-- > 0;
    }
is context(TYPE)

Specifies the context that a parameter applies to its argument. Typically used to cause a final list parameter to apply a series of scalar contexts:

    # &format may have as many arguments as it likes,
    # each of which is evaluated in scalar context

    sub format(*@data is context(Scalar)) {...}

Note that the compiler may not be able to propagate such a scalar context to a function call used as a parameter to a method or multisub whose signature is not visible until dispatch time. Such function call parameters are called in list context by default, and must be coerced to scalar context explicitly if that is desired.

The return function

The return function notionally throws a control exception that is caught by the current lexically enclosing Routine to force a return through the control logic code of any intermediate block constructs. With normal blocks this can be optimized away to a "goto". All Routine declarations have an explicit declarator such as sub or method; bare blocks and "pointy" subs are never considered to be routines in that sense. To return from a block, use leave instead--see below.

The return function preserves its argument list as a Capture object, and responds to the left-hand Signature in a binding. This allows named return values if the caller expects one:

    sub f { return :x<1> }
    sub g ($x) { print $x }

    my $x := *f();  # binds 1 to $x, via a named argument
    g(*f());        # prints 1, via a named argument

To return a literal Pair object, always put it in an additional set of parentheses:

    return( (:x<1>), (:y<2>) ); # two positional Pair objects

Note that the postfix parentheses on the function call don't count as "additional". However, as with any function, whitespace after the return keyword prevents that interpretation and turns it instead into a list operator:

    return :x<1>, :y<2>; # two named arguments (if caller uses *)
    return ( :x<1>, :y<2> ); # two positional Pair objects

If the function ends with an expression without an explicit return, that expression is also taken to be a Capture, just as if the expression were the argument to a return list operator (with whitespace):

    sub f { :x<1> } # named-argument binding (if caller uses *)
    sub f { (:x<1>) } # always just one positional Pair object 

On the caller's end, the Capture is interpolated into any new argument list much like an array would be, that is, as a scalar in scalar context, and as a list in list context. This is the default behavior, but as with an array, the caller may use prefix:<[,]> to inline the returned values as part of the new argument list. The caller may also bind the returned Capture directly.

The caller functionT

The caller function returns an object that describes a particular "higher" dynamic scope, from which the current scope was called.

    say "In ",           caller.sub,
        " called from ", caller.file,
        " line ",        caller.line;

caller may be given arguments telling it what kind of higher scope to look for, and how many such scopes to skip over when looking:

    $caller = caller;                      # immediate caller
    $caller = caller Method;               # nearest caller that is method
    $caller = caller Bare;                 # nearest caller that is bare block
    $caller = caller Sub, :skip(2);        # caller three levels up
    $caller = caller Block, :label<Foo>;   # caller whose label is 'Foo'
The want function

The want function returns an object that contains information about the context in which the current block, closure, or subroutine was called.

The returned context object is typically tested with a smart matchT (~~) or a when:

   given want {
        when Scalar {...}           # called in scalar context
        when List   {...}           # called in list context
        when Lvalue {...}           # expected to return an lvalue
        when 2      {...}           # expected to return two values
        ...
    }

or has the corresponding methods called on it:T

       if (want.Scalar)    {...}    # called in scalar context
    elsif (want.List)      {...}    # called in list context
    elsif (want.rw)        {...}    # expected to return an lvalue
    elsif (want.count > 2) {...}    # expected to return more than two values

Note these are pseudo type associations. There's no such thing as an Lvalue object, and a List is really an unbound argument list object, parts of which may in fact be eventually bound into scalar context.

The leave functionT

A return statement causes the innermost surrounding subroutine, method, rule, token, regex (as a keyword), macro, or multimethod to return. Only declarations with an explicit keyword such as "sub" may be returned from. You may not return from a quotelike operator such as rx//.

To return from other types of code structures, the leave function is used:

    leave;                      # return from innermost block of any kind
    leave Method;               # return from innermost calling method
    leave &?ROUTINE <== 1,2,3;  # Return from current sub. Same as: return 1,2,3
    leave &foo <== 1,2,3;       # Return from innermost surrounding call to &foo
    leave Loop where { .label eq 'COUNT' };  # Same as: last COUNT;

Note that the last is equivalent to

    last COUNT;

and, in fact, you can return a final loop value that way:

    last COUNT <== 42;

If supplied, the first argument to leave is a Selector, and will be smart-matched against the dynamic scope objects from inner to outer. The first that matches is the scope that is left.

Temporization

The temp function temporarily replaces the value of an existing variable, subroutine, or other object in a given scope:

    {
       temp $*foo = 'foo';      # Temporarily replace global $foo
       temp &bar := sub {...};  # Temporarily replace sub &bar
       ...
    } # Old values of $*foo and &bar reinstated at this point

temp invokes its argument's .TEMP method. The method is expected to return a Code object that can later restore the current value of the object. At the end of the lexical scope in which the temp was applied, the subroutine returned by the .TEMP method is executed.T

The default .TEMP method for variables simply creates a closure that assigns the variable's pre-temp value back to the variable.

New kinds of temporization can be created by writing storage classes with their own .TEMP methods:

    class LoudArray is Array {
        method TEMP {
            print "Replacing $.id() at {caller.location}\n";
            my $restorer = $.SUPER::TEMP();
            return { 
                print "Restoring $.id() at {caller.location}\n";
                $restorer();
            };
        }
    }

You can also modify the behaviour of temporized code structuresT, by giving them a TEMP block. As with .TEMP methods, this block is expected to return a closure, which will be executed at the end of the temporizing scope to restore the subroutine to its pre-temp state:

    my $next = 0;
    sub next {
        my $curr = $next++;
        TEMP {{ $next = $curr }}  # TEMP block returns the closure { $next = $curr }
        return $curr;
    }

    # and later...

    say next();     # prints 0; $next == 1
    say next();     # prints 1; $next == 2
    say next();     # prints 2; $next == 3
    if ($hiccough) {
        say temp next();  # prints 3; closes $curr at 3; $next == 4
        say next();       # prints 4; $next == 5
        say next();       # prints 5; $next == 6
    }                     # $next = 3
    say next();     # prints 3; $next == 4
    say next();     # prints 4; $next == 5

Hypothetical variables use the same mechanism, except that the restoring closure is called only on failure.

Note that "env" variables may be a better solution than temporized globals in the face of multithreading.

WrappingT

Every Routine object has a .wrap method. This method expects a single Code argument. Within the code, the special call function will invoke the original routine, but does not introduce a CALLER frame:

    sub thermo ($t) {...}   # set temperature in Celsius, returns old temp

    # Add a wrapper to convert from Fahrenheit...
    $id = &thermo.wrap( { call( ($^t-32)/1.8 ) } );

The call to .wrap replaces the original Routine with the Code argument, and arranges that the call to call invokes the previous version of the routine. In other words, the call to .wrap has more or less the same effect as:

    &old_thermo := &thermo;
    &thermo := sub ($t) { old_thermo( ($t-32)/1.8 ) }

Except that &thermo is mutated in-place, so &thermo.id stays the same after the .wrap.

The call to .wrap returns a unique identifier that can later be passed to the .unwrap method, to undo the wrapping:

    &thermo.unwrap($id);

This does not affect any other wrappings placed to the routine.

A wrapping can also be restricted to a particular dynamic scope with temporization:

    # Add a wrapper to convert from Kelvin
    # wrapper self-unwraps at end of current scope
    temp &thermo.wrap( { call($^t + 273.16) } );

The entire argument list may be captured by the \$args parameter. It can then be passed to call as *$args:

    # Double the return value for &thermo
    &thermo.wrap( -> \$args { call(*$args) * 2 } );

The wrapper is not required to call the original routine; it can call another Code object by passing the Capture to its call method:

    # Transparently redirect all calls to &thermo to &other_thermo
    &thermo.wrap( -> \$args { &other_thermo.call(*$args) } );

Outside a wrapper, call implicitly calls the next-most-likely method or multi-sub; see S12 for details.

As with any return value, you may capture the returned Capture of call by binding:

    my \$retval := call(*$args);
    ... # postprocessing
    return *$retval;
The &?ROUTINE object

&?ROUTINE is always an alias for the lexically innermost Routine (which may be a Sub, Method or SubMethod), so you can specify tail-recursion on an anonymous sub:

    my $anonfactorial = sub (Int $n) {
                            return 1 if $n<2;
                            return $n * &?ROUTINE($n-1);
                        };

You can get the current routine nameT by calling &?ROUTINE.name. (The outermost routine at a file-scoped compilation unit is always named &MAIN in the file's package.)

Note that &?ROUTINE refers to the current single sub, even if it is declared "multi". To redispatch to the entire suite under a given short name, just use the named form, since there are no anonymous multis.

The &?BLOCK object

&?BLOCK is always an alias for the current block, so you can specify tail-recursion on an anonymous block:

    my $anonfactorial = -> Int $n { $n < 2
                                        ?? 1
                                        :: $n * &?BLOCK($n-1)
                                  };

&?BLOCK.label contains the label of the current block, if any.

If the innermost lexical block comes is part of a Routine, then &?BLOCK just returns the Block object within it.

[Note: to refer to any $? or &? variable at the time the sub or block is being compiled, use the COMPILING:: pseudopackage.]

CurryingTTT

Every Code object has an .assuming method. This method does a partial binding of a set of arguments to a signature and returns a new function that takes only the remaining arguments.

    &textfrom := &substr.assuming(str=>$text, len=>Inf);

or equivalently:

    &textfrom := &substr.assuming(:str($text) :len(Inf));

or even:

    &textfrom := &substr.assuming:str($text):len(Inf);

It returns a Code object that implements the same behaviour as the original subroutine, but has the values passed to .assuming already bound to the corresponding parameters:

    $all  = textfrom(0);   # same as: $all  = substr($text,0,Inf);
    $some = textfrom(50);  # same as: $some = substr($text,50,Inf);
    $last = textfrom(-1);  # same as: $last = substr($text,-1,Inf);

The result of a use statement is a (compile-time) object that also has an .assuming method, allowing the user to bind parameters in all the module's subroutines/methods/etc. simultaneously:T

    (use IO::Logging).assuming(logfile => ".log");

This form should generally be restricted to named parameters.

To curry a particular multimethod it may be necessary to specify the type of one or more of its invocants:

    &woof ::= &bark:(Dog).assuming :pitch<low>;
    &pine ::= &bark:(Tree).assuming :pitch<yes>;
MacrosT

Macros are functions or operators that are called by the compiler as soon as their arguments are parsed (if not sooner). The syntactic effect of a macro declaration or importation is always lexically scoped, even if the name of the macro is visible elsewhere. As with ordinary operators, macros may be classified by their grammatical category. For a given grammatical category, a default parsing rule or set of rules is used, but those rules that have not yet been "used" by the time the macro keyword or token is seen can be replaced by use of "is parsed" trait. (This means, for instance, that an infix operator can change the parse rules for its right operand but not its left operand.)

In the absence of a signature to the contrary, a macro is called as if it were a method on the current match object returned from the grammar rule being reduced; that is, all the current parse information is available by treating self as if it were a $/ object. [Conjecture: alternate representations may be available if arguments are declared with particular AST types.]

Macros may return either a string to be reparsed, or a syntax tree that needs no further parsing. The textual form is handy, but the syntax tree form is generally preferred because it allows the parser and debugger to give better error messages. Textual substitution on the other hand tends to yield error messages that are opaque to the user. Syntax trees are also better in general because they are reversible, so things like syntax highlighters can get back to the original language and know which parts of the derived program come from which parts of the user's view of the program.

In aid of returning syntax tree, Perl provides a "quasiquoting" mechanism using the quote q:code, followed by a block intended to represent an AST:

    return q:code { say "foo" };

Modifiers to the :code adverb can modify the operation:

    :ast(MyAst)         # Default :ast(AST)
    :lang(Ruby)         # Default :lang($?PARSER)
    :unquote<[: :]>     # Default "triple rule"

Within a quasiquote, variable and function names resolve according to the lexical scope of the macro definition. Unrecognized symbols raise errors when the macro is being compiled, not when it's being used.

To make a symbol resolve to the (partially compiled) scope of the macro call, use the COMPILING:: pseudo-package:

    macro moose () { q:code { $COMPILING::x } }

    moose(); # macro-call-time error
    my $x;
    moose(); # resolves to 'my $x'

If you want to mention symbols from the scope of the macro call, use the import syntax as modifiers to :code:

    :COMPILING<$x>      # $x always refers to $x in caller's scope
    :COMPILING          # All free variables fallback to caller's scope

If those symbols do not exist in the scope of the compiling scope, a compile-time exception is thrown at macro call time.

Similarly, in the macro body you may either refer to the $x declared in the scope of the macro call as $COMPILING::x, or bind to them explicitly:

    my $x := $COMPILING::x;

You may also use an import list to bind multiple symbols into the macro's lexical scope:

    require COMPILING <$x $y $z>;

Note that you need to use the run-time := and require forms, not ::= and use, because the macro caller's compile-time is the macro's runtime.

Splicing

Bare AST variables (such as the arguments to the macro) may not be spliced directly into a quasiquote because they would be taken as normal bindings. Likewise, program text strings to be inserted need to be specially marked or they will be bound normally. To insert a "unquoted" expression of either type within a quasiquote, use the quasiquote delimiter tripled, typically a bracketing quote of some sort:

    return q:code { say $a + {{{ $ast }}} }
    return q:code [ say $a + [[[ $ast ]]] ]
    return q:code < say $a + <<< $ast >>> >
    return q:code ( say $a + ((( $ast ))) )

The delimiters don't have to be bracketing quotes, but the following is probably to be construed as Bad Style:

    return q:code / say $a + /// $ast /// /

(Note to implementors: this must not be implemented by finding the final closing delimiter and preprocessing, or we'll violate our one-pass parsing rule. Perl 6 parsing rules are parameterized to know their closing delimiter, so adding the opening delimiter should not be a hardship. Alternately the opening delimiter can be deduced from the closing delimiter. Writing a rule that looks for three opening delimiters in a row should not be a problem. It has to be a special grammar rule, though, not a fixed token, since we need to be able to nest code blocks with different delimiters. Likewise when parsing the inner expression, the inner parser subrule is parameterized to know that }}} or whatever is its closing delimiter.)

Unquoted expressions are inserted appropriately depending on the type of the variable, which may be either a syntax tree or a string. (Again, syntax tree is preferred.) The case is similar to that of a macro called from within the quasiquote, insofar as reparsing only happens with the string version of interpolation, except that such a reparse happens at macro call time rather than macro definition time, so its result cannot change the parser's expectations about what follows the interpolated variable.

Hence, while the quasiquote itself is being parsed, the syntactic interpolation of a unquoted expression into the quasiquote always results in the expectation of an operator following the variable. (You must use a call to a submacro if you want to expect something else.) Of course, the macro definition as a whole can expect whatever it likes afterwards, according to its syntactic category. (Generally, a term expects a following postfix or infix operator, and an operator expects a following term or prefix operator.)

Quasiquotes default to hygienic lexical scoping, just like closures. The visibility of lexical variables is limited to the q:code expression by default. A variable declaration can be made externally visible using the COMPILING:: pseudo-package. Individual variables can be made visible, or all top-level variable declarations can be exposed using the q:code(:COMPILING) form.

Both examples below will add $new_variable to the lexical scope of the macro call:

  q:code {  my $COMPILING::new_variable;   my $private_var; ... }
  q:code(:COMPILING) { my $new_variable; { my $private_var; ... } }

(Note that :COMPILING has additional effects described in Macros.)

Anonymous hashes vs blocks

{...} is always a block. However, if it is completely empty or consists of a single list, the first element of which is either a hash or a pair, it is executed immediately to compose a Hash object.

The standard pair list operator is equivalent to:

    sub pair (*@LIST) {
        my @pairs;
        for @LIST -> $key, $val {
            push @pairs, $key => $val;
        }
        return @pairs;
    }

or more succinctly (and lazily):

    sub pair (*@LIST) {
        gather {
            for @LIST -> $key, $val {
                take $key => $val;
            }
        }
    }

The standard hash list operator is equivalent to:

    sub hash (*@LIST) {
        return { pair @LIST };
    }

So you may use sub or hash or pair to disambiguate:T

    $ref =  sub { 1, 2, 3, 4, 5, 6 };   # Anonymous sub returning list
    $ref =      { 1, 2, 3, 4, 5, 6 };   # Anonymous sub returning list
    $ref =      { 1=>2, 3=>4, 5=>6 };   # Anonymous hash
    $ref =      { 1=>2, 3, 4, 5, 6 };   # Anonymous hash
    $ref =  hash( 1, 2, 3, 4, 5, 6 );   # Anonymous hash
    $ref =  hash  1, 2, 3, 4, 5, 6  ;   # Anonymous hash
    $ref = { pair 1, 2, 3, 4, 5, 6 };   # Anonymous hash
Pairs as lvalues

Pairs can be used as lvalues. The value of the pair is the recipient of the assignment:

    (key => $var) = "value";

When binding pairs, names can be used to "match up" lvalues and rvalues, provided you write the left side as a signature using :(...) notation:

    :(:who($name), :why($reason)) := (why => $because, who => "me");

(Otherwise the parser doesn't know it should parse the insides as a signature and not as an ordinary expression until it gets to the :=, and that would be bad. Possibly we should require a "my" out front as well...)

Out-of-scope names

GLOBAL::<$varname> specifies the $varname declared in the * namespace. Or maybe it's the other way around...

CALLER::<$varname> specifies the $varname visible in the dynamic scope from which the current block/closure/subroutine was called, provided that variable is declared with the "env" declarator. (Implicit lexicals such as $_ are automatically assumed to be environmental.)

ENV::<$varname> specifies the $varname visible in the innermost dynamic scope that declares the variable with the "env" declarator.

MY::<$varname> specifies the lexical $varname declared in the current lexical scope.

OUR::<$varname> specifies the $varname declared in the current package's namespace.

COMPILING::<$varname> specifies the $varname declared (or about to be declared) in the lexical scope currently being compiled.

OUTER::<$varname> specifies the $varname declared in the lexical scope surrounding the current lexical scope (i.e. the scope in which the current block was defined).