This page was generated at 2008-05-17 02:01:11 GMT.
(syn r14541, pugs-tests r20436, pugs-smoke r19912)
  [ Index of Synopses ]

TITLE

Synopsis 6: Subroutines

AUTHOR

Damian Conway <damian@conway.org> and Allison Randal <al@shadowed.net>

VERSION

  Maintainer: Larry Wall <larry@wall.org>
  Date: 21 Mar 2003
  Last Modified: 17 Mar 2008
  Number: 6
  Version: 94

This document summarizes Apocalypse 6, which covers subroutines and the new type system.

Subroutines and other code objects

From t/blocks/sub_return_values.t lines 8–228 (63 √, 0 ×): (skip)

 

Subroutines (keyword: sub) are non-inheritable routines with parameter lists.

Methods (keyword: method) are inheritable routines which always have an associated object (known as their invocant) and belong to a particular kind or class.

Submethods (keyword: submethod) are non-inheritable methods, or subroutines masquerading as methods. They have an invocant and belong to a particular kind or class.

Regexes (keyword: regex) are methods (of a grammar) that perform pattern matching. Their associated block has a special syntax (see Synopsis 5). (We also use the term "regex" for anonymous patterns of the traditional form.)

Tokens (keyword: token) are regexes that perform low-level non-backtracking (by default) pattern matching.

Rules (keyword: rule) are regexes that perform non-backtracking (by default) pattern matching (and also enable rules to do whitespace dwimmery).

Macros (keyword: macro) are routines whose calls execute as soon as they are parsed (i.e. at compile-time). Macros may return another source code string or a parse-tree.

Routine modifiers

From t/blocks/multi_sub.t lines 51–69 (3 √, 2 ×): (skip)

 

Multis (keyword: multi) are routines that can have multiple variants that share the same name, selected by arity, types, or some other constraints.

Prototypes (keyword: proto) specify the commonalities (such as parameter names, fixity, and associativity) shared by all multis of that name in the scope of the proto declaration. A proto also adds an implicit multi to all routines of the same short name within its scope, unless they have an explicit modifier. (This is particularly useful when adding to rule sets or when attempting to compose conflicting methods from roles.)

From t/blocks/proto.t lines 7–23 (no results): (skip)

 

Only (keyword: only) routines do not share their short names with other routines. This is the default modifier for all routines, unless a proto of the same name was already in scope.

A modifier keyword may occur before the routine keyword in a named routine:

    only sub foo {...}
    proto sub foo {...}
    multi sub foo {...}
    only method bar {...}
    proto method bar {...}
    multi method bar {...}

If the routine keyword is omitted, it defaults to sub.

Modifier keywords cannot apply to anonymous routines.

Named subroutines

From t/blocks/scoped_named_subs.t lines 5–35 (8 √, 0 ×): (skip)

 

The general syntax for named subroutines is any of:

     my RETTYPE sub NAME ( PARAMS ) TRAITS {...}    # lexical only
    our RETTYPE sub NAME ( PARAMS ) TRAITS {...}    # also package-scoped
                sub NAME ( PARAMS ) TRAITS {...}    # same as "our"

The return type may also be put inside the parentheses:

    sub NAME (PARAMS --> RETTYPE) {...}

Unlike in Perl 5, named subroutines are considered expressions, so this is valid Perl 6:

    my @subs = (sub foo { ... }, sub bar { ... });

Anonymous subroutines

From t/data_types/anon_block.t lines 19–28 (4 √, 0 ×): (skip)

 

The general syntax for anonymous subroutines is:

    sub ( PARAMS ) TRAITS {...}

But one can also use a scope modifier to introduce the return type first:

     my RETTYPE sub ( PARAMS ) TRAITS {...}
    our RETTYPE sub ( PARAMS ) TRAITS {...}

In this case there is no effective difference, since the distinction between my and our is only in the handling of the name, and in the case of an anonymous sub, there's isn't one.

Trait is the name for a compile-time (is) property. See "Properties and traits".

Perl5ish subroutine declarations

You can declare a sub without parameter list, as in Perl 5:

    sub foo {...}

This is equivalent to

    sub foo (*@_, *%_) {...}

Positional arguments implicitly come in via the @_ array, but unlike in Perl 5 they are readonly aliases to actual arguments:

    sub say { print qq{"@_[]"\n}; }   # args appear in @_
    sub cap { $_ = uc $_ for @_ }   # Error: elements of @_ are read-only

Also unlike in Perl 5, Perl 6 has true named arguments, which come in via %_ instead of @_. (To construct pseudo-named arguments that come in via @_ as in Perl 5, the p5-to-p6 translator will use the ugly p5=> operator instead of Perl 6's => Pair constructor.)

If you need to modify the elements of @_ or %_, declare the array or hash explicitly with the is rw trait:

    sub swap (*@_ is rw, *%_ is rw) { @_[0,1] = @_[1,0]; %_<status> = "Q:S"; }

Note: the "rw" container trait is automatically distributed to the individual elements by the the slurpy star even though there's no actual array or hash passed in. More precisely, the slurpy star means the declared formal parameter is not considered readonly; only its elements are. See "Parameters and arguments" below.

Note also that if the sub's block contains placeholder variables (such as $^foo or $:bar), those are considered to be formal parameters already, so in that case @_ or %_ fill the role of sopping up unmatched arguments. That is, if those containers are explicitly mentioned within the body, they are added as slurpy parameters. This allows you to easily customize your error message on unrecognized parameters. If they are not mentioned in the body, they are not added to the signature, and normal dispatch rules will simply fail if the signature cannot be bound.

Blocks

From t/data_types/anon_block.t lines 12–18 (no results): (skip)

 

From t/data_types/anon_block.t lines 29–34 (2 √, 0 ×): (skip)

 

From t/data_types/anon_block.t lines 54–108 (16 √, 1 ×): (skip)

 

Raw blocks are also executable code structures in Perl 6.

Every block defines an object of type Code, which may either be executed immediately or passed on as a Code object. How a block is parsed is context dependent.

A bare block where an operator is expected terminates the current expression and will presumably be parsed as a block by the current statement-level construct, such as an if or while. (If no statement construct is looking for a block there, it's a syntax error.) This form of bare block requires leading whitespace because a bare block where a postfix is expected is treated as a hash subscript.

A bare block where a term is expected merely produces a Code object. If the term bare block occurs in a list, it is considered the final element of that list unless followed immediately by a comma or colon (intervening \h* or "unspace" is allowed).

From t/spec/S02-whitespace_and_comments/unspace.t lines 175–190 (no results): (skip)

 

"Pointy blocks"

From t/data_types/anon_block.t lines 35–53 (9 √, 0 ×): (skip)

 

From t/blocks/pointy.t lines 11–14 (no results): (skip)

 

Semantically the arrow operator -> is almost a synonym for the sub keyword as used to declare an anonymous subroutine, insofar as it allows you to declare a signature for a block of code. However, the parameter list of a pointy block does not require parentheses, and a pointy block may not be given traits. In most respects, though, a pointy block is treated more like a bare block than like an official subroutine. Syntactically, a pointy block may be used anywhere a bare block could be used:

From t/blocks/pointy.t lines 15–40 (5 √, 0 ×): (skip)

 
    my $sq = -> $val { $val**2 };
    say $sq(10); # 100
    my @list = 1..3;
    for @list -> $elem {
        say $elem; # prints "1\n2\n3\n"
    }

It also behaves like a block with respect to control exceptions. If you return from within a pointy block, the block is transparent to the return; it will return from the innermost enclosing sub or method, not from the block itself. It is referenced by &?BLOCK, not &?ROUTINE.

From t/blocks/pointy.t lines 50–60 (0 √, 2 ×): (skip)

 

From t/blocks/pointy.t lines 61–81 (3 √, 0 ×): (skip)

 

From t/blocks/pointy.t lines 82–83 (no results): (skip)

 

A normal pointy block's parameters default to readonly, just like parameters to a normal sub declaration. However, the double-pointy variant defaults parameters to rw:

    for @list <-> $elem {
        $elem++;
    }

This form applies rw to all the arguments:

    for @kv <-> $key, $value {
        $key ~= ".jpg";
        $value *= 2 if $key ~~ :e;
    }

Stub declarations

To predeclare a subroutine without actually defining it, use a "stub block":

    sub foo {...}     # Yes, those three dots are part of the actual syntax

The old Perl 5 form:

    sub foo;

is a compile-time error in Perl 6 (because it would imply that the body of the subroutine extends from that statement to the end of the file, as class and module declarations do). The only allowed use of the semicolon form is to declare a MAIN sub--see "Declaring a MAIN subroutine" below.

Redefining a stub subroutine does not produce an error, but redefining an already-defined subroutine does. If you wish to redefine a defined sub, you must explicitly use the "is instead" trait.

The ... is the "yadayadayada" operator, which is executable but returns a failure. You can also use ??? to produce a warning, or !!! to always die. These also officially define stub blocks if used as the only expression in the block.

It has been argued that ... as literal syntax is confusing when you might also want to use it for metasyntax within a document. Generally this is not an issue in context; it's never an issue in the program itself, and the few places where it could be an issue in the documentation, a comment will serve to clarify the intent, as above. The rest of the time, it doesn't really matter whether the reader takes ... as literal or not, since the purpose of ... is to indicate that something is missing whichever way you take it.

Globally scoped subroutines

Subroutines and variables can be declared in the global namespace, and are thereafter visible everywhere in a program.

Global subroutines and variables are normally referred to by prefixing their identifiers with * (short for "GLOBAL::"). The * is required on the declaration unless the GLOBAL namespace can be inferred some other way, but the * may be omitted on use if the reference is unambiguous:

    $*next_id = 0;
    sub *saith($text)  { print "Yea verily, $text" }
    module A {
        my $next_id = 2;    # hides any global or package $next_id
        saith($next_id);    # print the lexical $next_id;
        saith($*next_id);   # print the global $next_id;
    }
    module B {
        saith($next_id);    # Unambiguously the global $next_id
    }

However, under stricture (the default for most code), the * is required on variable references. It's never required on sub calls, and in fact, the syntax

    $x = *saith($y);

is illegal, because a * where a term is expected is always parsed as the "whatever" token. If you really want to use a *, you must also use the sigil along with the twigil:

    $x = &*saith($y);

Only the name is installed into the GLOBAL package by *. To define subs completely within the scope of the GLOBAL namespace you should use "package GLOBAL {...}" around the declaration.

Lvalue subroutines

From t/blocks/lvalue_subroutines.t lines 11–112 (4 √, 10 ×): (skip)

 

Lvalue subroutines return a "proxy" object that can be assigned to. It's known as a proxy because the object usually represents the purpose or outcome of the subroutine call.

Subroutines are specified as being lvalue using the is rw trait.

An lvalue subroutine may return a variable:

    my $lastval;
    sub lastval () is rw { return $lastval }

or the result of some nested call to an lvalue subroutine:

    sub prevval () is rw { return lastval() }

or a specially tied proxy object, with suitably programmed FETCH and STORE methods:

    sub checklastval ($passwd) is rw {
        return new Proxy:
                FETCH => method {
                            return lastval();
                         },
                STORE => method ($val) {
                            die unless check($passwd);
                            lastval() = $val;
                         };
    }

Other methods may be defined for specialized purposes such as temporizing the value of the proxy.

Operator overloading

From t/oo/methods/overload.t lines 7–72 (3 √, 6 ×): (skip)

 

From t/operators/operator_overloading.t lines 11–121 (17 √, 5 ×): (skip)

 

From t/operators/recursive_definition.t lines 7–16 (2 √, 0 ×): (skip)

 

Operators are just subroutines with special names and scoping. An operator name consists of a grammatical category name followed by a single colon followed by an operator name specified as if it were a hash subscript (but evaluated at compile time). So any of these indicates the same binary addition operator:

    infix:<+>
    infix:«+»
    infix:<<+>>
    infix:{'+'}
    infix:{"+"}

Use the & sigil just as you would on ordinary subs.

Unary operators are defined as prefix or postfix:

    sub prefix:<OPNAME>  ($operand) {...}
    sub postfix:<OPNAME> ($operand) {...}

Binary operators are defined as infix:

    sub infix:<OPNAME> ($leftop, $rightop) {...}

Bracketing operators are defined as circumfix where a term is expected or postcircumfix where a postfix is expected. A two-element slice containing the leading and trailing delimiters is the name of the operator.

    sub circumfix:<LEFTDELIM RIGHTDELIM> ($contents) {...}
    sub circumfix:{'LEFTDELIM','RIGHTDELIM'} ($contents) {...}

Contrary to Apocalypse 6, there is no longer any rule about splitting an even number of characters. You must use a two-element slice. Such names are canonicalized to a single form within the symbol table, so you must use the canonical name if you wish to subscript the symbol table directly (as in PKG::{'infix:<+>'}). Otherwise any form will do. (Symbolic references do not count as direct subscripts since they go through a parsing process.) The canonical form always uses angle brackets and a single space between slice elements. The elements are not escaped, so PKG::circumfix:{'<','>'} is canonicalized to PKG::{'circumfix:<< >>'}, and decanonicalizing always involves stripping the outer angles and splitting on space, if any. This works because a hash key knows how long it is, so there's no ambiguity about where the final angle is. And space works because operators are not allowed to contain spaces.

Operator names can be any sequence of non-whitespace characters including Unicode characters. For example:

    sub infix:<(c)> ($text, $owner) { return $text but Copyright($owner) }
    method prefix:<±> (Num $x --> Num) { return +$x | -$x }
    multi sub postfix:<!> (Int $n) { $n < 2 ?? 1 !! $n*($n-1)! }
    macro circumfix:«<!-- -->» ($text) is parsed / .*? / { "" }
    my $document = $text (c) $me;
    my $tolerance = ±7!;
    <!-- This is now a comment -->

Whitespace may never be part of the name (except as separator within a <...> or «...» slice subscript, as in the example above).

A null operator name does not define a null or whitespace operator, but a default matching subrule for that syntactic category, which is useful when there is no fixed string that can be recognized, such as tokens beginning with digits. Such an operator must supply an is parsed trait. The Perl grammar uses a default subrule for the :1st, :2nd, :3rd, etc. regex modifiers, something like this:

    sub regex_mod_external:<> ($x) is parsed(token { \d+[st|nd|rd|th] }) {...}

Such default rules are attempted in the order declared. (They always follow any rules with a known prefix, by the longest-token-first rule.)

Although the name of an operator can be installed into any package or lexical namespace, the syntactic effects of an operator declaration are always lexically scoped. Operators other than the standard ones should not be installed into the * namespace. Always use exportation to make non-standard syntax available to other scopes.

Parameters and arguments

From t/syntax/signature.t lines 90–143 (7 √, 0 ×): (skip)

 

Perl 6 subroutines may be declared with parameter lists.

By default, all parameters are readonly aliases to their corresponding arguments--the parameter is just another name for the original argument, but the argument can't be modified through it. This is vacuously true for value arguments, since they may not be modified in any case. However, the default forces any container argument to also be treated as an immutable value. This extends down only one level; an immutable container may always return an element that is mutable if it so chooses. (For this purpose a scalar variable is not considered a container of its singular object, though, so the top-level object within a scalar variable is considered immutable by default. Perl 6 does not have references in the same sense that Perl 5 does.)

To allow modification, use the is rw trait. This requires a mutable object or container as an argument (or some kind of protoobject that can be converted to a mutable object, such as might be returned by an array or hash that knows how to autovivify new elements). Otherwise the signature fails to bind, and this candidate routine cannot be considered for servicing this particular call. (Other multi candidates, if any, may succeed if the don't require rw for this parameter.) In any case, failure to bind does not by itself cause an exception to be thrown; that is completely up to the dispatcher.

To pass-by-copy, use the is copy trait. An object container will be cloned whether or not the original is mutable, while an (immutable) value will be copied into a suitably mutable container. The parameter may bind to any argument that meets the other typological constraints of the parameter.

</