Revision 28344

Date:
2009/09/21 20:57:15
Author:
lwall
Revision Log:
[S03,S09]
Range objects are now primarily intervals in C<cmp>
Extend dwimminess of series to handle steps and limits readably
:by is deemed Too Ugly and is now dead, David Green++
Use series operator to replace :by semantics more readably
Range objects used as lists now simply mutate .. to ...
(taking into account ^ though)
Alpha ranges must now match endpoint using !after semantics on non-eqv
Simplify range semantics when used as subscripts
Kill unshifty negative subscript lvalues as too error prone
Spec way to declare modular subscripts
Files:

Legend:

 
Added
 
Removed
 
Modified
  • docs/Perl6/Spec/S03-operators.pod

     
    14 14
    15 15 Created: 8 Mar 2004
    16 16
    17 Last Modified: 2 Sep 2009
    18 Version: 172
    17 Last Modified: 21 Sep 2009
    18 Version: 173
    19 19
    20 20 =head1 Overview
    21 21
     
    270 270
    271 271 Pair composers
    272 272
    273 :by(2)
    273 :limit(5)
    274 274 :!verbose
    275 275
    276 276 =item *
     
    1413 1413
    1414 1414 Adverbs will generally attach the way you want when you say things like
    1415 1415
    1416 1 .. $x+2 :by(2)
    1416 1 op $x+2 :mod($x)
    1417 1417
    1418 1418 The proposed internal testing syntax makes use of these precedence rules:
    1419 1419
     
    1751 1751 More typically the function is unary, in which case any extra values
    1752 1752 in the list may be construed as human-readable documentation:
    1753 1753
    1754 0,2,4 ... { $_ + 2 } # same as 1..*:by(2)
    1754 0,2,4 ... { $_ + 2 } # all the evens
    1755 0,2,4 ... *+2 # same thing
    1755 1756 <a b c> ... { .succ } # same as 'a'..*
    1756 1757
    1757 The function need not be monotonic, of course:
    1758 The function need not be monotoniccaly increasing, of course:
    1758 1759
    1759 1760 1 ... { -$_ } # 1, -1, 1, -1, 1, -1...
    1760 1761 False ... &prefix:<!> # False, True, False...
     
    1763 1764
    1764 1765 () ... { rand } # list of random numbers
    1765 1766
    1766 The function may also be slurpy (*-ary), in which case all the
    1767 The function may also be slurpy (n-ary), in which case all the
    1767 1768 preceding values are passed in (which means they must all be cached
    1768 1769 by the operator, so performance may suffer).
    1769 1770
     
    1773 1774 1,1 ... { $^a + 1, $^b * 2 } # 1,1,2,2,3,4,4,8,5,16,6,32...
    1774 1775
    1775 1776 If the right operand is C<*> (Whatever) and the sequence is obviously
    1776 arithmetic or geometric, the appropriate function is deduced:
    1777 arithmetic or geometric (from examining its I<last> 3 values), the appropriate function is deduced:
    1777 1778
    1778 1779 1, 3, 5 ... * # odd numbers
    1779 1780 1, 2, 4 ... * # powers of 2
    1780 1781
    1781 Conjecture: other such patterns may be recognized in the future,
    1782 depending on which unrealistic benchmarks we want to run faster. C<:)>
    1782 If there are only two values so far, C<*> assumes an arithmentic
    1783 progression. If there is only one value (or if the final values do
    1784 not support the requisite arithmetic), C<*> assumes incrementation
    1785 via C<.succ>. Hence these come out the same:
    1783 1786
    1784 Note: the yada operator is recognized only where a term is expected.
    1785 This operator may only be used where an infix is expected. If you
    1786 put a comma before the C<...> it will be taken as a yada list operator
    1787 expressing the desire to fail when the list reaches that point:
    1787 1..*
    1788 1...*
    1789 1,2,3...*
    1788 1790
    1789 1..20, ... "I only know up to 20 so far mister"
    1791 If list on the left is C<Nil>, C<*> will return a single C<Nil>.
    1790 1792
    1791 If the yada operator finds a closure for its argument at compile time,
    1792 it should probably whine about the fact that it's difficult to turn
    1793 a closure into an error message. Alternately, we could treat
    1794 an ellipsis as special when it follows a comma to better support
    1795 traditional math notation.
    1793 Conjecture: other such patterns may be recognized in the future,
    1794 depending on which unrealistic benchmarks we want to run faster. C<:)>
    1796 1795
    1797 1796 The function may choose to terminate its list by returning ().
    1798 1797 Since this operator is list associative, an inner function may be
     
    1809 1808 10,20,30,40,50,60,70,80,90,
    1810 1809 100,200,300,400,500,600,700,800,900
    1811 1810
    1811 If the right operand is a list and the first element of the list is
    1812 a function or C<*>, the second element of the list imposes a limit
    1813 on the prior sequence. (The limit is inclusive on an exact match,
    1814 and in general is compared using C<!after> semantics, so an inexact
    1815 match is *not* included.) Hence the preceding example may be rewritten
    1816
    1817 1 ... * + 1, 9
    1818 10 ... * + 10, 90
    1819 100 ... * + 100, 1000
    1820
    1821 or as
    1822
    1823 1, 2, 3 ... *,
    1824 10, 20, 30 ... *,
    1825 100, 200, 300 ... *, 1000
    1826
    1827 In the latter case the preceding 3 elements are used to deduce
    1828 the correct arithmetic progression, so the 3, 30, and 300
    1829 terms are necessary.
    1830
    1831 If the first element of the list is numeric, a C<*> is assumed
    1832 before it, and the first element is again taken as the limit.
    1833 So the preceding example reduces to:
    1834
    1835 1, 2, 3 ...
    1836 10, 20, 30 ...
    1837 100, 200, 300 ... 1000
    1838
    1839 These rules may seem complicated, but they're essentially just replicating
    1840 what a human does naturally when you say "and so on".
    1841
    1842 Note that the sequence
    1843
    1844 1.0 ... *+0.2, 2.0
    1845
    1846 is calculated in C<Rat> arithmetic, not C<Num>, so the C<2.0> matches
    1847 exactly and terminates the sequence.
    1848
    1849 Note: the yada operator is recognized only where a term is expected.
    1850 This operator may only be used where an infix is expected. If you
    1851 put a comma before the C<...> it will be taken as a yada list operator
    1852 expressing the desire to fail when the list reaches that point:
    1853
    1854 1..20, ... "I only know up to 20 so far mister"
    1855
    1856 If the yada operator finds a closure for its argument at compile time,
    1857 it should probably whine about the fact that it's difficult to turn
    1858 a closure into an error message. Alternately, we could treat
    1859 an ellipsis as special when it follows a comma to better support
    1860 traditional math notation.
    1861
    1812 1862 In slice context the function's return value is appended as a capture
    1813 1863 rather than as a flattened list of values, and the argument to each
    1814 1864 function call is the previous capture in the list.
    1815 1865
    1866 If a series is generated using a non-monotonic C<.succ> function, it is
    1867 possible for it never to reach the endpoint. The following matches:
    1868
    1869 'A' ... 'Z'
    1870
    1871 but since 'Z' increments to 'AA', none of these ever terminate:
    1872
    1873 'A' ... 'z'
    1874 'A' ... '_'
    1875 'A' ... '~'
    1876
    1877 The compiler is allowed to complain if it notices these, since if you
    1878 really want the infinite list you can always write:
    1879
    1880 'A' ... *
    1881
    1882 To preserve Perl 5 semantics, you'd need something like:
    1883
    1884 'A' ... { my $new = $_.succ; $_ ne $endpoint and $new.chars <= 1 ?? $new !! () }
    1885
    1886 But since lists are lazy in Perl 6, we don't try to protect the user this way.
    1887
    1816 1888 =back
    1817 1889
    1818 1890 Many of these operators return a list of C<Capture>s, which depending on
     
    2937 3009
    2938 3010 The C<..> range operator has variants with C<^> on either end to
    2939 3011 indicate exclusion of that endpoint from the range. It always
    2940 produces a C<Range> object. Range objects are immutable (but can
    2941 spawn mutable C<RangeIterator> objects, and a C<RangeIterator> can
    2942 be interrogated for its current C<.from> and C<.to> values,
    2943 which change as they are iterated). The C<.minmax> method returns
    2944 both as a two-element list representing the interval. Ranges are not
    2945 autoreversing: C<2..1> is always a null range. Likewise, C<1^..^2>
    2946 produces no values when iterated, but does represent the interval from
    2947 1 to 2 excluding the endpoints when used as a pattern. To specify
    2948 a range in reverse use:
    3012 produces a C<Range> object. Range objects are immutable, and primarily
    3013 used for matching intervals. C<1..2> is the interval from 1 to 2
    3014 inclusive of the endpoints, whereas 1^..^2 excludes the endpoints
    3015 but matches any real number in between.
    2949 3016
    2950 2..1:by(-1)
    3017 Range objects support C<.min> and a C<.max> methods representing
    3018 their left and right arguments. The C<.minmax> method returns both
    3019 values as a two-element list representing the interval. Ranges are
    3020 not autoreversing: C<2..1> is always a null range.
    3021
    3022 If used in a list context, a C<Range> object returns an iterator that
    3023 produces a series of values starting at the min and ending at the max.
    3024 Either endpoint may be excluded using C<^>. Hence C<1..2> produces
    3025 C<(1,2)> but C<1^..^2> is equivalent to C<2..1> and produces no values (Nil).
    3026 To specify a series that counts down, use a reverse:
    3027
    2951 3028 reverse 1..2
    3029 reverse 'a'..'z'
    2952 3030
    2953 (The C<reverse> is preferred because it works for alphabetic ranges
    2954 as well.) Note that, while C<.minmax> normally returns C<(.from,.to)>,
    2955 a negative C<:by> causes the C<.minmax> method returns C<(.to,.from)>
    2956 instead. You may also use C<.min> and C<.max> to produce the individual
    2957 values of the C<.minmax> pair, but again note that they are reversed
    2958 from C<.from> and C<.to> when the step is negative. Since a reversed
    2959 C<Range> changes its direction, it swaps its C<.from> and C<.to> but
    2960 not its C<.min> and C<.max>.
    3031 Alternately, for numeric sequences, you can use the series operator instead
    3032 of the range operator:
    2961 3033
    2962 Because C<Range> objects are lazy, they do not automatically generate
    2963 a list. They only do so when iterated.
    2964 One result of this is that a reversed C<Range> object is still lazy.
    2965 Another is that smart matching against a C<Range> object smartmatches the
    3034 100,99,98 ... 0
    3035 100 ... *-1, 0 # same thing
    3036
    3037 In other words, any C<Range> used as a list assumes C<.succ> semantics,
    3038 never C<.pred> semantics. No other increment is allowed; if you wish
    3039 to increment a numeric sequence by some number other than 1, you must
    3040 use the C<...> series operator. (The C<Range> operator's C<:by> adverb
    3041 is hereby deprecated.)
    3042
    3043 0 ... *+0.1, 100 # 0, 0.1, 0.2, 0.3 ... 100
    3044
    3045 Smart matching against a C<Range> object smartmatches the
    2966 3046 endpoints in the domain of the object being matched, so fractional
    2967 3047 numbers are C<not> truncated before comparison to integer ranges:
    2968 3048
     
    2974 3054 typespace the range is operating, as inferred from the left operand.
    2975 3055 A C<*> on the left means "negative infinity" for types that support
    2976 3056 negative values, and the first value in the typespace otherwise as
    2977 inferred from the right operand. (For signed infinities the signs
    2978 reverse for a negative step.) A star on both sides prevents any type
    2979 from being inferred other than the C<Ordered> role.
    3057 inferred from the right operand. (A star on both sides is not allowed.)
    2980 3058
    2981 3059 0..* # 0 .. +Inf
    2982 'a'..* # 'a' .. 'zzzzzzzzzzzzzzzzzzzzzzzzzzzzz...'
    3060 'a'..* # 'a' le $_
    2983 3061 *..0 # -Inf .. 0
    2984 *..* # "-Inf .. +Inf", really Ordered
    3062 *..* # Illegal
    2985 3063 1.2.3..* # Any version higher than 1.2.3.
    2986 3064 May..* # May through December
    2987 3065
    2988 Note: infinite lists are constructed lazily. And even though C<*..*>
    2989 can't be constructed at all, it's still useful as a selector object.
    2990
    2991 For any kind of zip or dwimmy hyper operator, any list ending with C<*>
    2992 is assumed to be infinitely extensible by taking its final element
    2993 and replicating it:
    2994
    2995 @array, *
    2996
    2997 is short for something like:
    2998
    2999 @array[0..^@array], @array[*-1] xx *
    3000
    3001 3066 An empty range cannot be iterated; it returns a C<Nil> instead. An empty
    3002 3067 range still has a defined min and max, but the min is greater than the max.
    3003 3068
    3004 If a range is generated using a magical autoincrement, it stops if the magical
    3005 increment would "carry" and make the next value longer (in graphemes) than the "to" value, on the
    3006 assumption that the sequence can never match the final value exactly. Hence,
    3007 all of these produce 'A' .. 'Z':
    3069 Ranges that are iterated transmute into the corresponding series operator,
    3070 and hence use C<!after> semantics to determine an end to the sequence.
    3008 3071
    3009 'A' .. 'Z'
    3010 'A' .. 'z'
    3011 'A' .. '_'
    3012 'A' .. '~'
    3013
    3014 3072 =item *
    3015 3073
    3016 3074 The unary C<^> operator generates a range from C<0> up to
     
    3789 3847 will apply the hyper operator to just the values but return a new
    3790 3848 hash value with the same set of keys as the original hash.
    3791 3849
    3850 For any kind of zip or dwimmy hyper operator, any list ending with C<*>
    3851 is assumed to be infinitely extensible by taking its final element
    3852 and replicating it:
    3853
    3854 @array, *
    3855
    3856 is short for something like:
    3857
    3858 @array[0..^@array], @array[*-1] xx *
    3859
    3792 3860 =head2 Reduction operators
    3793 3861
    3794 3862 Any infix operator (except for non-associating operators)
  • docs/Perl6/Spec/S09-data.pod

     
    13 13
    14 14 Created: 13 Sep 2004
    15 15
    16 Last Modified: 17 Jun 2009
    17 Version: 34
    16 Last Modified: 21 Sep 2009
    17 Version: 35
    18 18
    19 19 =head1 Overview
    20 20
     
    203 203
    204 204 @dwarves[7] = 'Sneaky'; # Fails with "invalid index" exception
    205 205
    206 However, it is legal for a C<Range> object to extend beyond the end
    207 of an array as long as its min value is a valid subscript; the range
    208 is truncated as necessary to map only valid locations.
    209
    206 210 It's also possible to explicitly specify a normal autoextending array:
    207 211
    208 212 my @vices[*]; # Length is: "whatever"
    209 213 # Valid indices are 0..*
    210 214
    215 For subscripts containing ranges extending beyond the end of
    216 autoextending arrays, the range is truncated to the actual current
    217 size of the array rather than the declared size of that dimension.
    218 It is allowed for such a range to start one after the end, so that
    219
    220 @array[0..*]
    221
    222 merely returns Nil if C<@array> happens to be empty. However,
    223
    224 @array[1..*]
    225
    226 would fail because the range's min is too big.
    227
    228 Going the other way, it is allowed for a range to start with a negative
    229 number as long as the endpoint is at least -1; in this case the
    230 front of the range is truncated.
    231
    232 Note that these rules mean it doesn't matter whether you say
    233
    234 @array[*]
    235 @array[0 .. *]
    236 @array[0 .. *-1]
    237 @array[-Inf .. *-1 ]
    238
    239 because they all end up meaning the same thing.
    240
    241 As a special form, numeric subscripts may be declared as cyclical
    242 using an initial C<%>:
    243
    244 my @seasons[%4];
    245
    246 In this case, all numeric values are taken modulo 4, and no range truncation can
    247 ever happen. If you say
    248
    249 @seasons[-4..7] = 'a' .. 'l';
    250
    251 then each element is written three times and the array ends up with C<['i','j','k','l']>.
    252
    211 253 =head1 Typed arrays
    212 254
    213 255 The type of value stored in each element of the array (normally C<Object>)
     
    492 534 but not:
    493 535
    494 536 my @virtue{*..6};
    495 my @koalas{*..*};
    496 537 my @celebs{*};
    497 538
    498 539 These last three are not allowed because there is no first index, and
     
    633 674 @array[*+1] # Second element after the end of the array
    634 675
    635 676 @array[*-3..*-1] # Slice from third-last element to last element
    677 @array[*-3..*] # (Same thing via range truncation)
    636 678
    637 679 (Note that, if a particular array dimension has fixed indices, any
    638 attempt to index elements after the last defined index will fail.)
    680 attempt to index elements after the last defined index will fail,
    681 except in the case of range truncation described earlier.)
    639 682
    640 Using a standard index less than zero prepends the corresponding number
    641 of elements to the start of the array and then maps the negative index
    642 back to zero:
    683 Negative subscripts are never allowed for standard subscripts unless
    684 the subscript is declared modular.
    643 685
    644 @results[-1] = 42; # Same as: @results.unshift(42)
    645
    646 @dwarves[-2..-1] # Same as: @dwarves.unshift(<Groovy Sneaky>)
    647 = <Groovy Sneaky>;
    648
    649 Note that, as with a normal C<unshift>, the new elements are
    650 actually stored starting at standard index zero, after pre-existing
    651 elements have been bumped to the right. Hence after the assignments
    652 in the preceding example:
    653
    654 say @results[0]; # 42
    655 say @dwarves[0]; # Groovy
    656
    657 Using a negative index on an array of fixed size will fail if the
    658 resulting number of elements exceeds the defined size.
    659
    660 Note that the behaviour of negative indices in Perl 6 is
    661 different to that in Perl 5:
    662
    663 # Perl 5...
    664 ............_____________________________..................
    665 : | | | | | | : :
    666 .....:.....|_____|_____|_____|_____|_____|.....:.....:.....
    667 [0] [1] [2] [3] [4] [5] [6] [7]
    668 [-7] [-6] [-5] [-4] [-3] [-2] [-1]
    669
    670
    671 # Perl 6...
    672 ............_____________________________..................
    673 : | | | | | | : :
    674 .....:.....|_____|_____|_____|_____|_____|.....:.....:.....
    675 [-2] [-1] [0] [1] [2] [3] [4] [5] [6] [7]
    676 [*-7] [*-6] [*-5] [*-4] [*-3] [*-2] [*-1] [*+0] [*+1] [*+2]
    677
    678 686 The Perl 6 semantics avoids indexing discontinuities (a source of subtle
    679 687 runtime errors), and provides ordinal access in both directions at both
    680 688 ends of the array.