Perl Weekly Challenge: Week 376
Challenge 1:
Chessboard Squares
You are given two coordinates of a square on 8x8 chessboard.
Write a script to find the given two coordinates have the same colour.
8 W B W B W B W B
7 B W B W B W B W
6 W B W B W B W B
5 B W B W B W B W
4 W B W B W B W B
3 B W B W B W B W
2 W B W B W B W B
1 B W B W B W B W
a b c d e f g h
Example 1
Input: $c1 = "a7", $c2 = "f4"
Output: true
Example 2
Input: $c1 = "c1", $c2 = "e8"
Output: false
Example 3
Input: $c1 = "b5", $c2 = "h2"
Output: false
Example 4
Input: $c1 = "f3", $c2 = "h1"
Output: true
Example 5
Input: $c1 = "a1", $c2 = "g8"
Output: false
Because the colors of the squares of a chessboard alternate in a regular way it is easy given the coordinate of a square to determine its' color. In fact we can do it in one line of Raku or Perl code. As my one-liner actually contains two statements, I've split it for clarity.
First we need to get the input into the script. The first line takes two coordinates
as command-line arguments, .join()s them together and then splits them into an array of individual characters with .comb(). As I write this, it occurs to me that I could
have used ^2 instead of 0..1 and saved 2 characters but it's too late now. I'll
try and remember for next time.
my @a = (@*ARGS[0..1].join).comb;
Now we take the numeric value of the 1st element which is a letter with .ord() and
add the value of the 2nd element (which is already a number) to it. If the resulting
value is even, the square is white or if it is odd, the square is black. We can
determine if the value is odd or even by using the % 2 (modulus 2) operator.
The same procedure is followed for the 3rd and 4th elements. The two results are
compared with == and if they are both even or both odd, True is output otherwise False.
say (@a[0].ord + @a[1]) % 2 == (@a[2].ord + @a[3]) % 2
We can do the same thing in Perl but unfortunately booleans are still not automatically
stringified so we have to add ? "true" : "false" at the end.
my @a = split//, join q{}, @ARGV[0,1];
say (((ord($a[0]) + $a[1]) % 2 == (ord($a[2]) + $a[3]) % 2) ? "true" : "false")
Challenge 2:
Doubled Words
You are given a string (which may contain embedded newlines) which is taken from a page on a website. The string will not contain brackets qw{ [ ] }.
Write a script that will find doubled words (such as “this this”) and highlight (wrap in brackets) each doubled word.
The script should:
- Work across lines, even finding situations where a word at the end of
one line is repeated at the beginning of the next.
- Find doubled words despite capitalization differences, such as with
'The the...', as well as allow differing amounts of whitespace (spaces,
tabs, newlines, and the like) to lie between the words.
- Find doubled words even when separated by HTML tags. For example, to
make a word bold: '...it is <B>very</B> very important...'. Only show
lines containing doubled words.
Adapted from Mastering Regular Expressions, Third Edition by Jeffrey E. F. Friedl
Example 1
Input: $str = "you're given the job of checking the pages on a\nweb server for doubled words (such as 'this this'), a common problem\nwith documents subject to heavy editing."
Output: "web server for doubled words (such as '[this] [this]'), a common problem"
Example 2
Input: $str = "Find doubled words despite capitalization differences, such as with 'The\nthe...', as well as allow differing amounts of whitespace (spaces,\ntabs, newlines, and the like) to lie between the words."
Output: "Find doubled words despite capitalization differences, such as with '[The]\n[the]...', as well as allow differing amounts of whitespace (spaces,"
Example 3
Input: $str = "to make a word bold: '...it is <B>very</B> very important...'."
Output: "to make a word bold: '...it is <B>[very]</B> [very] important...'."
Example 4
Input: $str = "Perl officially stands for Practical Extraction and Report Language, except when it doesn't."
Output: ""
Example 5
Input: $str = "There's more than one one way to do it.\nEasy things should be easy and hard things should be possible."
Output: "There's more than [one] [one] way to do it."
The ascription to Friedl's book (which I own and is great btw) gives away how we should solve this problem. Regular Expressions are very powerful but advanced usages require quite a bit of thought. I spent more time on this challenge than any in recent memory.
Because I am more familiar with Perl regular expressions, I did Perl first. The
core of the script is a single use of the s/// (substitution) operator. I used
the /x flag so I could spread it out over several lines for legibility.
$str =~ s/
The first part of the operator is a regular expression search pattern. Because
we are using the /i flag, the search will be case-insensitive. The /g flag
makes the search global and /s flag treats the entire input as one continuous string
so matches can occur accross line boundaries.
The first thing to match is a sequence of alphanumeric characters (the character class \w) preceded by a word boundary (\b). This is captured by the surrounding parentheses for future
use. I'm not exactly sure why we needed the preceding \b but it would not work properly
without it.
\b (\w+)
Prospective doubled words can be separated by whitespace \s. The spec says we
should also treat HTNL tags as separators. The following regular expression matches
whitespace or tags. All this is also captured. Note that the expression for an HTML
tag is also in parentheses but this is only to group it together. We signal to the
regex engine that this group is not for capturing by putting ?: before the operning
parenthesis.
( \s | (?: \s? <[^>]+> \s? )+ )
Then we try and match another instance of the first match represented by the back reference
\1. This too is captured.
(\1)
Now for the second part of the operator. We have three matches: a word, intervening
whitespace, and the words' double. Because we have the /e flag, we can interpolate them into a string as backreferences $1,$2, and $3. We enclose the doubled words, i.e.
$1 and $3 in square brackets. This string replaces whatever was previously in that position in $str.
/
"\[$1]$2\[$3]"
In Perl, flags to the s/// operator go at the end whereas in Raku they go after
the s.
/igxe;
Now we only want to print lines that contain matches so first we split() $str into
lines and with grep() filter any line that contains [; or we could have looked
for ] but we don't need both. The matched lines are join()ed together with newlines
and the whole thing is output with say().
say join qq{\n}, grep { /[\[]/ } split /\n/, $str;
For Raku, we are employing the same method but Raku regular expressions have a somewhat different syntax I'm not 100% used to yet so there may be better ways to do some of this.
Because function parameters are immutable by default in Raku, we can't modify $str
so the first order of business is to copy it to another variable.
my $text = $str;
Although Raku does have a S/// operator I'm using the String classes' .subst()
method to achieve the same thing.
$text = $text.subst(
The first argument is a regular expression search pattern as in the Perl version.
/
One annoying thing about Raku regular expressions is that some flags go before
the expression and some go within it and it isn't always clear to me which is which.
If you are using .subst() it's even worse; some flags go after as a separate parameter
as we shall see shortly.
:i
\b in Raku is <?wb>. I don't know why they had to make it more verbose like that.
Yet another pet peeve is that if you want to use a reference to a capture group such
as $0 within the same regular expression, you have to assign it to a variable first
and then reference that variable. Luckily I recalled this from PWC 328,
<?wb> (\w+) {} :my $m = $0;
This is the part that deals with separators.
(\s || [\s? '<' <-[>]>+ '>' \s?]+)
And this is the second instance of the doubled word.
($m)
/,
The substitution is done in the same way as Perl.
{ "\[$0]$1\[$2]" },
As I wrote earlier, some flags are defined as additional parameters at the end of
.subst(). There is no need for equivalents to the \x and \e flags as those
are default behaviors for Raku regular expressions.
:g
);
Printing matched lines only is done in the same way as in Perl but shortcut methods
like .lines() and .contains() plus left to right method chaining make it a lot
more readable in my opinion.
$text.lines.grep({ .contains(q{[}) }).join("\n").say;