Perl Weekly Challenge: Week 99

Challenge 1:

Pattern Match

You are given a string $S and a pattern $P.

Write a script to check if given pattern validate the entire string. Print 1 if pass otherwise 0.

The patterns can also have the following characters:

? - Match any single character.
* - Match any sequence of characters.

Example 1
Input: $S = "abcde" $P = "a*e"
Output: 1
Example 2
Input: $S = "abcde" $P = "a*d"
Output: 0
Example 3
Input: $S = "abcde" $P = "?b*d"
Output: 0
Example 4
Input: $S = "abcde" $P = "a*c?e"
Output: 1

This is a very limited subset of regular expressions so all that needs to be done is to convert it to the equivalant Perl regexp syntax.

$P =~ s/\*/.*/g;
$P =~ s/\?/./g;

Then the resulting regexp is applied to $S. It is placed in a capture group so that ...

$S =~ /($P)/;

... we can compare what got matched to $S (also checking that something got captured in the first place.) If the result and $S are equal, it means the entire string was matched so we print 1. Otherwise, we print 0.

say q{}, (defined $1 && $S eq $1) ? 1 : 0;

(Full code on Github.)

This is the Raku translation:

sub MAIN(
    Str $S, #= a string.
    Str $P  #= a pattern.
) {

Rakus ability to call methods and chain them together makes the code very compact. Note the use of <{ ... }> to interpolate code into the middle of a regexp.

    $S ~~ / ( <{ $P.subst('*', '.*', :g).subst('?', '.', :g) }> ) /;

    say ($0 && $S eq $0) ?? 1 !! 0;
}

(Full code on Github.)

Challenge 2:

Unique Subsequence

You are given two strings $S and $T.

Write a script to find out count of different unique subsequences matching $T without changing the position of characters.

Example 1
Input: $S = "littleit', $T = 'lit'
Output: 5

    1: [lit] tleit
    2: [li] t [t] leit
    3: [li] ttlei [t]
    4: litt [l] e [it]
    5: [l] ittle [it]
Example 2
Input: $S = "london', $T = 'lon'
Output: 3

    1: [lon] don
    2: [lo] ndo [n]
    3: [l] ond [on]

Like the previous problem, this one also involves creating a regexp. While I did solve it, I'm kind of uncomfortable with how I did it. I have a feeling that there must be a better way.

The first thing I did was to make an array of regex patterns which will represent subsequences. And I added a literal match of $T as the first one.

Then I split T into a set of subsequences with a 'gap' between the first group of letters and the second group where the gap was succesively replaced by a regexp representing the potential difference btween $S and the two groups.

So for example, if $S = 'london' which has 6 characters and $T = 'lon' which has 3, the maximum size of the gap can be 6 - 3 = 3 characters. so for each one of l,on, and lo,n (lon is already taken care of,) I insert .{1}, .{2}, and .{3} making a total of 2 * 3 = 6 extra patterns.

my @patterns = ( $T );
for my $i (0 .. (length $T) - 2) {
    for my $j (1 .. (length $S) - (length $T)) {
        my @t = split //, $T;
        $t[$i] .= ".{$j}";
        my $pattern = join q{}, @t;
        push @patterns, $pattern;
    }
}

Having completed our list of patterns we go through each one and count all the ones that match and finally, print the result.

my $count = 0;
for my $pattern (@patterns) {
    if ($S =~ /($pattern)/) {
        $count++;
    }
}

say $count;

(Full code on Github.)

This is what the algorithm above looks like translated to Raku.

sub MAIN (
    Str $S, #= a string.
    Str $T  #= a subsequence to be matched in <S>
) {
    my @patterns = ( $T );
    for 0 .. $T.chars - 2 -> $i {
        for 1 .. $S.chars - $T.chars -> $j {
            my @t = $T.comb;
            @t[$i] ~= " . ** {$j} ";
            @patterns.push( / <{ @t.join(q{}) }> / );
        }
    }

    my $count = 0;
    for @patterns -> $pattern {
        if $S ~~ / ( $pattern ) / {
            $count++;
        }
    }

    say $count;
}

(Full code on Github.)