Perl Weekly Challenge: Week 35

Challenge 1:

Write a program to encode text into binary encoded morse code.

Pay attention to any changes which might need to be made to the text to make it valid morse code.

Morse code consists of dots, dashes and gaps. It can be encoded in binary in the following fashion:
  dot: 1
  dash: 111
  intra-character gap: 0
  character gap: 000
  word gap: 0000000
  An intra-character gap is inserted between the dots and dashes in a character.

This weeks challenges made me reminisce. When i was about 12 years old I made a simple electronic "clacker" for a school science fair. I didn't win but I learned a lot of things about Morse code, navigation, telegraphs etc. some of which I can even remember today.

Solving the challenge is just a matter of translating input characters into sequences of binary digits. This can be done most efficiently with a lookup table.

use constant CHARACTER_GAP => '000';
use constant WORD_GAP => '000000';

my %to_morse = (
    'A' => '10111',
    'B' => '111010101',
    'C' => '11101011101',
    'D' => '1110101',
    'E' => '1',
    'F' => '101011101',
    'G' => '111011101',
    'H' => '1010101',
    'I' => '101',
    'J' => '1011101110111',
    'K' => '111010111',
    'L' => '101110101',
    'M' => '1110111',
    'N' => '11101',
    'O' => '11101110111',
    'P' => '10111011101',
    'Q' => '1110111010111',
    'R' => '1011101',
    'S' => '10101',
    'T' => '111',
    'U' => '1010111',
    'V' => '101010111',
    'W' => '101110111',
    'X' => '11101010111',
    'Y' => '1110101110111',
    'Z' => '11101110101',
    '1' => '10111011101110111',
    '2' => '101011101110111',
    '3' => '1010101110111',
    '4' => '10101010111',
    '5' => '101010101',
    '6' => '11101010101',
    '7' => '1110111010101',
    '8' => '111011101110101',
    '9' => '11101110111011101',
    '0' => '1110111011101110111',
);

The actual code that uses this table looks like this:

sub morse_encode {
    my ($message) = @_;

    my @words = split /\W/, $message;

First split the message into words. \W is the regex metacharacter for a word boundary.

    for my $word (@words) {
        my @chars = split q{}, $word;
        for my $c (@chars) {
            if (exists $to_morse{uc $c}) {
                $c = $to_morse{uc $c};
            }
        }

Then for each word, we split it into characters and for each character, if it is in our table (lower case characters are upper cased) we replace it with its' binary representation. I haven't implemented them but apparently there are a few extensions to International Morse Code for common punctuation but I shudder to think how it could have dealt with unicode. Luckily there is better tech for international communications now.

        $word = join CHARACTER_GAP, @chars;
    }

Then the characters are joined back into words with the character gap sequence between each one.

    return join WORD_GAP, @words;
}

(Full code on Github.)

Finally, the words are joined back into the message with the word gap sequence between each one and this is returned to the caller. This is a very inefficient representation. I really ought to pack it into a bit string which you can do in Perl with the pack('b*') (little-endian) or pack('B*') (big-endian) functions. But the spec didn't mention it so i didn't do it.

I won't repeat the lookup table for the Raku version. The encoding code is also very similar to my Perl version modulo language syntax differences.

sub morse_encode(Str $message) {

    my @words = split /\W/, $message;

    for @words <-> $word {
        my @chars = $word.comb;
        for @chars <-> $c {
            if %to_morse{uc $c}:exists {
                $c = %to_morse{uc $c};
            }
        }
        $word = @chars.join($CHARACTER_GAP);
    }

    return @words.join($WORD_GAP);
}

(Full code on Github.)

One problem I ran into was that the iterator in a for loop which you specify with -> is read-only. Most of the time that's what you want. But in this case I'm modifying elements in @words and @chars as I go so that won't do. I learned that you can make the iterator read-write by using <->.

Challenge 2:

Write a program to decode binary morse code.

Consider how it might be possible to recover from badly formed morse code.

a) by splitting the morse code on gaps

b) without looking further than one digit ahead

The second challenge is the reverse of the first. So we start by reversing the keys and values of the lookup table.

my %from_morse = (
    '10111'               => 'A',
    '111010101'           => 'B',
    '11101011101'         => 'C',
    '1110101'             => 'D',
    '1'                   => 'E',
    '101011101'           => 'F',
    '111011101'           => 'G',
    '1010101'             => 'H',
    '101'                 => 'I',
    '1011101110111'       => 'J',
    '111010111'           => 'K',
    '101110101'           => 'L',
    '1110111'             => 'M',
    '11101'               => 'N',
    '11101110111'         => 'O',
    '10111011101'         => 'P',
    '1110111010111'       => 'Q',
    '1011101'             => 'R',
    '10101'               => 'S',
    '111'                 => 'T',
    '1010111'             => 'U',
    '101010111'           => 'V',
    '101110111'           => 'W',
    '11101010111'         => 'X',
    '1110101110111'       => 'Y',
    '11101110101'         => 'Z',
    '10111011101110111'   => '1',
    '101011101110111'     => '2',
    '1010101110111'       => '3',
    '10101010111'         => '4',
    '101010101'           => '5',
    '11101010101'         => '6',
    '1110111010101'       => '7',
    '111011101110101'     => '8',
    '11101110111011101'   => '9',
    '1110111011101110111' => '0',
);

sub morse_decode {
    my ($message) = @_;

    my @words = split WORD_GAP, $message;

This time we split the message into words with the word gap sequence.

    for my $word (@words) {
        my @chars = split CHARACTER_GAP, $word;
        for my $c (@chars) {
            if (exists $from_morse{$c}) {
                $c = $from_morse{$c};
            }
        }

Then each word is split into characters with the character gap sequence. They are still binary strings at this point. We try and find that binary string in our lookup table and if it is present replace it with its translation.

        $word = join q{}, @chars;
    }

    return join q{ }, @words;
}

(Full code on Github.)

And then the characters are joined into words and the words back into a complete message just like in challenge one.

The Raku version is also just a translation from Perl as in challenge one.

sub morse_decode(Str $message) {

    my @words = $message.split($WORD_GAP);

    for @words <-> $word {
        my @chars = $word.split($CHARACTER_GAP);
        for @chars <-> $c {
            if %from_morse{$c}:exists {
                $c = %from_morse{$c};
            }
        }
        $word = @chars.join;
    }

    return @words.join(q{ });
}

(Full code on Github.)