Yes, I said instaparse. How to actually use it.

Create a project and add instaparse to your dependencies:

[instaparse "1.4.8"]
 
and at the top of your file:
 
(ns example.core
  (:require [instaparse.core :as insta]))
 
Now we will take the previous example, the problem from day 4.
 
It gives us a series of keys like this:
 
aaaaa-bbb-z-y-x-123[abxyz] 
a-b-c-d-e-f-g-h-987[abcde] 
not-a-real-room-404[oarel] 
totally-real-room-200[decoy]
 
Which keys are the real rooms?
Each key is made up of an encrypted name
(lowercase letters separated by dashes)
followed by a dash, a sector ID, and a checksum in square brackets.
It is valid if the checksum is the five most common letters
in the encrypted name, in order, with ties broken by alphabetization.

  • aaaaa-bbb-z-y-x-123[abxyz] is a real room because the most common letters are a (5), b (3), and then a tie between x, y, and z, which are listed alphabetically.
  • a-b-c-d-e-f-g-h-987[abcde] is a real room because although the letters are all tied (1 of each), the first five are listed alphabetically.
  • not-a-real-room-404[oarel] is a real room.
  • totally-real-room-200[decoy] is not.
Let's define a variable called "lines" to be fed to our parser:

(def lines "aaaaa-bbb-z-y-x-123[abxyz]
                 a-b-c-d-e-f-g-h-987[abcde]
                 not-a-real-room-404[oarel]"

Parse function we used previously:

(defn parse-room [s]
  (let [parts (string/split s #"-")
        [id chk] (string/split (last parts) #"\[")]
    {:word  (apply concat (butlast parts))
     :chksum (butlast chk)
     :id (Integer/parseInt id)}))

We previously used string split to do this: 

user=> (require '[clojure.string :as str])

user=> (str/split "Split me up" #" ")
["Split" "me" "up"]

user=> (str/split "q1w2e3r4t5y6u7i8o9p0" #"\d+")
["q" "w" "e" "r" "t" "y" "u" "i" "o" "p"]

; Note that the 'limit' arg is the maximum number of strings to
; return (not the number of splits)
user=> (str/split "q1w2e3r4t5y6u7i8o9p0" #"\d+" 5)
["q" "w" "e" "r" "t5y6u7i8o9p0"]



So if we run the function:

(parse-room lines)
 
We get the following map: 
{:word (\a \a \a \a \a \b \b \b \z \y \x \1 \2 \3 \[ \a \b \x \y \z \] \newline
\a \b \c \d \e \f \g \h \9 \8 \7 \[ \a \b \c \d \e \] \newline
\n \o \t \a \r \e \a \l \r \o \o \m),
:chksum (\o \a \r \e \l),
:id 404}

But right now we are only interested in the first part:

(string/split lines #"-")

;=> ["aaaaa" "bbb" "z" "y" "x" "123[abxyz]\n
a" "b" "c" "d" "e" "f" "g" "h" "987[abcde]\n
not" "a" "real" "room" "404[oarel]"]

The parse function creates a new scope with a variable called "parts",
bound to the value of the previous result.

(string/split (last (string/split lines #"-")) #"\[")
;=> ["404" "oarel]"]

This result is destructured by the form

[id chk]

and return the map

;=> {:word  (apply concat (butlast parts))
         :chksum (butlast chk)
         :id (Integer/parseInt id)}

Example key:

aaaaa-bbb-z-y-x-123[abxyz]

We want to parse this into a map of 3 keys:

:parts - ["aaaaa" "bbb" "z" "y" "x"]
:id - 123
:chksum - "abxyz"

Defining context-free grammars:
CategoryNotationsExample
Rule: := ::= =S = A
End of rule; . (optional)S = A;
Alternation|A | B
Concatenationwhitespace or ,A B
Grouping()(A | B) C
Optional? []A? [A]
One or more+A+
Zero or more* {}A* {A}
String terminal"" '''a' "a"
Regex terminal#"" #''#'a' #"a"
EpsilonEpsilon epsilon EPSILON eps ε "" ''S = 'a' S | Epsilon
Comment(* *)(* This is a comment *)

We'll keep working on this. Check out the next problem!

Comments

Popular posts from this blog

MilkyTracker: The quest for the perfect music software

Programming: The New Rock 'N' Roll

Pardon My Parsing...