Consider the following strings
"13:15"
"1:15pm"
"quarter past 1pm"
These may be just strings to a computer but to a human, we can see that they all represent a time. Furthermore, we see that there is some structure to these strings. So whereas “quarter past 1pm” makes sense as a time. “Foohoogablahblahl” does not.
Now our first instinct might be reach for regex to parse these strings and extract the data out of them, but I'd advise against it. We will solve this problem with parser combinators using the parser-ts library.
Let's begin with the first example "13:15" which is the time in 24 hour format. Upon closer examination, we see that it is comprised of three parts <hours> <:> <minutes> each with its own set of rules.
Taking a look at the <hours> component, we know that there are some rules. First, it must be a number and that number must be between 0 & 23. So to begin with, we reach for the int parser from the parser-ts/lib/string
module.
import { run } from "parser-ts/lib/code-frame";
import * as S from "parser-ts/lib/string";
console.log(run(S.int, "14")); //[1] Right 14
console.log(run(S.int, "asdf4")); //[2] Left "Expected: an integer"
console.log(run(S.int, "14aaaaa")); //[3] Right 14
Let's break this down. On line [1], we ran the parser on the string "14" and we got back a Right instance containing the number 14. On line [2], we see that if the int parser is unable to extract out a number, we error out with a Left instance containing a helpful error message. On line [3], we see that we succeed in extracting out the number 14 and we abandon the rest of the string.
Phew, that was a lot, however we aren't done yet because if we try to do
console.log(run(S.int, "29")); // Right 29
We succeed with the number 29 which isn't what we want because there isn't a 29th hour in 24 hour format. To solve this, we reach for our first combinator filter. Filter allows us to pass in a predicate so that we can filter out our invalid values. After this, we finally have our 24 hours parser.
const isValid24Hour = (hours: number) => hours >= 0 && hours <= 23;
const hours24Parser = P.expected(
pipe(S.int, P.filter(isValid24Hour)),
"24 hour clock hours must be between 0 & 23"
);
console.log(run(hours24Parser, "14")); // Right 14
console.log(run(hours24Parser, "32")); // Left "24 hour clock hours must be between 0 & 23"
Nice! Next we need our colon parser which matches just the ":" character.
const colonParser = S.string(":");
Finally, we apply a similar approach to the hours24Parser for our minutes parser
const isValidMinute = (minutes: number) => minutes >= 0 && minutes <= 59;
const minutesParser = P.expected(
pipe(S.int, P.filter(isValidMinute)),
"Minutes must be between 0 & 59"
);
At this stage, we have our hours24Parser
, our colonParser
and our minutesParser
. So the next challenge is to figure out a way to mash all three of these together. Now Parsers are monads which means that we can chain them together. To make life a bit simpler, we will be using the bind
and bindTo
convenience methods to store our extracted data.
const time24hoursParser = pipe(
hours24Parser,
P.bindTo("hours"), // store hours into an object with key 'hours'
P.chainFirst(() => colonParser), // ignore and don't store the colon
P.bind("minutes", () => minutesParser) //store minutes under the key 'minutes'
console.log(run(time24hoursParser, "00:24")); // Right { hours:0, minutes: 24}
);
Excellent! Let's give this hours, minutes data object a type and call it a TimeObj
type TimeObj = {
hours: number;
minutes: number;
};
Okay, let's take a breather now. Imagine you are an international amazing race participant and you have just woken up on a bus after three consecutive connecting flights. The bus is currently traveling in a tunnel and you have no idea what year it is let alone the month, day or time. Spying a fellow passenger with a wristwatch, you groggily come to and have the following conversation:
You: Hello fellow human. I appear to have regained consciousness after my intraplanetary travels. May I inquire the reading on your portal time tracking device?
Passenger: ooookay....
Passenger: <looks down at watch>
Passenger: uhhh, it is currently quarter past 1
You: May I ascertain the position of the sun relative to our position as well?
Passenger: huh? ohh
Passenger: quarter past 1...pm
So how do we parse the string "quarter past 1pm"? To do this, we will need to break down the structure of that string into a grammar.
<long-form-time> ::= <segment> <relative> <hours12><ampm>
<segment> ::= quarter | half
<relative> ::= past | to
<hours12> ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12
<ampm> ::= am | pm
Within our grammar, we allow sentences like "quarter past 1pm", "quarter to 5pm" and "half past 12pm".
So let's get into it. The first thing to tackle would be the string "quarter" and "half".
const quarterParser = pipe(
S.string("quarter"),
P.map(() => 15)
);
const halfParser = pipe(
S.string("half"),
P.map(() => 30)
);
const segmentParser = P.either(quarterParser, () => halfParser);
console.log(run(segmentParser, "half")); // Right 30
We have a quarterParser to match on the string "quarter" and return back the number 15. We have a halfParser to match on the string "half" and return back the number 30. We then use the either combinator to try one and then the other if the first one fails.
Next up will be tackling the <relative> segment of our grammar. This one requires a bit of thought. When we see the word "past" in the sentence "quarter past 1pm", what does it represent?
It represents that the notion that we should add 15 minutes to whatever the hours component is. The type signature of this notion might look like:
(minutes: number) => (hours:number) => TimeObj
So let's get to it
const minutesPast = (minutes: number) => (hours24: number): TimeObj => ({
hours: hours24,
minutes
});
const pastParser = pipe(
S.string("past"),
P.map(() => minutesPast)
);
Let's break this down. First we define a function minutesPast
which takes in the minutes, then the hours in 24 hour format and returns back our 24 hour time object representation.
Next we define the pastParser
, this parser matches on the string "past" and carries within it the minutesPast function.
Similarly, we do the same for the toParser
const minutesTo = (minutes: number) => (hours24: number): TimeObj => {
if (hours24 === 0) {
return {
hours: 23,
minutes: 60 - minutes
};
} else {
return {
hours: hours24 - 1,
minutes: 60 - minutes
};
}
};
const toParser = pipe(
S.string("to"),
P.map(() => minutesTo)
);
We also need a way to parse our <hours12> segment and convert that into a 24 hour format
const isValid12Hour = (hours: number) => hours >= 1 && hours <= 12;
const num12Parser = P.expected(
pipe(S.int, P.filter(isValid12Hour)),
"12 hour clock hours must be between 1 & 12"
);
const amParser = pipe(
S.string("am"),
P.map(() => (hours12: number) => (hours12 === 12 ? 0 : hours12))
);
const pmParser = pipe(
S.string("pm"),
P.map(() => (hours12: number) => (hours12 === 12 ? 12 : hours12 + 12))
);
const amPmParser = P.either(amParser, () => pmParser);
const hours12Parser = P.seq(
num12Parser,
(hours12) =>
pipe(
amPmParser,
P.map((f) => f(hours12))
)
);
console.log(run(hours12Parser, "1am")) // Right 1
console.log(run(hours12Parser, "3pm")) // Right 15
console.log(run(hours12Parser, "14am")) // Left "12 hour clock hours must be between 1 & 12"
Let's break this down. Using a similar technique to the hours24Parser
earlier in this article, we create the num12Parer
to only allow numbers between 1 & 12.
The amParser
and pmParser
are Parsers that match on either am or pm and contains a function that converts 12 hour time to 24 hour time. We then join the amParser
and pmParser
together using the either
combinator.
The hours12Parser
takes the num12Parser
and combines it using seq which is a combinator that passes results from the first Parser to the next Parser if it succeeds. If it succeeds, we pass the parsed hours12 number to the amPmParser
, as that contains a function inside, we use P.map
to apply our hours12 number to the contained function getting back a number representing the hours in 24 hour time.
Phew! We now have all the parsers for our individual segments. Time to glue them altogether.
const longFormParser = pipe(
segmentParser, // [1]
P.chainFirst(() => S.spaces1), // [2]
P.chain((minutes) =>
pipe(
P.either(toParser, () => pastParser),
P.map((f) => f(minutes)) // [3]
)
),
P.chainFirst(() => S.spaces1), //ignore 1 or more spaces
P.chain((f) =>
pipe(
hours12Parser, // [4]
P.map((hours12) => f(hours12)) // [5]
)
)
);
console.log(run(longFormParser, "quarter to 12am")); //Right {hours:23, minutes:45}
console.log(run(longFormParser, "half past 12am")); //Right {hours:0, minutes: 30}
Let's talk it out. [1] We start with our segmentParser
to match on "quarter" or "half" and get back the number 15 or 30. [2] We then use chainFirst
with S.spaces1
to ignore 1 or more whitespace characters. [3] We then use either our toParser
or pastParser
to get our function of the shape minutes -> hours -> TimeObj
. Since we have our minutes already, we partially apply the minutes to our function [3]. After ignoring some more spaces, we use the hours12Parser
to extract out the number of hours in 24 hour time [4]. Finally, we have the last argument for our function from [3] and we have the time represented in our TimeObj
format.
With that, our alien friend now knows what time it is to help it win the amazing race!
To conclude, I leave you with a poem
A string is not only a string
Wherewith a data structure lies within
Regex don't you use
Use Parser combinators for the win!
~derp
Link to codesandbox:
https://codesandbox.io/s/time-parser-example-jwrwk3?file=/src/index.ts
Top comments (0)