DEV Community

SEN LLC
SEN LLC

Posted on

RFC 5545 is a 200-Page Spec. Your iCalendar Feed Needs About 150 Lines.

RFC 5545 is a 200-Page Spec. Your iCalendar Feed Needs About 150 Lines.

A PHP 8.2 + Slim 4 microservice that takes JSON events and returns valid .ics feeds. The point wasn't to replace sabre/vobject โ€” it was to understand what iCalendar actually is by building the 90% subset from scratch.

Sooner or later someone on your team asks for a calendar feed. Deploy windows. Course schedules. A recurring ops meeting nobody wants to remember to re-book. Maintenance windows the support team should see in their client. The ask is always small and the solution is always the same: generate an .ics file, drop it behind a URL, tell people to subscribe.

So you write some code that concatenates strings, reads RFC 5545 skimmingly, ships it, and then three months later someone's Outlook rejects the feed because a line is 78 bytes instead of 75, or Apple Calendar silently drops every event because you used LF instead of CRLF, or the recurring meeting title containing a comma splits into two separate fields and now the summary is "Standup" and the rest of the title is lost.

I've been in that room. This article is the one I wish I'd had.

๐Ÿ”— GitHub: https://github.com/sen-ltd/ical-api

Screenshot showing curl request producing iCalendar output with CRLF line endings visible in xxd

The repo is a small Slim 4 service with four routes:

  • POST /ical โ€” JSON body, returns text/calendar
  • GET /ical?url=<json-source> โ€” fetch JSON from a URL and convert
  • POST /validate โ€” accepts .ics text, returns structural errors
  • GET /health โ€” the usual

No sabre/vobject. No eluceo/ical. The iCal builder is written by hand in ~150 lines of PHP. That's deliberate. If you're going to own a feed generator you should understand the rules it's applying.

What RFC 5545 actually requires for the 90% case

The spec is intimidating. Two hundred pages, VTIMEZONE sub-grammar, BYDAY inside RRULE with its own mini-language of -1SU and +2MO, attachment encoding, attendees with participation states. Most of it is noise for the feed you're going to ship.

Strip it down to the minimum a subscribing client will accept, and you get this checklist:

  1. CRLF line endings. Every line, including the last one.
  2. Line folding at 75 octets. Not characters. Octets.
  3. Text escaping for \, ;, ,, and newlines in SUMMARY, DESCRIPTION, LOCATION.
  4. Envelope: BEGIN:VCALENDAR, VERSION:2.0, PRODID, END:VCALENDAR.
  5. Events need UID, DTSTAMP, DTSTART, DTEND.
  6. A working RRULE if you want recurrence.

That's it. Six rules. Miss any of them and some consumer somewhere will refuse to load your feed โ€” and the error messages are terrible.

Rule 1: CRLF or nothing

RFC 5545 ยง3.1 is unambiguous:

Lines of text SHOULD be delimited by a line break, which is a CRLF sequence (CR character followed by LF character).

SHOULD in RFC-speak means "you have a good reason not to or you do it." Most real consumers treat it as MUST. Apple Calendar will load a LF-only feed and silently drop events. Outlook will reject it outright. Google Calendar is lenient but inconsistent.

In PHP the trap is "\n". Every concatenation you do with "\n" instead of "\r\n" is a bug you won't see until you're looking at xxd output. My builder assembles with \n internally for readability and then runs the whole document through a folder that re-joins with \r\n. One place to get it right, one place that can be wrong.

Verification is cheap:

curl -sS -X POST http://localhost:8000/ical -d @payload.json | xxd | head
00000000: 4245 4749 4e3a 5643 414c 454e 4441 520d  BEGIN:VCALENDAR.
00000010: 0a56 4552 5349 4f4e 3a32 2e30 0d0a 5052  .VERSION:2.0..PR
Enter fullscreen mode Exit fullscreen mode

0d 0a at every line break. If you see 0a without a preceding 0d, your feed is broken regardless of how pretty the text preview looks.

Rule 2: line folding at 75 octets

This is the rule most implementations get subtly wrong. The spec:

Lines of text SHOULD NOT be longer than 75 octets, excluding the line break. Long content lines SHOULD be split into a multiple line representations using a line "folding" technique.

"Octets." Not characters. Not "the result of strlen on a string you think is ASCII." When the SUMMARY contains "ไผš่ญฐๅฎค A ใฎใƒŸใƒผใƒ†ใ‚ฃใƒณใ‚ฐ" (a totally normal Japanese title), every kanji is 3 bytes in UTF-8. The naive mb_strlen approach will split at character 25 thinking you're at 25 bytes, and you'll actually be at 75 bytes, which is fine, or at 78, which means some strict parsers reject the whole event.

Worse: if you split at an arbitrary byte offset you can cut a multi-byte codepoint in half. The continuation line starts with a leading space (that's how folding works) and now contains an invalid UTF-8 sequence. Some consumers ignore it. Others throw.

Here's the folder I ended up with:

public static function foldLine(string $line): string
{
    $len = strlen($line);
    if ($len <= 75) return $line;

    $out = '';
    $pos = 0;
    $first = true;

    while ($pos < $len) {
        $budget = $first ? 75 : 74;  // leading SPACE costs 1 octet
        $end    = min($len, $pos + $budget);

        // Back off if we landed inside a UTF-8 continuation byte (10xxxxxx).
        if ($end < $len) {
            while ($end > $pos && (ord($line[$end]) & 0xC0) === 0x80) {
                $end--;
            }
        }

        $chunk = substr($line, $pos, $end - $pos);
        $out  .= ($first ? '' : "\r\n ") . $chunk;
        $pos   = $end;
        $first = false;
    }

    return $out;
}
Enter fullscreen mode Exit fullscreen mode

Two things worth pointing out:

  • The budget drops from 75 on the first line to 74 on continuations, because the leading SPACE that introduces a folded continuation is itself an octet that counts toward the 75-octet budget of that line.
  • The UTF-8 back-off uses the high-bit pattern. Continuation bytes in UTF-8 are 10xxxxxx, so we walk backwards until we hit a leading byte. In the worst case we walk back 3 bytes (for a 4-byte codepoint).

My test for this is uncompromising:

public function testMultiByteCharactersAreNotSplitMidCodepoint(): void
{
    // 40 Japanese characters ร— 3 bytes = 120 bytes, across two segments.
    $line = str_repeat('ใ‚', 40);
    $folded = ICalFolder::foldLine($line);
    foreach (explode("\r\n", $folded) as $i => $seg) {
        $content = $i === 0 ? $seg : substr($seg, 1);
        $this->assertTrue(mb_check_encoding($content, 'UTF-8'));
    }
}
Enter fullscreen mode Exit fullscreen mode

Every segment must survive mb_check_encoding. If you split inside the E3 81 82 that encodes ใ‚, the first segment ends with E3 or E3 81 โ€” broken UTF-8 โ€” and the test fails loudly.

Rule 3: text escaping

RFC 5545 ยง3.3.11 defines TEXT values:

Input Output
\ \\
; \;
, \,
newline \n (literal backslash + n)

The critical thing is the order of operations. If you escape semicolons first and backslashes second, you double-escape everything because the backslashes you introduced in step 1 get mangled in step 2. Backslash first, always:

public static function text(string $value): string
{
    $value = str_replace(["\r\n", "\r"], "\n", $value); // normalize newlines
    $value = str_replace('\\', '\\\\', $value);         // backslash FIRST
    $value = str_replace(';',  '\\;',  $value);
    $value = str_replace(',',  '\\,',  $value);
    $value = str_replace("\n", '\\n',  $value);         // newline LAST
    return $value;
}
Enter fullscreen mode Exit fullscreen mode

A test that nails down the ordering:

public function testEscapingOrderDoesNotDoubleEscape(): void
{
    $out = ICalEscaper::text("\\,");
    $this->assertSame('\\\\\\,', $out);
}
Enter fullscreen mode Exit fullscreen mode

Input is one backslash + one comma. Expected output is two backslashes (the escaped backslash) + one backslash + one comma (the escaped comma). Four characters total. If you got five you escaped the backslash twice, which means you escaped the comma first, added a backslash, then escaped the backslashes.

Rule 4: the envelope and rule 5: the VEVENT essentials

These are straightforward but easy to miss because most examples online only show VCALENDAR or only show VEVENT, never the combination. Every event needs:

  • UID โ€” globally unique, stable across updates. If you don't pass one, my validator auto-generates <random>-<timestamp>@ical-api. The consumer deduplicates by UID, so "stable across updates" matters if you're going to change a meeting โ€” reuse the same UID or your subscribers will see both.
  • DTSTAMP โ€” when the .ics was generated, in UTC. Required by the spec. Different from DTSTART.
  • DTSTART โ€” when the event begins.
  • DTEND โ€” when it ends. Must be strictly after DTSTART. My service returns 422 if you pass end <= start, because this is exactly the kind of bug that gets shipped.

Here's what the builder emits for one event:

public static function buildEvent(array $ev, string $dtstamp): array
{
    $lines = [];
    $lines[] = 'BEGIN:VEVENT';
    $lines[] = 'UID:' . $ev['uid'];
    $lines[] = 'DTSTAMP:' . $dtstamp;
    $lines[] = 'DTSTART' . self::datetimeSuffix($ev['start']);
    $lines[] = 'DTEND'   . self::datetimeSuffix($ev['end']);
    $lines[] = 'SUMMARY:' . ICalEscaper::text($ev['title']);
    if (!empty($ev['location']))    $lines[] = 'LOCATION:' . ICalEscaper::text($ev['location']);
    if (!empty($ev['description'])) $lines[] = 'DESCRIPTION:' . ICalEscaper::text($ev['description']);
    if (!empty($ev['url']))         $lines[] = 'URL:' . $ev['url'];
    if (!empty($ev['recurrence']))  $lines[] = 'RRULE:' . self::rrule($ev['recurrence']);
    $lines[] = 'END:VEVENT';
    return $lines;
}
Enter fullscreen mode Exit fullscreen mode

Every field goes through ICalEscaper::text except URL, because URL is a URI value type, not a TEXT value type, and URIs have their own escaping rules (percent encoding) that live at a layer below iCal.

Rule 6: the RRULE 90% case

RRULE is a language of its own. The full grammar supports BYDAY=TU,TH, BYSETPOS=-1, BYMONTHDAY, BYYEARDAY, WKST, and combinations thereof. 99% of real feeds use four patterns:

  • FREQ=DAILY (every day)
  • FREQ=WEEKLY (every week on the same weekday)
  • FREQ=MONTHLY (every month on the same day of month)
  • Optionally INTERVAL=2 (every other)
  • Terminated by COUNT=N or UNTIL=20261231T235959Z (never both)

That's the subset I accept in the JSON API:

{"recurrence": {"freq": "WEEKLY", "interval": 2, "count": 10}}
Enter fullscreen mode Exit fullscreen mode

becomes

RRULE:FREQ=WEEKLY;INTERVAL=2;COUNT=10
Enter fullscreen mode Exit fullscreen mode

If you need BYDAY or BYSETPOS โ€” the "last Friday of every month" pattern โ€” you need a real iCal library. I could extend the validator and builder to handle it, but then I'd be shipping a worse sabre/vobject and the article wouldn't make sense.

The timezone problem

This is where I chose to simplify hardest. iCalendar has three ways to express a datetime:

  1. UTC โ€” 20260420T090000Z. Unambiguous. Works everywhere.
  2. Floating โ€” 20260420T090000 (no Z, no TZID). Means "9 AM wherever the subscriber is." Useful for birthdays, bad for meetings.
  3. Local with TZID โ€” DTSTART;TZID=America/New_York:20260420T090000, plus a full VTIMEZONE block in the envelope with every DST transition for the calendar's span.

Option 3 is what you want for a production feed with multi-region subscribers. Option 3 is also where the spec goes from hard to miserable โ€” you need to emit daylight-saving rules from an IANA source, you need to handle historical transitions correctly, you need to pick a sensible range.

I emit options 1 and 2. If the input datetime has a Z suffix or an explicit UTC offset, the output gets Z. Otherwise it becomes floating. This is the right answer if you're building an internal tool where "my server's clock" and "my subscribers' clocks" are going to agree, or if everything is already in UTC. It's the wrong answer for a public-facing meeting scheduler, and that's when you reach for sabre/vobject.

The validator endpoint is self-dogfood

The builder's output has to be valid. The validator reads iCal text and returns structural errors. If I point the validator at my own builder's output, it should return {valid: true, errors: []}. That's a test I can run without understanding either module separately.

public function testValidateOnValidIcsReturnsValidTrue(): void
{
    $good = file_get_contents(__DIR__ . '/fixtures/simple.ics');
    $r    = $this->request('POST', '/validate', $good, 'text/plain');
    $this->assertTrue(json_decode($r['body'], true)['valid']);
}
Enter fullscreen mode Exit fullscreen mode

The fixture was generated by hand to match what the builder would produce โ€” same CRLF, same properties, same format. If the validator rejects it, the validator is wrong. If the validator accepts something the builder didn't produce, the validator is too lenient. Both bugs get caught the moment you run the suite.

The validator also catches common hand-written mistakes: missing UID, DTSTAMP in ISO 8601 form instead of iCal basic form, LF-only documents, unterminated VEVENT, missing VERSION or PRODID.

Tradeoffs I shipped with

  • No VTIMEZONE. Explained above. If you need it, use sabre/vobject.
  • No BYDAY, BYSETPOS, BYMONTHDAY in RRULE. Ditto.
  • No attendees, no ATTACH, no CLASS, no TRANSP. A feed, not a scheduling system.
  • Apple Calendar quirks. Apple is stricter than the spec in some ways and looser in others. If a subscriber is on iCloud and your feed still doesn't load, the answer is usually "add METHOD:PUBLISH" (I do) or "stop relying on floating times" (valid feedback).
  • Request size limit. 256 KB default on POST body, 512 KB default on fetched URLs. Enough for thousands of events; small enough that nobody can DoS the process.
  • No authentication. This is a microservice you put behind your own proxy or in a VPN. If you expose it publicly you want an auth layer in front of it.

Try it in 30 seconds

docker run --rm -p 8000:8000 ghcr.io/sen-ltd/ical-api

curl -sS -X POST http://localhost:8000/ical \
  -H "Content-Type: application/json" \
  -d '{
    "calendar": {"name": "Standups"},
    "events": [{
      "title": "Team standup",
      "start": "2026-04-20T09:00:00Z",
      "end":   "2026-04-20T09:15:00Z",
      "recurrence": {"freq": "DAILY", "count": 10}
    }]
  }' > standups.ics

curl -sS -X POST http://localhost:8000/validate \
  -H "Content-Type: text/plain" \
  --data-binary @standups.ics
Enter fullscreen mode Exit fullscreen mode

If you want to see the CRLF sitting on every line:

xxd standups.ics | head
Enter fullscreen mode Exit fullscreen mode

Every 0d 0a you see is a tiny victory.

Closing

Entry #134 in a 100+ portfolio series by SEN LLC. Previous entries worth a look:

Feedback welcome. The next thing I want to add is a VTIMEZONE emitter that doesn't lie โ€” which probably means backing it with the IANA tz database and admitting I'll be learning a lot.

Top comments (0)