Introduction
Hi everyone! In this post, I'll write a summary of what I did this week on my GSoC project: Define and implement a specification for dhall-docs
comment's format.
Why do we need a spec?
Because of two particular reasons.
Distinguish between internal and documentation comments
Is really common on several programming languages to write comments that you may not want to render in the language documentation generator. A simple way of avoiding so is by having "markers" on comments. Haddock, for instance, requires that you use |
as a marker in both single-line and block comments:
-- | Valid haddock comment
-- ignored haddock comment
{-| Valid haddock comment -}
{- Ignored haddock comment -}
Javadoc also does it, by requiring that Javadoc comments must start with /**
.
For the same reason above, we need to be able to distinguish between internal and documentation comments on dhall-docs
.
Indentation issues
We are going to use Markdown as a markup language for the documentation generator, specifically, We will use a particular flavor of Markdown named CommonMark. Markdown, unlike Dhall, is sensitive to indentation. The following markdown document:
first column
4 columns later
will render "first column" in a normal paragraph, and "4 columns later" as an Indented Code Block. I invite you to give it a try.
In the beginning, since dhall-docs
only supports Header comments, we used the first column of the line as a base of indentation. That means that something like this:
{-
foo
bar
baz
-}
will be rendered as 3 different paragraphs, in different lines, whereas:
{-
foo
bar
baz
-}
will render all three lines in an indented-code-block. But to support other documentation comments (such as record fields), this doesn't scale well. Take this example:
{
{-
should this be indented???
-}
foo = bar
}
Comment's content indentation is now clear, and forcing users to write this case like this:
{
{-
should this be indented???
-}
foo = bar
}
is awful and was never an option.
Final specification
Rewriting here the final specification is a non-sense, but you can read it here. Block-comments were heavily based on Dhall's multiline strings to make it really familiar to something already implemented on the language. You can read here the specification for multiline strings here. Single-line comments, on the other hand, was something completely new, or at least couldn't find anything similar from the Dhall's language design that could help, but I think that it will be comfortable for users.
To give you a summary:
-
Block-comments starts with
{-|
and a newline e.g.
{-| foo bar -}
-
Single-line comments can span several lines, but the first one should init with
--|
(note the final whitespace) and every other line should start with--
(note the two whitespaces). Also, they need to be vertically aligned e.g.
--| foo -- bar
Implementation
I have to admit: Text-manipulation is really hard and messy for me. My first implementation of the specification was really awful and cumbersome. One of my mentors gave me this post about some tips on type-driven design and I applied them to the implementation. I heavily recommend reading that post.
I defined the following data-type:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE KindSignatures #-}
data CommentType = DhallDocsComment | MarkedComment | RawComment
type ListOfSingleLineComments = NonEmpty (SourcePos, Text)
data DhallComment (a :: CommentType)
= BlockComment Text
| SingleLineComments ListOfSingleLineComments
deriving Show
newtype DhallDocsText = DhallDocsText Text
Note the Language extensions I'm using (disclaimer: I'm still learning Haskell and specifically the use of language extensions and some of them require some math background. I apologize if I say some nonsense. If you want to give me a term fix, please leave a comment on the post):
-
DataKinds
allows us to extend Haskell's kind system by promoting our data-types to kinds. -
KindSignatures
allow thea :: CommentType
syntax. In that example it means "I accept any valid value of kindCommentType
i.e.DhallDocsComment
,MarkedComment
,RawComment
.
That allows us to have these type of comments:
DhallComment RawComment
DhallComment MarkedComment
DhallComment DhallDocsComment
and you know for sure what kind of text stores each possible type. This is something I love about Haskell: the type-system itself helps you write correct programs.
After that, all left was doing some mappers between each possible DhallComment
:
-- checks if a DhallComment has the `|` marker
parseMarkedComment
:: DhallComment 'RawComment
-> Maybe (DhallComment 'MarkedComment)
-- check that a MarkedComment is valid against the `dhall-docs` spec
parseDhallDocsComment
:: DhallComment 'MarkedComment
-> Either CommentParseError (DhallComment 'DhallDocsComment)
-- Manipulates the comment's text. Since the Comment
-- is a 'DhallDocsComment it should never fail
parseDhallDocsText
:: DhallComment 'DhallDocsComment
-> DhallDocsText
and to ensure that the implementation worked properly, the dhall-docs
test setup was enhanced to add several unit test cases for this which you can see here
Examples of using this new spec
Recently a PR was opened to modify the Dhall's Prelude header files. You can see the PR here:
Use `dhall-docs` comment format for Prelude #1045
... so that the comment headers are included in the generated documentation
To end
I have 1 month left to finish the project and there are some core and cool features that are missing. As soon as I finish them I'll post here so you can check out. I have to say: I'm really excited to finish this project :)
Thanks for reading!
Top comments (0)