TL;DR
If you only want to see my current progress, you can visit this website to see it https://hydra.dhall-lang.org/build/63538/download/1/docs/
Motivation
What is a document generator? If you're familiar with other programming languages such as Java or Haskell, for example, you may know of tools that analyze your source code, extract useful information in your comments and display them in a a markup language such as HTML, for instance.
In haskell, that tool is haddock. It analyzes your source code, searching for comment annotations on your data-types and functions, to report them in a nice way using HTML.
For instance, the following function declaration:
-- | Function description
haddockExample
:: a -- ^ input description
-> b -- ^ output description
it will render something like this on the generated HTML:
The generated HTML can be then uploaded to any host to serve them. This is useful if you're a package maintainer to let your consumers know how to use your packages.
The main goal of this #GSoC project is to build a similar tool for the Dhall configuration language. I set some milestones for the project, and this post will focus on the following:
- Generate some readable with a nice UI/UX from a dhall package. A Dhall package is essentially (at this moment) a folder with several Dhall files, ending with the
.dhall
extension. - Add rendered source code (first iteration). Haddock does it. For each haskell module, it will create:
- The HTML documentation
- A HTML rendered source code, similar to this
Documentation generator
At this moment, I developed the dhall-docs
executable. That takes the following flags:
-
--input
, which is a relative or absolute path to the Dhall package. -
--output-link
, which is a symlink (defaulting to./docs
) to the generated documentation. -
--package-name
, which is the actual package-name used in HTML titles and in the place where generated documentation are actually saved.
The tool will traverse the whole --input
directory in a recursive way, searching for all the files that ends in .dhall
and parse them. If a dhall file fails to parse as a Dhall expression, it won't be included.
Structure of the generated documentation
On each directory on the generated documentation, an index.html
file is generated listing the subpackages (the directories in it) and the exported dhall files in that level.
If we visit a .dhall file, we can see something like this:
The following is the actual source code that the tool took as its part of its input:
{-
`subtract m n` computes `n - m`, truncating to `0` if `m > n`
-}
let subtract
: Natural → Natural → Natural
= Natural/subtract
let example0 = assert : subtract 1 2 ≡ 1
let example1 = assert : subtract 1 1 ≡ 0
let example2 = assert : subtract 2 1 ≡ 0
let property0 = λ(n : Natural) → assert : subtract 0 n ≡ n
let property1 = λ(n : Natural) → assert : subtract n 0 ≡ 0
let property2 = λ(n : Natural) → assert : subtract n n ≡ 0
in subtract
As you would notice, the Documentation
header on the generated HTML corresponds to the Header comment, and the actual source code is the rest of the file. The header is written in Markdown, dhall-docs
uses mmark
as markdown parser and preprocessor.
This is the first iteration of the work, I have plans on expanding the places where annotation comments can go,
like record type labels, for example.
Here is the list of PRs involved on this task:
- dhall-haskell#1833 introduced the repository skeleton
- dhall-haskell#1845 first attempt to generate this documentation without any css, parsing the header without using a markdown pre-processor.
- dhall-haskell#1848 improved the css rules
- dhall-haskell#1863 parsed the header markdown contents and rendered them on the html page
- dhall-haskell#1871 added a small ci/cd configuration to generate sample documentation. you can see it here. This was my hardest task since I didn't knew any of nix, feels good to actually have accomplish it.
-
dhall-haskell#1876 stored the generated documentation at
$XDG_DATA_HOME/dhall-docs
following the XDG specification. Documentation is stored at$XDG_DATA_HOME/dhall-docs/${SHA256_OF_DOCS}-${PACKAGE-NAME}
this makes it content-addressable.
All of the work was really ad-hoc, so I won't add any implementation details: you can see them on the PRs. Next section was way more interesting to implement, so please keep reading :)
Rendered Source Code (first iteration)
On the previous section I showed up a first iteration on rendered source code. There were several ways of doing this task, but the thing that I was almost about to start to do was to traverse the Dhall AST, generating Html ()
. In FP terms, I should create a catamorphism. In non-FP terms, I should create a mapper.
But this was going to involve a lot of lines of code, and actually some repetition of what the Dhall.Pretty
module of the dhall
package does i.e. define formatting rules for the AST elements and tokens.
The Dhall.Pretty
module used under the hood the prettyprinter
, its core consists of the following functions and ADT:
data Ann
= Keyword -- ^ Used for syntactic keywords
| Syntax -- ^ Syntax punctuation such as commas, parenthesis, and braces
| Label -- ^ Record labels
| Literal -- ^ Literals such as integers and strings
| Builtin -- ^ Builtin types and values
| Operator -- ^ Operators
deriving Show
-- Create a `Doc Ann` from a dhall expression
-- annotating elements using our syntatic rules
prettyExpr :: Pretty a => Expr s a -> Doc Ann
-- SimpleDocStream can be later rendered as `Text` on
-- a terminal
layout :: Doc ann -> Pretty.SimpleDocStream ann
This module contained basically all of what I have to do, and repeating code is bad! The prettyprinter
says on its package description:
A prettyprinter/text rendering engine. Easy to use, well-documented, ANSI terminal backend exists, HTML backend is trivial to implement, no name clashes, Text-based, extensible.
so I thought: "man, there should be a way to generate Html ()
from a dhall expression using this module".
And there was a way. On the package documentation, they recommend using SimpleDocTree
instead of SimpleDocStream
to render HTML-like output, and the package itself exports an utility to do the conversion: treeForm
. Traversing the SimpleDocTree
ADT made all this work possible in the following function:
import Lucid
import Dhall.Pretty (Ann (..))
import qualified Data.Text.Prettyprint.Doc.Render.Util.SimpleDocTree as Pretty
import qualified Dhall.Pretty
exprToHtml :: Expr Src Import -> Html ()
exprToHtml expr = renderTree prettyTree
where
prettyTree = Pretty.treeForm
$ Dhall.Pretty.layout
$ Dhall.Pretty.prettyExpr expr
textSpaces :: Int -> Text
textSpaces n = Data.Text.replicate n (Data.Text.singleton ' ')
renderTree :: Pretty.SimpleDocTree Ann -> Html ()
renderTree sds = case sds of
Pretty.STEmpty -> return ()
Pretty.STChar c -> toHtml $ Data.Text.singleton c
Pretty.STText _ t -> toHtml t
Pretty.STLine i -> br_ [] >> toHtml (textSpaces i)
Pretty.STAnn ann content -> encloseInTagFor ann (renderTree content)
Pretty.STConcat contents -> foldMap renderTree contents
encloseInTagFor :: Ann -> Html () -> Html ()
encloseInTagFor ann = span_ [class_ classForAnn]
where
classForAnn = "dhall-" <> case ann of
Keyword -> "keyword"
-- ommited for brevity
This is similar to the first option: transform the Dhall AST (Expr s e
) to Html ()
, the difference is that we don't have to worry about the types of syntactical elements: that logic can be kept on the Dhall.Pretty
module, and this function only creates the Html ()
from it.
The PR that introduced that change is dhall-haskell#1892, and we can see how many additions/deletions that change involved:
Neat! A lot of value in less than 160 lines of code.
The things I learnt along the way
Of course, I've improved my haskell, specifically the package ecosystem. One of the things that overwhelmed (and sometimes annoyed me) is how package versions are resolved, and since we have to ensure our project works on several GHC versions with several package versions, we have to be really sure about the version of a package that we are adding.
This was difficult on the first two weeks, since I had to do some little research about packages to render HTML and parse markdown, and fight against the ci/cd pipeline when a version error occurred.
Thankfully, now I understand better how it works, and in the future of the project I don't think I'll add more packages, but now I'm sure how to tackle that kind of issues.
Another thing that I've learned a little on the project was Nix. In short words, its a functional package manager. Fun fact: I see that a lot of people that enters the functional programming world tends to use only tools that uses that paradigm. Everytime I searched something about Nix, it was using a haskell project.
If you made it this far
Thanks for reading!
Top comments (0)