Chicken Blog

Or how I built my own static site generator.

It has been a while since I have updated this blog, which frustrates me. I have many more ideas than I have time to write them down, much less put them into practice. I have taken steps to alleviate this situation: I chickenized my blog.

Why static websites are so fucking hard

I think I would post more if some of the most tedious parts of mantaining a website were automated. Currently I manage this blog, my personal website, and my girlfriend’s business catalog. These are all static websites which are updated only ocassionally. How hard can that be?

For starters, HTML sucks. While CSS is somewhat bloated, at least it encourages modularity: reusing styles across different pages is a breeze. But HTML does not support shareable components without JavaScript; and I’ll be damned if I have to use JS. And having pages which not only share style but content happens all the time.

This problem is well-known, and webmasters have created many tools to solve it. Yet, nothing I found met my requirements:

Most of these are self explanatory. For writing HTML, I usually draft the content in djot rather than markdown, convert it to HTML, and then tweak it until I like it. Crucially, I then delete the draft and make all subsequent changes on the HTML file directly. Why I think this is the best approach to generating HTML deserves a blog post of its own.

The only “tool” I found that came close to meeting all the requirements is make; yes, that make. Indeed, one can make a static site generator with make. But it stretches make past its limits, and I get the feeling that its the wrong tool for the job.

What to do? I started to look for general-purpose macro processors (GPMP). These are tools that allow you to create macros: give some text a name (perhaps some parameters too) which can be expanded in many different places. A reusable HTML component would just be a macro in this context.

The only GPMP I found was GNU’s m4. As is common for GNU software, it was full of bloat, so it was a hard pass from me.1

I decided to write my own GPMP (in C). I was halfway there when I realized, to my delight, that I was reinventing a subset of the Scheme programming language. I threw my code away and started to write a program in Scheme to build my blog.

Why and what Scheme

I never thought that a fully fledged programming language was the tool I was looking for, but now I can’t imagine any other way. A hand-crafted program beats a framework any day of the week. I will come back to this point later.

The language choice is interesting. For this project I used Scheme, specifically CHICKEN Scheme. Scheme is a simple dynamically-typed language, with good support for both object-oriented and functional programming. Compared to other implementations of Scheme, CHICKEN offers:

  • Support for raw multiline strings with embedded expressions. We will see this is crucial.
  • A transpiler to C, for maximum portability.
  • The right language standard, the Revised Revised Revised Revised Revised Report on the Algorithmic Language Scheme (also known as R5RS), rather than the bloated Revised Revised Revised Revised Revised Revised Report on the Algorithmic Language Scheme (also known as R6RS).2

I usually write everything in C, but in this case CHICKEN outshines C by a large margin.

I am familiar with Scheme from the wizard book SICP, but this is my first time coding anything useful in Scheme, and my first time using CHICKEN. Nevertheless, I think it came out quite good.

What follows is a description of my program, in the painful detail that can only come from a beginner. Why? Because I want to refer back to this post when I do the same thing for websites other than my blog, and more generally whenever I use CHICKEN again. There are some design patterns I don’t want to forget. If you are learning CHICKEN like me, I think this section could be useful to you. If not, feel free to skip ahead to the conclusion.

So, let’s get on with how I chickenized my blog, shall we?

Chickenization

You can find all the code (with minor modifications) in sourcehut.

Disclaimer: The code snippets here are licensed under the GNU AGPLv3.

The goal

My blog is statically generated from these files:

The data/posts directory includes all my posts in HTML but only the content. Taking this post as an example:

> head -6 data/posts/chicken-blog.html
<p>
  It has been a while since I have updated this blog, which frustrates me. I
  have many more ideas than I have time to write them down, much less put them
  into practice. I have taken steps to alleviate this situation: I chickenized
  my blog.
</p>

The data/posts/metadata.scm file contains, well, all the metadata I need to associate to each post. I write these as Scheme S-expressions, one for each post.

> head -6 data/posts/metadata.scm
((title . "Chicken blog")
  (description . "How I chickenized my blog.")
  (year . 2026)
  (month . 6)
  (day . 2)
  (content-file . "chicken-blog.html"))

The data/template/ directory contains everything that needs to go in my website, minus the HTML.

With one command I can generate my site from these files and deploy it to live. It takes less than two seconds.

Programming in CHICKEN

Let’s actually code this thing. I divided my program into five modules:

The generate.scm file is the entry point of the program so we start there.

The generate module

(module generate ()
  (...import modules (generate)...)

  (...post-metadata->post procedure...)
  (...write-page procedure...)

  (...filepath definitions...)

  (...create post objects...)

  (...write post pages...)
  (...write index page...))

The module generate () just names the current module “generate” (the empty parentheses mean that this module doesn’t export anything).

; (...import modules (generate)...)
(import scheme
      (chicken base)
      (chicken io)
      (chicken pathname)
      (chicken process-context)
      (chicken file)
      (chicken sort)
      types
      dates
      render)

Okay, so we are importing some modules; what do they do? The scheme module is basically all of the R5RS Scheme language, and (chicken base) adds CHICKEN-specific extensions. These are both included by default when you are not inside a module. The other (chicken *) modules are like the C standard library, and their names are self-explanatory, except for (chicken process-context), which we will need to access command-line arguments. The rest of the modules are the ones we will define later. Usually, I would have qualms about using non-standard language extensions but, since CHICKEN can compile everything into standard C code, I don’t give a toss.

Now, let’s write a procedure that can parse that metadata we saw earlier. The procedure post-metadata->post takes two arguments. data is one of the S-expressions in the metadata file, and content-dir is the directory where the content files are in, in our case data/posts. It returns a “post object”, which we will define in the types module.

; (...post-metadata->post procedure...)
(define (post-metadata->post data content-dir)
  (define title)
  (define description)
  (define year)
  (define month)
  (define day)
  (define content-file)
  (define (process-record record)
    (let ((key (car record))
          (value (cdr record)))
      (cond ((eq? key 'title) (set! title value))
            ((eq? key 'description) (set! description value))
            ((eq? key 'year) (set! year value))
            ((eq? key 'month) (set! month value))
            ((eq? key 'day) (set! day value))
            ((eq? key 'content-file) (set! content-file value))
            (else (error "Unknown metadata field" key)))))
  (map process-record data)
  (define date (make-date year month day))
  (make-post title
             description
             date
             (make-pathname content-dir
                            content-file)))

Not too bad I hope. As we will see when we get to the types module, a post object is a special instance of a “page object”.3. These objects have all the information needed to write them to the desired HTML file. Let us define the procedure that does just that. To add flexibility, we include an output-dir parameter which allows us to change the location of the site.

; (...write-page procedure...)
(define (write-page page output-dir)
  (let ((file (make-pathname output-dir
                             (page-output-file page)))
        (rendered (render-page page)))
    (receive (dir fname ext) (decompose-pathname file)
      (create-directory dir #t))
    (delete-file* file)
    (with-output-to-file file
                         (lambda () (write-string rendered)))))

The render-page procedure outputs a string which is all the HTML corresponding to the page. It is defined in the render module.

Next, we define some parameters that the user can specify in the command-line.

; (...filepath definitions...)
(define posts-dir (car (command-line-arguments)))
(define output-dir (cadr (command-line-arguments)))
(define metadata-file (make-pathname posts-dir "metadata.scm"))

(assert (file-exists? metadata-file))

Now the user can specify where the post contents and metadata are located (for us data/posts) and where the pages should be created (as we will see later, this will be the directory site).

We have everything that we need to create all post objects from the metadata and content files. We collect them into a list post, which is sorted in reverse-chronological order.

; (...create post objects...)
(define posts
  (sort (map (cute post-metadata->post <> posts-dir)
             (with-input-from-file metadata-file read-list))
        (lambda (post1 post2)
          (not (date<? (post-date post1)
                       (post-date post2))))))

The cute procedure is just syntactic sugar for specializing parameters, which is exactly what we need because we only call post-metadata->post with content-dir set to posts-dir. The read-list procedure can parse S-expressions from a file.

Finally, we write the HTML for the posts.

; (...write post pages...)
(map (cute write-page <> output-dir) posts)

We can also create the main page, which is supposed to have links to all the posts. This is why we sorted the posts list.

; (...write index page...)
(write-page (make-index (render-navigation posts)) output-dir))

The make-index procedure creates the page object of index.html, and takes as an argument a navigation menu for the posts.

And that is all for the generate module.

The types module

We get to do the good kind of OOP in the types module.

(module types *
  (import scheme
          (chicken io)
          (chicken pathname)
          utils
          dates)

(...make-index procedure...)

(...make-post procedure...)

  (...getters for page objects...)
  (...getters specific to post objects...))

The * in the module declaration means we are going to export everything defined in this module. You will soon see why I am indenting some things all the way to the left.

; (...make-index procedure...)
  (define (make-index navigation)
(...index header definition...)
(...index content definition...)
    (lambda (op)
      (cond ((eq? op 'title) "nagbu's blog")
            ((eq? op 'description) "nagbu's blog.")
            ((eq? op 'content) content)
            ((eq? op 'stylesheets) '("/style.css"))
            ((eq? op 'header) header)
            ((eq? op 'url-pathname) "/")
            ((eq? op 'output-file) "index.html"))))

This is the classic message passing method to represent objects; I learned it from SICP. Basically, an object (in this case a record) is just a function with some state. It accepts a message and does the appropiate thing, in this case return one of its fields. We can wrap this functionality in accessors, in this case getters, as follows.

; (...getters for page objects...)
(define (page-title page) (page 'title))
(define (page-description page) (page 'description))
(define (page-content page) (page 'content))
(define (page-stylesheets page) (page 'stylesheets))
(define (page-url-pathname page) (page 'url-pathname))
(define (page-output-file page) (page 'output-file))
(define (page-header page) (page 'header))

I hope you can infer what these fields represent, except maybe content, which is just what goes inside the HTML tag <main>.

The header field is defined as follows.

; (...index header definition...)
    (define header #<<HTML
<header>
  <a id="site-name" href="/">nagbu's blog</a>
  <nav>
    <ul>
      <li><a href="https://www.nagbu.net">Main site</a></li>
    </ul>
  </nav>
</header>
HTML
    )

Multiline raw strings in a heredoc style, what isn’t there to love? Unfortunately, we have to indent them all the way to the left in order to not introduce extraneous indent. We can also embed expressions in them.

    (define content #<#HTML
<nav>
  <dl>
    #(indent "
    " navigation)
  </dl>
</nav>
HTML
    )

Here we use a clever trick. We need to embed the navigation into the content, but it needs to come out indented, even if it contains multiple lines. The indent procedure, defined in the utils module, replaces all newlines in its second argument by the first argument.

; (...indent procedure...)
(define (indent indent-string content)
  (string-translate* content `(("\n" . ,indent-string))))

So, by passing the current indent as the first argument, we get the right result. We are already beginning to see the strengths and limitations of CHICKEN strings; I will return to the topic of string literals in the conclusion.

Anyway, posts are defined similarly.

; (...make-post procedure...)
  (define (make-post title description date content-file)
    (define header #<#HTML
<header>
  <nav>
    <a href="/">Back to blog</a>
  </nav>
  <hgroup>
    <h1>#{title}</h1>
    <p><time>#(date-dashes date)</time></p>
  </hgroup>
</header>
HTML
    )
    (define stylesheets '("/style.css" "/post.css"))
    (define url-pathname (string-append "/"
                                    (date-slashes date)
                                    "/"
                                    (urlify title)))
    (define output-file (make-pathname (date-slashes date)
                                       (urlify title)
                                       ".html"))
    (lambda (op)
      (cond ((eq? op 'title) (string-append title " | nagbu's blog"))
            ((eq? op 'title-raw) title)
            ((eq? op 'description) description)
            ((eq? op 'date) date)
            ((eq? op 'content)
             (with-input-from-file content-file read-string))
            ((eq? op 'stylesheets) stylesheets)
            ((eq? op 'url-pathname) url-pathname)
            ((eq? op 'output-file) output-file)
            ((eq? op 'header) header))))

As you can see, posts have a couple fields which do not apply to general pages. So, we need extra getters.

; (...getters specific to post objects...))
(define (post-date post) (post 'date))
(define (post-title-raw post) (post 'title-raw)))

Other modules

The render module is nothing special: rendering is just concatenating information from the page objects in the right order. The dates module contains a date data type which I found useful, and the utils is just some convenient procedures for string manipulation.

Compiling CHICKEN

To be honest, while writing the program was satisfying, compiling it using modules was rather unpleasant. Initially I wrote everything in a single file, and everything worked fine. When I migrated to modules I realized that CHICKEN inherits all problems that C does when compiling multiple source files. This makes sense since the CHICKEN compiler, called csc, uses a C compiler under the hood. Unlike C, there is little information in the web on how to solve CHICKEN compilation issues. Eventually I figured it out, but only after sweating a bit.

Use csm!

Before you get down in the weeds with csc, consider using csm. It automatically resolves dependencies and calls csc, multiple times if necessary, with the right flags (kind of like CMake I guess). Moreover, it echoes the commands it runs and it can generate a Makefile too.

However, I wanted to learn how to use csc from the start because:

  • csm has limited configuration.
  • csm might not be available in a given CHICKEN distribution.
  • I will eventually have to.

Still, csm helped me reverse engineer how csc works.

Maybe use csc?

There are some csc theory that we need to understand before attempting to compile anything. With no flags, csc attempts to build an executable, which it does by first transpiling Scheme to C, and then calling a C compiler. For this project, I want a statically linked executable—but we will allow the CHICKEN standard library to be loaded dynamically. For a “fully static” executable (useful for minimizing dependencies) look into the -static flag , for a dynamically linked executable (useful for loading at the REPL) look into the -dynamic flag, and for a C transpilation (useful for porting to systems without CHICKEN installed) look into the -t flag. The deployment section of the manual and the modules section of the manual have some more hints.

First attempt

Clearly, we need object files for each module we import. We can tell csc to generate those with the -c flag. Then we can gather all object files and compile, right? Let’s try it with a Makefile.

CSC       = chicken-csc

# For safe optimization
CSCFLAGS  = -O3

bin/generate-pages: src/generate.scm build/render.o build/utils.o build/types.o build/dates.o
	mkdir -p bin
	$(CSC) $(CSCFLAGS) -o bin/generate-pages \
		build/render.o build/utils.o build/types.o build/dates.o src/generate.scm

build/render.o: src/render.scm build/utils.o build/types.o build/dates.o
	mkdir -p build
	$(CSC) $(CSCFLAGS) -c -o build/render.o src/render.scm

build/types.o: src/types.scm build/dates.o build/utils.o
	mkdir -p build
	$(CSC) $(CSCFLAGS) -c -o build/types.o src/types.scm

build/dates.o: src/dates.scm
	mkdir -p build
	$(CSC) $(CSCFLAGS) -c -o build/dates.o src/dates.scm

build/utils.o: src/utils.scm
	mkdir -p build
	$(CSC) $(CSCFLAGS) -c -o build/utils.o src/utils.scm

If you try this, you will find that the dependencies accross modules cannot be resolved, e.g. types cannot find dates or utils. That makes some sense; how does csc resolve imports anyway? To answer that question, we need a deeper understanding of the inner workings of csc.

Interlude: how csc works

(Disclaimer: This subsection is based on what I, a beginner, have learned from reading the less-than-optimal documentation and some online forums. Some of it is probably wrong. Take it with a grain of salt.)

So far I have used the term “compile” as referring to the whole process of turning Scheme source files into executables, or maybe object files. As with C, it helps to separate this process into three stages.

Preprocessing

Whenever the preprocessor sees an (import <module>), it needs to know what procedures and variables <module> exports. By including this information in the preprocessor’s output, a later stage can check for use of undeclared indentifiers and take advantage of type information.4 Also, if the module happens to modify syntax or define a macro, it gets expanded at this stage.

Compiling

This is where CHICKEN code gets transformed into machine code (via a C transpilation under the hood). Any unresolved identifiers stay unresolved, but they must be declared before use.

Linking

Replace unresolved identifiers with memory addresses to the actual function in a different compilation unit.

So, as I understand it, we failed at the preprocessing stage: we did not tell types what procedures dates and utils export (this information is not included in object files by default). To solve this problem, C uses header files containing function prototypes and macro definitions. The C compiler looks for these files using the “library path”, which is nothing more than a list of directories. What is the CHICKEN equivalent?

Second attempt

CHICKEN has “import libraries” which are the equivalent of header files. Unlike in C, we don’t have to write these explicitly since csc can generate them automatically.5 To do this we use the -J flag, which outputs the “header file”, named <module-name>.import.scm, in the current directory. For other modules to use this file, we need to change their “library path” to point to wherever the “header file” is. The flag -I is the same as in a C compiler.6 All in all, you might come up with this.

CSC       = chicken-csc
CSCFLAGS  = -O3

bin/generate-pages: src/generate.scm build/render.o build/utils.o build/types.o build/dates.o
	mkdir -p bin
	$(CSC) $(CSCFLAGS) -o bin/generate-pages \
		build/render.o build/utils.o build/types.o build/dates.o src/generate.scm \
		-I build

build/render.o: src/render.scm build/utils.o build/types.o build/dates.o
	mkdir -p build
	$(CSC) $(CSCFLAGS) -cJ -o build/render.o -I build src/render.scm
	mv render.import.scm build

build/types.o: src/types.scm build/dates.o build/utils.o
	mkdir -p build
	$(CSC) $(CSCFLAGS) -cJ -o build/types.o -I build src/types.scm
	mv types.import.scm build

build/dates.o: src/dates.scm
	mkdir -p build
	$(CSC) $(CSCFLAGS) -cJ -o build/dates.o src/dates.scm
	mv dates.import.scm build

build/utils.o: src/utils.scm
	mkdir -p build
	$(CSC) $(CSCFLAGS) -cJ -o build/utils.o src/utils.scm
	mv utils.import.scm build

This almost works! All imports are resolved.7 The error is now in the very last step, when we link all of the object files together. From what I understand, this happens because we haven’t told the linker in what order these are supposed to be run, and so when it runs everything in some order it gets confused. To make a long story short, we need to give each compilation unit a name, which can be done with the -unit flag, and explictly tell the linker which units depend on which, which is done with the -uses flag.

Final attempt

At last, we get a working Makefile, to which I added some niceties.

.POSIX:
.SUFFIXES:

CSC       = chicken-csc
CSCFLAGS  = -O3
POSTS     = data/posts
SITEDIR   = site
TEMPLATE  = data/template

site: $(POSTS) $(TEMPLATE) bin/generate-pages
	rm -rf $(SITEDIR)
	cp -R $(TEMPLATE) $(SITEDIR)
	bin/generate-pages $(POSTS) $(SITEDIR)

live: site
	rsync -avvP --delete  $(SITEDIR)/ nagbu_blognagbu@ssh.nyc1.nearlyfreespeech.net:/home/public

live-dry-run: site
	rsync -avvPn --delete $(SITEDIR)/ nagbu_blognagbu@ssh.nyc1.nearlyfreespeech.net:/home/public

bin/generate-pages: src/generate.scm build/render.o build/utils.o build/types.o build/dates.o
	mkdir -p bin
	$(CSC) $(CSCFLAGS) -o bin/generate-pages \
		build/render.o build/utils.o build/types.o build/dates.o src/generate.scm \
		-I build -uses render -uses utils -uses types -uses dates

build/render.o: src/render.scm build/utils.o build/types.o build/dates.o
	mkdir -p build
	$(CSC) $(CSCFLAGS) -J -o build/render.o -c src/render.scm -unit render \
		-I build -uses utils -uses types -uses dates
	mv render.import.scm build

build/types.o: src/types.scm build/dates.o build/utils.o
	mkdir -p build
	$(CSC) $(CSCFLAGS) -J -o build/types.o -c src/types.scm -unit types \
		-I build -uses dates -uses utils
	mv types.import.scm build

build/dates.o: src/dates.scm
	mkdir -p build
	$(CSC) $(CSCFLAGS) -J -o build/dates.o -c src/dates.scm -unit dates
	mv dates.import.scm build

build/utils.o: src/utils.scm
	mkdir -p build
	$(CSC) $(CSCFLAGS) -J -o build/utils.o -c src/utils.scm -unit utils
	mv utils.import.scm build

clean:
	rm -rf build bin $(SITEDIR)

(Small shoutout to NearlyFreeSpeech.NET who have been great hosting providers (no, I’m not sponsored). The rsync takes a couple of seconds tops.)

Running make rebuilds the site, and make live, well, makes it live. How nice is that?8

Conclusion

And that was how I chickenized my blog. What did we learn?

Multiline raw interpolated string literals

If you are going to attempt something like this, I’d say to choose a programming language with good support for multiline raw interpolated string literals (MRISLs). CHICKEN does have them, but I have some gripes with them.

  • The escape character # is not customizable
  • Have to indent them all the way to the left to prevent extraneous whitespace.
  • The indent of multiline embeded expressions is not handled automatically (we fixed this with the clever but kinda ugly9 indent procedure).

While I can see CHICKEN adding support for changing the escape character, the other complaints are too harsh, and I’m not aware of any language that solves them (though C# at least attempted it).

Not all is lost. People are considering including MRISLs with customizable escape character in R7RS large edition, see the associated SRFI 267. In addition, CHICKEN provides an API to customize its parser, see the read-syntax module; though I have not experimented with it yet, it may be able to solve some of the problems we encountered.

Frameworks and SSGs

It took one day for me to get a working version of this program, and another day figuring out how to compile it. As a result I now have an extensible system which fulfills all my needs and no more.

Would I have been better off using someone else’s static site generator? Well, I don’t like relying on software which limits what you can do. Using a handcrafted program I can easily add support for RSS, different versions of my content (e.g. translated in other language, in text form for terminals), syndication, etc. Would someone else’s SSG support that and whatever else I want to add in the future? Big maybe. And we haven’t factored in all the time spent on learning the tool/framework, which is very specialized knowledge compared with the programming experience I acquired in this project.

For any computer task, if you know how to program, I think you are better off picking whatever language you like and coding up a solution than relying on someone else’s framework. Sure, you have time constraints. But if it is something that you care about enough, like I care about this blog, handcrafting is always the better option.

We want to solve concrete problems, not anticipate the tasks others might have in the future, so we create applications instead of frameworks. We write editors, not text-editing toolkits, we write games instead of engines. We do not generalize unnecessarily, as we will never be able to fully comprehend how our code might be used. Moreover, we often evade our responsibilty to solve actual problems by procrastinating and conceiving one-size-fits-all pseudo solutions for future generations that will probably never use them. So, we instead identify a problem that might be addressed by a computerized solution and do nothing but working towards that solution. We do not create abstractions for abstraction's sake but to simplify our current task.
Felix Winkelmann (creator of CHICKEN), The BRUTALIST Programming Manifesto.

  1. I am not a GNU hater, quite the opposite. But their software is subpar.↩︎

  2. You can think of Scheme standards in terms of C standards as follows. Everything before R5RS is pre-ANSI C, R5RS itself is C89, R6RS is C++, and the small edition of R7RS—which is the most current standard at the time of writing—is C99, maybe a bit more. From the looks of it, the large edition of R7RS, whenever it comes out, is going to be C23.↩︎

  3. While I generally dislike OOP, I find it natural in situations like this one. Scheme is nice in that OOP is not mandatory, and one can easily write applications with only some OOP sprinkled in. Due to this, however, a post object is not truly an instance of a page object because Scheme doesn’t have builtin notions of inheritance, so we have to manually enforce that post objects support all operations one wants on page objects. On the other hand, this is not hard to do, and the added flexibility of Scheme makes this tradeoff worth it to me.↩︎

  4. Scheme is dynamically typed, but CHICKEN has some facilities for working with types explicitly (similar to Python). Type checking is not very robust but it does allow for big optimizations. I chose not to use explicity types in my project.↩︎

  5. Remember how at the top of each module we specified all the exports? Here is where that information is used.↩︎

  6. Though I haven’t been able to verify this, I would expect that directories added to the path with -I get the highest priority in library search. Otherwise, we might be in an awkward situtation if we accidentally define a module with the same name as another module installed in your machine.↩︎

  7. Even standard library imports are resolved, in essentially the same way. The “library path” (the path that gets added to with the -I option) includes the directory where the CHICKEN standard library is located in your machine (for me that’s /usr/lib/chicken/11). This directory contains, among other things, files of the form *.import.so. I gather that these are like the “header files” we talked about earlier, but in the case of shared libraries these files also include the actual machine code. In that case, I presume that instead of merely specifying that an identifier corresponds to a module, we can give the actual memory address? I don’t understand this very well.↩︎

  8. You might complain that the Makefile is ugly: it has a lot of repetition and the need to explictly list dependencies for each compilation unit is not ideal. You are welcome to use csm or write a script that generates the Makefile. Personally, I like to keep my Makefiles simple and POSIX-compliant.↩︎

  9. “Clever but kinda ugly” words I live by.↩︎