Legacy Structure

It’s time to deal with moving new drafts into the website content structure. The command I originally wanted to set up to make this work was:

$ maetl collection -i drafts/redesign

To simplify conceptually mapping to what’s already there, I’ve decided I’ll retain the notebooks organisation for now (and solve the design and navigation problems later).

$ maetl notebook -i drafts/redesign

What I’ll do now is create a new simplified URL scheme, and provide the capability for individual documents to override this with a metadata field (I’m pretty sure this already works in my current publishing setup, but I want to step through everything again from first principles to ensure it’s robust enough to survive the next few years of my manic writing activity).

There’s a few things I need to do right away to get this working.

First, I need a way of detecting the format of what’s in the drafts path, identifying what information and metadata is missing in order to turn it into a publishable collection of notes. Next, I need to read everything out of the drafts path and write it into a new path in the maetl.net content structure with the extra metadata for publishing merged in.

Here’s what the site content types now look like (where notes is legacy and notebooks is an empty new collection):

essays/
notes/      # legacy structure
notebooks/  # new structure
notes/
pages/
projects/
talks/

Extracting content types from drafts

The first thing I do here is a bit of discovery around what I can do quickly within the filesystem APIs (no git or any monstrous databases and apps). I can’t be bothered Google searching for docs and specs so I’ll fire up irb and use the repl to find out what data is easily accessible on file objects:

$ irb
> doc = File::Stat.new("drafts/redesign/motivation.md")
> doc.inspect

This tells me I can get the creation date of a document from the birthtime field. I’ll have to assume for now the fields of the File::Stat struct I’m looking at are MacOS specific (highlights how the code I’m writing here is deliberately meant for use on my local machine only).

I also want to open up any Markdown files and scan for front matter sections. I’ll ignore other file types for now (will return to this when I come to clean up essays and talks, it’s not needed now).

I already have existing utilities within my website repo to do this so I’ll grab them and mash it all together:

include ContentUtils

def import_dir
  unless params[:import].exist?
    raise ArgumentError.new("Import path `#{params[:import]}` does not exist")
  end

  params[:import].realpath
end

def default_note(text, path, stat)
  {
    url: "/#{path.dirname.basename}/#{path.basename(path.extname)}",
    title: path.basename.to_s.gsub("-", " "),
    summary: text[0, 120].gsub("\n", " "),
    created_at: stat.birthtime,
    published_at: DateTime.now,
    updated_at: DateTime.now
  }
end

def prepare_notebook
  export_dir = CONTENT_PATH.join("notebooks/#{import_dir.basename}")

  mkdir_p(export_dir)

  Dir[import_dir.join("*.{txt,md}")].each do |text_file|
    path = Pathname.new(text_file)
    stat = File::Stat.new(text_file)
    text, meta = read_yfm(text_file)
    frontmatter = default_note(text, path, stat).merge(meta)
    write_yfm(export_dir.join(path.basename), text, frontmatter)
  end
end

I could probably delete the url key here (the static site generator would automatically default to generating a URL from the directory and filename pattern) but it’s fine to leave it for now. It’s a quick script that cleans up a Markdown file, regardless of whether or not it has clean metadata attached.

Picking this up in the static site generator is a one line config change in Ruby to set :notebooks => '.md'.

So I run this and immediately crash into an error:

$ maetl notebook -i drafts/redesign
Errno::ENOENT: No such file or directory @ rb_sysopen - ./templates/layouts/notebooks.html

There’s no template for the new content type. At this point, I’ve got a choice whether to wrangle config/logic inside the static generator to map the notebooks type to an already existing template or just cp the existing template to a new notebooks.html file.

cp templates/layouts/notes.html templates/layouts/notebooks.html

So I run the script again, and boom, it works without error. But when I got to preview the generated files, I see only the index.html has been generated. None of the entries are there.

Turns out when rethinking the process of writing drafts, I forgot one very important part of how the static site generator works. Each entry on the site has a status field which must be set to published for the content to appear on the site. If status isn’t specified in metadata, the entry defaults to draft and is hidden by default.

In making the draft publishing step into a separate script, perhaps I was subconsciously working towards getting rid of the status field. Right now, it’s being used more for unpublishing old content I don’t want on the site, so it serves a useful purpose even though the naming and organisation feels all wrong now.

I think the compromise here is just to add the field to the front matter of the new entries and call it a day. Now everything is working end-to-end and the site is building correctly.

The presentation looks terrible, there’s a bunch of missing metadata and I’m still not handling images properly. In a more positive sense, I can put all this new stuff onto the site now and I’ve refamiliarised myself with how all the legacy content is structured. I’m starting to get a feel for what I might be able to do to clean things up and simplify the workflow.