The Bitspeech Sonnets

The Bitspeech Sonnets is a computer-generated book produced for NaNoGenMo 2019 which uses a constructed language of English-like consonant and vowel fragments to assemble poems based on traditional sonnet rhyme schemes.

Example of a generated sonnet rhyming ‘rongun’ and ‘zagan’

I originally had a more ambitious simulation-heavy concept for a generated book weaving a non-human narrative out of the interactions between trees in a forest, but this got dropped due to major time restrictions this month. I still hope to revisit this concept again in future sometime, but maybe that will have to happen outside of NaNoGenMo.

But I did want to do something, with the constraint that it would have to be smaller and simpler. Towards the end of the month I started thinking about asemic writing and constructed phonetics and syllables. I came across a few interesting precedents for hashing algorithms and pronounceable mapping formats such as Bitspeak by Victor Maia, and the idea began to coalesce into a project that seemed more manageable and didn’t require funneling huge amounts of time and energy into an intricate simulation and emergent narrative experiment with an uncertain outcome.

In all my explorations of generative writing and narrative design, I hadn’t yet dived into poetry generation so this was also an opportunity to start from scratch and try something new in an area where I hadn’t built up a repetoire of techniques and code libraries and could start fresh with no idea what I was doing.

Language

‘Bitspeech’ is an asemic language where each word is constructed from a base 16 hex number with the underlying bits mapped onto a lookup table of consonant and vowel pairs. Different dialects of Bitspeech can be formed by changing the available sets of consonants and vowels or changing the underlying bit mapping to use different sized lookup tables.

In the variation used for this project, each character in the hex translates to a byte. When split into a pair of 4 bit keys, this gives 16 possible positions for the set of consonants and the set of vowels.

Bits Consonant Vowel
0000 p a
0001 b e
0010 t i
0011 d o
0100 k u
0101 g an
0110 ch en
0111 j in
1000 f un
1001 v on
1010 l ai
1011 r ei
1100 m oi
1101 y ui
1110 s aw
1111 z ow

Procedure

Turning this into a book of sonnets is not very subtle and mostly uses brute force.

The first step is to aggregate a Bitspeech lexicon from a bunch of sequential and randomly generated integers, providing additional indices for syllable counts and running Double Metaphone on the generated words to get a set of approximations of how the words sound when spoken.

Because of the combinatorial explosion given 50,000–100,000 possible words to compare as end rhymes, it’s not viable to precompute lists of rhyming words in the lexicon, so this is done during the assembly of each poem.

Poems are assembled starting from a rhyme scheme, where sonnets are assumed to have a structure of 10 syllables per line with end rhymes matching the rhyme scheme (iambic pentameter isn’t possible here because the syllablic and phonetic information is so crude and lacks the concept of stress, but this would be an interesting direction to explore within this overall framework).

For example, the procedure for generating a rhyme scheme for a Petrarchan sonnet of the form ABBA ABBA CDE CDE starts by randomly picking a root rhyming word for each of the unique rhyming lines A, B, C, D and E. Then the full list of end rhyme are selected by scanning the lexicon for the closest rhyming words for each line. In this case, A and B each get four rhyming words assigned and C, D and E get two.

Rhymes are scored by matching Double Metaphone representations for each word, starting with the last sound and working backwards. This is crude and doesn’t always make sense, but it’s good enough to get the generator working end to end.

With the list of possible end rhymes generated, poems are assembled line by line, working back from the last word, filling in the gaps by selecting random words per syllable until the line length constraint is exhausted. The lines are placed in stanzas, according to the rhyme scheme pattern.

Example of a generated sonnet rhyming ‘jamow’ and ‘santow’

That’s really all there is to it. As a poetry generator, the sonnets are really only as bad or as good as the underlying data allows. With a lexicon containing more semantically structured data with richer information about syllables and phonetic structure and a more accurate or creative method of scoring rhyme matches, the basic procedure here has the potential to produce really interesting results. But that entails a very different focus and commitment to meaning and expression, rather than asemic writing.

Presentation

While book design is by no means a requirement—or even necessarily desirable—for NaNoGenMo entries, a big motivation for me to work on these projects is the process of presenting the finished work in book form with a cover, front matter and consistent typesetting.

Cover of a generated book: yellow and umber Cover of a generated book: green and grey

This time round, I kept things relatively simple with a hand-picked palette of potential spot colours for the cover and minimal text formatting. I considered adding a table of contents for the poems, but that seemed like scope creep, though I would definitely consider it if the generated books were smaller and more focused on individual poems, rather than tumescent volumes of text aimed at crossing the 50,000 word boundary.

Further Thoughts

I was happy to get this finished and had a surprising amount of fun making this work, despite the obvious flaws and limitations with rhyme and syllable detection.

The project feels relatively ‘done’ at this point, but there are still plenty of directions to take it further—such as looking at a better rhyming dictionary, mapping syllable stress patterns to embed iambic pentameter effects and attempting to model discourse level properties of the traditional form of the sonnet structure, like proposition and resolution and the turn in the ninth line.

There are also a few unanswered aesthetic questions to think about. For instance, is Bitspeech really asemic? Do the familiar consonants, vowels and dipthongs imbue the text with unavoidable meaning? Do the words feel like anything?

Thankfully, despite being totally new to poetry generation, I didn’t make too much of a mess or get tangled in knots with the codebase, so the structure here has the potential to be used as a seedwork for further poetry experiments.

If you think you’d be interested in taking this further, please let me know by commenting on the GitHub issue or contacting me on Twitter. I’m happy to go through the code and add comments and documentation to help make it more understandable and easier to build on.