back home

WordPress and GeSHi Markdown Formatting

Written by Bunkers on March 5, 2017

Yesterday I went through the process of swapping out the Markdown parser used by the Typewriter plugin for Parsedown. The next hurdle I came to was still related to the formatting of code blocks. As far a I can tell the recommended markup for a code block with the language defined is this:

<pre>
    <code class="language-php">
...

This does seem sensible, but I can be stubborn sometimes and I like using a plugin called WP-Syntax. It uses GeSHi to generate the highlighting code in PHP. Most other syntax highlighters I've come across are JavaScript based so do the highlighting on the client.

The problem comes from Parsedown outputting the markup as above for code blocks but GeSHi wants these code blocks to look like this:

<pre lang="php">
...

So a single pre element rather than a code element embedded within a pre tag. Fortunately Parsedown, because of the nature of it's implementation as a single PHP class, is easily extended. You can override the methods of the parser and manipulate the output.

I'm not going to dump all the code in here, but the methods we're interested in are blockCode, blockCodeContinue, blockCodeComplete, blockFencedCode, blockFencedCodeContinue and blockFencedCodeComplete. As the names suggest there are three methods called during the parsing of a code block. The first defines the tags for output, and in this case is where you can swap the HTML to be a single pre tag rather than a code tag as a child of a pre tag.

The generic code block gets started as the parser comes across a line with an indentation level of four, and the fenced code block by the use of three tick characters```as per the specification. So the first methods check for these situations and setup the HTML that will wrap the subsequent code in the block.

For both block types the Continue and Complete functions are the same as the original apart from one global change. These functions append content to our HTML elements, but our HTML structure has gone from two elements to one. So everywhere $Block['element']['text']['text'] appears we need to change it to $Block['element']['text'].

Now we have two classes that render Markdown and we can decide whether to output either GeSHi formatted code blocks, or the (probably more) standard version. In the plugin for this I've added an option to the Writing settings page which toggles the version of the Markdown parser I'm using. There's lots of great guides for adding WordPress settings, but I used a variation on the code presented in this one from WP Engineer.

The code in that tutorial takes the approach of having a single option defined with a value that is a serialised array. This is useful for storing a number of settings in a single option, but probably unnecessary in this instance. It's also a question of taste, so you may want to simplify things a bit and store the setting as a single value.

I'll present the full code for the plugin in the next post where we'll also discuss some of the other syntax highlighting options I could have used (and may still do so) that would have prevented me from having to do any of this!