Improving WordPress Part 3 – SoC & The Editor (A suggestion to Core)

If you already know all about the WordPress editor, you can skip directly to the section on separation of concern.

The WordPress editor is a curious beast. It’s oddly tied down into the core codebase, despite being a completely external project (TinyMCE), and it has given rise to a significant number of hacks and workarounds, to try to support the various workflows of different WordPress users. Since 2017 is the year for WordPress core to focus on the editor, I thought I’d put down some thoughts, in the hopes that I might help inform some decisions.

First, let’s talk about what the editor does, and walk through some of the different workflows it supports.

The WordPress editor, at its heart, is just a way to get the body of a post, whether text or html, into the database field “post_content”.  That field is designed to be the sole, canonical source for WordPress post/page content, whether accessed by excerpt, search, feed, api, or any other method.

The simplest way to insert content into post_content is via “Text Mode”, where the user enters plain text, to be published as-is. Of course, unless the content is wrapped in <pre> tags, that content would look terribly unformatted to viewers, if loaded directly, so on save and output WordPress runs several regex-based filters to produce readable content, such as wpautop.

Text mode is useful for bloggers and writers who want to focus on content, and get their work out quickly and cleanly. The fact that it renders as plain text also makes it a breeze to use other editors and either copy/paste when complete, or hook into the WordPress API, and publish content directly from your editor of choice, such as MarsEdit.  However, it leaves the author with very little control over appearance and formatting.

This lack of control is addressed in a few ways. Firstly, for formatting, text mode supports inserting HTML directly into the body of the post. Secondly, dynamic content can be added to the post via “shortcodes”, which are a bbcode-like language that theme and plugin authors can expand on. Both HTML and shortcodes, when entered, are stored directly in post_content.

Since adding HTML in plain-text isn’t all that intuitive (and because formatting it runs afoul of wpautop), WordPress also provides a visual editor, which renders tags as HTML, and provides easy access to several inline elements and headings.

Now, in 2017, the WordPress core is working on revamping the visual editor to focus on “content blocks” (user defined formatted areas). This project is codenamed gutenberg. However, there is a deep-rooted issue with these plans, which is cross-compatibility.  The sad fact is, none of these modes actually work smoothly together.

Pitfalls of the unified post_content system

To maintain the tagless flexibility of text mode, html is added to posts at run-time, based on whitespace that would otherwise be ignored. However, because of that, it is impossible to effectively and legibly enter raw HTML or even shortcodes in textmode, because basic formatting and indenting of HTML (or shortcodes with content) will cause all kinds of <p>s, <br>s, and &nbsp;s to be inserted into the output. Yes, this can be turned off with filters, but it leaves the two modes incompatible, as you can’t switch from one to the other without changing filters.

Next, visual mode escapes raw HTML that is entered into it, and if you switch to text mode to add HTML, then back to Visual, the code is often “Normalized” by the browser, in ways that can absolutely destroy functionality. In addition, certain valid HTML5 constructs (such as links that wrap block-level elements) are not supported by the visual editor, and will be converted to block level elements that contain links, in a way that often breaks both semantics and accessibility. In addition, some HTML is filtered out by WordPress’ sanitization functions, arbitrarily removing JavaScript handlers and some advanced attributes. So, now Visual Mode is also incompatible with text mode, or raw HTML.

Also, any HTML or shortcodes added to post_content becomes visible to the WordPress search functionality. This can lead to confusion for users, as the shortcode [fusion-gallery class='cold'] might appear on pages that have nothing to do with cold fusion, or the tag <marquee type="odeon"> might appear on pages that have nothing to do with the Odeon Theater’s front visage. In the worst case scenario, this type of search ambiguity can be used for penetration testing, as you can search for strings like [ngg_images, to search for plugins with known security flaws.

Starting with WordPress 4.4, Images added to posts also run through filters to add responsive source sets (formerly RICG responsive images). These filters match images based on their class attribute, while images in visual mode are displayed by their src attribute, meaning that an image’s source can be modified in text mode and saved, without modifying its class attribute. As a result, the image can look great to the user a a specific resolution, but other viewers on other devices may be served an entirely different asset. Furthermore, as these image sizes are chosen from the theme’s defined thumbnail sizes, themes with poorly chosen thumbnail sizes can serve extremely undersized images to certain screen sizes, and none of this is presented to the user.

The new idea with content blocks (project gutenberg) is to add sections of HTML with special markup into the content, to allow users to insert columns or images or other custom blocks, as defined by themes or plugins. I believe this idea, will further compound the core issue that the WordPress editor is dealing with: Lack of separation of concern.

What is Separation of Concern, and why is it Necessary?

Separation of concern is a concept in programming (and several other fields) which says “let different pieces of a project serve specific tasks”. Normally, with web design, it means that your HTML should be clean and semantic and clearly describe your site’s structure, your JavaScript should be minimal and well documented and offer additional functionality, and your CSS should be logical and offer visual styling. However, when it comes to WordPress, people usually mean it as “let the theme handle presentation of data, and plugins handle exposition of functionality”.  The problem is that the entire concept breaks down, when it comes to post_content.

post_content only makes sense as the sole source of data for posts, when it is only serving raw text. That made sense when WordPress was a new system just for bloggers, and still mostly made sense as individual features were added, each compromising SoC, but in 2017, we’ve gotten to the tipping point, where WordPress is no longer a blogging platform with bolt-ons, it’s a Website Platform with great blogging capabilities. As such, we need a change of approach.

Most people who still use post_content for pages other than blogs are relying heavily on shortcodes and html in text mode to meet modern formatting needs. This clutters search, and (as I mentioned in my last “Improving WordPress” post) damages auto excerpts and feeds. In addition, when themes/plugins break or web-standards change, they are left with potentially hundreds of pages of broken code. As far as I can tell, content blocks are still subject to this issue.

Those who have abandoned post_content for ACF or Beaver Builder or Widget Areas or flat HTML templates are heavily reliant on plugins that fight the core for every inch, and still fail to properly address problems like search, feed, and excerpts, as well as SEO plugins and other tools that trust post_content to be the primary source of data.

How should the issue be addressed, moving forward?

There are already filters and actions in many places in code, that can override the normal functioning of post_content. As a first step, I propose that those filters be reviewed and holes be filled to make the WordPress editor into a completely modular construct. All references to post_content would pass through filters before being used, making the editor fully replaceable, not only globally, but on a per post_type basis. That way, the traditional editor could be maintained for blog posts or api connections, but it could be replaced wholesale wherever something more is needed, without regard for maintaining old functionality. Plugin and Theme authors would be informed that direct access to the post_content field was deprecated, and that they should move to use the helper functions provided.

As a second step, I recommend abstracting all post_content functionality into a class, similar to the existing Walker class for nav menus, that would be an optional value when declaring custom post types. Custom classes would be implemented wholesale, to ensure that all edge cases like search, seo plugins, api/cli access, and feeds would be addressed by plugin authors (or would explicitly block access to these methods, with a provided explanation. e.g. not all forms of editor need to support saving via API or connecting to MarsEdit). This would give a strong path forward to legitimize the already flourishing community of content builder plugins, and to give them better options for data storage, as they would have complete control of page rendering.

As the third step, I recommend re-purposing the post_content field as the “plain-text representation” of a post. It would be maintained for fast searches, feeds, and auto-excerpts (depending on the editor class functionality in question), but would no longer be considered the canonical version of a post, in any way. All methods of the editor class that change the post’s content would be expected to update post_content.

As the final step, I recommend creating an extensible “content bucket” system, with pre-defined names like “asides”, “pull-quotes”, “features”, “headings”, “images”, etc that custom editor classes could use as a common storage system. That way, if you switch editors, a lot of your content could come along, and simply be sorted into new buckets. In addition, themes could start to support these buckets in different ways, taking some of the burden of the presentation layer off the Editor.

Benefits to the proposed system of abstraction

  • Existing plugin ecosystems are moved into a healthier direction, rather than being abandoned
  • Flexibility for various use cases is maintained, while avoiding having to be bound by making all features back-compatible
  • True Separation of Concern
  • The ability to update post html in a single location, as web technologies evolve
  • The ability to stop parsing HTML with regular expressions, which is a BIG no-no in programming
  • The flexibility to apply post content in new ways that haven’t even been considered yet, without being tethered to old design choices.
  • The ability to maintain the interface, exactly as-is for legacy users, while still improving their experience by solving the issues of mixing plain-text mode with html.

Possible objections to the proposed system of abstraction, and rebuttals

  • We don’t get the benefit of “free browser rendering” – Since posts already pass through several regular expressions on the way to the browser, each of which introduce potential errors, and given the existing system of shortcodes, I don’t believe the performance hit is as large as imagined. Several other CMS systems (such as drupal) already assemble pages on the fly. In addition, with complete control over the editor class, there is no reason this abstraction couldn’t also improve integration with full-page cache systems like varnish, or at least single post-body caches, where all static elements could be pre-rendered in a cache, with markers for where to insert dynamic elements.
  • post_content will likely get out of sync with the actual structured data – This issue can be fairly well mitigated in a class-based system, as editor classes will specifically declare the forms of editing they allow. For example, if you try to edit a post via API when the post_type only supports a visual editor, it will throw an error.
  • Changing editors will lose data – That’s actually the beauty of SOC… changing editors would only lose structure. Content could be rebuilt at any time from the plain-text version. In addition, this issue exists currently with changing themes or plugins. Even if content blocks in gutenberg are able to exist purely as flat html, a plugin or theme change would leave custom blocks as uneditable blobs that don’t render properly. As a further enhancement, as a common language for data storage is created, several common blocks could transition data smoothly between editor classes, without losing either structure or data.
  • There are already filters for this in place – There are some filters in place for parts of this, and they are often applied in a piece-meal fashion by various plugins. Enforcing a class based system that extends a common base class, and is required to provide specific functionality will enforce a wholistic view on these editors.
  • This doesn’t help or Core – Abstracting the editor actually could do a lot of good for the core, by allowing them to build gutenberg and future enhancements without having to tie their hands to the existing functionality. could offer the choice of the traditional editor or the Gutenberg editor for pages or posts, or even provide a more visual option for pages only, that doesn’t need to be rooted in the idea of “posts”. In addition, with WordPress’s recent acquisitions of some major plugins, this would open the door to fixing and legitimizing hugely popular plugins like Visual Composer or Beaver Builder or even Divi, so that their large communities are back in line with WordPress ideals, and eventually it could pave the way for acquisition of one or more of the larger players.

Final Thoughts and Disclaimers

I am the primary author of a page builder called Blockade, which uses many of the same concepts that the content blocks team are pursuing. Obviously there could be some conflict of interest involved in my re-steering of project objectives, but that isn’t really the case. I built Blockade to be less of a visual builder than the competition, but I have always known that it is a hack, and isn’t capable of being the polished solution I want. That is not to put it behind other builders. They are all hacks, and so is the Gutenberg plan. Separation of Concern is the only way forward that doesn’t tie developers’ hands in one way or another. If Gutenberg becomes part of core, I will simply turn Blockade into a set of enhancements for that system. If the editor abstraction came into play, I could actually cut loose and build the ideal page builder I’m dreaming of. Either way, there is a path forward for me… but I worry about the path forward for all the websites that have chosen other enhancements to the editor experience, who may not have a path forward with the current plan.


Improving WordPress Part 2 – Better Auto Excerpts and Feeds

WordPress allows most HTML in the post_content field, allowing a fair amount of flexibility for formatting post content for browsers. However, when that content needs to be presented in feeds or excerpts, it is run through a set of sanitizing filters that strips this HTML, leaving bare text.

For the most part this works ok, but when WordPress strips block level tags like <div>, <blockquote>, or even <br>, the remaining text can be jammed together, causing words to run together.

It would normally be a fairly simple task to add a regular expression which would add whitespace around such tags, before stripping them. However, WordPress considers the functions in question so integral to the core functionality, that they do not include any filters to hook to at all.

To address this shortcoming, I needed to step back a few levels, and rewrite several WordPress core functions from the last place there is a usable hook. To save you all the trouble i went through, here is a single-file plugin that will insert the relevant missing formatting:


Improving WordPress Part 1 – Pass WordPress Admin Notices Across Page Redirects

I’ve been writing a lot of WordPress plugins recently, and one task keeps popping up, that doesn’t seem to have a definitive answer in the WordPress core.  I’ll need to show a message (confirmation, success, error, notice, warning, info, etc) to a user, after redirecting them through one of the scripts used to save changes, such as admin_post.php (for example, using the save_post action with a custom metabox).

Because I’m a strong proponent of DRY programming (Don’t Repeat Yourself), I wanted a universal tool to solve the problem, once and for all.  I’ve created a small class that I call WP_Persistent_Notices.  It’s a singleton, and is pluggable, so there should be no issues with simply including it in a theme or plugin, as is, and not worrying about another theme or plugin also including it. read more »


The Facts of Life (politics in a coding blog??)

First things first, I am not a politician, and frankly, I am uninterested in changing anyone’s opinions on how to run this country.  In fact, I’d prefer to be staying out of this field entirely.  However, large numbers of otherwise intelligent people have been making statements recently that have no basis in logic, and I needed to weigh in on this intellectual dishonesty.  We need to talk about the two-party system.

Let me set the record straight for some of the voters out there, especially those considering who the “lesser of two evils” is. I am not going to talk about the qualities of various candidates, because that is shouting into a hurricane. Instead, I’m gonna give you some cold, hard, irrefutable facts about voting.

Do you live in Ohio, Colorado, Iowa, Nevada, New Hampshire, Virginia, or Florida? If not, your vote will not have any effect on the presidency. Your state is not contested, and you are not an elector. You are utterly and completely meaningless to the overall election. read more »


Introducing WP SmartCrop – Intelligent, Responsive Image Cropping for WordPress

I’ve been taking another look at old code recently, and I dusted off a couple on-the-fly smart cropping demos that I wrote, years ago. At the time, they relied on questionable hacks and ran extremely slowly, making them impractical for real-world use.

However, with the rise and widespread adoption of CSS3, and the recent incorporation of Responsive image srcsets into WordPress 4.4, The timing seemed right to finally complete the toolkit, and offer WordPress users truly responsive images. read more »