If you already know all about the WordPress editor, you can skip directly to the section on separation of concern.
The WordPress editor is a curious beast. It’s oddly tied down into the core codebase, despite being a completely external project (TinyMCE), and it has given rise to a significant number of hacks and workarounds, to try to support the various workflows of different WordPress users. Since 2017 is the year for WordPress core to focus on the editor, I thought I’d put down some thoughts, in the hopes that I might help inform some decisions.
First, let’s talk about what the editor does, and walk through some of the different workflows it supports.
The WordPress editor, at its heart, is just a way to get the body of a post, whether text or html, into the database field “post_content”. That field is designed to be the sole, canonical source for WordPress post/page content, whether accessed by excerpt, search, feed, api, or any other method.
The simplest way to insert content into post_content is via “Text Mode”, where the user enters plain text, to be published as-is. Of course, unless the content is wrapped in <pre> tags, that content would look terribly unformatted to viewers, if loaded directly, so on save and output WordPress runs several regex-based filters to produce readable content, such as wpautop.
Text mode is useful for bloggers and writers who want to focus on content, and get their work out quickly and cleanly. The fact that it renders as plain text also makes it a breeze to use other editors and either copy/paste when complete, or hook into the WordPress API, and publish content directly from your editor of choice, such as MarsEdit. However, it leaves the author with very little control over appearance and formatting.
This lack of control is addressed in a few ways. Firstly, for formatting, text mode supports inserting HTML directly into the body of the post. Secondly, dynamic content can be added to the post via “shortcodes”, which are a bbcode-like language that theme and plugin authors can expand on. Both HTML and shortcodes, when entered, are stored directly in
Since adding HTML in plain-text isn’t all that intuitive (and because formatting it runs afoul of
wpautop), WordPress also provides a visual editor, which renders tags as HTML, and provides easy access to several inline elements and headings.
Now, in 2017, the WordPress core is working on revamping the visual editor to focus on “content blocks” (user defined formatted areas). This project is codenamed gutenberg. However, there is a deep-rooted issue with these plans, which is cross-compatibility. The sad fact is, none of these modes actually work smoothly together.
Pitfalls of the unified post_content system
To maintain the tagless flexibility of text mode, html is added to posts at run-time, based on whitespace that would otherwise be ignored. However, because of that, it is impossible to effectively and legibly enter raw HTML or even shortcodes in textmode, because basic formatting and indenting of HTML (or shortcodes with content) will cause all kinds of
s to be inserted into the output. Yes, this can be turned off with filters, but it leaves the two modes incompatible, as you can’t switch from one to the other without changing filters.
Also, any HTML or shortcodes added to post_content becomes visible to the WordPress search functionality. This can lead to confusion for users, as the shortcode
[fusion-gallery class='cold'] might appear on pages that have nothing to do with cold fusion, or the tag
<marquee type="odeon"> might appear on pages that have nothing to do with the Odeon Theater’s front visage. In the worst case scenario, this type of search ambiguity can be used for penetration testing, as you can search for strings like
[ngg_images, to search for plugins with known security flaws.
Starting with WordPress 4.4, Images added to posts also run through filters to add responsive source sets (formerly RICG responsive images). These filters match images based on their
class attribute, while images in visual mode are displayed by their
src attribute, meaning that an image’s source can be modified in text mode and saved, without modifying its class attribute. As a result, the image can look great to the user a a specific resolution, but other viewers on other devices may be served an entirely different asset. Furthermore, as these image sizes are chosen from the theme’s defined thumbnail sizes, themes with poorly chosen thumbnail sizes can serve extremely undersized images to certain screen sizes, and none of this is presented to the user.
The new idea with content blocks (project gutenberg) is to add sections of HTML with special markup into the content, to allow users to insert columns or images or other custom blocks, as defined by themes or plugins. I believe this idea, will further compound the core issue that the WordPress editor is dealing with: Lack of separation of concern.
What is Separation of Concern, and why is it Necessary?
post_content only makes sense as the sole source of data for posts, when it is only serving raw text. That made sense when WordPress was a new system just for bloggers, and still mostly made sense as individual features were added, each compromising SoC, but in 2017, we’ve gotten to the tipping point, where WordPress is no longer a blogging platform with bolt-ons, it’s a Website Platform with great blogging capabilities. As such, we need a change of approach.
Most people who still use
post_content for pages other than blogs are relying heavily on shortcodes and html in text mode to meet modern formatting needs. This clutters search, and (as I mentioned in my last “Improving WordPress” post) damages auto excerpts and feeds. In addition, when themes/plugins break or web-standards change, they are left with potentially hundreds of pages of broken code. As far as I can tell, content blocks are still subject to this issue.
Those who have abandoned post_content for ACF or Beaver Builder or Widget Areas or flat HTML templates are heavily reliant on plugins that fight the core for every inch, and still fail to properly address problems like search, feed, and excerpts, as well as SEO plugins and other tools that trust
post_content to be the primary source of data.
How should the issue be addressed, moving forward?
There are already filters and actions in many places in code, that can override the normal functioning of
post_content. As a first step, I propose that those filters be reviewed and holes be filled to make the WordPress editor into a completely modular construct. All references to post_content would pass through filters before being used, making the editor fully replaceable, not only globally, but on a per post_type basis. That way, the traditional editor could be maintained for blog posts or api connections, but it could be replaced wholesale wherever something more is needed, without regard for maintaining old functionality. Plugin and Theme authors would be informed that direct access to the post_content field was deprecated, and that they should move to use the helper functions provided.
As a second step, I recommend abstracting all
post_content functionality into a class, similar to the existing Walker class for nav menus, that would be an optional value when declaring custom post types. Custom classes would be implemented wholesale, to ensure that all edge cases like search, seo plugins, api/cli access, and feeds would be addressed by plugin authors (or would explicitly block access to these methods, with a provided explanation. e.g. not all forms of editor need to support saving via API or connecting to MarsEdit). This would give a strong path forward to legitimize the already flourishing community of content builder plugins, and to give them better options for data storage, as they would have complete control of page rendering.
As the third step, I recommend re-purposing the
post_content field as the “plain-text representation” of a post. It would be maintained for fast searches, feeds, and auto-excerpts (depending on the editor class functionality in question), but would no longer be considered the canonical version of a post, in any way. All methods of the editor class that change the post’s content would be expected to update
As the final step, I recommend creating an extensible “content bucket” system, with pre-defined names like “asides”, “pull-quotes”, “features”, “headings”, “images”, etc that custom editor classes could use as a common storage system. That way, if you switch editors, a lot of your content could come along, and simply be sorted into new buckets. In addition, themes could start to support these buckets in different ways, taking some of the burden of the presentation layer off the Editor.
Benefits to the proposed system of abstraction
- Existing plugin ecosystems are moved into a healthier direction, rather than being abandoned
- Flexibility for various use cases is maintained, while avoiding having to be bound by making all features back-compatible
- True Separation of Concern
- The ability to update post html in a single location, as web technologies evolve
- The ability to stop parsing HTML with regular expressions, which is a BIG no-no in programming
- The flexibility to apply post content in new ways that haven’t even been considered yet, without being tethered to old design choices.
- The ability to maintain the interface, exactly as-is for legacy users, while still improving their experience by solving the issues of mixing plain-text mode with html.
Possible objections to the proposed system of abstraction, and rebuttals
- We don’t get the benefit of “free browser rendering” – Since posts already pass through several regular expressions on the way to the browser, each of which introduce potential errors, and given the existing system of shortcodes, I don’t believe the performance hit is as large as imagined. Several other CMS systems (such as drupal) already assemble pages on the fly. In addition, with complete control over the editor class, there is no reason this abstraction couldn’t also improve integration with full-page cache systems like varnish, or at least single post-body caches, where all static elements could be pre-rendered in a cache, with markers for where to insert dynamic elements.
post_contentwill likely get out of sync with the actual structured data – This issue can be fairly well mitigated in a class-based system, as editor classes will specifically declare the forms of editing they allow. For example, if you try to edit a post via API when the post_type only supports a visual editor, it will throw an error.
- Changing editors will lose data – That’s actually the beauty of SOC… changing editors would only lose structure. Content could be rebuilt at any time from the plain-text version. In addition, this issue exists currently with changing themes or plugins. Even if content blocks in gutenberg are able to exist purely as flat html, a plugin or theme change would leave custom blocks as uneditable blobs that don’t render properly. As a further enhancement, as a common language for data storage is created, several common blocks could transition data smoothly between editor classes, without losing either structure or data.
- There are already filters for this in place – There are some filters in place for parts of this, and they are often applied in a piece-meal fashion by various plugins. Enforcing a class based system that extends a common base class, and is required to provide specific functionality will enforce a wholistic view on these editors.
- This doesn’t help WordPress.com or Core – Abstracting the editor actually could do a lot of good for the core, by allowing them to build gutenberg and future enhancements without having to tie their hands to the existing functionality. WordPress.com could offer the choice of the traditional editor or the Gutenberg editor for pages or posts, or even provide a more visual option for pages only, that doesn’t need to be rooted in the idea of “posts”. In addition, with WordPress’s recent acquisitions of some major plugins, this would open the door to fixing and legitimizing hugely popular plugins like Visual Composer or Beaver Builder or even Divi, so that their large communities are back in line with WordPress ideals, and eventually it could pave the way for acquisition of one or more of the larger players.
Final Thoughts and Disclaimers
I am the primary author of a page builder called Blockade, which uses many of the same concepts that the content blocks team are pursuing. Obviously there could be some conflict of interest involved in my re-steering of project objectives, but that isn’t really the case. I built Blockade to be less of a visual builder than the competition, but I have always known that it is a hack, and isn’t capable of being the polished solution I want. That is not to put it behind other builders. They are all hacks, and so is the Gutenberg plan. Separation of Concern is the only way forward that doesn’t tie developers’ hands in one way or another. If Gutenberg becomes part of core, I will simply turn Blockade into a set of enhancements for that system. If the editor abstraction came into play, I could actually cut loose and build the ideal page builder I’m dreaming of. Either way, there is a path forward for me… but I worry about the path forward for all the websites that have chosen other enhancements to the editor experience, who may not have a path forward with the current plan.