The Gutenberg that could have been

Posted on September 5, 2017 by gschoppe

I have been very vocal in the WordPress community about the fundamental issues I see with the new visual editor being bundled with version 5.0. One response I keep hearing is “how would you do it differently?” So, I thought I’d outline a hypothetical roadmap for the Gutenberg that might have been.

Just for those who haven’t been reading my previous posts, my most fundamental issues with Gutenberg fall into just five topics (if you are familiar with the issues, feel free to skip ahead):

WordPress posts should be stored in a universal Structured Data format
Block-based editing means that posts are structured into blocks (obviously). post_content‘s current format is a series of hacks layered on top of the original plain-text implementation, which is fundamentally unstructured. This current format, by definition, cannot preserve all the details of the original data. the current format is literally incapable of supporting lossless, structured data, no matter how many new elements are introduced. Structuring data in a JSON/Mobiledoc format would have benefits to the existing data, as well as setting WordPress up for the next decade. Benefits include:
1. Images can be re-rendered from ID on the fly, so if thumbnail sizes change or filenames get updated, or images get deleted, the post will be able to adapt. Oh, and say goodbye to mixed content warnings when a site gets migrated to https.
2. Links can be re-rendered from the ID on the fly, so if permalinks change or posts get deleted, the post can adapt.
3. Embeds can be modified to match the source protocol, and can be regularly tested for failure
4. HTML structure can’t be accidentally broken in text-mode
WordPress shouldn’t marginalize third-party admin tools, especially while touting the API’s ability to revolutionize editing
The new block-based editor provides a first class experience to the core editor, but leaves everyone else dealing with HTML comments, which (other than the existing more and page comments) will not be rendered at all by any HTML-based visual editor. Shortcodes, while a terrible long-term solution, are a structure all editors have had to deal with already, so they are guaranteed to not break their interface. This interface could be upgraded as new API endpoints are introduced, to consume the real, structured data.
WordPress needs to actively reduce the number of hacks involved in the editing process, not increase them
As things stand, post_content is processed by several functions before being output. Here are a few of them, and how they could be implemented with real structured data
1. wpautop turns whitespace into paragraph and break markup. In a real structured format, each p or br would be stored as a separate node in the structure. This preserves whitespace, so it can be used to add clarity to certain views where necessary (like in pre or code blocks, which can be an isolated nodes with raw text inside, or with poetry)
2. shortcodes are parsed by a complicated and error-prone regular expression. In a real structured format, these could be stored as a custom type of node, massively simplifying the parsing process and allowing for a shortcake-like visual interface for editing them.
3. capital_P_dangit just makes no sense. This should be a spellcheck option, not a filter on the content.
4. RICG responsive images use yet another messy regular expression. Making images a type of node in the structure will make safe modification of image tags trivial.
WordPress posts should be rendered differently in different contexts
WordPress search should only deal with the text of the content, and auto-excerpts should respect whitespace around blocks. Feeds and AMP pages need to render in a manner compatible with the limitations of the format. Emails need to render with ridiculous 1990’s era HTML. These are not secondary interfaces to be rendered from the “true” html version. They are equally important representations of the data, that should be able to be produced from the common structure. This also would support future, unknown formats.
React is too problematic for the infinite use cases supported by WordPress
Many users have weighed in on the issue or non-issue of React’s Patent Clause. Regardless of personal opinions on the subject, some companies’ legal departments are unwilling to accept the terms of that license. This is an unacceptable limitation for the most widely-used CMS system on the market. WordPress will lose customers if they embed React. At most they need to choose Preact by default, and allow users to swap in React or another clone by dequeue/enqueue, if they so prefer. The core team claims this decision is still up in the air for Gutenberg, but with a react-based feature plugin rolling toward integration with core in the beginning of 2018, it’s clear that decision is being made by defacto usage.

Now that the issues and benefits have been laid out, lets talk about a potential path forward.

The Roadmap that Wasn’t

WordPress 4.9

4.9 would be treated as a defacto LTS release, as it is the last version before Gutenberg changes begin, and it would continue to receive security updates for a minimum of 3 years. This would calm the nerves of large businesses worried about the fast changes that are to come, as well as supporting those who simply cant be served by Gutenberg. Baseline opt-in telemetry is introduced, as well as plugin and update signing, just to make sure we have a strong footing in security for our business users for the next few years.

WordPress 4.10.0

WordPress 4.10 would be all about setting the stage for Gutenberg. Every connection to the admin from API to XMLRPC to WP_CLI to native functions are given an update/create_post_content function or an update/create_post function that sits between the user and the raw post_content field. A new set of end points are also added on the back end to allow users to update or receive a structured view of the data. This version of the data will match the MobileDoc format, but for now will be rendered from the existing post_content. Developers are informed that directly editing post_content will no longer work as of WordPress 5.0, and that they should start updating their code to use these new end points. This would be about nine months out from 5.0 release.

WordPress 4.11.0

In 4.11, the public would get their first taste of Gutenberg. It is a visual refresh to the editor that maintains feature parity, but now stores structured data through the standard structured endpoints. That structured data, however, still also ends up parsed into post_content as our existing monstrous mishmash of formats, for now. However, there are new filters and actions available for rendering that structured data for feeds and excerpts. a ‘cache clear’ action allows existing editors to update the structured data from the unstructured data, as a temporary shim. We stress that Gutenberg is coming into its own in the next version, and plugins need to be updated. All plugins should be able to use the structured data alone to communicate with the database. A cron regularly checks the integrity of the structured and unstructured forms of the post against each other, and adds a warning in the admin panel if they get out of sync, informing them that some part of the code on their site is not ready for WordPress 5, and giving them the option to resync the structured data, by parsing post_content. Eager developers could get started on alternative editors that consume the same MobileDoc format, as the format implementation is feature complete. This would be three months from 5.0 release.

WordPress 5.0.0

Gutenberg launches with full block support, probably based on Slate.js and Preact. No TinyMCE or content-editable to be found! Gutenberg is limited in scope to the editor area, rather than the entire page, but has the ability to implement blocks that store to metadata or other custom storage, and default blocks can be locked in place on certain post types, making them just a standardized interface for custom metadata (although the existing areas are not removed.. deprecation of those can wait until Gutenberg has time to gain support). All standard post data is stored in the structured format, in a db field called post_structure. The old post_content is now a flat text version of the content that is used only to power search. It has a parser, just like html content, feed content, and email content, so search can get an optimized version of the keywords and phrases. A set of filters exists to allow users to save other fields into post_content, so complex search structures involving categories and meta are now trivial to implement. A set of helper functions are introduced to parse blocks to/from the legacy shortcode format, using PEG parsers (generated by hand, not via beta versions of software). No comment structure exists in this compatibility layer, just legacy formats. These functions are slated for eventual deprecation, in favor of raw consumption of the structured format, but they will help third party editors get through the transition. The cron check for bad data in the post_content is retained for now, with more forceful wording.

WordPress 5.1.0 and Beyond

Gutenberg’s scope is slowly increased to incorporate other functionality on the site. Once the parallel metabox structure in Gutenberg has gained support, we gradually phase out the old style metaboxes. Next we integrate the customizer panel into Gutenberg. Eventually we might even see an entire admin interface based in Gutenberg components. However, we take that scope creep very slowly and reevaluate at every step. Even if that never occurs, the fundamental data structure of WordPress is far stronger, allowing third-party plugins to create entirely new Admin interfaces, and allowing core to try things out in various ways, to see what the market responds to. Of course, Telemetry is key in all of this.

To be clear, this is not the Gutenberg we’re getting, and to me, at least, that is deeply unfortunate.

23 Comments

Dac Chartrand says:

September 5, 2017 at 9:55 pm

Serialized PHP data in full text DB columns are also a hack, use JSON columns instead.

Hindsight is 20/20.

Reply
- gschoppe says:
  
  September 5, 2017 at 10:23 pm
  
  I absolutely agree with that statement, and if core was given the proper attention, I believe an option to replace serialization with JSON encoding would already exist. These things could be shimmed and replaced over time, however the level of hack involved in post_content is light years beyond choosing the wrong form of serialization for options and postmeta, and repairing it is directly in line with the stated goals of 2017.
  
  Reply
  - Dac Chartrand says:
    
    September 5, 2017 at 10:48 pm
    
    I guess what I was trying to say is that WordPress got pretty far on regular expressions in spite of technically superior choices being widely available (New Jersey style?🙂 ). I’m looking at your proposed Mobiledoc vs how the PEG parser works. On docs and specs alone, clearly the first is “better” and more thought out but check out that link. However ugly, assuming it works, PEG with HTML comments is easy to grasp (and JSON). It warrants further discussion.
    
    That said, I just wanted to say that this article and the previous are great. I work on a plugin that uses WP as a dev platform and we are trying, working even, on figuring out what we are supposed to do with Gutenberg. Your “in the trenches” version of things matches more how I feel than the executive vision. Hopefully some of it is taken to heart.
    
    Good times.
    
    Reply
    - gschoppe says:
      
      September 5, 2017 at 11:41 pm
      
      It’s funny that you mention “Worse is Better” because that’s the feeling I get when imagine the WordPress ecosystem with React at its core. It’s clearly the better structure than our ad-hoc solutions with PHP and whatever JS people pile on, but it may have too high a barrier to entry or too much of a prescriptive nature to facilitate the continued growth of the ecosystem.
      
      I do agree with you entirely that when Gutenberg comes out (inevitably without a real structure), I’ll do what I can to keep working with it. It will just be one more hack in the list of hacks I have to keep in mind when developing. Hopefully it won’t be the straw that breaks the camel’s back and leads to me forking Ghost to make a CMS out of it.
      
      But there is no doubt in my mind that however far we get on cobbled-together data structures, we could get further, faster on a real JSON format, and it would encourage much better growth in the ecosystem.
      
      Reply
Christopher A says:

September 6, 2017 at 8:20 am

I think we could go one step further and solve almost all of the issues by introducing wp_blocks, wp_blockmeta and wp_block_relationship tables. This would allow us to continue to store raw HTML for each block, make each block searchable using a boolean column, allow for blocks to be reused across posts and widgets, and allow for block updates without parsing through every single post.

Reply
- gschoppe says:
  
  September 6, 2017 at 8:33 am
  
  I think this could be an excellent v2 iteration. The reason I say v2 is that I believe it would be a simple extension to a strong json format to create a block that would support partials and templates that are stored elsewhere.
  
  The majority of blocks should only be storing page-specific data like “this image is ID 37” or “this paragraph has an em tag as a child.” Breaking out to block tables at that level of granularity would be more of a negative than a positive, as it would have a massive impact on performance, for minimal gains.
  
  Even with more complex blocks like slideshows or recent posts lists, each post should only be storing the fact that the block was used, and whatever page-specific data is necessary to render it in context.
  
  From there, in the v2 expansion, reusable sections could be saved as partials and templates, and included as blocks, so the template could be modified later. This would give the best of both worlds, as these reusable blocks could have options too, allowing the template to expose editable fields on a per-page basis. And allowing a partial or template to contain multiple blocks, so theoretically users could build their own.
  
  The Block analogy kind of breaks down at this point, which is why I prefer using the Atomic Design language. By that standard, v1 would introduce atoms (simple nodes like textnode and dynamic elements that don’t nest), molecules (wrapping nodes like divs or even strongs and dynamic elements that contain other elements), and pages (exactly what you think they are). v2 could add support for organisms (reusable partials made up of groups of blocks) and layouts (reusable page templates, rendered from blocks).
  
  Reply
  - kaiser says:
    
    September 9, 2017 at 5:50 pm
    
    If you would leave out molecules and organisms, it might be easier to wrap your head around. Both naming wise as well as regarding the “reusability” part. Everything else gets too complicated and (imho) raises the entry barrier.
    
    Reply
    - gschoppe says:
      
      September 9, 2017 at 6:35 pm
      
      Not sure I follow…
      
      Molecules would be any block that can contain another block, which would be necessary for features like columns or metabox blocks.
      
      Organisms would be saved collections of blocks that can be inserted together… in laymen’s terms “snippets” or “Template Partials”. Most page builders support partials now, and they get a fair bit of use.
      
      Reply
John Teague says:

September 6, 2017 at 6:39 pm

Informative and instructive.

I talked about the need to create an LTS version and move on to a complete rework of WordPress from the ground up including JSON, meta, options, builder, and moving to a fully REST compliant model a few years ago. You can imagine the reaction, in equal parts disdain, silence, polite Pat’s on the head, and fork challenges.

You explained it much better than I did ?

Reply
- gschoppe says:
  
  September 6, 2017 at 7:24 pm
  
  To be honest, a forked “NextPress” version would be very hard to build solo, but it wouldn’t take too many dedicated developers to have a pretty good core team.
  
  Reply
  - Kalen Johnson says:
    
    September 6, 2017 at 8:43 pm
    
    Start a Slack room, and they will come….
    
    Reply
  - fwolf says:
    
    September 8, 2017 at 12:16 am
    
    Call it FourthWord or FourthPress, based upon the “four” in the current version number, incorporate all future fixes, and then continue over to some kind of NextPress, that might go further into creating more stable data structures, while preferably also keeping the i18n issues in mind.
    
    FourthWord / FourthPress is the 4.10+ stable release, that tries to cater to the massive existing WordPress developer und CMS end-user community, NextPress is the more flexible, improved 5.x release you are suggesting in your fictive Gutenberg road map.
    
    Oh yes, and I’m totally prepared for heavy-duty work on a fork. Been prepared since the pest called Theme Customize(r) reared its ugly head at us 😉
    
    cu, w0lf.
    
    Reply
    - gschoppe says:
      
      September 8, 2017 at 12:23 am
      
      I think there’s room for both under the same fork. If you picture it as similar to the Drupal or Magento codebase, the major releases can contain quite large breaking changes, because the previous versions still get security updates and backports of those features that are relevant to them. Basically XPress4 would be compatible with WordPress 4 and would continue to get updates beyond 4.10 to keep it stable, and XPress5 would be the new, improved structure that is still concerned with back compatibility, but willing to move forward.
      
      Reply
John Teague says:

September 6, 2017 at 8:50 pm

Personally, I think it’s time to consider it seriously. With Automatic moving quickly to grab up self hosted business market share with .com and jetoack, there’s an opportunity here. So, I’d be up for that. And I’d be willing to donate resources for it. No Trac please ?

Reply
- gschoppe says:
  
  September 6, 2017 at 8:55 pm
  
  Definitely GitHub… Worth a thought. The first version could just be a rebranded WP 4.9, that would receive long term support. It could be a lifeline to a lot of Enterprise clients.
  
  Reply
  - Adrianne says:
    
    September 7, 2017 at 2:17 pm
    
    OMG yes. Gutenberg is going to destroy my edu clients who 1) already struggle with putting regular content into their site and 2) don’t have the budget to redo their websites when the new version breaks it.
    
    Part of me understands why Matt is trying to make WP more accessible to people via a page-builder like interface, but as someone commented on your earlier article, there are still a TON of other things that make WP inaccessible to people that choose Wix, SquareSpace, etc over WP.
    
    I’m really debating if now is the time to jump ship and learn a new CMS… I just don’t know.
    
    Reply
The Saint Michael says:

September 7, 2017 at 9:48 am

I don´t like how it is developing, not at all. It will give users/owners to much power to decide over design and layout, on default.

And when they have that power count on it websites will be ruined. As last time I was stupid enough to give owner choice to choose Google fonts. It ruined website completely, he chosed some idiotic fonts popular for restaurant menus/pricelist, but now for text. Later there was no brake, no step back, no corrective move. Just trivial example.

Next, all this talk about writing flow, here, there. It is a bull. Maybe employees at Automattic live in this world, but sure not big majority of WordPress users. I will tell them how it works:

– No serious, registered, webdeveloper/company in Europe will build an website for under 900 €. Maybe students.
– Add to it 200 € more and in you have monthly salary of the majority of the people in Europe.
– They work one month very hard for it.
– When they pay an basic personal/company presentation website at least 900 €, more evidently 1500 – 2000 €.
– They are scared to death when they are inside WordPress admin, to not to ruin something so expensive, something they gave so much money for.
– They do not care about Gutenberg blocks.
– They do not care about “writing flow” and all other fancy talk to sell Gutenberg better.

Reply
- gschoppe says:
  
  September 7, 2017 at 10:17 am
  
  It’s interesting, because we have clients that fall to both sides of this spectrum, but I agree entirely about how bad unchecked control over design/layout could be for users.
  
  Our clients want different things in different contexts, ranging from:
  
  Total control with all available layout tools
  
  Some subset of layout tools, but not all
  
  The ability to edit the text content or images in layouts, but not to modify them
  
  Predefined sections that can be strung together in various orders
  
  Predefined layouts that must be filled out in the same way every time
  
  and, of course, these needs often vary across the site based on several different parameters, like:
  
  Post Type
  
  Category/Taxonomy Term
  
  Post Format
  
  Post Author
  
  Post Status
  
  Random metadata entry
  
  User Capabilities
  
  At the moment, Gutenberg supports whole theme support and is discussing global block white/black lists on github… they are a long way from the granular level of control necessary to work in the various client situations I see daily.
  
  Reply
Matt says:

September 8, 2017 at 8:36 pm

Wow, you really don’t agree with any decision WordPress has made in the past or any of the plans for the future. Why do you use it?

Reply
- gschoppe says:
  
  September 8, 2017 at 8:49 pm
  
  I certainly hope that this comment is by a random user who wants to stir up trouble, rather than the man in charge of the direction of the world’s most widely-used CMS.
  
  I use WordPress because of a large community of great developers who build on top of a deeply flawed but popular platform to make awesome things.
  
  Many of those developers have liked or shared this post on Twitter or other social media, or reached out to me directly, because many of them have similar concerns.
  
  If this really is by the real Matt, Dismissive comments like this are not how you listen to a community.
  
  I would love to see a point by point discussion of the issues and concerns I, and many other voices in the community, have raised, to show that they are being considered and discussed, even if not accepted.
  
  Reply
- mzalewski says:
  
  September 9, 2017 at 7:46 pm
  
  That’s disappointing. Gutenberg is a missed opportunity – I’ve always had a huge amount of respect for the WP team and WordPress in general, but have become disillusioned over the past 6 months.
  
  gschoppe (and others) have attempted to bring these issues up many times, even posting in the Slack channels. It would have been great to see a discussion between the core team and the wider development community around the direction of Gutenberg, but it seems these concerns are being ignored.
  
  Maybe you’re right – it might be time to move on.
  
  Reply
The Saint Michael says:

September 9, 2017 at 8:26 am

I still wish Gutenberg will succeed. I like it somehow, mean working inside it.
But, cannot go against myself. I give it about 25% chance to succeed without making life very hard for majority of developers.

Common man, it is ridiculous. They should close this Github page and make a pause, they just shame themselves. They have planned something so big, then they ask other people on Github how will they solve problem with Metaboxes and custom fields. Like it is small and annoying, not so important thing.

Will we solve it today with stupid iframes ? Yes, shouted public from last row. Cheers, it is frames for today. Tomorow we try something else. But don´t worry, we will ask you on Github.

Anyway, what is the point of chasing new WordPress users when you cover more than 25% of all websites on Internet (or how much). Desire to play God ?
I mean, very fine. But not at every price, and not making plans public before plan for Metaboxes is finished and prototyped. It should not happen, ever.

Reply
Stephen Petrey says:

September 19, 2017 at 1:39 pm

Here’s to hoping Gutenberg doesn’t mar the Automatic organization any further…. I still cannot believe it’s going to ship in 5.0 — it’s absurd. Great post Greg.

Reply

Greg Schoppe

The Gutenberg that could have been

The Roadmap that Wasn’t

WordPress 4.9

WordPress 4.10.0

WordPress 4.11.0

WordPress 5.0.0

WordPress 5.1.0 and Beyond

23 Comments

Dac Chartrand says:

gschoppe says:

Dac Chartrand says:

gschoppe says:

Christopher A says:

gschoppe says:

kaiser says:

gschoppe says:

John Teague says:

gschoppe says:

Kalen Johnson says:

fwolf says:

gschoppe says:

John Teague says:

gschoppe says:

Adrianne says:

The Saint Michael says:

gschoppe says:

Matt says:

gschoppe says:

mzalewski says:

The Saint Michael says:

Stephen Petrey says:

Leave a Reply Cancel Reply