Matt Mullenweg, CEO of Automattic and founder/director of the WordPress Foundation, recently wrote a blog post entitled “We Called it Gutenberg for a Reason”, which attempted to address the widespread concerns voiced about the direction of the new WordPress editor. In general, the post made a lot of big promises about how Gutenberg would solve everyone’s problems. Unfortunately, many of those claims don’t live up to reasonable scrutiny. So, I find myself writing a response to the post, voicing some of the issues I find with what I see as an overly optimistic view.
I’m going to base this post around addressing the various bolded sections in Matt’s post, so if you haven’t read it, you may want to.
Developers and agencies will be able to create interactive templates that clients can easily update without breaking things or dealing with custom post types: Imagine a custom “employee” block that you can add to an About page that includes a picture, name, and bio. They’ll be able to replace most meta boxes, and they’ll get a chance to update old code or clients to work in this new paradigm.
This response fundamentally misunderstands the value that Custom Post types and Metaboxes provide agencies. They serve to provide separation between structured data and design. For example, a ‘staff’ custom post type would have a totally different interface than a generic post, requiring the user to enter the structured data, like name, position, and photo as raw text or image fields that are completely decoupled from the page’s design. This stops content editors, who are not on the design team, from deciding that certain staff pages should look different or behave differently.
physically separating the interface for these post types from each other serves to enforce the idea that they are separate concepts, and that one shouldn’t make a random post a child of a staff page, for example.
There has been talk on github of making metabox blocks, which would allow for feature parity on this front, but I fail to see any benefit from the changeover. Basically, Matt is touting the ability to create unstructured blocks of data, in an ad hoc manner, when Agencies and Developers need structured data, instead. This decoupled structure protects the data attached to a post through the process of a complete redesign of the front end, and providing a more design-centric interface just begs content editors to try to make it “look good”, rather than leaving that to their design team.
In addition, lots of metaboxes store data that isn’t shown on the single post view of a page. It might be open graph metatags or data for a filterbar, or enhanced keywords for site search. Moving this data into a view that is integrated with the post_content just blurs the line between unstructured content and structured data, and will confuse users.
Plugin developers will be able to completely integrate into every part of WordPress, including posts, pages, custom post types, and sidebars without having to hack TinyMCE or squeeze their entire feature behind a toolbar button. Today, every plugin that extends WordPress does it in a different way; Gutenberg’s blocks provide a single, easy-to-learn entry point for an incredible variety of extensions. Some folks have already begun to port their plugins over and finding it easier to build and to have a much improved UI, I’m looking forward to highlighting those stories as we get further along and more people write about them.
Matt uses the term hack to describe integrating with TinyMCE. This is very telling, as WordPress has been tightly coupled to TinyMCE for a decade, yet still thinks of their API as a hack. It exists as such, because almost no documentation exists in the WordPress codex for this tightly coupled component. TinyMCE has a poorly documented API that needs a lot of attention. However, TinyMCE is remaining a core part of the new editor. What Matt is saying is “This full-featured API is badly documented, so we built a new API around it, rather than write the documentation”.
Perhaps Gutenberg will be different, but maybe some effort should be made to improve the documentation we have, before creating a whole new kettle to document. (and it should be noted that the current Gutenberg documentation, although in beta, is definitely not up to the standard necessary)
It should also be noted that when Matt talks about a “single entry point”, he’s really over-simplifying the issue. Gutenberg adds the need to write React components, to interface with WordPress via the API, but the API isn’t the core code. Any sufficiently deep integration is still going to be rooted in PHP, so really we just added the necessity for every project to incorporate JS and CSS components as well as the aforementioned PHP, complicating scope, and narrowing the field of competent developers who can work on it.
Theme developers won’t need to bundle tons of plugins or create their own page builders. There’ll will be a standard, portable way to create rich layouts for posts and guide people setup right in the interface, no 20-step tutorials or long videos needed. Every theme will be able to compete with multi-functional premium themes without locking users into a single theme or compromising their experience.
This would be wonderful, if it was even close to what the actual core team is building. In actuality, there is an eschewing of “theme builder-like features” going on, as the project goals focus on “a vertical river of content” and continually push off such fundamental needs as a common interface for nested blocks. What will happen in actuality is that Gutenberg will launch and page builders will take one of two routes, either they will implement custom blocks to fill in these basic needs that Gutenberg left off, continuing to contribute to theme lock-in, and leaving free themes behind as they don’t have the manpower to create all the necessary custom blocks, or they will continue to completely ignore the post editor and go on with their custom solutions. As long as custom blocks are necessary to create a full page builder in Gutenberg, there really is no improvement here over any of the awful shortcode-based builders we’ve seen in the past.
This is a major turning point for WordPress. Either they are creating a common page builder or they aren’t, but this is not a pool they can continue to dip their toes into, advertising one thing, and developing another.
Core developers will be able to work in modern technologies and not worry about 15 years of backwards compatibility. We’ll be able to simplify how menus, widgets, and the editor work to use a common set of code and concepts. The interface will be instantly responsive.
This is possibly the largest and most bald-faced lie of the entire post. Developers are saddled with 15 years of poor decisions with regard to the WordPress editor, and they just keep digging deeper with every iteration. Let’s consider a few:
- wpautop – post_content isn’t really stored as HTML. Instead it is stored as some sort of monstrous combination of structural white-space and HTML. This persists into Gutenberg, although there is an automated function to abstract it away. This is a significant hurdle to developers, as the conversion from one to the other fundamentally cannot maintain all the data contained in the raw HTML, so things are lost in translation, in unexpected ways.
- shortcodes – these were a hack to allow for inserting dynamic content into static posts. They use a non-standard format that is parsed out of HTML via regular expression (a task that is fundamentally buggy and ideologically flawed)
- flat HTML image storage – images are stored in the media library by ID, but stored in the post as HTML tags with src attributes. This means that deleting or updating an image that is used in many posts will not update the images themselves, leading to broken image resources. You also cannot update thumbnail sizes used in posts in bulk, for the same reason.
- wp_kses – a filter that is run to purify post_content, stripping “dangerous tags and attributes”… once again, this is based on regular expressions.
- RICG source sets – These modifications to image tags happen at render-time, once again powered by regular expressions
- capital_p_dangit – This is a perfect example of how data purity is misunderstood by Matt and several players in the core team. In version 3.0, Matt added a filter to core that changes the spelling of WordPress on render, to correct the capitalization. What should have been a spelling suggestion in TinyMCE was implemented as a hard filter on post_content, using regular expressions. This has the potential to break urls and embedded code that needs to have accurate naming conventions. All this was brought to the core team’s attention at the time, and despite widespread and detailed explanation of the issue, was retained as a completely vanity-based change to core.
- And now we add comment-based post structure – Rather than learning from the misstep of creating a new structured data standard in shortcodes, the core team has doubled down, and is creating a second structured data standard, based in HTML comments,that is,
once again, parsed using Regular Expressions(it appears i was incorrect about this point. it is parsed using PEG, which is theoretically a parser capable of handling HTML parsing… HOWEVER, the PEG parser for PHP is generated by a project called Peg.js, which is at version 0.1, and does not promise stability until version 1.0, and a project called phpegjs, which is at version 1.0.0beta7 and has only one contributor.. this does not inspire faith). There is a universal standard for flattened structured data storage that almost every other service uses, called JSON. There is even a standard for storing block-based documents in JSON, called MobileDoc, which is currently being used by Ghost. Rolling your own, to try to preserve back compatibility with unstructured posts is a horrible misstep. In a logically developed system, there would be a conversion step, from which point, all posts would be stored in logical, structured data. The plain-text editor would simply be a new endpoint to produce MobileDoc with a different interface, similar to how Ghost supports Markdown. This would make the single source of truth a real structured format.
These are all issues that are going to continue to hinder core developers again and again. In addition, WordPress is built on old PHP, you cannot argue that the core is going to soar ahead with React and API without considering that there are serious flaws in the foundation that remain unfixed. WPDB still fails to provide proper binding for variables, passwords can still be saved in some monstrous MD5-derived format, plugin downloads are unsigned, XMLRPC remains a security risk… these are only a few of the thousands of core issues that exist, which have not been repaired in many years. Pretending that a new coat of React paint will allow the developers to ignore these issues and work with an API abstraction is a misstep in the extreme, especially when trak tickets involving these issues are being closed as “not related to the core focus of 2017”
This claim is a lot like the original goal of making WordPress database-agnostic, via pluggable Database engines. It seems like it might be possible on the surface, until you realize how many hooks there are to old, bad decisions.
Web hosts will have better signup rates, as Gutenberg opens up WordPress to an entirely new set of people for whom WordPress was too complex and hard to set up before. (Remember our goal: to democratize publishing.) Their churn rates will go down: they’ll stop bleeding customers to Wix, Weebly, and Squarespace, and fewer people will abandon their sites because it was too hard to make things look they way they wanted.
You cannot please everyone all the time. The idea that WordPress can win back the Squarespace crowd shows this lack of knowledge of the competitor’s client base. There are a million concerns, from domain to SSL, to just FTP that WordPress cannot address, and that will never be as simple as they are on a fully-integrated hosted environment. Users who do not need anything more than what these WYSIWYG builders offer will continue to go there, and their SEO, page speed, and ability to add custom features will suffer as a result. Trying to focus the WordPress experience on this lowest tier is simply going to cause the ecosystem to bleed the power-users who build interesting things with it by bringing the downsides of WYSIWYG builders into WordPress to cater to those who will never spend a dime on their own sites.
Users will finally be able to build the sites they see in their imaginations. They’ll be able to do things on mobile they’ve never been able to before. They’ll never have to see a shortcode again. Text pasted from Word will get cleaned up and converted to blocks automatically and instantly. (I pasted the first version of this post from Google Docs and it worked great. ?) They’ll start manipulating their sites in ways that would have taken a developer. They’ll be able to move from blogging to using WordPress as a CMS without missing a beat. Editing posts will just work; they’ll write more. They’ll learn blocks once, and then be able to instantly use and understand 90%+ of plugins.
This is a very interesting statement, beyond the overly rosy assumptions that are low on facts, because it highlights a common use case: writing elsewhere and pasting into WordPress. Many people go this route because they either don’t trust, or don’t like the WordPress editing experience. Many reviews of Gutenberg by writers have stated that their issues with the interface actually got worse, rather than better, due to the constantly changing controls and lack of consistent interface elements. Given those facts, I find it somewhat telling that even Matt Mullenweg, champion of Gutenberg as the future of writing, chooses to do his writing elsewhere.
(People were worried when the printing press was invented, too. A Swiss biologist warned against the “confusing and harmful abundance of books,” but I’d say it all worked out in the end.)
Statements like that serve to denigrate those with legitimate criticism, despite Matt’s statements to the contrary. But that is mostly what we are hearing these days. There is a lot of “listening” and “considering”, but those with meaningful concerns only hear platitudes back.
At the same time, releasing meaningful usage data is prevented by Matt, claiming that it will only be used as ammunition. Knowledge is not ammunition. It is the basis of informed decisions.
Making decisions in the absence of knowledge, based on platitudes and mandates is the basis of religions, and pardon me, but I don’t want to join up to the religion of Matt. I want to work in a modern, data-driven, open-source environment.
Gutenberg will ship with WordPress 5.0, but the release will come out when Gutenberg is ready, not vice versa.
This highlights one of the major flaws of the entire Gutenberg project: The lack of a metric for failure. If version 5.0 is determined by Gutenberg, and slated for a Jan 2018 release, there is no time for the team to fail and re-evaluate core decisions. At the beginning, when the data structure was first being designed, it was highlighed that comments were a poor solution, and that nesting needed to be addressed as early as possible, and that react posed concerns for certain companies running WordPress. Those who raised concerns were told “this is a prototype, it will all be rebuilt before moving forward”. Now, we see that all of those poor decisions are cemented in code by a lack of time to refactor, and many of the issues that will highlight these core flaws are being pushed off as “not part of the scope of the 1.0 release”.
Scope is a word we hear a lot, and there are a lot of problems with it. Gutenberg was originally a codeword for the new editor, which would add block support, and basically replace the existing editor block. From there, the scope has changed to first encompass the customizer, then metaboxes, and now full page-building and the entire “WordPress experience”. The project began without a tight scope or definition of success and failure, and now is naturally suffering from scope creep. But everything it moves to now encompass comes at the expense of further development and iterations on the core idea.
Gutenberg is quickly becoming the complete interface for WordPress 5.0, but is still the size and scope of a team tasked with a new feature for the editor, working from poorly scoped and roughed in ideas about data storage and interface. This is a huge problem, that is not related to “fear of change” or “lack of visibility”. They are systemic problems stemming from severely lacking project management, metrics, and priorities from day one.
Matt called it Gutenberg for a reason, but I find it a fairly ironic one. The real innovation of Gutenberg’s press was good planning for the future (a durable alloy that would hold up under abuse) and a flexible data structure (type cases, leading, etc) that could handle unforeseen use cases. Gutenberg’s press didn’t try to imitate full-plate presses, and it didn’t try to be back compatible for existing woodblocks. It was revolutionary because it rebuilt the press from the ground up, with future-safe techniques… and yet here we are, staring down the barrel of HTML comments as an ad-hoc datastructure and an editor that still relies on content-editable wrapped in a hacky API, in 2017.
Apologies if this seems blunt at times, I’m trying to address various points.
> Basically, Matt is touting the ability to create unstructured blocks of data, in an ad hoc manner, when Agencies and Developers need structured data, instead.
This is a rather incorrect assumption of what he is saying. Blocks are about how you interact with the data, not where that data lives.
> Providing a more design-centric interface just begs content editors to try to make it “look good”, rather than leaving that to their design team.
This is also a mischaracterization. The design team in your example now has the ability to provide a block—or a template—that specifies how things are meant to be displayed while giving their authors a way to edit the information visually and directly.
> Based in HTML comments, that is, once again, parsed using Regular Expressions
They are not parsed with regular expressions.
> Either they will implement custom blocks to fill in these basic needs that Gutenberg left off, continuing to contribute to theme lock-in
I don’t follow the inference in this conclusion. A custom block can be served as a plugin, working across themes, given that understanding blocks would be part of core.
> At the beginning, when the data structure was first being designed, it was highlighed that comments were a poor solution, and that nesting needed to be addressed as early as possible.
This is stated as a fact while it is just an opinion. I am happy to engage in discussing the merits and tradeoffs of the approach, but characterizing it just as a “horrible misstep” makes it hard. It seems you are proposing the data format should essentially create a breach between any content created pre-gutenberg and any content created post-gutenberg.
If you look at how Gutenberg works, an object tree is precisely the data format the editor uses to manipulate and present information. We serialize it back into «post_content» as HTML because we don’t want to fracture the expectations of themes, plugins, third-party apps, email contexts, readers, feeds, etc, around the shape they expect content to be without intervention. The fact you can create a post in Gutenberg and still open it in the old editor, or a mobile app, and have access to your content is something I believe should be valued. We are not starting from scratch, we have more than a decade of a large portion of the web content to answer to.
Even then, if you look a bit beyond, you could also see that it gives us a platform to eventually make the switch to store JSON directly. (It is likely how the following steps of customization would work.)
> From there, the scope has changed to first encompass the customizer, then metaboxes, and now full page-building and the entire “WordPress experience”.
Scope never changed. We are not tackling customizer nor full-page building in the first iteration of Gutenberg. Meta-boxes are part of the current editor so they need to be addressed in some way. You call things like nested blocks “fundamental” and that they need to be done as early as possible, yet that would definitely increase scope beyond the original goals. In any case, we actually did have support for nesting in our grammar implementation. It was left out because the UI would need a lot dedicated attention, but it was already clear we could support it from a technical point of view.
The goal has always been the post editor screen, and it’s one of the reasons why columns are also not part of the first implementation. The pieces that have hinted at the future evolutions had been done purely to communicate the overall vision, as that has proven to be important to articulate more broadly.
That may be what some members of the team are saying, but that clearly isn’t the “benefit” touted in Matt’s post
my point is that many metaboxes make no sense as a visual interpretation, either because they will not be displayed visually, or on that page, or because showing how the data appears causes certain types of clients to try to “goose” the data to look better, adding unwanted whitespace or messing with formatting where they shouldn’t.
You are correct. I was misinterpreting the regex-like grammar of the PEG parser for a regex builder. I am however concerned about the use of phpegjs and peg.js in core, as phpegjs has only a single contributor and is in beta, and peg.js is at version 0.10 and claims that stability is not guaranteed until version 1.0. I will print a correction, but i still firmly believe the choice of rolling a new, custom format is a major misstep over using a trusted format with a time-tested, proven parser.
How would that be any better than what we have now, with page-builder plugins? either way, you are locked into something that stores your data in a custom way. The only difference is that in one version the custom storage method might be a shortcode, and in the other it’s a “block”… either way, if you don’t have the plugin/theme to parse a block, you’re SOL. So yay, we end up with a million competing and incompatible column block standards, rather than a million competing and incompatible column shortcodes
I’m sorry, but it is a hard fact. You are dealing with structured data. it MUST be stored in a structured format. This mashup of structured and unstructured data just leads WordPress further down the path of compounded hacks on post-content. Back compatibility could be preserved easily by maintaining a flattened cache of the post in post_content, and storing the structured content separately. Then you simply implement a converter for old posts, to convert them to the new structured format. Readers, Feeds, excerpts and even WordPress’s site search already suffer because post_content includes HTML tags and shortcodes, and pulling them out naively leaves orphaned data. Instead, you could implement a set of parsers for feeds, excerpts, etc that would allow them to be generated separately from the structured content. That way, blocks could define how they are flattened in different contexts, providing an optimal flat text version, search version, email version, etc. There should be a single source of truth, but it shouldn’t be the flattened post_content.
a secondary benefit of storing as JSON/MobileDoc is the ability to natively update links and image sources in posts when they change. With that and the various enhancements to feeds, excerpts, search, etc, there are clear advantages to real structured data that go way beyond just Gutenberg.
It would also allow for trivial importing from other modern CMSs like Ghost.
Gutenberg is a breaking change. To give it the treatment it deserves, it needs to be breaking to old editors and apps. new versions could implement the same structured format, or a compatibility layer could be created (similar to Ghost’s markdown support) which would provide a flat version for old editors and apps, and convert to the structured data on save. This would make the unstructured format a second-class citizen, as it should be, not the primary storage method.
Bull. Metaboxes are part of the edit PAGE. The editor is everything that gets loaded with
wp_editor(), AKA the thing that directly produces
post_content. Encompassing the entire edit page is absolutely an expansion of scope.
Seeing columns as not a part of the post_content is just dumb, as there are a million post formats that need columns in content, and they are one of the first blocks that will get a million competing and incompatible implementations. The same applies to a common interface for nested blocks. If you don’t implement it, you will get a thousand competing crap versions immediately, which will erode user trust.
It is not a breaking change at a data level.
I think we are exhausting the bandwidth of the asynchronous conversation a bit, but you are more than welcome to join us on the core slack at the #core-editor channel to keep the conversation going and help build the project.
Given that all third-party apps expect that source to be the HTML-like post, this cannot be changed without breaking them completely. And if you don’t break them, it naturally fractures the single source of truth, leading to all sorts of issues at very wide level given WordPress’ ubiquitousness.
Please, don’t call ideas you disagree with just dumb, it doesn’t help with the discussion. We have communicated that columns are part of the project, just not in the initial version. If others want to start implementing them before, please, it’d be very welcome as an exploration and potential contribution to the project.
It is a breaking change at every other level. It makes sense for it to be breaking at a data level.
I would prefer to have these conversations in a public forum, because they were discussed many times in private, and nothing came of it. Open letters and public posts allow the general community to better understand the implications of decisions made in slack.
If those third party apps feed through any php code before hitting the database, a translation layer could be implemented to let them talk HTML, while the database stores the real format. That could be maintained as legacy code, as they are transitioned to deal with the modern format. If they do interact directly with the database, that is not a supported path, and there is no reason to expect them to stay up to date in other ways, or to update in the future… I’m sure there are third party tools that offer visual editors right now and strip comments other than page or more… those will break. there is the possibility of breaking things no matter what you do, but some choices improve the state of WordPress, and some just add more hacks.
And I am sorry to use the word dumb, but it is hard to imagine why one of the most commonly added features via builders, shortcodes, and other hacks is left out, in favor of just matching the feature set of Medium. I don’t know why on earth a million competing column standards could possibly improve the uniformity of the editor experience.
Greg Holmes says:
“my point is that many metaboxes make no sense as a visual interpretation, either because they will not be displayed visually, or on that page, or because showing how the data appears causes certain types of clients to try to “goose” the data to look better, adding unwanted whitespace or messing with formatting where they shouldn’t.”
Precisely. WordPress, welcome to Concrete5 … I can’t wait to have to click on some meaningless little visual strip in the visual editing field, so that I can edit some meta data or code block. Yay.
Dennis Snell says:
Just passing through here but I think you might be misinformed about the parser
There aren’t any Regular Expressions in the formally-specified grammar which is used to generate a parser that converts the stored text into a tree data structure. If you have found some in the source code where they are inappropriate please point them out here so we can identify them and fix them.
You are correct in that Regular Expressions are guaranteed to fail when parsing a context-free (or more complicated) language like HTML, which is why we chose as a project not to use them.
You are correct. I was misinterpreting the regex-like grammar of the PEG parser for a regex builder. I am however concerned about the use of phpegjs and peg.js in core, as phpegjs has only a single contributor and is in beta, and peg.js is at version 0.10 and claims that stability is not guaranteed until version 1.0. I will print a correction, but i still believe the choice of rolling a new, custom format is a major misstep.
Vladimir Prelovac says:
Enjoying reading your commentary Greg. Great to have technical perspective balance for us that are unable to go deeper into it.
Couple of questions for you:
1. What do you think is the fundamental problem with Gutenberg?
2. How would you take WordPress to the next level differently?
To me, there appear to be two major, fundamental problems with Gutenberg.
If I was building the next Gutenberg editor, I would start by converting the datastructure to a common JSON-based format. This would benefit people who want to build custom page builders, as well as providing a way to do things like automatically update post links and image sources in posts, since those would no longer be stored as flat html. Then, I would build an optional block based editor for those situations where one is needed (which is not the same as all situations), based on slate.js, and using preact (for lighterweight code that doesn’t carry the weight of the facebook license terms). However, a common, standards-based post format would empower third parties to create their own editors, adding choice to the WordPress community, rather than removing it.
> However, a common, standards-based post format would empower third parties to create their own editors
I think this is a strong point Gutenberg is missing right now for sure. In this age of “headless WordPress” where the WP-Admin and the Theme is _not_ the only think interacting with the content anymore.
WordPress powers native mobile apps, Calypso and other similar de-coupled apps, and in our case it actually hydrates our newspaper print services.
So having a good standard for content is much more important than the WP-Admin interface, as the data will be touched by EVERY consumer, but the editor interface will be optional. . .you can interact with your site over REST or WPGraphQL or XML-RPC, etc. . .so the actual UI of Gutenberg is bypassed by several clients, but the structure of the data is needed by EVERY method of interacting with it.
Thanks for the support, this is one of the crucial reasons I am pushing for a mobiledoc implementation. That would allow developers to work with WordPress data regardless of the interface. JSON is supported by every major programming language, out of the box, so it would be as easy to create a front end or an editor in node.js as it would be in Python, Java, or even C#
Passing through here and I’d like to add a note of agreement. Using HTML comments to coerce structured data into an unstructured monolithic post is a backward-thinking decision. Like many people doing custom development, I interact with WP through API. Now I have to implement yet another translation layer for yet another mutant hack of a data standard.
Wow. Simply, Wow. What an eye opener.
As a very mediocre WordPress developer, I admit I fell into believing many of Matt’s promises. I really liked his post, as it explained a lot of what Gutenberg is all about, or at least of what Matt thinks it’s all about.
Still, when I read it I kept getting this weird feeling that “something is not right”. It wasn’t the bold claims: we all saw in the past software that changed dramatically when certain features were introduced, and Gutenberg, at least on the surface of it, seemed to be such a milestone.
So it wasn’t the promises and their scope. It was my humble knowledge that “things in WordPress just don’t work this way”, and that they simply can’t just become so rosy just by a revamped editing experience that was so rushed to core.
I read a lot of articles criticizing Gutenberg, as well as a few that praised it, or at least gave it more credit (there simply are way more articles to the con than to the pro). None of those articles explained so clearly what is so fundamentally wrong with Gutenberg, and in so many levels, while addressing every aspect of WordPress use cases (for this clear structure I guess we should thank Matt :).
So thanks mate for a great piece, which also taught me quite a few new concepts. I feel I now much more informed, if not enlightened, of what problems we all would probably encounter by a rushed implementation of Gutenberg in WordPress’s next major release.
I really hope posts like this one will change something!
Jason Bahl says:
I’m pretty critical of Gutenberg as well, _but_ they are moving fast. Concerns I’ve had when I first used it have changed over time. The project is still _super early_. . .the 1.0 milestone isn’t the same as semver 1.0, so it shouldn’t be treated as if it’s out of beta and ready for core merge.
The project is moving REALLY fast and it’s been fun to see it evolve. Lots of concerns by the community have been noted, discussed and in some cases already addressed, at least to some degree.
I was VERY skeptical about the project when it began, and I still am to a degree, but I do think the way it has progressed gives me much more optimism that over time some of the major concerns can be sorted out.
I do think that HTML comments isn’t the best long-term solution for the data storage, but I do believe Matias is correct in that it provides a way to do this now without breaking existing content, and paves a path toward alternate data storage solutions.
The parser is already defining the shape of the data, so now we can store that where we want. . .WordPress _is_ filterable after all. I imagine folks (the core Gutenberg team and the community at large) will experiment with alternative data-stores to post_content to store the JSON data and hydrate blocks with that alt store instead of the post_content, and over time we’ll figure out the best long term data store for the shape of data we’re dealing with. For example, my team will likely experiment with storing the blocks in ElasticSearch so that we can do some cool aggregations of data across our content, including the block meta, etc, but still filter the content and hydrate the block editor with the format that’s expected.
I think the Gutenberg team is doing fantastic with the task at hand. It’s an extremely ambitious project and they’re doing very good at handling the criticism, taking notes, discussing and making plans to address.
I still have concerns, but I also have a lot of hope and optimism. . .and, at the end of the day, it’s Open Source so we can dive in at any time and contribute solutions to the problems that concern us most!
Unfortunately, I think the biggest issue here is that it relegates the data structure for post_content to a second-class role in the project to revolutionize the editor. In the age of WordPress API, the datastructure is the most crucial component of the project. I have heard the argument that filters can be used and plugins built with previous controversial changes, but the problem is that these decisions shape core, and by making them once, things will solidify around them, making them impossible to change.
Once the mobile editor app is producing flattened HTML comments, that will become the argument that it needs to stay. The same applies to third party tools. Critical flaws to the datastructure of core get solidified by plugins and external programs that try to follow best practices.
And yes, I’m architecting a drop-in MobileDoc implementation, but it will take time to write a project that goes against the grain of core, and if core launches with this awful structure, it becomes MUCH harder to ever rectify it. Right now, a major breaking change is coming in the form of Gutenberg. Either that single breaking change involves converting post_content, or core has to justify a second breaking change soon after, to fix the hole they’ve dug for themselves. And how easy is it gonna be to justify a second fundamental change in one year? or in two? or in three? Instead, break things once and put WordPress’s data on a strong footing. otherwise, this same lame argument will be made about breaking changes to justify never merging a proper data structure into core
Not really. You could imagine a new API endpoint that returns the JSON representation of the post, and third party apps—like mobile apps—could interact directly with it and never touch the HTML source.
I think you are seeing something as a fundamental flaw and a forever-closed door that is not really the case, something which was designed precisely to allow us to move forwards without alienating existing content.
So you are saying that it is acceptable, at some point, to introduce a breaking change to how third-party tools save data. If so, do it now, when there is already a major breaking change in the editor occurring. The introduction of Gutenberg is the time to make these changes, as they are expected with a complete overhaul. Making them later just causes a second round of breaking changes, and erodes user trust.
If the plan was to ever get a real format, Gutenberg would launch with a shortcode-based PEG parser, instead of a comment-based one, because it would use the tools already there to shim until the new structure comes in. Adding a completely new format, specifically to remove it later is illogical, and frankly unbelievable. Because once there is a format, people will start generating it in ways you haven’t expected, and then that becomes the excuse to maintain it for back-compatibility.
No, I am saying that we could introduce tools for third-party apps that wish to leverage blocks to consume a JSON structure. The endpoint would handle how that is saved.
Not really, shortcodes suffer from not being invisible if interpreted as HTML. Comments are native to the specification, so a prescriptive grammar can be built that doesn’t affect output at all. WordPress, in fact, already uses comments for the “more tag” and for splitting a post into pages.
As long as that json structure is generated from parsed html/comments/shortcode mess, the overall codebase of WordPress continues to become an unending tower of hacks.
Shortcodes are not invisible, but that would actually be a benefit to third party visual editors. if a third-party editor doesn’t render comments, there is no way to know where a slideshow goes, or how it is delineated. shortcodes, while certainly a stopgap, would at least let them see the delineations. I am not saying this is a good solution, but comments are even worse, because to many editors they will be invisible, unless they are the known more or page comments
Paul Dahlen says:
Well, here is one WordPress developer who has asked for clarity, expressed concerns, and hoped for answers on three different posts concerning Gutenberg including Matt’s vision. None of my concerns have been answered, clarified, or addressed at all. This causes me to believe that a) I don’t have the name value of some who seem to be getting answers while others like me aren’t, b) I’m simply being ignored, or c) there are no plans to addresses the issues I’ve written about. I’ve found this community to be very cliquish, so I suspect it’s a combination of all 3.
I’ve done my part. I’m made my concerns known to Matt in response to his treatise, so at the risk of simply being a squeaky wheel, I will retire from writing more, and I will wait to see just how this goes down in the end. I’ve had the experience of being the project lead developer in a very popular flight simulator plug in, where a project was designed and coding was started and others continued to scope creep and push, push. When it went live years ago it was met with 90% bad press and members left. It’s the only project I lead and released that I am still not proud of. I was arrogant enough to think that “they’re just complaining because no one likes change” and “they don’t understand the big picture” (we had no big picture).
One positive thing that’s come out of this is my introduction to several other tools that Matt considers “competitors” and I find I am impressed with a couple of them. I was not aware they were there until this all started. So that’s the silver in this cloud. I hope Gutenberg is a good thing for the entire WP community including end users who have never even read these various forums and blogs. If it’s not, you will simply see people reaching for something else that does what they want to do better. Best wishes!
Christopher A says:
I have concerns about structured data inside of HTML comments as well. One of the most overlooked issues is security. By default, every WordPress site is searchable via “?s=” queries. Raw post_content text is searched for matches. There are many instances where you’d want to store private data with a block and do not intend for it to be publicly searchable.
There has been talk of allowing blocks to store data in postmeta. While this solves the problem from a technical standpoint, a user will not be informed as to where the data is being stored. There will be no way to tell which data will be publicly searchable.
This is an excellent point, and one of the many existing problems with flat post-content that Gutenberg (as written) will make much worse. If a JSON-encoded structure was used, searches could use a “search cache” instead, which would be a flattened field containing the portions of the structured content that make sense to search. It would also make search much more plugin-friendly, as plugins could add filters to the search cache renderer, to add specific metadata to the search cache. This would vastly improve search for things like woocommerce or wp job manager, and is impossible with the current “post_content is the single source of truth” mindset.
Hey Greg, thanks for the post. Do you have a recommendation for a CMS that structures data well? Couldn’t hurt to be prepared, just in case.
I am currently looking for a good answer to this, as I haven’t kept up with the most recent CMS systems for a year or so. Drupal has had structured data in a roll-your-own kind of way for a very long time, but is long in the tooth. MobileDoc is at the heart of Ghost, which seems to be making fantastic data decisions, but is still adamant that their solution is primarily a blog. Looking at how slate.js serializes state is a great example of the goal, but it isn’t integrated in any CMSs that I’ve seen.
The hard part is that few CMSs try to do everything integrated in one package like WordPress does, so normally the core wouldn’t care how data is stored, as it would all be via abstraction, and different editors could insert different renderers and encoders, however they want.
My personal reading / learning list currently features Bolt, ProcessWire, ImpressPages and October CMS. Also included, for the static flavour, are SiteCake, AutoMad and HTMLy.
My personal requirements to a proper CMS or Toolkit are:
– at least as flexible as WP (think in terms of action / filter hooks, themes and so on)
– no hackish module crap
– proper, easily understood file and data structure
– future-proof data structure (something like Custom Post Types etc.)
– no overly complex and excessive PHP “we need every new and bleeding-edge feature that is under the starts, the moon and the sun” gangbang orgy
– easy readable, clean and properly documented / commented code
– no obscure quasi-features like TypoScript etc.
– if possible / in the scope, a reliable, user friendly Admin UI would be nice
– proper developer documentation (and NO, some thrown together PHPDocs are NOT to be considered proper documentation!) or at least good tutorials
Considering how the core team handled criticism of the decision to convert the text widget to a neutered visual editor widget that stripped HTML… I’m not holding my breath that they are listening to the critiques and comments that are critical on this project. I’m tempted to fork the last non-Gutenberg version of WP when it launches. I’m also NOT happy about the use of React because of the patent implications with FB.
Thank you for making your voice be heard. This so reminds me of the push for the customizer which, only a portion of sites are using.
There’s such a push to be like the new guys, when we could focus on making WordPress a solid platform, and not just blog software with some CMS-like features.
Focus on data, security, CMS and let gutenberg be OPTIONAL. Sure, turn it on by default. Then I will turn it off. 😉
Thanks Greg for the post.
Gutenburg has rolled out across our sites, yet you can’t even apply superscript! It’s a really poor editor and I have a number of people here simply confused about what it’s for. Things that used to be very simple are now three or four mouse clicks away. Not good.
I know. If you take a look at the reviews section on the Gutenberg plugin, it is clear that the majority of WordPress users feel the same way, but Matt doesn’t want to listen.
I recommend installing the classic editor plugin on every site you control, immediately, before WP 5.0 makes it even worse.