HTML Processors - HL Vanilla Community
<main> <article class="userContent"> <div class="embedExternal embedImage display-large float-none"> <div class="embedExternal-content"> <a class="embedImage-link" href="https://us.v-cdn.net/6030677/uploads/QPDJH2QHXB6H/microsoftteams-image-288-29.png" rel="nofollow noreferrer noopener ugc" target="_blank"> <img class="embedImage-img" src="https://us.v-cdn.net/6030677/uploads/QPDJH2QHXB6H/microsoftteams-image-288-29.png" alt="MicrosoftTeams-image (8).png" height="108" width="1356" loading="lazy" data-display-size="large" data-float="none"></img></a> </div> </div> <p>Vanilla allows you to customize content formatting by adding HTML processors. These processors can modify the content of a post or article before it's rendered on the page.</p><h2 data-id="html-processor-fundamentals">HTML Processor Fundamentals</h2><p>The first step to adding an HTML processor to Vanilla's content formatting pipeline is to...create the HTML processor. To do this, you'll need to create a new PHP class that extends <code class="code codeInline" spellcheck="false" tabindex="0">Vanilla\Formatting\Html\Processor\HtmlProcessor</code>. This parent class will contain most of the functionality you'll need to get started. You'll primarily be responsible for implementing your own <code class="code codeInline" spellcheck="false" tabindex="0">processDocument</code> method. This method is responsible for taking an object representation of an HTML document, modifying it and returning the updated value.</p><p>Registering a new HTML processor requires an addon. The HTML processor class will live in the addon directory. The addon's primary class will be used to register the HTML processor with Vanilla's formatting service via a hook.</p><h2 data-id="example-html-processor">Example HTML Processor</h2><p>Here's an example of a basic HTML processor.</p><pre class="code codeBlock" spellcheck="false" tabindex="0"><?php namespace Vanilla\MyAddon\Formatting\Processors; use DOMText; use Vanilla\Formatting\Html\HtmlDocument; use Vanilla\Formatting\Html\Processor\HtmlProcessor; class CustomProcessor extends HtmlProcessor { private const REPLACE_PATTERN = "/\[giphy ([A-Z0-9]+)\]/i"; private const XPATH_QUERY = '/html/body//text()[not(ancestor::a) and (contains(.,"[giphy "))]'; /** * @inheritDoc */ public function processDocument(HtmlDocument $document): HtmlDocument { $nodes = $document->queryXPath(self::XPATH_QUERY); /** @var DOMText $node */ foreach ($nodes as $node) { $fragment = $document->getDom()->createDocumentFragment(); $fragment->appendXML( preg_replace( self::REPLACE_PATTERN, '<img src="https://media.giphy.com/media/$1/giphy.gif" />', $node->data ) ); $node->parentNode->replaceChild($fragment, $node); } return $document; } } </pre><p>The above processor will look for occurrences of <code class="code codeInline" spellcheck="false" tabindex="0">[giphy {ID}]</code> in content, where {ID} is the unique alphanumeric ID of a GIPHY image, and replace them with an image tag of the GIF.</p><h3 data-id="how-it-works">How It Works</h3><p>As mentioned, the processor only needs one method: <code class="code codeInline" spellcheck="false" tabindex="0">processDocument</code>. This method is automatically invoked for registered HTML processors as part of the content formatting pipeline. It takes one parameter: <code class="code codeInline" spellcheck="false" tabindex="0">$document</code>. This is an instance of <code class="code codeInline" spellcheck="false" tabindex="0">Vanilla\Formatting\Html\HtmlDocument</code> and represents a piece of content that has been rendered as HTML. A discussion written in Markdown would be rendered as HTML, then passed to this method as an instance of <code class="code codeInline" spellcheck="false" tabindex="0">HtmlDocument</code>. The <code class="code codeInline" spellcheck="false" tabindex="0">processDocument</code> would only be aware of the HTML representation. The original Markdown would be inaccessible to the HTML processor.</p><p>Breaking it down a little more, let's take a look at the first line of this method.</p><pre class="code codeBlock" spellcheck="false" tabindex="0">$nodes = $document->queryXPath(self::XPATH_QUERY); </pre><p>We perform an XPath query to look for text nodes in the document containing our GIPHY tag code. Because we're explicitly querying for text nodes, the return value will be an array of <code class="code codeInline" spellcheck="false" tabindex="0">DOMText</code> instances. Your own XPath query results may vary. Effectively utilizing XPath is beyond the scope of this document, but <a href="https://devhints.io/xpath" rel="nofollow noreferrer ugc">Devhints' XPath Cheatsheet</a> is a great resource.</p><p>Now that we have some nodes to target, we can begin modifying them in a basic loop.</p><pre class="code codeBlock" spellcheck="false" tabindex="0">$fragment = $document->getDom()->createDocumentFragment(); </pre><p>The first operation in our loop is to create an empty <code class="code codeInline" spellcheck="false" tabindex="0">DOMDocumentFragment</code>, attached in our HTML document. This will ultimately be used to replace the text node we're currently targeting.</p><pre class="code codeBlock" spellcheck="false" tabindex="0">$fragment->appendXML( preg_replace( self::REPLACE_PATTERN, '<img src="https://media.giphy.com/media/$1/giphy.gif" />', $node->data ) ); </pre><p>After nabbing the content of our text node, using its <code class="code codeInline" spellcheck="false" tabindex="0">data</code> property, we make our replacements using a simple regular expression and PHP's <code class="code codeInline" spellcheck="false" tabindex="0">preg_replace</code> function. The result of this call is used to set the content of our newly-created <code class="code codeInline" spellcheck="false" tabindex="0">DOMDocumentFragment</code>. At this point, we're ready to commit our substitution to the document.</p><pre class="code codeBlock" spellcheck="false" tabindex="0">$node->parentNode->replaceChild($fragment, $node); </pre><p>In the final step, we replace the current <code class="code codeInline" spellcheck="false" tabindex="0">DOMText</code> instance with our <code class="code codeInline" spellcheck="false" tabindex="0">DOMDocumentFragment</code> in the document. This officially applies the image tag replacement to the content being formatted.</p><h2 data-id="manipulation-of-the-dom">Manipulation of the DOM</h2><p>The ability to fully harness the potential of HTML processors in Vanilla largely depends on your familiarity with <a href="https://www.php.net/manual/en/book.dom.php" rel="nofollow noreferrer ugc">PHP's DOM library</a>, particularly <a href="https://www.php.net/manual/en/class.domdocument.php" rel="nofollow noreferrer ugc"><code class="code codeInline" spellcheck="false" tabindex="0">DOMDocument</code></a> and its related classes. The <code class="code codeInline" spellcheck="false" tabindex="0">Vanilla\Formatting\Html\HtmlDocument</code> class provides some utility and convenience methods to help developers more effectively utilize these DOM objects. Beyond reading up on PHP's own DOM library, you should checkout the public methods provided by the <code class="code codeInline" spellcheck="false" tabindex="0">HtmlDocument</code> class and those provided by its traits. There is very likely to be some method there to reduce the effort required to build your HTML processor.</p><p>In addition to PHP's <code class="code codeInline" spellcheck="false" tabindex="0">DOMDocument</code> and Vanilla's <code class="code codeInline" spellcheck="false" tabindex="0">HtmlDocument</code>, Vanilla also has a utility class for working with the DOM: <code class="code codeInline" spellcheck="false" tabindex="0">Vanilla\Formatting\Html\DomUtils</code>. This class is a collection of methods for performing common tasks on an instance of <code class="code codeInline" spellcheck="false" tabindex="0">DOMDocument</code>.</p><h2 data-id="registering-your-html-processor">Registering Your HTML Processor</h2><p>We have our processor. All that's left to do is register it. This is a very simple process, using Vanilla's container to ensure the processor will be registered with the formatting service. If you don't already have one in your addon, <a href="https://success.vanillaforums.com/kb/articles/245-event-and-handlers" rel="nofollow noreferrer ugc">you'll need to implement a </a><a href="https://success.vanillaforums.com/kb/articles/245-event-and-handlers" rel="nofollow noreferrer ugc"><code class="code codeInline" spellcheck="false" tabindex="0">container_init</code></a><a href="https://success.vanillaforums.com/kb/articles/245-event-and-handlers" rel="nofollow noreferrer ugc"> hook</a>. Here's a simplified example:</p><pre class="code codeBlock" spellcheck="false" tabindex="0">public function container_init(Container $dic): void { $dic->rule(BaseFormat::class) ->addCall("addHtmlProcessor", [new Reference(\Vanilla\MyAddon\Formatting\Processors\CustomProcessor::class)]); } </pre><p>This rule adds a call to <code class="code codeInline" spellcheck="false" tabindex="0">addHtmlProcessor</code> for all formatters instantiated by the container, providing our <code class="code codeInline" spellcheck="false" tabindex="0">Vanilla\MyAddon\Formatting\Processors\CustomProcessor</code> class from the above example as the new processor. The formatters will automatically use the new processor on the HTML they generate.</p> </article> </main>