Producing A Desk Of Contents The usage of jSoup And ColdFusion

Producing A Desk Of Contents The usage of jSoup And ColdFusion

[ad_1]

I am authoring my Function Flags E book the usage of Markdown. Then, I am changing the Markdown into HTML the usage of Flexmark and ColdFusion. And, as soon as I’ve the uncooked HTML, I am the usage of jSoup to enhance the DOM for output. As a part of this, I am dynamically injecting a Desk of Contents (ToC). Within the e book, I am handiest together with the h2 headings; however, it were given me fascinated with how I may use jSoup and ColdFusion to create a extra inclusive desk of contents.

The main factor here’s the “Impedance Mismatch” between the construction of the HTML report and the construction of the Desk of Content material. The HTML construction is somewhat flat (possibly even totally flat), in which all the heading components may also be siblings. We bring to mind those headings as being hierarchical. However, this can be a “psychological style”, no longer a structural style.

The desk of contents, then again, is (ceaselessly) a hierarchical style, in which “nested headers” are rendered as nested lists. As such, in an effort to dynamically render the TOC, we need to translate the implicit hierarchy of headers into an particular hierarchy of knowledge constructions.

That is an set of rules that we intuitively perceive; however, it is not the very best to explain. Given a header (H), we wish to stroll up the pending Tree construction till we discover a mum or dad header (P) such that P<degree> is semantically more than H<degree>. Which means that we’ve got positioned the direct mum or dad of the given header; and, at that time, we will be able to append the header (H) to the kids of (P).

To discover this, I created a flat HTML document that has a chain of header components from H1 all of the method all the way down to H6 (content material abbreviated for the weblog):

<h1>My Groovy Manifesto (h1)</h1>
<h2>Bankruptcy 1 (h2)</h2>
<h3>Subsection 1-1 (h3)</h3>
<h4>Subsection 1-1-1 (h4)</h4>
<h5>Subsection 1-1-1-1 (h5)</h5>
<h6>Subsection 1-1-1-1-1 (h6)</h6>
<h4>Subsection 1-1-2 (h4)</h4>
<h3>Subsection 1-2 (h3)</h3>
<h3>Subsection 1-3 (h3)</h3>
<h2>Bankruptcy 2 (h2)</h2>
<h3>Subsection 2-1 (h3)</h3>
<h4>Subsection 2-1-1 (h4)</h4>
<h4>Subsection 2-1-2 (h4)</h4>
<h2>Bankruptcy 3 (h2)</h2>
<h3>Subsection 3-1 (h3)</h3>

As you’ll be able to see, all the headers are siblings of one another – the “hierarchy” is semantic, no longer structural. Producing a structural desk of contents in ColdFusion (Lucee CFML) seems like this:

<cfscript>

	report = javaNew( "org.jsoup.Jsoup" )
		.parseBodyFragment( fileRead( "./content material.htm" ) )
	;

	// The heading nodes within the HTML content material are hierarchical from a semantic point of view,
	// however are all siblings from a structural point of view. As such, we wish to translate
	// that FLAT construction right into a TREE construction for our desk of contents. Every segment /
	// heading goes to include a degree and a collection of sub-sections (kids).
	toc = [
		level: 0,
		children: []
	];

	// With a purpose to generate a hierarchical construction, we wish to stay observe of the
	// "mum or dad" heading. This manner, we will know once we come upon a kid of the former
	// heading; or, if we need to traverse again up the "mum or dad chain" to seek out an
	// suitable location in a unique heading.
	mum or dad = toc;

	// I resolve how deep the desk of contents will have to move. Now not each and every unmarried header
	// essentially provides worth to the ToC (from a person revel in point of view).
	maxLevelInToc = 5;

	for ( node in report.choose( "h1, h2, h3, h4, h5, h6" ) ) {

		present = [
			level: val( node.tagName().right( 1 ) ),
			title: node.text(),
			children: [],
			// NOTE: By way of default, we are going to suppose that the present heading node is a
			// subsection of the mum or dad heading node. We're going to validate this beneath.
			mum or dad: mum or dad
		];

		if ( present.degree > maxLevelInToc ) {

			proceed;

		}

		// The present/mum or dad assumption above is ONLY CORRECT if the present degree is
		// more than the mum or dad degree (ex, h3 vs h2). Alternatively, if the present degree is
		// smaller than or equivalent to the mum or dad degree, we need to trip up the TREE till
		// we discover the fitting mum or dad (ex, if present node is h2 and mum or dad is h2, we
		// must trip up the parent-chain till we discover the h1 that can include the
		// present h2).
		whilst ( present.degree <= present.mum or dad.degree ) {

			present.mum or dad = present.mum or dad.mum or dad;

		}

		// Now that we've got known the proper mum or dad/kid dating for our present
		// node, we will be able to upload it to the right kind kids assortment after which observe the
		// present node because the mum or dad for next headings. This may occasionally create a bi-
		// directional tree construction.
		present.mum or dad.kids.append( present );
		mum or dad = present;

	} // END: For-loop.

	// At this level, we've got aggregated all of our report headings. Render them as a
	// sequence of nested lists, beginning with our root TOC container.
	renderSection( toc.kids );

	// ------------------------------------------------------------------------------- //
	// ------------------------------------------------------------------------------- //

	/**
	* I render the given table-of-content (ToC) sections. This serve as calls itself
	* recursively whilst there are kids to render.
	*/
	public void serve as renderSection( required array sections ) {

		if ( ! sections.len() ) {

			go back;

		}

		```
		<cfoutput>
			<ul>
				<cfloop merchandise="native.segment" array="#sections#">
					<li>
						#encodeForHtml( segment.name )#
						#renderSection( segment.kids )#
					</li>
				</cfloop>
			</ul>
		</cfoutput>
		```

	}


	/**
	* I create a brand new Java elegance wrapper the usage of the jSoup JAR information.
	*/
	public any serve as javaNew( required string className ) {

		var jarPaths = [
			expandPath( "./jsoup-1.16.1.jar" )
		];

		go back( createObject( "java", className, jarPaths ) );

	}

</cfscript>

As you’ll be able to see, our Tree construction is bidirectional. As we iterate over the header components, we construct a connection from the mum or dad heading and its subheadings in addition to a connection from the subheading again to its mum or dad. This bidirectionality lets in us to stroll again up the TOC construction once we wish to in finding the fitting semantic mum or dad.

As soon as we now have the nested knowledge construction, we will be able to then render it as a chain of nested lists:

A table of contents rendered as a series of nested unordered lists.

jSoup is any such superb device. I used to be slightly sluggish to undertake it (it is been round for years). However, now that I’ve it as a part of my ColdFusion tool-belt, I am all the time discovering extra tactics to leverage it.

Wish to use code from this put up?
Take a look at the license.



[ad_2]

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back To Top
0
Would love your thoughts, please comment.x
()
x