Divs are played out
We all love our <div> tags. They've been around for decades, and for decades they've been the go-to element when you need to wrap some stuff in a block for styling or structural purposes. It's still very common to look through production websites and see stuff like this:
<div class="container" id="header">
    <div class="header header-main">Super duper best blog ever</div>
    <div class="site-navigation">
        <a href="/">Home</a>
        <a href="/about">About</a>
        <a href="/archive">Archive</a>
    </div>
</div>
<div class="container" id="main">
    <div class="article-header-level-1">
        Why you should buy more cheeses than you currently do
    </div>
    <div class="article-content">
        <div class="article-section">
            <div class="article-header-level-2">
                Part 1: Variety is spicy
            </div>
            <!-- cheesy content -->
        </div>
        <div class="article-section">
            <div class="article-header-level-2">
                Part 2: Cows are great
            </div>
            <!-- more cheesy content -->
        </div>
    </div>
</div>
<div class="container" id="footer">
    Contact us!
    <div class="contact-info">
        <p class="email">
            <a href="mailto:us@example.com">us@example.com</a>
        </p>
        <div class="street-address">
            <p>123 Main St., Suite 404</p>
            <p>Yourtown, AK, 12345</p>
            <p>United States of America</p>
        </div>
    </div>
</div>
Hoo, that's a lot of <div>s. And hey, it works. I mean, mostly. It has the structure you need, and I'm sure it'll look the way you intend by the time you're done styling it. But it has some big problems:
- Accessibility - Many a11y tools are pretty smart, and try their best to parse the structure of a page to help guide users through it in the way the page's author intends, and to give users easy jump points to navigate quickly to the section of the page they care about. But - <div>s don't really impart any useful info about the structure of a document. The smartest a11y tool in the world still isn't a human, and can't be expected to parse- classand- idattributes and recognize all the weird and wild ways that devs all over the world name their blocks. I can recognize that- class="article-header-level-2"is a subheading, but a robot can't. (And if it can, get it out of my computer, I'm not ready for the AGI revolution just yet.)
- Readability - To read this code, you need to carefully scan for the class names, picking them out from between the - <div class="..."></div>boilerplate. And once you're a few levels deep in the markup, it becomes tricky to keep track of which- </div>closing tags go with which- <div...>opening tags. You start to rely very heavily on IDE features like coloring different indentation levels or highlighting the matching tag for you to keep track of where you are, and in larger documents it can require a lot of scrolling on top of those features.
- Consistency and standards - It can be frustrating to start a new job or move to a new project and have to learn from scratch all the crazy markup conventions used across the codebase. If everyone had a standardized way to mark up common structures in web documents, it would be much easier to skim an HTML file in an unfamiliar codebase and get a quick handle on what it's supposed to represent. If only there was such a standard... 
HTML5: Such a standard
HTML5 is not new. That's an understatement; an initial working draft was released for public comment in January of 2008 (11 years ago!), and it became a full-fledged W3C recommendation in October of 2014, 4Β½ years ago. So, like, it's been around for a while.
One of the primary advancements of HTML5 was introducing a standardized set of semantic elements. The term "semantic" refers to the meaning of a word or a thing, so "semantic elements" are elements designed to mark up the structure of a document in a more meaningful way, a way that makes it clear what they're for, what purpose they serve in the document. And importantly, since they're standardized, these elements define the document in a way that everyone can use and understand, robots included.
I think the HTML5 spec itself sums up the issue nicely in a note under the definition of the <div> element:
NOTE:
Authors are strongly encouraged to view the div element as an element of last resort, for when no other element is suitable. Use of more appropriate elements instead of the div element leads to better accessibility for readers and easier maintainability for authors.
β https://www.w3.org/TR/html5/grouping-content.html#the-div-element
I'll divide the semantic block elements into two categories: primary structure and content indicators. These aren't standard terms or anything; I just made them up for this article. But I think the distinction is useful enough. π€·ββοΈ
Primary Structures
There's a super common pattern that can be found in websites, tutorials, and even CSS libraries all over the internet, and for good reason. We often divide a page at its topmost level into three regions: header, main, and footer, then divide those regions into sections as needed. I included this in my example above to prove the point:
<div class="container" id="header">...</div>
<div class="container" id="main">
    ...
    <div class="article-section">...</div>
    ...
</div>
<div class="container" id="footer">...</div>
I've seen (and used) this pattern for decades, and it makes a ton of sense to structure a document this way, both for readability of the HTML and for easier styling of the page in CSS. The header and footer elements also make partial templates in languages like PHP or Rails/ERB a ton easier to work with, as you can include common header and footer partials all over the site:
<?php include 'header.php'; ?>
<div id="main">...</div>
<?php include 'footer.php'; ?>
So here's the thing: everyone agrees that this is a nice pattern to follow. And that includes the folks at the WHATWG and W3C, who standardized the pattern into four new elements in HTML5 with very clear names: <header>, <main>, <footer>, and <section>.
  
  
  Bookends: <header> and <footer>
The <header> and <footer> elements are basically twins: they're very similarly defined in the spec and follow the same set of rules about where they're allowed to be used, with the only difference being their semantic purposes: headers go at the beginnings of things, footers go at the ends of things. And by "things", I mean more than just the <body> of your page: this pair of elements are designed to be used within any part of your document that represents a chunk of content with a clear beginning and end. This can include things like forms, articles, sections of articles, posts on a social media site, cards, etc.
Headers and footers are attached semantically to the closest "sectioning root" or "sectioning content" element. These are things like <body>, <blockquote>, <section>, <td>, <aside>, and lots of others; click the links above if you want the full lists. Assistive technologies can use these elements and others to generate an outline of a document, which can help users navigate it more easily. You shouldn't have more than one <header> or <footer> per sectioning root/content. (One of each is fine, but not two of the same.)
As a final note, <header>s very often hold the heading element (<h1>-<h6>) for their context. This is not necessary, but can help to group other related elements with the heading, like links, images, or subheadings, and can help maintain a consistent structure even when the heading is the only element in the <header>.
  
  
  The good stuff: <main>
The third primary region element, <main>, is special. The spec says two very important things about <main>:
The main content area of a document includes content that is unique to that document and excludes content that is repeated across a set of documents such as site navigation links, copyright information, site logos and banners and search forms (unless the document or applicationβs main function is that of a search form).
β https://www.w3.org/TR/html5/grouping-content.html#elementdef-main
So <main> is where you put the good stuff, the important parts of a page, the reason the user came to this page in particular, not your site in general. In other words, the main content. π―π²π€―
All that other stuff, logos and search forms and navigation and such, can go in a <header> or <footer> within the <body> but outside of <main>.
There must not be more than one visible main element in a document. If more than one main element is present in a document, all other instances must be hidden using the hidden attribute.
β https://www.w3.org/TR/html5/grouping-content.html#elementdef-main
This is pretty unique. Unlike <header> and <footer> (and most other block elements), <main> can't be used all over the page within arbitrary sectioning content; it should be used once and only once. Or rather, it can be used multiple times in a document, but only one <main> element should be visible at a time; all others must be hidden with the hidden attribute, which basically acts like display: none; in CSS. If you think about it, this suggests a pretty useful pattern for preloading views in an app: create a new <main hidden>, fetch some content that the user is likely to view next (e.g., the next article in a series, the next slide in a slideshow, etc.), and when the user clicks the link/button to load that view, swap out the current <main> with the preloaded one by toggling the hidden attribute on both.
Before continuing, let's pause and review the example from above. Here's how it would look if we used <header>, <main>, and <footer> for the main structure of the article:
<header>
    <h1>Super duper best blog ever</h1>
    ...
</header>
<main>
    <h2>Why you should buy more cheeses than you currently do</h2>
    ...
</main>
<footer>
    Contact us!
    <div class="contact-info">this.is.us@example.com</div>
</footer>
That's so much nicer already! But there's still plenty of work to do.
  
  
  Break it down: <section>
So we've got a basic outline for our page: a header, a footer, and a main content region. Now it's time to add some of that sweet, sweet content.
Typically you'll want to break your content down into sections, especially for mass text content like this article, because no one likes reading impenetrable walls of text.
That's where <section> comes in. This one is the simplest in terms of rules: structurally speaking, it's basically just a <div> with special semantic meaning. A <section> begins a new "sectioning content" region, so it can have its own <header> and/or <footer>.
What's the difference, then, between a <section> and a regular old <div>, and when should you use each? Well, allow me to quote the spec once again:
NOTE:
The<section>element is not a generic container element. When an element is needed only for styling purposes or as a convenience for scripting, authors are encouraged to use the<div>element instead. A general rule is that the<section>element is appropriate only if the elementβs contents would be listed explicitly in the documentβs outline.
β https://www.w3.org/TR/html5/sections.html#the-section-element
You know, as a quick aside, the HTML5 spec is actually pretty readable. It's one of the more readable specs out there. Every time I glance at it for a quick answer, I inevitably learn something unexpected and useful, especially if I start clicking links. Give it a try some time!
So in short, if you would list this portion of the document in the table of contents, use a <section>. If not, use a <div> or something else.
Content Indicators
Okay, so we've got a solid structure for our page. Instead of just slinging <div>s all over, we've explicitly marked the main content region of the page, and we've called out headers, footers, and sections. But there's definitely more semantics than that to our document.
Let's talk about a few of the elements added in HTML5 that communicate content semantics rather than structure.
  
  
  The whole shebang: <article>
The <article> element is used to represent a fully self-contained region of content, something that could be plucked out of your page and dropped into another and still make sense on its own. This might be a literal article or blog post, but could also be used for a social media post like a tweet or a Facebook wall post.
The HTML5 spec recommends that articles always have a heading that identifies what it is, ideally using a heading element (<h1>-<h6>). An <article> can also have <header>, <footer>, and <section> elements, so you really could use it to embed a full document fragment with all the structure it needs within another page.
To return to the example from the way up above, let's rewrite the class="article-*" elements using an <article> and some of the other elements we've discussed.
<article>
    <header>
        <h1>Why you should buy more cheeses than you currently do</h1>
    </header>
    <section>
        <header>
            <h2>Part 1: Variety is spicy</h2>
        </header>
        <!-- cheesy content -->
    </section>
    <section>
        <header>
            <h2>Part 2: Cows are great</h2>
        </header>
        <!-- more cheesy content -->
    </section>
</article>
Isn't that a ton more readable than the original? And again, not only is it easier to read, it's way more useful for assistive tech; robots can't always figure out your specific class name pattern, but they can follow this structure.
  
  
  Getting around: <nav>
This element is a bit more well-known than others. <nav> is designed to clearly identify the main navigation blocks on the page, the groups of links that help the user find their way around the rest of the site (e.g. a site map or list of links in the header) or the current page (e.g. a table of contents).
In our example up top, let's apply a <nav> to that group of links in the header.
<nav>
    <a href="/">Home</a>
    <a href="/about">About</a>
    <a href="/archive">Archive</a>
</nav>
Doesn't change the structure at all, but you know what it is at a glance rather than needing to read and process the class name on a <div> to figure it out, and more importantly the robots can find it too. 
  
  
  Getting in touch: <address>
The last element we'll discuss is <address>. This element is intended to call out contact info, and it's often used in the main page <footer> to markup the mailing address, phone number, customer service email address, etc. for a business.
Interestingly, the rules for how to markup the content within an <address> element is left open. The spec mentions that there are several other specs that address this, and it probably is outside the scope of HTML itself to provide that level of granularity.
A common solution is RDFa, also a W3C spec, which uses attributes on tags to label the different components of data. Here's what the footer from our example might look when marked up with <address> elements and RDFa:
<footer>
    <section class="contact" vocab="http://schema.org/" typeof="LocalBusiness">
        <h2>Contact us!</h2>
        <address property="email">
            <a href="mailto:us@example.com">us@example.com</a>
        </address>
        <address property="address" typeof="PostalAddress">
            <p property="streetAddress">123 Main St., Suite 404</p>
            <p>
                <span property="addressLocality">Yourtown</span>,
                <span property="addressRegion">AK</span>,
                <span property="postalCode">12345</span>   
            </p>
            <p property="addressCountry">United States of America</p>
        </address>
    </section>
</footer>
RDFa is admittedly a bit verbose, but it's pretty handy for marking up data. If you're interested in learning more about RDFa, here's a few links:
- The W3C's RDFa primer
- A description of schemas and links to a bunch of them on schema.org
- The LocalBusinessschema used above
Conclusion
Okay, we've covered a lot, and we've seen a lot of it applied to our example in bits and pieces. But let's put it all together and see what it looks like.
<header>
    <h1>Super duper best blog ever</h1>
    <nav>
        <a href="/">Home</a>
        <a href="/about">About</a>
        <a href="/archive">Archive</a>
    </nav>
</header>
<main>
    <article>
    <header>
        <h1>Why you should buy more cheeses than you currently do</h1>
    </header>
    <section>
        <header>
            <h2>Part 1: Variety is spicy</h2>
        </header>
        <!-- cheesy content -->
    </section>
    <section>
        <header>
            <h2>Part 2: Cows are great</h2>
        </header>
        <!-- more cheesy content -->
    </section>
</article>
</main>
<footer>
    <section class="contact" vocab="http://schema.org/" typeof="LocalBusiness">
        <h2>Contact us!</h2>
        <address property="email">
            <a href="mailto:us@example.com">us@example.com</a>
        </address>
        <address property="address" typeof="PostalAddress">
            <p property="streetAddress">123 Main St., Suite 404</p>
            <p>
                <span property="addressLocality">Yourtown</span>,
                <span property="addressRegion">AK</span>,
                <span property="postalCode">12345</span>   
            </p>
            <p property="addressCountry">United States of America</p>
        </address>
    </section>
</footer>
If you ask me, that's 100x more readable than the original example, and it's going to be 100x more effective for SEO and accessibility purposes, too.
These are by no means the only semantic elements in HTML. There are lots of additional elements that help to tag and structure your text content, embedded media, etc. Here are a few to check out if you're enjoying this and want to dig deeper. You might recognize a few:
And that's just a start! Like I said, when you start reading the HTML spec, it's tough to stop. It's an incredibly rich language, and I think people underestimate it a little too often.
 
 
              

 
    
Top comments (128)
Awesome post!
you have a dangling
</div>in the last HTML code block. :)Oh no! π« Hahaha, thanks for pointing that out! Fixed it
I agree with everyone, very nice and useful post! Especially for me, who I'm not a web developer but I sometimes do some side and personal projects that imply web development!
As a side note, there's still one extra
</div>in the code sample inThe whole shebang: <article>section ;)Thanks, fixed!
Which tag should I use to display the code?
For semantics purposes, use the
<code>tag. For display purposes, to make it looks like code, it depends:If it's an inline code snippet in the middle of a sentence, you can just use the
<code>tag:The <code>while(condition)</code> loop is useful for loops with an unknown number of iterationsIf it's an independent block of code, possibly with multiple lines, and you don't need syntax highlighting, you can wrap your
<code>tag in the "preformatted text" tag,<pre>:<code>tag, but you're probably going to want to bring in a library that does the very hard work of parsing and highlighting your code, like Prism or highlight.js.Bottom line, though, no matter what extra stuff you're doing for display purposes, code should always be wrapped in a
<code>tag.Great post.
I been doing this for over a year and love it.
I feel you should have given some mention to Custom Tags. They help so much in pushing this to the next level.
PS-- Content Indicators has a hanging < /div>
Thanks, fixed!
By custom tags, do you mean custom elements? Those are a whole other ball of wax. They're part of the Web Components API, which requires JavaScript and a ton of extra domain knowledge. This article is just an introduction to an important feature of HTML5, and Web Components are not a part of HTML5, so that would fall pretty far outside the scope of this article.
To be honest, I'm pretty unfamiliar with custom elements myself, so I'm not the person to write that article. Sounds like you have some background, so maybe you are! I'd love to read it!
Yes custom elements are what I mean.
Though there is far more to it than I know or understand (for now). In it's simplest form you can just write a tag as < box>< /box> (or any word you choose) and be able to target that tag as you would any other.
JavaScript:
document.getElementsByTagName("box")
CSS:
box{display: none;}
All without understanding anything else of the API.
The greatest thing about a custom tag is it is blank.
No pre-attached css or code like 'p' or 'div' have. Your free to write without having to remember what has what attached.
I use this kind of coding extensibly in my Browser Game "EVO Idle". As well as some new Dynamics to manipulate information quickly and easily. (Be warned some are considered against the standard.)
RIght, that's the thing. Those aren't so much "custom elements" as they are "nonstandard elements", because the browser doesn't understand them, and they don't have any functionality backing them. It is true that you can write arbitrary tags in your markup, and IIRC the browser treats them like
<span>s by default (which is to say, as generic inline elements), but this isn't standard practice, and it's usually considered a bad idea.A custom tag name indicates to other developers that there's something special about this tag, that it's a component or a proper Custom Element with some JS behind it or something, and it can be very confusing to read markup with lots of nonstandard elements that aren't backed by any other code, especially if they're mixed with actual components that are backed by code. So my recommendation is to instead use a standard element with a
class="..."instead of a custom tag name. The only thing that changes in your selector is an extra.before what would have been the element name, and is now a class name.Just my two cents, take it or leave it π
html.spec.whatwg.org/#custom-elements
Yep, that's a nice summary! They distinguish there between "non-standard elements", arbitrary tags that have not been added to the CustomElementRegistry, and "custom elements", which have:
(My emphasis added.)
The word "defining" is linked to the portion of the standard that describes how to define a custom element, which begins as follows:
I disagree. If people wish to make assumptions about everything than it is their own fault for messing up.
A simple Google check will verify if a tag is standard or custom.
Using custom tags with or without javascript makes the document more readable. Which is the point you were making.
Not everything needs to be predefined.
If it were the language would never have evolved in the first place.
It is people pushing out of the standard practice that evolve the language.
If you wish to be confined by such limitations that is your choice.
I choose to evolve my coding style by trying new things and creating new concepts. Even in the face of the people's backward concepts of what is and is not proper coding.
Alright, please read what I have to say here in full. I need to say something that I feel is very important.
I fully respect that you have a different opinion from me on how best to write your markup. And I'm perfectly cool with that; you can write your HTML in whatever way you see fit, and I'll be happy to hear how it works for you, and what benefits you find in it. Genuinely, reading about alternative viewpoints on web development is one of my hobbies, and I almost always find something I like in every one I explore.
But please do not turn around and label my attempts to explain the existing standards and best practices of the web platform as "backwards concepts of what is and is not proper coding". Because these are not some arbitrary whims handed down from some oligarchic hierarchy of web gods. These are standards with a huge amount, literally decades, of research, community-wide debate, and iterative revisions behind them, and there are are very good reasons why they exist.
It honestly hurts to be accused of trying to "confine" and "limit" people from "trying new things and creating new concepts" simply because I'm explaining the background and advantages of the specs that are out there. My point was never that nonstandard tags are evil and you should feel bad for using them. But you started the conversation by suggesting that I should have promoted nonstandard elements as a good practice that "help[s] so much in pushing this to the next level." The fact that you said that nonstandard tags can build on the techniques that I discussed in my article tells me that I did a poor job of emphasizing the reasons why we use standardized semantic tags. Semantic tags are not primarily about improving code readability. If that was the case, there would have been no point in defining a spec for them; we could just standardize the use of arbitrary tag names and let common patterns develop within the community, like we have with CSS class names.
My point in this thread was to let you know that there is a very real difference between arbitrary nonstandard elements, which have no defined semantics or behavior and can't be used by assistive tech or web crawlers, and true custom elements, for which the developer has explicitly defined the behavior and semantics for the browser.
Because here's the thing: the semantic web isn't just a matter of preference or style or convenience. It has a huge direct impact on the lives of many, many users, those who rely on assistive technology, which in turn relies on the semantics it can parse from the text to help those users.
If you've never tried to use the web with a screen reader before, please do. I think every web developer needs to do this periodically in order to better understand how many of their users interact with the web, and how honestly horrible a lot of the web can be for users who rely on assistive tech. If your site is built with nonstandard elements with no defined semantics, then the best that a screenreader can do is read the text top-to-bottom, with no way to let the user easily navigate the page. But if you use the elements I talked about in this article, a screen-reader can add build an outline of the page to give to the user, and it makes it a hundred times easier to navigate the page.
And microdata specs like RDFa help fill in the rest of the semantics that aren't expressible in HTML alone. Seriously, browse a little through schema.org/docs/full.html and look at all the options. And that's all stuff that assistive tools can potentially utilize to give users more context about what the page represents. (And by the way, it can dramatically help your SEO on top of this.)
In my experience, there's a tragic lack of attention paid to semantics in web development training, and this knowledge gap actively hurts the users that actually need it. That's a big part of why I wrote this article. Using semantic HTML and microdata formats improves the lives of many people much more directly than you might expect. Nonstandard tags, unfortunately, donβt, and I worry they may redirect devs away from the standard methods that do because nonstandard tags require fewer characters and zero research.
Maybe I should have emphasized the a11y aspect of semantics more strongly in my article, and maybe I'll write a follow-up to do just that. But please, please don't think that I or anyone else is trying to enforce some arbitrary restrictions that stifle innovation by recommending that devs avoid nonstandard elements and use existing semantics frameworks instead. What I'm trying to do is help one of the most underserved and ignored groups of users on the web.
This feels like a hugely under-appreciated comment, itβs a very good reply. I seriously think you can turn this comment into its own blog.
Thanks π I may do that
Please do :-)
For posterity, since new people seem to keep finding this article and this comment thread, I did write that follow-up:
Why I care about the Semantic Web
Ken Bellows γ» Apr 22 '19 γ» 7 min read
Good article. It can be tricky when a design isn't at all related to the common news, blogs, etc. In those times I've found this flowchart pretty handy to help reason about semantics. html5doctor.com/downloads/h5d-sect...
Nice, that's an awesome chart! Haven't come across it before. Kinda surprised that
<header>and<footer>don't show up anywherePerhaps a v2 is in order ;)
Edit: I just noticed the date on it is 2011-07-22, so yeah, a v2 really would be good.
I'm completely agreeing using semantic HTML. Usually I'm always trying to use the sectioning elements.
But I'm observing that I'm still heavily using
<div>s, inside of an<article>or<section>, especially when layouting with flexbox.I tend to create a lot of wrapping
<div>s, for grouping together some elements as one flex child.(e.g. for using the
margin-*: auto;trick altogether)It kind of feels unsemantic, but then again, since these are only used for layouting, it seems to be ok.
How do other people feel about that?
I have two feelings:
First and foremost, it's almost definitely fine! The point here isn't to totally get rid of
<div>s, it's to stop using them in cases where they have some semantic meaning that's covered by another element. But generic containers with no semantics to them are exactly what<div>s are, so if they're just grouping things for styling purposes, not semantic purposes, you're all good!Second, with the above said, as kind of a side comment, if you're creating a lot of nested flex containers, you probably would be better served with a flatter HTML structure and CSS Grid for the layout. Grid is supported pretty solidly at this point, and it's usually not hard to create fallbacks for older browsers that still look perfectly fine (even if they don't exactly match the mockup) without adding any extra markup. If you haven't tried Grid yet, give it a shot, it'll blow your mind! (If you can't tell, I'm very excited about Grid, I wrote a couple article about how pumped I am haha, I recommend this one you wanna see the thing I love most)
I'm more and more trying to use Grid, you're right it is mindblowing.
Right now I'm using it for overall page layout and flex for smaller sub items. I need to reconsider if nested flexboxes can be done with grid.
(I actually found my way to this article, via your Why we need CSS subgrid article. Great one, too!)
I just wanted to chime in on the CSS Grid discussion.
If the only reason you aren't using CSS Grid yet is because IE11 needs to look identical; I wrote a whole series on how to write modern CSS Grid code that works perfectly in IE with no fallback styles.
css-tricks.com/css-grid-in-ie-debu...
PS. Great article, I'll definitely be recommending it to people.
Stop telling me what to do!
Great post.
But something bugs me : multiple
<section>in one<article>?When I learned HTML5 ~10 years ago, I was told the contrary, multiple
<article>in one<section>(see alsacreations.com/xmedia/doc/origi...).What is the best practice ?
Well, you can do both, depending on the scope of the tags. For example, I might put a tweet in an
<article>, and I might show multiple tweets in a single<section>. But typically, on a site like a blog where you have literal articles or other long-form text, you'd have one<article>that wraps the main text content of the document, and that<article>would contain multiple<section>s.Let's look at the spec again.
<article>:<section>:So the idea that an
<article>can contain multiple<section>s seems natural to me. And in fact, the spec itself shows a code example under the definition of<section>linked above where an<article>has multiple<section>elements within it.Thanks for answering !
It feels clearer now.
Last question, just to be sure I understand well, would it be correct to do the following :
I don't see why not! Two tangential comments though:
Each
<section>should ideally have a heading tag (<h1>-<h6>) to identify the section. Remember that a<section>should be something you'd list in your table of contents, so what would you call that section? Though I recognize you might have skipped it for the sake of example code, and of course this has no effect on the main pointYour comment syntax is slightly wrong, it should be
<!--rather than<--!. But again, no effect on the main pointThanks again !
And :facepalm: for comment syntax X)
Good article!
However, I have strong and complex feelings on this, being interested in the semantic web since 2003 and remember ardently following xhtml1.0 strict standards.
main: great, good.
header/footer: awesome, thanks
nav: oh wow how did we live through xhtml without it?
article/section/aside: wow, trash, wtf?
Article, section and aside are so vague that they semantically mean about the same thing as div. You still have to drop identifying classes on them to actually note what they are intended to contain.
Comments/assorted widgets/supplementary info can be articles, but also can be sections, but can also be asides or contain or be contained by any mixture of the above. I'm not saying div is the gold standard, but this is one area whatwg kinda dropped the ball. Semantically, it doesn't say what it is any more than div does.
I'm not alone here, either.
That's a very interesting perspective, genuinely, and I'll have to think more about it.
But my initial reaction is that I think article and section at least have pretty clear meanings:
<article>should be used only for a high-level portion of the document that is sufficiently independent that if it were plucked out of your page and dropped into another page, it would still make perfect sense, like a blog post or tweet.<section>is a block that doesn't meet the above independence criteria, because it only makes sense within the context of the surrounding content, but that you would list in your table of contents.An
<aside>is, admittedly, a bit more vague: is a "note" block in the middle of your text an "aside"? I often use phrases in my writing like "as an aside, ...", but I wouldn't put the proceeding paragraph in an<aside>. So maybe its name is a bit misleading. But I still think it serves an understandable purpose: if I never read the content within the aside, I shouldn't be confused about anything in the article, but the content in it should either enhance my understanding of the content (e.g., example 13 in the spec adds background info about a country that might help an unfamiliar reader) or give me additional functionality, like a sidebar with links or buttons.I think this is sort of an inherent problem with trying to fit any rigid spec to human language: we want the spec to be written using words we understand, but human language is incredibly fluid and non-rigid, so there are bound to be confusing cases where its unclear which element to use, etc. But I still think there are clear black and white areas, even if there are lots of grays in between.
All that said, I'm interested to hear your feedback, and I'm very interested to read more about the problems people see with the spec; it should be helpful in explaining it better down the line. Thanks for pointing this out to me!
I think part of the issue we run into is that the human language being used to describe this is specifically the language of print layouts like newspapers or magazines. That language was far more known back in the late 90's early 00's but more and more developers are going to be seeing this through web-first eyes.
Your explanations are spot-on though. Even with aside, while we might see it as vague in its contents, you nailed the purpose.
Thanks! And yeah, I agree, the farther we get from the days of mostly-print-media, the less obvious the metaphors become.
I also think with
<aside>specifically there are sort of two competing metaphors: the "sidebar" layout element, and the semantic "aside", for tangential info. These are really two very different things, but the spec currently allows for both, which is confusing. As I understand it from some googling, the spec initially only allowed for the semantic usage, and did not recommend using<aside>for sidebars with content unrelated to the main content, like navigation links, etc. But the spec was later amended because of common confusion and the perceived need for a sidebar element to explicitly allow both usages. I'm unsure how I feel about that move.I enjoyed this refresher and I agree there's more to markup than divs. With that said, the current crop of available HTML tags are a little confining. They seem biased towards newsy, bloggy, content-heavy sites (section, article, header, footer, p, aside etc). It would be great if there were a few tags aimed at web applications - tags like controlbar, preview, settings, livecontent, and user etc. I often find myself trying to map the meaning of some "thing" I'm working on to the nearest-matching standard tag. In the absence of more semantically appropriate choices, this all-to-often turns out to be a div.
That's a pretty good point, and one that a lot of people have discussed in the last couple of years.
As for most of the semantic tags being biased toward content-heavy sites, I'd say you're like 75% right. Tags like
<p>,<aside>, and<article>are pretty specifically defined in terms of representing that kind of content. But some of the others, like<header>,<footer>, and<section>, while arguably still defined with content-heavy applications in mind, are IMHO still pretty useful for web applications if you focus on the idea of an "outline" for your site, and how that impacts accessibility in particular. Assistive tech like screen readers uses elements like these to build an outline of the parts of the page for the user, so you could still have the main interactive regions wrapped in sections, with<h1>-<h6>tags to label them, and<header>s and<footer>s where they make sense, though those are probably rarer, especially<footer>.As for introducing new elements that are more focused on web applications, I have three thoughts.
First, I think this would be much more difficult than it might seem at first glance. How to structure and divide up the parts of a web application is pretty controversial even within a single team, let alone trying to standardize semantic markup for the whole industry. There actually have been some discussions about this, and several proposed (and even partially implemented) elements have been retracted; see
<menu>, for example.Second, there already is a way to do this, though it's not quite as clean as semantic elements: the Roles Model, which uses the
role=""attribute to define pretty specifically what elements in the page are for, and it has a pretty large selection of "widget roles" that are super useful within web apps. I also expect it would be 100x easier to add a new role than a new semantic element.Third, there is also another approach that's been floating around for a few years now and gaining some steam that will allow the community to develop and share these elements, maybe create some community de facto standards, without waiting on the W3C for it: Custom Elements. They aren't just for widgets: you can register custom elements to play semantic roles as well, if you want, and define some amount of semantics and behaviors for them. I haven't played with them too much, but have heard good things from those who have.
As a final note, I just saw a super interesting thread on Twitter on a very closely related subject, then switched to Dev and saw your comment in my notifications. This is the first tweet of several, and I recommend clicking through and reading the whole thread.
Thanks for your response. Of course you are right about all of it, but roles and custom elements seem a little verbose, if not downright complicated for app design and scaffolding. HTML tags are so basic by comparison - you can write and read and understand them immediately. I'm not saying the web needs to be simple - heaven knows it's not, but there is something to be said for brevity when it can be had. In my HTML fantasy, there would be just a handful of common, handy tags for web applications and I would be universally heralded and financially rewarded for my contribution to humanity.
Well that's the awesome thing about custom elements, once they're written, they are just HTML tags. The idea with them is to pass them around like any other library. So hey, give it a try, write that fantasy set of web app tags, share them around and get your fame!
This is certainly a lot better than just divs and spans, however there are also some quirks with HTML5's section elements.
Section and Aside bring some nasty-ness to the outline for example.
Using more than one h1 (like in your example) is also generally discouraged for accessibility, even though the spec seems to encourage it.
I usually don't include a heading in the nav (even though the outline wants it), and only use 1 h1 element on the page at all times (literally the main heading on the page, the rest gets an h2/h3/etc where it makes hierarchical sense).
So in your example I would change the 2nd h1 to h2 and all the h2 to h3 for a somewhat nicer outline.
Other than that, I really encourage writing more semantic markup in general. There's a ton of useful tags out there that are way better than your styling hooks (div/span)!
Thanks for the context! I'll readily admit that I'm no expert on the outline algorithm, but I will say that the rumors I've heard suggest that an updated outline algorithm is beginning to be implemented in a few browsers, and the updated algorithm prefers a new
<h1>within each sectioning context, e.g. within each<section>,<article>, etc. But even so, it's not the predominate algorithm in the wild right now, so fair point, one<h1>per page.How do you examine the generated outline for a page, as perceived by assistive tech? Are there good tools out there to help with this?
Hey Ken!
Yeah the spec calls for a new outline in each section, so multiple H1's are technically allowed, but assistive technologies and (as much as I hate this argument) SEO don't always line up with the spec. So for now it's safer to go the 1 H1 route.
Usually I use W3C's validator (validator.w3.org/nu) with the outline option checked to test for at least the basics.
Then there's some other tools (like the "Siteimprove Accessibility Checker" plugin for Chrome) that help a bunch as well (but they focus less on semantics and more on pure a11y).
Isn't it kind of redundant to use a 'header' tag just wrap an 'h1' tag? Seems to me to clutter up the dom structure...
That's certainly true, and in many cases you don't need it. You can definitely just put the
<h1>(or<h2>, ...) tag on its own, and AFAIK that works just as well for screen readers, SEO, etc. But the advantage of a<header>is being able to group other things like inline icons, section anchor πlinks, etc., and I find myself going back and adding those later often enough that I have just made a habit of using a<header>wrapper almost all the time. But with that said, it's a very YMMV situation, so feel free to skip the<header>if you feel confident you don't need more than the<h1>, there's nothing wrong with that semantically πIf you think of a magazine or newspaper article it makes more sense. Things like a byline, publication date, subtitle, those are all still part of the header for that article. You would want to group them semantically.
Some comments may only be visible to logged-in visitors. Sign in to view all comments.