HTML Markup: Tag, You're It

#beginners #html #overview

Overview

The element is the fundamental building block of an HTML document. The purpose of the HTML element is to provide semantic meaning to content of the page.

A headline isn't a headline because of how big it is or its typeface. It's a headline because someone has decided that its purpose on the page is to be a short introduction of what follows. Thus, the beginning and and end of the most important headline on a page is marked by opening and closing <h1>...</h1> tags.

A paragraph is a set of sentences that are somehow related. It may be visually set apart using line-spacing or indentation, or even making the very first letter larger. Spacings or indentations have nothing to do with whether or not something is a paragraph, though. The formatting is only there to cognitively tip you off that this group of sentences needs to be thought of as being interrelated somehow. A paragraph may be formatted, but formatting does not make something a paragraph. In HTML, what makes a paragraph a paragraph is the opening and closing <p>...</p> tags.

Semantics refers to how the structure of some thing can provide meaning beyond what the content of the thing itself literally says. HTML is used to define the structure of a document by delimiting its content with HTML elements in order to functionally label the various parts.

Syntax

Most HTML elements takes this general form:

<name attribute="value">This is the element's content.</name>

Things to note:

Elements are delimited by two tags: the opening tag (e.g. <name> ) and the closing tag (e.g. </name> ) . Closing tags contain a / character followed by the name of the element being closed.
An open tag can contain more than one attribute, but it isn't required to have any. Some tags that don't have attributes with values serve no purpose, though. For example, an <a> (anchor) tag with no attributes is valid ... but kind of pointless.
Most attributes require a value be given, but there are rare exceptions.
The closing tag never contains attributes or values.
The element is everything from the beginning of the open tag to the end of the closing tag. So saying "the blockquote element" means "everything from <blockquote> to </blockquote> including the words that come between them." Those words in-between the tags are the element's content.
Elements can contain other elements as content. An element cannot start inside another element and then end outside it. It also cannot start outside another element and then end inside it.
An element that contains another element is called a parent element. An element that's entirely inside another element is called a child element. An element that's both inside and outside of another element is called a mistake. It's like your dog: in or out, buddy. Not both.
Whitespace generally doesn't matter in the element's content. Multiple spaces between words in the element's content are collapsed to a single space. Linefeeds and newlines are ignored entirely. There is an element with the name pre (<pre> ... </pre>) that preserves all whitespace and linefeeds in the content as written.
Elements that are unknown to a browser are simply ignored without error. In this way they are much like students. Don't understand something? Ignore it and act as if it doesn't exist!
Not all values of attributes require quotation marks, but most do so they are properly interpreted by accessibility software. Since it is always OK to surround values with quotes, it is considered a best practice to just surround all values with quotes no matter what.
An attribute can have multiple values. Values may be separated by commas, semi-colons, or spaces depending on the attribute. The delimiter you choose from memory will usually be wrong. Them's the rules.

While the vast majority of elements require both opening and closing tags, there are a few that don't require closing tags. These are formally referred to as replaced elements. This means that that the tag will be replaced entirely by content from outside the HTML document. Examples of this kind of element includes the br, hr, img, embed, video, and iframe elements.

Replaced elements are informally referred to as "self-closing" elements. Other self-closing elements can be found within the head element of an HTML document. These are not replaced elements since they aren't intended to be visible to the page's readers.

Most replaced, or self-closing, elements have multiple attributes. In the following img element, there are two attributes, src and alt.

<img src="foo.png" alt="An image of a foo">

The src attribute takes a value that defines where the source of the image may be found. It is written in the form of a URL (Uniform Resource Locator) and may be either absolute or relative in form. These concepts will be covered in more detail than you ever wanted in another post in this series.

The alt attribute's value is a textual label describing what the image depicts. This value may or may not be displayed by the browser if something prevents the image from being displayed.

Since everything necessary to display the image is present in this tag and there is no need for content within the element itself, no closing tag is necessary. Any closing tag would simply be ignored because that's what a browser always does when it doesn't understand something.

A version of HTML called XHTML required that replaced elements end with the characters /> (e.g. <br /> or <hr />. The current HTML 5 standard doesn't require this because it annoyed everyone. There are some among us who got used to doing it, so it's not considered wrong -- just unnecessary. Like Kardashians.

HTML Element Categories

HTML elements can be categorized by the function they serve. The following is a list of the most important categories with links to a comprehensive list of the elements that comprise them. The list is found on the Mozilla Developer Network (MDN)

Main Root: The single from which all others in an HTML document descend.
Document Metadata: Elements that describe information about a page, not the information that appears on a page.
Sectioning Root: The single element from which all visible content in an HTML document descends.
Content Sectioning: The elements that logically structure the overall HTML document.
Text Content: Elements that define the semantic role of discrete content blocks.
Inline Text Semantics: These semantic elements classify arbitrary chunks of text within a block.
Image and Multimedia: These are elements that embed images, audio, and video content within HTML pages
Embedded Content: Other types of non-HTML content that can be displayed within the context of a web page are marked with these elements.
Scripting: Elements that allow for including code that allows the user to interact with the page's content to perform tasks.
Table content: Elements that allows data to be organized in a tabular manner.
Forms: Elements that allow users to enter information to be processed by the web server through a variety of means.
Interactive elements: Elements used to create inteactive user interfaces.
Obsolete and Deprecated Elements: These are elements that should not be used. Ever. Really. Not ever. Even if you really, really, really want to. Even if you found some code that uses it and it makes your page look the way you want it to.

No one memorizes all of these elements. Some become familiar through repeated usage. All share some attributes, some have unique attributes. No one remembers them all. That's why there are online references like MDN. Lists like this can assist in determining where to find an element that fulfills a particular kind of role.

DEV Community

HTML Markup: Tag, You're It

Overview

Syntax

HTML Element Categories

Oldest comments (0)