Authoring Webpages/Printable version

From Wikibooks, open books for an open world
Jump to navigation Jump to search


Authoring Webpages

The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
https://en.wikibooks.org/wiki/Authoring_Webpages

Permission is granted to copy, distribute, and/or modify this document under the terms of the Creative Commons Attribution-ShareAlike 3.0 License.

Requirements

Requirements[edit | edit source]

In order to use this tutorial, you must have a web browser and a text editor. (Most operating systems come shipped with both.)

If you use Microsoft Windows, you will find an editor called Notepad in your Accessories.

Note: a text editor is not the same as a word processor. Do not use a word processor for this course.

Once you get more comfortable authoring webpages, you may want to acquire a more powerful text editor. For now, however, editors like Notepad, Simpletext (Apple Macintosh) or Nano (GNU/Linux) suffice.

Also, some minimal computing knowledge is assumed. If you do not know how to operate a computer, this course is not for you, yet.

Testing: Your first Webpage[edit | edit source]

In order to test whether you fulfil the requirements, you will now create your first web page in the following simple steps.

  • Open your text editor program.
  • Type in the following simple HTML document. Note: all valid webpages must start with a DOCTYPE. This selects the version of HTML used by the web page.
<!DOCTYPE html>
<html>
 <head>
  <title> Simple document </title>
 </head>
 <body>
  <p>The text of the document goes here!</p>
 </body>
</html>
  • Save your text as a file called first.html to your hard disk (or other storage medium). For the remainder of this course, it would be handy if you had a separate folder where you could save your webpages.
  • Open your web browser program.
  • Open the .html file you just saved in your web browser. Most browsers have a menu called something like 'File/Open File' for this.
  • Your browser should now display your first webpage.

This doesn't look very impressive, does it? Well, it will get much more impressive soon, and the good news is that it will not get much more difficult.

To reach true impressiveness though, you need to be able to reach your audience. To reach your audience, a webserver program needs to know about your webpages, so that it can serve your pages to all who ask for them.

Webservers[edit | edit source]

You could run your own webserver, but setting up such a program is beyond the scope of this tutorial.

If you follow this course in a class, the teacher will have set up a webserver for you.

If you follow this course by yourself, please check the website hosting options your Internet Service Provider (ISP) has available. (The ISP is the company that connects you to the internet. If you are connected through your work or college, consult with your systems administrator about hosting possibilities.) A form of web hosting may even be a free service your ISP offers you as part of your access package.

Recapitulation[edit | edit source]

In short, you need:

  1. A text editor (program)
  2. A web browser (program)
  3. A webserver (program+internet connection, or computer+program+internet connection)

The first two items you need right away, the last you need when you want to publish your webpages on the World Wide Web.


Up: Table of Contents - Next: Introduction


Introduction

Introduction[edit | edit source]

The Internet[edit | edit source]

The World Wide Web is only one of many Internet services. The Internet is a network of computers. ("Net" is short for "network".)

The Internet provides the means to pass packets of data around the globe. In order to make sense of these packets, you need services: protocols (sets of rules) that tell you what a certain piece of data means, and computer programs (software) that allow access to data based on such protocols.

The web is one such service, as are e-mail, FTP, torrents, Instant Messaging, etc. All these services require particular programs that understand their protocols, although some programs can work with multiple services. For instance, a web browser may also be capable of downloading files from an FTP site or managing e-mail.

The World Wide Web[edit | edit source]

The web was created to make finding and exchanging documents across the Internet easier. Before the web, if you had a document to offer through the Internet (for instance, a scientific paper, or a list of jokes, or a recipe, or your curriculum vitae), you had to put it up on an FTP site and then pass the address of that FTP site to someone else.

With the web, you embedded the address into the page, instead of passing it around. More than that, you could hide the real address from view by wrapping it in descriptive text. Instead of typing in a bunch of numbers in your FTP client, and then moving from folder to folder until you stumbled on a file called ppr_john.ps, you could create a 'link' called "John's paper on astroturf" that would take you directly to the desired file.

Hypertext[edit | edit source]

HTML stands for "HyperText Markup Language". But what is "hypertext"?

Webpages are documents. They are files on computers. When displayed in a browser, they are displayed as "hypertext". The definition of files and computers is outside the scope of this textbook, but hypertext needs to be explained before we can continue.

On a computer, a text file behaves like print: like a book or a magazine. You can scroll through the file instead of having to turn pages, but the text behaves the same way it would if it were printed. It simply sits on the page and you read it. Hypertext, on the other hand, has additional functionality; mainly, the use of "hyperlinks". ("Hyperlinks" are commonly referred to simply as "links".) A link is a part of the page that, when clicked on by a mouse (or tapped with a finger on a touchscreen), takes the reader to a different part of the document, or to an entirely different document. Typically, links are clickable parts of text, but they can also be images. For instance, clicking a chapter heading in the table of contents could take you to that chapter; a linked phrase could take you to a footnote or a reference document for further information.

Probably the first description of "hypertext" appeared in 1945, when Vannevar Bush wrote an article in The Atlantic Monthly called "As We May Think," about a futuristic device he called a "Memex". He described the device as electronically linked to a library and able to display books and films from the library. The Memex also gave the reader the ability to automatically follow references to the work referenced.

The Memex did more than offer linked information to a user, though. It was a tool for establishing links as well as following them.

HTML was designed by Tim Berners-Lee with similar goals in mind: to provide a way for scientists to create a huge library of interlinked works and to provide a way for the users of this library to alter certain documents: for instance, to add annotations or links.

The latter part of Berners-Lee's dream never really materialized until the invention of the first wiki, Wiki-Wiki Web. Wiki pages are special types of web pages that allow the reader to edit them. For example, this textbook is part of a wiki. Anybody can change the contents of this textbook on its website.

Since hypertext is so different from normal text, there are certain things that need to be considered when writing it.

Where hypertext links to another document, the author needs to make clear what it links to. (The dreaded "click here" is, without a doubt, the worst way to create a hyperlink, as it tells the user absolutely nothing about the linked document.)

Where a hypertext document is part of a larger whole (say, a website), it is often helpful to the users if they can find out which part of a larger whole it is. The document should link to a home page. The home page is a web page that forms the "front", or table of contents, of a website. The home page usually contains information about the website and has menus of links that allow the user to navigate to various parts of the website.

For instance, a home page might say: "This is the personal home page of Clarence Wiley". This suggests to the visitor that the web pages found there are probably of a personal nature, most likely created by Clarence, and of some value to Clarence.

Similarly, web pages that are part of a website often use a uniform style. Since arriving at a uniform writing style is not always easy or convenient (think of a web page with dozens of authors), other hints may be employed that tell visitors where they are. One of these hints is explained in the following chapter.

Dangerous metaphors[edit | edit source]

Calling web documents "pages" is a metaphor. They aren't really "pages", as in a book. However, it is useful to explain characteristics of the new and unknown by comparing them with similar characteristics of the old and known. As such, metaphors can be a useful and powerful device. However, the new and unknown has traditionally suffered harshly from a heavy stamping of the metaphor on the subject.

It is said by some that when television was invented, it took fifty years for the new medium to escape from being a stage in a box. Far into the twentieth century, television was made by aiming a camera at a stage (a 'set' in television and cinema terms) and just passing on to the viewer what went on.

Oddly enough, the things that are possible on television, but not on stage, were already possible when the first consumer grade television sets were being produced: broadcasting with a time-delay (record now, broadcast later), editing a program, using unusual viewpoints, animation, overlaying images et cetera. Of course there were a few 'revolutionaries' who used these techniques, but in general the metaphor (television is like a stage) held the new medium captive.

Even today, news anchors can sit behind their desk clad in nothing but underwear below the waist, safe in the knowledge that the camera will never do anything the audience of a stage play wouldn't do.

Today, the web has been struck similarly hard by failing metaphors. Since the web is clearly tied to computers, a lot of people confuse authoring web pages with programming. Since a lot of web content was written from the start by people using graphical web browsers, designing a web page is often primarily considered a graphical design task. (Contrary to popular belief, the first web browser displayed images and used an early form of stylesheets. [1])

False metaphor #1: Programming[edit | edit source]

Programming is the art of creating a computer program. A computer program is something that tells the computer what to do. Usually, a computer program is a list of instructions. For instance, a computer programmer can write a program that tells the computer to open a window on a screen, and display a large, bold text in the top-left corner.

A hypertext document can be implemented as a computer program. A modern day example of programming hypertext would be PDF, the Adobe originated format for distributing print documents across computer networks and to printers. However, HTML, the hypertext language for the web, is not a programming language. Instead, it is a Markup language: it allows one to "mark up" the structure of a document. (You may want to revisit this section later, once you have "marked up" a few web pages yourself.) HTML is a way to tell a web browser what the different parts of a document are. For example, one part of a web page could be a paragraph and another part could be a list.

Viewing HTML as a programming language means that you view its constructs, its labels, its mark-up, as instructions to the browser. For instance, you may want to indicate that a particular piece of text should be printed in large and bold letters. You could use the HTML code for headings for this, because most graphical browsers will display a heading as large, bold text. However, you may get visitors with browsers who don't display a heading as large bold text. That's the moment when the trap of the false metaphor closes around you.

The important thing to remember is that HTML tells a web browser what the different parts of a page are, not how they should look.

False metaphor #2: web authoring as graphical design[edit | edit source]

To view the web as a graphical medium is a much more insidious problem, because it is largely a correct view. Most web pages are browsed through a graphical browser. On such occasions, the graphical design of a web page can have a decisive influence on how well the content of that page is perceived and received by the visitor.

The problem lies not so much in seeing the web as a graphical medium, but in our assumptions on what a "graphical medium" is. Since web pages are often displayed on a computer screen, web page authors often design a layout grid with certain dimensions. Not everybody may be using the same dimensions, though, and visitors are hardly ever willing to change the dimensions of their windowing system to accommodate the wishes of a website's author.

The web can be displayed on a vast array of devices, some of which are not graphical at all: think of speaking browsers or touch browsers (Braille rules for the blind). You need to adjust your assumptions on what a graphical medium means to write good web pages.

The practice of creating web pages that can be accessed by a wide variety of types of browsers is called "accessibility". As the web grows, accessibility becomes more and more important.

Other, less damaging metaphors[edit | edit source]

It is clear that when we wish to see the web in the light of another, better-understood invention, we need to do this with care, and clearly delineating the limits of our comparisons.

I would like to propose a few metaphors that are just as useful as the ones before, but that have less potential for damage.

The media that are perhaps the most natural candidates for comparing the web to are unsurprisingly other Internet services. They all share with each other that they provide a way for people to talk directly to other people without intervention of middlemen such as editors and publishers. This possibility for direct contact stems from the underlying low-level protocols of the Internet. On the Internet, every computer can talk to any other computer.

Other systems for sharing information in the free-form manner the web allows, are abundant in real life. Just try to imagine all the possibilities you have when you want to announce a neighborhood party to strangers, or when you want to share your daily troubles with relative strangers: letter pages, bulletin boards, pen pal magazines, etc. spring to mind.

Conclusion[edit | edit source]

The important idea to take away from this chapter is that the web is a way of sharing information. There is nothing wrong with running programs on the web, or with presenting graphical design on it. These are well-understood and accepted uses of the web. What you should make a distinction between, though, is the web as a way to present information and the form and shape that information takes.

Your need to share --> Your way of sharing it --> An audience.

Questions and Exercises[edit | edit source]

1. Collect examples of typical web pages. How do they fit into the web? Which role do they play?

2. Find a web page. Make a list of all of the page's possible users, and write down how they would experience the page. Would they find the information they were looking for? Which information would they not be able to find? Would they have an easy way of finding more information through hyperlinks?

If you follow this course as part of a class, let the teacher pick a web page and see how many different audiences the students can come up with.

If you follow this course by yourself, you could go to http://www.google.com and enter a random word in its search box, then activate the "I'm Feeling Lucky" link. I find that the names of kitchen things often make great 'random' words ('ladle', 'cinnamon', 'fridge', 'stove', etc).

3. Try to view a web page using a non-graphical browser, or a browser, such as Opera, with all graphics capabilities switched off. If this exercise is done as part of a class, form duos: let one student face away, let the other read out what's going on in the web page. Let the student who cannot read the web page give instructions and see how hard or easy it is to use the website.

4. Try to think of a subject you would like to create a web page or a website about. Go to a search engine and try to find web pages about this or similar subjects. For instance, if you would like to create a website for your football club, try to find the websites of other local and internationally famous clubs. According to your personal taste and opinions, what have the authors of these sites done right? What have they done wrong?

Answers[edit | edit source]

For answers, see Answers to Questions and Exercises.



Previous: Requirements - Up: Table of Contents - Next: Creating a simple page


Creating a simple page

Creating a simple webpage[edit | edit source]

Time to get our hands dirty! (In a manner of speaking.)

Text structures[edit | edit source]

In the first chapter, which stated the requirements for following this course, a small exercise was printed, in which you created your first simple web page. If you haven't done that exercise yet, go there now and do it.

Text entered into a text editor and saved to a hard disk, or other form of storage, is often called a "plain text file". Plain text files generally provide three small ways for separating text. Tabs and spaces are used for separating words and returns are used for separating paragraphs.

People have found creative ways of producing intricate layouts using just these small methods of markup. In web pages, however, this method of laying out text doesn't work. A web browser will collapse all consecutive spaces, tab stops and returns into one single space or soft return, depending on where on the line the word occurs. HTML does layout a much different way, which will be discussed later.

When you structure a text, you generally do so to make it easier to digest and to read. By making chapter and section headings more pronounced, you allow the reader to skim over a text until they find an especially interesting part. By using introductions and abstracts, you allow a reader to decide if this text will be interesting to them. You can use illustrations, because sometimes people will much sooner understand what's going on when they can see what's going on.

HTML, the HyperText Markup Language for the web, and its successor XHTML, the eXtensible HyperText Markup Language, allow you to impose a structure on a plain text document. It replaces the few simple mark-up methods plain text allows you by its own. (Note: XHTML has been superceded by HTML5.)

The way HTML lets you do all this, is by letting you label certain parts of the text as a heading, a paragraph, an image, a table, a list, etc. Some structures are not supported by HTML, since it is a relatively simple markup language. For instance, there is no element for introductions or leads. The reader will have to infer from the position in a text which is the introduction, the lead or the abstract.

Each part of an HTML document is called an element. Elements are separated from each other by a type of label called tags.

An example:

 <p>This is an <em>important</em> example.</p>

What you see above is a paragraph element with an embedded emphasis element. The paragraph element starts with a <p> tag and ends with a </p> tag. The emphasis element starts with an <em> tag and ends with an </em> tag.

A web page by any other name...[edit | edit source]

There are several versions of the HTML standard. The most current is HTML5. However, web browsers should also support older versions of the standard, so that examples such as the one given in the Requirements chapter are considered valid HTML documents.

Every web page must have a title element (under older versions of HTML this was the only required element):

 <title>My webpage</title>

The title is the text by which the browser window will be named. It is the text that appears at the top of your browser window, and it is the default name of the bookmark (or favorite) used by the browser. It is also the title by which the page will be listed by a search engine.

Find a descriptive title. The heading of your page will often do just fine. The heading of this chapter is "Creating a simple web page", so its title could be the same text. Since this web page explains to you how to create a simple web page, that would be an excellent title.

Bad titles abound on the web. For example, a company might name their page

 <title>Big Fridge Manufacturer Inc.</title>

If you are lucky, they will even add in a bit of information related to the web page you are visiting, for instance:

 <title>Big Fridge Manufacturer Inc. - Manual of the Cool 3000 ice box</title>

Better for the visitor is:

 <title>Manual of the Cool 3000 ice box - Big Fridge Manufacturer Inc.</title>

After all, it is much more likely that the visitor who gets to this page is searching for the manual, rather than for company info.

So, the lesson is: always put the important information first. Often, you only have a limited amount of characters available to you in your bookmarks menu, in the search engine listing or in the window title bar; utilize that space to the maximum.

An HTML document with only a title element is not very useful. We will now introduce you to a couple of elements that will allow you to make good use of 90% of the power of the web.

A simple linking webpage[edit | edit source]

title
The name of a page.
h1
The most important heading(s) on a page, often the same as the title.
h2
A sub-heading.
p
A paragraph.
a
A link.

With these elements, we can make the following simple web page:

<!DOCTYPE html>
<html>
 <head>
  <title>Friends and family of Clemence Wylie</title>
 </head>
 <body>
  <h1>Friends and family</h1>
  <p>The following are links to the websites of my friends and family</p>
  <h2>Friends</h2>
  <p><a href="http://www.tomsawyer.us">Tom Sawyer</a></p>
  <h2>Family</h2>
  <p><a href="http://www.tantejeanette.ca">Aunt Jeanette</a></p>
 </body>
</html>

Exercise 2-1[edit | edit source]

Copy the above sample code to your text editor. Save it as exercise2-1.html. Open it in a web browser. Does it display like you expected?

Answers[edit | edit source]

For answers, see Answers to Questions and Exercises.

Creating a link[edit | edit source]

The a tags in the previous example have a special purpose. They create anchors. (Anchors are more commonly referred to as "links".) They contain attributes with attached values. Attributes are part of a tag that give the browser additional information about the element. Each attribute is followed by an equals sign (=) and a value in quotation marks (").

a tags have several possible attributes. The most important are href and id. href is the attribute that defines the URL (Uniform Resource Locater) or URI (Uniform Resource Indicator) (more commonly known as an "address"). This is the destination that the link leads to: another document, or a location within the same document. Commonly, addresses are called URLs; however, this practice has become deprecated, and it is now recommended that you use the broader term "URI", instead.[1]

The id is a unique name for the link, which can be used by other links to refer to it.

(Note: Previous versions of HTML used name instead of id.)

URIs typically have the following form:

protocol://domain/path#named_anchor

For example

https://www.example.com/books.html#section2

The protocol for web pages is usually http (HyperText Transfer Protocol) or its secure variant https. In this example, the link would take you to the section2 section of the books.html document on the www.example.com domain, using https.

Note, however, that most parts of this are optional, depending upon how you want to use the URI. URIs can be relative or absolute. An absolute URI will include the domain as part of the address. (The www.example.com part.)

An href can just be a relative path (for example wines/french/red/bordeaux.html): in that case, the address will be calculated from the page that contains the link.

An href can also just be a domain name: http://www.example.com/ leads to a website with that address; the web server of that site is supposed to figure out which document you want. This typically defaults to index.html or default.htm.

Elements deconstructed[edit | edit source]

An HTML document consists of elements. These elements are constructed as follows:

<tag>;contents</tag>

An opening tag may contain attributes. Attributes often have values.

<tag attribute1="value" attribute2="value2" attribute3>

The tag that closes an element is just like the opening tag, but has a slash in front of the name, and cannot contain attributes:

</tag>

Some elements cannot contain other elements. The HTML standard defines which elements can be contained by an element. The permitted combinations vary from version to version.

Elements are either block-level elements or inline elements. With block-level elements, the browser sets off the element in its own "block". It has a return placed both before and after it. Some examples of this are headings (h1, h2, h3, and so on), paragraphs (p), and list items (li). Inline elements are not treated this way, so (for example) they can be inserted into paragraphs without disrupting the flow of the paragraph. Good examples of this are anchors (a), emphasis (em), and images (img).

Block-level elements can contain inline elements, but inline elements cannot contain block-level elements.

For instance, the following is valid HTML:

 <h1><a>Valid HTML</a></h1>

But this is not:

<syntaxhighlight lang="html">

Invalid: <a>

invalid HTML

</a>

</syntax>

Validity[edit | edit source]

The term "valid HTML" has already been mentioned a few times. Since web pages are authored by people, and people make mistakes, web browsers tend to be extremely forgiving towards those mistakes. They will even try to correct your mistakes.

Still, there are several reasons why you should try to mark up a web page with valid HTML:

  • different browsers may correct your mistakes differently
  • future browsers might not be as forgiving
  • valid HTML is easier to read and maintain
  • when trying to correct bad markup, it helps if you are not side-tracked by other possible errors

The organization responsible for maintaining the HTML standard is the World Wide Web Consortium. It runs a validation service that you can use to check if your HTML is valid. You can find it at http://validator.w3.org. It is a good practice to validate your HTML with this service.

A common mistake is forgetting to start every document with a DOCTYPE. A document without a DOCTYPE is automatically invalid (HTML version information). Note: many texts erroneously state that the DOCTYPE is optional. It is true that all major browsers will forgive the absence of a DOCTYPE, but this does not make the page valid. The appearance of a page may vary noticeably between different browsers if the DOCTYPE is omitted, because each browser has its own peculiarities when rendering such pages.

Exercises[edit | edit source]

Time to have some fun. The following exercises will let you make some simple web pages and websites. The goal is to teach you the power of several different ways of linking.

Exercise 2-2[edit | edit source]

Copy the example web page above to the clipboard and open http://validator.w3.org. Paste the example in to the 'Validate by Direct Input' section and click on 'Check'. Is the example valid?

Exercise 2-3[edit | edit source]

If you have an anchor <a id="anchor1"></a>, then <a href="#anchor1">link</a> will link to it. That means that when you activate the link, the web page will be displayed starting at the anchor (rather than as usual from the top).

Make a copy of the web page you created in Exercise 2-1 and save it as exercise2-3.html. Change this file to include a 'menu' of anchors at the top that link to the headings of the different subsections (Family, Friends).

Exercise 2-4[edit | edit source]

There is a hybrid form of book and game called Choose Your Own Adventure (CYOA). In such a game-book, you read a bit of text as in a normal book, but after a while, you get to make a choice as to how to continue.

For instance:

You are sitting in the tub, soaking and relaxing. Your rubber ducky is chattering away happily when suddenly a pike grabs it from below and drags it down.

- If you dive into to the water to save the ducky, go to page 89

- If you pull the plug to empty the bath, go to page 24

In this exercise, you will write a short CYOA, in which the choices are represented by anchors that will lead to the text continuing from that choice. Every "chapter" must be a separate web page.

Keep it snappy and don't spend too much time on this. Ten to twenty web pages should be sufficient. The story does not need to be good or finished.

Tip: Create a template HTML file which you can use to base all subsequent chapters.

Exercise 2-5[edit | edit source]

Create a web page and save it as "exercise2-5.html". The web page should contain a short, informative text about a subject of your choice. It should contain at least three working links to external websites about the subject.

Answers[edit | edit source]

For answers, see Answers to Questions and Exercises.

Images[edit | edit source]

Including an image on a webpage is done using the img element. img is one of a class of elements referred to as "self-closing". Self-closing elements don't have a closing tag. Instead, they end with />. (In HTML5, the slash is optional; however, it is considered a best practice to include it.)

The img element has two obligatory attributes: src and alt.

src takes a URI as its value. In this case, the URI will be the "address" (location) of the image.

Since URIs can be relative, if the image is located in the same folder as the web page that includes it, the URI can consist merely of the file name of the image. More commonly, the image will be located with other images in a directory called img. It is a good practice to keep your images in a separate directory, because it will make your site better organized.

The alt attribute contains a textual description that appears when the image cannot be displayed. For instance, if the image is a photo of a lake with a castle, you could have the following code:

<img src="img/lakecastle.jpeg" alt="photo of a lake and a castle" />

When the purpose of the image is decorative, you might want to use an empty alt value. That way, when the page is displayed, the "decorative" text will not interrupt the flow of the page's main text.

<img src="img/prettypattern.jpeg" alt="" />

However, when the image has a function to fulfill on a webpage, the presence of alt text is very important for visitors who can't see the image. For instance, many webpages have navigation built from menus of links, where the links are represented by images. If the images can't be displayed and there's no alt text, users won't be able to use the navigation.

<a href="family.html"><img src="img/button-family.png" alt="Family" /></a>

The img element lets you embed an image on a page. You can of course also link to an image that you do not want to display on the page, because it has no role there. For instance, if you want to offer people the chance to download photos you made, you can offer links to those photos. For that you use the same a element that we have been using to link to other webpages:

<a href="img/lakecastle-large.jpeg">Photo of a lake and a castle (JPEG, 512 KiB)</a>

Note how you can create links to every file that can be located using a URL. By indicating that a photo is stored in the JPEG format (a very common file format for photos) and by indicating the file size, we give visitors the opportunity to decide whether they A) can use a file of this format and B) whether they are willing to download a file of this size.

Pre-formatted text[edit | edit source]

HTML contains many more elements (for example, HTML 4.01, contains 91 different elements), but for now we will discuss only one more before moving on to the style of web-writing.

The pre element allows you to retain plain text formatting (as discussed shortly at the beginning of this chapter). This means that within the element, consecutive spaces, tabs and hard returns will not be collapsed into a single space or soft return.

There is little use for this element. It stops the text from reflowing neatly when the browser width is reduced or expanded, causing visitors to scroll horizontally, which web-surfers generally hate to do. HTML and its companion layout language CSS have plenty of options to display line-breaks and indentation. Also, it is pretty meaningless in non-visual browsers.

However, when you wish to copy pre-formatted text from other documents, it may be handy to use the pre element until you have the time to mark that text up.

Example:

<pre>1  2    4        8</pre>

Further reading[edit | edit source]

Later during this course, we will discuss further elements. However, the intention of this course is not to make you fluent in HTML; it is to make you fluent in authoring webpages.

Generally, to fully comprehend something requires that you fully comprehend its form first. You cannot be a successful karateka if you cannot perform the various moves. You cannot be a successful French speaker if you have not mastered its grammar and vocabulary first. However, knowing all the ways to hit someone does not make you a good karateka, and knowing all the words and rules of the French language does not prevent you from becoming a mumbling baboon the next time you need to speak French.

To fully comprehend authoring webpages, you need to look beyond the language in which you write them. This is what we will do in most of the further chapters of this book.

The official HTML5 Recommendation of the World Wide Web Consortium can be found here. Although this documentation can, at times, be pretty hard to read, it represents the last word on any discussion of what is valid HTML5 and what is not.

Further, one the authors of the HTML 4.01 Recommendations, Dave Ragett, has written a couple of handy guides to HTML and its companion layout language CSS, which you can find at http://www.w3.org/MarkUp/#tutorials. If there are points in this and later chapters that you do not fully comprehend, you could do worse than study Dave's texts. They are much clearer than the official specifications, and short enough to study alongside this text.

More exercises[edit | edit source]

The following exercises are optional. You can use them to practice putting images on your webpages.

Exercise 2-6[edit | edit source]

Download the following images, and put them all on a webpage that you will save as 'exercise2-6.html'. Think of useful alt texts.

To download linked files, a lot of browsers contain a Save Link As function. In graphical browsers using a mouse, this function is often part of the context menu. On the PC this means you have to click you right mouse button, on Mac OS this means you have to press the Ctrl-key and press the mouse button.

(images to follow later)

Answers[edit | edit source]

For answers, see Answers to Questions and Exercises.

References[edit | edit source]

  1. "The Difference Between URLs and URIs". Daniel Miessler. https://danielmiessler.com/study/url-uri/

Previous: Introduction - Up: Table of Contents - Next: How to write for the web


How to write for the web

Introduction[edit | edit source]

In the previous chapter, you have learned how to actually create a webpage.

To reiterate, you should have learned

  • that webpages consist of text with markup tags;
  • that the markup is part of a language called HyperText Mark-up Language (HTML);
  • that HTML must be valid, for various reasons;
  • that you can use a text editor to write HTML; and
  • how to apply some basic HTML elements to a text, such as paragraphs (p), links/anchors (a), important and less important headings (h1, h2), images (img), etc.

In other words, you have learned some of the mechanics of creating a webpage, and if you have followed the exercises, you actually gained some experience in writing webpages.

Now, you will begin to learn how to apply that knowledge in a meaningful way.

As we observed before, there is no right or wrong way to make a webpage. The page may consist of valid or invalid HTML, but whether a page is successful depends solely on the goals you have set yourself and whether they were met. However, we have also noted that there is no way of knowing whether the goals were met for all visitors.

Generally, your goal will be to tell the world a story.

If you manufacture lawnmowers, your goal with a webpage may be:

  • to entice people to buy your lawnmowers
  • to teach customers about how to use their lawnmower
  • to educate people about the dangers involved with lawnmowers
  • to show stock-holders in your company how business is going
  • to make it easy for the press to write about your lawnmowers
  • to impart your love for making lawnmowers onto others
  • to offer a community of lawnmower minded people the chance to interact with each other
  • to show suppliers how to reach your factory
  • to warn customers of shoddy look-a-like products
  • to accuse competitors
  • to respond to accusations of competitors
  • ...

This list is endless. Who knows why somebody makes a webpage? There could be a million reasons.

But once you have committed yourself to telling your story, it would help if you could get your story across.

General writing skills are beyond the scope of this textbook. There are other Wikibooks (under development at the time of writing) that teach you exactly this. See for instance How to write an essay.

Reading problems on the web[edit | edit source]

Although few hard facts are known about this, it is generally assumed that reading from a screen is harder than from paper. For instance, it takes about 30% more time to read a text off a screen than off paper. Although little is known about the reason for this, it could be because most screens are static, while readers who use paper generally can move the paper around. Another theory holds that because the resolution of text on paper is much higher than that on screen (300dpi vs. 100dpi), it is 'easier for the eye' to read text off paper.

Most visitors browse websites using a browser that displays the contents of a webpage on a screen. The standard typography of web browsers seems to accommodate for a harder reading experience: paragraphs are separated by empty lines.

Regardless of how difficult it appears to be to read off a screen, web surfers have shown a marked preference for lay-outs that make it easier to scan a text for important points. Some of the qualities of these lay-outs are:

  • list-type information in lists
  • clearly marked hyperlinks
  • no horizontal (sideways) scrolling
  • all the important information above the 'fold'
  • inverted pyramid structuring of text

Often, visitors will print out text that they want to read with attention, while they skim over text that does not warrant such close scrutiny.

Information scent[edit | edit source]

Whatever they are searching for, most web surfers follow a similar strategy in hopping from page to page. They will 'scan' (skim through) a webpage, looking for interesting information. They will seek out elements that seem most promising to deliver that information, such as headings, emphasized text, and hyperlinks. If within the first few seconds a webpage does not seem to promise interesting information, the visitor will hop on the most promising hyperlink out of there. If there are no promising hyperlinks, visitors may even use the Back function of their browsers to return to a previous page.

The trail of promising elements is similar to the trail a bloodhound follows when hunting prey. It is therefore said that elements on a webpage emit an 'information scent'. The stronger that scent, the more likely the user is to follow it.

However, if a certain type of element has been producing false trails often, users will start turning away from such elements. For instance, although animated content is much more attention grabbing than static content, visitors turn away from animations more and more. This may be because web surfers have started to equate animations with advertisements; and advertisements rarely contain the sort of information the visitor was looking for.

By knowing how to make a webpage more accessible to those who use graphical browsers, and by knowing how to apply the right amount of 'scent' to certain elements on a page, the webpage author arms himself with the right tools to reach his audience optimally.

In the following, we will take a look at a couple of such tools.

Links vs. Text[edit | edit source]

The term Hypertext consists of two parts: 'hyper' and 'text'. Hyper is originally a Greek word meaning 'over', 'beyond'. Hypertext is something that takes text to a new level; but it still remains text. Hypertext has all the properties text has, and then some.

Although the title of this section makes it seem so, the extra properties of text, i.e. the (hyper)links, are not at war with the original text. At least not all the time.

When the theories of the web experts are correct, a web user surfs from page to page, following a trail of information scent. Whether that user is looking for a specific piece of information and trying all likely avenues that may lead to that piece; whether a user is looking for specific information, but gets side-tracked by other information; or whether a user is not looking for information at all, but just for some fun; all seem to be using the same strategy of following a trail of promising leads.

The longer a visitor stays on a page that does not hold his interest, the more the likelihood increases that the visitor starts looking for a way to escape. Although the promised information or fun may lie in the plain text that is presented on a webpage, all the escape routes are embedded in hyperlinks. Chances are, therefore, that any text that does not immediately captivate a visitor's attention, is unlikely to be read later on, unless much of the preceding text has already made abundantly clear to the visitor that this is where he wants to stay.

Let me repeat again that it is not the author's task to entertain the reader. But an author who does not want to entertain readers, will find himself without an audience. Similarly, an author who sets out to maximize the sense of reward someone gets from reading his work, will reach the maximum audience for that type of work.

???more to follow

The fold[edit | edit source]

The fold is a term that could in some sense be used for all media, but is pretty useless when speaking of streamed media such as speech. It is a term used for visual browsers, meaning the bottom edge of the first displayed part of a webpage.

For instance, if a webpage consists of fifty lines of text, but only the first twenty are displayed, the fold is beneath the twentieth and above the twenty-first line.

That may seem to be an arbitrary position to give its own name, but it is not, because webpage visitors will generally look at what's above the fold, not at what's beyond.

Unfortunately, it is not a very useful metric, as each visitor has his web browser set up in different ways, and so for each visitor the position of the fold may differ.

However, it teaches us that things that need to be viewed by as many visitors as possible, need to be as high up on a page as possible.

The inverted pyramid[edit | edit source]

So far we have discussed the difficulty with which a visitor reads a webpage off a screen, and with which haste he does so. Forget to grab the right person's attention, and he will disappear to the greener grass on other webpages. The web is quite vicious like that: all your competitors are your direct neighbour.

In the section about the fold we discussed how it is important to maximize your readership by putting important information at the top.

The method of writing a text with the most important sentence first, the second most important sentence second, the third most important sentence third, etcetera, is called the Inverted Pyramid.

Traditional essay writing holds that a piece is structured as follows: introduction, expansion, conclusion. For most of us, it is pretty common practice to sit on important facts for awhile, and only reveal them as the piece progresses. As readers, we have become accustomed to text being structured like this, and are willing to show a certain amount of forgiveness when an author does not come to the point right away. On the web, we click away without even blinking our eyes when we encounter a text that does not immediately tell us what we want to know.

Pitfalls[edit | edit source]

So far, we have been stressing what happens when a visitor does not find the information he is looking for. But what happens when the visitor does find what he is looking for: does web text still need to be as oddly contorted as described before in order to appeal to the visitor?

One would hope not. How would Alice in Wonderland read if we had put the most important facts first? If we had put links to more interesting literature in the text? If we had forgone all artistic ambition just to satisfy some weary surfer, who wasn't going to read our text anyway?

The usability extremists would say, yes, there are no artistic values--just a new world, with new rules.

Unfortunately for them, people do read on the web. They are moved by what they encounter. They have ways to circumvent the difficulties of the web medium. For instance, people who find an interesting text on the web often print it out, so that they can read it from an easier medium.

I would also argue, without any facts to back me up, that the nature of a text plays a role. Informative and persuasive texts may be constructed according to the guidelines given before, whereas entertaining texts may be constructed in the old-fashioned, pre-web way.

If you are trying to sell me something, for instance, there's really no need to fluff up your text with "marketese". More likely than not, I will hit the Back button of my browser faster than you can say 'percentage'.

If you are trying to inform me of something, I might be more lenient. The trade we are making (you get my attention, I get your insights) is much more immediate, and therefore worth more to me.

But even the authors of entertaining webpages can learn something here. The most important thing: you write for an audience. The medium plays a role in that. A hard rock band must use loud speakers. It's what the audience expects. (And of course, the hard rock band that knows this can play on this, and show their soft side in an 'unplugged' session--but such things are better only tried by those who understand their medium.)

Even if the entertaining text is going to be printed out, even if the user is going to sit in front of his screen and persevere through your entire, brilliant play, some of the methods outlined before make sense.

Should you not use the inverted pyramid, you should at least use headings for headings, don't fiddle with font sizes (yes, this is unfortunately possible), let paragraphs stay separated by blank lines, etcetera. It is not rare to find that some author of science fiction stories has found out a way to display their story the same as on paper--forgetting in the mean time that he is not publishing on paper, and losing part of his audience.

Most of the promotional value of an entertaining text is in the text itself. Entertainment is highly valued: if what you wrote is good, others will link to it, including search engines.

Usability and Accessibility[edit | edit source]

The World Wide Web Consortium has produced a set of Web Content Accessibility Guidelines that provide guidance on some areas of writing web pages. The guidelines are quite daunting at first. Don't try to understand them all in a single reading. Some guidelines are easy to follow whilst others can require significant effort.

Watchfire's WebXACT is a popular free online tool for checking pages for accessibility. Although it can only check some of the guidelines and occasionally fails pages that do meet the guidelines it is a good starting point for learning the guidelines.

Checkpoint 10.4, "Until user agents handle empty controls correctly, include default, place-holding characters in edit boxes and text areas.", is now considered unnecessary and has been removed from the draft of the next version of the guidelines. See the UK's Royal National Institute for the Blind's article Place-holding text in form elements for more information.

Checkpoint 4.2, "Specify the expansion of each abbreviation or acronym in a document where it first occurs." It is now considered good practice to markup all occurrences of an abbreviation instead of just the first occurrence. HTML 4.01 and XHTML1.0/1.1 have two elements for marking-up abbreviations. The abbr element is used to markup any abbreviation. The acronym element can be used to markup acronyms. Since acronyms are a special type of abbreviation they can also be marked-up with abbr. Abbreviations that are not acronyms should not be marked-up with acronym.

Questions and Exercises[edit | edit source]

Exercise 3-1[edit | edit source]

Behold the following text: "It is always the dead that are praised, or so claim those who are unsuccessful while alive. Perhaps this is so, partly because some people only revere that which has passed. However, works that have withstood the test of time, have withstood its scrutiny. Long-dead authors that are still remembered, are remembered because of the quality of their work. And no work can be stiled that of a genius, until other works have come along against which it can be compared." (Free after Samuel Johnson's "Preface to Shakespeare".)

Rewrite this paragraph using the inverted pyramid.


Answers[edit | edit source]

For answers, see Answers to Questions and Exercises.


Previous: Creating a simple page - Up: Table of Contents - Next: Adapting a webpage for visual browsers


Adapting a webpage for visual browsers

Introduction[edit | edit source]

The HyperText Mark-up Language (HTML) allows you to add structure and hyperlinks to a text. This in turn allows a web browser to display that text in a useful manner to the user. The mark-up you use has little or nothing to do with the display of the text: this makes it possible to display the hypertext on a wide array of devices.

The web browser has to make a translation between the mark-up you provide and display properties. For instance, a heading can be displayed in bold, large text on a graphical browser, and can be spoken out loudly in a speech browser, et cetera.

Generally, manufacturers of web browsers make good choices. As well-trained readers, we know that a bold large text on its own line over a mass of normal sized text is probably its heading. So when a browser renders a heading like that, we tend to recognize it as a heading without even thinking. List items are preceded by list item markers such as numbers or 'bullets', emphasized text is printed in bold or italics, et cetera.

Manufacturers of graphical web browser even consider that reading off a screen is hard: paragraphs are divided by empty lines, text is generally printed (by default) using a relatively large font-size, and hyperlinks are clearly marked as such, by using different colors and underlines or surrounding lines (the latter in the case of images).

There are two things though, that current web browsers are getting wrong. One is that they display each and every webpage the same; the other that they generally fail at presenting webpages in an aesthetically pleasing way.

It can certainly be argued that the browser manufacturers cannot be blamed for these problems.

Excercise 4-1[edit | edit source]

Why can manufacturers of web browsers not be blamed for rendering all webpages alike? Why can they not be blamed for displaying webpages in an aesthetically unpleasing way? You may also argue the reverse, if you like.

One of the few clues a visitor might have about the quality of webpages is provided by a concept called website. A website is a collection of webpages that belong together. Because they belong together, they provide a powerful hint to the visitor that information that was promised on one page of the website, may be found on another page of that website. Similarly, if the visitor accepts the voice of a webpage author as an authority, other webpages by that author may provide an interesting target for further visits.

There are several ways in which an author can make clear that webpages form part of one overarching website. One of these ways has already been discussed: by using a sensible title text, authors can show that pages belong together. The title of this page, "Adapting a webpage for visual browsers - Wikibooks", helps to underline this. All pages on the Wikibooks website have a title that ends with "- Wikibooks". If you feel like reading other textbooks, the message is clear: you should be on this site.

Another way to suggest a relation between webpages, that is, to suggest a website, is in the strategic use of the Uniform Resource Locator (URL), the address of a webpage that is often displayed in a webbrowser's address bar. You can use similar addresses for similar webpages.

The third way to suggest a relation between webpages is probably the most useful, though: you can use a uniform visual style to indicate to visitors on which website they are.

A bit of history[edit | edit source]

Originally, HTML was a mish-mash of graphical display mark-up and structural mark-up. This sounds perhaps worse than it was: only a few elements were reserved for visual presentation, such as B for bold text, and I for italicized text. Further, PRE allowed an author to display text using 'plain text formatting' (as discussed before), BR forced a graphical browser to start displaying following text on a new line, and IMG let you display an image at a certain point in the text.

All this wasn't so bad, really. It did not introduce a huge drop in accessibility, but allowed authors of pages for graphical clients (more or less the default since the beginning of the web) to 'ponce up' their pages and make them more attractive to visitors.

However, by allowing display-only elements in an otherwise display-independent language, Tim Berners-Lee opened the door for abuse by browser manufacturers.

The possibility to create a visually pleasing web page made the web a more attractive place to be. Just like the DTP revolution made everybody think that they were graphical designers (mistakenly, most of the time), so did the graphical web open the door for letting form rule over content.

When the web became more popular with users, it of course also became more popular with businesses. Actual businesses were founded to produce web browsers, something almost unheard of before. The most successful of these was Netscape.

Browser wars[edit | edit source]

Netscape quickly recognised that the instant appeal of pretty webpages was just as strong a selling-point for a web browser, if not stronger, than the other, more solid appeals, such as the promise of becoming one's own publisher, or being able to traverse a associative landscape of ideas.

However, Netscape only had control over the browser, not over the web itself, or its underlying HTM Language. So what Netscape did was let its web browser, Navigator, recognise an extended version of HTML. Frames, for instance, are a Netscape invention, as is JavaScript and the FONT element. Using this strategy, Netscape Navigator soon became by far the most used browser on the web. With it the web (and with the web the internet) became a public space, rather than the academic space it had been before.

The one software company that had thoroughly missed the internet boat, and that was Microsoft. To the day of writing this, Microsoft still does not understand the internet: they don't see it as a space within which to operate, but rather as a thing to be owned. In the beginning they even tried to replace it with its own 'internet', called MSN; which is of course nonsense, because the internet is not an atomic network. It is a network of network. MSN was to be part of the internet, much to the chagrin of Bill Gates and his people.

Meanwhile, Netscape looked further ahead. For them, the internet was a vehicle to gain control over the 'desktop', the metaphor 1980s' operating systems use to describe themselves. If everybody had Netscape Navigator installed, and Netscape Navigator was this universal tool that could be used for everything, from playing games to word processing, it did not really matter which operating system was running beneath the browser.

Microsoft was, through its original disdain for the internet and the web, suddenly threatened in the core reason of its existence. It then made a strategic decision that pulled it right into the centre of the internet: it would build its own browser.

Microsoft's browser was called Internet Explorer, and it started the so-called browser wars. Microsoft started playing Netscape's game. It first supported almost all of Netscape's HTML extensions, added a few of its own, and changed the behaviour of some elements in a way that made them work slightly better.

The web was under a threat of balkanization: the dividing in parts so small, that dealing with the web, and especially authoring web pages, was to become an ordeal that would almost prove too much for authors and surfers alike.

Authors suddenly had to decided which browsers to support, or if they should support specific browsers at all. The possibility to treat the web as a purely visual medium planted the misguided thought in a lot of heads that web pages should look alike everywhere.

This was when the World Wide Web Consortium stepped in, and started rallying for stricter standards, and a division of document structure and document lay-out. The smartest move of the W3C was to get the browser manufacturers on board. With Microsoft and Netscape both having a direct say about how next versions of HTML would look like, they became stakeholders with an interest in creating a useful web language.

The W3C introduced a programming language for suggesting certain lay-outs to a web browser, called CSS. That abbreviation stands for Cascading Style Sheets. Stylesheets had a couple of things going for them. Most importantly, they promised a 'code once, view everywhere' approach to web authoring. This was good for the W3C, because that approach was what HTML was about in the first place. And it was good for the authors, because now they did not have to learn about the quirks of every web browser.

Other advantages of CSS were from the start:

  • They could be stored in separate files, so that the style for an entire site could be stored in one stylesheet;
  • They allowed for chaining ('cascading') stylesheets, so that part of a website could have its own distinct style from the rest of the site, while still looking part of the site; and
  • They enabled a few tricks that Netscape and Microsoft hadn't gotten to yet, such as more control over text styles.

Future[edit | edit source]

It could be argued that allowing a few graphical display elements in a hypertext language was a good move. It popularized the web, and with it the internet. It introduced a great deal of authors to the web. However, today many of these authors see the web as a visual medium rather than the (hyper)textual medium it really is. Weaning those people from their wrong notion may prove an impossible task.

Websites and home pages[edit | edit source]

Websites[edit | edit source]

As we saw before, webpages form part of the web, because they link to other pages or because other pages link to them. In other words, web pages do not live in a vacuum. When you receive a folder in a letter box for a pizza delivery service for instance, there is no context. You do not generally receive copies of competing services simultaneously, or a map that shows you where the delivery service is located, or an encyclopaedia that tells you about the history of pizzas.

A webpage does come with such a context. For instance, if you visit a webpage that lets you order pizzas, you probably visited a page before that which let you choose from several pizza delivery services. Also, the order page may link to a map, or to an article about the history of pizzas. Even if it does not do so, the web browser may provide additional functionality to you. For instance, if you selected the name of a pizza in the Firefox web browser, then right-clicked, a menu item would appear that would let you search Google for the selected text in a new tab in the background.

Although all webpages have a such a context, only the ones provide by you are under your control. When you group a number of related webpages, such a grouping is called a website. Webpages may form a website for any number of reasons; because they belong to the same subject, because they are hosted on the same server, because they live on the same domain, or because the are created by a single person or organisation.

For instance, the collection of webpages at http://www.nasa.gov form the website of the US national space agency NASA. They may not all be served by the same server, and they are not created all by the same person, but they live on the same domain, try to be the voice of NASA on the web and deal with topics that are all related to NASA.

A person or an organisation may of course have multiple websites; and what is called a website in this respect is not really important.

Websites are often characterised by

  • a main site navigation ("menus" of hyperlinks that are repeated on every webpage and that lead to important sections),
  • a coherent visual style across webpages, including a logo and a favicon,
  • coherent page names, for instance by repeating the site's name in the title element,
  • a natural division of the website in topics and subtopics.

All these clues tell you after going from one page to another that you are either on the same website, or have left for another website. Most web authors get these clues at least partially right, and most web page visitors are able to tell which website they are on.

An interesting feature of a webpage is then that it contains information that is at the core of what the author wanted to say with that particular webpage, but that it also contains information to the visitor that they are on a particular website, where they are on that site, where they can go, etc.

Most webpages will contain a lot of information that are pertinent to the subject of that page, and a little information about the website itself. The exceptions are called home pages; home pages are about the website they are part of.

Home pages[edit | edit source]

The "homepage" is the main page of a website. It is used as a central hub for the rest of the site. This is the file that is displayed when going to a web address that doesn't specify a document, e.g. http://www.example.com/. It is typically named index.html, index.htm, default.html, or default.htm.

A homepage has a number of functions and a number of rules and heuristics that it should adhere to. The functions of a homepage are to:

  • provide a navigational aid for the site
  • provide information on the site's theme
  • establish the brand identity of the website (e.g. the site pages' appearance)
  • provide a means for visitors to reorient themselves
  • provide the minimal location for a web page author to link to
  • live at an easy to remember location

Let's review these.

Navigational aid[edit | edit source]

When a visitor follows a information scent to your web page, the web page may or may not fulfill the information wish of the visitor. In the latter case, a visitor will either want to track back, or follow further scents.

As we have noted, the indication that a webpage is part of a website provides a powerful hint to the visitor that the same voice that wrote the webpage has written other webpages on possibly similar subjects. A visitor who does not succeed at the current webpage, or who has succeeded, but now has changed goals, may want to further explore your websites.

For example, if you collect jokes, and a visitor thinks the first of your jokes is funny, they may wish to read more of your jokes.

Links on the webpage to related webpages can be very useful; but sometimes these links are lacking, or they are worded in a way that does not help your visitor, or they do not make clear that they lead to parts of the same website; and often a webpage does not show the intentions an author has with a website, the freshness of a website et cetera.

A homepage should provide this sort of information, or at least leads to it. A homepage of a larger should also provide alternative ways of navigation. This could be done through:

  • a search function
  • a main menu
  • a catalogue
  • highlighted webpages

A common way to link to a homepage is to use the logo or site name as a hyperlink, or to link to it from a breadcrumb trail.

Site style[edit | edit source]

In order to find out that you are on a website, you need to have visited at least two webpages on that site: one to establish that a certain style was used, another to verify that style. Any webpage on a site can be used for that second function, but the homepage must always be usable in this way.

Site meaning[edit | edit source]

The homepage of a website should always make clear what the website is about, just like a webpage should always make clear what that webpage is about.

News[edit | edit source]

In lieu of other designated places where a visitor can review whether pages of a website have been added or updated, the homepage should be regularly updated to indicate that a website is still "alive". Popular ways to do this are to show leads to recent news item, to regularly change highlighted items, or to regularly make simple changes to the lay-out of the homepage. Other ways are to introduce seasonal elements into a homepage. For instance, you could use your homepage to wish visitors happy holidays.

Other designated places that allow you to assess the liveliness of a website are the news page, or for example the Recent Changes page at Wikipedia. You can use these pages instead of the homepage for situational feedback, as long as it is clear to visitors that they need to look somewhere else, for instance because you clearly link to a news section.

Location[edit | edit source]

In general, you will wish to let the world know about specific webpages, instead of your website. However, there are instances when the latter is desirable. For those instances it would be useful if the homepage were easy to find and to get to. One way to achieve is, is to locate the homepage at the shortest possible URL. If your webpages are at http://www.example.com/~wily/friends.html, http://www.example.com/~wily/album.html, and http://www.example.com/~wily/contact.html, your homepage should be at http://www.example.com/~wily/. (When faced with a request for a directory instead of a file, a webserver will generally start looking for files with a certain name, such as index.html, index.php, welcome.html etc. This behaviour differs from webserver to webserver, but index.html is usually a pretty safe file name to give to your homepage.)

Also, when you forget to link to a homepage, or when the link cannot easily be found, visitors will apply a trick called the directory traversal attack. Despite its goulish name and the fact that it is a crime in the UK, this is a perfectly moral and fine thing to do. It works by guessing the parts of a URL that are superfluous for the homepage address.

For example, if you are at http://www.example.com/~wily/friends.html, removing "friends.html" or "~wily/friends.html" from the full address in the address bar of your web browser, may lead to the homepage of this webpage's site.


Recapitulation[edit | edit source]

In essence, a homepage provides situational knowledge about a website. It should show a visitor what the site is about, what its main themes are, how you can get there, how fresh a site is, etc.

Cascading Style Sheets[edit | edit source]

Stylesheets and the style element[edit | edit source]

CSS (Cascading Style Sheets) is a language used to "style" markup, such as HTML. CSS is a series of rules, and each rule has three parts: a selector, a property, and a value.

 a {
   color: red;
   font-style: italic;
 }

In this example, "a" is the selector. It selects all anchors ("links") in the document. Each rule that affects anchors is enclosed in the brackets ({}) following the selector. Here, the two properties are "color" and "font-style". "color" is used to set the color of the text, and "font-style" is used to set the variety of the font. Anchors will appear as red, italicized text.

There are three methods of adding CSS to a web page, but the third is considered the best for most purposes.

1) As a tag attribute. The name of the attribute is "style".

<a style="color: red; font-style: italic;">Example link</a>

This is considered the worst way of adding styles, in most cases. The reason for this is that it's very hard to maintain. If you wanted to change the color of the anchors on your site, you'd have to find and replace the attribute in every "a" element in every web page, which could occur hundreds or even thousands of times, depending on the size of the website.

2) As a style element. This element is placed inside the head tag.

 <stlye>
   a {
     color: red;
     font-style: italic;
   }
 </style>

This is considered the second worst way of adding styles, in most cases. While not nearly as difficult to maintain as the first method, you'd still have to synchronize all of the style elements across all documents, and as your site grows, this would become more and more awkward.

3) As a linked document. The document ends in .css (e.g. "styles.css"). This is considered the best way, in most cases, because it is the easiest to maintain. To link your web page to your stylesheet, add this in your head tag:

 <link rel="stylesheet" href="styles.css" />

Now, if you want to change any part of the anchors' appearance, you'd only have to edit one file, and it will affect all pages on the website that have that link in the head.

Typography on the web[edit | edit source]

Empty lines separating block level elements--line width----losing control--practice: font family--practice: italics, bold, font-size--practice: line-width--practice: line-height.

Colours: dangerous and beautiful[edit | edit source]

Easy branding with colour--danger: colour blindness--danger: browser settings--solution: never use just colours--solution: always define all colours--practice: applying colours to text and links--practice: applying colour to backgrounds--practice: applying coloured borders.

Preparing for print[edit | edit source]

The print style sheet.

Questions and Exercises[edit | edit source]

Answers[edit | edit source]

For answers, see Answers to Questions and Exercises.


Previous: How to write for the web - Up: Table of Contents - Next: HTML, XHTML and DOCTYPEs


HTML, XHTML and DOCTYPEs

The current version of HTML is HTML5. Since modern browsers have been supporting HTML5 for several years now (as of 2016), you'll probably want to use that one.

Versions of HTML prior to 4.0 are obsolete. Web pages in these versions should be withdrawn or converted to a variant of HTML version 4.01 or HTML5. No new page should be created using a version prior to HTML 4.01. You shouldn't be using any variant of XHTML, since it has largely been abandoned in favor of HTML5.

Internet Explorer version 8.0 and earlier (about 6% of web surfers were using Internet Explorer version 8.0 or lower in December 2014) does not support HTML5.

Validation[edit | edit source]

Regardless of the version of HTML you choose to use, it is important that you regularly validate your pages to ensure they conform to your chosen specification. The World Wide Web Consortium (W3C) has an online validator at https://validator.w3.org/. You can either upload or HTML file or copy and paste it into the validator.

HTML5 Doctype[edit | edit source]

The Doctype for HTML5 is <!DOCTYPE html>

All HTML5 documents must start with this line. Note that while this looks like a tag, it has special rules. It does not have a closing tag, nor is it self-closing.


Previous: Adapting a webpage for visual browsers - Up: Table of Contents - Next: Collecting pages in a website


Authoring websites

URLs[edit | edit source]

URLs should be permanent. When you begin to put together a website make sure that the URLs you choose will survive redesigns. A useful article on good URLs is Choose URIs wisely.

Navigation bars[edit | edit source]

Navigation bars normally consist of lists of links. Since they are lists they should be marked-up using the ul or ol elements. (XHTML 2.0 may introduce a special navigation list element when it is released.)

 <ul>
  <li><a href="/" accesskey="1">Home</a></li>
  <li><a href="/news/">News</a></li>
  <li><a href="/contact/">Contact us</a></li>
  <li><a href="/accesskeys/" accesskey="0">Access keys</a></li>
 </ul>

Every page on your website should include a link to your home page. The link should be in the same place on every page. In most cases it will be the first link in your primary navigation bar.

Access keys[edit | edit source]

Access keys are keyboard shortcuts for links and form elements. Important links that appear on all pages on your website, e.g. the link to the home page, should be given an access key. The access key bindings should be the same on all pages on your website. You should include a web page that lists all the access keys used on your website.

Search and Site Map[edit | edit source]

You can't have a link from your home page to every other page on the website. If your website contains more than a few pages you need a site map and possibly a search tool so users can find relevant pages easily.


Previous: HTML, XHTML and DOCTYPEs - Up: Table of Contents - Next: Promote your website


Promote your website

What is Internet[edit | edit source]

The Internet is a vast network that connects computers all over the world. Through the Internet, people can share information and communicate from anywhere with an Internet connection.


Preventing link rot

If you maintain a web site, or if you use links to other web sites (like in a blog or on a wiki), then you could suffer from link rot. Link rot is the process by which links on a website gradually become irrelevant or broken as time goes on, because websites that they link to disappear, change their content or redirect to new locations. Link rot particularly affects free web hosts, like GeoCities, where people lose interest in maintaining their site.

Discovering[edit | edit source]

Detecting link rot for a given URL may be difficult using automated methods. If a URL is accessed and returns back an HTTP 200 (OK) response, it may be considered accessible, but the contents of the page may have changed and may no longer be relevant. Some web servers also return a soft 404, a page returned with a 200 (found) response (instead of a 404) that indicates the URL is no longer accessible. In the end, the only reliable way to test that a link is still valid is to click through and check it.

Combating[edit | edit source]

Webmasters[edit | edit source]

A number of basic rules can help webmasters to reduce link rot, including:

  • Do not keep a hyperlink collection unless you are willing to look after it.
  • Design your hyperlinks to be maintained, such as a central hyperlink collection.
  • Do not link to sub-pages ("deep linking") unless you are confident that they will remain stable.
  • Use hyperlink checker software or a Content Management System (CMS) with link checking included.
  • Use permalinks.
  • Put the right e-mail address or other contact information on the same page where the links are with specific information ("Found a bad link? Contact links@example.com and we'll fix it.")
  • When changing domains, help others fix their link pages by spreading the information well ahead of the migration, and use HTTP status codes to communicate that a page has moved (e.g. "301: Moved Permanently").

Citing URLs[edit | edit source]

When linking you should avoid citing "unstable" Internet references. There are several approaches that you can take to avoid introducing link rot:

  • Avoid using URLs that point to resources on a personal site
  • Use Persistent Uniform Resource Locators (PURLs) and digital object identifiers (DOIs) whenever possible.
  • Use WebCite to permanently archive and retrieve cited Internet references

Tools[edit | edit source]

There are a number of tools that can be used to combat link rot by archiving web resources:

  • WebCite, a tool specifically for scholarly authors, journal editors and publishers to permanently archive "on-demand" and retrieve cited Internet references.
  • Archive-It, a subscription service, allows institutions to build, manage and search their own web archive
  • hanzo:web is a personal web archiving service created by Hanzo Archives that can archive a single web resource, a cluster of web resources, or an entire website, as a one-off collection, scheduled/repeated collection, an RSS/Atom feed collection or collect on-demand via Hanzo's open API.
  • Spurl.net is a free on-line bookmarking service and search engine that allows users to save important web resources.

Modern management[edit | edit source]

On Wikipedia, and other Wiki-based websites only external links still present a maintenance problem. Wikipedia uses a clear color system with internal links, so the user can see if the link is live before clicking on it. If referencing an old website or dated information, users can externally link to pages in the Internet Archive, allowing for a reliable permanent link.

References[edit | edit source]

  • Gunther Eysenbach and Mathieu Trudel (2005). "Going, going, still there: using the WebCite service to permanently archive cited web pages". Journal of Medical Internet Research. 7 (5).

External links[edit | edit source]


Answers

Rewrite this paragraph using the inverted pyramid.


Behold the following text: "Long-dead authors that are still remembered, are remembered because of the quality of their work. And no work can be stiled that of a genius, until other works have come along against which it can be compared. It is always the dead that are praised, or so claim those who are unsuccessful while alive. Perhaps this is so, partly because some people only revere that which has passed. However, works that have withstood the test of time, have withstood its scrutiny." (Free after Samuel Johnson's "Preface to Shakespeare".)


Teacher's Guide

Goals of this course[edit | edit source]

This course has only one goal: to teach the practice of creating webpages.

If you want to teach your pupils about computing, networking, the internet, HTML et cetera, I would like to refer you to other WikiBooks.

Related Wikibooks[edit | edit source]

Requirements[edit | edit source]

In order to be able to teach this course, you need to:

  • familiarise yourself with the material;
  • provide your students with a computer linked to the internet, or make sure they have access to such a computer for the duration of the lesson;
  • provide your students with web browsing software, or make sure they have access to a web browser;
  • provide your students with text editing software, or make sure they have access to a text editor;
  • set up a web server, and provide access to it;
  • provide storage space for your student's projects; and
  • make sure your students possess a minimum of computing experience.

Optional:

  • graphics to be used in the practice projects; and
  • graphics editing software, to create such graphics.

Web browsers[edit | edit source]

The web browsers you use should be able to handle HTML 4 or better. Web browsers that are shipped with current operating systems are acceptable for this goal.

There are lessons in this course that deal with graphical presentation of web pages. You may skip these lessons, but if you don't, you need to provide a so-called graphical web browser. Well known such browsers are for instance Mozilla, Apple Safari, Opera, and Microsoft Internet Explorer. These browsers are either provided with your operating system, or can be downloaded and used freely.

If you need to download and install a web browser, we recommend the FireFox family of web browsers.

Ideally you will provide your students with several web browsers on side-by-side computers. This will allow them to see how the rendering of web pages varies between different browsers.

Text editors[edit | edit source]

The text editors that generally accompany an operating system are all that is needed to create webpages in this course.

Although you may use editors that are specifically geared towards creating and editing websites, we advise against this. It might make your students dependent on such editors. Even if all they will use after they finished the course is such an editor, learning the basics of creating a webpage can be very helpful during trouble shooting.

Webservers[edit | edit source]

None of the practice projects are so complex that they require a certain server set-up. Later versions of this course may require more advanced webservers. If you do not know how to set up a webserver, consult your systems administrator.

One thing to keep in mind is that you may wish to run your webserver solely on an intranet. That way, your students cannot abuse the school's Internet connection to publish information the school does not wish to be associated with.

If you are the systems administrator, you might find Apache2Triad or more generally Wikipedia:comparison of web servers helpful.

On the other hand, using a free web hosting service is likely to be easier than setting up even the easiest intranet web server. It may help student motivation to see that their work is becoming part of the "real" WWW that anyone in the world can immediately read, and not just another homework assignment that only the teacher can read. Although some hosting services provide "templates" and "wizards" for creating websites, we recommend against using those tools -- it might make students dependent on that hosting service. Instead, use ordinary file uploads.

Storage space[edit | edit source]

The practice projects are all reasonably small. You may devise a large end-of-course project though. Also, your students may wish to study webpages they created earlier on. For these reasons, it could be handy to provide your students with storage space to keep their projects on and reference later.

The HTML and CSS and JavaScript files (which does not include image files, MP3s, videos, etc.) for one larger than average web site is under 8 MB.

Does this need to be separate from the files on the web server?

Previous knowledge[edit | edit source]

If your students are new to computing, you may want to spend the first lesson acquainting them with some general principles of computing. At the very least, they should know how to:

  • Start up and switch off a computer or session;
  • Start and quit programs;
  • Open and close files from programs;
  • Copy, move, create and delete files on a file system;
  • Operate a webbrowser; specifically, how to open a URL; how to open a file; how to navigate between webpages using hyperlinks; how to navigate between webpages using browser functionality.

Class project[edit | edit source]

Since this is a work book as much as a text book, students will benefit from having engaging projects in which they can test their skills to the fullest.

Many of the exercises in this book are designed to provide such project but will probably be found lacking by almost all students. The exercises are limited by their nature: they must assume incomplete knowledge in the earlier chapters of the book, and they must be finishable within the time set for homework.

Therefore, it would be advisable to have an end-project in which the students can test all the skills and knowledge they acquired during this course.

The teacher is free to invent such a project, and hopefully in the course of time, many cool project proposals will be added to this Wikibook.

However, a teacher of minors can also choose to participate in the six-monthly ThinkQuest competition. ThinkQuest is a website creation competition for elementary and secondary school students that has produced some of the finest sites on the web today. Students are pitted against other students from all over the world. The competition is fierce and the participants need to use all they've got to have a chance at winning. All participating teams must have a school teacher as team leader, but the students themselves must do all the work.

In due time, this Wikibook will provide all the basic knowledge required for participating in and winning a ThinkQuest competition with one major exception: this book will not deal with embeddable content, such as images, animation, video et cetera. These form an important part of the web, but they fill a whole separate book on their own: the Web Design book.

The students who are following this course on their own, are encouraged to think up their own projects. If you come up with fun projects, please share them by posting their descriptions here.