One warning: This document is meant to be a "work-in-progress" on a continual basis to reflect the ever-changing nature of the WWW. So,if you spot any technical inaccuracies, please be forgiving-- and tell me what's new! Thanks.
However, if you want to do anything at all fancy, such as link to other documents, put in graphics, bullet or number lists, or whatever, then you'll need to be able to do a bit more than this. This Lab isn't long and you'll be able to do a lot more when you're finished today. Let me know if there are any confusing sections so I can improve it.
As a markup language, HTML is not so much concerned about the appearance of the documents, but about the structure of a document. Rich Text Format (RTF), on the other hand, is an example of a formatting language. The difference between them is that, in HTML, you would use commands to mark the headings, normal paragraphs, lists (and whether they are numbered or not), and even things like addresses. In RTF, you would use commands (usually the word processing program does this for you) to indicate the typeface, font size, and style of the text in a document.
HTML is a relatively simple implementation of Standard Generalized Markup Language (SGML). HTML is simple enough to just type in directly without using some sort of HTML editor. HTML editors are useful, especially if you have massive quantities of documents to write, but they are not necessary to get started.
In general, HTML commands begin with a < and end with a >. The commands are almost never case sensitive and are either "container" or "separator" commands (although there are numerous exceptions to both of those generalizations). By a container, I mean that there is usually a beginning command and an ending command. The commands would thus be applied to the text in-between the beginning and ending commands. An example of a container command is the title command, which surrounds the text that is designated as the document's title with <title> and </title>. An example of a separator command is the command used to separate paragraphs (<p>).
White space, meaning anything that is not a printable character, is generally ignored in HTML. Leaving a blank line in your document will generally not create a blank line when the document is displayed in a browser.
Finally, not every element common to typical documents is included in HTML. You will occasionally have problems converting some documents. For example, the version of HTML in common use today doesn't support equations and support for tables is relatively new and not available in all browsers.
A quick word about the HTML
philosophy
There is a sort of paradigm shift involved with using HTML, especially
for people used to having a lot of control over the look of a page
(such as publishers). With the Web, the user or reader
has a lot more control over the look and feel of documents than most
publishers are used to.
The designers of HTML were concerned with conveying the structure of documents in a simple, portable way. The advantage of this approach is that users on most any kind of system, from a Macintosh to a simple, dumb terminal can view documents formatted with HTML with relatively little loss of information.
Some people attempt to get around formatting limitations in HTML by using a lot of graphics or using HTML commands in such a way as to make their documents look "just right" in one particular browser (such as Netscape or Explorer). Often, the result is that those same documents look terrible or are even inaccessible in another browser.
It's probably better in the long run to use general HTML commands. Your documents will be more accessible to more people and you won't have to worry as much about how your documents will look on browsers that you don't have access to.
The top part of the document should also have a section for heading information which is surrounded by the <head> and </head> commands. There are several items of information that you can put in the header, but almost all of it is totally ignored by most browsers out there. One piece of information you should always have in the heading is the title of your file. For our purposes, you should name each of your files in sequence as YOUR_WEBNAME1,2,3, etc., followed by ".html". So, for example, your first file might read "Iam_Asenior1.html" (if your webname is Iam Asenior). The title (as mentioned earlier) is surrounded by the <title> and </title> commands.
The title of a document is not normally displayed as part of the page, but is often displayed in some sort of special section in most browsers (Netscape puts the title in a Document Title box just under it's menu, for example). However, the title is also used by most browsers when saving the user's "hotlist," so ordinarily it should be both descriptive and short enough to fit comfortably on one line.
Finally, the "body" of the document should also be marked off with the <body> and </body> commands. This is the part of the document that is normally displayed as the page in most browsers.
This is what a typical document would look like so far:
<html> <head> <title>This is my title</title> </head> <body> </body> </html>(Remember that white space doesn't matter, so this stuff could all be on one line if you wanted. It makes no difference.)
Most browsers recognize at least four heading levels. There is support in HTML for more than that, but what I mean by "recognize" is that the browser gives up to four levels of headings a distinct style. After the fourth level, it gets difficult to tell the heading levels apart. If you get much beyond that, you should consider breaking up your document into multiple pages.
The heading commands look like <hX> and </hX>, where X is the heading level. In most documents on the Web, the first heading is a duplicate of the document's title. Our typical document would look like this after we added the first heading:
<html> <head> <title>This is my title</title> </head> <body> <h1>This is my title</h1> </body> </html>
Our sample document would look like this after we added a paragraph or two and a subheading:
<html> <head> <title>This is my title</title> </head> <body> <h1>This is my title</h1> This is a sample paragraph. The majority of most documents contain this type of construct. <p> <h2>This is a subheading</h2> The quick brown fox jumped over the slow lazy dogs. The quick brown fox jumped over the slow lazy dogs.<p> The quick brown fox jumped over the slow lazy dogs. The quick brown fox jumped over the slow lazy dogs.<p> </body> </html>I put the paragraph marks after each paragraph in this example, but they can just as easily go in front of each paragraph. The <p> is just a separator.
Assignment #2: Take time now to prepare a sample document similar to the one above, but you don't have to use the "Quick brown fox" sentence as your sample text. Feel free to express yourself within community standards. When finished, view it in Netscape and compare your image with others in the class. Remember to save this file to your storage device!
Lists
There are three kinds of lists in HTML: ordered, unordered, and a
special kind called a definition list. The ordered lists are numbered.
Unordered ones typically just use bullets to mark each item.
In ordered lists the browsers take care of inserting the actual numbers. This behavior is convenient for authors because if you insert or delete items in a sorted list, you don't have to worry about renumbering everything. An ordered list begins with <ol> and ends with </ol>.
Unordered lists typically use bullets to mark off each item in the list, but this is up to the browser (a DOS browser may use asterixes or dashes, for example). An unordered list begins with <ul> and ends with </ul>.
In both kinds of lists, the individual items are designated with a <li> command. This is another one of those commands that isn't typically used as a container (i.e. it doesn't have a corresponding </li> command).
You can also nest lists to get an outline effect.
Here's our sample document with a few lists thrown in:
<html>
<head>
<title>This is my title</title>
</head>
<body>
<h1>This is my title</h1>
This is a sample paragraph. The majority of most documents contain
this type of construct. <p>
<h2>This is a subheading</h2>
The quick brown fox jumped over the slow lazy dogs.
The quick brown fox jumped over the slow lazy dogs.<p>
Here's an ordered list:<p>
<ol>
<li> first item.
<li> second item.
<li> notice that <p> commands are not necessary to
separate list items.
</ol>
Here's an unordered list:<p>
<ul>
<li> an item.
<li> another item.
<li> here's a nested list
<ul>
<li> a nested item
<li> another nested item
</ul>
<li> the last item
</ul>
The quick brown fox jumped over the slow lazy dogs.
The quick brown fox jumped over the slow lazy dogs.<p>
</body>
</html>
Assignment #3: Using the file you created in #2 above, make 2 new files with your text converted to ordered and unordered lists. View your files in Netscape, and remember to save the files to your storage device! This is what the list above would look like when rendered with your browser:
Here's an ordered list:
Here's a sample definition list:
<dl> <dt> First Term <dd> First term's definition. <dt> Second term (or title, or whatever) <dd> Text that explains or expands on the second term. </dl>And This is what it would look like in your browser:
URLs
A URL (or Universal Resource Link or Label or something like that) is
the address of a document or resource. It usually takes this form:
protocol://machine.name/directory/document.name
The protocol is the Internet protocol used to reach the document or resource. On the Web, it is typically "http", but it can be any of numerous other things (such as ftp, gopher, telnet, etc). The machine.name is just what you think it is: the name of the host where the document resides (such as www.pcweek.ziff.com/~pcweek).
The directory and document.name components of the URL are self explanatory.
The easiest way to get the URL of a document is to find it using Netscape and then copy the URL into your HTML document. (Copy the text in the Document URL field near the top ofthe Netscape window.)
Putting Links in HTML documents
The HTML command for putting a link into a document takes this form:
<a href="URL">text of link</a>
You put the URL in the quotes following the "href=" and put the text of the link (the part that users will click or select to activate the link) after the > and before the </a>.
So, here's out document with a few links:
<html> <head> <title>This is my title</title> </head> <body> <h1>This is my title</h1> This is a sample paragraph. The majority of most documents contain this type of construct. Here's a link embedded in the document right <a href="assignment1.html">here </a>.<p> <h2>This is a subheading</h2> The quick brown fox <a href="assignment2.html>jumped</a> over the slow lazy dogs. The quick brown fox jumped over the slow lazy dogs.<p> ...and so on..(Of course, the words you choose to have highlighted for the link should indicate something about the nature of the link.)
Assignment#4: Set Up your file here to include links to your other assigment files. View it and save it to your storage device!
However, that doesn't mean that a few colorful images is a bad thing on a web page. Images are also often necessary to make a point that can't be made using text only.
To add an image to your document, you need to convert it first into GIF format. There are a number of tools for doing that available on the WWW, so I won't get into that here, but if you can't find something or need help, see me.
To make things easier on yourself, put the images that you want to show in your document into the same directory as your document. It is possible to display a GIF image that is stored almost anywhere (even somewhere on the Internet), but let's not get into those kinds of complexities right now.
The HTML command for inserting an image at the current position takes the following form:
<IMG SRC="/name_of_image.gif">
That's all there is to it to insert a GIF file into your document. Notice that the location is given relative to the current document. The location does not have to be a full URL (but it can be if you want). This same trick can be used for normal links (not just for images). Some say it's faster to use relative URLs, but I've never really noticed a great difference in speed. However, writing this way will make changing your files over to a different serve much easier.
Image Options
There is also one optional argument to the IMG command that you may
want to use occasionally.
You can "suggest" to the browser that the image be aligned in a particular way with the surrounding text using the "align=" directive. The choices are "top", "middle", or "bottom", which indicate where the base of the image should be in relation to the base line of the surrounding text.
Here's a couple of examples:
<img align="top" src="./image.gif"> Some Text. <img align="middle" src"./image.gif"> Some Text. <img align="bottom" src"./image.gif"> Some Text.Here's how it would look in your browser:
Some Text.
Some Text.
Some Text.
Another useful option is to suggest a text-only alternative for browsers that don't support in-line images. The "alt" directive is used like this:
<img alt="o" src="./image.gif"> Some Text. <img alt="o" src"./image.gif"> Some Text. <img alt="o" src"./image.gif"> Some Text.For users on a text-only browser like W3-mode of Emacs or Lynx, these items would appear as just "o" instead of something like [IMAGE]. Some web servers make very good use of this directive to display icons for the image-oriented and simple word links (like "[Home] [Next]") for text-only browsers.
Assignment#5: Set up a simple home page for your project now (if applicable). View it and save it to your storage device!
That said, many thanks to the providers of html guides across the WWW, and a special acknowledgement to Eamonn Sullivan, on whose work much of this tutorial is based.