Web Systems Lecture 4  - HTML

This lecture is divided into hyperlinked sections

Introduction
The World Wide Web
Hyperlinks
Compatibility Issues with HTML
Web Page Sections
Page Heading
Metatags
Usage of Metatags
Page Body
Images
Graphics Interchange Format (gif)
Joint Photographics Expert Group (jpg)
Portable Network Graphics (png)
Conclusion
Tutorial Questions


Introduction

This section of the course is concerned with Hypertext Markup Language. HTML is used to create the documents found on the World Wide Web.

We will look at the format of a web page and the sections that it is divided into.

We will see what the use of metatags within document headers are for.

We will see how images are stored for display on web pages.


The World Wide Web

The World Wide Web is a service that is able to link the information that is held on many computers across the world. References within documents that are held as hyperlinks can refer to another document or a picture or a music file held on a totally different computer somewhere else in the world.

This helps to make the service easier to use, and more anonymous because the user at the client end does not need to know the location of any document that he or she is looking at.


Hyperlinks

When a reference is made within a web page to another file the words that describe the link are underlined and often displayed in a different colour. An example of a hyperlink can show this.

See the beginning of this document. If you are reading this online, when you pass the cursor over the hyperlinked area, the cursor changes to a pointing hand. This indicates to the user that the text is a hyperlink to another file or indeed a different place in the same file.

When the user clicks (or double clicks – variant upon user preferences) the hyperlink, the browser (or word processing program) displays the link that was specified by the designer of the web page.

It is possible to make a hyperlink using an image; the user clicks on the image and the browser fetches the related file and displays it. 


Compatibility issues with HTML

Because users around the world will be equipped with different computers having  different displays (EGA, VGA, SVGA etc.) with differing specifications, it may seem that designing a web page with text; images, headings and differing font sizes would be a daunting task.

Fortunately, when HTML was originally designed, these considerations had already been taken into account. HTML creates a web page in a general manner without reference to specific sizes etc. Heading 1 is designed to be larger than Heading 2, but the end user defines how large the fonts will appear on his/ her display. The result of this is that Heading 1 will be drawn larger than Heading 2, regardless of the size that the user has defined as a default.

Thus the HTML provides a guideline to the browser to help it display the page, but it lacks specificity. This means that a line of text in a web page displayed on one monitor may have more characters than that of the same page displayed on a different monitor.

The browser is responsible for interpreting the commands (tags) used within the HTML and carries out the job of rendering the page components locally on the client machine. This means that if a user has a monochrome display then the page will have colours suppressed and shades of grey will be used to make text that is highlighted stand out from the rest of the page.

The browser creates the best possible representation using the hardware available of the author’s intended page.


Web Page Sections

All HTML documents consist of two sections, the heading and the body.

The heading exists to help identify the page and to supply details to the browser and to search engines etc. that are concerned with the page.

The body contains the text and information that is desired to be presented to the user. 


Page Heading

The very first line of a WWW document should contain a document type definition of the web page – a doctype.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

This describes to the browser the language version that the page was written in and allows it to choose the best way of rendering the page.

Now comes the header of the page, within which are a series of Metatags.

Finally the header will contain the title of the page that will be displayed at the top of the browser. This is different from the filename. The file may be index.html, but the title could be ‘Fred’s Interesting Life Pages’.


Metatags

These fall into 3 separate classes.

The first is to store information about the document, information that will be used by your server to track or maintain the document. This is most commonly done in an intranet environment. For example, you might have a metatag that identifies the author of the page and another metatag that identifies the expiration date of the content of the page.

A second use is to make something happen to the page automatically. The two most common applications are to automatically refresh a screen with another page or to play a sound file automatically.

A third use is to help search spiders index the site. Some of the major search engines look for metatags when they index a site and use the metatag data to better identify and categorise the site. Using metatags can help the placement of your site within these search directories, but using metatags doesn't guarantee that your site will be indexed "correctly" either.

Metatags are not strictly required for a site to be indexed by a search engine. The automatic crawlers or search spiders will index all sites, regardless of whether the pages have metatags or not. None of the leading search/directory companies (Google, Yahoo, Excite), pay any attention to metatags when they compile their lists. However, putting certain terms in the metatags, may better define how your site is indexed with Alta Vista, InfoSeek, and Hot Bot.


Usage of Metatags

The general form of a metatag is shown below:

<meta http-equiv="server command" content="data">

If you are using the metatag to store server-specific information (like the author of the page) or to request some action from the server (like opening a new page or playing a sound) the tag will have two attributes, http-equiv and content


http-equiv

The first attribute is http-equiv. It names the metatag and alerts the browser that the value of this attribute needs to be passed along to the server for processing.

If you are using the metatag to refresh the page, this attribute will have the value "refresh." Most web servers have agreed to recognise this value in a metatag and act appropriately.

<meta http-equiv="refresh"> 


content

The second attribute is content. The value of this attribute further defines the value in "http-equiv." For example, if you are playing a sound or loading a new page, the content attribute contains the URL of the page to load.

This metatag tells the server to refresh the contents of the browser window. The refresh will happen in 30 seconds and the new page is named "index2.html:

<meta http-equiv="refresh" content="30; url=index2.html">


Page Body

This is the section of the web page that will be viewed by the people who access your page. The majority of people will not realise that the page has a header or know what it exists to perform. They will be interested in the function of the page and what it has to offer them in the form of text, images and sound.


Images

Images can be displayed within web pages and the types that may be displayed vary greatly. A web browser is able to display images in many different formats.

The most simple format of image is the bitmap. This is a file that contains details of the colour of every single pixel in the image. This is the type that is created by Paint, the program included with the Windows operating system. It has one major disadvantage in so far as it tends to be an extremely wasteful method of storing image information. An image may contain a lot of white space with the image in the centre. Each and every white pixel is described in the bitmap, regardless of the fact that it is carrying none of the picture.

A webpage may contain a reference to a bitmap, but this wastes the storage space on the web server and also means that the page will take longer to load the picture than necessary.

A solution to this wastage of space is to encode the picture so that the redundant areas of the picture are described more concisely. There are three widely used formats in which images are stored for use on the WWW, although more exist.


Graphics Interchange Format (gif)

The Graphics Interchange Format is a lossless 8 bit/ 256 colour protocol for "on-line transmission and interchange of raster graphic data in a way that is independent of the hardware used in their creation or display". This format was developed by Compuserve.

This type of image format is good for images that contain large areas of colour that are exactly the same, such as company logos. This is good for storing cartoon type images where the same colours occupy large areas of the image. 


Joint Photographics Expert Group (jpg)

JPEG is a lossy compression scheme. The greater the compression, the greater the degree of information loss. This can be user determined, to optimise the trade-off between resultant image size and image quality. The algorithm exploits some of the ways in which the human eye perceives and analyses images, so that compressed images still appear to be of high quality when looked at by human eyes.

As the name suggests, this image format was developed especially for true colour photographs which unlike cartoon type images, have varying colours, shades and hues. Unfortunately this is a lossy compression method and the original file cannot be recreated from the jpg. Compression ratio can be as much as 50:1 or more, but image quality suffers.


Portable Network Graphics (png)

The Portable Network Graphics (png) format was designed to replace the older and simpler GIF format and, to some extent, the much more complex TIFF format. PNG really has three main advantages over GIF: alpha channels (variable transparency), gamma correction (cross-platform control of image brightness), and two-dimensional interlacing (a method of progressive display). PNG also compresses better than GIF in almost every case, but the difference is generally only around 5% to 25%.


Conclusion

The World Wide Web provides links to files that may be held on different web servers around the world. The user does not need technical knowledge of the computer on which they are stored to access these documents, just click a hyperlink.

Hyperlinks allow documents to be created that allow the user to easily access different files. The hyperlink informs your browser of the location of the file and your browser can then make a request for the file.

HTML describes to browsers in a general fashion how the page is to be displayed. The browser makes its decision based upon the hardware that is available for display.

Web pages consist of two sections, heading and body. The heading contains information intended for the server and the client to act upon. The body contains the information that will be displayed on the client’s computer display.

Within the heading are metatags that specify attributes of the page and allow browsers to deal with the page appropriately.

Images that are displayed on web pages should be compressed to save on server storage and bandwidth so that they load more quickly on the client machine.


Tutorial questions

Why should a web document require a heading that describes the DTD to the browser?

When designing a web page, how would you ensure that the page looks as you intended on different displays?

What criteria would you use when deciding on the method of compression for an image on your web page?

What compression algorithms exist for compressing music that is stored in wav files?


References

http://www.projectcool.com/developer/basics
http://www.scu.edu.au/sponsored/ausweb/ausweb95/papers/management/vanzyl/
http://www.libpng.org/pub/png/pngintro.html
 



(c) MM Clements 2001                                                     Back to top of Page