The Internet is a loose amalgam of thousands of computer networks reaching millions of people all over the world. Although its original purpose was to provide researchers with access to expensive hardware resources, the Internet has demonstrated such speed and effectiveness as a communications medium that it has transcended the original mission. It has, in recent years, grown so large and powerful that it is now an information and communication tool you cannot afford to ignore. Today the Internet is being used by all sorts of people and organizations newspapers, publishers, TV stations, celebrities, teachers, librarians, hobbyists, and business people for a variety of purposes, from communicating with one another to accessing valuable services and resources. You can hardly pick up a newspaper or magazine without reading about how the Internet is playing a part in someone s life or project or discovery. To appreciate what the Internet has to offer you, imagine discovering a whole system of highways and high-speed connectors that cut hours off your commuting time. Or a library you can use any time of the night or day, with acres of books and resources, and unlimited browsing.WHENCE IT CAME The Internet universe was created by an unassuming bang in 1969 with the birth of ARPANET, an experimental project of the U.S. It had a humble mission, to explore experimental networking technologies that would link researchers with remote resources such as large computer systems and databases. The success of ARPANET helped cultivate numerous other networking initiatives, which grew up intertwined; 25 years later, these have evolved into an ever expanding, complex organism comprising tens of millions of people and tens of thousands of networks. Most users describe the Internet or the Net as a network of networks ; it appears to stretch forever. In this cyber-sphere, people in geographically distant lands communicate across time zones without ever seeing each other, and information is available 24 hours a day from thousands of places. The Internet is inhabited by millions of regular folks, non-techies who use it daily to communicate and search for information. When this book was first written in the spring of 1992, the Internet population was mostly researchers and academics, and there weren t many applications and interest groups of relevance to the general public. Two years later, mainstream services dominate use of the Internet.IT KEEPS GOING AND GOING... It s important to understand the significance of the Internet s growth and popularity.Similarly, stand-alone computers are useful, but their potential is limited by isolated applications word processors and spreadsheets, for example and the amount of money you have to spend on disk drives and CD-ROMs. A mere direct full-time or dial-up connection to the worldwide Internet gives you access to more info-goods, services, and people than you ll ever find on your own isolated computer or local-area network. The Internet is already the largest computer network in the world and, in terms of connected networks, people, and resources, it s getting larger, and therefore more valuable, literally by the minute. How large is the Internet? According to the Internet Society ISOC , a professional organization of Internet developers, influencers, and users, as of spring 1994, the Internet reached 69 countries directly and 146 via email gateways, and consisted of 23,659 networks and 2.217 million computers. The bulk of Internet computers and networks still belongs to the research and education communities. There s definitely a rising trend in commercial activity and connectivity; many businesses have realized that they can link their enterprise networks to the Internet and gain instant access to their customers. Some market research indicates that online services in general make up almost a billion-dollar industry, with an estimated 25 percent per year growth, so it stands to reason that providers of these services are migrating to the Internet, where the action is. The types of resources accessible via the Internet are growing at an astounding rate. Examples of some Internet resources are a database of regularly updated weather information in Michigan, an online magazine, a cartoon, and an archive of daily newspaper articles. A resource can also be a mailing list or a newsgroup that brings together people from all over the world to discuss shared interests such as soccer, cooking, and poetry. Suffice it to say that there are literally tens of thousands of servers, archive sites, mailing lists, newsgroups, and databases available on the Internet. The Success of the Internet It s hard to imagine how the Internet has grown so fast and been so successful without some ambitious organization or individual managing the project. Yet no one has a monopoly on access to or use of the Internet; there s no monolithic empire called Internet, Inc., controlling accounts and application development or roping off the backstage parts of cyberspace.Although you may not think about it often, standards play a big part in your everyday life. Libraries catalog books according to a standard system, so that once you learn it, you can walk into any library and find the books you need.Cooperation is a major ingredient to interoperability. The Internet nervous system does not have a central brain, such as a powerful supercomputer that controls its operation by feeding it commands and directing its limbs to perform key functions. The technology that makes it happen is known as internetworking; it creates a universality among disparate systems, enabling the networks and computers to communicate. Fundamentally, the Internet revolves around the concept of a packet, a basic building block or a digital brick. The packets are then individually routed from network to network until they reach their destination, where they are reassembled and presented to the user or computer process. This method of networking is very flexible and robust. If a network goes down meaning it isn t available to transfer information the packets can be rerouted to other networks in many cases.While most neophytes probably don t care about these standards and technical details, an understanding of the underlying infrastructure will help in learning to use the Internet properly and in taking full advantage of its powerful capabilities. It goes deeper than that though; understanding from the bottom up how separate computers and networks fit together will give you an appreciation for the net culture the sharing, cooperative spirit that is inherent in the Internet. Chapter 2 further defines these concepts of interoperability and open standards, as well as explaining how the protocols and networks come together to make the Internet work. THE EQUALIZER You can see how open standards enable businesses and individuals to compete on a level playing field in developing networking software and products. Once you, an Internet user, are jacked in, you have access to the same resources as the rest of the millions of Internet users, whether you re located in Sydney or Stockholm. The phrase democratization of communication often comes up in discussions about the Internet, which is, indeed, a truly democratic forum. The network doesn t care if you re president of a Fortune 500 company or a warehouse clerk, a potato farmer or a molecular biologist. Your tidings and opinions are handled the same way, and it s the worth and wit of what you have to say that determines who s willing to listen not your title. It s also never been so easy to be both a consumer and a producer of services. If you re ambitious enough and aspire to be an electronic entrepreneur who provides commercial services or Internet access, there s nothing to prevent you no long lines, no paperwork, and no regulations. Once your network is directly hooked into the Internet, all the computers on that network are accessible from every other Internet-connected computer. This environment empowers the individual; it encourages and stimulates participation, imagination, and innovation. There are numerous stories of how just one or two people have leveraged the Net to do great things, whether it s to publish a newsletter, make a name, or develop contacts. If you don t have access to a whiz-bang, high-speed Internet connection or to a large multi-user computer, that s not a problem. Your virtual storefront may be thousands of miles and two countries away, but it s probably a few seconds hyperdrive from every location.Ask an Internet wizard what this network is all about, and you ll probably get a long and dusty discourse studded with acronyms and techspeak. It s friendly if you approach it right, but potentially huge and terrifying, especially to people who don t know its special ways. Fortunately for you, the most important principle of all is that you don t have to fully understand how the Internet works to use it. Plenty of blissfully unaware Internet users are pounding away at keyboards and communicating merrily, with absolutely no knowledge of how the Internet fits together.A NETWORK OF NETWORKS The Internet is a worldwide web of interconnected university, business, military, and science networks. The Internet is made up of little Local Area Networks LANs , citywide Metropolitan Area Networks MANs , and huge Wide Area Networks WANs that connect computers for organizations all over the world. These networks are hooked together with everything from regular dial-up phone lines to high-speed dedicated leased lines, satellites, microwave links, and fiber optic links. This network web extends all over the world, but trying to describe all of it and how it fits together is a bit like trying to count the stars. Some network maps show the Internet as a cloud, because it s just too complex to draw in all of the links.What the Experts Are Saying . It s a biological phenomenon. John Perry Barlow, National Net 93 I m starting to think of the Internet as a kaleidoscope. Jean Armour Polly, Manager of Network Development and User Training, NYSERNet, Inc. So think of the Internet as a cloud of links. The cloud hides all the ugly details the hardware, the physical links, the acronyms, and the network engineers.Overall, the Internet is the fastest global network around.544Mbps for larger organizations. Gigabit-per-second network speeds currently being tested will allow even more advanced applications and services, such as complex weather prediction models produced by supercomputers and transmitted to weather centers. Or transmitting extremely large tens or hundreds of megabytes databases for example, earthquake data transferred from a collection site to the Institute of Geophysics and Planetary Physics for analysis.IN THE BEGINNING The Internet was not born full-blown in its present worldwide form of thousands of networks and connections. The ARPANET, described in Chapter 1, initially linked researchers with remote computer centers, allowing them to share hardware and software resources, such as computer disk space, databases, and computers. The original ARPANET itself split into two networks in the early 1980s, the ARPANET and Milnet an unclassified military network , but connections made between the networks allowed communication to continue. At first this interconnection of experimental and production networks was called the DARPA Internet, but later the name was shortened to just the Internet. Access to the ARPANET in the early years was limited to the military, defense contractors, and universities doing defense research. Cooperative, decentralized networks such as UUCP, a worldwide Unix communications network, and USENET User s Network came into being in the late 1970s, initially serving the university community and, later, commercial organizations. In the early 1980s, more-coordinated networks, such as the Computer Science Network CSNET and BITNET, began providing nationwide networking to the academic and research communities. These networks were not part of the Internet, but later special connections were made to allow the exchange of information between the various communities. The next big moment in Internet history was the birth in 1986 of the National Science Foundation Network NSFNET , which linked researchers across the country with five supercomputer centers. Soon expanded to include the mid-level and statewide academic networks that connected universities and research consortiums, the NSFNET began to replace the ARPANET for research networking. CSNET soon found that many of its early members computer science departments were connected via the NSFNET, so it ceased to exist in 1991. HOW COMPUTERS TALK The computers on a network have to be able to talk to one another. There are lots of protocol standards out there, such as DECnet, SNA, IPX, and Appletalk, but to actually communicate, two computers have to be using the same protocol at the same time.Developed by DARPA in the 1970s, TCP IP was part of an experiment in internetworking that is, connecting different types of networks and computer systems. First used ubiquitously on the ARPANET in 1983, it was also implemented and made available at no cost for computers running the Berkeley Software Distribution BSD of the Unix operating system. TCP IP, developed with public funds, is considered an open, non-proprietary protocol, and there are now implementations of it for almost every type of computer on the planet. Non-proprietary means that no one company not IBM, not DEC, not Novell has exclusive rights to the products needed to connect to the Internet.TCP IP isn t the only protocol suite that is considered open. Since the early 1980s, the International Organization for Standardization ISO has been developing the Open Systems Interconnection OSI protocols. While many of the OSI protocols and applications are still evolving, a few are actually being used in some networks on the Internet, and more are planned.The whole idea of protocols and standards can get complicated, but as an Internet neophyte, all you need to be concerned with are the applications that TCP IP offers. The difference between applications and protocols is that you don t actually see the protocols they re invisible to the end user , but you will access the Internet using the applications that conform to these standards. The Internet Toolbox Three TCP IP applications electronic mail, remote login, and file transfer are the Internet equivalent of the hammer, screwdriver, and crescent wrench in your toolbox. There are plenty of fancier applications using variations on or combinations of these basic tools, but wherever you roam on the Internet, you should have the Big Three available to you. The three basic Internet services, as well as the more powerful and colorful applications, are covered in later chapters, but here s a quick introduction to get you on your way. Electronic mail, also known as email or messaging, is the most commonly available and most frequently used service on the Internet. For example, a third-grade student in Texas can send an email message to a third-grader in Japan to ask how kids spend their free time there.Remote login is an interactive tool that allows you to access the programs and applications available on another computer. For example, say Sven, a student at the University of Oslo, is heading out to a ski vacation in the Rocky Mountains and wants to check the weather conditions and snowfall there. An Internet computer at the University of Michigan houses a weather database called the Weather Underground, with temperatures, precipitation data, and even earthquake alerts for the entire United States. Sven uses the remote login tool to connect to this computer and interactively query the Weather Underground for the information he needs. File transfer, the third of the Big Three tools, allows files to be transferred from one computer to another. For example, you may be interested in information on Chernobyl from the Library of Congress s Glasnost online exhibit of documents from the former Soviet Union. Using file transfer, you can download those articles from the computer where they re stored onto your own personal computer, where you can read them, print them out, or clip and incorporate parts of them into a paper you re writing. There are quite a few applications available today that use a combination or variation of these three tools to hide details even further. These operate on a client server model that is you use the client on your computer, and it contacts servers for directions and information. Clients and servers don t have to be located in the same geographical area, and in many cases on the Internet, they aren t. This technology is very flexible; during one session, your client may access servers all over the world to help you find information.As the Internet grows larger, locating the information you need will become difficult unless you re using information discovery and retrieval tools. The major resource-browsing applications, which operate on the client server concept, include archie, Gopher, WorldWideWeb WWW , Wide Area Information Servers WAIS , and Mosaic.How Does TCP IP Work? When you re actually using the above-mentioned tools, information of various types is being transferred from one computer to another. Each packet contains a piece of the information or document several hundred characters, or bytes , plus some ID tags, such as the addresses of the sending and receiving computers. Suppose that you wanted to take apart an old covered bridge in New England and move it lock, stock, and barrel to California people do do these things . You would dismantle the sections, label them very carefully, and ship them out on three, four, maybe even five different trucks. The trucks get to California at various times, with one arriving a little later than the others, but your careful labels indicate which sections go up first, second, and third. Each packet, as TCP IP handles it with its addressing information, can travel just as independently. Just as you might drive a different route to work to save a few minutes here or there, the packets may travel different networks to get to the destination computer. The packets may arrive out of order, but that s okay, because each packet also contains sequence information about where the data it s carrying goes in the document, and the receiving computer can reconstruct the whole enchilada. The switches are computers called routers, which are programmed to figure out the best packet routes, just as a travel agent might help you find the best flights with the fewest layovers. So it can travel over a fighter-jet network running at Mach-whatever speeds and connecting supercomputers that interconnects with a biplane network operating a lot slower. The Networks That Make Up the Internet The Internet network connections don t follow any specific model, but there is a hierarchy of sorts. Mid-level networks, in turn, take traffic from the backbones and distribute it to member networks, the neighborhood roads of the networking world. For example, the Texas Higher Education Network THENet is a mid-level network, connecting over 100 universities and research facilities in Texas.Each of the network links have speed limitations, but speeds are determined by the technology used not by some packet policeman .544Mbps or 56Kbps. Local-area networks are much faster. Local-area network pipes are usually pretty large, and therefore more water or data can be blasted through them than can be pumped transmitted during the same amount of time through a wide-area network pipe. Seamless Worldwide Networking Once all the pipes networks are in place, the Internet, which is actually tens of thousands of networks, looks seamless to the user. By means of internetworking that is, by connecting networks together to enable communication and information exchange all the details are hidden from you: the packets, the routers, and all those interconnections. Despite legions of different computers and disparate networks, somehow the whole web works, and any computer directly connected to the Internet can talk to all the other computers on the Internet. So you, working on a computer in your office in Israel or in your spare bedroom in Los Angeles, can communicate with a colleague in South Africa or a friend in Calgary. A network neophyte,faced with a cryptic computer prompt, may find it hard to picture the Internet as a friendly, peopled place. Through email and the other methods of online communication, people have become best friends without ever seeing or talking to each other. It is not uncommon for people to turn to the Net for answers; a question posted to online communities mailing lists and conferences can yield dozens of invaluable tales of experiences and testimonials within hours. Online communication, perhaps the ultimate in democratic exchange of information, eliminates barriers.On the Internet, people can communicate asynchronously and in real time. Asynchronous Greek for not at the same time communication means that someone can type in a message and send it off, but the recipient doesn t have to be around to receive it. You can send messages whenever you want to, they reach their destination quickly, and the recipients can read and respond when they want to. Real-time, interactive communication such as the Internet Relay Chat facility described later in this chapter , in contrast, means that as someone is talking that is, typing you see it on your screen as it is typed.ALL OR ALMOST ALL ABOUT ELECTRONIC MAIL Electronic mail is the most popular application on the Internet today. It s hard to imagine any other form of communication that can be so intimate and yet so wide-reaching, so focused, or so expansive.Email is sometimes compared to fax, but there are some fundamental differences. Electronic mail on the Internet is, for the most part, text that can be sent over a variety of network links everything from dial-up to fiber-optic lines. It usually costs the same to send email to one person as it does to send it to a group of people, while it would cost more in time and maybe paper to send a fax to those same people, especially if they re a long-distance call away. Both are asynchronous forms of communication, eliminating telephone tag that is, it s not required for the recipient to be present to receive either electronic mail or a fax. Interestingly enough, there are some projects on the Internet that combine the capabilities of both fax and email, and while interest is growing, the ubiquitous ability to fax over the Internet is not available just yet. For example, a librarian using this experimental system in Canberra, Australia, could send a fax from his Internet-connected workstation to a remote printer fax machine in Riverside, California.hall.org. Historically, Internet email has been text-based without some of the frills that many local-area, network-based email systems have. Text-based means that the message is only words just like what you re reading right now and can t include graphics, forms, and so on. Internet email is starting to branch out with some implementations, including the ability to query distributed directory databases an online directory service for people s email addresses , encode decode messages for privacy purposes see the Security Issues section in Chapter 5 , and send formats other than just text, such as graphic images, sounds, and different character sets Asian language text, for example using Multi-purpose Internet Mail Extensions, commonly referred to as MIME. The reason you need something like MIME is that the current Internet email system cannot transfer a non-text file such as a picture without doing something special to it there are funny characters in these files that can mess up their transfer. If you have a need to transfer non-text documents via email, be sure to inquire whether or not your provider s email application offers MIME support.There are other ways of sending software and graphics within a message if you don t have MIME support. Many email applications some are mentioned below support automatic conversion of non-text files into ASCII format; that way, those funny codes in the binary file are converted to something the mail system can handle plain text without burping. One such program that s sensitive to the Internet s email digestive system and can convert binary files to text is called BinHex, and it s available for both Macs and PCs. To you, BinHex files will look like a bunch of nonsense random characters on the screen they begin with the line, This file must be converted with BinHex . . If you receive one of these in your email or through other means and it s not automatically converted for you , you ll need the BinHex utility to transform the file to its original format, a binary file. Normal Heroes Always Make a Detour In 1990, after 15 years as editor, journalist, translator, and head of the Moscow News Computer Department, Anatoly Voronov started exchanging email with Dave Caulkins, an American setting up GlasNet in Russia. Their offices were three blocks apart, but their messages went through the Moscow Teleport host in San Francisco, which had a connection to the Internet. Voronov ascribed the roundabout routing to the famous principle expounded in the Russian movie classic Atbolit-66, Normal niye geroi vsegda idut v obkhod Normal heroes always make a detour . GlasNet became fully operational in 1991, with Voronov on staff. I remember a posting from a Chinese student in America, a participant in the Tiananmen Square events in Beijing, offering to share his personal experiences of how to beat tanks in the heart of the city. People wondered why the KGB didn t cut our connection. And we had a trick: the UUCP connection was originated in San Francisco, because at that time a non-authorized person or organization could not call abroad from Moscow. Sending Email Email is really fast it is sent and received in seconds, minutes at the most. All you need is access to the Internet, an email program, and the email address of the person with whom you wish to communicate. Access to the Internet. Chapter 2 discussed the differences between being directly connected to the Internet and being on an outernet network such as UUCP or BITNET, or a commercial service like America Online or CompuServe.Email Programs. A common characteristic of email programs is that they let you compose and send email, and then read and organize the email you receive.Post Office Email Programs If you re accessing the Internet using a PC or Macintosh, there are several different ways you can read and send email.One of the more popular applications uses the Post Office Protocol POP . In a nutshell, the POP system allows your personal workstation to get its email from a big computer that serves as a post office, delivering the mail when you or your computer ask for it. In order to use a POP-based email application, you need Internet access via dial-up or full-time connectivity and a POP mail account on a post office computer ask your Internet provider .Email Addresses.Internet email addresses are, in fact, very simple. The host part of the address should be recognizable to you a series of words separated by dots, as discussed in the domain name section of Chapter 2.username hostname Suppose that you know that Dave s computer name is sullivan-theater.cbs.com.cbs.com. Sending It Off. Each email program is different, so if you re not familiar with yours, you may have to fumble around a bit or actually read the manual or online documentation. You will need to specify that you want to send a message, either by typing send, clicking a send button, or by performing some other wonderful computer incantation. The email program will prompt you for information, asking for the recipient s email address, the key piece of information the program needs to send the message to the recipient. A good subject description makes the person to whom you re sending aware of the nature of your message, whether it s important or whimsical. If there s someone else you think would be interested in the message, here s a chance to include his or her address. If you have the disk space, it s a good idea to send a copy to yourself so you ll have a record of your outgoing messages. There may come a day when you ll need to know exactly what you said to someone! After you ve answered all the email program prompts, you can compose your message, using your email program s editor, which may or may not be similar to the word processor with which you re familiar. It s important to make your message easy to read and understand; some hints for effective communication are discussed in detail in the Netiquette section of this chapter. Trading Places: New Dimensions to Interlibrary Loans Paula Garrett of Batavia, Illinois, and Katie Wilson of Sydney, Australia, in an effort to see how the other half lives, traded homes and jobs for six months.The venture was a complete success because of the Internet and the Australia Academic and Research Network AARNet , according to Katie. We found it made a huge difference to be able to keep up with our jobs and keep things flowing smoothly. Six months was not a very long time in which to learn the jobs, and they are senior with a lot of responsibility, so the constant email communication helped us hold it all together! There s no question that the Internet has helped end the cloistered image of librarians. Anatomy of an Email Message An email message has two basic parts, the header information and the body of the message. These normally don t concern you, but they are necessary for the email programs and for debugging purposes.From letterman sullivan-theater.cbs.com Fri Feb 4 11:51:36 EST 1994 Received: by sullivan-theater.cbs.com id AA06414 5.65 IDA-1.3.5.for melman ; Fri Feb 4 94 11:51:35 -0500 From: David Letterman letterman sullivan-theater.cbs.com Message-Id: 9402041651.AA06414 sullivan-theater.cbs.com Subject: Tonight s Show To: melman sullivan-theater.cbs.com Cc: letterman sullivan-theater.cbs.com Date: Fri, 4 Feb 94 11:51:34 EST Status: OR Larry Bud , For tonight s show, we d like you to stand on your head and sing the theme song to The Jetsons.please try to be a good sport and don t scare the children.Thanks, Dave In this made-up example, David Letterman has sent email to Larry Bud, asking him for a favor.Receiving and Keeping Up with the Mail Receiving email requires less effort than sending it. When you fire up your email program, it fetches your mail from an online mailbox if there s anything in it , and then usually displays a one-line summary for each message in there. This summary will include information such as the message number, the date the message was sent, the sender, and the subject. You can select which message you want to read by typing the corresponding number, or by selecting it with your mouse.Here s an example of a message summary line: 1 Feb 4 David Letterman 20 Tonight s Show This is message number 1 in Larry Bud s email box. In this example, the number in parentheses indicates the number of lines in the message 20 , but it could refer to the number of characters too. If you think you can t keep up with the junk mail that flows into your snail-mailbox each day, then just wait until you collect dozens of keypals and you re busily exchanging messages every day. Most everyone loves to get email it will probably give you a tiny thrill to see the message, You have new mail, when you check your electronic mailbox. But because it s so easy to send and receive email, you may find that you can t keep up with all the messages you receive! You should set up a good routine for sorting your mail, deleting trivial messages, and filing the rest by saving them in separate electronic folders sorted by people or topics. If you don t keep up with your email efficiently, your messages will stack up in the inbox as they proliferate, and your email program may slow to a crawl. Your email program may allow you to sort incoming messages by date, sender, subject, size, or in other ways, and these functions can help you dispense of messages quickly. Get ready to switch gears on the infobahn! Almost anything you can think of is there for the taking--graphics, software, books, library catalogs, bulletin boards, data, sounds, movies, journals, newsletters, newspapers, and magazines. There are many thousands of independent databases, archives, and online services available via the Internet, making it one huge virtual library. Unfortunately, this electronic library is not as well organized as a real library. However, graphical interfaces and user-friendly tools have entered the scene, and can help you chart a course through what at first may appear to be a vast and unnavigable info-jungle. Of course, not everything is online yet, but the amount and diversity of information available online is increasing so rapidly that today you can find quite a bit of what you are looking for. The first edition of this book reported an impressive number of free public offerings, most of which were found in the academic and research domains., provides online newspaper and professional articles, the Official Airline Guide, financial services, and pharmaceutical directories--all accessible to subscribers. The Lexis for legal research and Nexis for business, financial, and general news databases from Mead Data Central are also accessible. Libraries are the last democratic educational institution . --Gloria Steinem, speech at the American Library Association in July 1992. There are too many useful information resources to list. This chapter will help you take advantage of the Internet by engaging it as an external brain, a vast storehouse of information resources. Already there are applications that utilize distributed hypertext; they link related resources together, allowing users to travel a never-ending web of information. What are the implications of having such widespread, ready access to timely information?S. The world is on its way to becoming everyone s information oyster and, therefore, the ways we learn and do business will probably change. People who will succeed in tomorrow s world will be those who can learn, discern, and deal with issues rapidly and intelligently using information tools.Everything You Know About Intellectual Property Is Wrong Thus, the rights of invention and authorship adhered to activities in the physical world.In other words, the bottle was protected, not the wine. Now, as information enters cyberspace, the native home of Mind, these bottles are vanishing. With the advent of digitization, it is now possible to replace all previous information storage forms with one metabottle: complex and highly liquid patterns of ones and zeros. Source: John Perry Barlow, The Economy of Ideas: A Framework for Rethinking Patents and Copyrights in the Digital Age Everything you know about intellectual property is wrong , Wired magazine, March 1994. There are some things to keep in mind while accessing information over the Internet. As for the validity and accuracy of documents, keep in mind the plausible situation in which a document has been archived, downloaded, annotated, edited, and saved by a friend before being emailed to you.The following pages will walk you through the most basic Internet information access and retrieval tools: remote login and file transfer. It s useful to know about these applications and how they work, but with the proliferation of graphical and menued front ends, you may not need to pull them out of your info-toolbox. The latter half of this chapter explains how to get started with information discovery and retrieval applications such as archie, Gopher, WAIS, WWW, and Mosaic. CAN YOU GET THERE FROM HERE? Reading this chapter may tantalize and frustrate those who have only limited access to the Internet. If you really need access to a particular resource, your system gurus or provider may be able to offer you another path.USING ONLINE RESOURCES AND SERVICES There are several classes of info-tools described in this chapter. Also--and this is part of the Internet standard disclaimer--the tools may operate differently on your system, so be sure to read local documentation and any instructions shown on the screen. In some cases, you have to type the commands; in others, you may use a straightforward menu system; in others, you may be clicking icons. The examples used in this chapter will, for the most part, be from a command-level perspective, showing the commands most of them in lowercase as you would type them on many computers.The first info-tool class includes the very basic, low-level, devices you can use to access just about anything. These are interfaces, applications that present the Net as a graphical environment, using icons, which when selected will call up appropriate tools and select the right resources.Keep in mind that the services explained in this chapter will most likely not be located on your own computer. You re not transmitting and receiving communication as you were in the last chapter; you or your applications are going out and actively getting information from other places all over the globe. The point is, on the Internet, it doesn t matter where it is, nor do you need to know in many cases where it is. Let Me In! Despite system differences, you will usually need to know a few specific pieces of information, such as the name of the computer or host that you want to connect to, perhaps a login id, and a password. Some computer systems require that you know the magic word to be let in to an account, and usually please won t work. The id also known as a username or userid lets the computer know who you are, and the password which only you should know proves it s really you. If you live in Amsterdam, it s unlikely that you re going to have an account on a computer in Tokyo, unless you have some type of special arrangement with an organization there.Public Services If you don t have any accounts on other systems, you may be wondering what you can use these tools for. All you need to know is the login id or name of the service, and that s usually easily available or very well known. Most of these services don t require passwords or, if they do, they either publish them, accept anything as a password, or request that you type in your email address or some other information that lets them track who s using their resources. A word on the hospitality of people and organizations providing publicly accessible services, file transfer sites, databases, and other resources. Sometimes it s requested that you use a service after working hours; if so you should respect that rule, keeping in mind the time zone as well. Different Environments When you are accessing remote services, you are connecting to another environment that may look very different from what you re used to using on your own system.Bienvenidos a Mexico! Sometimes the benefits of networking come in subtle packages. The Bush School in Seattle, Washington, is one of the first schools in the world to give Internet accounts to all the students, not just the teachers. Fred Dust, the school s Headmaster, relates how the Internet plays a very important role in learning, professional development, and parental involvement at his school. For example, two ninth graders were experimenting one day with online library access. They d lived their lives in English-speaking Washington State, had taken classes in Spanish, but hadn t realized it was actually used somewhere. The interface--the face that the other computer presents to you--will probably be different from the one you re familiar with. Don t worry; the public interfaces to these systems are pretty robust, so you won t harm anything if you make a few mistakes. Most of these online services don t come with manuals, so you ll need to read the instructions and use the help screens that are shown when you sign on. A contact name is sometimes listed with the description of the service or on one of the initial login screens; if you have problems, you can email or call.Error Messages Occasionally you ll get an error message or just not be able to get to that computer. First--and most likely--is that you misspelled or mistyped the name of the computer, in which case you ll get a message such as unknown host.If you know that the computer exists and that you have the correct name, and you still get an error message, you can try something else. If this is the case, and you do know the IP address, you can always try substituting it for the computer name. If you have the right computer name, and the remote computer doesn t respond after you initiate a connection using an information tool, there may be problems with the network or the remote computer may be down --that is, not working or available.ACCESSING INTERACTIVE SERVICES Remote login is a basic tool that lets you fly electronically all over the world, reaching your destination in a fraction of a second.How It Works Remote login on the Internet is a lot like using your modem to dial into another computer, but it s usually much faster and you don t actually have to dial a phone number. The name of the protocol that enables remote login is Telnet, which is also the name of the command on many systems to allow you to login to other computers. When using Telnet to login to a computer, just issue the telnet command followed by a space and the name of the computer. For example, if you want to check out an online book order service called Book Stacks Unlimited, Inc., type the following: telnet books.com The Telnet program will make a connection to the books.com system. In this particular example, you ll be asked to type in your full name, pick a password, and specify your contact information email and address . You can then use the menu system to search and order books over 240,000 titles are offered , and participate in a book discussion group. Now, when you telnet to most other systems, you are usually greeted by a computerized Who goes there? routine. The typical prompt is Login: or Username:, at which time you type your login id or username followed by the RETURN key. It is not shown because your password is supposed to be secret, and you don t want any folks kibitzing behind you to see what it is. In some cases when you connect to a resource, you ll have to specify an additional identifier called a port number.telnet madlab.sprl.umich.edu 3000 Here it was necessary to specify the port number, 3000, because it identifies a specific program. Resource guides always include the port numbers with the instructions for accessing resources, so if you don t see one, don t worry about it. There s something for everyone, such as local weather reports, snow ski reports for some parts of the country, earthquake reports for other parts, and hurricane reports. Sometimes when you login to another system, you ll be asked about your terminal type. Some resources, such as online library catalogs, are running on IBM mainframes, however, so you might have to use a different version of Telnet called tn3270 if it exists on your system in order to emulate an IBM 3270 terminal.Now that you ve learned what you can do on the Internet and a bit about how it works, it s time to cover a few advanced Internet topics. The Internet is more thanjust how-to. And there are some technical niceties--such as directory services and advanced methods for finding email addresses--that you can master if you re willing. A Finding More Help section of this chapter gives some direction for times when you need additional information or help with an Internet problem. Put a few million people together anywhere, even in electronic cyberspace, and they ll develop some kind of culture--a fabric of shared experiences, shared recreation, shared fears, shared rules of behavior--that makes them all feel part of a community. Now it s time to learn about some of the less tangible aspects of the Internet culture, the Net legends, and the notable--and notorious--subculture of network games. LEGENDS ON THE INTERNET Probably everyone knows at least one story that qualifies as an urban legend --a story that, while it may have started with a grain of truth, has been embroidered and retold until it has passed into the realm of myth.The following stories document the most well known of the bunch. Be street smart and wary of any posting promising fame and fortune, or asking you to forward a message far and wide.The Infamous Modem Tax The FCC Modem Tax Scare is a classic example of an Internet legend that refuses to die. The tax was quickly squashed in a congressional committee, and it was not--repeat not--under reconsideration at the time this book was published. The scare resurfaces continually on the networks, just like Jason from the Friday the 13th movie series, riling new users at the prospect that their new-found electronic freedom is about to be taxed.The FCC story is essentially innocuous, although its constant recycling through the Internet wastes people s time, as well as network resources. It has also created a cry wolf situation, and if another modem tax ever is proposed, it will certainly be harder to mobilize the opposition.A Catchy Title Should Appear Here Dave Barry, noted author and nationally syndicated humor columnist for the Miami Herald, is an Internet regular.feature.dave barry , has been keeping users entertained on a weekly basis for several years. Wanting to understand the erudition and sensitivity of his articles, thousands of jacked-in Dave followers formed a USENET newsgroup called alt.fan.dave barry. There, fans from Waterloo to Waxahachie discuss his articles and books, recent Dave sightings, those witty postcard replies to his fans, and his thriving presidential campaign in 1992 his catchword was A Catchy Slogan Should Appear Here . When asked what he thought of his electronic devotees, the Internet, and this book, Barry had this to say: I think it is truly a wonderful thing that, through the Miracle of Computers, millions of people can read my column instead of leading productive lives. Humor abounds on the Internet, and even researchers and educators have been known to search out a laugh. Get-Well Cards Gone Amok Back in the mid-eighties, a British seven-year-old named Craig Shergold was diagnosed as having an inoperable brain tumor. Craig wanted to set the Guinness record for receiving the most get-well cards, and his efforts got worldwide publicity, from mimeographed sheets to email pleas. Shergold is in his late teens now, and he s doing just fine; his brain tumor was successfully treated. He did set the Guinness record for get-well cards in 1989, and has gotten more than thirty million cards to date.Incredibly, however, the Craig Shergold story keeps circulating on the Internet, as fresh as the day it started. The Shergolds Craig s parents , the hospital--even Ann Landers--have sent out pleas to stop the flow, but the story has taken on a life of its own, and the cards keep rolling in. The hospital and post office, which have to cope with all the mail, sell some of it to stamp collectors and paper recyclers.So, if you see a plea on the network for cards for a little boy who s dying with a brain tumor, pass it up.How to Win Enemies and Influence People Against You The promise of easy and fast money is one that few people can resist. If you don t want to Lose Your Friends or Lose Your Internet Access, just say no to chain letters and pyramid messages in general.Speaking of things not to send--everyone hates junk mail, but Internet users hate it even more. You may be tempted to take advantage of the Internet for your business marketing programs, but consider the consequences before broadcasting commercial product and service advertisements: literally thousands of angry people will bombard your email box and tie up your phone to tell you how much they don t appreciate your doing that. A recent widely publicized case involved a lawyer in Arizona who sent a description of his services to more than 9,000 USENET newsgroups. He received over 30,000 email messages, and it s probably safe to say that none of them are fit to print in this book. Just because the current models of advertising and direct mail don t work doesn t mean that you can t use the Internet to promote your products. It s perfectly acceptable to provide a database or archive with details of your offerings that people can peruse when they want to. Some recommended books and journals that explain this new fine art of doing business in cyberspace are listed in the Appendix. Following the Internet to the Letter Jayne Levin is an independent businesswoman who has successfully substituted Internet know-how for start-up capital to fund her own newsletter. She uses the Internet for interviews, production, reviews, marketing, and sales. After one year, her newsletter has been very successful, and she expects it to be profitable after the first year of publication. I decided to launch The Internet Letter after exploring and writing about the Internet for a year, feeding my intellectual curiosity and seeing its power to help companies cut communications costs, gather corporate intelligence, and leverage scant resources. As a start-up company, I didn t have much money . The Internet offered invaluable resources, including desktop publishing software that was much less expensive than similar software sold at a computer store. With an Internet account that cost only 15 a month, I greatly reduced long-distance phone bills, conducting interviews online. CARL, a free database service, provides abstracts sometimes full text on articles that have appeared in national dailies and other publications. The Internet also provided a vehicle to distribute and sell my newsletter. I was contacted, via email, by a person in the former Soviet Union who asked for permission to translate the newsletter into Russian. I received requests for trial subscriptions from people in Turkey, India, Brazil, Cuba, Singapore, and Israel, and others used the electronic subscription coupon to sign up as charter subscribers. Source: Jayne Levin, Editor and Publisher, Net Week, Inc. GAMES Just about every computer user has at least one game tucked away somewhere--the kind you play surreptitiously when the boss isn t watching or when you ve got a bad case of writer s block. There are shareware and freeware games you can download for your own computer, as well as game newsgroup discussions and email lists. There s the Trivia USENET Newsgroup, whose participants have gotten past naming all the seven dwarfs and have now moved on to higher-order thinking--naming all the characters in sitcoms from long ago Gilligan s Island, the Brady Bunch, Laverne Shirley .games.trivia . As you might imagine, the big games on the Internet tend to match the network itself in scale and complexity, and they are a world and culture unto themselves. The games can feature fantasy combat, booby traps, and magic. Players interact in real time, and can change the world in the game as they play it by creating environments, rules, and characters. All the games demand an intense learning process to figure out all the characters and game idiosyncrasies, not to mention the rules.Some people literally spend all of their waking hours in the game. Many of the game players seem to feel the need to leave their mark on the game, and generations of game variations have evolved. In most games, new players take on a persona and then participate in the game. To quote from the Frequently Asked Questions document for Multi-User Dungeons, You can walk around, chat with other characters, explore dangerous monster-infested areas, solve puzzles, and even create your very own rooms, descriptions, and items. If these games sound interesting, check out the USENET newsgroups under the hierarchy rec.games.muds or alt.mud.SECURITY ISSUES Computer security is a major issue no matter where you go, what type of computer you use, or whether or not your computer is connected to a network. This section will provide some insight into security on the Internet and the answers to those questions. First of all, you should realize that despite its military origins, the Internet is not a classified network. The ARPANET was a network research experiment, so there was a lot of collaboration, with information being transferred between machines and researchers. Besides, the ARPANET was a small community, and users left their doors unlocked, just as trusting folks in small towns do. Today, the Internet is a massive cooperative with tens of thousands of networks--several orders of magnitude larger than the ARPANET--all tied together.What s not so secure about the Internet? When a new computer arrives at an organization, all the factory-set passwords and network configurations need to be changed; if they re not, the host will be an easy target for break-ins and outside attacks. Since all parts must work together to make the entire Internet secure, it s probably best to assume that things just aren t and act accordingly. Fortunately, when they do, lessons are learned, holes or weaknesses get fixed, problems are highlighted, and the Internet takes another step toward becoming more secure. The Food Is Better in the Virtual Dorm, or, Finding the Quad on a Penta Chip A simple multi-user role playing game in cyberspace called Multi User Dungeons MUDs may turn out to be the key to an entirely new approach to education. Recent Internet explorers playing MUDs saw new applications for these interactive, virtual worlds that were far from the Dungeons and Dragons and Star Trek realms of the early MUDs and their derivatives. In an attempt to incorporate education and distance learning into the virtual environment, MIT s MicroMUSE Multi User Simulated Environment University laid the foundation for educational uses of a technology once viewed cynically as a time-wasting and resource-gobbling game. Over the last two years, virtual colleges have begun to appear. Unlike traditional online, email-based distance learning classes, virtual colleges provide micro-worlds that enhance the subject matter being presented and provide environments in which students and faculty interact in real time. Typical of these new environments are DeanzaMUSE at De Anza College in Cupertino, California, and MariMUSE at Phoenix Community College in Phoenix, Arizona. DeanzaMUSE is a precise replication of the real De Anza college. At the same time, the VR virtual reality campus serves as a metaphor for navigating the information resources of the Internet. For example, the DeanzaMUSE campus planetarium has specialized links to astronomical resources around the world, the Euphrat Gallery features exhibits of JPEG images drawn from a variety of sources, and the Bio-Sciences classrooms access data from similar programs at major universities and research centers.Phoenix College offers a credit course through its Language Arts division taught entirely on MariMUSE. Depending upon the course being offered, class might be held on the deck of a Viking ship, at a street corner in New York City, or in a quiet study in sixteenth-century England. With nearly two years of experience in the newly emerging field of virtual instruction, MariMUSE instructors are doing pioneering work in the development of instructional tools and techniques. Virtual colleges may provide an entirely new and highly cost-effective environment in which to explore education in the twenty-first century. Source: Stan Lim Breaking Down Account Doors The press regularly reports on hackers breaking into computers and causing damage. Hacker in the computer world is a term of respect--hackers are basically nuts about computers and like to learn systems inside and out. Real hackers aren t angels, but they don t get their kicks from breaking into other systems to exploit holes and snooping in someone else s information.mudhead n., with the consolation, however, that they made wizard level. When encountered in person, all a mudhead will talk about is two topics: the tactic, character, or wizard that is supposedly always unfairly stopping him her from becoming a wizard or beating a favorite MUD, and the MUD he or she is writing or going to write because all existing MUDs are so dreadful! Source:The New Hacker s Dictionary, edited by Eric S.What Can You Do? As a user of the Internet, you can t do much about fixing security problems if the computer you re using to access the Net is not your own.Most levels of service on the Internet require some type of authentication to prove it s really you accessing the service. Your userid is usually well known you give it out so people can send email, for example , so the only way you can protect yourself is with a secret password.If an undesirable gets your password and uses it to enter your account uninvited, worse things can happen than just your files being looked at, modified, or deleted.Never give anyone your password without a valid reason. When you do give it to someone so that he or she can obtain necessary information or perform an action, change the password as soon as he or she is done. If you get an account on another system, such as a public database or bulletin board, do not use the same password that you use on your local system. Don t write your password down and leave the paper in an obvious place, such as in the desk drawer next to your computer.Copying the scams in which callers try to get your credit card number over the phone, some potential intruders call or send email claiming to be a system administrator. These con artists will tell you that, for various reasons, you need to change the password for your account to the one they provide you.How to Pick a Password An easily guessed password is one of the most common causes of security problems. If you don t know how to change your password, put it at the top of your list of things to learn. They should also not be easily guessed, such as your husband s or wife s name, girlfriend s or boyfriend s name, the dog s name, your license plate, the street where you live, your birthday--you get the picture. So what can passwords be? It s recommended that the word be at least six characters long. This way the password is not a word, but it s easy to remember and hard to guess. It s also recommended to mix some numbers with the letters and throw in some punctuation for pizzazz, but never make your password all numbers. Don t Try This at Home . You can t point-and-click on CompuServe to make toast in Cairo, but way out on the frontiers of Internet development, the cognoscenti are whipping up elegant hacks to do just that. TGV, Inc., a networking software company in Santa Cruz, California, first got involved in networking home appliances at a chance meeting between then TGV Technical Support Manager Stuart Vance and Simon Hackett of the University of Adelaide. In December 1989, Stuart was in Adelaide for a networking conference, and discovered in conversation with Simon a mutual love of perverse interesting computer and networking applications. They decided that it should be easy to extend control across a network, using the TCP IP network management protocol SNMP Simple Network Management Protocol . Upon returning home, Stuart managed to persuade TGV management to fund Simon s development of a custom controller to interface to a Pioneer Stereo system. Engineers at TGV wrote a small IP stack for the microprocessor, and Hackett and Vance ported the Epilogue Technology SNMP agent to run on the controller. Additionally, they developed but never quite completed a home electronics SNMP Management Information Base for selecting input CD, tuner, cassette deck, phonograph , volume, tuner band and frequency, and other standard stereo features.The stereo system project led to further collaboration between TGV and Hackett, including: one of two independent implementations of an SNMP-manageable Sunbeam toaster; an SNMP-manageable Sony 60-disc CD jukebox; and the Interphone, a scheme for audio communication over TCP IP. Simon has since founded Internode Systems, a networking company in Australia, and continues to work with Stuart on connecting unconventional and conventional devices to the Internet.Perhaps Hackett and Vance were influenced by Stephen Wright a comedian , who several years ago told this story: In my house, there s this light switch that doesn t do anything. Source: Stuart Vance Can People Read My Email? Can they read it? But you need to realize that once email leaves your system, it may sit on another computer hundreds or thousands of miles away, and you have no control over who has access to it. The best thing to do is to realize that your email is not going to be secure, and to avoid transmitting sensitive material, as already recommended in Chapter 3. Even if no one reads your email while it s in transit, the recipient could forward the message on to whomever he or she pleases. It is physically possible to tap networks, just like tapping telephone lines. Encrypt means simply that it s encoded into something that no one else can read without the proper key the digital equivalent of a Captain Marvel decoder ring . There are no automatic mechanisms available in the Internet right now to encrypt email, but if you have the necessary software on your computer, you can do it. An increasing number of people are interested in the privacy of their correspondence, and a number of programs and solutions are popping up to assist them. PEM implementations are unfortunately not in widespread use yet, but they ve begun to proliferate, and may be coming soon to an email application near you. Another encryption program in use on the Internet is called Pretty Good Privacy PGP , and it s used a lot outside the United States.Once you re a regular on the Internet, you ll notice that a lot of computers out there run the Unix operating system. Unix was, and is, popular among researchers and computer science departments which made up the early Internet , partly because some of the first versions of TCP IP were distributed free with one version of Unix known as the Berkeley Software Distribution BSD . Many computer companies sell their machines with Unix and TCP IP bundled in, which makes it a more popular combination than some of the other computers and operating systems, for which TCP IP support has to be ordered separately. You don t have to be a Unix expert to use the Internet, but it doesn t hurt to know some of the basic commands. If you re using the Internet, however, sooner or later you ll have to deal with Unix face-to-face, so included in this chapter are some explanations of the more common idiosyncrasies and applications you may encounter when using Unix on the Internet. Knowing how to navigate through directories and use some of the basic Unix commands will make you a more powerful Internet user. Be aware that this chapter will give you only the barest of tools to get you started and help you accomplish what you need to do. If they don t, you should use the help facility explained below , or call your local help desk to find out what the proper command or sequence of commands is.LOGGING IN Let s start with the basics--getting access to your Unix account. login: At this point, type your userid. Passwd: When you type your password, it will not and should not display on the screen. Important!GETTING HELP Unix may not always offer a lot of help outright, but it does have a help facility called man, which stands for manual pages. If you ever need help with a command, type man command where command is the name of the command.More The man command uses another Unix program called more that lets you page through files--meaning it shows you one screen at a time of the file instead of letting it fly off the screen. To advance to the next page, simply hit the space bar once typing return will only advance the file by one line .Other Things to Know Many of the applications and commands mentioned below refer to control commands. When either of these precedes a letter, you should hold down the CTRL key, and at the same time, press the command letter. For example, if you see G, g, or CTRL-G written in documentation, you should hold down the control key while pressing the g key lowercase g . THE UNIX FILE SYSTEM The Unix file system--the way files are organized on the computer s hard disk--is hierarchical, similar to the DOS file system. If you understand how the DOS file system works, then it shouldn t take you long to find your way around Unix systems. Since a good number of the public file archive sites are computers running the Unix operating system, learning your way up and down a Unix directory as was discussed in Chapter 4 will come in pretty handy. As a user on a Unix Internet system, you have your own space on the file system. You can organize this file cabinet any way you want--it can be very structured, neat, and tidy, or it can be extremely messy and unorganized. If you like some order to your life, then you ll be happy to know that you can create directories that house files or other directories. A directory can be compared to a manila folder, which you can use to organize and store papers files and other folders directories . Miles of Files, Directories, and Commands Once you start surfing the Internet, you ll be pulling down articles, books, and software, among other things, from all over the place. You ll also probably be creating quite a few files, using either the Unix editors mentioned below, or by uploading them from your own PC or Mac if you re dialed-in to the Unix system . Because there are so many different types of files and so many ways to create them, you should definitely have an organized system for storing them. First of all, to see which files you have in your home directory, you can use the ls command to list them. If you type ls with no arguments, it will list your current directory which, if you ve just logged in, is your home directory . When you type ls -l, you ll see several columns of information that specify permissions, links, owner, group, size in bytes, and the time of the last modification for each file. Here s a sample listing of David Letterman s home directory after he types ls -l to get a long listing: sullivan-theater ls -l total 16 drwx------- 4 letterman cbs 1536 Feb 13 14:34 . drwx------ 1 letterman cbs 3449 Feb 13 10:49 Mail -rw-r--r-- 1 letterman cbs 9383 Jan 17 11:03 LightBulbJokes drwxr-xr-x 2 letterman cbs 1024 Feb 12 10:39 News -rw-r--r-- 1 letterman cbs 792 Feb 6 17:51 guest-list -rw------- 1 letterman cbs 5097 Jan 29 13:59 tonights-jokes -rw-r--r-- 1 letterman cbs 5039 Jan 1 10:59 top-ten-list The top two files indicate directories: . is the current directory, and .. is the parent directory. Your listing may not have as many columns as in this example. , parent .. , Mail, and News. Sometimes you need to know the size of a file, especially if you re running out of disk space on your account.When you login, your current directory is your home directory.Again, another warning about case sensitivity. You need to type commands and names exactly as they appear in resource guides, email, and news. If the output of an ls command shows a file called LightBulbJokes, and you want to look at that file, you need to type the name exactly as shown; lightbulbjokes or Lightbulbjokes simply will not work. This may seem a little tedious, so you might want to use the copy and paste functions of your workstation, if they re available. There are several commands to view files; one of the most common ones, already explained above, is more. more LightBulbJokes This program lets you page through a file.If you decide you don t want to keep the LightBulbJokes file anymore, use the remove file command, rm. rm LightBulbJokes Or, if you want to rename the file, you can use the move command, mv. mv LightBulbJokes LBJ This would rename the file LBJ. If you forget where you are--what directory you re in--type the print working directory command, pwd.If tracy wants to go back to her home directory, she can issue one of two commands in this example. She can type cd, which by itself automatically puts her back in her home directory home tracy no matter where she is. yes, that s cd and then two periods , which will change her current directory to the parent of the one she s currently in up one level .Back to the David Letterman example. mkdir jokes The next thing he should do is move the two joke files he has in his home directory to the jokes directory. mv LightBulbJokes jokes mv tonights-jokes jokes He is using the rename command to move the files, but they do keep their original names in the jokes directory. cd jokes Now his current directory is home letterman jokes. LightBulbJokes tonights-jokes He can then edit or look at the two files in that directory., and he ll be one level above home letterman . If he decides later on to remove the jokes directory, he needs to use the rm -r command: rm -r jokes This will recursively delete every file in the jokes directory, and the jokes directory entry as well.These instructions are the bare minimum, but they should get you started moving, removing, renaming, and looking at files and directories. Sydney Is Burning As I flipped through my email messages one morning, I suddenly received a new one entitled The Sydney Bush Fires. The mail was from my Australian keypal, and he was telling me and some of his other keypals what it was like to be experiencing the bush fires that were burning all round Sydney. Forgetting all about my other messages for the time being, I quickly wrote back and arranged to go with him to the KIDLINK IRC Internet Relay Chat . On IRC, a place where, amazingly, people can talk back and forth, I was able to ask my friend all about the disaster. It turned out he was less than ten kilometers from the fires, he could see the flame-tinged sky and smell the smoke from his window, and he was able to tell me how far the fires were from the famous Opera House and the Taronga Park Zoo. During the next several days, I communicated through email several times more with my Sydney friend, and the fires got even closer to his house. However, all week long the information about the Sydney fires that I brought to current events in my social studies class was more up-to-date than anything in the newspapers. That is only one of my amazing network experiences, but it is one that illustrates the way being on a computer network and having access to the Internet has changed my life in wonderful ways. From a winning essay, Networks: Where Have You Been All My Life? by Rachel Weston, rweston cap.gwu.edu, Grade 7, Georgetown Day School, Washington, D.C. Creating Files Perhaps one of the reasons Unix doesn t have a good reputation for being user-friendly is the choices of editors. They re not that hard to use--it s just that in most cases you can t use your mouse to point, click, and insert text. Or, you can create a file on your PC or Macintosh, and upload it transfer it from your PC or Mac to the Unix system .Uploading Files to the Unix System. If you decide to heck with learning another editor, and you d rather upload files you created with your easy-to-use word processor on your PC, here s what you need to do.Most likely the files you ve created on your home computer are not text files. For example, a word-processed file is not a text file because there are a lot of codes and symbols in it known only to your word processor program. You can also convert a word-processed file to a text file by saving it as text with line breaks or text with no line breaks. To transfer a text file from your own computer to the Unix computer, initiate Kermit on the Unix system by typing kermit -r; the -r option means that the Unix system is going to receive the file. You should then escape back to your PC or Macintosh and initiate the sending process on your home computer by specifying what file you re sending, either through the menu system choose Send file or by typing a command for example, send file1.txt . Unfortunately, there are so many communications packages that it is impossible to tell you here what to do in your particular situation. If everything goes according to plan, your text file should transfer nicely, and then it will be on the Unix system.If you decide to transfer a binary file, such as software or a word-processed document, you can do that too.Vi.Here s your basic survival guide. To create or edit an existing file, you type vi filename, where filename is the name of the file you want to create or modify. When you fire up vi, you ll recognize a vi session because your screen doesn t contain much explanatory information, just the text in the file or, if the file is empty, a bunch of s in the first column of every line.Upon initiating a vi session, you ll automatically be put into command mode. When you type i, it won t show up on your screen, but you will instantly be put into insert mode. Now everything you type goes into vi s temporary editing buffer. Here s a way to tell if you re in command mode or insert mode.You can move around and position your cursor by using the arrow keys. You can also use the letters h, j, k, l in command mode only to move left, down, up, and right, respectively. To delete characters when you re in command mode, position the cursor over the character to be deleted and type x.It s possible to include other files in a vi session. :r filename where filename is the name of the file located in your current directory. Notice that when you type a colon in command mode, the cursor automatically positions itself at the bottom of the screen.Finding out how to leave vi is the biggest question most newbies have right after they start it up. . SUMMARY OF VI COMMANDS Command Mode Insert Commands: i insert before the cursor a insert after the cursor o open or start inserting in the line below the cursor O open or start inserting in the line above the cursor Delete Commands: dd delete the current line dw delete word x delete the character under the cursor Exiting Commands: :w write or save file :q!:wq write save changes and quit Other: :r filename include filename in buffer ESC returns to Command Mode PICO.To edit a file with PICO, simply type pico filename. The top line is a status line, the third line from the bottom is used for informational messages, and the bottom two lines provide a summary of the commands you can execute.To perform functions, PICO makes use of the control commands mentioned above. For example, to delete a line, position the cursor on the line, and type CTRL-k hold down the control key and press the k key .To quit PICO, type CTRL-x X . The message line three lines from the bottom will ask you if you want to save your creation or changes before exiting. Unless you don t want to save your changes for some reason perhaps you made too many mistakes and would like to recover the old version , you should always save. It will default to the filename you specified at startup, so just press return if you want to write over that file.SUMMARY OF PICO COMMANDS G Get Help X Exit O WriteOut save changes J Justify format the current paragraph R Read file insert a file at cursor position W Where is position the cursor at a specified text string Y Previous page position the cursor at the previous screen page V Next page position the cursor at the next screen page K Delete line U Undelete line C Current position of cursor T Spell Emacs. emacs filename To learn about emacs, check out the online tutorial by typing H T. Now that you know what you want to do on the Internet, or at least where you want to go exploring, you ll wantto get connected. This chapter tells you what you need to get started, your choices for individual access, where to go for services, and the basics for connecting a business organization.S. Demand for Internet access is increasing worldwide, but there are more connectivity choices for individuals and businesses in the United States because of the many competing provider services there. EASY STREET If you work for an institution or a company with full-time access through a network connection to the Internet, you have the shortest path of all. All you need to do is sit down at your office terminal or workstation and, using the instructions and Internet applications supplied by your in-house computer gurus, log on and get going. For example, a college s local-area network LAN might get access to the Internet by making a connection through a leased phone line to a regional network. Once that connection is made, in most cases, every computer on the local-area network has full-time access-- meaning, the Internet is available all the time, day and night.ALL YOU NEED TO GET STARTED DIALING INTO THE INTERNET Fortunately, these days there are more and more ways to get access to the Internet if you re an individual computer user or small business. Connecting an entire business or organization s network is more complex than can be covered in detail here, but an overview of the major steps is included later in this chapter see Connecting Your Business or Organization .Modems If you re in the market for a modem, then read this section before whipping out your credit card. A little planning and research in the modem department on your part will make your journey to the Internet a bit easier. Modems are, simply put, computer appliances that convert the digital signal from your computer into an analog sound wave that can be transmitted over telephone lines. A modem at the other end converts the analog signal back into a digital signal that is understood by the computer you re talking to. As with any computer-related purchase, you should buy the very best modem you can afford--perhaps even a bit better than you can afford. Technology changes fast, and five years from now, today s high-speed modems will be as obsolete as that dinosaur of modems, the 300bps acoustic coupler. If you ve already got a slower modem, don t despair just yet. Many individuals are still using 2400bps or slower modems that they ve had for several years to access the Internet and other services. All of the access and information systems support them, and, for the occasional user, the difference in online and or long-distance charges may not be too significant. Using a 2400bps modem, you can access electronic mail, Telnet, FTP, and the terminal client gopher application. However, the bigger the message or file, the longer it will take to show on your screen or transfer to your computer.If you plan to spend a lot of time online and run applications like Mosaic, or if you need quick, error-free access, spring for a high-speed modem with error correction and data compression. You can use Mosaic if you re dialing into the Internet with a modem, but you must be using a modem that runs at least 9.6Kbps, preferably at 14.4Kbps or faster .S. . See the Full-Access Dial-up Connection section to learn how to use Mosaic and other client applications via a dial-up link. The ideal modem for telecommunications not only communicates at high speeds but also has error correction and data compression features. Error correction protocols help filter out line noise, which throws garbage characters--like pdf --on your screen, and they ensure an error-free transmission. Data compression, while a useful feature, may not help you much on some bulletin boards and information services that have already compressed their files because your modem can t compress them any further. So it would be wise, especially if you are planning to spend a lot for a high-speed modem, to check some independent sources before you buy.Communications Software The second required component is software that will enable communication. Communications software, which is installed on your personal computer, sets up the three-way conversation between your computer, the modem, and the remote computer or terminal server. Since you are dialing into the Internet, there are many types of communication packages available, enabling three different kinds of connections.TYPES OF CONNECTIONS The following sections explain the three basic access options you have as an individual independent user.Terminal Emulation What It Is. Using your modem and free or commercial communications software, such as Kermit, PROCOMM, WhiteKnight, or MicroPhone, you can dial into an Internet-connected computer or communications server and basically turn your PC or Mac into a dumb terminal that will most likely emulate a VT100, a venerable terminal produced in the millions by Digital Equipment Corporation DEC . And there are a number of free implementations, like Kermit, that are widely distributed through various channels, such as user groups, bulletin board systems, and the Internet. Once connected, everything you type is from the perspective of the remote computer into which you have dialed.COMMON MODEM STANDARDS AND TYPICAL SPEEDS The following specify some common modem standards. Many of these--the ones that begin with a V --are defined by the Consultative Committee for International Telegraph and Telephone CCITT , an international organization that develops communications standards. The third column estimates the time it would take to transfer a 100K file the average size of many documents or image files on the Internet . Modulation Standard Speed Approx.V.22 1200bps 14 minutes V.22bis 2400bps 7 minutes V.32 9.6Kbps 2 minutes V.32bis 14.4Kbps 1 minute V.34 28.8Kbps 30 seconds Standard Type V.42 Error Correction V.42bis Data Compression MNP 4 Error Correction MNP 5 Data Compression Notes: Speeds are represented here in bits per second bps , not in baud.Be aware that the other end must support the same standards in order to achieve the desired connection rate. V.34 is also known as V.fast, and is supposed to be available in the summer of 1994. A popular high-speed modem these days is one that conforms to V.32bis with V.42 and V.42bis.There are many other standards.When you use FTP or Gopher to transfer a file, be aware that you are transferring the file to the Internet- connected computer you are dialed into, not to your own computer. If you want the file to reside on your PC or Mac, then you have to execute another transfer process by downloading it using a different kind of file transfer protocol, such as Kermit, Xmodem, Ymodem, or Zmodem. This is perhaps one of the biggest stumbling blocks for new users--the confusion about where the file actually is and how to make it show up where you want. When you transfer a file to the middle guy using FTP or Gopher, remember that you then need to tell the middle guy to transfer it to your own computer.For example, suppose that you re using Kermit to dial into an Internet-connected computer on the Zilker Parknet a commercial Internet provider located in Austin, Texas . You re zipping around the planet checking out the scene, when you find an archive of online books available via anonymous FTP on host vtucs.cc.vt.edu in the Files infores books directory the URL is ftp: vtucs.cc.vt.edu Files infores books . After you browse the digital shelves looking for a book you can curl up with on your laptop and read, you decide on Walden by Henry David Thoreau.At this point, Walden is on the Zilker Parknet computer the middle guy , not on your own computer. You need to initiate another transfer using Kermit, Xmodem, Ymodem, or Zmodem, for example from Zilker Parknet to your PC or Mac. Here s how to do this if the middle-guy computer and your computer both have Kermit.edu, path Eris Information Services Eris Files Information Resources Books. Offline Software Access What It Is. Offline software access brings some of the Internet functions, such as electronic mail, USENET news, and file transfer, straight to your computer, but lets you work offline. When that happens, the software makes the connection, performs the required functions, such as transferring email back and forth, and then disconnects. In addition to taking care of the communications, this software also provides email, an editor for composing messages, and perhaps news readers.Although you re not interactively using the Internet, you can still do a lot of useful things, such as download electronic mail and news, reading messages and postings at your leisure on your home computer rather than tying up a phone line or running up connection charges. But be aware that not all of the Internet s applications, particularly remote login, Gopher, and Mosaic, are available to you, since you can t issue commands and receive information interactively when you re not connected. Despite this limited functionality, these client connections are recommended for novice users, because they are more user- friendly than many of the public-access systems. With such access, you work with a familiar graphical application on your PC or Macintosh, not on a foreign computer account. You also don t have to worry about taking the extra step of transferring files from a middle-guy Internet computer to your home computer as you do with dial-up terminal emulation access --the software does all of this for you. Full-Access Dial-up Connection What It Is. A more advanced client connection uses client networking software and a high-speed modem to actually become a directly connected computer on the Internet. This type of access differs from the services above because you are skipping the terminal-emulation middle guy, so to speak, and you re interactively using the Internet, not working offline. What makes this happen is a fast modem the fastest you can get, at least 9.6Kbps , and software that conforms to Serial Line Internet Protocol SLIP or Point-to-Point-Protocol PPP . Either of these, used in conjunction with graphical Internet client applications like Gopher and Mosaic, brings the power and flexibility of the Internet straight to your home computer over an ordinary telephone line. SLIP and PPP are different, but each performs essentially the same function--that is, they make your computer a peer computer on the Internet. A SLIP or PPP connection is a great way to connect, but it can be more expensive and a bit more difficult to configure. When you use this type of connection, you are actually executing Internet applications on your own computer, not on an Internet-connected computer that you ve dialed into. For example, if you want to transfer a file using FTP from a public-access site, you transfer that file straight to your home computer instead of working with the terminal-emulation middle guy.How It Works. You must dial into another computer or terminal server that is running SLIP if your computer is running SLIP or PPP if your computer is running PPP to make this connection. They help you get set up at the beginning of the connection, but they are essentially invisible after you get going. You ll also need a unique Internet Protocol IP address, because your computer must be identified on the network. Your provider will most likely assign you an address, or the remote SLIP PPP server will assign you a number to use when you make the connection. You may want a registered hostname as well, but as with the IP address and any other required information and parameters, your network provider will probably be able to assist you.Internet to the Rescue! Tired of those busy signals when you re trying to reach technical support for your computer? Over the past year, they ve gotten bug fixes and patches for their SUN Microsystems workstations and technical support from their router vendor, Cisco Systems.One of the company s software engineers told us about how the Internet recently saved the day and night for him when his boss needed a network monitoring problem fixed by Friday morning and it was 4:59 p.m. Source: Peter Ho, Unocal Corp. CHOOSING AN INDIVIDUAL ACCESS PROVIDER Network access for individuals is a new and evolving market, one that is growing very quickly. So finding the services you want, the access, and the right price is not as simple as picking a long-distance phone carrier, or getting phone service through your local phone company. Use the information here and in the Providers section of the Appendix as a general guide to starting your own research. Public Dial-up Internet Access Systems Lots of companies offer dial-in access to their large Internet-connected computer systems, giving you terminal emulation or if available SLIP PPP access to the Internet. All of these services offer file transfer, remote login, Gopher, and news services, in addition to electronic mail and depending on the system a variety of other services, including commercial databases. Access is usually via a phone call to the system s local number, although some systems also offer access via public data networks, such as CompuServe Public Network CPN . Many public-access providers are expanding and adding access points in more cities, so you may want to contact them for their latest local dial-in information.More often than not, the type of computer into which you re dialed is running the Unix operating system. Many providers also offer menu systems that eliminate the requirement of a computer science Unix internals degree and simplify things greatly. If you are forced to wade through the Unix muck, be sure to refer to Chapter 6, which includes information on some common commands, applications, and how to get help if you get stuck. To be fair, Unix isn t all that bad, and once you get the hang of the system, it can be quite fun to use.See the Providers section in the Appendix for a list of public access dialup systems compiled by Peter Kaminski. U.S.As mentioned in Chapter 2, there are lots of regional academic research and national commercial Internet providers that offer individual access to their networks. The commercial providers, such as CERFnet, UUNET, ANS CO RE, Sprint, and PSI, offer a wide range of access for individuals, from terminal emulation to full-time SLIP or PPP access.Everything-but-the-Kitchen-Sink Providers You ve probably been shaking your head at all the background work you have to do just to find graphical, user-friendly interfaces, and to find an Internet provider. Well, be on the lookout for commercial products that combine full Internet access, an Internet provider, and all the parts needed to make graphical client applications like Gopher, WAIS, and Mosaic work.Special Interest Professional Groups You may be eligible for inexpensive Internet access through a special interest or professional group. If you are a teacher and are interested in finding out more about access to the Internet, then contact your district s computer coordinator or regional computing consortium to find out about your access options. Community Networks Community networks are springing up in cities all over the world. In addition to acting as online town halls, providing information about city government and local functions, they often offer email and perhaps full access to the Internet.See the Appendix to find out if there s an education network or Freenet in your area. Alternative Phone Access The services listed above are great if you live in a big city with local dial-in access points. However, if you live in a rural area, you travel frequently, or your chosen system is an expensive long-distance call away, you should investigate other access methods.CompuServe Packet Network CPN . If your chosen system allows access via CPN, use your modem to dial CompuServe s information service, 800 848-4480 in the United States, to find your closest CPN access number.Toll-Free Service. There are some Internet providers in the United States that offer a toll-free 800 number that gets you access to a communications or terminal server. Be aware, however, that 800 numbers are not free, and the cost is passed on to you, just like a long-distance charge. The last thing you need is a big surprise on your bill, because those blissful Internet hours can add up quickly.Major City Dial-in Service. Access is usually made via the local phone system to a terminal server or communications server connected directly to the Internet. A terminal server is basically a bouncing off point to the Internet, a computer that accepts connections and allows you to use the Internet to remotely login to other computers. Terminal servers have modems attached to them so that users can dial in and, from there, remotely login to any computer on the Internet, or initiate a SLIP PPP connection to become directly connected. Who does it: UUNet s TAC Access, EUnet s Traveller major cities in Europe , and PSI s Global Dialing Service GDS offer local dial access in many cities.Imagine you are a florist and you can ship anywhere in the world. This is what Jennings Florists in Victoria, British Columbia is doing, with full-color pictures of its most popular gift baskets and flower arrangements. The only link you need to reach the Jennings Florists catalog on the World-Wide Web see Figure 1-1 Jennings Florists Web Site is called a URL, or Uniform Resource Locator http: www.islandnet.com JenningsFlorists .Or suppose you love tennis, and you want to collect and make available tennis news, equipment tips, and player information and set up an Internet tennis specialty shop. This is exactly what Tenagra Corporation decided to do; their WWW Tennis Server includes articles and links to tennis information throughout the Internet.tennisserver.com Or suppose your company sells computer software or hardware, and your phone lines are constantly busy with help questions and requests for the latest updates or pricing schedules. Many top high-tech companies, including Sun Microsystems, IBM, Microsoft, and Novell, post technical notes, price lists, and even software upgrades on the Internet that their users can download immediately see Figure 1-2 Sun Microsystems Web Site and Figure 1-3 IBM Web Site. http: www.sun.com http: www.ibm.com http: www.microsoft.com http: www.novell.com Or say you re a real estate broker, and you d like to advertise your properties more widely. A company called Coolware has set up a site at which it is soliciting realtors and home owners to list real estate for sale. See Figure 1-4 Palo Alto Real Estate Web Site for the sample of listings it has posted for the City of Palo Alto, California.coolware.com real realestate.html And the Encyclop dia Britannica is starting to provide the full text of all its volumes across the Internet see Figure 1-5 Britannica Online Web Site for a fee. The company is marketing the service to colleges and universities but plans to make it inexpensive enough that individuals will subscribe.eb.com As you can see, it s a new world in publishing. You don t even have to have a computer if you want to hire someone to publish on the Internet for you.Publishing on the Internet simply means putting information on one computer where it can be seen by others on the Internet. Although the commercial and marketing possibilities are creating enormous interest, the bulk of what is published today on the Internet is available for free. Companies are finding that traditional advertising techniques don t transfer well to the Internet; instead they are supplying detailed information and participating in technical forums in which their products are discussed. A Quick History of Internet Publishing Techniques Publishing on the Internet basically consists of making computer files available on one computer, usually called a server, and allowing others to view or download them via other computers, usually called clients.Programmers and computer technical administrators have used this procedure for many years. In fact, Gopher, WWW, and WAIS take very similar steps behind the scenes as you browse the Internet, blissfully oblivious to the details. File Transfer Protocol FTP File Transfer Protocol, or FTP, is the original method of using the Internet to transfer files between different computer systems. FTP requires that you know the name of the computer to which you wish to connect and have a login ID and password for that computer.For years anonymous FTP was the method of choice for publishing on the Internet. Anonymous FTP a term that s used as a noun and a verb on the Internet does not require the client to have an ID for the publishing computer in order to connect and download files.Anonymous FTP allowed true publishing to occur on the Internet, because once you made your information available via anonymous FTP, anyone on the Internet could download it. This method proved so popular among the programmers and technical people most Internet users in those days that thousands of FTP sites computers with files available for anonymous FTP downloads sprang up over the years. The problem was finding the site that had what you wanted and then locating the information in the thousands of files that might be stored on that same computer. Archie--Indexing FTP Sites In 1990 Peter Deutsch and Alan Emtage, grad students at McGill University in Canada, came up with Archie, an interesting approach to solving this problem.Net-wide Index to Computer Archives, rodent being a reference to the Gopher servers for Veronica indexes. Deutsch and Emtage set up one computer to connect automatically to a certain number of FTP servers every night and download their directory structures and indexes. They added these indexes to a database and then would allow anyone on the Internet to connect to their machine and run a search program. The search program allowed the user to search by file name and would return all the occurrences of a particular file name, complete with date, directory path, and FTP site address.e-mail the results to you.com an excellent DOS file lister from shareware author Vernon Buerg . Although Archie allowed users to quickly find the FTP sites with the most recent copies of particular files, it wasn t especially user friendly. While connected to an FTP site, you could make your way through the various directories by typing in the appropriate change directory or cd command, but you couldn t view a file to determine whether it contained what you were looking for. For that, you first needed to use the FTP get command, download it, and then look at it on your own machine.bunyip.com:8000 products archie archie.html Gopher and Gopher In 1991 the University of Minnesota made all this easier. Short connections between the client and server programs so the server could quickly handle a client s request and move on to the next Stateless connections no memory of previous contacts between client and server, so the server wouldn t have to remember what stage each client was at This approach resulted in a menulike system that allows users either to see the contents of a file or to follow a link to other menus or files on another system. Thus was born the ability to browse the Internet, because Gopher allows you to read the text files you come across and, depending on your computer, to view pictures or hear sound embedded in the files you select.The success of Gopher s simple interface led to explosive growth in the number of Gopher servers in the United States and then around the world. And it led to extensions of the original Gopher protocol to allow for such things as electronic forms, abstracts, alternate data formats, or views, and the ability to store meta-information or behind-the-scenes information about the files such as modification date, file size, language, and administrator of the file .Follow this link for a brief explanation of protocols. Veronica--Indexing Gopher Sites The hundreds, then thousands, of Gopher servers springing up created the need to be able to find specific items among all the Gopher sites. There, Steven Foster of UNR applied the Archie model to Gopherspace, the collection of all items in all Gopher servers in the world, and called it Veronica. That is, he set up a computer to connect to all the registered Gopher servers and directed it to follow their menus, collecting menu and file names as it went. The compilation was searchable, and what was especially nice was that the results showed up as one or more Gopher menus, which the user could follow directly to the files or Gopher servers themselves. As the obvious utility of this system became known, universities around the world began running Veronica servers, which soon forced each to introduce its offerings with an opening screen that allows the user to select a server and type of search to use. See Figure 1-9 for an example of the selection screen and Figure 1-10 for the results of a search on the word Gambia. The dynamic nature of this database meant that you could do a Veronica search one day and find three matches and then the next day do the same search and find 10 or 12 more, depending on what had been added. Suddenly, Gopher became much more powerful--instead of working your way through menu after menu, you could connect to a Veronica server and search all the menus and files with a specific word or phrase that might appear in their titles.Wide Area Information Server WAIS WAIS pronounced ways , started in 1988, was an experimental project designed to come up with easier ways to search Internet files for content. The team was headed by Brewster Kahle, then of Thinking Machines Corporation a producer of parallel-processing computers , in cooperation with Apple Computer, Dow Jones, and KPMG Peat Marwick. The goal was to create a way to easily search large amounts of text, images, or other files scattered among different computers. Based on the ANSI standard search-and-retrieval protocol Z39.50, then under development, WAIS allows users to type in a question in their natural language--it does not force the user to learn and use a particular computer language or syntax. The WAIS protocol then does the dirty work of translating the query into a WAIS computer language query format and sends it off to various WAIS servers on the Internet. The servers in turn search their full-text indexes and return to the user a list of hits, or matches, ranked by how well they match the original query.The companies that had formed the development team made the details and program source code for WAIS publicly available because they realized that WAIS would be much more useful--and have a shot at becoming an industry standard--if it were commonly accepted and used by others across the Internet. Although WAIS as a front end, or client, hasn t caught on nearly as well as Gopher or WWW, it has become the search mechanism of choice, running in the background of other front-end systems. World-Wide Web WWW The World-Wide Web was invented by Tim Berners-Lee in 1989 in an attempt to efficiently store research data at CERN, the European Particle Physics Laboratory in Geneva, Switzerland. Berners-Lee, a consultant with a background in text-processing software development, wanted a system that would make it easy for various researchers to build up separate bodies of information and then link them electronically by matching the real links in the information such as going from a file about horses to more specific files about thoroughbreds, quarter horses, or Olympic three-day eventing . He based the system on the concept of hypertext, or text with links that can be followed electronically to other documents, files, sounds, images, or even programs. The main advantages, or power, of hypertext lie in its ability to link diff erent pieces of information in simple ways, at exactly the spot at which you thought of the connection. In a way, hypertext links are like footnotes, except that they are easier to follow and can be of any length. For example, if this page were hypertext, it could have links to the history of hypertext, examples of hypertext, and even a video of someone discussing hypertext.The World-Wide Web system is known by various names. WWW, W3, and Web are intuitive, but because it uses the HyperText Transfer Protocol HTTP and HyperText Markup Language HTML , the servers are technically called HTTP servers.One important contribution of WWW was the Uniform Resource Locator URL . This address system allows you to declare the name and port number a port is like a doorway, or loading dock, of a computer of the host computer, the protocol type of connection, such as FTP, Gopher, and so on , and the directory path and file name, all on one line. Net users made it common practice to include a URL for their home page a personal spot on the Web at the end of their e-mail messages.The other main innovation of WWW was HTML, or HyperText Markup Language. Briefly Chapter 4 goes into more depth , HTML is a relatively simple set of codes that turns ordinary text into hypertext when viewed by a WWW browser. In other words, an HTML file created on a Macintosh will look pretty much the same when viewed on a PC or a UNIX workstation. The ability to move a single file between different types of computer systems and have it work the same way in all of them is called portability, an essential ingredient when you want the whole world to use something. Although WWW was seeing some success--the number of servers was increasing steadily--the creation of a WWW browser called Mosaic by the National Center for Supercomputing Applications NCSA in Champaign-Urbana, Illinois, led to an explosion of interest in WWW. This program, which was being given away free, was created first for X-Windows on UNIX and soon after for Macintosh and PCs running Microsoft Windows. Gopher was nice and easy to use, but Mosaic was fun and much more impressive in what it could display, mostly because publishers could interweave text and images in documents to create the equivalent of a glossy brochure distributed on the Internet. Among the more popular early demonstrations of WWW was a tour of the Krannert Art Museum at the University of Illinois. Gopher could provide the same information but as menus, and the user could look at one element at a time but not the whole thing. That is, with Gopher you keep coming back to a menu from which you choose a text file, an image file, or a sound file.This difference has a big effect on the visual effects of a document browsed on the Internet. Imagine being able to use buttons or links from the colorful WWW page that activate sound files or run movie files that start as soon as you select that button. But WWW and Mosaic were designed in high-intensity computing environments, in which all participants had fast Internet connections and sophisticated workstations although they will work on a 386 PC running Windows with a 14,400-baud modem . When you start publishing on the Internet, it is good to remember that many people who want access to your information may in fact have slow Internet connections and not very powerful computers. If your company s home page takes 10 minutes to download and can only be appreciated on a top-of-the-line workstation, you may not get the audience you want. The reality of WWW is that graphic presentations are more attractive and exciting than plain text, but people won t see them if it takes five minutes or more for those images to appear on their screens. Your images won t slow down text-only Web browsers like Lynx because Lynx simply ignores them , but those using text-mode browsers won t see and enjoy your images. Summary Internet publishing consists of posting or putting material on the Internet where it can be viewed or downloaded from other computers. This process has evolved both in the user interface and in the tools that have been built to enable people to find them. We ve covered a lot of ground in this chapter, so let s review the terms--you re going to need them to understand the rest of this book: FTP File Transfer Protocol is a tried-and-true method for transferring files, but users have to know which computer has the directory and file they want, and the process is confusing to the novice. Archie is a program that actively goes out and indexes all the known anonymous FTP servers and makes that information freely available via the Archie client program. Gopher presents menus that lead to files that you can view easily, as well as links to other servers. The links are easy for the novice to follow, but with thousands of Gopher servers all over the world, finding what you want is often a problem. WAIS creates indexes of the full text of files, which it then allows you to search by writing a simple query in natural language. It then applies that query to a collection of full-text indexes, supplying you with a list of those documents that most closely match. For example, a search on wine might list hundreds of documents that you can then narrow down to those wines from California or Chile or France. Unfortunately, it does not lend itself to Veronica or Archie searches, although new indexing methods are being developed with a great deal of success.Three of the most popular publishing protocols are Gopher, World-Wide Web WWW , and Wide Area Information Server WAIS . They are three different protocols, or techniques for doing basically the same thing--locating a computer that has something you want, connecting to it, finding the file you re interested in, and then downloading or viewing it. Each consists of computers that act as servers, which wait for requests from other computers clients , and then send the files you request back over the Internet.This chapter discusses the groundwork you ll need to do and some of the steps involved in publishing on the Internet. Take a look at the list in the sidebar on page 20, and then we ll go through the steps one by one. Later, in Chapters 3, 4, and 5 on Gopher, WWW, and WAIS, we ll go through these steps in detail for each type of Internet server and use examples. You need not go through all these steps now, and they are not meant to scare you off or make the process seem intimidating. Instead, they are meant as a guideline or sort of check list for you to use to keep track of all the various elements involved. Joining the Internet Community Think of the Internet as a place to which you are moving and that you want to fit into. Your first job is to join the Internet community in the sense of learning how to make your way around and how to fit in. On the Internet you ll find a community of individuals, companies, and organizations that are discussing issues, modifying software, and creating new tools and techniques for publishing on the Internet, as well as publishing the most amazing variety of information. Keeping abreast of their activities through Usenet newsgroups and e-mail mailing lists and tracking the relevant subjects through Gopher and WWW indexes are essential parts of your publishing process because, like it or not, you are joining a community, and you need to be aware of what s going on out there. The first step is to start getting to know the people, organizations, and issues that shape what you do on the Internet.Now we ll go through the kinds of resources you ll come across and explain them in a little more detail. Later on, in Chapters 3, 4, and 5 on Gopher, WWW, and WAIS, you ll find tables of resources for each kind of server.Usenet Newsgroups Usenet News is an extremely large, international, cooperatively run system for exchanging messages.clarinet.com that provides Associated Press and Reuters newsfeeds through Usenet News for a fee. Disk space often is the limiting factor because a full Usenet newsfeed, consisting of all messages from all newsgroups all over the Internet, generates more than 5,400 megabytes per month. A newsgroup is basically a subject area that is created so that those who are interested in that subject may exchange messages.newusers.questions and news.announce.newusers. Other categories are often specific to a country or a university.News-reading programs exist for almost any computer platform conceivable--from mainframes to plain vanilla DOS PCs. I saw more than 8,000 newsgroups when I did this last at the University of California at Los Angeles, but UCLA has a Clarinet feed, which adds several thousand newsgroups. You then select those newsgroups to which you wish to subscribe. There s no charge unless you license a commercial service such as Clarinet. This isn t a permanent decision--it just helps narrow the focus a little. Once you ve subscribed to some newsgroups, you ll see depending on your news reader a number of threads or discussion chains.unc.edu usenet-b home.html Usenet newsgroups are an extremely dynamic source of information on the Internet. Asking questions is fine after you ve monitored the group for a little while and looked at any FAQ Frequently Asked Questions files available. Alternative newsgroups are the easiest to create, and they are often formed as an immediate reaction to something, either a new piece of software or a new protocol or just an idea. Later, if a subject gains wider interest, it might be voted in as an official newsgroup in comp or one of the other domains.Here are some newsgroups to check out.infosystems discusses various types of information servers, past, present, and future. infosystems.announce lists new information servers.answers is where many newsgroups post their FAQs and other documentation files.answers is a good way to keep track of at least some ways in which the Internet is growing.FAQ Frequently Asked Questions Files FAQ files are often extremely useful repositories of beginner and intermediate questions in different areas. They are usually composed by the regular readers of a newsgroup, so that habitués aren t constantly distracted by questions from newcomers.FAQs pop up on the most amazing subjects, and they are periodically reposted to the Usenet newsgroup alt.answers. Check out Figure 2-1 for the first of 28 screens of the FAQ archive at the Massachusetts Institute of Technology MIT .mit.edu pub usenet news.answers . RFCs RFCs Requests for Comment are a valuable resource for the more technically inclined; I mention them here because they demonstrate an important part of Internet culture: the development of new technology through open and widespread discussion. First, someone develops an idea which could be a new protocol, standard, or even a new type of service or specific tool ; when they start to get serious about it, they write up an RFC document, which describes the idea in detail.Eventually, either the idea dies because of lack of interest, or it becomes a recognized guidepost standard has a more official meaning for Internet programmers, developers, and users.RFCs are available online via Gopher, FTP, and e-mail.internic.net:70 00 rfc rfc-retrieval.internic.net rfc .internic.net or call 800-444-4345 choose prompt 3 from the InterNIC voicemail menu . Mailing Lists Mailing lists are e-mail-based systems for carrying on discussion groups. Note: You ll hear these referred to as list servers or listservs as well, but list servers are actually the hardware software for mailing list administration. Basically, a computer is set up to accept mail at a certain address and redirect it to all members of the list. One advantage of mailing lists is that you will get the messages whether you look for them or not, usually within a few hours of when they are sent.Additionally, most mailing lists have archives, in which all their correspondence is saved. When you join a mailing list, you usually will receive a message that describes all the commands available for that list. If it doesn t, or if you don t want to subscribe to the list, try searching some Internet indexers to see if someone has put together an archive for that particular mailing list.internic.net--This announcements-only mailing list is for anyone interested in getting current information about InterNIC and the services it provides.internic.net, and in the body of the message type Subscribe announce your name . internic.net--This moderated list each message is approved or controlled by the moderator is intended for staff at midlevel, campus, and discipline-specific network information centers NICs . Content includes InterNIC services aimed specifically at NICs, and mail sent to this list will be oriented to providing services, including new training resources and documentation, to end users.internic.net in the body of the message type Subscribe nics your name . internic.net--This announcements-only list is a group effort of people all over the Internet to concentrate announcements of new resources on one list. If you would like more information about participating in the Net resources list by becoming a monitor, send a note to scout internic.net.internic.net; in the body of the message type Subscribe net-resources your name . RFC announcements are distributed via two mailing lists: the IETF-Announce list IETF stands for Internet Engineering Task Force , and the RFC-DIST list.reston.va.us.DDN.MIL.Magazines and Journals Print Newsstand and postal subscription magazines and journals that focus entirely on the Internet are starting to arrive on the scene, and established magazines are paying more attention to the Internet and the subject of cyberspace.Internet Business Advantage com.Internet Business Journal phoenix.ca sie iar-home.html .phoenix.ca sie ibj-home.html ; 613-565-0982; fax 613-565-4433; e-mail strangelove.comsubscriptions.Internet World com; http: www.mecklerweb.com NetGuide , bills itself as the Guide to Online Services and the Internet. It is available at newsstands.cmp.cmp.com net .Wired com for details.hotwired.com .S.; fax 415-222-6399; e-mail talkzsubs wired.com.Online Magazines and Journals Online magazines are growing rapidly. In the list that follows I ve included both print magazines that are starting to offer some or all of their text online, as well as true online magazines and journals that are published entirely on the Internet. Some are specifically oriented to Internet publishing issues, whereas others are computer magazines that talk about Internet issues, among other things.Computer-Mediated Communications Magazinei It is edited by John December, co-author of World Wide Web Unleashed, and reports on people, events, technology, public policy, culture, practices, research, and applications of computer-mediated communication.unc.edu cmc mag current toc.html GNN Global Network Navigator This WWW site developed by O Reilly and Associates publishers of Ed Krol s Whole Internet User s Guide and Catalog is one of the best and most popular examples of Internet publishing.digital.com gnn GNNhome.html HotWired These include technology, way new journalism, the arts, commerce, and electronic conversation.wired.com Public-Access Computer Systems Review This free journal published by the University of Houston is primarily aimed at librarians and others who maintain publicly accessible computers.lib.uh.edu:70 11 articles e-journals uhlibrary pacsreview San Jose Mercury News This daily newspaper in California s Silicon Valley has tons of computer industry and Internet news and is available by online subscription.sjmercury.com St.times.stpete.fl.us default.html TechWeb , offers links to many of the CMP magazines, including Communications Week, Comm Week International, Computer Reseller News, Computer Retail Week, Electronic Buyers News, Electronic Engineering Times, Home PC, Information Week , Interactive Age, Internet Business Report, Netguide, Network Computing, OEM Magazine, Open Systems Today, VAR Business, and Windows Magazine. The search feature is WAIS based, and you can choose all magazines or just specific ones and then perform the search.techweb.com Ziff-Davis Publishing This site offers links to many of the Ziff-Davis magazines, including PC Magazine, PCWeek, PC Computing, MacWeek, MacUser, Computer Shopper, and Windows Sources.ziff.com Also see this index of online journals: http: www.w3.org hypertext DataSources bySubject Electronic Journals.html . Internet Sites and Services of Interest to Internet Publishers Look around the Internet for relevant standards and operating procedures. The point is to be a good Internet citizen.The following sites offer Internet guidelines and reference information that you might find useful: InterNIC The service areas are Information Services run by General Atomics , Directory and Database Services run by AT T , and Registration Services by Network Solutions, Inc.internic.net or http: www.internic.net One particularly useful site provided by InterNIC is its archive, Internet Documentation RFCs, FYIs For Your Information files , and so on .internic.net:70 11 .ds .internetdocs Quality, Guidelines, and Standards for Internet Information Resources This is a collaborative gathering of thoughts and ideas on the subject of improving information servers of all kinds. Among other lists and documents to which it provides links is Top Ten Things Not to Do on a Web Page. http: coombs.anu.edu.au SpecialProj QLTY QltyHome.html The Directory of Electronic Journals, Newsletters, and Academic Discussion Lists cni.org:70 11 scomm edir Organizations and Associations You may not be a joiner, but you should be aware of these organizations, because they will be helping to shape the Internet in the years to come: The Internet Society This international nonprofit society s principal purpose is to maintain and extend the development and availability of the Internet and its associated technologies and applications--both as an end in itself, and as a means of enabling organizations, professions, and individuals worldwide to more effectively collaborate, cooperate, and innovate in their respective fields and interests. It offers both individual and organization memberships.isoc.org gopher: gopher.isoc.org ftp: ftp.isoc.org isoc The Electronic Frontier Foundation EFF The EFF is a nonprofit civil liberties organization working to protect freedom of expression, privacy, and access to online resources and information. It was founded in July 1990 by John Barlow, Mitch Kapor founder of Lotus , Steve Wozniak cofounder of Apple , and others to ensure that the principles embodied in the U.S.S.eff.org .org.Internet Engineering Task Force IETF The IETF is a large open international community of network designers, operators, vendors, and researchers concerned with the evolution of Internet architecture and the smooth operation of the Internet.ietf.cnri.reston.va.us The World-Wide Web Consortium W3C W3C operates under the leadership of Tim Berners-Lee, the author of WWW, and was formed to document and encourage the development of a common set of tools and basic programs for continued WWW development.w3.org Electronic Privacy Information Center EPIC C., is a public interest organization established in 1994 to focus public attention on emerging privacy issues relating to the National Information Infrastructure better known as the Information Superhighway , such as the Clipper Chip, the Digital Telephony proposal, medical records privacy, national identification systems, and the sale of consumer data. EPIC publishes online newsletters and reports, pursues litigation under the Freedom of Information Act, and conducts policy research on emerging privacy issues.digicash.com epic or send e-mail to info epic.org.Internet Indexers and Directories Many excellent books and online guides are available if you need help finding what s out there.Make notes as you search through this material, because later you will want to be sure that your service is listed in these indexes, directories, newsgroups, and mailing lists.Also, learn from your competitors. Pay attention to how servers are set up and how documents are arranged, as well as what special features they provide to their users. For example, many sites allow you to search by key term, but one equipment manufacturer allows you to search by picture.Pay attention to feedback, searching, and charging mechanisms to see how easy and intuitive they are to use and which charging systems they are using. Netiquette Netiquette, short for network etiquette, is important. Likewise, it is far easier to take offense because online interaction excludes nuances of rhythm, mood, and context and all body language. Although flame wars angry messages are called flames have often broken out on unmoderated mailing lists and Usenet newsgroups, no one encourages the practice. Watch or lurk for a while in any mailing list or newsgroup to get a feel for the tone and to avoid asking newbie newcomer questions. No matter how strongly you feel about a subject, it is not appropriate to send information or opinions about it to unrelated groups.Most books about exploring the Internet talk about Internet manners, but you might want to look at Netiquette by Virginia Shea, published by Albion Books.bookport.com Albion catNetiquette.html Identifying Your Competition A word about competition might be appropriate here.Competition may not be quite the right word in your case, depending on the nature of the server and service you want to establish. Ideally, your server can work in combination with others to provide broader, more detailed resources to the Internet community instead of striving to kill someone off. Remember, the Internet has reached today s interesting state only because of the incredible spirit of cooperation that it has somehow inspired. Finding what s out there in your area of expertise is essential. For example, suppose for some reason you decided that your goal in life was to put up a WWW site dedicated to the comic strip Calvin and Hobbes by Bill Watterson. Once you were finished, wouldn t you be surprised to learn that five other sites one each in the United States, France, Holland, Sweden, and Norway already are dedicated to Calvin and Hobbes ?eng.hawaii.edu Contribs justin Archive Index.html Defining Your Goals What can you sensibly hope to attain by publishing on the Internet?Or you might want to publish on the Internet because doing so decreases the amount of time your staff spends searching for or providing information, because you want to experiment with new technology that may benefit you or your company, or because you want to sell things and make money. One advantage of defining your goals early is that it enables you to look at how others are achieving similar goals. You can become more analytical as you scour the Internet, noting the techniques each site uses and adding them to your repertoire. And I should say something about experimenting and keeping your goals loosely defined. More than one company has found that a side benefit of publishing on the Internet became more important and potentially more profitable than its original goal., a builder of electrical connectors, started with the idea of putting its catalog of 80,000 parts on CD-ROMs. In the process the company developed a unique scheme for finding exactly the right part by picture, name, or part number.Identifying Your Audience What do you know about your audience? If you want to publish information about Macintosh computers, you ll be able to store files in Macintosh formats and ignore other file formats. On the other hand, if you want all those interested in political science to read your work--no matter whether they re coming from a Macintosh, Sun Microsystems workstation, PC, or mainframe--you should plan to put all your documents in a portable format such as plain text or HTML or in a variety of formats. Who is going to be happy to see your information on the Internet? If the members of your audience tend to flock to certain newsgroups or e-mail lists, you ll want to be sure to announce your server in those places. If someone is already doing this, you might consider offering to mirror, or duplicate, their service as a way to decrease the load on their server or staff.LOCATION For example, if you want to sell a service to an institution with a clearly defined range of Internet Protocol IP addresses, you could rely on the IP screening capabilities in almost all Gopher, WWW, and WAIS servers. However, if your users will be scattered all over the Internet, you might have to use individual password-based authentication instead, which is technologically more difficult and requires much more administrative overhead. EQUIPMENT Do you expect that your audience will primarily be dialing in from home, or will your clients have direct cable connections to the Internet? For dial-ins you ll want to make sure that downloading information over a slow modem does not take forever more than a minute or two , and if it does, perhaps you should rethink the information you want to display.CHARACTERISTICS SECURITY AND ACCESS What If I Don t Know? If you don t know what kind of audience to expect, you should consider adding a poll to your server. That way you can learn more about the people who connect to your server and decide later whether to focus more narrowly or broadly. Don t forget the possibility of contacting other, more experienced Web, Gopher, or WAIS providers to find out what they think you might expect. Identifying Your Data The point I m trying to make is that you need to identify and analyze the type of information files you want to make available on the Internet. Will you be making announcements, offering catalogs, posting press releases, product information, technical documents, collections of quotations, collections of photos, previews of movies, or the latest stock market quote? Will it be plain text, formatted text, hypertext, sound, picture, video, program to be run, program to be downloaded, or something else? In addition, think through what happens after your server is set up. If your data are timely, you should plan to update regularly, and be sure that the date you last updated the file is a prominent feature.Another part of identifying your data consists of finding and recruiting information providers, or those people or organizations that will provide you with what you plan to publish. For example, you might provide company press releases and arrange for the public relations departments to supply them to you with dates for posting and removal.Identifying Your Needs Internet publishing can be done with little cost and time, or it can be a massive million-dollar effort. Hardware Given the kind of information you want to publish and the size of the audience you think you might expect, what kind of computer system is available to you to meet those needs?For example, you might start out with an inexpensive PC running Windows or with a Macintosh. These systems are familiar to many and are easy to set up to run Gopher or Web server software or both in combination .Note that larger, more powerful UNIX-based machines are getting easier to use. Sun Microsystems and Silicon Graphics are starting to bundle some models with Web server software and management tools to simplify server administration. See Chapter 11 for a discussion of future trends, but expect that you may end up using a combination of servers of all shapes and sizes. Network Connection Do you have a dedicated Internet connection, or do you dial in for Internet access? In the latter case, you wouldn t want your home machine to be your server, because it wouldn t be available most of the time.Software It may be obvious to you which server Gopher, WWW, or WAIS or combination you want to use, but review your decision before you take the plunge. Look through Chapters 3 Gopher , 4 WWW , 5 WAIS , and 6 Other Tools --including the servers you don t intend to use--and take a particularly close look at the examples in Chapter 10 for what others did in similar situations.Human Resources Do you have the expertise and time in-house to comfortably run your server? Depending on the size and complexity of the computer system you ll use for your server, you may need to recruit the assistance of a system administrator. Some of the Gopher, Web, and WAIS software discussed in this book require a system administrator to get it set up. This is the person or persons who do most of the work involved in collecting, organizing, and preparing the data and documents that go on your server. They don t need much system experience at all, because the process of loading the data into the correct directories can be made quite simple. You may indeed want everyone in your organization to add and keep track of their own files on your Gopher, WWW, or WAIS server. However, you should plan to entrust someone with the overall structure and organization of your server, and that person would be the data librarian. Make sure you have the technical support you ll need. If you plan to run a server on a system with which you are not familiar, and your system administrator is too busy to help, you should think carefully about what support you ll need and where you ll get it. Check the documentation and the newsgroup or mailing list for the type of server you plan to use to get an estimate of the technical expertise it requires. It is possible, even preferable, to have your system administrator set up and run the server, then give control of various sections to the data owners or maintainers. Remember, one goal of each of the three main server protocols, Gopher, WWW, and WAIS, was to increase the ease and access to Internet publishing. What level of staff effort will be required to keep your information current? Often this isn t an issue, because you ll simply put information on your server and replace it every few months or so when it gets updated. But if your company s reputation depends on the timeliness and accuracy of the information your server provides, it would be a good idea to make sure that the burden won t be overwhelming. Will you need the services of a graphic designer?Will you need management to change policies or procedures in order to get the information and updates your server will need?Identifying Available Resources When identifying hardware for use as a server, think about a backup.Don t underestimate the time and learning involved. This will be a learning experience for your whole crew, so be sure that they and you are up for it. Determining How to Make Up the Difference If you don t have the right computer equipment available, it s often possible to piggyback on another server.One advantage of Internet publishing is that, for the most part, it requires no specialized skill to format and prepare the information. For that reason you might include a broader range of your staff and personnel as data suppliers than you might first have expected.Identifying Charging Mechanisms How to charge for information--if at all--is an area that is changing rapidly. You don t have to give away your product to be part of the Internet community, but try to offer more than just advertising and order forms. This is the strategy that GE Plastics takes in providing hard technical information about its products and its design guides on its site, which has no ordering or charge-account mechanisms.ge.com gep homepage.html Microsoft provides access to much of its technical support Knowledge Base via the Web although it also sells the information on CD-ROM.microsoft.com pages kb kb.htm Putting the information on the Internet also provides customers with an alternative to Microsoft s phone and fax support. Or you might use your company s server to sponsor a community organization or just provide information about your area as a gesture of community support.But if you absolutely have to charge, some techniques are in use and others are in the advanced planning or testing stages. This is an area that is changing rapidly, so pay close attention to the sites and discussion groups you find that deal with this.Identifying Security Risks Methods exist for limiting access to all or part of your server from certain Internet subnets. Be aware of this possibility when laying out your server, because you might want some sections to be available only to those on your system and others to be available to everyone. Basically, this means that any gateways for example, proxy, caching, and e-mail to your Gopher, Web, or WAIS server should block out the same IP addresses that your server restricts.The extent to which security is an issue for you depends on the nature of your organization and computer system. A firewall is a method of isolating your company s or organization s computers behind a computer that acts as a gatekeeper, or firewall. All outgoing requests for information or services go to that one machine, which hides the sender s machine address but passes on the request.For lower security you still must ensure against unauthorized alteration of your server files and operating system. Several software developers for Gopher, WWW, and WAIS have already started posting detailed analyses of their server software s security holes and risks, as well as their recommendations.Estimating Your Costs Like anything else, what you will spend depends on what you want to do and, in this case, how many people will come to see what you ve done i.e., browse your Gopher, Web, or WAIS server . Many people mistakenly assume that because dial-up Internet-browsing accounts can cost as little as 15 per month, Internet-publishing accounts are just as inexpensive. The type of Internet access account needed to do what s described in this book is much more expensive, sometimes running hundreds or even thousands of dollars per month, with additional hardware and line installations adding to the cost. That higher price has to do with the types of service available in your area and the Internet throughput size of the pipeline you ll need.The Gopher, WWW, WAIS, or other server software you may use varies widely in price as well. Some of it is available for free, and other programs cost 5,000 to 15,000 plus annual maintenance charges of 1,000 or more.The type of computer you use will also be an expense, but thankfully there is a movement toward using less expensive PCs and Macintoshes in the place of the larger, more support-intensive UNIX machines.Don t let these costs scare you off. Nineteen percent of respondents expected to spend less than 5,000, 13 percent said more than 45,000, and 35 percent didn t know. Planning the Presentation of Your Data A dedicated individual, or a team of editorial consultants, graphic artists, and programmers, can develop a sophisticated well-organized information server. Christine Quinn, Stanford University s director of electrical engineering for computer and network services, describes Stanford s attempts to put a consistent Stanford face on its collection of Web servers in an article titled From Grass Roots to Corporate Image--The Maturation of the Web, which she presented at the 1994 WWW conference in Chicago. Because Stanford was looking for a unifying theme, the university hired a graphic artist to insert pictures of Stanford s famous architecture in its home pages. Quinn says that if you re ready to spend thousands of dollars on a brochure that will be seen by hundreds, why not spend an equal amount to do it right on the Internet, where it will be seen by thousands?ncsa.uiuc.edu SDG IT94 Proceedings Campus.Infosys quinn quinn.html Dale Dougherty of O Reilly and Associates, publishers of UNIX administration books, describes Internet publishing as a collaborative art. At the 1994 WWW conference in Chicago he recalled how his company s extremely popular Global Network Navigator GNN site needed more editorial and production people than technical staffers. According to Dougherty, a publisher organizes an audience, and publishing on the Internet has the same requirements of any publishing enterprise: Editorial requirements include a point of view, goals, the writing and editing of new material and updating of old material. Even the presentation of your menus and the design of your hypertext require consideration.How the information is presented--known on the Net as the metaphor that is used--can make a big difference in how easily and quickly newcomers learn their way around your information server. A metaphor is a way to describe the information server or how it is arranged by comparing it with something commonly understood. A book is another metaphor; although a book can be emulated in hypertext and has certain advantages, the flexibility of Internet servers may lead to even easier ways to organize information. The Open Software Foundation Research Institute, a nonprofit research and development organization based in Cambridge, Massachusetts, has collected ideas, analogies, and metaphors that might be useful in describing data and its relationships.osf.org:8001 www InfoPresForm Some of the ways of organizing information that researchers at the institute have come across on the World-Wide Web are shown at http: riwww.osf.org:8001 www InfoPresForm results.html . The structure of your organization might provide an obvious structure for your server, but you may prefer a more subject-oriented model based on the main areas your server covers. If a company is composed of different divisions, its server might be split up that way as well, with branches leading off to areas that represent each division. If a company is global, the server might have links leading to the different geographic regions and then to the countries and company offices within those regions. If a company or organization offers a variety of services or functions, it could split its Gopher or Web server along those lines.Extensive testing is one of the best ways to make your server and data presentation user friendly. Links can die or change on you--that link you thought was fascinating and pertinent two months ago may have changed focus or moved to another site. Also test your server s presentation by asking novice users to explore it, and then ask them to summarize what your server contains.Always keep in mind that users are a resource. On the Internet the bias is toward criticism and feedback with the assumption that you ll fix what s wrong , so take advantage of that, as well as the ideas your users may provide. In their travels through the Internet they may have come across a design or presentation solution that could be useful to you.Another method is to try different approaches to presenting the same data and then use your system s logs to determine which route browsers use most. Different people tend to learn in different ways.Keep these styles in mind when designing your server. For example, you could present a single concept with a written explanation, a diagram, a step-by-step example, and a discussion of the benefits and implications.Preparing Your Data Gopher servers generally need less data preparation than WWW and WAIS servers.Acquiring the Material You should analyze the need for Formatting the Material Formatting concerns include 1, Word for Windows 2.0, PostScript, and so on Whether those formats exclude use of those files on other types of computers, such as Macintosh, UNIX, VAX, and IBM mainframe Access to the Material In determining which image file formats to use, you should ask whether Future You also need to consider your needs in the future: Do you need to convey information that isn t yet standard on the Internet, such as tables and formulas that aren t easy to format on Web servers using HTML 2.0? Indexing Your Data You might consider indexing the contents of your Gopher and WWW servers WAIS servers have built-in indexing , although it s not required or even always necessary. If you have just a few documents and your menus aren t very deep, users are unlikely to get lost or miss items on your server. However, if you have an extensive menu structure, a large collection of documents, or large documents, you should consider an index. The usual practice is to place a search link at the beginning of your opening screen that offers users the chance to search the contents of your Gopher or WWW server. Users then can search quickly through the index for what they want, instead of working their way through various submenus or hypertext links. If you ve ever been unable to find something on a server that you knew was there, you ll appreciate what a convenience the ability to search the contents of a server can be. Jughead, developed by Rhett Jonzy Jones, a programmer at the University of Utah, is one tool available for indexing UNIX Gopher server menus.You also can index the full text of documents on your Gopher and WWW servers. Full-text indexing means that every word in each document is indexed, not just the menu description line for Gopher servers . For example, if you have a document summarizing the history of the conflict in the Middle East and it mentions Syria, the menu line might say only Middle Eastern Conflict--History. Someone searching for everything she can find about Syria won t turn up that document.You ll want to think about what kinds of searches your users might make and whether you ll need sophisticated indexing features such as structured field searching, which means that users can search fields or certain parts of the text, including title, author, and date. Whenever the text information you are publishing is so big that searching on a simple word or words fails to narrow the field, you need to consider whether fields are available and field searching is necessary. Menu and full-text indexing are not the only kinds of indexing. If you plan to provide a large number of nontext files, such as programs, pictures, sound, and video, you will want to index them. You might index only their file names or menu items , but if you have explanatory blurbs about each one, you will want to index the explanations so that indexes retrieve the nontext file as well. External indexers are usually programs that attempt to go out and collect information from all over the world and then make that index available from one server. External indexers are a crucial element in indexing Gopher and Webspace Webspace means the content of all the servers browsable with WWW browsers . The best way to do this is to look at the descriptions in Table 2-1, follow the links to each one, and then read more about them. Then you should register your server with all those indexes that you deem appropriate if you want your server to be truly internationally available. Indexing has many advantages for both you and your users. It can lessen the load on your server and ease the frustration and complaints users have about getting lost in large servers. But indexing can be difficult to install and complicates server administration, one responsibility of which is ensuring that the indexes are always up to date. Registering or Announcing Your Server Gopher, WWW, and WAIS all have free central registries that attempt to keep track of all the servers in the world. See the indexing sections of the Gopher, WWW, and WAIS chapters 3, 4, and 5 for a list of these services and an explanation of their philosophies and indexing techniques. If your server performs a service or acts as a resource for a certain population on the Internet, post information about it to the relevant Usenet newsgroup and e-mail the appropriate mailing list. The easiest way to find related sites is to search the Internet directories and indexes using terms that describe your own server. Maintaining Reliability A reachable site is a well-maintained site. If your Gopher, WWW, or WAIS server is often down, unavailable, or filled with links that don t work or data that are out of date, your company s reputation could be hurt. I ve even heard of someone who set up a machine to do this monitoring and page him if something is down. Links can be incorrect if they were entered incorrectly and never checked or they changed without notification. A server has no way of knowing all the sites that link to it, so posting notices of changes to everyone who needs them is impossible, although some server administrators attempt to notify all sites they are aware of. Some free UNIX programs for Gopher and WWW servers that automatically attempt to verify that all your links are working include: Go4check-- gopher: tjgopher.tju.edu 00 networks internet tools gopher go4check Anchor Checker-- http: www.ugrad.cs.ubc.ca spider q7f192 branch checker.html Link Verifier-- http: wsk.eit.com wsk dist doc admin webtest verify links.html You should consider mirroring to another site or backup server if you expect to have a popular server. The Internet Movie Database started out in Cardiff, Wales, and now has mirrors in Australia, the United States, Germany, and Japan.In another mirroring arrangement one site maintains several machines in parallel, each acting as the Gopher or WWW server. This is what the University of Minnesota Gopher crew had to do to handle the extreme load on its Mother Gopher server because Gopher was invented there, the University of Minnesota s site is called the Mother Gopher . The university runs 10 Mac IIc s running the A UX operating system in parallel, each with an 80MB hard disk just to support the top level of its Gopher server.Evaluating Server Performance and Audience Response Evaluation is an important but often ignored component of any process. In this case most of the servers come with some sort of log file, which is easily activated if not so easily analyzed. Chapters 3, 4, and 5 on Gopher, WWW, and WAIS servers detail what is possible with those servers, but check to see whether this information can be gleaned from the log file: Turning log files into useful reports is sometimes more difficult than you d expect, and some utilities have been written to simplify this process. Again, it s important to check with the Usenet newsgroup or e-mail mailing list for your particular brand of server to stay current with what these utilities can do.Because Gopher, Web, and WAIS servers routinely log the host name and IP address of each user s computer, along with the items retrieved, you do need to be aware of the potential for invading users privacy. To protect users privacy many log analysis programs automatically strip off the lowest level of IP address in their reports, which removes this link to individuals.Nothing prevents you from building your own evaluation tools into your server. One WWW site has a sign-in sheet that allows users to leave a comment as well as their name and e-mail address, data then added automatically to what is displayed on the server. Be aware that not all browsers support online forms, so always give your users an alternative, such as a way to e-mail their comments. Another option is to provide users with a voice telephone number, fax number, e-mail address, and or postal address from which they may obtain further information. Although this somewhat defeats the purpose of publishing on the Internet, responses can be an objective measure of the interest generated by your server. Or perhaps you ve posted a catalog, and users place their orders through an existing telephone or fax order system. In that case make sure that the order form or telephone operator asks for the source of users information, so that you can accurately evaluate your server s contribution. Using What You ve Learned You may or may not have planned to implement your Gopher, WWW, or WAIS server in stages, but almost certainly you will learn from the process and find better ways to accomplish your goals. Given that this is a most malleable medium, you should plan to continually check, revise, update, and perhaps redesign your server.However, if you change information or remove menus that may have been linked by other servers, you should post explanatory notes. In such cases it is helpful to leave a link in place that describes the change and gives the new link information. You may also find it helpful to keep an administrator s log, describing the changes you make to software and hardware and particularly any changes in philosophy on menu arrangement or hypertext design. Also record any addition of other resources, such as a searchable index or major bodies of information, such as new departments. You may find the log useful in the future, or it may give another administrator an insight into your server s design. Much of this is of interest to your users as well, so you should consider adding a What s New section to your server for providing this information online. If you do, you should still keep a separate and private administrator s log in which you record changes or upgrades in your system or server software as well as problems you encounter and their solutions. Summary This chapter is about the groundwork you should do before you start publishing on the Internet. This includes exploring Gopher, Web, and WAIS sites both in your subject area and in others to get ideas and techniques you can use in your server. Become familiar with FAQs, mailing lists and their archives , and Usenet News, as well as the various Internet indexers and directories listed in Table 2-1.You will need to answer many questions--about your audience, your competition, charging policies, and how the information you publish will be gathered as well as maintained.The presentation and organization of your data can be an important factor in your site s success.Evaluate your server log files, particularly the error logs. This is a growing medium, so assume that your server will need to be redesigned regularly, that it is a perpetual work in progress. Finally, remember that the Internet is not just millions of potential customers; it s also a network of people with varied interests, enthusiasms, and talents.Depending on whom you ask, Gopher and its newer version, Gopher , are in the midst of either their growing pains or their death throes. Gopher has already been surpassed by WWW in total number of servers and amount of traffic on the Internet, but with more than 7,000 servers worldwide and an extremely active development community, Gopher is an Internet publishing method that is hard to ignore.ADVANTAGES DISADVANTAGES But even if you re enticed by WWW, don t pass Gopher by too quickly. Graphics are nice, but for a large percentage of Internet users, graphics are a luxury that their modem speed can t support. Considering that preparation of material for Gopher is extremely simple and the Gopher software is easy to install and maintain, like FTP, Gopher will probably always have a place on the Internet. Gopher is an excellent choice when graphics are not an issue and a wide variety of people with little or no training need to add their files to the server.Gopher: Pros and Cons Gopher has the advantage of being a simple protocol that is easy to set up and maintain. Adding documents and files to a Gopher server is usually a matter of just copying them into the Gopher directory structure.The average size of Gopher menus is small, usually 2K to 6K, whereas WWW home pages are usually 5K to 10K or more, and inline images graphics that appear within a Web document , can increase that size to 100K or more.Gopher also does not require the user to have an extremely fast computer.A Gopher server is also flexible--it can serve up almost any type of file; Gopher allows you to offer different versions of the same file for example, WordPerfect, PostScript, and plain text versions of the same document, or Spanish, French, and Swedish versions . The biggest drawback to Gopher is that it just isn t as sexy or impressive in appearance as WWW can be. Although it can make picture files available for downloading, it can t mix the pictures and text together in a glossy brochure-like presentation as WWW can. One drawback to Gopher as a client program is that Gopher client programs can t browse WWW servers, although WWW client programs can view Gopher servers and retrieve data.How Gopher Works For technical details on the operation of Gopher, see The Internet Gopher Protocol and Gopher --Upward Compatible Enhancements to the Internet Gopher Protocol, both available at gopher: boombox.micro.umn.edu:70 11 gopher gopher protocol . Here s a simple description of what goes on between a Gopher server and client: If you want something else, you repeat the process. You need to study the exact language or syntax used in the Gopher protocol only if you re writing Gopher server or client software. Gopher and Gopher Features Unfortunately, it s not enough to decide you want to run a Gopher server. Whether Gopher is an option depends on the type of computer you re going to run your server on. Gopher adds some nice features to Gopher, such as interactive online forms, multiple versions of the same file called views in Gopher documents , and additional meta-information or information about the files, such as administrator, date updated, and so on . Gopher simplifies downloading from FTP servers because you immediately view what you download and then download more.Shell scripts are specially written programs that allow you to add other functions to your Gopher server.The list that follows details features of both Gopher0 and Gopher , but be aware that some Gopher servers may have additional features.Gopher0 FEATURES Answers requests in one step and then disconnects, which allows the Gopher server to run faster and handle more requests. Gopher FEATURES Gopher Resources One way to learn about Gopher is to download the server software and set one up. But for those of us who like to test the waters first, see Table 3-1 for some sites and resources that will tell you a great deal about Gopher matters. Note that some of the mailing lists are duplicates of Usenet newsgroups.Generally, your plan to learn more about Gopher and Gopher should be to Learn how to use the bookmark feature in your Gopher or WWW browser. Look at how they are laid out, and watch for techniques that make it easier or harder for you as a user to find information. Don t be upset if you don t get a response they are probably as busy as you are , but it s very much within the culture of the Internet to ask how something is being done.The University of Minnesota s Mother Gopher site is the first listed in Table 3-1 because it is the official Gopher archive.Types of Items Gopher Handles Gopher item types are important because they describe the kinds of information and services Gopher servers can deliver. But the Gopher server has to tell the Gopher client exactly what type of information will be sent out if the user s Gopher client program is to transmit and display the text file or image correctly. Most Gopher browsers the terms browser and client program are used interchangeably use a different symbol for each type of information they show. The users don t actually see the type codes listed in Table 3-2, but as a Gopher administrator, you will be dealing with them often. Right now the key thing to know is that Gopher has a wide variety of types; you can come back and learn the details later. See Table 3-2 for the current list of the many types of data and services Gopher servers can provide. They represent some other kind of action that the Gopher server can provide, such as a telnet session and a tn3270 session type. When a Gopher browser program sees type 8, which signifies a telnet session, it takes the host s name and port number provided and starts its own telnet session to that host computer. You can create your own data type too, although if you want anyone else to recognize it, you should register it and make sure their browser programs know how to handle it. The list of Gopher item types has grown since Gopher was originally created. This flexibility is extremely important, because it means that Gopher has the capacity to grow and adapt and allow you to connect your users to many, if not all, existing Internet services. Given the wide variety of formats in which you can publish using Gopher, don t forget the lowest common denominator, plain text.connection to the Internet. Although PostScript, Portable Document Format PDF from Adobe, and various word-processing formats make documents look much nicer than plain text, only plain text will be viewable if the user is on a mainframe Gopher client. Gopher Servers Gopher servers have been written or ported short for transported to many different types of machines. The University of Minnesota s Gopher Development Team started with a version for machines running the UNIX operating system, which includes Sun Microsystems, Silicon Graphics, IBM RS6000, and NeXT computers. Other groups then wrote versions for their favorite machines, including DEC Vaxen, IBM mainframes, PCs running Windows NT and OS 2, and even lowly DOS machines. Some servers run only the basic Gopher protocol, but many have moved on to Gopher . GN, from John Franks no relation to the author , a professor at Northwestern University, is a combination Gopher and HTTP WWW server. As with any software you need to address at least these five concerns when choosing a Gopher server: Function--Look at the functions supported, the extra capabilities the server provides, as well as its record of compatibility with add-on indexing programs. Platform--Check the platform operating system and hardware required, such as UNIX, Windows, and Macintosh, and its relative stability.Load--Investigate the load this server platform can be reasonably expected to handle.Support--Verify the level of support and activity this software enjoys, including newsgroups and mailing lists, as well as more formal commercial support mechanisms.Price--Determine the licensing policy.The section that follows discusses the capabilities of many existing Gopher server programs. UNIX UNIX is the most powerful and flexible operating system available for Gopher servers.Gopherd.micro.umn.edu:70 11 gopher Unix GN. This server is unique in that it serves both the Gopher and WWW worlds, sending the same documents out via either protocol, depending on the type of request. The GN server can act as both a Gopher but not Gopher and WWW HTTP multiprotocol server and has been recommended as an intermediate step between Gopher and WWW.unicom.com 1 FAQ whereas the software can be found at http: hopf.math.nwu.edu:70 .VMS VMS, short for Virtual Memory System, is Digital Equipment Corporation s multi-user, multitasking operating system.VMSGopher Server.6b is a free Gopher server for the VMS operating system.not always available at Boombox, so check other sites as well.University of Minnesota.psu.edu or gopher: niord.shsu.edu or gopher: gopher.wfeb.edu .Precompiled ready to run executables are available at gopher: trln.lib.unc.edu . The Minnesota archive is available at gopher: boombox.micro.umn.edu:70 11 gopher VMS . Macintosh Macintoshes are a user friendly place to set up a Gopher server. You can manage quite a bit of customization by using AppleScript a command language that comes with System 7 and MacPerl a free Macintosh port of the Perl language . An alternative to mounting a Gopher server on a Macintosh or PowerMac is to run A UX Apple s UNIX instead of the normal Macintosh operating system.Gopher Surfer.0.micro.umn.edu:70 11 gopher Mac server Windows NT Microsoft Windows NT 3.5 is now a serious choice for Gopher administrators, primarily because of the ease of installation and administration. Windows NT also offers the ability on some servers to interact with Visual Basic and data in regular Windows applications such as Microsoft Excel and Access. EMWAC GopherS for Windows NT. GopherS, pronounced Gopher-ess, is a Windows NT-based Gopher0 server not Gopher that runs as a Windows NT service so you don t have to remain logged in. It handles multiple simultaneous connections using multiple threads and can handle WAIS database searching if you use its WAISTOOL tool kit.ed.ac.uk in the directory pub gophers.ed.ac.uk HTML internet toolchest top.HTML DOS and Windows DOS runs on a minimal PC so it is ideal for setting up extremely inexpensive Gopher servers. Windows has the advantage of being widely used even if it s not always stable , so it is an easy platform to begin with. KA9Q NOS.micro.umn.edu:70 11 gopher PC server ka9q GO4HAM.DLL written by Gunter Hille of the University of Hamburg.informatik.uni-hamburg.de pub net Gopher pc go4ham or from gopher: boombox.micro.umn.edu:70 11 gopher PC server hamburg OS 2 IBM s OS 2 version Warp 3.0 has the advantage of being relatively stable while running on PCs with less memory than Windows NT requires. GoServe.x operating system.micro.umn.edu:70 11 gopher os2 Other Platforms MVSGopher Server.micro.umn.edu:70 11 gopher mvs Rice CMS Gopher Server.micro.umn.edu:70 11 gopher Rice CMS VieGOPHER.micro.umn.edu:70 11 VieGOPHER Other Servers The best way to find out about new Gopher servers is to go looking on gopher: boombox.micro.umn.edu , check the Gopher FAQ periodically, and watch the Gopher newsgroups see Table 3-1 . Installing Gopher Server Software Gopher and Gopher are protocols. The protocol does not specify how information is to be stored on the server, nor does it in any way define how the Gopher server should go about doing its job. The programmers who wrote each of the Gopher servers had a lot of leeway to do things in a way that made the most sense for the machine and operating system they were using.For these reasons I won t give you step-by-step instructions. We ll go through the basic steps of installing and setting up Gopher server software, using the UNIX Gopher server as a model. The point here is to understand what options are generally available and the kinds of decisions you ll have to make, no matter which type of Gopher server you choose. If you have problems, check the mailing list or Usenet newsgroup and their archives related to your server software to see whether someone offers a solution. Instead, I assume that you or the person who will be responsible for your server will be able to use the software documentation to install it, and I will concentrate on providing you with an explanation of the concepts and techniques that you ll begin to use once the software is installed. Downloading, Uncompressing, and Compiling the Software You have to get your software from somewhere, and usually that means you re getting it by Gopher or anonymous FTP from one of the sites listed earlier in this chapter.On UNIX the files come in the form filename.tar.Z, which means they ve been both tarred and compressed. Make sure you have plenty of disk space, then type Uncompress filename.tar.Z, which will replace your compressed version with a much larger file without the .Z at the end. Then make sure you are sitting in a directory one level above the directory into which you want to put your files, and type tar -xfv filename.tar, which will create the subdirectories and unpack all the files that go into them. Macintosh files usually come in BinHex format .hqx to travel safely across the Internet, and sometimes they are also in a self-extracting archive .sea .sea.hqx, but unloading it is easy with the shareware BinHex program use Archie to find it on the Internet . UNIX and VMS versions usually have to be compiled before they can be run. WAIS is the program most commonly added to Gopher, and it s essential for some of the indexed searching functions you may want. The best thing you can do is read the documentation carefully and take careful notes of each step in the compile process. If you do have to go asking for help in the Gopher newsgroups, you ll be able to accurately summarize your situation don t leave out the details of your operating system . And when the next release of your Gopher server software comes out, you ll be able to avoid the mistakes you made the first time around. What s Involved in Gopher Server Configuration Gopher server configuration refers to editing certain configuration files to add information about your site and decide how the server should behave in various situations. It consists of such things as the server s name, administrator s name and e-mail address, time zone, default language, location, latitude longitude, and other identifying factors. The hard part of configuration is understanding what each configuration option means and deciding how important it is in the operation of the server.The configuration information will be in different locations and may have different names, depending on the version of Gopher server software you use.conf, makefile.config, and conf.h. There also are some command line options when starting the Gopherd program. Configuring Site-Specific Information The site-specific information for the Minnesota UNIX Gopher includes such items as organization, site, administrator, latitude longitude, time zone, host alias, and language. Organization is the group that owns the site, such as UCLA or Addison-Wesley. The site is the name you want to appear when your server is contacted, such as Social Sciences Computing for my Gopher server. This information is required under the Gopher protocol because it will be listed for every menu item on that Gopher server that does not specifically have its own administrator information.conf file. The 3-D version of Gopher, which is in development, will use latitude and longitude to provide a graphic display of the server s geographic location.mit.edu:8001 geo or telnet: martini.eecs.umich.edu:3000 .Remember that the machine you put your server on today may not be the machine it s on next year. For that reason it s best to set up a host alias, instead of the specific name of the machine on which it resides. For example, we Social Sciences Computing at UCLA originally set up our Gopher server on one machine, but instead of registering that name, we used an alias. Within six months we had to move our Gopher server to a machine with a different name, but the alias stayed the same. Language identifies the default or most commonly used language in the text files on your Gopher server.Configuring Gopher Performance Gopher can be configured to behave in different ways in particular situations. The cache time variable sets the amount of time in seconds before the server checks whether the Gopher files have changed. Because it s faster for the Gopher server to grab a cache file of a directory list than to actually run a directory listing, this setting affects the server s response time, particularly for large directories. The number of seconds should be low when you expect rapid change in a directory and high when change is likely to be rare. You can configure your Gopher server to ignore or not include any files that end with a particular extension or that match a particular pattern.txt.SP could designate a text file in Spanish, whereas message.txt.DE means text in German Deutsch . When Gopher browsers select this item, they see a list of the different Spanish and German views available for that document.1, Word for Windows, PostScript, and plain text. The decoder option allows you to run files with certain extensions through certain programs before delivery.zip.zip files.Configuring Access and Security One of the first things you need to think about after you have the Gopher server running is whether you need to restrict access. If anyone, for any reason, should not see what s on your server, you should consider whether you want to put that information in a public place like the Internet. Pay close attention to exactly what options for limiting access your server software provides, and be sure you understand the method in use.With these techniques you can restrict part or all of your Gopher server. You can have certain information available only to those coming from certain domains and have other information sent to different domains.Restricting or Allowing Access by Domain Name. The first method of restricting access is to allow or restrict access based on the Internet domain or subdomain of the Gopher user s machine.sscnet.ucla.edu , sscnet stands for Social Sciences Computer Network, which is in the UCLA network, which is in the edu education domain, as opposed to the com commercial , gov government , mil military , or .uk United Kingdom , .fr France , .au Australia , .br Brazil , and .sg Singapore domains.edu domain. Further, if I had something else of interest only to those in the social sciences, I could restrict access to only those people whose machines are in the sscnet.ucla.edu subnet. This method works when you have a small number of local subdomains that can be easily listed and anything else shouldn t be allowed to see your server. This is appropriate when your company or campus is licensed for certain types of things, but you don t want to publish them for the world. The restricted access option specifies IP addresses or subnet domains that are or are not allowed access to specific directories or the whole server, whereas you can also limit access by type browse, read, or search only for certain machines or subnets. Bummermsg is the message that goes out to anyone refused access to your Gopher server because of the restrictions you set up.Hiding Your Server. Because Gopher servers are expected to run or be listening on port 70, hiding your Gopher server means running it on a nonstandard port. Port numbers 1023 and below are reserved for the system administrator, but others can use ports with higher numbers, so you ll often see personal Gopher servers listed at port 7000 instead of Gopher s usual port 70. Sites that need different features provided by different Gopher servers often run both but on different ports of the same UNIX machine. Although the degree of safety obtained by running on a nonstandard port number is not great, it does keep casual browsers out, simply because they would have to try all your ports to find your Gopher server. Of course, once someone sees a link or URL to your server, nothing prevents that person from copying it and linking to it. But if you don t publicize this number, it is extremely unlikely that someone will search all possible ports on your machine just to find a hypothetical Gopher server. Requiring a Password for Connection. Requiring a password sometimes called a ticket for connection to the Gopher server is basically a low-security method of validating users. But for now you should know that this capability is built into at least the University of Minnesota UNIX Gopher server. Startup: Stand Alone Versus Inetd This section applies only to UNIX servers. Obviously, you want your Gopher server to come up automatically whenever your server is rebooted.Inetd is a special process on UNIX that listens for a call on various ports and, when it finds one, starts up the appropriate program, in this case the Gopher server program Gopherd. This would seem sensible, because the Gopher server runs only run when a request comes through, but unfortunately the time delay to start up the Gopher server program, as well as other factors, make this the less desirable alternative. Standalone mode means that your Gopher server is running all the time, sitting and waiting for someone to talk to it on port 70 or whichever port you ve assigned it to watch. That s why the UNIX Gopher documentation strongly suggests that you run your Gopher server from a limited privilege account, with only enough privileges to access the directories it needs. Setting Up the Gopher Data Directory No matter what version of the Gopher server software you use, you will need an area in which to store files and documents. You are not obliged to have many subdirectories, but subdirectories often help to keep things organized and keep your Gopher menus from getting too long. And using different subdirectories for different departments makes it easy to set up group permissions if your operating system allows it so that one department can update its own material but not another department s. Figure 3-2 shows how the first-level directory from the example in Figure 3-1 would look in Gopher.Short names for directories and files are convenient for Gopher managers, but users like longer names that include some description.names file. This section isn t meant as a tutorial but rather as an example of the kind of control almost any Gopher server allows you. When using the University of Minnesota s UNIX Gopher server software, putting a period at the front of a file name makes that file invisible to Gopher.The example in Figure 3-3 uses the first two of these and shows a .names file setup to give longer names and change the order.names file to make these cosmetic changes. Note how much more descriptive that Gopher menu is now.For the final step in this example let s add a link back to UCLA s main Gopher server, so that our users can find their way out to other parts of the campus. Also, the Chicano Studies Research Center has its own Gopher server, and because that s one of the departments we support, let s link to it as well.names file or to another file that starts with a period.links and keep it separate.links file, and Figure 3-6 shows the results of adding the two links. Figure 3-5 shows how to add information that describes how and where to locate the other Gopher servers. And the Chicano Studies Gopher server is a Gopher server, which allows us to add the administrator s name and contact information.Figure 3-6 shows the two new link items in alphabetical order as items 2 and 5 on the Gopher menu, the result of the Gopher server program s doing its job of coordinating the various files and items to make a useful Gopher menu out of them.The next step in setting up this Gopher server would be to add files and links to each directory anthro, econ, geog, and sscnet .names and .links files. If you need to add more directories at the top or in any subdirectory along the way, you re free to do so.Useful Techniques for Adding Information and Files The whole point of the original design of Gopher and WWW as well, for that matter was to make it easy for people to publish their information.Eventually, you will find that you don t have time to handle all the files and links that need to be added to your server. Because no one knows as much as you do about the operation of your Gopher server, it is important to have easy, convenient, foolproof methods for others in your company or organization to add information to your Gopher server. What follows are some techniques for adding information to your Gopher server.Group Permissions If you are using UNIX or any other operating system that allows rights to be assigned to a group of users, take advantage of that feature to allow individuals to update their own areas of expertise in your Gopher server. Set up a group and then give that group access to the files and directories in its area of the Gopher data directory. The Macintosh Gopher Surfer server administrator can use the User and Group Privileges features of Apple Personal Filesharing to permit participants in a Macintosh Appletalk network to put their files in your Gopher by just dragging them over with a mouse. Be sure that users know what to do if they need to modify the access rights to the files they add. The limitations of this method are that the users need to know their way around your operating system and they have to be familiar with any specific Gopher server set-up instructions your server may have. Adding Documents with FTP On UNIX systems you can give users the rights to FTP files into their subdirectories of your Gopher data directory.names file format can design and upload that file and control the appearance of the file names and order of appearance. Linking to Users Directories You can create links within the Gopher data directory that point out files in other directories, specifically a subdirectory of the home directory of the person you want to administer that section of the Gopher data. This works on UNIX and may work on any other operating system that allows aliases or links to make a file appear to be in two places at once. Connecting Gopher to a Database Although most Gopher servers store their data in files, that isn t necessarily the only way to go. This arrangement allows you to use Gopher to access information in a sophisticated and powerful database package, where it might be used for many other purposes as well. For example, you could keep your company s product information, prices, and inventory in a database that you also make available via Gopher.This has its advantages, especially if you can link Gopher to an existing mainframe SQL Structured Query Language relational database that is already being used for other purposes.statements. The University of Minnesota Gopher Development Team wanted to give Gopher users access to information about campus events that was stored on a mainframe computer. The data was in an SQL database in this case Sybase , so the team developed the GopherSQL gateway to translate Gopher requests into SQL query statements. This process is described in Paul Lindner s paper presented at the 1994 Gopher Conference, A Gopher Interface to Relational Databases, which you can find at gopher: boombox.micro.umn.edu:70 11 gopher Unix gopher-gateways . Getting Others to Collect Links Encourage your staff and users to suggest links to you by saving them in their own bookmark files and e-mailing them to you.Adding Links via Forms Build a Gopher form on which users or staff members can enter link information. The limitations are that users need to have a browser that is compatible with Gopher forms are a Gopher feature and that anyone can add documents to your Gopher server if you have not installed some authorization or password control options. Using Veronica to Collect Links Use the Veronica Gopher Indexing program to search out links to related or useful information.link file for the entire search instead of one item at a time and add those links to your Gopher server. Adding Documents via E-Mail GMAIL GMAIL is a Perl script for UNIX Gopher servers that allows your data maintainers to send you their additions and deletions via e-mail see Table 3-4 . This script allows you to create a list of e-mail addresses and match each to a directory in which users can deposit messages. The subject line of the message becomes the Gopher menu item for that file, and the rest of the mail header is stripped off for security and appearance. The limitation of GMAIL is that each e-mail ID can deposit in only one directory, although each directory can have multiple contributors. Adding Documents via E-Mail Appending to a File On UNIX systems you can set up an alias such that any messages received are added to the end of a specified file instead of being sent to a person. The University of Minnesota s Gopher server displays these files they are also called mail spool files as a menu of items, using the subject line of each message as its menu item. This is a good way to add the messages from a mailing list to your Gopher server, if you make that alias a subscriber to the mailing list. One solution would be to run a cron job a program on UNIX that lets you schedule certain events at specified dates and times so that the file is renamed once a week with the week name. One limitation of constantly appending to a file is that the latest items always appear at the end of the Gopher menu. GopherLunch GopherLunch is a Gopher data submission system from Arizona State University see Table 3-4 . It runs on UNIX Gopher servers and allows relatively easy submission of documents via ASK block forms, FTP, or e-mail messages. Gopher Tools Whenever you get a bunch of programmers interested in something, you ll soon see that they ve developed additional programs and tools that make their job easier. Tools like Jughead were developed to index Gopher servers and allow Boolean searches logical operators, such as search for this AND that OR something else .Gopher Menu Layout The layout of a Gopher server is not as simple a project as you might think. Get lost and give up in disgust. Falsely assume you don t have what they want when they can t find it immediately. Look for something by one name or in one category, when you ve grouped or named it differently. Have to go so deeply into your Gopher menu that they can never get back or tell anyone else how to get there of course knowledgeable users would save a bookmark to that link once they had found it . Spend too much time following links and menus to get to a popular item that should have been near the top of the menu. Find an index search of your server s contents but then have no way to understand the terminology you used in your documents, and all their guesses fail to hit. Here are some guidelines rules suggestions for Gopher menu design: 1.Be clear and descriptive in your menu titles. 2.Avoid vague words like other and miscellaneous. 3.Keep your main Gopher menus to one screen, two at most. 4.Test your menus on novices. 5.Make a searchable index of everything on your server and make it highly visible. Password Protection Sooner or later Gopher managers find that they need some sort of password verification to allow only certain users access to particular data files or directories. Providing password protection is a problem endemic to the Internet publishing industry, and some say its solution is the key to making the Internet commercially successful as opposed to just wildly successful at encouraging cooperation and sharing of information .However, you should be aware that some Gopher servers offer a lightweight security system that exchanges some form of password or token to prove the identity of the user. It is called lightweight security because the password is usually sent across the Internet in the clear or with only modest encryption. Lightweight security systems can be enticing, because they appear to give you at least a certain level of security. Although you might be acutely aware of the limitations of your security scheme, your data librarians and others who add items to your server may not always be so cautious. What Logs Can Tell You Almost all Gopher servers have some provision for keeping logs, and some utility programs provide statistics so you can analyze those logs.IP address of all clients Dates and times of connection Item s retrieved this does not include links to items on other servers, because they would attach to the other server, not yours Host names of all clients These logs make it possible to determine the number of times an item is retrieved and when and its relative popularity. You cannot yet get a reading on the type of clients Gopher, Gopher , HTTP, and so on connecting to your Gopher server. That information would tell you how many of the Gopher features you use are beyond the abilities of the Gopher client programs that access your site. Knowing what files are most and least popular can be useful for measuring utility as well as for redesigning your layout. Perhaps the information you know your users would find most helpful is so far away from the top level that they never find it.However, you need to be aware of the potential for invasion of privacy inherent in these logs. Gopher and Web administrators often automatically strip off the lowest level of IP address in their reports, precisely to remove this link to individuals. I m told that at least one Gopher site has had to modify its Gopher server software to remove the possibility of linking this information, because it has a grant that requires absolute anonymity for those retrieving information. Of course no log actually tells you whether the information downloaded was useful or was even what the user expected. To gather that kind of information you could provide an online form for your users to fill out, asking them specific questions about certain features or selected files on your Gopher server. Setting Up Forms Using ASK Blocks Among the most useful enhancements in Gopher is the ability to interact with your users. The information is sent back to your Gopher server to be appended to a file or processed by a shell script or program. Uses of ASK blocks are endless, including asking for comments, filling out an order form, and checking off answers in a multiple-choice questionnaire. ASK block forms can be built from the following component commands: Ask asks the user a question and allows a one-line response. AskL asks the user a question and allows a multiple-line response. AskP asks the user a question and hides the response as is commonly done with passwords so that no one can see the response over the user s shoulder. ChooseF asks the user for the name of a local file on her machine. AskF asks the user to create a new file name on his machine so that the Gopher server can send and store something there. Select shows the user a set of options. Choose shows the user a set of options from which the user may select only one similar to Macintosh radio buttons . Check out these Gopher test sites at the University of Minnesota and the University of Indiana for examples of the latest ASK types. For consistency we ve given URLs here, but if your Gopher browser doesn t handle them, just connect to the sites listed mudhoney.micro.umn.edu and FTP.bio.indiana.edu and follow the menus to the Gopher examples. gopher: mudhoney.micro.umn.edu 11 gplustest gopher: FTP.bio.indiana.edu 11 Gopher Gopher test ask 09 application Gopher -menu 20En US ASK blocks are fairly simple.5 gopher: grace.skidmore.edu 0R50706-69321- help Gopher plus ask-tips . One big drawback to using Gopher ASK blocks is that Web browsers don t support them and it s not certain when or if they will.Security Holes and How to Prevent Them To think about security you have to understand where your main points of weakness are and what you might be able to do about them. It s important to prevent a user from somehow subverting the Gopher server to get it to perform actions that you didn t intend. Another area of security is in limiting the directories that are accessible to the Gopher server program.the Gopher server can see only the files in your Gopher data directories, no one will be able to trick it into handing over something it shouldn t. If, for example, you re laying your Gopher server over an FTP server, make sure that no one can make an anonymous upload.Finally, scripts and programs that are run in response to Gopher requests are always a point of concern. If it s possible to shell out or execute other commands via one of these programs, you have to make sure that nothing dangerous can be run via your script. Different versions of the Gopher server software offer different solutions to these problems, and some may raise additional concerns. Check the relevant newsgroup and list server archives for discussions of these issues, and read through the documentation for the server. For the Minnesota UNIX Gopher server start by reading Guide to Safe Gophering written by Paul Lindner for the 1994 Gopher conference gopher: boombox.micro.umn.edu:70 00 Gopher Gopher Conference 94 Papers SafeGopher . Remember that you may have security problems simply by being connected to the Internet.Running Scripts or Programs from a Gopher Menu You occasionally may want to provide a Gopher menu option that runs a program or shell script on your server and returns the answer to the user.Who is logged onto this machine? When someone chooses this item, it executes the who program to show the list of users currently logged on and sends the results to the client. This permits you to truly provide access to changing data or information, because your server will reply to the user with the most recent information.1.Checking for the latest weather information 2.Displaying the last 10 comments made by users 3.Checking for what has changed on the Gopher server in the last few days All kinds of scripts or programs can be run behind the scenes by the Gopher server--they need not be interactive, but they should make some sort of text response to the Gopher user. For example, a WAIS gateway takes a query by a user on a Gopher server, sends it out to a WAIS server, and returns the answer via Gopher. Different operating systems have different scripting languages; Perl is available for Macintosh, DOS, and UNIX. Registering Your Gopher Server Registering your Gopher server is essential if you want others to be able to find it easily and if you want it to be included in index servers like Veronica.tc.umn.edu , which is then referenced by other Gopher lists. You get your Gopher server on this list by sending the university an e-mail message with the following information: server s name, host name, port number, administrative contact, an optional selector string, and an abstract describing the content of your server.net; otherwise send e-mail to gopher boombox.micro.umn.edu. Indexing Although you will design your Gopher menus for ease of access and the convenience of your users, that s often not enough. This means people have to spend more time searching through your menus to find the material they want or assume sometimes incorrectly that you don t have it. Minimizing the time users spend and improving the accuracy of their searches are in everyone s interest: yours, because users taking wrong paths tie up your server, and theirs, because they obviously don t want to spend more time searching than they have to. Indexing the contents of your Gopher server also known as your section of Gopherspace is helpful to everyone. Gopher indexing usually means that all the menu items and directories in your Gopher site are indexed and available for searching.You can get your server indexed in several ways.You can do it, and you can arrange for Veronica to do it.If you do the indexing, somewhere on your main Gopher menu you ll want to offer users the choice of searching the entire contents of your Gopher site or just going through the Gopher menus. If the Veronica database shows that your server has something in one of the Gopher menus that matches, the searcher is given links to those items on your server.The remainder of the chapter briefly summarizes some indexing schemes that are available to Gopher administrators. Veronica Veronica is a central indexing service that attempts to index all of Gopherspace each month.Getting your server indexed by Veronica is easy.scs.unr.edu:70 00 veronica veronica-faq . Assuming your site is not under restricted access, Veronica will know to index your server if you register it with the Mother Gopher at the University of Minnesota or if your server is linked to a Gopher server that is registered with the Mother Gopher. The biggest problem is deciding which areas of your Gopher server you don t want indexed. Of course, if you don t want any part of your server indexed, that is easy to arrange as well by using a Veronica control file to leave instructions for Veronica.scs.unr.edu:70 00 veronica About Index-control veronica-ctl . This is a plain text file that is left on your Gopher server for the Veronica harvester to find when it comes by to index your Gopher server.Note: If you are running the UNIX Gopher server from the University of Minnesota, gopherd.conf gives you the option of completely excluding your Gopher server from Veronica harvesting.You might think, Why not let Veronica go ahead and index my entire Gopher site? But you ll be doing yourself and the Internet a favor if you think this through carefully. Veronica updates only once a month, so if you have items that will be added and replaced several times within a month like Usenet newsfeeds , Veronica indexing won t be helpful. The Veronica control file cannot specify individual files, only menus directories . Therefore it might be worth rearranging your menu structure if you find that some Gopher menus contain items that should be indexed and others that should not. Jughead Jughead is an alternative to Veronica that lets you generate your own indexes of your Gopher menus and items, but unfortunately it runs only on UNIX systems. Jughead s designer meant it as a tool for Gopher administrators, but among other things it builds an index of a selected Gopherspace--titles and menu items only, not full text. Like Veronica, it then can act as an index server by listening for search requests at a certain port and responding with the appropriate Gopher item or document.One advantage of Jughead is that it was written by a Gopher administrator specifically for Gopher servers, as opposed to some of the other indexing programs, which are more general.Glimpse Glimpse is a powerful full-text indexing and query system that allows approximate matching, Boolean searching, and limited regular expression UNIX-style searching. Glimpse does not directly support Gopher servers, but you may be able to make it work as a back-end indexer for Gopher, which is how WAIS works.cs.arizona.edu:1994 is that development work is continuing, whereas freeWAIS appears to be stalled. WAIS WAIS indexing is a powerful way to index the full text of your Gopher server and much more.GN Gopher WWW Server GN is a combination Gopher and WWW server that has its own indexing scheme built in.math.nwu.edu:70 Summary Gopher is an inexpensive and simple method for publishing information on the Internet that does not require users to have a fast or powerful computer. Gopher s ability to link to items on other computers as easily as to those on the same computer freed users from worries about login IDs, passwords, and protocols when moving from system to system.Indexing programs like Veronica and Jughead increase the utility of Gopher tremendously, particularly because the results of each search become its own Gopher menu. Unsuspecting users might think the Gopher menu of search results they are seeing see Figure 1-10 for an example of a Veronica search is stored some place on the Internet, but actually it is created at the time of the search. This means that if the user does the same search two weeks later, the menu that appears might be different, because other resources have been added to the Internet and indexed by Veronica in the interim.Gopher servers and clients browsers come in two forms, Gopher and plain Gopher, or Gopher0. The enhancements to Gopher include such online forms ASK blocks , the ability to provide different views of the same document, and the addition of meta-information information about the item, such as date, author, language, and so on .Gopher server software is available for a wide variety of computer platforms. Usually, they allow you to restrict or allow all or part of your Gopher data directories to machines from certain domains of the Internet.The virtue of publishing via Gopher servers is that information can be made available simply by dropping the appropriate file into the Gopher data directory. Often, however, particularly with UNIX systems, nontechnical users will want simpler methods than FTP to move their files and information into the appropriate Gopher data directories. There are many alternatives, including GMAIL, which allows e-mail accounts to be used as trusted contributors to specific directories of your Gopher server. The design of your Gopher server s menu layout is an important factor in making it easy for users to find what they need. Indexing your Gopher server with Jughead or other indexers is an excellent way to allow users to quickly search the contents of your server to find out if you have what they want. Registering your Gopher server with the Mother Gopher server at University of Minnesota guarantees that it will appear in the master list of all Gophers in the world. Be careful to use the Veronica control file if you do not want all areas of your Gopher server to be indexed. Security is always an issue when dealing with the Internet, but properly configured, a Gopher server doesn t add to that risk.Gopher is said by some to be dying, but it provides an efficient method for making text, graphics, and other files available across the Internet although not in one document like WWW . Because Gopher browsers can run on even the simplest computers available a two-floppy PC, for example and with the slowest of modems, it will continue to serve a purpose long into the future.People like color and pictures. The promise of the Internet as a new communications medium has been fulfilled more by WWW than by any other Internet application. Gopher is easy to use and manage. But WWW is the Internet application that makes people sit up and take notice, because of its colorful graphics and text. Although it may not be fair to say that WWW has taken over the Internet, WWW traffic has certainly been increasing astronomically. WWW browsers such as NCSA Mosaic and Netscape are so easy to use that they have fascinated and attracted many new Internet users. The ability of WWW to combine colorful graphics with text has led to extremely creative Web sites put up by companies, organizations, universities, and even individuals. WWW: Pros and Cons WWW servers allow you to present information with an impressive mix of graphics and text. However, WWW requires greater bandwidth than Gopher to transport its image files across the Internet, and transport can be extremely slow over telephone lines, even with 14,400-baud modems. WWW server software is available for a wide variety of computers, including UNIX machines, Windows, Macintosh, and OS 2 operating systems. HTML, HyperText Markup Language, which is the basis for most WWW documents, is easily decipherable and written in plain text, portable between operating systems.WWW uses one easily understood interface to connect the user with most types of Internet resources, including Gopher, WAIS, Finger, telnet, tn3270, and FTP.One serious problem with WWW is that HTML is being used as a page layout language when it really wasn t designed for that.ADVANTAGES WWW browsers work with many types of Internet resources, including Gopher0, WAIS, Usenet News, Finger, telnet, tn3270, and FTP. DISADVANTAGES Most servers can t limit the number of connections to the server called load limiting to avoid overloading the network or server.WWW servers let you think in terms of publishing an electronic version of a full-color brochure.How WWW Works WWW works similarly to the Gopher protocol: it waits for requests from WWW browsers and then fulfills the request if it can. Like Gopher, HTTP servers can send anything, but the information usually consists of text files in HTML embedded with inline images. Where Gopher sends plain text files, which the client can read immediately, WWW browser software must interpret and receive the HTML that comes through the Web. The WWW protocol, HTTP, is a stateless client server protocol, which means that a Web server does not have a long attention span. It receives a request from a client, such as Mosaic, Netscape, or Lynx, and it processes that request and responds with either the information requested or an error message.HTTP servers handle a broader range of commands than a Gopher server, and the protocol is slightly more complicated and still evolving . But they re basically both doing the same thing: providing files or actions upon request, not unlike what the anonymous FTP servers do. Essential to WWW are Uniform Resource Locators, or URLs, which have become the phone numbers of the Internet. They were invented along with WWW by Tim Berners-Lee as a simple method for describing links to files on other systems. For a discussion of the related developments of Uniform Resource Identifiers URIs , Uniform Resource Names URNs , Uniform Resource Characteristics URCs , and Uniform Resource Agents URA , see Chapter 11.Here s a simple description of what goes on between a WWW server and client. The Web client program reads the URL to get the protocol to be used Gopher, HTTP, FTP, and so on and the address and port number of the server to be contacted if it s the same server, it s considered local . The URL text usually consists of a file name and or a directory, but it could also be the name of a program or text for a database query. The Web server passes the file or item on to the Web client if it can or an error message if it can t. If the file is in HTML format, the Web client immediately scans it for inline images. If it finds any, it starts the process over again with separate requests for each of those image files assuming the user has not turned off Display Images . If the item specified by the link was a program, the Web server would run the program and send the output back to the Web client. If the item specified was a database query, the Web client would receive the results of the search. If the item specified was a file that the Web client doesn t know how to display, it can pass it off to a helper application or viewer program.If the user wants something else, she repeats the process, starting a new connection each time. From the Web server s point of view, the process is efficient because it doesn t have to keep track of ongoing requests.WWW Features Although not all WWW servers offer all the features listed here, most will. This provides the user with a sort of touch-screen interface; the user can click on different areas of a map or other image and get further information on that area.WWW Resources Resources for learning how to use and build Web servers are all over the Internet. Pay close attention to the Usenet newsgroups and e-mail list servers listed in Table 4-1, because they ll be a source of ongoing support, ideas, and information as WWW continues to develop. Web Data Types Like Gopher servers, Web servers can deliver many different types of data files in addition to HTML, which is the most common. HTML HyperText Markup Language, or HTML, is a method of marking plain text for both structural and layout elements, as well as links to other documents and images, sounds, movie clips, and so on. You can write it on a Macintosh, serve it from a Sun Microsystems workstation, and view it on a Windows PC. Any WWW client should be able to interpret the HTML you write, so long as you conform to the HTML standard.SGML Standard Generalized Markup Language, or SGML, may become a popular language for WWW documents. Many large organizations have used SGML extensively, and there is a movement to create for it viewers that can be hooked up as helper applications added to WWW browsers. One advantage seems to be SGML s ability to fully define every structural element in a body of text, a useful feature when automated programs are designed to go out and retrieve for you specific sections of documents, such as a table of contents.PDF Portable Document Format, or PDF, is the name for Adobe Systems method of representing documents in a manner independent of the original application, software, hardware, and operating system used to create those documents. In other words, it doesn t matter what type of machine or program created the document--it will show up virtually the same on any type of machine with a PDF viewer. Gopher Because WWW was designed to pull all existing Internet services together in a web, it should be able to connect to Gopher servers.WAIS WAIS, or Wide Area Information Server, is another Internet resource that WWW was designed to work with.GIF GIF, or Graphical Interchange Format, is a method of storing graphics made popular by CompuServe, a commercial bulletin board service. That is, if you publish your images in GIF format, most people using graphical Web browsers won t have to do anything special to see them. GIF does a good job with crisp sharp images like icons , whereas JPEG is better for realistic images, such as scanned photographs. A controversy erupted in early 1995 when Unisys, the patent holder for LZW compression, a key element of GIF, decided to charge CompuServe and its developers for the right to use this part of GIF. End users did not have to pay anything, and although talk about creating a new image standard is flying, GIF is a mainstay on WWW for now because all graphical Web browsers can display inline GIF images. TIFF Tagged Image File Format TIFF is a widely used image format often associated with scanning software.CGI Scripts Technically, Common Gateway Interface scripts, or CGI scripts, are not a data type. CGI is a way to write programs or shell scripts or Perl scripts to perform some action when this item is chosen.Plain Text Plain text files can be stored on Web servers, but they won t look as good as HTML text. Once users get a look at nicely formatted HTML files, they re no longer satisfied with a plain text file, which is what they usually see on a Gopher or WAIS server. PostScript PostScript is a printer-independent language that is, it doesn t matter which printer so long as it is PostScript compatible , also from Adobe Systems.Sound Files Several different sound file formats are available, and none works on all platforms.JPEG JPEG Joint Photographic Experts Group is a system for compressing still images that is better than GIF for full-color photos. JPEG usually requires an external viewer before it can be downloaded and viewed, although some WWW browsers like Netscape handle inline JPEGs. MPEG MPEG Motion Pictures Experts Group is a method for compressing movie images.HTML--HyperText Markup Language HTML is the essential ingredient in building a Web server. That is, HTML is the first computer code to offer a broadly accepted method for creating and displaying hypertext with images and text and sound and movie files in the same document. Unfortunately, the very attractiveness of the hypertext leads many to the false conclusion that they can further control the appearance of these Web pages.0 offers many more formatting options than HTML 2.0. HTML assumes that different Web browsers display documents differently, according to the needs of the operating system and equipment for which they re designed. Lynx cannot display images at all, and it can t handle different font sizes or even italics, because it was designed for plain VT100 terminals, a common dumb terminal. Lynx is a common way to connect to the Internet because it requires only a telnet connection, which is possible over even the slowest modems. As with Gopher, HTML has been evolving into newer versions. This process can be slow, because even when everyone agrees that improvements are necessary, it is hard to get all parties concerned to agree on how the improvements should be made.0 was completed in the spring of 1995, and 3.0 and 3.1 are under discussion.What HTML Is and Is Not The purpose of HTML is one of the most hotly discussed issues in the Web world.S. That means that the information might be displayed as text or a graphic outline of that text or transposed to audio output.SGML does not concern itself with the layout or final form in which the text will appear. Because different clients have different types of machines with different capabilities, SGML purists say that it is foolish and just plain wrong to try to specify in any way how the text will be displayed in the final form.document. On the other hand, many, if not most, publishers were drawn to the Web by its ability to present attractive mixes of graphics and text. Unconcerned with the SGML history of HTML, multimedia authors simply want to publish in this new medium and have more control over the way things are displayed. Naturally, SGML purists feel that HTML s use as a page layout language is completely inappropriate and in fact a perversion of the pure SGML concept. However, SGML purists see H1 as a structural marker for the document, the principal header at the head of the main section.Although the SGML purists may sound overly restrictive, in the long run they are probably right in suggesting that strict adherence to these principles will make these documents more useful for future document retrieval techniques.Conforming to HTML Standards You can get into some bad habits writing HTML. In the interest of improving the standard of Web pages, Mark Gaither of HaL Software Systems has created a free online HTML conformance-checking service http: www.halsoft.com html-val-svc on the Internet.cc.gatech.edu grads j Kipp.Jones HaLidation validation-form.html . You can either submit your HTML text to be checked or point to a URL that the HTML validation service will then check and send comments back. The key point here is that writing HTML that is clean and conforms to current HTML standards ensures that your documents will display correctly in all compliant browsers. Gaither, author of Why Validate Your HTML Documents http: www.halsoft.com html whyvalidate.html , has commented that one reason to ensure HTML compliance is so that newer and better browsers can be written with the assurance that most HTML pages on the Web are correct. That makes the job of designing a new browser easier because they don t have to continually allow for variations and mistakes in HTML layout. The writers of browser programs will be able to add core functions and greater sophistication if they can be sure the browsers will encounter well-structured HTML. The article called for the ability to inline pieces of documents as a way to cite someone else s work without infringing on their copyright http: www.igd.fhg.de www www95 papers 95 webip.html . To use a piece of another document inline would require the ability to extract and retrieve specific sections of someone else s HTML to be used in your document. Writing HTML Quickly One of the best tools for learning to write HTML quickly is the capability of all Web browsers to view and capture the source HTML of any document you find on the Web. That means that if you see something you like, you can look at it to see how it was done, and save it for further reference. But as with learning a foreign language, look at a wide variety of examples and pay attention to how they are used in different situations. Another quick way to get started is to convert an existing word-processing document into HTML, using some of the filters and converters discussed later in this chapter. LaTex, Word for Windows, and WordPerfect documents can be converted this way, although sometimes you have to use a two-step process first saving them in RTF Rich Text Format . Once you ve done that, making changes in your original word-processing document and converting again is an easy way to create useful Web documents quickly. Eventually, you ll need to learn HTML codes to handle problems that the converters and filters don t handle, but you can accomplish a lot by going this route first. EasyHTML is a program from NCSA National Center for Supercomputing Applications--creator of NCSA Mosaic that lets you create and save HTML pages interactively by filling out online forms on a Web server. The user answers questions and creates an HTML page one step at a time, with immediate feedback after each step, to show what the HTML page will look like. One is straight HTML and the other is as a form, so that EasyHTML can be used to re-edit the file. EasyHTML doesn t let you use the full set of HTML commands, but it lets your users create simple HTML documents quickly and easily.ncsa.uiuc.edu easyhtml Several good online tutorials and guides to HTML are listed in Table 4-1.Guides to Writing HTML An easy way to find out what mistakes others are making and avoid them is to search for sites that have gone to the trouble of compiling common errors and mistakes in writing HTML.willamette.edu html-composition strict-html.html cwis.uci.edu:8042 Staff StyleGuide.html w3.org hypertext WWW Provider Style Overview.html pcweek.ziff.com eamonn crash course.html wa.com htmldev devpage dev-page.html The most common mistake in HTML is to believe that the way your documents appear on your browser is the way that they ll appear on every other Web-browsing program.Relative Versus Absolute Addressing Not all URL links have to be absolute, or completely written out. It is often more useful to simply refer to other documents or files relative to the current server, directory, and document. http: latino.sscnet.ucla.edu murals Sparc SPARC.html And this link is relative, because it assumes the file is in the same directory as the file that does the linking this trick does not work when linking to another server, of course : sparc2.html Note how short the second URL is. For example, if our server gets too popular and we re forced to move it to a new machine, or a different directory, the relative links won t need updating.SGML Instead of HTML? Many Web sites are using SGML for the primary storage of their Web documents and then converting them to HTML.Several commercial SGML editors are available: 3.1 by SoftQuad, Inc.HTML Editors, Filters, and Converters Table 4-2 offers a partial list of HTML editors available for creating HTML documents.One common complaint while I was writing this was that many HTML editors especially those for Windows are limited to files that are smaller than 32K. One solution I ve heard of is to split large HTML files into pieces and then edit those pieces separately.Another aspect of HTML editors that you should consider is whether they discourage or prevent you from making mistakes in HTML.Web HTTP Servers When choosing a Web server technically they re called HTTP servers , the first place to check for online information is the WWW FAQ by Thomas Boutell at http: sunsite.unc.edu boutell faq www faq.html .proper.com www servers-chart.html .As with Gopher servers, you ll need to address at least five concerns when choosing a Web server: Function--Look at the functions it supports, the extra capabilities the server provides, and its record of compatibility with add-on indexing programs. Support--Verify the level of support and activity this software enjoys, including newsgroups and mailing lists, as well as more formal commercial support mechanisms.Another concern is ease of operation. You may find it worthwhile to first set up a Web server on a PowerMac or Windows machine, learn what s involved, and try out your publishing ideas.Some servers can run as proxy servers. This means that if your organization has a firewall, you could have a proxy server sitting on the firewall, passing out requests to the rest of the WWW and saving the documents returned called caching so that they ll be ready for the next person who asks. Security provisions might be important to your organization if you want to limit the access to some sections of your Web server.The section that follows lists many existing WWW HTTP servers and describes some of their various features. Some aspects to check out include the load, or number of requests that can be handled in a day; how easy it is to add information; the ability to limit the number of connections depending on the load; and the general popularity of the server assuming that the more popular servers tend to get better user support from the Web community . As of May 1995, in my humble opinion, the CERN and NCSA servers are by far the most popular Web servers, with GN, MacHTTP, and Windows-based servers coming behind. I expect a great increase in the use of Macintosh, Windows NT, and OS 2 as platforms as more and more individuals start putting up Web sites on desktop machines. UNIX Until recently, most Web servers ran on UNIX, because most of the Internet was developed on UNIX, an extremely flexible and powerful operating system. Many Web servers and tools for Web management were developed first on UNIX and later rewritten ported to other operating systems. The drawbacks to using UNIX-based WWW servers are that you need to be sure you have good UNIX support, and it can be a complicated and arcane field. Some companies, notably Sun Microsystems and Silicon Graphics, are working to make this easier, because they are in the business of selling UNIX hardware.enst.fr pioch httpd . CERN HyperText Transfer Protocol Daemon The CERN HyperText Transfer Protocol Daemon HTTPD server is available at no charge for UNIX and VMS and provides the ability to run as a proxy server with document caching for faster access.w3.org hypertext WWW Daemon Status.html NCSA HyperText Transfer Protocol Daemon The NCSA UNIX server is available at no charge and runs on UNIX systems. Server administrators can decide whether they want to permit their machine s users account holders to write their own HTML files in their own directory as opposed to the server s data directory .com smith would get you to Smith s home page directory.ncsa.uiuc.edu docs GN The GN server is a free combination Gopher WWW server by Professor John Franks of Northwestern University. He stopped development before Gopher , and now he s moved on to a WWW-only server called WN, which you may want to consider if you don t need Gopher.math.nwu.edu:70 Netscape Netscape Communications Corporation offers two commercial UNIX-based Web servers, Netscape Communications 1,500; free to nonprofit educational and charitable groups and Netscape Commerce Server 5,000 . Commerce Server makes possible secure financial transactions for example, paying with a credit card over the Internet with clients using their Netscape browser.com Open Market WebServer and Secure WebServer Open Market offers two commercial UNIX-based Web servers, one for information services 1,495 and the other Secure WebServer, 4,995 for electronic commerce.openmarket.com Plexus The Plexus Web server is UNIX based and written in Perl by Tony Sanders based on an earlier version by Marc Van Heyningen at Indiana University.bsdi.com server doc plexus.html WN The WN server by Professor John Franks of Northwestern University is a free HTTP server with many features, including title and keyword searching.math.nwu.edu Apache Apache is a public domain patched improved version of NCSA s 1.3 HTTPD Web server. It was created by a group of WWW providers and part-time HTTPD programmers to get HTTPD to behave the way they wanted it to.3 HTTPD.apache.org EIT Webmaster s Starter Kit Enterprise Integration Technologies EIT has put together the Webmaster s Starter Kit, which is based on the NCSA HTTPD server for UNIX.The enhancements extend NCSA s HTTPD in several ways, including virtual document configuration options, automatic server monitoring and restarting, request prioritization, polite errors, and polite down time.eit.com wsk doc VMS VMS Virtual Memory System is Digital Equipment Corporation s multi-user, multitasking operating system.CERN HTTPD The CERN HTTPD server is available at no charge for VMS and provides the ability to run as a proxy server with document caching for faster access.w3.org hypertext WWW Daemon Status.html Region 6 HTTP Server This WWW server for VMS, written by David L. Jones of Ohio State University, is said to offer a performance advantage over the CERN version, because it runs with DECthreads, which allows it to serve multiple users simultaneously.eng.ohio-state.edu www doc serverinfo.html Macintosh Macintoshes are user-friendly places to set up WWW servers, and they are gaining in popularity. It allows those who use Macintoshes to set up a Web server on the type of computer they re used to and conveniently make their information available worldwide. And for small and medium-sized Web sites, a Macintosh works as well as a more powerful and expensive UNIX machine, according to Joe Holmes of Sonoma State University in California.sonoma.edu btools theTest.html ; reactions and further commentary are available at http: www.sonic.net net.dreams word current.html . One of his main points is that while UNIX workstations are generally much more powerful than Macintoshes, they come with a large cost in UNIX support staff. If you don t have that staff in-house, you should definitely consider using Macintoshes as Web servers, especially if you don t expect yours to be the largest site on the Internet. By using AppleScript and MacPerl, you can manage quite a bit of customization. MacTCP the underlying software that communicates with the Internet is the factor that limits the Web load that Macintosh-based servers can handle, but Apple is working on a replacement. For excellent resources and data on Macintosh WWW systems and servers, look at the Macintosh WWW Development Guide by Jon Wiederspan of the University of Washington http: www.uwtc.washington.edu Computing WWW Mac Directory.html . Another alternative for mounting a Web server on a Macintosh or PowerMac is to run A UX Apple s UNIX instead of the normal Macintosh operating system.WebSTAR Formerly MacHTTP The WebSTAR MacHTTP Web server for Macintosh was originally written by Chuck Shotton of BIAP Systems http: brain.biap.com and has some enthusiastic users.0 has been renamed WebSTAR 295 educational, 795 other, from StarNine Technologies; http: www.starnine.com and speeded up and enhanced considerably. WebSTAR allows for multiple simultaneous transfers; CGI scripts in AppleScript, MacPerl, or any other language; directory and page password security; and setting maximum simultaneous connections. It is comparable to commercial Web servers on other platforms.2 is still available from StarNine Technologies 75 educational, 95 other . Netwings Netwings is an HTTP server for Macintosh built on the 4D database system. More than just a Web HTTP server, it provides most functions you would want from an Internet server, such as e-mail, mailing list, and database services. A one-user license costs 1,495 and prices escalate from there: five-user license, 7,150; 25-user license, 33,650; 50-user license, 59,500; 100-user license, 104,650.com HTTPD4Mac HTTPD4Mac is a bare-bones Macintosh Web server written by Bill Melotti and is not a port of either the NCSA or CERN servers.246.18.52 MacCommon Lisp Server You can now interface your Lisp programs to the world to show exactly what you can do better and faster in Lisp a programming language .0 and HTML 2.0 that comes complete with source code.ai.mit.edu projects iiip doc cl-http home-page.html Windows NT Windows NT 3.5 is now a serious choice for many Web administrators primarily because of the ease of installation and administration. Windows NT also offers the ability on some servers to interact with Visual Basic and data in regular Windows applications, such as Microsoft Excel and Access. EMWAC HTTPD for Windows NT Both a freeware and a professional Web server for Windows NT 1,995 in the United States; 2,490 on the international market are available from the European Microsoft Windows NT Academic Centre EMWAC at the University of Edinburgh. http: emwac.ed.ac.uk html internet toolchest https contents.htm SAIC-HTTP A noncommercial license is available to use to the Web server for Windows NT developed by San Diego-based Science Applicaton International Corporation SAIC .itl.saic.com features.html Netscape Netscape offers a Windows NT version of its Communications server that has the same features as the UNIX version.netscape.com for details. Folio Infobase Web Server Folio Corporation is licensing the Edinburgh University Computing Service to provide an HTTP server that runs on Windows NT and generates HTML pages on the fly from Folio infobases. Infobase is Folio s term for its extremely fast and efficient full-text searchable database system that first appeared as the software behind Novell NetWare online manuals.S. Tax Code to WordPerfect manuals to the Bible, and this server will make it easy to put them onto the Internet. By adding this feature to Web server software, Folio will enable its UNIX users to browse the Folio infobases through their Web browsers.folio.com WebSite WebSite 499 is a 32-bit WWW server for Windows NT 3.5 released in May 1995.ora.com gnn bus ora item website.html InterNotes Web Publisher Lotus Development Corporation has announced that it is selling a Web server 7,500 that runs on Windows NT and acts as a gateway between its popular Lotus Notes system and the WWW. Lotus Notes includes data replication and synchronization keeps versions of the database up to date on all machines , as well as full-text search, and has support for Macintosh, DOS, Windows, and OS 2 clients. InterNotes Web Publisher translates Lotus Notes documents and databases into HTML and delivers them in response to WWW queries. It also automatically creates HTML pages of Lotus Notes views to create an easy way for browsers to navigate among documents on the Web site.lotus.com inotes DOS and Windows DOS runs on a minimal PC so it is ideal for extremely inexpensive servers. Windows has the advantage of being widely used even if it s not always stable , and its user-friendly features make it an easy platform with which to start Web publishing. Hype-It 1000 Hype-It 1000 549 and Hype-It 2000 1,995 are commercial Web servers that run on a 386 PC running DOS and support 30 simultaneous connections.com homepage.htm KA9Q NOS HTTP KA9Q NOS is a full-fledged Network Operating System NOS that runs under DOS and acts as a server for e-mail, FTP, Gopher, WWW, and CSO Central Services Organization .chem.ufl.edu ka9q ka9q.html HTTPD for Windows by Robert Denny This Windows version of HTPPD has most of the features of the popular UNIX version, including CGI scripts.city.net win-httpd Other Platforms HTTPD for the Amiga This is a port or transfer of the NCSA HTTPD server to the Amiga computer platform.Omnipresence.com amosaic 2.0 GLACI HTTPD The Great Lakes Area Commercial Internet GLACI HTTPD server runs on Novell Netware 3.x and up as an NLM Netware Loadable Module . It can be configured with IP access lists and also to allow users to store their personal HTML documents in their Novell home directory.glaci.com info glaci-httpd.html Webshare CMS HTTPD This free server by Rick Troth is written to run on the VM CMS operating system.ua.edu troth rickvmsw rickvmsw.html Deciding Who Runs Your Web Server This is a good time to remind you of the difference between a system administrator and a data librarian. A system administrator keeps track of the technical details of the operating system and of the computer on which your server runs. Neither description may fit you exactly, but whereas the cooperation of a good system administrator is essential for the UNIX Web servers, much of the job of maintaining your Web server will fall to the data librarian.Take control of your data. So get your system administrator to read the documentation and go over the documentation with you, so that you know what your server is doing.Installing a Web Server You have three areas to consider, no matter which Web server you install: Server Configuration. Configuring your server requires you to specify where the server will store its log files, what port it will answer normally 80 , an e-mail address for the server administrator, the root directory in which the server program files reside, the host name that the server should give out, how long the server should wait for a client once a transaction has started, and so on.You ll come across another Web server configuration issue with UNIX Web servers. On the other hand, if you don t see much WWW traffic, it seems a shame to have the Web server daemon always running.Resource Configuration. UserDir is important because it determines whether your staff or others with accounts on your Web server can set up their own public HTML directory. Otherwise, those with accounts on your system would be able to publish on the Internet simply by creating a subdirectory by that name in their directory and putting files into it.someplace.com , Jack Todd with an account under the name todd could refer to his own Web directory as http: www.someplace.com todd . He could put as much or as little as he wanted in that directory including subdirectories and it would all be available under that URL, which he might put on his business card. This is happening all over the place, and you ll need to decide how much freedom you want your system users to have. Remember, this applies only to people with accounts on your system running NCSA HTTPD server , not people browsing your Web site from elsewhere. The DirectoryIndex command lets you avoid showing the contents of a directory.html default name exists in a given directory, it will not send the directory list or index but will instead send the index.html file to the Web browser. If a file with that name is not there, the Web server will create an index of the contents of that directory and send that to the Web browser.html file, the contents of that directory are available only if someone links to them by name. If you have no links to that directory from elsewhere, users will have no way of knowing what the names of the other files are.html file when you re ready, but in the meantime it acts as a shield for that directory. One other advantage is that you can shorten your home page URL.sscnet.ucla.edu murals murals.html , I can change the name of murals.html to index.html and leave it off the URL so it becomes http: latino.sscnet.ucla.edu murals , and the effect is the same.The AddType directive allows you to set certain file extensions to be treated as certain types of data.htm an equivalent of .html to ease editing of your HTML files in DOS systems that can have only three-character extensions.htm extention and then FTP them up to your Web server without changing their names.Access Control Security . If anyone for any reason should not see what s on your server, you ll need to know how to keep that person out.Here are some simple techniques for restricting access to your server or sections of your server: When restricting access by Internet domain or subdomain, you can also restrict access to subdirectories or block someone from using the entire Web server. For more information about restricting by domain, see the section on Configuring Access and Security in Chapter 3 because the process is similar. Restricting by domain works when you have a small number of local subdomains that can be listed easily; anything else can be said to be external and shouldn t be allowed to see your server. This is appropriate when your company or campus is licensed for certain types of things, and you don t want to publish them for the world. Hiding your server means running it on a nonstandard port other than port 80 and not advertising its host name and port number except to those who need access.Successfully hiding your files and directories depends on your ascertaining that a particular set of files or directories on your server has absolutely no links and that users can t browse and find them on their own.html the NCSA default is present in that directory, the file will be used instead. This a technique usually used to present a home page for that directory with links to the contents of that directory. But if you deliberately don t make links for the files you want to hide, no one will be able to find them in that directory unless they already know the file names.Several Web servers provide for password protection. Ask whether the password is passed over the Internet in the clear or whether it s encoded in some manner that is relatively secure. Again, for real security look to one of the security systems being developed, such as Netscape s SSL, S-HTTP from Enterprise Integration Technologies, and the combination from Terisa Systems. As of May 1995 the W3 Organization had not yet decided on a standard means of providing secure exchanges to Web servers across the Internet. Designing Your Web Pages As with Gopher servers, everyone has an opinion about the best way to lay out a Web server. http: www.arcade.uiowa.edu hardin-www jaffeDesign2.html http: info.med.yale.edu caim StyleManual Top.HTML Keep Home Pages Small Keep your home page or any other entry points small. Remember that many people will connect to your home page just to see what s there, not because they really want all the information you have to offer. Because small size equals speed on the Internet, it s only courteous to have a small, quickly downloaded home page that summarizes the information available with links to the larger bodies of content. Watch Total Size of Your Pages The more images you have, and the larger they are, the longer your page will take to load. The inline images that go with an HTML page add to the total time it will take to display, so keep them small and have as few as possible. However, if you use inline images of green and red dots, for example several times in a document, most Web browsers cache, or save, them locally, so they are downloaded just once and then displayed several times. Use Thumbnail Images as Teasers Offer thumbnail graphic images that link to larger images on a separate page.Offer Text-Only Views Providing an alternative on your home page for text-only viewers is essential for the many people who use a text-mode Web browser. In many cases they have no other option, because the equipment they re using or their connection to the Internet will not support graphics.Text-only views also help those who have graphical Web browsers but turn off image loading so that they can move through Web sites quickly and explore what is available.Text views of your site are important to another group of Web browsers: the blind or vision impaired.Have a Unifying Element or Graphic Users can easily become confused as they move through link after link of your server. Repeating a small graphic or piece of text at the top of all your pages will remind them where they are. If you use a graphic, be sure it s small and that you give a text description for it using the ALT alternate tag on HTML. In this way your text-only browsers get your text descriptions of the image instead of just a line that says image. Provide Sets of Pages for Downloading Consider offering a way for users to download all the linked files and images pertaining to a certain subject or theme in your Web server, if you think they ll want the whole set. For example, finding an online manual for the NCSA Web server is nice, but sometimes it s more convenient for users to get it all at once so that they don t have to worry that the connection to NCSA will be down just when they need to look at the manual. Group the relevant Web pages and images, and compress them into one file that users can download easily and view locally.Provide Print Versions Users find PostScript or PDF Portable Document Format documents useful alternatives to your Web pages. A PostScript, PDF, or Microsoft Word version of your information can provide even better document formatting than is possible with HTML. In certain cases considering your Web server as a distribution method for these print versions of your information may be advantageous. Also, remember that many people have less time online than they would like, and they may prefer to download the information you have to offer in order to read and study it at their leisure. Divide Your Information into Chunks Wherever possible, divide your information into chunks that can be conveyed in one- to two-screen pages. If you find your documents are taking more than a screen or two, you should think about splitting them into smaller sections. Offer Alternatives ALTS for Inline Images Take advantage of the HTML ALT code in your inline images to provide text descriptions of those images for the users who can t view them.Explain Your Server on Your Front Page Although not everyone will enter your Web server through your front page, many will. If your server is intended to accommodate only students on your campus, make this clear so that others won t waste their time. Sign and Date Every Document The World-Wide Web is new and exciting, but time does pass and people do need to know when documents were written and last updated.Avoid Click Here Links Make each link descriptive. If the text of your links uses the word here as in Click here for tip on investing , the indexer will pick up a bunch of heres and nothing useful. Map Your Server Although it s not always possible or necessary to map every link or even every section of your server, users find maps useful. See Figure 4-1 for an example of how one company used a map of its Web site to provide access to any page on its Web server with three clicks, at most, of the mouse.Ask for Feedback Make it easy for users to give you feedback. Build a comment form or supply a Mail to link, and be sure to list your e-mail address in case the first two don t work.Use Directional Links Be sure that you put a link back to your home page on all your other pages. Remember that some users will land in the middle of your Web pages after doing an Internet search, and unless you give them a way to get to your home page, they may just give up. Warn of Large Files When your links hook up to large files more than 50K , let your readers know in advance, ideally by specifying the file size.Adding Information and Files Several methods are available for adding items to your Web server.Typing text directly into an HTML editor gives you control over every step of the process but can be quite time-consuming. Although you might not be able to imagine this now, eventually you may find that you need greater detail in accessing pieces of documents.Converting word-processing files to HTML is already possible for WordPerfect and Microsoft Word for Windows versions 2.0 and 6.0 . When you have a large amount of information, creating a database in which to store it may be a good idea. Sims, computing coordinator for the business faculty at Edith Cowan University in Australia, is to combine a naming convention with a database to create individual HTML documents for each record in the database.html.Finally, you will find as you explore the Internet and in your reading that you will come across links or URLs to all sorts of useful or interesting locations. Explore the bookmark capability of your browser so that you become proficient at organizing and copying the links that you save there, a relatively painless way to add to your server s resources. Internet policy isn t clear on this, but it s probably best to tell the administrator of any site to which you have provided a link.Be Careful of Embedded Codes It is tempting to take your existing text files and online documents and add them directly to your Web server. I once came across a situation in which a vendor s Web site examples used the characters and in the text. HTML interpreted them as codes, and they did not display correctly on either Web browser to which I had access at the time Lynx and Mosaic . The greater than and less than signs were being interpreted as HTML codes, but because they did not adhere to HTML coding requirements, they caused the loss of some text.Web Tools Many special tools, both free and commercial even more than for Gopher--see Table 4-3 , have been developed to help Web administrators manage their servers. Realtime in this case means a special image or document that might be always changing like the Southern California Traffic Report http: www.scubed.com caltrans or documents created on the fly or in reaction to input from a user.Advanced Web Techniques Once you ve got your Web server up and running, and you ve tired of the thrill of seeing your HTML pages on screen, you ll want to try out some of the more advanced capabilities of your Web server.Images Images present a wealth of possibilities and problems. Also, color quality varies considerably according to whether the image was scanned in 8-, 16-, or 24-bit color and viewed on Macintosh, Windows, or UNIX workstations. Postage Stamp or Thumbnail Images Large image files take time to transfer, so many sites make thumbnail or postage stamp-sized reductions of their image files to give users a taste of the images available. Plan for the smaller versions to take up 2K to 10K instead of the 20K to 200K a larger image file might take. Although the postage stamp images are stored as GIF files so they can be used as inline images, the original full-size image could be in GIF or JPEG. Transparent GIFs Making the background of a GIF image transparent so that it takes the color of whatever background the browser uses can be a nice aesthetic touch. The giftrans program can do this on UNIX and DOS, and a program aptly called Transparency can do this on Macintoshes.Clickable Images A clickable image is a graphic image on a Web page that has certain sensitive areas that, if clicked with a mouse, link to some other page or program. When the user clicks on part of the image, its x-y coordinates are sent back to the WWW server, where a CGI script or program processes them and performs some specific action, depending on which section of the image the user selected.Here s a short list of examples: A clickable map of a city could lead to a list of the restaurants nearest the spot chosen. An image could be enlarged by clicking on various sections, assuming you have larger versions of the image scanned and available. Clicking a painting in different areas could lead to commentary on various aspects of the painting by art critics and historians. A dissection of an animal, or even a human being, can be guided by clicking your way through pictures of the organs. A common use is to use a clickable image as a sort of menu to the various areas of the Web site.To create a clickable image decide first which areas of the image should lead to which actions. The NCSA version of clickable images gives you a choice of points, squares, rectangles, circles, and polygons with as many as 100 vertices. Points would be difficult for the user to click accurately, so they use a closest to rule--the point that is closest to the coordinates clicked will respond. Then mark these areas, and record in a map file the x-y coordinates and the URL to the appropriate image, document, or script for each area. The top left corner is 0,0--heading down increases the value of the y-axis, and moving to the right is positive for the x-axis.The major drawback of the current method of providing clickable images is that users get no immediate feedback as they move the mouse over the image. In development are several alternatives for image mapping that allow the browser to do more of the work and provide immediate feedback.Interactive Image Format IIF Interactive Image Format IIF is a form of client-side clickable image developed at the University of Michigan by the Weather Underground a weather service, not the 1960s radicals . As you moved your mouse across the map, text in a box at the top tells you which weather-reporting station is nearby.sprl.umich.edu Creating GIF Images on the Fly On the fly means that your server creates the GIF image only after the user requests it. The image might be a special graphic representing flood damage in an area, a zoom-lens view of a map, a spinning globe, a graph of rising or falling profits, or the amount of coffee left in a pot.CGI Scripts CGI scripts are behind many of the most interesting applications on the Web. Whenever the Web server does something special as opposed to simply providing a file , it does so because of a CGI script. CGI, or the Common Gateway Interface, is a standard way for external programs to interact with Web and other information servers. Some functions, like gateways to Archie, Finger a user information lookup command; see Chapter 6 , and WAIS have already been added by CGI scripts or programs that come with your WWW server software.The Common Gateway Interface designates how programs will interact with Web servers behind the scenes. For example, a commonly used CGI script takes the information from an online comment form and e-mails it to the Web administrator. It does this by taking the information submitted on the form and using it to fill out a mail message, which it sends. CGI programs can be UNIX shell scripts a series of commands in a file, like DOS batch files , Perl scripts, or programs written in C, C , or other programming languages.ncsa.uiuc.edu cgi interface.html .ncsa.uiuc.edu cgi ; one link to a CGI primer is http: hoohoo.ncsa.uiuc.edu cgi primer.html . NCSA-Supplied CGI Scripts The simplest way to start using and writing CGI scripts is to look at the CGI scripts and example programs that came with your server or those mentioned here. jj is an example HTML form that Champaign, Illinois, residents use to order submarine sandwiches from a restaurant called Jimmy Johns. test-cgi is a script that echoes all the information that accompanies a CGI script request to test the design of your forms. uptime runs the uptime command on the server, if it exists, and sends the response to the user.pl is a CGI script written in Perl that passes queries on to a WAIS server, wraps the response in HTML, and sends it back to the user.Other CGI script tricks include adding access counts to your documents, which allows your users to see how many times a particular document has been accessed. Chuck Musciano of Harris Corporation explains how to do this and gives away a C program for the NCSA HTTPD server that counts accesses and displays a message saying you are visitor number such and such.corp.harris.com access counts.html Server-Side Includes Server-side includes are a feature of NCSA s HTTPD server that permits a server to automatically include information from files and environment variables in an HTML file as it is being sent to the user. One useful application is to have your HTML documents automatically show the date they were last updated, without you or anyone else modifying the date in the file. Server-side includes have many other uses, but be sure that you look for any warnings or advisories about them, because in some situations they may pose a security threat.Forms Online forms are among the most useful and appreciated features of Web servers.Basically, forms consist of two parts, the HTML description of the form and the CGI program on the server that will do something with the answers to the form s questions.HTML 2.0 forms can be created from the following pieces: INPUT specifies a field that the user fills out. You control the maximum length of the text the user can enter as well as the size of the onscreen space. Input fields can be of the following types: CHECKBOX provides users with a list of values and allows them to check off the ones they want to use. HIDDEN sends the contents of this field along with the completed form to the user but doesn t show this field. IMAGE lets you specify an image that, when clicked, causes the form to be submitted with the x-y coordinates of the clicked spot. PASSWORD is the same as TEXT a single-line entry field , but what the user types does not show up on screen. SELECT allows the user to select from a list of headings or options. I ve seen one Web server with a SELECT option of several hundred items, but because SIZE wasn t specified, users of some browsers could choose only those options that appeared on the first screen because the browser didn t scroll automatically. Once this server was set for SIZE to 17 in this case , the browser started scrolling through the entire list, showing 17 items at a time. TEXTAREA lets users enter more than one line of text.Almost all the fields also have a name attached to them so that your CGI script can easily distinguish between them. Each answer the user gives is paired with its variable name, and these sets are sent back to the Web server.Several books on HTML and many online resources, including one from NCSA, http: www.ncsa.uiuc.edu SDG Software Mosaic Docs fill-out-forms overview.html , provide, among other things, examples of forms, complete with explanations. At the end of its document NCSA is kind enough to offer a test server that you can use to test any forms that you develop. The script simply echoes back to you any of the fields that it receives, so you can see whether they work as you expected. Don t forget that you can always look at the source HTML of any form you come across on the World-Wide Web. Some examples of forms in use on the Web include a form for adding home pages and a form that forwards e-mail. Figure 4-2 shows part of a form used by the Web site HUMnet Humanities Computing at UCLA to allow students, staff, and faculty to add their own home pages to the humanities Web server. Figure 4-3 shows a portion of the HTML that created the form that appears in Figure 4-2. Note all the departments listed as options and how Art History is listed as the default option by using option selected see Figure 4-3 . The E-mail Forwarding Request Form in Figure 4-4 also was designed and is used by HUMnet at UCLA. Note that HUMnet also uses this form to communicate the rules and establish the information required for e-mail forwarding on its system.Web Indexing One criticism of the WWW is that it is not nearly as easy to index as Gopher servers. If you want to index all documents and links in a Web server, you have to actually scan through all the HTML documents, looking for the title of that document, as well as all the HTML links. And when you do find a link, what do you index--the link s URL or the link s descriptive text see Figure 4-6 ? The link URL is great for getting you somewhere, but isn t always useful for describing what it s linking to.Nonetheless, the reasons for indexing Web servers are the same as for Gopher servers: so your users can find out what you have and don t have quickly and with minimal frustration and so your server is efficient in providing its information. The important thing to keep track of is what is being indexed--file names and links, document titles, the full text of all documents, or an abstract written by the Web administrator. Look at them all, but be sure that when you choose one, you know exactly what it is designed to index and what its advantages and disadvantages are. Internal Indexers I call do-it-yourself indexing internal indexing.igd.fhg.de neuss GlimpseHTTP is a powerful full-text indexing and query system that allows approximate matching, Boolean searching, and limited regular expression UNIX-style searching.cs.arizona.edu:1994 ghttp External Indexers Web Crawlers, Spiders, and Worms External indexers are programs that reside on someone else s machine but index all servers. Often called Web crawlers, spiders, worms, and robots, they are programs that attempt to traverse all the known menus or links in Gopher and Web servers in order to build comprehensive indexes of Gopher or Web space. It is important that you understand the distinctions between each one so that you can make sure that your server and its documents are indexed in the way that s most appropriate for them.These programs require your cooperation to avoid excessive efforts on their part and to limit the load on your server. Or you may not want it indexed for some reason, either to limit access or because the information such as Usenet News links changes too quickly. Certain hidden files can be placed in specified locations on your server that tell the indexing programs which sections to index and which to ignore. The link to Martijn Koster s list of Web Wanderers, a good list of Web robots and spiders, is http: web.nexor.co.uk mak doc robots active.html .ALIWEB ALIWEB asks Webmasters to summarize in an index file the various themes or subjects covered in their WWW server.nexor.co.uk aliweb doc aliweb.html , it looks for this file and checks its structure; this index file is picked up by the ALIWEB harvester at regular intervals. The index file is a plain ASCII file in a special format left in a special place on the Web server.GENBBB GENBBB allows you to add WWW titles, reports, and links to other virtual libraries via an online WWW form.cs.colorado.edu homes mcbryan public html bb summary.html EINet Galaxy EINet Galaxy lets you add your own annotations and links to its catalog.einet.net galaxy.html Lycos Register your Gopher, WWW, or FTP site with Lycos at http: lycos.cs.cmu.edu lycos-register.html .World-Wide Web Virtual Library This is a directory or catalog of the Internet by subject.w3.org.w3.org hypertext DataSources bySubject Overview.html Advertising and Registering Your Web Server Putting something on the Web doesn t necessarily publicize it. Unless you want to have a little-known Web server, you ll want to register it and advertise it as much as possible. One drawback to the Internet is that there is no guaranteed way to intensively advertise your site among the tens of thousands of other Web sites available for browsing. Try these techniques: w3.org hypertext DataSources www Geographical generation hew-servers.html . infosystems.www.announce. w3.org, which should get your server on the W3O World-Wide Web servers list.uiuc.edu in HTML format and third person to get your announcement on the NCSA Mosaic s What s New page. signature file an e-mail feature on many systems that lets you automatically place your organization, title, and other data at the end of every message , on your business cards, brochures, and such, and spread the news via word of mouth and e-mail. Summary WWW is a method of publishing on the Internet that is generating tremendous interest and enthusiasm among users.WWW technically, HTTPD servers are available in both commercial and free versions for most types of computer systems. Although UNIX has been the operating system of choice for development and for most WWW servers, that is changing as more and more individuals set up servers on their desktop computers--PCs running Windows, Windows NT, or Macintoshes. Another option is to take a 486 or Macintosh and run it with a UNIX operating system, which can make it perform much faster and more efficiently as a Web or Gopher server. Writing HTML is getting easier with the new tools both HTML editors and word-processing add-ons that are being developed.Design your Web site so that its contents and purpose are clear to even the most casual user. Where possible, index the contents with WAIS, Glimpse, ICE, or some other indexing program to allow users to quickly search for material. When you can t avoid having large documents or files, label the links with the size of the file so that users are forewarned. Considerable programming effort and creativity are being applied to making Web servers do more than just provide documents and files on demand. CGI scripts are programs that can be added to most Web servers to perform special functions such as searches, updating databases, performing commercial transactions, and providing interactive online forms. Sun s Hot Java is adding an actual programming language that will be understood by any Hot Java-compatible Web browser see Chapter 11 . With more than 25,000 WWW servers on the Internet and increasing all the time, it is extremely important to register and advertise your Web site in as many places as possible. Now, with robots and spiders like Lycos from Carnegie Mellon University that traverse Webspace to add to their indexes of Web documents, the situation has improved dramatically. WAIS, pronounced ways, stands for Wide Area Information Server, an animal quite different from Gopher or WWW. Whereas Gopher and WWW are systems that help users to look at documents on various servers with ease, and indexing was added as an afterthought, WAIS was designed from the beginning to retrieve information from multiple indexed document sources. The emphasis on searching and the ability to query many different data sources simultaneously are why WAIS is often the indexer or search engine running in the background for Gopher and WWW. A key point about WAIS is that your users do not have to use special WAIS clients, because you can use WAIS indexes with a Gopher server or a WWW-to-WAIS gateway.To a greater extent than the various versions of Gopher and WWW, WAIS is split between a commercial version, offered by WAIS, Inc., and freeware versions supported by various organizations and individuals., costs 15,000, and there has been no intermediate version available at a lower price.WAIS: Pros and Cons WAIS is one approach to indexed searching. It has the advantage of being written to work with both text and nontext files so that you can use a WAIS server for a collection of photographic images, word definitions, book reviews, video or audio clips, and almost anything else you can imagine.However, one major drawback of WAIS is that the organization that had supported the noncommercial version called freeWAIS has dropped it in favor of a model based on the international standard Z39.50 v.2 or v.1992 . For example, if you index all your e-mail with the structured fields version of WAIS, you might designate fields for the sender, the subject, and the date.ADVANTAGES DISADVANTAGES How WAIS Works WAIS does its work by transmitting a natural language query--a question in plain English--to a select number of information servers previously chosen by the user. Each of those servers compares the request with documents it has on file and sends back the headlines, or document titles, of their closest matches, ranked by how many words they had in common with the query. The user sees the list of headlines each of which is a link to the original alongside a number that gives each headline s relative ranking against the query.With relevance feedback a user can take one document and use it as a basis for a new search in an effort to find more documents that closely match the subject matter of the first one. In effect, it now has a much more specific search query to work from, with many more example words to try to match. When WAIS Is Useful When you have a large amount of text that you want to be completely searchable, you need an indexed server tool like WAIS. The text could consist of many different files, or it could be one or more large files, with blocks of information divided in some standard way. For example, a card catalog might have title, author, publication date, and such with a blank line between each record a record is a set of fields .For example, you might profitably use WAIS to search the S.WAIS and Z39.50 WAIS and Z39.50 are both protocols for retrieving information from computer to computer. That is, they are designed to carry on an extended conversation between the client and server until the search session is complete. Z39.50 and WAIS allow for a longer interaction, building on previous responses. Z39.50 which was started in the United States has gone through several versions, beginning with the largely unimplemented 1988 version Z39.50-1988 .50-1988 and built into a productive working system.50 standard continued, however, and not in the same direction as the WAIS enhancements.50-1992 also called Z39.50-V2 improved the standard, among other things by bringing it closer to a similar international standard that also was being developed., the principal holder of the WAIS torch, has altered its product to conform to the newer standard.org , which had been the developer and guardian of the freeWAIS noncommercial version, decided to switch its development efforts to the newer Z39.50 standard.50 implementation and includes a UNIX client, server, HTTP-to-Z39.50 gateway, and an e-mail-to-Z39.50 gateway.FreeWAIS is a single indexer, single search engine, which means that it supports only one method of indexing documents and one search engine for searching those documents. Ideally, users would be able to search many different databases without needing to learn the ins and outs of each one.WAIS fits in with Z39.50 in that the WAIS protocol has become what is called a Z39.50 profile, or a specific application of Z39.50.50 standard.50 protocol are used in a given system and how the system should interpret these portions.50 profiles.S. Government Information Locator Service GILS , started by the Office of Management and Budget in concert with the Information Policy Committee of the Information Infrastructure Task Force, is another Z39.50 profile. The purpose of GILS is to make accessible the tremendous quantity of economic data, environmental data, and technical information collected and processed by various agencies of the U.S.50 so that the information can be retrieved in a variety of ways.usgs.gov:80 gils . The single biggest use of Z39.50 might be the bib-1 profile developed for the library OPAC Online Public-Access Catalog market.50. It is hoped that as bib-1 proves successful, profiles for other applications may be developed, such as geographic and medical information systems.50 and WAIS resources available online, as well as WWW-to-Z39.50 gateways. Searching Capabilities and Limitations Some WAIS servers support stemming, which reduces every word in a query to its word stem.It is important to understand that the rankings or scores that WAIS generates for documents are just approximations or guesses at what you really want. Technically, they may match on a certain number of words, but those words might be used in different contexts, so that the document with the highest score might not be useful at all. might bring poetry with those words in one stanza or information about the University of Michigan s Blue Skies Gopher project but nothing that answers the question. To use a sports analogy, think of computer searching as getting you into the ballpark but not playing the game for you. The key to WAIS is that by using a powerful search algorithm, it locates and orders documents that match your query, but it doesn t guarantee their utility to your needs. Fielded Searching Fielded searching gives you the ability to search for certain information, such as author or publisher or company. So if you are looking for information about publishing on the Internet, you would prefer to find the words publishing on the Internet in the title of a document instead of scattered throughout the text. When you use fielded searching, a document does not receive a high ranking because it has each of those words individually in its text but because it has those words in its title.WAIS Servers The selection among WAIS servers is not nearly as ample as it is for Gopher and WWW. FreeWAIS CNIDR Thinking Machines gave the original WAIS program to the public domain.org welcome.cnidr.html .3 has Boolean searching and stemming, and the source code is available for other programmers to modify, but it does not offer structured fields and has many bugs that haven t been fixed.50-1992 standard.3 is available at ftp: ftp.cnidr.org pub NIDR.tools freewais . FreeWAIS-sf The sf in freeWais-sf stands for structured fields. FreeWAIS-sf is a UNIX-based WAIS server that builds on the freeWAIS code to provide the ability to search structured fields including text, date, and numbers as well as full text. Such a search for James Bond in the title would turn up only those titles in which the words James Bond appears, not all articles or martini recipes that mention James Bond.FreeWAIS-sf can be used as a plain WAIS server and is compatible with existing WAIS clients. In addition, it improves on freeWAIS by allowing the user to define document format and headline layout without having to be a C language programmer. It supports country-specific character sets 8-bit only , which means that server operators will have fewer problems indexing data, at least in European languages. Development is continuing, but this is not a commercial product.informatik.uni-dortmund.de freeWAIS-sf Two other programs are available.informatik.uni-dortmund.de SFgate SFgate And SFproxy lets you index your personal WWW hotlist or bookmark file--the collection of Web links you ve saved and easily add URLs.informatik.uni-dortmund.de SFgate SFproxy.html WAISserver by WAIS, Inc. WAISserver 2.0 by WAIS, Inc., is the top-of-the-line commercial WAIS server 15,000 . s customers include Britannica Online and CMP s TechWeb, among others.wais.com WAISserver for VMS WAISserver for VMS is noncommercial software by Jim Fullton formerly of the University of North Carolina, now at CNIDR and runs on VAX VMS systems contact Fullton if you want to use the software for commercial purposes .unc.edu pub packages infosystems wais servers vms vms-server . WAISserver for Windows NT EMWAC The European Microsoft Windows NT Academic Centre EMWAC is an integral part of Computing Services of the University of Edinburgh and has been set up to support and act as a focus for Windows NT within academia.ed.ac.uk html internet toolchest top.html WAISserver for Windows There is an older, free, pre-WinSockets a program that allows Windows to talk TCP IP version of a WAIS server that will run on Windows. It was written by Tony Addyman and is noncommercial, and he provides no support but will help someone who wants to take it over.salford.ac.uk in pub wserver.zip. Indexing Alternatives to WAIS Some alternatives to WAIS don t use the WAIS protocol but do provide full-text indexing. ISite: ZDist Under a New Name ISite, a server for the protocol Z39.50.92, is based directly on the latest version of Z39.50 protocol.org talks whyzdist.html .cnidr.org software zdist zdist.html .S. INQUERY The INQUERY system, from the Center for Intelligent Information Retrieval CIIR at the University of Massachusetts, is not intended to be an off-the-shelf information retrieval system.cs.umass.edu inqueryhomepage.html Glimpse Glimpse is a powerful indexing and query system that allows fast file searches free for nonprofit use; for licensing information contact the authors at glimpse cs.colorado.edu .cs.colorado.edu The GlimpseHTTP gateway allows WWW searching of Glimpse indexes.cs.arizona.edu:1994 ICE ICE, an indexing program for WWW servers by Christian Neuss, is a lightweight, easy to install alternative to WAIS gateways. According to the author, ICE is beerware--if you decide that you like it, send him a can, or case, of your favorite beer.igd.fhg.de neuss Open Text Open Text Corporation offers a commercial alternative to WAIS, also called Open Text, that is best known for its use with large tagged structures such as SGML databases. Try searching--it s extremely impressive, particularly because it lets you search specific sections of HTML documents as well as do proximity and occurrence count searches. The latter means that the searcher will send back only those documents in which your search term appears a specific number of times. Open Text would like to run its Open Text Web Index as a charged service, but it is available for free while the company tries to convince people to buy its Open Text indexing products.uunet.ca:8080 WAIS Resources WAIS resources are much less prevalent than Gopher and WWW resources, although they are sorely needed. Managing a WAIS server is an inherently more complicated task, so you may find the newsgroups and mailing lists to be fairly technical.Estimating Computer Requirements WAIS servers can run on UNIX systems, Windows NT, and Windows. Linux is one version of UNIX that runs on 386 PCs, and Macintoshes can also run a form of UNIX as their operating system.Installing a WAIS Server Like Gopher and WWW servers, setting up a WAIS server is a two-part process, installing the server and adding the data, documents, or images it will serve. With Gopher servers data preparation is a simple matter of putting the data, document, and image files into the data directory.But indexing is what WAIS servers are all about.We ll use freeWAIS-sf 1.1 as an example. It is a free WAIS server for UNIX machines, and it has some advanced features, including the ability to search structured fields. It is an enhancement of the original version of freeWAIS and was written by Ulrich Pfeifer and Tung Huynh of the University of Dortmund, Germany. The installation process consists of running a configuration script that checks out the kind of UNIX system you have.1 do: waisserver is the program that sits and listens at port 210 that s the default, but you can name another port . waisindex is the program that indexes your document and data files in preparation for making them available to After you have tested your server and pronounced it ready for use, you set it up to be ready and waiting at all times. See the discussion on inetd versus local in Chapter 4, because the same concerns apply here inetd applies only to UNIX . Basically, inetd starts up only when a request comes through; it doesn t waste system resources by waiting, but its startup is slower.WAIS Fundamentals Because WAIS is based on indexing, the most important part of setting a WAIS server up is to index your data, whether it s a collection of text files, images, sound files, or programs. For files that change often, you can build new indexes separately and then use a shell script or a timer to swap them with the older indexes every day or every hour, whenever necessary., allows for incremental indexing, which means no swapping is necessary. Although it understands 30 different types or formats of files see Table 5-2 on WAIS data types , you can also specify other types explicitly. This capability is particularly important because it allows you to include newer types of files and multimedia formats as they are developed. You can index them by their file name, or you can relate them to a text file with descriptive information and index that. WAIS File Types When you use waisindex to index your files, you need to specify what type they are.As you can see there are many special purpose file formats on the list.Text Text files are anything that contain text and are stored in ASCII. The rule here is that when the server finds a match, it returns the entire file to the client, whether that file is a single résumé or 1,500 résumés, a conference paper or 12 papers from the one conference, or a margarita recipe or a collection of recipes. This can be a problem when all you want a user to see is a single résumé or a single recipe or a single conference report.Para Say your file of names and addresses has a blank line between each name, as in Figure 5-1. If you use the para short for paragraph type, each name and address combination is treated as a separate document, because a blank line separates each address into paragraphs. The headline would be the first line of each paragraph, which would work fine for this example.One Line If you want every line of a file to be treated separately, you would specify the file as the one line type.Dash With the dash type you can have several paragraphs grouped together, so long as they are separated by a row of dashes. This is useful when the text is longer than a paragraph but not long enough to be in a file by itself. In Figure 5-3 student 1 wrote an essay of several paragraphs, student 2 wrote only one paragraph, and student 3 wrote three paragraphs.Mail or Rmail Large amounts of text often accumulate in mail files. Use the mail or rmail type, and WAIS can index each message as a separate item, with the subject line of the message acting as the headline. The system will treat the subject line for each as its headline, even though it is not the first line in the section. Netnews or RN Similarly, you can use waisindex to specify Usenet News Netnews files or those saved by the rn news reader as a type.GIF, TIFF, and PICT Image Files If it seems strange that WAIS can index image files, don t be fooled.Another technique is to match an explanatory text file with a nontext file, one with the extension .text, the other with an extension suitable to the type of file it is, such as .gif, .au, .mpeg, and so on.text and lincoln.gif would both turn up during a search for Lincoln. WAIS indexes the text in the description file and it can be set up to return the titles of both the nontext and text files when it finds a match.More Information on Types Most of the data types listed in Table 5-2 are probably unfamiliar to you. Even when they are familiar, it can be helpful to have a model against which to check to ensure you and WAIS really are speaking the same language. For that reason the nice folks at Dortmund University in Germany put a set of example files on the Net: ftp: ls6-www.informatik.uni-dortmund.de pub wais fmt-examples . Indexing with WAIS Once you ve identified the type of file format you have and know where you ll store the indexes, you re ready to run the waisindex program to create all the indexes that waisserver will need.txt. The -export parameter means to send the information on your WAIS server to the master directory of WAIS servers at WAIS, Inc.wais.com Figure 5-5 shows an index from a file of mural artists biographies that fit the para paragraph type. Unfortunately, as you can see in Figure 5-5, the name of the artist leads right into the first line of the biography. Because the para type uses what it finds on the first line to create the headline presented in query results, this is messier than we want.After checking that the colon usage is consistent throughout the file, we can write a simple Perl script to find everything up to the colon and put a carriage return character afterward it.Now when waisindex takes the first line of each paragraph to be the title for that record, the artist s name and birthdate show up alone for a clean title.Indexing HTML files is a common occurrence with the tremendous increase in WWW servers.cis.ohio-state.edu hypertext faq usenet wais-faq freeWAIS-sf faq.html . WAIS Tools The key to indexing documents is to clearly separate and identify each piece that you might want to treat separately. If a WAIS search finds a match in a 20-page chapter, the person who requested the material still will have to plow through 20 pages to find the section that matches.Parsers and Filters Parsers and filters are tools that can split up your text or rearrange it for you. Basically, you can program parsers and filters to take advantage of any built-in patterns in the text to break it the way you want it. Say you have a file of mail messages that is separated by a row of equal signs Microsoft Mail for DOS can produce files like this , and you want to index each message individually.Create Your Own Filters It may be possible to create your own filters by using a language such as Perl, C, or even shell scripts. You can make the server make such changes automatically when you have a regular supply of files that need exactly the same alteration. But the problem often is that each file is different, and someone has to find the easiest way to convert it to a form that will work.Advertising Your WAIS Sources Once you have installed your WAIS server and set up some databases for public searching, you can use the Register parameter in Waisindex to register with the central WAIS directories.Summary WAIS is a very different animal from Gopher or WWW. Gopher and WWW provide information over the Internet, but for the most part any indexing that is done is hired out, that is, done by other programs. WAIS and Z39.50, to which it s related, are specifically about indexing text databases and providing for online searches of those databases. This is a large and growing area of Internet publishing, but it is often hidden behind Gopher and WWW front ends. The commercial version of WAIS servers is available only from WAIS, Inc., and is expensive 15,000 , although the company is considering a free version see Chapter 11 . The improved freeWAIS-sf version adds the ability to do structured field searches, which adds quite a bit more flexibility for the user.Z39.50 is a U.S.50.lib.ncsu.edu staff morgan alcuin wwwed-catalogs.html for Library Catalogs with WWW-to-Z39.50 Interfaces.S.50-based program called GILS to make available the vast amount of information and data collected by various U.S.Although this book focuses on Gopher, WWW, and WAIS as the three most popular and effective tools for publishing on the Internet, there are others.Perl Imagine a programming language written specifically for dealing with text. Instead of being designed for square roots and numerical calculations, this language is designed for finding and manipulating text in large or small files. This is Perl, and you or your Web administrator should become familiar with it, because much of what you ll be dealing with is text--transforming it, rearranging it, and filtering it.Many utilities for Web and Gopher servers are written in Perl. One such feature is the associative array, which allows you to store data in an array sort of a table subscripted referenced or addressed by text instead of just by number. For example, you might make an array of countries in which the two-character country code is stored under the full country name country Singapore SG or country BR Brazil .cs.cmu.edu:8001 Web People rgs perl.html Python Python is an extremely powerful programming language that has text-handling features and is object oriented.1, and Windows NT, so Python programs are portable between all these platforms.python.org . Glimpse Glimpse, which stands for Global Implicit Search, is a powerful tool for indexing the contents of a file system.cs.arizona.edu:1994 Harvest Harvest is a different approach to collecting, organizing, searching, and replicating relevant information across the Internet. Harvest was developed by the Internet Research Task Force Research Group on Resource Discovery IRTF-RD in a project funded by ARPA the Advanced Research Projects Agency, the founder of the Internet .cs.colorado.edu Harvest may have solved three primary problems of Internet publishing: In the bargain Harvest s developers created an extremely powerful set of tools that are available for commercial licensing or free educational and government use.FTP Daemons FTP daemons daemons are programs that run by themselves are an easy way to make files available publicly. Well, it s still around, and although it s not as friendly or easy to use as Gopher, WWW, and WAIS, it does work. You might choose to run an FTP daemon when you know that your users will have no trouble using FTP and you don t want to bother with anything extra.Finger Finger is a UNIX utility that was designed to give current information about a particular person, usually whether they are currently logged in, how they spell their names, and the like.plan and .project, which will be displayed with their login information when they are fingered queried with the Finger program . One creative idea was to use this utility to quickly make available simple company sales or contact information. Just create a pseudo UNIX account name, such as sales or products, and put the appropriate company marketing information in the .plan and .project files for these accounts.name.com or finger products company.com on their computer, they ll receive that company s sales or product information.washington.edu . This system has the advantage of being easy for a UNIX system administrator to set up, and it can provide a small amount of information quickly. CSO CSO is a computer resident phone book, or a sort of White Pages for the e-mail and telephone directory system used at many universities. According to Ed Krol s Whole Internet User s Guide and Catalog, CSO sometimes also called a QI server got its name from the Computing Services Office of the University of Illinois at Champagne-Urbana. What s important for us is that with a client program called PH, or with Gopher and from Web browsers, looking information up in these computerized White Pages is simple. The server software for CSO can be found at ftp: vixen.cso.uiuc.edu pub qi.tar.gz .cso.uiuc.edu pub phitarigz .cso.uiuc.edu with subscribe info-ph in the body of the message. List Servers Electronic mailing lists, commonly called list servers, are so common on the Internet that it s hard to think of them as a form of publishing, but they are. Finally, some list servers are one-way and consist entirely of announcements from a central source with no mechanism for sending replies or comments. Open lists, which promote discussion between any and all subscribers of the mailing list, whether they know each other or not. Moderated lists, which are edited by a moderator who decides which messages will get published and which will not. These lists tend to stay on subject more, but an active list can create a great deal of work for the moderator. Although most of the thousands of list servers are free, at least one group has figured out how to make money publishing this way. WEBster began with the Journal of High Performance Computing and, as it developed its customized list server software, added WEBster and is considering offering others. You can find out more about WEBster by sending e-mail to sos webster.tgc.com.tgc.com or call 800-795-4472 from outside the U.S., call 619-625-0070 .tgc.com webster.html MUDs and MOOs MUD stands for take your pick Multiple-User Dimension, Dungeon, or Dialogue. Each user takes the role of a character that can walk around, chat with other characters, explore monster-infested areas, solve puzzles, and even create rooms, descriptions, and items. Basically, they allow a group of people to join together to play in and create the illusion of an artificial world.If you think these are just games and have nothing to do with Internet publishing, here are some possibilities to make you change your mind: For example, you could use MUD or MOO to help people negotiate a mutually acceptable site for a sewage treatment plant. MUD and MOO could be used to take students or customers through the various steps of a computer program, lab experiment, or engineering project. MOO could be used to create a library for historians, so they can edit and annotate bibliographies online.MOO and MUD are not typical Internet publishing tools, but they have some interesting capabilities for those who want to give their users an imaginative experience. John December, co-author of World Wide Web Unleashed, maintains an excellent online guide to Internet tools that contains this section on MUDs and MOOs: http: www.rpi.edu Internet Guides decemj itools cmc-group-mu .html Also check out Francis Litterio s Web site on MUDs and MOOs at http: draco.centerline.com:8080 franl mud.html . Hyper-G The Institute for Information Processing and Computer-Supported New Media IICM at Graz University of Technology in Austria is developing Hyper-G to replace WWW. The Web and Hyper-G have some basic differences in philosophy, primarily that Hyper-G maintains the connection between the user and the server throughout the session. Although maintaining the connection imposes a greater load on the server, it allows the server to know who the user is and what he has been doing. When combined with several different layers of user access rights, Hyper-G could enable the server to interact with the user in a very different way than WWW servers can. A paper about using Hyper-G to publish on the Web was presented at the April 1995 World-Wide Web Conference in Darmstadt, Germany.igd.fhg.de www www95 papers 105 hgw3.html For more information see ftp: ftp.iicm.tu-graz.ac.at pub Hyper-G , http: hgiicm.tu-graz.ac.at .tu-graz.ac.at or telnet info.tu-graz.ac.at and login as info. HyperNews HyperNews is a conferencing system that is a cross between WWW and Usenet News although it doesn t rely on Usenet News that lets people add comments or responses to items posted to a HyperNews Web server. The comments are stored hierarchically with sections divided into subsections that are in turn divided into more subsections , and users can add new areas on their own.You might find this a useful tool, particularly if you want to create a discussion group for a subject and make the feedback immediately available publicly.ncsa.uiuc.edu HyperNews get hypernews.html Internet Relay Chat IRC Internet Relay Chat is a sort of free party line on the Internet for fast typists. It was designed in Finland as a replacement for the UNIX talk command, but it has since grown into an enormously popular and addictive Internet resource. IRC is divided into hundreds of channels on all sorts of subjects in many different languages primarily English but also German, Japanese, French, Finnish, Spanish, and others . Users connect to IRC servers using free IRC client software and check the list of channels for one that looks interesting. Such immediate interaction without regard for geography proved its utility during the Persian Gulf War and the attempted coup against Boris Yeltsin when live reports came from IRC in those areas out to the rest of the world.rpi.edu Internet Guides decemj itools cmc-mass-irc.html WebChat WebChat is a realtime, fully multimedia chatting application that is available as an add-on for most WWW servers.irsociety.com webchat.html .This might be something you could use to schedule an online discussion with your company president or just for discussions on topics related to your Web site. Summary Gopher, WWW, and WAIS together provide a wide variety of Internet publishing capabilities especially when enhanced with CGI scripts .Perl and Python are computer languages that are available on many types of computers for free.Glimpse is an indexing alternative to WAIS see Chapter 5 for other indexing alternatives , and Harvest tackles the whole problem of collecting, organizing, searching, and replicating information across the Internet. It can solve problems you didn t know you had, such as conserving network traffic and speeding up delivery by replicating portions of your data on various servers automatically. FTP daemons and Finger are traditional UNIX services available on other platforms as well that can be useful in Internet publishing. List servers electronic mailing lists are another old standby that are useful for widening your audience because may people find that e-mail is easier to get than full Internet access. MUDs and MOOs are multiple user systems that permit many people to build a virtual world together. Because of their unique capabilities they may provide exactly the kind of interactive experience that you want to add to your site. Hyper-G may be the replacement for WWW because it maintains the connection between the user and the server and has several different layers of user access rights.HyperNews, Internet Relay Chat, and WebChat provide an interactive set of tools.As you can see, some creative software is already available, and more is being developed. It will behoove you to stay abreast of what s being developed on the Internet and how it might complement what you re doing. First, the Internet is not a mass of consumers waiting to buy what you want to sell. Television, radio, newspapers, and magazines are almost entirely one way, allowing little or no feedback or interaction between the readers and the writers, whereas the Internet comes from a tradition of intense discussion of almost every issue imaginable. Don t go onto the Internet with old ideas about marketing and sales--they won t work, and they ll do you great damage.Imagine a world in which Question Authority is a way of life, not a motto on a bumper sticker. And remember that much of your audience is in a position to evaluate, analyze, test, and comment on any exaggerated claims you might make for your product.Demographics The demographics of the Internet are changing. Users of the Internet are not just the young, white nerdy male computer students that many people assume is the Net population. For the last few years, partly because of Gopher, WWW and WAIS, the Net has seen a huge influx of new faces. First, more and more university faculty, staff, and students are seeing the Internet as important to their research and communication needs. Finally, the major commercial online services, like America Online, CompuServe, and Prodigy, are racing to provide full graphical browsing of the Internet along with their commercial databases.For all these reasons the quantity and variety of potential users of your Internet publishing efforts have changed. But because the Internet is expanding so rapidly and is controlled by no one, it is impossible to say with any precision exactly what the demographics are. The Graphics, Visualization, and Usability Center of the College of Computing at Georgia Institute of Technology in Atlanta has made some valiant and sophisticated attempts to find out, however, at least for the WWW.cc.gatech.edu gvu user surveys . Georgia Tech received 13,000 responses to the April 1995 demographics questionnaire, which also had separate sections on WWW browser usage, consumer attitudes and preferences, and questions for Web service providers. Because the survey is voluntary, the center makes no claims about the representativeness of the data for the entire Web population.Opportunities The Internet offers many opportunities in addition to direct sales.If you are thinking about making people pay for access to your Gopher, WWW, or WAIS servers, several factors make this more complicated than you might suspect. This chapter attempts to describe some charging schemes that are in development or use and to explain how they are different.Is Anyone Making Money on the Net? Don t expect to get rich, but yes, some are making money on the Internet. The officers of First Virtual Holding Company, one of the first transaction systems on the Internet see Survey of Charging Techniques later in this chapter , say that thousands of dollars are coming in daily through its InfoHaus online store. Author Stu Sjouwerman said that in 10 weeks he sold 200 copies of his book Make Money on the Internet .In the sprig of 1994 Laura Fillmore, president of the Online BookStore, told attendees at a Texas conference on making money on the Internet that they should expect to use imagination and creativity to come up with new approaches to publishing.bus.utexas.edu ravi laura talk.html Her point is that the Net remains a learning experience for everyone, and it s not always possible to predict what you ll learn or whether you ll make money in the process.com obs english papers top.htm for some of her other talks and papers. Economic Models There are several different economic models for making money on the Internet.Here are some ways to approach online sales: Per copy--You sell the text or item to one user at a time, charging for each copy downloaded. Site license--You require a site, or collection of sites, to pay a fee for the right to download an infinite number of copies of the text. Site licensing often is used in combination with subscriptions to limit the right to download to a certain period of time. For free in its entirety--You allow downloading of the entire text in one format such as ASCII for free but charge for it in other formats, such as print. Sponsorship--You sponsor a site on the Net much as you might sponsor a Little League team or a broadcast on PBS. Usually, there is no charge to the user, although sponsors often require users to register so they know who is getting their message. Some Net advertisers believe that if they do not offer browsers content with some quality, browsers won t drop in to see their ads.Table 7-1 lists several Net sites and resources where people are dealing with issues related to commerce on the Internet. Charging Issues and Concepts Once you get into the world of online commerce, you have to pay attention to a range of issues and concepts that could affect the security of any system you put in place. If you are considering a system that involves electronic money, checks, or tickets, can the buyer get away with reusing them in effect, double spending ? Some sophisticated charging systems may depend on equipment, banking participation, international exchange agreements, and such, that may take time to establish or invent. Sometimes people agree on the merits of these tools and they are widely adopted, which is what happened with Gopher and WWW. Something that is scalable will operate just as well when millions of people are using it as when a few hundred are. Given the size of the Internet and its amazing growth, you want a program or system that can grow past one machine or site. Will the system be flexible, or does it force every buyer and every seller to interact in exactly the same way, with the same equipment? The latter capability is called language localization, which means that the software will run in the language of the country in which it is being used. If someone comes up with a new and improved authentication protocol way to ensure the right person is using your system , will you be unable to use it because your charging system isn t adaptable? If the usual credit card transaction charges apply 20 cents or more per transaction , you need a different approach for selling things such as screens of text at a half-cent each. Whereas buyers might concede the seller s need to know their identity, they might very much resent and boycott the seller because of a charging system that allows outsiders to track their purchases.Digital Cash Digital cash is the electronic equivalent of cash money. Usually, they rely on some form of public key encryption or digital signature to determine the value and validity of this electronic currency. Digital Signatures Originally proposed in 1976 by Whitfield Diffie, a digital signature is a way to ensure that something composed entirely of electrons is actually a message sent to you by a certain person.Heavy Duty and Lightweight Security Lightweight security schemes are considered to have certain fundamental flaws that leave them open to attack by sophisticated programmers. But they can be useful when the value or security of what you need to protect is not likely to inspire sophisticated attacks.contrib.andrew.cmu.edu usr db74 kerberos.html or http: nii.isi.edu info kerberos documentation.html . Private Key Encryption Private key encryption is the type described in most spy novels. Someone uses a key or cipher to encrypt a document, and only those who have a copy of that key can decode the message. This is a powerful method, but it requires the safe transmission of the key between the people who need to see the message.Public Key Encryption Public key encryption, the brainchild of Diffie, is based on the difficulty of finding the prime factors of extremely large numbers. They both do one-way encryption, that is, whatever is encrypted using one key can only be decoded by using the other. If someone encrypts a message to you with your public key, only the owner of the matching private key presumably you can decrypt the message.Survey of Charging Techniques This section reviews some efforts to develop online charging. Think of this as a survey of a growing field and inquire online for more information about any that catch your interest.Anonymous Credit Cards A series of papers at the AT T Research site describes an anonymous credit card system that preserves the anonymity of the parties and the security of the transaction while arranging for payment for goods and services.research.att.com acc CyberCash CyberCash can be used to buy and sell information as well as hard goods. The CyberCash approach to Internet commerce is to establish a trustworthy link between the Internet and the traditional banking world see Figure 7-1 ., has teamed up with Wells Fargo, the seventh largest bank in the United States, and Check Free Corporation, the leading electronic commerce company in the United States. CyberCash offers credit and debit card transactions and eventually plans to offer true digital cash that can be transferred among friends and strangers and not just merchants. CyberCash allows credit card holders to encrypt their personal credit card data in a way that only CyberCash can decrypt.CyberCash can handle charges, voids, and returns, as well as peer-to-peer transactions direct exchanges between two equal parties and transactions too small to handle through normal credit card channels.cybercash.com Ecash by DigiCash Corporation The DigiCash Corporation s ecash TM system see Figure 7-2 provides an electronic equivalent for most functions of cash, especially anonymity. This ability to cut the chain of interlocking information that invades privacy is one of the main goals of ecash and DigiCash s related plans for echecks. Ecash was announced in May 1994 at the First International WWW Conference in Geneva; a 1 million open-ended trial run began in October 1994 during the Second International WWW Conference in Chicago. During the trial period which has no set end date DigiCash gives 100 in cyberbucks to every participant to spend in participating cybershops.Ecash, short for electronic cash which is the same as digital cash , http: www.digicash.com ecash ecash-home.html , relies on public key cryptography to create digital signatures that are then used with random-number blinding to ensure the privacy of all parties.digicash.com , which has offices in Amsterdam and Palo Alto, California.The client software for ecash is available for Macintosh, Windows, and UNIX platforms after registration. The server software is available for UNIX WWW servers both NCSA and CERN , Windows but only for testing , and is under development for Macintosh WWW servers. Although DigiCash is running an ecash bank for the trial, it has no plans to link cyberdollars to any real currency. Instead, the company is discussing licensing arrangements with banks, financial institutions, and other organizations possibly governments that are interested in issuing ecash. There are ecash shops, ecash customers, and ecash banks although the cyberbucks have no value, the items and services being sold do . Although it is not possible for the bank or government authorities to link the buyer to a specific transaction, it is possible for buyers to prove definitively if they wish that they have made a particular payment.The ecash system works as follows: First Virtual First Virtual Holdings, Inc.fv.com Instead of developing complicated password and encryption schemes, First Virtual set out to design a system that does not need to send any confidential information over the Internet and does not depend on particular hardware or software. First Virtual s solution is to replace your credit card number which you provide by voice when you first sign up with a First Virtual account that you use for all transactions. You might ask why First Virtual s account number doesn t run the same risks as a credit card number when passing over the Internet. The system was designed to work from any country but initially requires a credit card and for merchants a checking account from a financial institution in the United States or Canada.fv.com faq index.html . First Virtual offers a software addition to Web servers that allows them to accept First Virtual payments. Sellers have a 10 registration fee and a transaction fee of 29 cents plus 2 of the value of the transaction, which is deducted from the amount paid by the buyer.infohaus.com , an electronic go-between that will sell your items for you, for a commission, of course, and a monthly charge of 1.50 per megabyte of storage. According to Tom Gable, spokesman for First Virtual, InfoHaus merchants were doing thousands of dollars in sales per day in April 1995. A transaction on the First Virtual system would proceed as follows: Yes means First Virtual should charge his credit card for that amount; no means the product was in some way unsatisfactory--don t charge. If a buyer responds with fraud, First Virtual immediately deactivates the buyer s account and contacts him about establishing a new one. First Virtual s system requires the sellers to be willing to allow buyers to download their product with no absolute guarantee of getting paid each time. Except for requiring a certain amount of trust, First Virtual s is an elegant system for certain types of sales and appears to be growing.Mondex Mondex is an electronic cash smart card a plastic card with a microcomputer chip embedded in it that allows the safe movement of money over the Internet. Each time a Mondex card is used, the chip on the card generates a unique digital signature that is recognized by the other Mondex card involved in the transaction. The digital signature is the guarantee that the cards involved are genuine and that they are dealing with genuine Mondex signals. This recognition process also identifies the card for which the cash is intended, which means that a third party cannot intercept funds.mondex.com mondex home.htm . NetBill Electronic Commerce Project Carnegie Mellon University s Information Networking Institute is designing the protocols that will allow users with NetBill accounts to buy from merchants whose servers run NetBill software. The institute is designing the system so that it is possible to bill for 1-cent transactions credit cards usually charge 25 to 50 cents per transaction ; its focus will be network-delivered downloaded goods with a certified delivery protocol to guarantee delivery. In February 1995 Carnegie Mellon and Visa formed a partnership to develop and conduct a precommercial trial of NetBill by the end of the year.ini.cmu.edu netbill . The constraints are that the system assumes realtime communication between three parties and it uses encryption which limits its exportability .NetCheque NetCash NetCheque NetCash at http: nii-server.isi.edu info NetCheque , which is being developed at the University of Southern California s Institute for Scientific Information, works much like paper checks. Based on the Kerberos security software system and Prospero file system, users registered with NetCheque servers can write checks to other users.NetCheque software was released in December 1994 for testing and development.Security--works on open networks but protects all parties to the transaction Flexibility--allows different kinds of payments: personal checks, cashier s checks, credit cards, and eventually electronic cash Scalability--can handle extremely large numbers of transactions Efficiency--a per-transaction cost of a fraction of a cent Unobtrusiveness--does not interrupt other computing activities and is expected to integrate easily with existing network and online software, such as CompuServe, America Online, and Prodigy NetChex NetChex is a virtual checking account system for online transactions in development by Net 1, Inc., based in Phoenix. The client software runs on DOS or Windows machines and permits authorized users members to gain access to and transmit electronic checks for free. According to the June 26, 1995, edition of PC Week, NetChex is ready to unveil its system but is waiting to ally with a bank or larger partner.netchex.com Open Market Open Market s payment system, as embodied in its 4,995 WebServer, allows for the purchase of both hard goods and information.openmarket.com uses existing Internet and World-Wide Web protocols, but it comes in separate parts, or modules, each performing a specific function. The modular design means that when improvements in authentication or security schemes or some other part of the process come along, the newer version can replace the appropriate module, and the server need not replace the entire system. The Open Market purchase process goes like this: The Open Market server debits the buyer s credit card account and gives the buyer an access link, which is a link back to the seller s server that also serves as confirmation of the sale, complete with a digital signature. Open Market has some advantages: Authentication methods can range from asking for name and password to challenging asking questions only the legitimate buyer can answer to requiring the buyer to have a hand-held authenticator.PayNet PayNet Corporation is working with the Thompson Publishing Group to develop a system that focuses specifically on business-to-business information. PayNet provides a service for niche business publications such as Management of Aboveground Storage Tanks that are distributed as newsletters and inserts to looseleaf notebooks.PayNet is a three-party payment system; providers and customers register with PayNet and the customer gains access to any publication in the system.The billing approach is a hybrid of telephone and credit card billing systems.S.PayNet is expected to work like this: The PayNet server has to be online to complete the transaction, but there will be multiple servers so that one will always be available.com. Secure HTTP Secure HTTP http: www.commerce.net information standards drafts shttp.txt is being developed by Enterprise Integration Technologies EIT as an extension of the HyperText Transfer Protocol to provide a secure means of transporting information across the Internet. Secure HTTP can be used in a wide variety of WWW contexts because it is concerned only with the way messages are formatted and the protocol by which they are exchanged.Secure Sockets Layer SSL Netscape calls its solution to security problems SSL Secure Sockets Layer http: home.mcom.com info SSL.html . Netscape has proposed that the W3O working group on security consider SSL for part of a general security approach for the Web. Netscape is also working with W3O and others to establish open security standards for the Net.Netscape s system works at a low level, below the application level, but above TCP IP, to secure transmission privacy between a client and server, no matter what application they re running--FTP, Telnet, Gopher, Usenet News, e-mail, WWW, or anything else that comes along.Netscape s SSL provides three types of protection: Server authentication so that the client, or user, can be sure they are dealing with the server they expected The key feature of Netscape s security scheme is that it would underlie the actual application you are using without interfering with it.Shen Plan Shen http: www.w3.org hypertext WWW Shen ref shen.html is a security scheme being developed under the sponsorship of CERN and the European Union. The philosophy is to build as much as possible on existing RFCs, especially the Privacy Enhanced Mail PEM standard in order to encourage integration of e-mail, Usenet News, and Web systems.internic.net rfc that define message encryption and authentication techniques for electronic mail over the Internet. The Shen security scheme provides for three levels of security: Terisa Systems SecureWeb Toolkit Terisa Systems was founded as a joint venture of Enterprise Integration Technologies EIT and RSA Data Security to formulate a security system for the WWW. For users and those wishing to sell on the Internet, such competition promised confusion, because some sites would require one security system and others would use another. An elegant solution was reached, however, when America Online, CompuServe, Prodigy, IBM, and Netscape Communications announced their intent to join Terisa as equity holders in the effort to develop Terisa s SecureWeb client and server tool kits, incorporating both S-HTTP and SSL by late 1995 see Figure 7-5 .The inclusion of America Online, CompuServe, and Prodigy in this venture is particularly important, because they are the three largest commercial alternatives to the Internet.terisa.com Offline Payment Some companies have avoided the wait for secure Internet commerce to develop by collecting money and credit card information offline. Voicemail systems or human telephone operators are often used to take credit card numbers over the phone and then activate the requested service online or supply the user with an authorization code that can be used safely via e-mail or online forms. Although the authorization code might be stolen via the Internet, it is not widely useful like a credit card number, so it is much less vulnerable. WEBster Electronic Magazine via E-mail An electronic magazine called WEBster came up with an ingenious method for selling its electronic text over the Internet see Figure 7-6 . The system enables WEBster to market to a much larger population--all those who have e-mail access, as opposed to those who have a WWW or Gopher browser.tgc.com webster.html Summary Internet Publishing is not just about selling things, but this chapter is. Commerce is becoming part of the Internet, although the new forms of communication between companies and their customers may be more important than direct transfers of electronic cash or checks. Marketing on the Internet has a different flavor, because Internet users have the same ability to disseminate their message worldwide as do the corporations and organizations doing the marketing. This is done by active participation in relevant newsgroups and mailing lists with an emphasis on solid technical information rather than marketing hype.Different economic models exist for making money on the Internet, which has room for many more.Some charging techniques are already in place First Virtual and CyberCash and others are in testing or coming online DigiCash and CyberCash . The technology is being built into Web browsers and servers now and may come to FTP, Gopher, and others in the near future. If you still have not recovered from the travails of setting up your own PC and the very idea of establishing a server yourself is giving you nightmares, you can have someone else set up and run your Gopher, WWW, or WAIS server--and it can be quite inexpensive. Read the sidebar to this chapter, Finding and Selecting an Internet Service Provider, if you need a better connection to the Internet or just to learn about some of the network connection issues that will affect you. Reasons to Hire Out There are a lot of reasons not to do the work of publishing on the Internet yourself. You might not be at all sure of what you re doing despite reading this book , and you don t want to risk your company or organization s reputation while you experiment. Or perhaps placing your Web pages on someone else s server for a few dollars a month is the quickest and easiest method to get your material online.Twenty-Four-Hour Internet Connection Providing a part-time Internet publishing service doesn t make any sense because this is truly an international medium that requires 24-hour access. If you don t have an Internet connection 24 hours a day, you will want to weigh the costs of running such a connection yourself against the costs of hiring out the service.Fast Internet Connections You may already have a 24-hour Internet connection, but Internet publishing can add quite a bit to the load on your computer. A Gopher, Web, or WAIS server that becomes popular might quickly overwhelm all the other traffic you send via that link. That s when it might make more sense to use someone else s service, and let the provider worry about handling the load. Computer Power One good reason to hire out your Internet publishing is to make sure that the computing power available on your Gopher, Web, or WAIS server matches the demand you expect. If the load you expect will overburden your computer, it might make a great deal of sense to hire out your Internet publishing venture to some place with a much more powerful computer. An Inexpensive Alternative At the very least you need a computer and an Internet connection to run your own Gopher, WWW, or WAIS server. But you can become an Internet publisher for as little as a 10 set-up fee and an e-mail account if you use First Virtual s InfoHaus service. These options might be ideal for an artist, writer, computer programmer, or small business or organization that wants to establish an Internet presence with minimal upfront costs.Expertise Obtaining expertise is one of the most popular reasons to hire out the work of establishing and running a server. Paying the right consultants to do the work can be a safe and efficient method of ensuring a professional looking result. UNIX servers are among the most powerful machines for running a Net site, but they demand a great deal of attention and experience to manage. It s safe to say that you ll spend more time managing the UNIX operating system than the Gopher, Web, or WAIS servers. Security If security concerns are paramount in your organization, the safest way to limit your vulnerability is to run your service on someone else s machine.What You Pay For Publishing on the Internet is a new field, and it s tempting to hand the responsibility to a consulting firm or marketing agency that sounds or looks good. But charges vary widely, and the impression you make on the Internet can have little correlation to the amount of money invested.First and foremost, you are paying for someone else s 24-hour Internet connection. However, connections provide various capacities, so learn what type of connection your provider has, and get the provider to estimate how many users and how much data its computers can support comfortably.connection. Paying for an Internet consultant or service provider s expertise is reasonable.Obviously, you ll be paying for a portion of whatever computer system and server software your outside provider is running.One way to spend more money is to have someone else prepare all your documents and data for you. And in the beginning that might be exactly what you want to do, if only to get started right and get your staff trained and up to speed. But don t lock yourself into a situation in which you must continue to pay for consulting in addition to the server and storage costs. Also, find out what kind of tools the experts provide to help you assemble and publish your data. Can you write your own scripts, or do your experts provide scripts and programs for most common functions, such as online forms and collecting users comments? How much storage space are you getting for your money? Is there a charge for how much is actually delivered to users as well as for storing your documents and data?If at all possible, make sure that your site s link is short, easy to remember, and contains the name of your company or organization. But now at least one commercial Web server has the ability to respond as if it were multiple servers as many as 256 , each with a different name and IP address. It is much easier to remember a short URL than a long one, and why not advertise your company or organization s name instead of the Internet provider s? When you are told that your Web site has had 1,000 connections per day, don t be too impressed. So if your organization s home page has nine inline images plus the home page text itself, each user will show up as 10 connections in the logs.Services and Consultants If you re convinced you need to hire out your Internet publishing, and you know what to look for and watch out for, it s time to consider what s available. Assuming you are comfortable with communicating via e-mail, fax, and phone, your Internet consultant or service could be on the other side of the world as easily as right next door. In fact, when dealing with localization to different languages or cultures, contracting with someone in the country you wish to reach might be only sensible. How to Find Services and Consultants Because this field is expanding so rapidly, it makes sense to search the Internet itself for information about services and consultants.directory.net . In May 1995 Open Market listed more than 370 companies that are in the business of creating a presence on the Internet for other organizations.ncsa.uinc.edu HyperNews get www leasing.html . Another way to find services and consultants is to look at the Internet sites you like. If the site is maintained by an Internet publishing service or consultant, it almost always will have a link to its own site or information on how to contact it. Internet publishing consultants are increasingly a cottage industry. Their one- and two-person shops offer services ranging from planning your Internet strategy to writing HTML pages and CGI scripts to installing software and hardware at your site.Internet publishing services on the other hand, are usually large organizations with a dedicated Internet connection and powerful computer of their own. In addition to helping you plan your Internet strategy and develop your content, they offer a site with storage and sometimes usage fees. For example, an Internet shopping center might offer storefronts with leasing costs pegged to the amount of material or space provided.The next section briefly profiles eight companies that offer some form of Internet publishing service.Cyber Publishing, Inc. Cyber Publishing, Inc., http: www.travelweb.com thisco cyber cyber.html see Figure 8-1 , based in Phoenix, specializes in consulting and constructing World-Wide Web pages and sites.travelweb.com , including electronic brochures for Hyatt Resorts and Best Western International., e-mails that The biggest danger is companies or individuals spending time or money building Web pages or Web sites with little or no real idea of to whom they are talking nor what real business those people can or will do with these pages. The mad rush to build pages--and the mad rush of individuals and companies to sop up the money they perceive is out there waiting for someone to offer to build these pages--is creating a glut of sometimes colorful but mainly useless Web pages and sites that really serve no larger purpose than the vanity of a decision-maker in the sponsoring company. Later this year 1995 , I expect a tremendous negative reaction in the press based on reaction from many of these companies now shelling out 25,000- 100,000 to pioneer on the Web, but who will receive no real business or other tangible benefit from their Web investment. I sincerely hope there are enough serious, industrial strength Web sites such as TravelWeb in existence by that time to ward off the negative publicity from the thousands of disenchanted companies that have invested in vanity sites. You can contact Cyber Publishing and Bruce Covill by telephone at 602-912-8822 or by e-mail at bcovill cyberpub.com. EINet Corporation EINet Corporation of Austin, Texas, provides consulting and planning services to help companies use the Internet to achieve their business goals see Figure 8-2 . The EINet staff provides the technology, services, and labor necessary to meet clients objectives using Macintosh, IBM PC, or UNIX platforms.Staff members have been working on electronic commerce and security solutions for the Internet since 1992. They also have an interesting concept, the Virtual Private Internet, which uses cryptographic techniques and firewalls to allow businesses to create a private set of resources across the Internet. For information on the EINet s service offerings send e-mail to info einet.net; voice 800-844-4638 U.S. , 512-338-3544 outside the U.S. .einet.net . Innovative Concepts Network Innovative Concepts Network ICNet offers what it calls its Virtual Host Service see Figure 8-3 . You can establish your own FTP, Gopher, or WWW server on Virtual Host Service s machines for 200 per month, including support, plus 200 in set-up fees. Virtual Host Service means that you get your own URL and IP address, which you do not share with anyone else.Virtual Host Service s servers are mostly Pentiums running BSDI UNIX with customized NCSA HTTPD for Web services and GN for Gopher services. You can add anything you want, including a back-end database the server looks up answers to questions in a separate database and sends the results to the user .ic.net Virtual Host Service is based in Ann Arbor, Michigan; voice 313-998-0090; e-mail support ic.net. Internet Distribution Services, Inc. Internet Distribution Services, Inc., develops and implements electronic storefronts or catalogs for organizations that want to market their products and services on the Internet see Figure 8-4 .service.com , has created more than 40 servers for a variety of large and small organizations, including Ziff-Davis Expositions Siemens Rolm Communications, Inc. Silicon Graphics Country Fare Restaurant Palo Alto Weekly CareerMosaic Syntex DIALOG Information Services Science Television Center for Software Development NEC Informix National Semiconductor Bay City News Wire Service Shopping 2000 CyberCash Costs consist of fixed-price set-ups and monthly operation fees if Internet Distribution Services provides space on one of its Internet servers.For more information contact Marc Fleischmann at 415-856-8265 voice or e-mail marcf netcom.com. Liberty Hill Cyberwerks Liberty Hill Cyberwerks is a one-person shop, based in San Francisco and run by Eric Theise, an academic-turned-Internet-educator, consultant, and writer see Figure 8-5 . A long-time Internet educator in academia, online forums such as the WELL s Internet conference, public spaces, and in-house settings , Theise began running Internet servers for clients in early 1993 when Gopher was still the information space of choice. Clients include Circle International global logistics , digital: threads Web strategy and marketing consultants , Hearts of Space Radio Records, New Albion Records, Project Inform national AIDS treatment information clearing house , online newsletters such as Brock N. Meeks s CyberWire Dispatch and DataLine--The Glass Ceiling, and local organizations such as the Bay Area Internet Users Group, the Virtual Reality Education Foundation, Radio Valencia a cafe performance space Theise frequents , and others. Cyberwerks prices its work on a per-project basis, with server set-ups thus far ranging from 1,000 to 7,000 and monthly maintenance fees ranging from 200 to 800. Theise favors a configuration that dedicates a computer to each client; he finds this offers the greatest flexibility in configuration and the option to simply relocate the computer should the client decide to bring its Internet connection in-house. But he is now offering virtual host services as well--a single server masquerades under multiple domain names in a way that is largely invisible to the outside world. Theise describes Liberty Hill Cyberwerks s mission as twofold: To educate people in the possibilities of networking technologies--cyberspace literacy--in forums ranging from corporate boardrooms and trade shows to free or low-cost venues such as bars and book stores, galleries and theaters, museums and community centers Cyberwerks puts up servers with significant content--erring in the direction of functionality instead of glitz--and representing clients in online forums while offering them training in ways to begin to do it themselves Liberty Hill Cyberwerks s connection to the Internet is a T-1 line through a regional provider, the Little Garden at http: www.tlg.net .bsdi.com .Because Cybwerks believes in the importance of reaching as many people as possible, it relies heavily on the Web Gopher capabilities of John Franks s GN server and Brent Chapman s Majordomo mailing list manager. Cyberwerks devotes one machine to virtual Web hosts using Franks s WN server and an experimental server capable of running the NCSA HTTPD and Apache Web servers, University of Minnesota s Gopherd, and Cornell University s audio videoconferencing reflector software, CU-SeeMe. Theise says: For heaven s sake, if you re thinking of hiring it out, browse through as many of your potential providers pages as possible, preferably with a couple of different browsers such as Netscape and Lynx . A quick scan around the Web at the multitude of typo-laden pages, painful color schemes, and broken links indicates that there s a new self-proclaimed Internet presence expert coming online every hour. You can reach Eric S.com; http: cyberwerks.com gopher: cyberwerks.com , mailto:info cyberwerks.com . WAIS, Inc. WAIS, Inc., of San Francisco was founded in 1992 by Internet pioneer Brewster Kahle who helped design the original version of WAIS while at Thinking Machines Corporation ., provides products and services for organizations that want to deliver information over the Internet see Figure 8-6 ., product WAISserver TM , organizations such as Dow Jones, Novell, CMP Publications, and Encyclopedia Britannica have given Internet users access to books, magazines, news, product data sheets, technical overviews, company information, and more. WAISserver allows content providers to index and publish large databases on the Internet, where they can be searched from around the world. WAIS Production Services create turnkey, Internet-based services using WAISserver as the core technology. WAIS Production Services can integrate custom modules for user registration and feedback, transaction-based and subscription-based billing, personalized invoicing for online shopping, archive searching for back issues, automatic content expiration, and new content alerting. You can reach WAIS, Inc., at 415-356-5400 voice ; fax 415-356-5444; e-mail frontdesk wais.com; http: www.wais.com . Web Communications Web Communications offers both FTP and WWW server privileges at 9.95 personal , 29.95 business , and 95 corporate per month. This allows for between 5MB personal and business to 10MB corporate of disk storage per month.Although WebCom does not allow you to write your own CGI scripts, the company has made it easy to add online forms.WebCom also provides mailing list services that are easily set up by filling out online forms. Your WWW pages are automatically indexed into an Archie database and can be easily added to the ALIWEB index by placing text markers on each Web page.The WebCom server runs on an HP9000 series 800 which is a mid-range work group server designed to service up to 100 simultaneous connections .5Mbps , and company officials say they will upgrade the link as needed. WebCom is a self-service Web provider see Figure 8-7 .S.webcom.com . Summary It s possible to hire out all or part of the pieces of the Internet publishing process. It can be done inexpensively or you can spend a lot of money, and many different services offer very different plans.But even if you want to do all the work yourself, you need to make sure that your Internet connection can handle the load.Creating a presence on the Internet can be as simple as giving some files to a Gopher or Web provider. Whatever you decide to do, be sure that you know what your money is buying in graphics, marketing, and programming expertise, as well as hardware and software. The variety of consulting and service approaches profiled in this chapter give you an idea of what s possible. As you see interesting sites on the Internet, look for who set them up, because you ll often find links to consulting companies and individuals. Publishing on the Internet is not just a matter of hardware and software. Publishing on the Internet is as much of an advance today as the invention of the encylopedia was in its day. If your Internet search on a subject one day yields nothing interesting, the same search a day or two later might well bring several new sources. In fact, searching the Internet through different indexes on the same day will yield different sources, just because no one index covers everything. Publishing on the Internet is so simple and inexpensive relative to traditional publishing techniques that many who would not previously have considered it will find themselves in the publishing business. A Different Kind of Publishing The Internet is a new medium for publishers.Text, Pictures, and More You can put pictures and text together on the Internet, just as you can in magazines, newspapers, and books, but on the Internet you can publish a lot more. Users can run programs from your server to search, calculate interest rates, or make animated figures jump around on the screen.You Don t Control the Whole Package This will be disappointing to the control freaks among us, but the appearance of your documents will vary from browser to browser, and not everyone will start at the beginning of your home page or at the top level of your Gopher server. Because of the open nature of Gopher and WWW, it s possible for a person browsing your server to save the links to the parts of your server that they find most useful. And if they pass those links on, or others get them through index searches, they ll return to your server directly at the point that most interests them, which may not be where you expected at all.Change Once a book or a magazine comes off the presses, it s done. On the other hand, you ll have to spend a significant amount of time maintaining links to other servers and updating your own material. Technology Although print technology has advanced quickly in recent years remember when color pictures first appeared in newspapers? , the speed of change is much greater and much more apparent on the Internet, as with anything having to do with computers.available.Interactivity More than any other medium, the Internet provides ways to get rapid feedback from your readers. In fact, the real wealth of the Internet is the knowledge and experience that all the participants around the world bring to the table. Usenet News, Internet Relay Chat see Chapter 6 , and Gopher and Web forms give you access to that knowledge and experience.Geography Distance simply is not a factor when someone has Internet access. But leapfrogging geography leads to one of the great pluses of the Internet: it s possible to build a community where there isn t one geographically. People from all over the world meet via mailing lists, Usenet News, Internet Relay Chat, and HyperNews to discuss their favorite subjects. Access to Information More than anything else, the ability to include information from other sites and servers makes the Internet truly a new medium. A footnote in a book or a citation in a magazine take a certain amount of effort and energy to follow up on. But links in Web pages or Gopher sites are so simple to follow that a reader who isn t paying attention may not even notice where all the items are coming from. Thus you could establish a new server about a particular subject, not because you re an expert on that subject but because you ve tracked down links to a number of sites that do have experts.Copyright Let me start this section by stating that I am not a lawyer, and the opinions expressed here are no substitute for competent legal counsel.Copyright is how authors protect themselves from unauthorized copying of their work. There is tremendous concern on all fronts about what copyright means and how it can be enforced in an environment like the Internet, where something has to be copied just to be viewed.Although legal challenges and changes to existing copyright statutes may help us adapt to the digital age, some facts are not in dispute. Copyright law does apply to the Internet. The United States and most other countries consider original material copyrighted as soon as it is created and fixed in a tangible medium. Text on Gopher, WWW, and WAIS servers as well as e-mail is considered a tangible medium and thus fulfills that criterion.Notice Nonetheless, you should place a copyright notice on work on the Internet.S. law does not require a copyright notice, but giving copyright notice is an announcement of your intention to protect your rights.If you own the copyright to something that you or someone else wishes to publish on the Internet, make sure it is labeled as copyrighted material, with you as the copyright holder. This is not true, and labeling your material as copyrighted is an important step in changing that presumption and the public perception of the protection of copyright materials. Proper copyright notice reads simply: John Doe 1995 Given that someone can easily link to just one item in your Gopher, Web, or WAIS server, it would seem appropriate to repeat the copyright notice with each document or file, although U.S. Your copyright protection for original work is the same whether you ve given notification or not, but the deterrent effect is greater if each document has its own notice. Finally, with the massive indexing by Archie, Veronica, Lycos, the Web Crawler, InfoSeek, Yahoo, and so on, a significant percentage of the browsers reaching your site will not be coming in through your home page. They may never see your home page, so if that s the only place the copyright notice sits, you ll lose whatever deterrent effect such a notice offers. Permission Make sure you have written permission for any copyrighted material you put on your server.Using someone s copyrighted material and giving that person credit does not satisfy copyright law. Copyright is about authors having control of where their work is copied and being compensated for the use of their works.S. A legal exception to copyright law is fair use for certain purposes, such as criticism, comment, news reporting, teaching, scholarship, and research.S. The purpose and character of the use, including whether such use is of commercial nature or is for nonprofit educational purposes; The distinction between fair use and infringement may be unclear and not easily defined.loc.gov:70 00 copyright fls fl102 If you can t find the copyright owner to get permission, don t use the material. There are limits to what is considered a reasonable effort to locate the copyright owner, but you may have to prove that you made sufficient effort in court, which takes time and money.Images With modern graphics tools it is becoming easier to embed a small text copyright notice within an image instead of just placing the notice in text below the image on Web servers . Gopher and WAIS servers have no provision for sending text alongside an image, so the only way to include a copyright notice is to embed it in the image. You do face the problem of the need to include the copyright notice on the work and the damage to the image involved in overwriting part of the image with the copyright notice.Public Domain It is a myth that by not defending a copyright from infringement the material passes into the public domain. The only ways for copyrighted material to pass into the public domain are for the author to explicitly waive all rights to it or for the copyright to expire.S.International Copyright International copyright does not exist, but many countries abide by copyright agreements between countries and through international conventions, principally the Berne Convention, to which the United States is a signatory. It is always wise to check the copyright laws of the countries concerned in specific cases and to consult a knowledgeable attorney.S.S.law.cornell.edu topics copyright.html Hopefully other nation s copyright laws will soon be available online as well. Editing for an International Audience Be aware that the material you put on the Internet is going to be available in more than 60 countries. Avoid large graphics, or offer text alternatives, because many international sites have narrow bandwidth Internet connections, and downloading large images can be extremely slow and painful. Summary The Internet is a different medium, and the rules and potential have not been fully explored. You will find that new tools and capabilities are arriving constantly, so you will need to allocate time to keep abreast of these changes. Interactivity is one aspect of Internet publishing that is easy to lose sight of. If you provide for it, users can download and view material from your server, upload comments and fill out online surveys or participate in chat sessions, or simply give e-mail feedback.Finally, geography can simply disappear. If only 20 people in the world are interested in your subject and they all live in different places, you won t be getting together for dinner. But by using a central Web, Gopher, or WAIS server, as well as e-mail mailing lists and Usenet newsgroups, you can unite your efforts. At the same time, there are times when it is appropriate to emphasize geography, such as publishing oral histories from your community, local maps, and tourist information. Although the Internet is a different publishing medium, copyright laws still apply. Because publishing on the Internet fulfills the legal criterion of being fixed in a tangible medium, most material is copyrighted automatically as soon as it appears. It is probably better to embed your copyright notice in your images so that it cannot be deleted if someone copies the images out of context. While you should protect your own copyrighted material, you should also respect the copyright rights of others and never use someone else s material without permission.International copyright protection is not guaranteed even though many countries have signed international copyright agreements such as the Berne Convention. Be aware that the material you put on the Internet will be available in more than 60 countries, so you need to avoid local geographic references and slang that won t make sense to others.This chapter profiles and interviews administrators of exemplary and interesting sites from around the world to show you how they ve put it all together.Britannica Online http: www.eb.com For a company that has been in business for 225 years, Encyclopaedia Britannica certainly got off the mark quickly when it came to publishing on the Internet. According to Doug Paul, executive vice president and general manager of Britannica Publishing Division, and Anne Long, executive director, electronic products, Britannica first started experimenting with multimedia in 1988, with its creation of Compton s Multimedia CD-ROM. They sold the Compton s New Media Division in 1993 when they saw the CD-ROM business moving toward entertainment instead of education. The initial text-only CD-ROM version of the Encyclopaedia Britannica 995 has enjoyed sales stronger than expected and the recently released illustrated version is also proving to be successful. The Compton s period was a crucial experience, however, in that it made Britannica aware of the value of electronic publishing and brought the company a joint relationship with an advanced programming group in La Jolla, California.In December 1992 Britannica began exploring the possibility of putting the text of the Encyclopaedia Britannica on college and university servers.Britannica is still working out its marketing strategies, adding individual subscriptions and other institutions to what had been only college and university access. It has been using IP address validation but has recently added passwords, which will allow the company to sell individual subscriptions as well as site licenses. In addition to copyright statements and legal notices scattered here and there, Britannica uses a variety of means to secure its intellectual property rights and those of its third-party providers. Britannica has embedded many of their images with copyright notice instead of putting that information in a line of text beneath the graphic. Technical Notes Britannica Online consists of the full range of articles in the print set with a growing number of links 3,000 to outside resources.Britannica Online dynamically serves every article.Britannica has been using a Sun Microsystems Sparc 10 with 64MB of memory as its server behind a firewall., server software modified to optimize relevance ranking.g.Advice Harold Kester, vice president and general manager of the Advanced Technology Group of Encyclopedia Britannica North America, urges entrepreneurs to understand the utility of what you re doing and develop a business plan around that. He notes that brand name identity is almost always a key to success.delivering products on the Internet, it s probably ten times more important. Burlington Coat Factory http: www.coat.com Burlington Coat Factory s Web site offers company information, a map to its stores, and pictures of some of its products.We had been using the Internet for several years for e-mail and occasional FTP access. In April 1994 I got the idea that Burlington should be able to easily put up a Web presence and should also be able to turn it into a commercial site.By June I had a functioning server in-house behind the firewall and was creating internal content. By July we had the prototype of the external pages pretty well set up and went to the marketing side to get approval.Burlington promoted its site with a coupon good for a 5 discount on a purchase of 50 or more to anyone who filled in an online form commenting on the site. At the peak of the offer Burlington was getting more than 100 responses a day and serving an average of 10 documents per visit, Young said. Burlington plans to use the Net as an active sales site but is waiting for a security protocol to be established.Actual out-of-pocket costs directly attributable to the Web pages are essentially zero--Burlington s firewall machine and Internet link were already in place and budgeted for other uses, Young said. In terms of people time, we are looking at about four man months, probably of a senior analyst, about half of the time in learning, Young continued. Burlington is working on using a Web site to disseminate information to employees; an employee telephone directory and some graphics already are online. Burlington wants to add linked biographies, skill matrices, and weekly status reports and store them as Web documents for easy reference, Young said.Technical Notes Burlington runs its Web server on a Sun Sparcstation with a firewall and runs the NCSA HTTPD server.All the scripts for forms processing were written in C based on the models provided with the NCSA server. One set of linked entry forms has the first form run a program that creates the next form and so on, passing session information along to ensure continuity across four linked entry screens. Advice Young thinks that anyone who sees the Net as a pot of gold is probably dreaming right now, but for those who look to learn from their adventures in Web publishing, and for those who seek to further the culture of the Net, the minimal costs are far outweighed by the potential future opportunities. CareerMosaic http: www.careermosaic.com CareerMosaic, launched by Bernard Hodes Advertising, provides employers with a cost-effective way to tell prospective employees about the company. According to Tim Gibbon, executive vice president, and Bruce Moore, director of systems and planning for the Western region, companies can present as much information as they want, including graphics, annual reports, and the full text of a pension plan. With the aid of Bernard Hodes Advertising, companies can communicate their corporate personality in hypermedia, using text files, graphics, audio, photographs, and video. Companies represented include Bellcore, First USA financial services , the Good Guys! electronic retail chain , Intel, Intuit, National Semiconductor, NeXT, Sun Microsystems, Symantec, Union Bank, Wells Fargo Bank, and Cedars-Sinai Medical Center of Los Angeles. Online job application forms are in the works. Moore said the most innovative thing CareerMosaic does is publish so that the material is equally accessible from all types of Web browsers. The challenge has been to keep CareerMosaic s Web designers from using all the experimental HTML features that Netscape and others are devising.Technical Notes To write HTML pages on the fly CareerMosaic created a custom database application built on the 4th Dimension database package running on PowerPCs.Advice Moore said that the most important lesson is not to forget what you know about your own area.CICA Shareware Repository http: www.cica.indiana.edu , gopher: gopher.cica.indiana.edu , and ftp: ftp.cica.indiana.edu The Center for Innovative Computer Applications CICA at Indiana University is one of the world s largest collections of Microsoft Windows public domain and shareware applications, tips, utilities, drivers, bitmaps, and such.5GB. Its Web site sees approximately 50,000 logins per month requesting more than 70GB of those files, for an average of 2.3GB per day of file transfers. The site is considered so valuable that it is mirrored by at least 15 other sites around the world, including sites in England, Holland, Germany, Finland, Sweden, Switzerland, Israel, Australia, Taiwan, Thailand, Japan, Singapore, and Poland. According to FTP librarian and administrator Michael Regoli, the site is maintained by volunteers on an old machine.On the administrative computing side of things, network bandwidth has been the biggest obstacle. People think we are limited to 45 connections from the Internet during the day because we haven t the horsepower on the desktop to service that many users. If left unchecked, and wide open, the FTP site could easily consume half, if not more, of Indiana s three T-1 links. Technical Notes The FTP server is a Sun Sparcstation 1 with 28MB of RAM. It will be moving to a 90Mhz Pentium PCI bus with 64MB of RAM and a PCI ethernet card for faster network access.3, with 64MB of main memory 85 MIPS . The bandwidth is effectively increased by the more than 15 mirror sites because they provide an entirely new set of lines into the archive.CMP TechWeb http: techweb.cmp.com techweb CMP publishes 16 computer trade magazines; CMP TechWeb, which started November 1994, is its platform for interactive publishing, which the company expects will become a big business itself. TechWeb offers recent headlines, a resource guide for technical job hunters, links to current issues of its magazines, and an extremely useful full-text search of its issues published since January 1994. Since July 31, 1995, TechWeb has required that you register with the company for these services, but all that involves is filling out an online survey.The CMP magazine list includes Communications Week, Comm Week International, Computer Reseller News, Computer Retail Week, Electronic Buyers News, Electronic Engineering Times, Home PC, Information Week, Interactive Age, Internet Business Report, Max CD-ROM, Netguide, Network Computing, OEM Magazine, VAR Business, and Windows Magazine. Most are controlled subscription magazines, which means that they are free to people who hold jobs that put them in the target populations the magazines advertisers want to reach. Users query TechWeb s WAIS searcher by filling out an online form that permits a user to specify which or all of the 16 CMP magazines to search, as well as starting and ending dates, title, author, section, column, and text.Technical Notes TechWeb s site runs on UNIX with Netscape and WAIS server software.Advice Mitchell York, publishing director of CMP Interactive Media, says: It s wonderful that literally anyone can now be a publisher with a potential audience of millions. But the fact is most Web sites don t have staying power--that is, the ability to attract one user more than once or twice. It s quite okay for someone to put up a Web site for personal enjoyment, but if the purpose is to build an audience, you re in a different league. In this arena, the key is to have targeted content aimed at an audience that needs the information you supply, and that the information is refreshed often. CyberWire Dispatch http: cyberwerks.com cyberwire and gopher: cyberwerks.com 11 cyberwire In September 1993 readers of the WELL s Wired conference were greeted with a new topic that audaciously began with: This is the place for breaking news from the CyberFront. In other cases, there could be a string of dispatches running throughout the day as an issue or situation develops. Quick, off-the-cuff, interactive journalism. You don t need credentials, a degree or editorial approval. You do need attitude, desire and nerve. It was followed by the first issue of what became Brock N. Over subsequent weeks the Dispatch quickly developed a reputation on and off the WELL for brash, occasionally off-color, but always accurate reporting on federal telecommunications policy and the antics of those working to change cyberspace for better or worse. Meeks, an award-winning veteran journalist who is Washington bureau chief for INTER CTIVE WEEK, understands better than anyone the speed, reach, and freedom associated with online publishing. Although Meeks s early dispatches were used without acknowledgment in the bylined articles of others, the now-copyrighted Dispatch has been cited by national news wires and publications like the Economist. Dispatch was originally distributed only on the WELL and over the com-priv mailing list. In mid-1994 Meeks asked Liberty Hill Cyberwerks s Eric Theise what it would take to set up a dedicated Dispatch mailing list. Theise, a supporter of Meeks s work, set up a list and searchable Web Gopher archives of back issues in a matter of days at no charge. The Web site has links to organizations Meeks thinks you should know about, including the Electronic Privacy Information Center, Voters Telecomm Watch, and other organizations working to preserve constitutional freedoms on the Internet. The e-mail mailing list has about 7,100 subscribers, many of which are local mail exploders which resend to many other addresses .S.Technical Notes Combined Web and Gopher traffic through the Dispatch s archives average roughly 1,500 hits per day.The CyberWire Dispatch archive server runs GN for Gopher WWW services on Liberty Hill Cyberwerks s main server, an Intel 486 66 with 32MB of RAM, running BSD UNIX. The mailing list is managed with Majordomo. Liberty Hill Cyberwerks uses a simple sed script to reformat the mailings into a form usable by the Web and Gopher server.Advice Theise says: Never underestimate the amount of time it s going to take to manage a large mailing list. Despite the fact that Majordomo, LISTSERV, and Listproc three different e-mail list server programs automate many of the functions needed to keep a list going, you ll be amazed--you ll despair?--over what can happen with a large list. And, as anyone who s been on a list for more than two days knows, there s a steady stream of people who don t know how to unsubscribe from the list, even if you ve told them repeatedly. Hearts of Space Radio Records http: www.hos.com , gopher: hos.com , mailto:info hos.com Liberty Hill Cyberwerks Eric Theise tells this story about driving from San Francisco to Minneapolis for the 1993 Gopher Developers Conference: In the middle of Nebraska in the middle of the night, a National Public Radio affiliate began broadcasting the syndicated Hearts of Space program. After a very long day of country and western and classic rock, the HoS mix of space, ambient, electronic, and ethnic music was a welcome change to these tired ears. At the end of the program, producer host Stephen Hill announced that program playlists were available in the WELL s radio conference. As soon as I arrived, I sent him e-mail asking if he d be interested in making his playlists available beyond the WELL s subscription-only doors. Within weeks, the duo had opened the Hearts of Space section of the GN-based WELL Gopher, offering a searchable archive of play lists from mid-1990 to the present, a list of the nearly 300 stations carrying the program, a catalog of CD and cassette releases on Hearts of Space s record labels, and a simple About Hearts of Space file.music.newage, Theise recalled. A year after they teamed up, the Web was becoming the information space of choice.com. The Gopherspace and listener mailing list came online in fall 1994; early 1995 saw the opening of the Web site and the installation of an e-mail gateway between Hearts of Space s office studio network and the Internet, Theise said. From a marketing perspective, one of the most valuable features of the Hearts of Space Web site is the ability to link selections in the Hearts of Space Radio playlists to descriptions, cover art, and sound samples from releases on Hearts of Space s own labels.music.newage, rec.music.ambient, alt.radio.networks.npr, and other groups for discussion and questions about its programming and artists. Technical Notes Marking up five years worth of playlists by hand would have been impossible, so Theise created a relatively simple awk UNIX utility script to recognize HoS releases by title and insert links in addition to the boilerplate HTML. Simple awk and sed scripts were used extensively in the initial formatting of the release sheets for the approximately 70 releases from Hearts of Space, and they still use sed scripts to facilitate the weekly site updates. The Hearts of Space Internet server is an Intel 486 66-based computer with 16MB of RAM.com autoresponder, and GN as its Gopher and Web server. It s a textbook example of using GN to deliver materials via Gopher and the Web and makes fairly extensive use of GN s built-in search facilities so that browsers can search the play list archives and the recordings catalog.As of July 1995 nearly 2,100 listeners receive the weekly playlist via e-mail, and the autoresponder--mentioned at the end of each radio broadcast--averages dozens of inquiries per day. Credit card orders for CDs and cassettes are accepted through Web forms and e-mail; customers concerned with security can fax their orders or send them through regular mail. Hearts of Space s internal QuickMail system exchanges e-mail with the Internet using the StarNine MailLink Remote UUCP gateway, allowing staff to be in communication with artists, radio stations, distributors, and others worldwide. InfoUCLA http: www.ucla.edu UCLA s Web site is similar to many university Web sites in that it provides central access to campus resources. These include the campus e-mail and phone directory, campus map, library hours and policies, library card catalog access, admissions and records information, administrative policies, central computing, associated students, bookstore, and student newspaper, and a diverse set of departmental Web, Gopher, and FTP servers. Many research centers are putting at least some of their research material up, and several national journals based at UCLA are thinking about turning electronic. The interesting thing that UCLA is doing, though, is providing virtually free dial-in SLIP PPP access to the Internet and e-mail for all students, staff, and faculty. Called BruinOnline after the campus mascot , this ambitious project started as a reaction to an increase, from one year to the next, of 10,000 undergraduates who wanted e-mail accounts.The implications of every student s having e-mail access from campus labs, if a student didn t have a computer at home are just starting to hit the faculty. The possibilities are fascinating: receiving homework assignments by e-mail, creating local Usenet newsgroups or mailing lists for every class, starting online discussion groups and holding online office hours, building class Web sites, and sending and updating problem sets by e-mail or FTP. Some faculty members are already preparing their lesson plans in HTML, and a Slavic languages professor had his students learn enough HTML to annotate a Russian novel. Technical Notes InfoUCLA sits on an IBM RISC system 6000, model 59H, with 256MB of RAM, running NCSA HTTPD, freeWAIS, Gopher, and QI a campus directory service .Internet Movie Database http: www.msstate.edu Movies , http: www.msstate.edu Movies alternative access.html The Internet Movie Database WWW server is a repository of movie details, reviews, ratings, and trivia as well as biographies and filmographies of actors and directors.The Internet Movie Database began in 1989 with periodic postings to the Usenet newsgroup rec.arts.movies. Since then it s grown into an international volunteer effort whose principal objective is to provide useful and up-to-date movie information freely available on-line, across as many systems and platforms as possible, according to its FAQ. The database is available via FTP, WWW, and e-mail and grows by about 7,000 filmography additions and 500 new titles per week.All data are contributed freely by movie lovers around the world, and the management is all volunteer.arts.movies and other Usenet Newsgroups continues. The author of the Web interface to the Internet Movie Database Rob Hartill was inducted into the WWW Hall of Fame in 1994, and the database itself received an honorable mention in the Best Entertainment category.The users maintain the database and police its accuracy. This is a method of ensuring that good data don t get corrupted by accident or malice while relying on user-supplied data and input. Hartill, who manages WWW interface to the database, and Col Needham, overall coordinator and creator of the Internet Movie Database, write: The database is maintained by a core team of 15 people at the moment. For most of them, it s a case of giving up a few hours of their spare time each week and being able to quickly respond to a problem that needs fixing. In the first four months of 1995 about 1,500 different people contributed new data to the system--some just a few lines and others tens of thousand additions. Hartill adds, A few of the managers spend 10 or more hours a week, just trying to stay up to date with submissions, and then there s Col, who probably spends more time working on the database than the rest of us put together. Technical Notes The voting system users get to vote on the ratings of any movie they look up is coordinated independent of the different mirror sites because it actually predates them.The original interface with the movie database was developed in UNIX shell script language and over the years was converted to faster C code. When the WWW interface was added, the data were rearranged for optimal speed for the most common types of query, but that meant losing the wider range of query types possible via the traditional interface.The mail server runs on a Pentium machine running SCO Santa Cruz Operation, Inc. UNIX; the main FTP site is a NeXT machine.All time, computers, and effort are donated. The mail server and the server that collects contributions to the database are provided courtesy of the PC Users Group in England. From Cardiff alone the database gets more than 60,000 connections a day.S.-based mirror sites in the pipeline, these figures are sure to grow dramatically.cm.cf.ac.uk htbin Graphs todays stats . Advice Rob Hartill: Don t be put off by the size of the Internet. Don t be fooled by the silky smooth duck gliding across the pond--there s a hell of a lot of kicking going on underneath.Col Needham: Start small and keep at it.Kaleidospace http: kspace.com , gopher: gopher.kspace.com , ftp: ftp.kspace.com This arts-oriented Web, Gopher, and FTP site allows artists in all media to showcase and even sell their work on the Internet. Artists pay a set-up fee of 100 and then either a monthly fee of 25 or a 10 percent commission on their sales. Artists can then announce future exhibits and installations, list their biographies, allow sample downloads of their music, writings, photographs, paintings, or rotating video clips of their sculpture or ceramics.Before Kaleidospace had a secure online transaction system running, its procedure was to take orders via online forms or e-mail and then call back to confirm the order and take credit card information. Now Kaleidospace is running Netscape s Commerce Server for all the artists, so it can take secure orders directly over the Internet.Jeannie Novak, Kaleidospace founder and principal programmer and designer, started the Santa Monica, California-based company in January 1994 because, as an independent musician, she wanted an alternative way to distribute her album of classical and acoustic piano music. As of July 1995 Kaleidospace had Web pages for more than 200 artists and had moved into Web consulting with nearly 100 commercial clients. Novak said, We do ALL the HTML for the artists. This way we create a consistent interface for the users and are able to support artists who don t have computer access about 40 percent . Pete Markiewicz, who does general support at Kaleidospace, adds that Novak writes all the HTML by hand but is working in Perl to create auto-entry forms specifically targeted for Kaleidoscape Web page formats.Technical Notes The primary Web server runs on an accelerated Sun Sparc 2 with 64MB of RAM running a modified version of NCSA HTTPD.The Netscape Commerce Server runs on a Pentium 90 with 64MB of RAM. The combined Gopher-FTP server was run off a Macintosh Quadra with 24MB of RAM for a year, but Kaeidospace is moving it to a UNIX machine. Novak and Markiewicz claimed that Web site development is easier on Macintoshes, so they build them there and transfer them to UNIX later. They frequently get several thousand individual users a day with hits in the hundreds of thousands . The chat room is done with WebChat server software.irsociety.com wbs.html Kaleidospace runs a virtual host system for its commercial clients to allow each to have its own host name. Advice Jeannie Novak advises: The main way to cope with high traffic is to increase RAM to 64 or 128 MB.Style and design are much more important than fancy HTML tricks.Make sure you have focused, real content not available elsewhere.coms are jarring and ultimately irritating to the users. On our own site we work with artists, musicians, writers, CD-ROM authors, performers, filmmakers, animators, software developers, and artisans--but they re ALL independent meaning not under contract .Los Angeles Murals Home Page http: latino.sscnet.ucla.edu murals This Web site is attempting to document and display some of the 1,500 murals in Los Angeles. It is a combined effort of the Social and Public Arts Resource Center, the Mural Conservancy of Los Angeles, UCLA Chicano Studies Research Center, Social Sciences Computing at UCLA, and Robin Dunitz, who is contributing the contents of her book, Street Gallery: A Guide to 1,000 L.A.Plans include providing a clickable map of Los Angeles, with links to murals by location, artists biographies, and lists of murals by artist, sponsor, neighborhood, and subject.Technical Notes This site is run on a Sun Sparcstation 10, with 36MB of RAM and running NCSA s HTTPD server. MicroSemanario gopher: gopher.uba.ar 11 microsem , ftp: ftp.informatik.uni-muenchen.de local rec argentina micros , and http: www.informatik.uni-muenchen.de rec argentina MicroSemanario semanario means weekly in Spanish was born to serve the many Argentine ex-patriates around the globe. Because of the country s economic woes, many Argentine scholars and scientists left during the last 30 years to continue their careers in the United States and Europe, with smaller numbers in Brazil, Israel, and Australia. Because detailed news about Argentina can be difficult to obtain outside of Argentina, especially in Spanish, the School of Sciences Facultad de Ciencias Exactas of the University of Buenos Aires decided in November 1990 to start sending out Argentine news by e-mail over the Internet. This free e-mail newsweekly is divided into sections dealing with politics, economics, society, culture, education, science, and sports. According to Guillermo Gimenez de Castro, originator and director of MicroSemanario, MicroSemanario is a weekly summary, and as such we take our sources from all the forms of press available: oral, written, and televised. MicroSemanario is distributed through two mailing lists totalling more than 3,000 subscribers and has an estimated readership, including family members and friends of the recipients, of 6,000 to 9,000. The size of the weekly edition has grown to roughly 60K so they split it into two parts to avoid e-mail problems. MicroSemanario gets some support from the university but relies to a large extent on volunteer labor for the writing and editing of the news summaries. According to an article for Internet News by Ricardo Bravo Centro de Comunicacion Cientifica-Universidad de Buenos Aires and director Gimenez de Castro, The main aim of MicroSemanario is not that of journalism, but to provide fellow-countrymen something to stay in touch with their society, alleviating in part the feeling of losing links to Argentina. Moreover, Micro usually provides information on living conditions in Argentina, and attends--as far as possible--to questions of readers wishing to come back. To subscribe to MicroSemanario send e-mail to majordomo ccc.ar and in the body of the message put subscribe micro. Technical Notes MicroSemanario is sent out entirely in low-end ASCII 32-127 to avoid e-mail translation problems.The University of Buenos Aires s Gopher and WWW server runs on a Sparc Sparcstation. Midnight Special Bookstore http: msbooks.com msbooks Midnight Special is an independent bookstore in Santa Monica, California, that has its own WWW site. The store rents Web space for 75 per month and is busy converting store sections into Web pages with resident experts, e-mail discussions, book reviews, and employee recommendations. The store s motto is taken from Bertolt Brecht: Hungry man, reach for the book: it is a weapon. Midnight Special s Web site advertises its calendar of events and offers weekly lists of best-sellers. Its inventory is searchable by title, author, and ISBN number for 90,000 titles and leads to order forms along with the search results.Tony Cappelli, the bookstore s Webmaster, has even started a Web column called Tools of Dissent in which he comments on the latest Internet technology and how it might be used for social and political action. The bookstore used the Internet to expand its market after several book superstores have moved in down the block from Midnight Special. Technical Notes Midnight Special s WWW server is running NCSA HTTPD on a Pentium computer with 24MB of RAM and BSD UNIX.Monster Board http: www.monster.com The Monster Board is a career development site and a center for human resources information on the World-Wide Web. In May 1995 the Monster Board had more than 10,000 résumés and expected that number to skyrocket with its increased investment in equipment, advertising, and faster Internet feed. This combination Web-WAIS site is a branch of ADION Human Resource Communications, a New England recruiting and advertising company. Posting a résumé is a matter of filling in online forms, with questions that include all the traditional résumé subjects like education, work experience, technical skills, willingness to move, and salary requirements. Once users fill out the form, they receive a password good for updating at any time during the next 12 months.The Monster Board provides companies with two separate résumé-related services: A company that posts jobs to the Monster Board may review all résumés posted for that particular job. The colorful monster images, meant to symbolize big and unique ideas, were actually in use by ADION in other projects but carried over well to the Web. ADION also claims that the monsters represent the characteristics its subscriber companies want in their employees: creativity, innovation, communication, energy, and positive results. John Kirby, project manager for the Monster Board, claims, The Monster Board will literally change the face of human resources because jobs will be posted to the world, and you ll be collecting applicants from all over the world as well. Technical Notes The Monster Board runs on two Sun Sparc 20s it started with a Sparc 2 running CERN HTTPD and a custom WAIS gateway based on freeware.Advice Kirby says, Make sure your site is unique. If it s a side business, don t rely on the Web to make your business for you unless you have a well-defined niche. 1990 Census Lookup http: cedr.lbl.gov cdrom doc cdrom.html and http: cedr.lbl.gov cdrom lookup The Lawrence Berkeley Laboratory and University of California at Berkeley are working together to create the world s largest online database of federal government statistics.S.Although not all levels of census data are available, the Web site offers cross-tabulations of answers to various combinations of census questions by national, state, and urbanized metropolitan regions.OncoLink http: cancer.med.upenn.edu , gopher: cancer.med.upenn.edu OncoLink is a patient-physician cancer information resource sponsored by the University of Pennsylvania Medical Center and available via Gopher, WWW, and telnet. OncoLink attempts to provide one-stop shopping for the patient, family member, health care provider, researcher, or browser searching for cancer-related information. Since its inauguration on March 7, 1994, OncoLink has been accessed 350,000 times from more than 75 countries and averages 4,000 accesses a day. According to a paper presented to the American Medical Informatics Association in the fall of 1994, the OncoLink Web and Gopher sites were originally ordered around the academic specialties of the faculty contributors. This made sense to the faculty, but users primarily laypeople who either knew someone with cancer or had it themselves soon e-mailed urging a different arrangement. They requested, and the center quickly added, a menu of various cancer types, so the layperson can easily find all the relevant material for a particular cancer. The paper also discusses the center s study of server logs, particularly in terms of what links were being followed and what patterns they revealed.med.upenn.edu manuscripts amia.html . Statistical reports through April 30, 1995, show that 83.4 percent of all transactions are from Wide-World Web clients and 16.6 percent from Gopher clients. OncoLink also has found that whether a link is text or a small image makes a difference in how often that link is picked.Technical Notes OncoLink is running a DEC station 5000 25 with 40MB of RAM.Advice According to Dr.m.On-line Books Page http: www.cs.cmu.edu Web books.html The On-line Books Page Web site, based at Carnegie Mellon University in Pittsburgh, is an index of hundreds of online books. It also points to some common repositories of online books and other documents, including the Project Gutenberg texts, as well as specialty or foreign-language repositories and book catalogs and retailers. According to site administrator John Ockerbloom, The On-Line Books Page started in 1993, when the Web was still a novelty in most places. A staff member had made up some nice HTML versions of some Project Gutenberg texts, and as part of our initial departmental Web, I made up a page that pointed to his texts and included pointers to a few major book repositories like Gutenberg. Later I noticed that there was no overall list of titles and that this would be useful to avoid having to check each archive individually.Special exhibits include Banned Books On-line and Celebration of Women Writers. Technical Notes The Web servers are Sparcstations running NCSA HTTPD 1.3 with local modifications .The On-Line Books Page only indexes the books; all texts are stored elsewhere, so it uses relatively little space. Advice Ockerbloom lists the following steps for publishing information: Decide whether you should work on putting up new material, work on indexing or explaining existing material, move into a different or more specialized niche, or some combination of these things. If you can t do it through your school or workplace, many public access providers now give people the opportunity to publish Web pages. Find one that is reasonable in price and in administration , seems to have an easily reachable Web server, and can provide you the space you need. Note that if you re indexing information, like us, rather than supplying a lot of new material yourself, you don t necessarily need a lot of space. Word of mouth will do much of the rest; once word gets around people will add their own links to you and suggest additional information you can put on your page.Palo Alto Real Estate http: none.coolware.com real realestate.html This Web site offers some real estate listings in the Palo Alto area of northern California as well as a map and information about the community.Technical Notes Palo Alto Real Estate runs its Web site from a Sun Sparcstation.Advice Keith Cooley of Coolware, the sponsor of the Palo Alto site, says, Spend two hours learning HTML and then do it--break the mold and do what feels right. PhoNETic http: www.soc.qc.edu phonetic PhoNETic is a Web service that converts phone numbers to letters and vice versa. Either people just have to see it to believe it, or converting phone numbers to letters is more useful than you might think. This is a textbook case of using CGI scripts or programs behind a Web server to perform a task for whoever wants it. PhoNETic was designed by Nick Sklavounakis and Nikolay Uglov, LAN UNIX system administrators of the Department of Sociology at Queens College of the City University of New York. Sklavounakis recounts the origin of PhoNETic: It was a rare very rare day when my assistant and I didn t have much to do. I thought it would be a good idea to experiment with writing CGI scripts and implementing interactive forms on our Web pages. Technical Notes PhoNETic runs NCSA HTTPD from a DEC Alpha 3000 300X 175mhz with 96MB of RAM.0 and a Connectix QuickCam. The day PhoNETic was Cool Site of the Day awarded by Glen Davis at InfiNet , it had more than 70,000 hits.Advice Nikolay Uglov: Make sure your server can handle the load in case of a huge success; be friends with your network administrator.Nick Sklavounakis: I think most important is to determine how much activity you think your server must deal with on a daily basis. Based on that, the appropriate hardware and software can be determined, as well as the type of link to the Internet you will have.The users will enjoy quick response time, and the server will have a less overall load people can connect and disconnect quicker when data is transferred faster . As for hardware, there are many options available now, and for newcomers I would probably recommend looking at the alternatives to operating in a UNIX environment, like using a Macintosh Internet Server, Windows NT, or OS 2. Playboy http: www.playboy.com Playboy Magazine s Web site home page, with its faint background pattern of the Playboy bunny logo, includes excerpts from the magazine, cartoons, and Playmate images, links to Playboy news and announcements, the full text of some interviews from Playboy archives as well as the Playboy Advisor FAQ file. In addition, the Web site accepts online questions to the Playboy Advisor and is conducting an online search for a photo feature of the Girls of the Internet. According to Eileen Kent, who started and runs Playboy s Web site, the magazine originally got an Internet connection in order to see what was being said about it on the Playboy newsgroup and mailing list alt.mag.playboy and playboy-request lovesexy.com . But just doing an electronic version of the magazine never appealed to her, although Playboy updates the Web site simultaneously with the newsstand version. When Hugh Hefner, founder of Playboy, saw a collection of e-mail that came in from the Web site, he said it reminded him of the letters readers sent when the magazine first started. Kent says that to her, The best, the really good stuff on the Internet is the homegrown stuff. Mirsky s Worst of the Net http: turnpike.net mirsky Worst.html is one site she finds particularly hilarious. One concern she has is that Net users accord copyrighted material insufficient respect. She s convinced that every year a new crop of freshmen come in with Internet accounts, and they aren t getting educated about copyright issues as they should be.S. that make application software infringement a felony. Playboy often finds illegal archives of Playboy images and contacts site administrators about these copyright infringements, Kent said. Playboy has been using Netscape Server and is moving to Netscape Commerce Server. Playboy created its HTML files in-house and gets more than 800,000 hits per day-- and not just downloading the Playmate images, Kent says. Advice According to Kent, Nobody should limit the Net to just thinking of it as a marketing phenomenon. Project Gutenberg http: jg.cso.uiuc.edu pg pg home.html , ftp: uiarchive.cso.uiuc.edu pub etext gutenberg , and ftp: ftp.etext.org The philosophy of Project Gutenberg is to make information, books, and other materials available to the general public in forms that most computers, programs, and people can easily read, use, quote, and search. They reason that years from now, when programs and systems have changed, plain old ASCII, readable on Macintoshes, PCs, and UNIX systems, will still be usable, if not any prettier than it is today. Project Gutenberg has been publishing works in the public domain with the help of volunteers since 1971 when Michael Hart, professor of electronic text and executive director of Project Gutenberg at Illinois Benedictine College in Lisle, Illinois, decided to earn the equivalent of 100 million-worth of computer time he had been given on a mainframe at the University of Illinois. Hart figured there was no normal computing he could do that would be worth that amount of money.Starting with the U.S.Technical Notes Project Gutenberg runs a modified version of the NCSA HTTPD server, which runs Mach NextStep v3.0 on an original 68000 NeXT cube. The modifications were designed to provide a firewall that has serving capability similar to what is available in the CERN HTTPD. The Web site enjoys about 13,000 hits a week, well distributed among all its pages, which is considered a good sign that people are actually reading them and not just browsing.cso.uiuc.edu jg.cso . Advice John M.Provide good content.Courtesy.Fresh links.Travels with Samantha by Philip Greenspun http: www-swiss.ai.mit.edu samantha Philip Greenspun is a graduate student in computer science at MIT who has a background in advertising photography and a knack for putting words together. In late 1993 he found the World-Wide Web and decided to put the story of his summer travels around the United States on the Internet as a Web book. In the literary tradition of John Steinbeck s Travels with Charlie, Travels with Samantha is a highly readable chronicle of the places he saw and the people he met, as well as his moving reason for taking the trip. Travels with Samantha includes 250 excellently digitized photographs to go with the stories written on his Macintosh PowerBook Samantha while on the road.Since Samantha Greenspun has written of his travels to Berlin, Prague, and New Zealand and put up earlier writings on photography and computer science.Technical Notes Greenspun also edits two online journals, Web Tools Review http: webtools.com wtr and Photo Journal http: photo.net photo .5GB per day answering user requests. Advice Greenspun finds that Web publishers with desktop scanners and 8-bit video boards are the biggest source of ugly images on the Internet today. He recommends using a 24-bit video card for any image work, Kodak PhotoCD as the minimum acceptable quality scan, and a T-1 connection as the minimum speed necessary for serving popular pages with many photographs. Virtual Shareware Library http: www.fagg.uni-lj.si SHASE Descriptions of more than 100,000 pieces of software are available on the Internet, archived at FTP sites like SimTel, CICA, Hobbes, and SunSite. The Virtual Shareware Library VSL acts as a catalog by using a search-and-retrieval mechanism SHASE behind a WWW server to provide a user-friendly way to select particular sites and then search those software descriptions. Dr. Ziga Turk pronounced Zheegah Toork , a member of the faculty of civil engineering at the University of Ljubljana, Slovenia, developed the service out of personal frustration, The Internet opened up tons of shareware and related software, but he could not tell what was available and what wasn t. At about the same time I started the WWW server, I found out that Perl is a language everyone uses for CGI scripts. As an exercise in Perl and for my own use I wrote the search engine pointing to CICA and SimTel only. It was rudimentary but proved popular. Another, more serious problem was a legal matter that arose when some FTP site managers claimed their indexes should not be used in Turk s service without their explicit permission. That forced Turk to close his service for a week or so, until he and the site managers were able to reach a friendly agreement. The lesson I learned was to ask for permission from the managers of all the other archives most had been contacted before , Turk said. Technical Notes In Ljubljana the VSL runs on an HP710 workstation running NCSA s HTTPD server.When the VSL-related load on the server in Ljubljana approached 50,000 hits a day, users at the university found they had difficulty gaining access to other services run from the same server and to mirrors pumping the database each day, so now the server in Ljubljana does not allow access for the commercial and unresolved domains.Advice Turk offers the following advice: First impressions are important. It s difficult to write about the future of Internet publishing when it s changing literally day by day. But in the year that I ve been thinking about and working on this book I ve seen some trends that I think will continue. To state it simply, publishing on the Internet is becoming easier and more widespread much faster than I think anyone anticipated.Computers do not solve problems, but people do, and computers are linking people together in ways never before imagined. If the theory that online versions will stimulate interest in buying print versions proves true, perhaps Project Gutenberg and others like it will stimulate a renewed interest in the classics as well as less well-known literary works. This chapter is loosely divided into several sections, but it s important to read the sections you re not interested in.Internet Hardware and Services A natural and realistic concern is how well the Internet will hold up under all the attention and massive growth.S.S. Although loss of federal support may lead to higher access charges, it s not clear how much that will affect end users. Whether the Internet is ready or not, the influx of new users and new hosts is tremendous. These users will affect all services currently available, and publicity can provoke wild distortions in the load on any given server.Whether the growth comes from integrating the cable TV system and the Internet, from ATM Asynchronous Transmission Mode , or from ISDN Integrated Services Digital Network access to home and office, the Internet pipeline is likely to get bigger and faster in the next few years--probably not big enough or fast enough to accommodate everyone and their multimedia dreams but considerably improved over today. Archie, Lycos, and Veronica, among other centralized indexing services, are becoming overloaded. One alternative that s arising is commercial versions of these services, assuming that people will be willing to pay to get immediate service.We re now on the second generation of WWW indexing approaches, which are becoming more sophisticated. One approach, I think, will be for user input to become an important element in adding to the depth and accuracy of the Internet indexes and subject directories. Lycos, for example, makes a strong step in that direction by allowing users to fill out an online form to remove a URL from the Lycos index. InfoSeek http: www.infoseek.com and Architext http: www.atext.com both offer new searching technology. The service uses statistical and analytical methods to turn up documents related to the words you used in your search, even if those words don t actually appear in the documents.Software, Data Formats, and Protocols When you think about how much has happened because a few people at different universities and institutes decided to write some software Gopher at the University of Minnesota, Veronica at the University of Nevada at Reno, WWW at CERN, and Mosaic at NCSA at the University of Illinois , trying to predict what might happen in the next few years is foolish.But first let me deal with the competition between Gopher and WWW. Even if everyone had a fast computer and speedy Internet link, Gopher would have a future. Gopher server software may not be that much easier to administer than a Web HTTPD server, but it is definitely simpler to put Gopher documents on the Internet than it is to write HTML and maintain links for Web documents. The irony here--and WWW zealots will loudly proclaim this--is that nothing prevents you from putting plain text documents on a Web server, just as you do on Gopher servers. WWW with HTML formatting can make online documents look so much nicer to those with the right equipment that it would be a waste to ignore that capability--and few do. But that s just my opinion.Michael Regoli, administrator of the CICA Windows FTP site, says: I think that in terms of bleeding edge Internet technology, Gopher has indeed seen its day; this was predicted when I first saw the Web in action, some 12 to 18 months ago as of April 1995 . It doesn t take big hardware to run it on the client side , and I think it will continue to have a following, especially among those using limited computing resources low-end 386s, 286s, etc. and limited net access.Paul Hoffman, president of Proper Publishing, an Internet consulting company http: www.proper.com , believes that Gopher will never go away, in the same sense that FTP hasn t gone away.Anyone who thinks that the whole world is going to graphical browsers doesn t understand the large corporate and the academic markets.S.: it is far from dead 30 years after television reached a 90 penetration in the home. There are no technical advantages to Gopher versus the Web, but that is irrelevant. What will keep Gopher alive is that it is easier to install and support Gopher clients on character-based systems than it is Web clients. The University of Minnesota Gopher Development Team is also working on the interaction of Gopher and WWW.micro.umn.edu:70 00 gopher Gopher Conference 95 Papers WebbedGopher Both Gopher and WWW browsers will be enhanced by the development of plug-in viewers that handle documents with special formatting.One essential element of software is the formatting of the data that are being manipulated. PDF--Adobe s Portable Document Format provides the graphic control that many Internet publishers want, but file sizes are rather large. PDF is starting to get its own add-on products, which increases the probability that it will be seen as a real graphical alternative, and perhaps a new standard, for publishing on the Internet., which allows members of a work group to share and co-edit documents in PDF. Readers can send their comments including voice annotations, pictures, bit maps, sound files, spreadsheets, and so on back to the author, who can then incorporate comments from several different readers in a revised version of the original document. IIF--Interactive Image Format used in Blue Skies Gopher is one alternative to WWW s clickable image maps; it offers immediate feedback to users as their mouse moves over hot or warm areas of the map. Developed under a National Science Foundation grant at the University of Michigan, its viewers are available embedded in Macintosh and Windows Gopher clients and will be added as a viewer for Web clients.sprl.umich.edu IIF Microsoft Word for Windows Format--This program has some advantages if you think of how many people use Word for Windows. HTML--HyperText Markup Language is fighting to maintain its place as the document format of choice for WWW servers.0 adds significant features for greater flexibility in document formatting and control.ics.uci.edu pub ietf html and http: www.w3.org hypertext WWW MarkUp MarkUp.html for more information on where HTML is going. But the drive for a powerful formatting language still wages war with the drive for a language that clearly defines the structure of a document. ASCII--ASCII text has been the underlying text format in Internet protocols like Gopher, WAIS, and WWW HTML is written in ASCII , but it has some basic flaws. The main one is that it doesn t work well for languages other than English, especially ideographic languages like Chinese, Japanese, and Korean. Unicode--Unicode is a superset of ASCII that was designed to solve the problem of representing all of the world s modern written languages including ideographic languages , but it has not been widely adopted. Microsoft has said that Unicode is part of its long-term strategy and in fact adopted it with Windows NT and Windows 95.unicode.org for more information on Unicode.Password Validation and Secure Systems The combination of two different and formerly competing security systems SSL and S-HTTP in Terisa s cooperative venture should go a long way toward establishing a system of secure communication across the Internet. Once a secure system is developed and becomes available for other companies to use, browsers with these new features will become available.3-D Virtual Gopherspace The Gopher team at the University of Minnesota is working on a three-dimensional virtual Gopherspace system that will offer a new way to browse Gopherspace. Unfortunately, it will require the browser to have a powerful machine, which runs contrary to Gopher s strength in being friendly to low-end PCs.micro.umn.edu:70 00 gopher Gopher Conference 95 Papers 3DUnixServer 3-D Web and VRML The April 1995 WWW Conference in Darmstadt, Germany, saw the announcement of the availability of VRML, or Virtual Reality Markup Language for WWW. One commercial example might be an auto company that simulates its new car in VRML and puts the simulation on the Internet for online suggestions and improvements.ncsa.uiuc.edu General VRML VRMLHome.html , http: www.sdsc.edu vrml , and http: vrml.wired.com . Better HTML Editors One clear trend is toward improvement in number and quality of such WWW publishing tools as HTML editors and converters. Microsoft and Novell have already brought out free HTML authoring add-on packages for their word processors, Microsoft Word for Windows 6.0, and Novell s WordPerfect. And as authoring tools become more prevalent and easy to use, the number of sites and amount of content will increase exponentially, especially if quality is maintained. PDAs and Web Browsers PDAs, or Personal Digital Assistants such Apple s Newton, are computers and therefore candidates for Internet browsers. That means it can display HTML files it has stored in memory but has no facility for easily browsing the Internet. With improvements in communications and wireless technology, it might be possible to look at the Southern California Real-Time Traffic Reporter Web site while you are driving in that traffic http: www.scubed.com caltrans . New Schemes for Resource Naming URLs or Uniform Resource Locators should be familiar to you by now. They simplified life on the Net immeasurably, but a significant drawback to the URL is that moving a file or changing a host name can invalidate documents around the world that link to it. If the host specified in a URL is not available, or too busy, or has changed its name, that link won t work. The results include an alphabet soup of acronyms--URI Uniform Resource Identifier , URN Uniform Resource Name , URC Uniform Resource Characteristic , and URA Uniform Resource Agent . URI Uniform Resource Identifier . The URI working group of the Internet Engineering Task Force IETF is attempting to move WWW and other Internet services to a system by which a name URN is assigned to a resource that is described by author, title, keyword in a URC and can be found at various locations URLs . The IETF draft proposals on URIs are located at http: www.ietf.cnri.reston.va.us ids.by.wg uri.html .acl.lanl.gov URI archive uri-archive.index.html . URN Uniform Resource Name . That is, if a document proves extremely popular, many sites around the world might maintain a copy, and the URN server would list all of them. URC Uniform Resource Characteristic . Having a clearly defined way to store information about a document can be crucial to being able to retrieve documents that match your search criteria.acl.lanl.gov URI urc draft.txt for the URC draft specification. URA Uniform Resource Agent . URAs refer to agents that would be programmed to work within our computers to do certain types of searches on the Internet. For example, a URA might be defined to do a person search using existing free Internet search services, such as Netfind http: www.rpi.edu Internet Guides decemj itools nir-utilities-netfind.html and Whois http: www.rpi.edu Internet Guides decemj itools nir-utilities-whois.html to find a long-lost friend who might be online. If in the future another type of person-searching service develops, scripts that interact with it could be added to the person-search URA.bunyip.com:8000 products client client.html , to show that it works.bunyip.com:8000 products client draft-ietf-uri-ura-00.txt . Developing the URA is important because it could become a building block for more sophisticated tools that could make finding information on the Internet less of a hit-or-miss process.Common Client Interface Common Client Interface CCI is the converse of the already-established Common Gateway Interface CGI that provides for scripting or programming links behind the scenes on WWW and now Gopher servers. For example, you might have a program that starts your WWW browser 30 minutes before you get to work each morning and does a standard set of searches for news and information pertinent to your interests.WWW Client-Pull and Server-Push Two new innovations from Netscape include Client-Pull and Server-Push scripting in HTML. Client-Pull allows WWW browsers to download an HTML document and then go back repeatedly with no command from the user to download a particular document, image, or resource from that same server. The inverse of this is Server-Push, which requires the HTTP server to keep the connection open and to send updates at certain intervals to the Web client. One application might be a stock market service that constantly updates stock prices and downloads them to the Web browser, with no action on the user s part. Hot Java from Sun Java is a new programming language developed over the last several years by Sun Microsystems as an improvement on the C language. Hot Java takes this language and puts it into a Web browser so that it understands or interprets programs written in the Java language. Once you have the two together Hot Java in the browser and Java programs on the server , a Web publisher could create a special program or applet, as Sun calls them in the Java language that would be immediately available to all Hot Java-compatible Web browsers. For example, if someone wants to develop a utility for users of her Web site, she can write it in Java; users with a Hot Java-compliant browser could automatically download the program and run it on their machine.sun.com . WebForce Silicon Graphics, Inc. SGI , is starting to bundle Netscape s Commerce Server, WebMagic Author WYSIWYG HTML Editor, and an enhanced Netscape Web browser with a new line of SGI workstations. This development could mean that publishing on the Web is as easy as browsing it, if you buy one of these new machines. Bundling HTTP and other Internet server software with some workstations and PCs is a trend that signals that Internet publishing has arrived.Oracle World-Wide Web Interface Kit Oracle, the leading database server company for UNIX, Netware, Windows NT, and OS 2 , is offering the World-Wide Web Interface Kit--at no charge.Web Mail The goal of the Web Mail project, partially sponsored by the Open Software Foundation Research Institute, http: www.osf.org RI , is to seamlessly integrate mail processing and the Web browsing environment. The idea is to be able to access e-mail with all the normal e-mail functions--reply, delete, forward, and so on--while browsing the Web. In addition, the project also wants to devise a way to give you Web access to your store of archived e-mail without forcing you to convert it to HTML. One exciting result of this project might be to make it possible to control your e-mail archives through Web browsers, which are usually able only to read or browse archives.osf.org:8001 www webmail . Speeding Up the HTTP Protocol Latency refers to the interval between a request for a document and when you can retrieve it. The basic complaint seems to be that because each document usually contains inline images, all of which are retrieved separately, telephone costs mount because of the delays resulting from the numerous reopenings of new connections to the server.spyglass.com techreport six developers tech doc6.html . Another paper, by Venkata N. Mogul, of Digital Equipment Corporation s Western Research Laboratory, describes these problems, methods they devised for resolving them, and statistics showing how their methods improve speed. They suggest including in HTTP a GETALL function that would retrieve a document and all its inline images in one connection. If it works as billed, we can look forward to improvements in speed and access, even without improvements in Internet bandwidth.spyglass.com six developers tech doc5.html . WAIS and Z39.50 According to a company representative, WAIS, Inc., is considering making a free version of its 15,000 commercial WAIS server product available for downloading from its Web site http: www.wais.com sometime in mid- to late 1995. It probably would be limited to indexing files of 10MB or less, but otherwise it would have all the features of the commercial WAIS server 2.1, which is Z39.50-V2 compliant.Z39.50-1995 version 3 was approved at the end of 1994 and should become final in 1995, hence the name.50-1995 compliant.50 applications and WWW to Z39.50 Gateways, see Table 5-1.S., LEXIS NEXIS, and MIT Information Systems are developing or using Z39.50 in their applications. Commerce Commerce on the Internet is alternately seen as its salvation and its destruction. Some have expressed the viewpoint that until commercial services and content arrive on the Internet, nothing of value will be available. That point of view ignores the cooperative work that s being done and the real virtue of the Internet, access to millions of people all over the world. I think that commercial activity on the Internet will take two major forms. One form involves items or information being bought and sold, and the other involves items or information exchanged at no cost in indirect support of a company s business. Given the public relations, advertising, focus group, and customer support aspects of modern business, the Internet provides an excellent medium in which to conduct these activities. Recent developments in the areas of digital signatures, digital watermarks, document coding, and digital cash are going to make it easier to do business on the Internet. In March 1995 Utah passed a law making digital signatures legally binding, and the states of California and Washington are said to be considering similar legislation.Digital watermarks consist of an image or piece of text married to a graphic file in such a way that it can be used to identify the owner of the image. Systems Research and Applications Corporation SRA of Arlington, Virginia, has applied for a patent on a technique it developed and calls Imprint TM that is being used by large online publishers of digital photos. Imprint runs on a Macintosh and marries a company logo or other image file in PICT format to a TIFF image file.com or at 703-803-1883 voice . AT T has been researching the encoding of small amounts of information four binary digits in formatted text. Formatted in this case means that the text is in either an image file or in a format description language such as PostScript, TeX, or troff. The encoding techniques include shifting lines of text, shifting the position of words, and altering the features of certain letters or fonts.research.att.com docmark Popularization and trust of digital cash and other electronic payment systems are essential for large-scale purchasing over the Internet. It s quite likely that 1996 will see the arrival of many more sites with something to sell and that they will accept electronic payment of some sort. Publishing Issues Subscribing to academic journals is becoming increasingly expensive, and many university libraries are unable to maintain the wide variety of subscriptions their faculty demands. As the financial stability of these journals weakens, online publishing will become increasingly attractive, as it has in the last few years in some disciplines.umds.ac.uk JMI ejourn.html Seals of APproval SOAP is a system under discussion that online academic journals might use to give their seal of approval to particular journal articles or documents to show that they have passed a peer-review process.Academic journals are not the only ones going online.Sometime in 1996 or shortly thereafter you will find it possible to move documents directly from your word processor to an Internet server format such as HTML, PDF, ASCII, or PostScript. Organizing the content and maintaining links will continue to be labor intensive, but that load will increasingly be spread among the actual producers of the documents as well as increasingly sophisticated data librarians. Several U.S.com , and the San Jose Mercury News http: www.sjmercury.com , have begun to offer online editions of some sort. And the San Jose Mercury News offers news summaries and updates throughout the day as well as all its classified ads. Journalists already have access to a wealth of online information, and newspapers may link their online stories to related stories in their archives. Because online publication means that stories could include video and sound, editors will have more options in stories they put online, although newspapers are still trying to learn how to make money this way.Political and Social Issues The rise of a completely new, extremely powerful, and far-reaching communication tool is bound to raise political and social issues.Many parents, educators, and politicians are concerned about the lack of safeguards to what children may encounter on the Internet.S. The bill as written was so broad that enforcement would be difficult and threatened a huge censoring effect on Internet publishing. Another problem inherent in such legislation is the application of moral standards from one area to another as the Internet makes geography less of a factor. Another approach to safeguarding children is to develop restricted access systems that can reach only certain limited pre-approved sites or newsgroups. Although commercial servers will almost certainly take this approach, it requires a body that would approve sites for the rest of the Internet. The sites would be left with the problem of checking all their material--wading through an unimaginable amount of material and then filtering all new material the same way.Someone may be able to develop a set of Internet tools for limiting access to a certain range of approved servers and refuse to link to servers that are not on the list. This might work as the inverse of the security scheme common on servers that already restrict access to certain servers or Internet subdomains. Some critics of these solutions reason that free speech brings with it risks and responsibilities. Part of the problem is what children might find accidentally; another concern is that children will actively search out forbidden material. If limiting software is developed, organizations might offer a highly controlled censored subset of Internet resources to their users.The next few years will see many and varied solutions to these problems.Another problem is who defines what is appropriate? Because almost anyone with Internet access and a computer can publish on the Internet, I think everyone will have to define his own standard. What people post and the items on other servers to which they link will reflect their standards and values and may or may not be appropriate for their audience.The other immediate political problem on the Internet is encryption and who has access to it.S. Although some efforts are being made to change this policy, other countries have even stricter laws concerning the use of encryption techniques.The movement to make the Internet and its resources available to all is strong. In many cases that requires that a country first acquire sufficient technology and infrastructure to gain access and take advantage of the Internet. In the United States the debate centers on whether the government should license commercial Internet and telecommunications providers, requiring them to connect and provide a certain minimum level of service to all schools and libraries in their communities in exchange for licenses. None of these issues is easily resolved, and you should expect to hear a great deal of discussion about subsidized access. Even as some people argue about censorship and whether to license commercial gateways to the Internet, others are worrying that the Internet, like television before it, will lead to the dumbing down of society. These critics point out that because abstracts are much easier to distribute and archive on the Internet than full text of articles and books, reading the abstract of an article will begin to replace the reading of the full text. But the popularity of Project Gutenberg and its massive effort to publish the full text of works of literature described in Chapter 10 is eloquent testimony that the Internet is unlikely to carry television s baggage. I think the Internet will increase the level of exposure to high culture, the cultures of other countries, and different points of view. My hope is that the profusion of varied points of view on the Internet will be greeted enthusiastically by the majority. Conclusion No one book can cover all the technical, philosophical, social, and legal aspects of publishing on the Internet. Given the amazing rate at which Internet publishing techniques are evolving, and that your next step should be to explore these resources on the Internet, I d have to paraphrase her.