TECHNOLOGICAL UNDERPINNINGS: COMMUNICATIONS AND NETWORKING
Rudolf Nottrott
National Center for Ecological Analysis and Synthesis, 735 State St., Suite 300,
Santa Barbara, CA 93101
Abstract. At most biological field stations there are few formal provisions for on-going data exchange after individual investigators have returned to their home institutions. However, opportunities for such long distance collaboration and information exchange have recently increased with the development of wide area network technology. Wide area networks have the potential to change the culture of collaborative research in ecology. To familiarize ecologists working at biological field stations with the mechanics of Internet communication and data exchange, this chapter provides a brief review of the history of electronic networking; the architecture, protocols and common client/server applications of the Internet; and basic network security issues.
INTRODUCTION
Ecologists have recognized an increasing need for long-distance collaboration, rapid communication, and increased data access. Biological field stations, which may host hundreds of scientists and research studies over the course of decades, have a clear need for data archiving and access to those data by geographically and temporally dispersed researchers. Once biological field stations and other research institutions have established standards for metadata, protocols, and network software, archiving of long-term data (e.g., relating to the history of a site) for efficient retrieval will be possible. In this chapter, key aspects of network infrastructure such as network functions, hardware, client/server mechanisms, and security are reviewed.
HISTORY
1989 and before - a jungle of networks
Until 1989, there were few opportunities for ecologists to utilize wide area networks for data exchange or collaborative research. The myriad of incompatible networks with variable longevity (Frey and Adams 1989) made it impractical for scientists, both nationally and internationally, to use this technology for information exchange. However, once the potential significance of wide area networks for data exchange between geographically dispersed researchers was recognized, demand for enhanced network functions and increased network access ensued.
A 1989 survey of 18 Long-Term Ecological Research (www.lternet.edu) sites (with more than 500 widely-dispersed researchers) indicated three primary needs related to increased network capabilities (Brunt et al. 1990):
THE INTERNET
Beginning as early as 1969, but accelerating exponentially around 1989 (Rutkowski 1997, Leiner et al.1997), the growth of the Internet, and its current position as a de facto global standard, has now made it feasible for widely distributed researchers to utilize wide area technologies. The key elements for the Internet's success were the openness and expandability of the Internet protocols, and their scalability from Local Area Networks (LAN) to global Wide Area Networks (WAN). Although Internet access is not yet truly global, it currently is widely available in the U.S., Europe, Japan, Southwestern Australia, and parts of South America and South Africa (Landweber (ftp://ftp.cs.wisc.edu/connectivity_table), and Matrix Information and Directory Services (http://www.mids.org/mapsale/world/index.html)).
Definition of the Internet - the foundation of TCP/IP
The Federal Networking Council (Federal Networking Council 1995, http://www.fnc.gov/) defines the term Internet as the global information system that:
The global address space of (i) is illustrated in Figure 1 which shows five computers (Internet hosts) on four continents connected via the Internet.
Figure 1. IP global address space.
Each computer is identified by a unique address, called IP number. For better readability, IP numbers are usually shown as four sets of decimal numbers separated by periods (e.g., 128.85.36.9), but they simply represent 32-bit binary numbers, allowing for 232 = 4,294,967,296 computers. In practice the number is smaller, because blocks of IP numbers are reserved for various technical reasons (Hunt 1992). Also, organizations are allocated whole blocks of numbers (usually 256 or 65,536 numbers), which they can use at their discretion. Some argue that present trends indicate a leveling off in the number of host computers at approximately 38 million hosts around the year 2002 (Hilgemeier 1997). However, the future internet may well have 128-bit IP numbers, to avoid the bottleneck of address shortages, and thus keep growing into the foreseeable future.
To further simplify use of the system, IP numbers are commonly represented in the form of hierarchical domain names, e.g., LTERnet.edu instead of 129.24.70.200, and Domain Name Service software is used to facilitate the conversion to numeric IP addresses. The basis of the IP protocol is that all information sent over the network is in small packets (e.g., 1000 characters) complete with destination and sender IP numbers plus other data (e.g., sequence number) needed during and on delivery. The packets may arrive at their destination in arbitrary order, but software on the destination computer can put them back together as needed, using the sequence number. Imagine a colleague in Australia sending you a 300-page story in sequentially numbered, one-page letters, at a rate of one per day. After nearly a year, you compile them to get the full story. Fortunately, the Net is faster than that.
Layers and protocol stacks
Network architects conceptualize, design and implement their network software in what they call "layers." For the Internet, the layers represent protocols including IP and TCP/IP. In schematic diagrams the software layers resemble stacks of bricks. Hence, they are often called protocol stacks. Before protocol stacks came preinstalled with most computers, one would have to install them before attempting a connection to the Internet (the Trumpet Winsock stack is a well-known example for Windows® 3.1). Figure 2 illustrates the Internet Network layers and how they relate to the Internet definition described above (this is a special case of the ISO/OSI reference model as detailed in Hunt 1992).
Figure 3 illustrates the same layers with an e-mail handling program at the top application layer. The Internet Layer, IP, corresponds to (i), the Host-to-Host Transport Layer, TCP/IP, corresponds to (ii), and the Application Layer, Telnet, SMTP, FTP, HTTP, correspond to (iii). Most network users work at the level of the "high-level services" of the Application Layer, such as Telnet, FTP or HTTP (through Web browsers). By examining some examples of high-level applications, and considering how layers pass data back and forth, the mechanics of the Internet become clearer.
Clients and servers - present-day workhorses of the Internet
The example of e-mail delivery in the previous section illustrates another concept that has found widespread use in software architecture for network-distributed applications - client/server architecture. A client is generally a software program that requests a "service" from another program called server. In the example of e-mail delivery, the client might be a program such as Eudora®, pine, Microsoft Outlook® or the original Unix program called 'mail', all available for many different kinds of operating systems and computer platforms. It's server counterpart has historically been a program called 'sendmail' running on Unix machines (alternative mail servers are now available). The client and the server communicate using standard keywords and formats, which are called a "protocol." In the case of e-mail delivery the protocol is called Simple Mail Transfer Protocol (SMTP). Mail delivery using this scheme is not unlike a Telnet session
Figure 2. Internet network layers.
Figure 3. Internet network layers with an e-mail handling program.
(although in practice a software mechanism called "sockets" is used, with Winsock and BSD Unix sockets being most common), and it is indeed possible to 'talk' by telnet directly to e-mail servers, as well as many other servers (such as Web, WAIS and News servers).
Naturally, people have come to expect much more user-friendliness, and consequently modern clients hide the protocol exchange behind a façade of windows and menus, as Figure 4 shows for the Eudora® mailer client. With the widespread use of desktop workstations, clients and servers for different application areas are becoming increasingly common. Table 1 gives an overview of the most common types of application protocols. Most Internet protocols are described in Request for Comments (RFC http://ds.internic.net/rfc/). A very comprehensive list of Winsock clients, together with reviews and source links, can be found at http://cws.internet.com.
Figure 4. Eudora windows.
NETWORK SECURITY
Security of data and other information on wide area networks is a key concern among scientists (Brunt et al. 1996). However, in most circumstances, solutions are available to ensure security of data and information, within reasonable limits. Most of the original Internet applications were developed by engineers and academics with little need for security, and only recently have commercial applications such as bank transactions and online sales necessitated development of extremely secure network applications (including features such as encryption). With the growth of the number of Internet hosts to tens of millions, the atmosphere has changed from that resembling a small town, where few residents lock their doors, to that of a big city, where some doors may need dead-bolts and chains. It is important to keep in mind that network growth is good, and security consciousness is a small price to pay for the increased services that have come with network growth.
Table 1. The most common client/server pairs at the Internet application layer
Application Layer Protocol |
Common Name |
Example clients |
Example servers |
SMTP, Simple Mail Transfer Protocol |
e-mail delivery |
Eudora®; MS Outlook®; Pegasus®; mail (built-in on most Unix systems) |
Sendmail, built-in on most Unix systems |
POP, Post Office Protocol |
e-mail pickup box |
Eudora®; MS Outlook®; Pegasus® |
POP3 |
FTP, File Transfer Protocol |
Ftp |
ws_ftp; built-in on Unix systems |
ftpd - built-in on most Unix systems |
NNTP |
News |
Free Agent®; |
Nntpd |
TELNET |
Telnet |
Ewan; Built-in on most Unix systems |
Telnetd - built-in on most Unix systems |
HTTP |
WWW, Web |
Netscape Navigator®; MS Internet Explorer® |
Httpd from NCSA |
|
Video conferencing |
CU-SeeMe® |
Reflector |
Gopher |
Mostly superseded by the Web |
|
|
ODBC, Open DataBase Connectivity |
ODBC database Access |
MS Access®; Excel® |
Oracle®, Ingres®, MS SQL server |
Sources for security information
The two most widely used operating systems with built-in TCP/IP capabilities are Unix and Windows NT (Windows 95® was developed with much lesser network capabilities, mostly for use in proprietary LANs). Because Unix is a much older and more mature operating system, several organizations (e.g., Computer Emergency Response Team, http://www.cert.org and the Internet Society, http://www.isoc.org) have considerable experience with Unix security issues, and a wealth of literature is available on Unix system security.
Recently, numerous books dealing with NT security issues have become available. An easy way to find the latest books is to do an online search at one of the electronic bookstores. For example, a search at http://www.amazon.com for "windows and security" retrieved 9 items, including Rutstein (1997) and Dalton et al. (1997). Similarly, a search for "unix and security" returned 7 items including Garfinkel and Spafford (1996). In addition, several newsgroups have been established to discuss Unix-related security issues (comp.security.unix), NT (comp.os.ms-windows.nt.admin.security), and miscellaneous other security issues (all in comp.security).
Simple precautions
Simple measures can help prevent most security problems:
Gated Communities - firewalls and intranets
With the exponential expansion of the Internet, some organizations with strict security needs have partially separated themselves from the Internet. They have done this by using a TCP/IP-based LAN internally, called an Intranet, which is connected to the Internet through a separate machine running "firewall" software. Firewalls can effectively protect your institutional network from the outside world and still allow your users access to the Internet. Firewalls obscure the internals of your Intranet from the outside world by refusing to provide name or address information about internal machines, by replacing internal users' login names with aliases (for email), by allowing FTP and other services only to/from the firewall and by allowing telnet or remote log-ins only to/from the firewall.
LITERATURE CITED
Brunt, J., J. Porter, R. Nottrott. 1990. Internet connectivity in the Long-Term Ecological Research Network (LTER): assessment and recommendations. LTER Network Office, University of Washington. Seattle, WA.
Dalton, W., S. Fuller, B. Kolosky, J. Millecan, Nachenberg, C. Goggans. 1997. Windows NT server 4: security, troubleshooting, and optimization. New Riders Publishing
Frey, D., and R. Adams. 1989. !%@:: a directory of electronic mail. O'Reilly & Associates, Cambridge, MA.
Garfinkel, S., and G. Spafford. 1996. Practical UNIX & internet security, 2nd edition. O'Reilly & Associates, Cambridge, MA.
Hilgemeier, M. 1997. Internet growth - host count turning point in June 1997. http://www.is-bremen.de/~mhi/inetgrow.htm
Hunt, C. 1992. TCP/IP network administration. O'Reilly & Associates, Cambridge, MA.
Leiner, B., V.G. Cerf, D. Clark, R. Kahn, L. Kleinrock, D. Lynch, J. Postel, S. Roberts, S.Wolff. 1997. Brief history of the Internet. Internet Society. http://www.isoc.org/internet/history/brief.html
Porter J., R. Nottrott, D. Richardson. 1996. Ecological databases: new tools and technologies. In Long-Term Ecological Research. Ecological Society of America annual meeting, Providence, RI.
Porter J., K. Baker, R. Nottrott. 1996. Tools for managing ecological data. Eco-Informa '96 Conference. Lake Buena Vista, FL.
RFC 1244. 1991. Request for comment. Site security handbook. http://ds.internic.net/rfc/rfc1244.txt
Rutkowski, A.M. 1997. Internet Trends. General Magic, Inc., Sunnyvale, CA. http://www.genmagic.com/Internet/Trends/; and http://www.genmagic.com/Internet/Trends/slide-4html
Rutstein, C.B. 1997. Windows NT security: a practical guide to securing Windows NT servers and workstations. Computing McGraw-Hill, New York, NY.