1


Proxy Server Overview

The Internet has grown from being a relatively small network used by a few egghead types like myself into a global system of information distribution linking all types of users. Businesses, universities, government agencies, and private individuals all use the Internet today in growing numbers. The Internet has become as important to some users as their own private LANs (Local Area Networks). As the Internet grows in its number of users and its scope of use, more people are looking for simple, cost-effective ways of linking their own LANs to the Internet.

Installing a full-fledged connection to the Internet can cost quite a bit. Even though hardware costs for Internet routers and CSU/DSU (Channel Service Unit/Digital Service Unit) units are coming down, proper Internet connections are still beyond the reach of most small businesses and private users. This being the case, people have been searching for solutions that will provide LAN-wide access to the Internet through a smaller connection, which does not require all the fancy hardware that installing a large connection, such as a T1 line, requires. Modem speeds are increasing at a rate of about one jump every eight months, and ISDN (Integrated Services Digital Network) may take hold as a standard for data communication in the near future. These two forms of data communication are far cheaper than installing a T1 line and are used by millions of people.

The nature of a TCP/IP (Transmission Control Protocol/Internet Protocol) network also makes it impossible for a private, non-sanctioned LAN to have a legitimate connection to the Internet with typical dialup-type connections. The addressing scheme used requires all IP addresses to be unique. Many private networks using the TCP/IP protocol are setup using IP (Internet Protocol) addresses that may already be in use by other Internet sites. Connecting a private network with repeated IP addresses will cause serious problems. There is also the issue of routing. Unless the entire Internet is aware of a block of addresses (known as a subnet), data will not be routed correctly to a site. That's where the InterNic comes in. The InterNic governs the Internet and issues addresses to sites wanting a legitimate presence on the Internet. Once a site has a valid set of addresses, the core routers are informed of the new addresses, and data will flow correctly to and from the new site.

The process of getting a valid subnet and a connection, can take many weeks, cost several thousand dollars in hardware, and cost between $1000 and $2000 a month in access charges. In contrast, a single user dialup connection to an ISP (Internet Service Provider) through a modem takes no time to obtain, averages about $20 a month, and requires little more than a $120 28.8 Kbps modem. ISDN access to a provider (if it's available) can run in the neighborhood of $20 to $70 a month plus metered charges and about $300 in hardware. If a valid connection to the Internet is beyond your price range right now, you're in luck. Microsoft has developed Microsoft Proxy Server, an NT server application that makes it possible for an entire LAN with non-sanctioned IP addresses to have access to the Internet through a typical dialup link to an ISP. Microsoft Proxy Server will work with any valid link to the Internet: Analog Dial Up, ISDN, T1 and beyond.

In some situations, using Microsoft Proxy Server will be a better choice than actually giving workstations full access to the Internet with valid IP addresses. Microsoft Proxy Server is easier to set up and has security features that make it easy to control the type of Internet access client workstations have. Controlling access to the Internet on a LAN with a legitimate connection is tough to do because each workstation on the LAN can have its own valid presence on the Internet. When workstations connect through a Microsoft Proxy server, they rely on the its presence on the Internet for their connection.

Microsoft Proxy Server Web Proxy
is an IIS Web Server Sub Service

In order for Microsoft Proxy Server to be installed, the Microsoft IIS Server must first be installed. The only element of IIS that must be enabled is the Web server element. The Microsoft Proxy Server Web proxy runs as an aspect of the IIS Web server and operates by using the IIS Web server to listen to TCP port 80 traffic and determine if it is to be remoted to the outside or whether it is to remain locally serviced by the IIS web server (or another web server operating on the local LAN). HTTP, FTP, and Gopher requests can all be handled through port 80. Microsoft Proxy Server is actually two services in one. The first part, the Web proxy, handles CERN-compliant proxy requests through port 80 on the TCP/IP protocol. CERN, Conseil Europeen pour la Recherche Nucleair (European Laboratory for Particle Physics), developed many UNIX-based Internet communications standards. Among these standards was a proxy service protocol that allowed for remoting Internet requests through a dedicated connection point.

The CERN-compatible proxy server element of Microsoft Proxy Server supports communications through port 80 of HTTP, FTP, and Gopher requests. These services all obtain data through the Internet using similar means. In order for CERN-compatible proxying to function correctly, clients must be able to interact with a proxy server. This approach is OK for compliant applications, such as IE (Internet Explorer) and Netscape, but there are many other Internet applications that do not have built-in proxying capabilities.

Many Internet applications simply communicate with the Windows Socket interface (WinSock), which in turn communicates with the rest of the network, internal or external. For these applications, another proxy method must be used to give them access to the outside Internet. The second element of Microsoft Proxy Server is known as the WinSock proxy server. For those applications that do not support CERN-compliant proxying, the WinSock Proxy server can intercept their WinSock calls and remote them to Microsoft Proxy Server for external processing. This necessitates the installation of special WinSock Proxy client software, which can intercept a client's WinSock call and redirect it correctly. The difference between the Web proxy and the WinSock Proxy is covered later in this chapter.

What is a Proxy Server?

Catapult was the code name Microsoft gave to its Proxy Server. Like everything Microsoft produces, it was renamed Microsoft Proxy Server upon release. Proxy servers have been around in the UNIX world for a while now, but Microsoft Proxy Server is Microsoft's first attempt at creating a proxy server for the NT environment.

The actual definition of a proxy server is a server that performs an action for another computer which cannot perform the action for itself. A real world analogy for a proxy can be seen at high-priced art auctions. Many bidders at art auctions will not attend the auction themselves, for whatever reason. Some of the actual bidders at the auction are the proxies for the real buyers. The proxy acts for the buyer and relays the status of the proceedings to the buyer over the telephone. If you watch CNN coverage of important art auctions, you'll see proxies all over the bidding hall talking to their buyers over land line or cellular phones. The proxy acts only when instructed to do so, and anything that the buyer could do in person, the buyer can do through his proxy.

In the world of computers and the Internet, workstations behind the proxy do not have valid Internet connections and therefor cannot talk to the Internet on their own. The proxy sits at the juncture of the Internet connection and the local LAN connection (typically an NT machine with two network interface cards (NICs) or with one network card and a RAS connection to the Internet) and routes local LAN requests to the Internet as though the Microsoft Proxy Server itself were requesting the information. On LANs that do not have valid subnets issued from the InterNic, workstations cannot route data through an NT machine that does have a connection to the Internet.

The following is a small diagram of my own scenario:

Figure 1.1. The author's network.

I am working on the machine named PENTIUM. The IP addresses given to all of my workstations and server are on the 220.200.200 subnet. This selection of a subnet was a fairly random process. I selected it because it fell within the Class C subnet range and was easy to remember. Most network administrators implementing the TCP/IP protocol on a non-Internet connection LAN will select a subnet in a similar manner. When a LAN is not connected to the Internet, network administrators have few restrictions when selecting the addressing scheme to be used. If a LAN is to be legitimately connected to the Internet, the LAN must be configured with IP addresses issued by the InterNic.

There are sets of private IP subnets that private LAN administrators can use for their own internal TCP/IP networks. These private subnets were set aside by the InterNic and will never be used openly on the Internet. These addresses should be the ones given to private TCP/IP networks that will not be directly participating on the Internet. I actually chose my addresses in the 220.200.200 subnet poorly, without doing enough research. When a private LAN uses addresses for the Internet that are already taken, the Microsoft Proxy server will keep all traffic directed to those addresses local, which will cut off a large chunk of the Internet from your LAN users. The Microsoft Proxy server determines what is local and what is external. When the internal addresses overlap external addresses, the Microsoft Proxy server will keep traffic for those addresses as local-only traffic. Chapter 4, "Planning Your Installation and Configuration," will cover the reserved IP addresses in more detail.

The diagram in Figure 1.1 shows my simple 10BaseT network of three Win95 workstations and one NT server. The server is connected to the LAN via a standard network card and to the Internet through a 28.8 Kbps modem connected via RAS. When Microsoft Proxy Server is installed on the NT server and properly configured, all workstations may connect to the Internet through Microsoft Proxy Server as though they themselves were connected. Of course, each workstation must be specially configured to connect to a proxy server rather than the actual Internet. Setting up a proxy requires a little more effort that just installing and setting up the software on the server. Fear not however, client-side configuration is very simple because all Microsoft Internet applications rely on a centralized configuration point in the control panel for proxy settings. Most non-Microsoft Internet applications can be easily configured individually to communicate through a proxy. Because the use of proxy servers is becoming common, most Internet application programmers are incorporating the ability for their software to communicate with the Internet through a proxy server. In the future, a majority of Internet applications will have the ability to use a proxy server for their connection. Even if an Internet application does not have the ability to communicate through a proxy, special client-side software can be installed that will allow nearly all Internet applications to talk properly with Microsoft Proxy Server.

How Microsoft Proxy Server Serves a Network

The primary use of the Internet these days is World Wide Web (WWW) access. Whether your network uses Microsoft Internet Explorer or Netscape Navigator, LAN users will be able to "surf the net" through Microsoft Proxy Server as though they were connected directly to the Internet via Dial Up Networking. If you have ever tried other proxy servers, such as WinGate or NetProxy, you'll find that Microsoft Proxy Server's performance exceeds their performance by quite a bit. Having tried both WinGate and NetProxy, I can attest that Microsoft Proxy Server's performance is nearly identical to actually being connected with a local workstation connection to the Internet via Dial Up Networking. Both WinGate and NetProxy, while fine applications in their own right, did not compare to Microsoft Proxy Server's performance with any client, such as IE 3.0, Netscape 3.0, Eudora, WinVN, WS_FTP32, etc.

Microsoft Proxy Server supports nearly any type of Internet client, from WWW client to FTP client to Newsgroup client. Win95 workstations work best through Microsoft Proxy Server, but almost any TCP/IP client (such as OS/2 and Windows for Workgroups) can access the Internet through an NT machine running Microsoft Proxy Server. There are some limitations to 16-bit clients using Microsoft Proxy Server, but this topic will be covered in greater detail in Chapter 11, "Proxy Server and Client Applications.".

Users of a LAN can access the Internet through a personal dialup account connected by RAS on the NT machine running Microsoft Proxy Server. This means that users on a LAN will not have personal e-mail accounts on the ISP providing the connection. Most ISPs only give one e-mail account for each dial in account. Most companies in this situation will opt to use the provided e-mail account as a company-wide account. Privacy is one of the concessions necessary when using Microsoft Proxy Server through a dial in account.

Microsoft Proxy Server's purpose is to provide LAN access to the Internet, not vice versa. Meaning, outside Internet users will not be able to access LAN workstations through Microsoft Proxy Server. Microsoft Proxy Server only listens to internal network requests for outside information.

Microsoft Proxy Server can also provide a very high level of security for controlling which LAN users have access to the Internet connection and exactly what those users can access. Because Microsoft Proxy Server is an integrated NT application, it can draw on NTs internal security systems for LAN user authorization. All Internet protocols can be separately configured for different security levels. Different users or groups of users can be authorized for each type of Internet connection (such as HTTP, FTP, and NNTP). Microsoft Proxy Server also has the ability to limit the sites where LAN users can connect. If you need a secure Internet access point, Microsoft Proxy Server is your best choice.

Web Proxy vs. WinSock Proxy

As previously mentioned, Microsoft Proxy Server consists of two separate servers. The first is the Web Proxy server and the second is the WinSock Proxy server. Both are installed when Microsoft Proxy Server is installed, and each run as a separate service under NT. Each service can be controlled and configured through the Internet Service Manager found in either the Microsoft Proxy Server folder or the Microsoft Internet Server folder in Programs group on the Start menu.

The Basics of TCP/IP

All communication which takes place over the TCP/IP protocol is done through ports. The IP address combined with a port number is known as a socket. Servers that communicate via TCP/IP do so through predefined ports, depending on which type of server they are. For example, the WWW server communicates with the HTTP protocol. This protocol uses port 80. FTP servers listen to port 21 for their traffic. Telnet communications take place over port 23. The NNTP protocol (Newsgroup communications) uses port 119, while SMTP (Mail) uses port 25. These ports are virtual channels of communication used between servers of clients using the same protocol.

The most confusing part of this discussion is the term protocol. Protocol is used at every level of networking to describe the procedure two applications (server/client) or devices (network interfaces) use to talk to one another. Network protocols are used by two network cards to communicate, while server and client applications communicate using software protocols. Take the following arrangement of protocols, for example:

It gets very confusing at times when the word protocol is used to describe so many different levels of networking.

TCP/IP packets have a header. This header has information such as destination IP address and destination port. Two types of packets can pass over the TCP/IP protocol: TCP and UDP (User Datagram Protocol). The primary different between TCP and UDP packets is that TCP packets contain header information in them that indicates the sequence of the packets and UDP packets do not. Most Internet protocols use TCP packets to communicate. Because sequencing (making sure the order of received packets is the same on the receiving end as it was on the sending end) requires slightly more overhead, UDP communication is used by servers requiring the highest level of efficiency, but at the slight cost of data integrity. Servers such as Real Audio and VDO Live use UDP communications because their data is transmitted in real time and requires the fastest possible communication.

TCP packet transmissions are what is known as stream-oriented while UDP packet transmissions are known as datagram-oriented. Both use the TCP/IP protocol and are nearly identical, except for the differences mentioned earlier.

The Web Proxy Server

The Web Proxy server element of Microsoft Proxy Server is a CERN-compliant proxy server that operates by listening to TCP/IP port 80 for traffic. It must therefore use the IIS Web server as its listening mechanism because port 80 is the WWW port. The Microsoft Proxy Server Web Proxy server is incorporated into the IIS Web server once it is installed.

The Microsoft Proxy Server proxy consists of two parts: the filter and the application. Clients properly set to talk to a proxy server send their requests out on port 80 in a different request format than they would if they were communicating with the actual destination server on their own. This alteration of request format is done so the proxy server will know exactly what kind of request it is being sent, HTTP, FTP, or Gopher. When browsers are not set to talk to a proxy, they make the correct type of request to the destination server depending on the format of the user entered request.

For example, when a non-proxy enabled browser gets a user request for:

http://www.pandy.com/index.htm

the browser knows that the protocol to be used is HTTP from the format of the command. The browser then sends the following request to the WWW server pandy.com:

GET ./index.htm

GET is a standard command for obtaining Internet objects such as files, HTML documents, and imbedded objects. This command is implemented in different ways, depending on which protocol is using GET.

If a browser that is not configured for proxy use sends GET ./index.htm to a proxy server, the proxy server would not know which protocol to use nor the location of the object to GET. Therefore, when browsers are configured to send proxy formatted commands to a proxy server, the requests looks like this:

GET http://www.pandy.com/index.htm

The browser assumes that the proxy server will correctly parse the command and issue it to the correct site in the correct format. Keep in mind that FTP clients configured for CERN-compliant proxy interfacing use the same format and port (80) as Web clients, even though FTP clients use port 21 when talking to servers on their own.

Once the proxy server receives the properly formatted request, it sends the request to the actual destination in the same format the browser would have, had the browse not been configured for proxy interface. If the request for data was an FTP request, the proxy server would have initiated communications to the destination FTP server on port 21. Once the proxy server receives a reply from the destination, it sends the data on to the requesting workstation on the LAN, and the cycle is complete.

This is the job of the proxy filter: to determine if the received HTTP request is in proxy format or in local request format. If the filter determines that the HTTP request is in standard format (GET ./index.htm) it assumes the request is to be handled by the local WWW server. If the HHTP request is in proxy format, it will pass the request on to the proxy application where it will be reformatted and reissued out to the correct destination.

The proxy application performs many operations on the request before it actually sends it out. The proxy application is responsible for looking in the local proxy cache to see if the data requested is already present. If the data is present, its Time To Live (TTL) has not expired, and the object has not been changed on the destination server, the proxy application will pull out the object from the cache and send it to the requester without having to go out to the Internet to get it. This process of holding a certain amount of information in a local cache can greatly increase Microsoft Proxy Server's perceived performance, especially when a smaller connection to the Internet is being used by many people. Caching is discussed in greater detail in Chapter 12, "Controlling the Proxy Server Cache."

The proxy application is also responsible for authenticating the requester and ensuring that he has authorization to both use the protocol and obtain information from the requested destination. Client authentication takes place before any other action.

Once the requested information has been obtained, either from the local cache or from the Internet, the proxy server will send the information to the requester via the HTTP protocol on port 80.

The WinSock Proxy Server

The WinSock Proxy server works quite differently than the Web Proxy server. In order for the WinSock Proxy server to function correctly, special WinSock Proxy client software must be installed on each workstation needing WinSock Proxy support. When Microsoft Proxy Server is installed, a special network share is created that contains the WinSock Proxy client installation files. Workstations can link to this shared resource and run the SETUP file found there to correctly install the WinSock Proxy client software.

When Internet clients such as Eudora (a popular e-mail client) request data from a TCP/IP network, they make WinSock calls to the local WinSock DLLs (Dynamic Link Libraries). The WinSock DLLs then process the request via the TCP/IP protocol. In order for applications such as Eudora (which has no special proxy configurability) to work correctly in a proxy environment, the WinSock layer on a workstation must be able to forward or remote the request to the WinSock Proxy server that performs the action on behalf of the requester.

For this to happen, the client WinSock DLLs must be renamed and new WinSock Proxy DLLs must be put in their place. In a 16-bit environment, the WinSock DLL is WINSOCK.DLL. In a 32-bit environment, such as Win95, the WinSock DLL is WSOCK32.DLL. These original DLLs are not over written by the new DLLs, but they are renamed to something different. Once the WinSock Proxy client software is in place, all Internet clients on workstations will function as they always did for local LAN traffic.

When installing Microsoft Proxy Server, Microsoft Proxy Server will examine your private LAN and determine which set of local addresses is used by your network. These addresses are contained in the LAT (Local Address Table). This information is kept in the file \MSP\CLIENTS\MSPLAT.TXT. This is a very important file and will be discussed momentarily.

Once the WinSock Proxy client software is installed, any Internet application requesting data does so to the new WinSock Proxy DLLs. Once the WinSock Proxy DLLs get a request, they open a control channel to the WinSock Proxy server on port 1745 and download a copy of the LAT. The address of the requested destination is compared against the contents of the LAT. If the address is found to be local, the WinSock Proxy DLLs simply relay the request to the legitimate WinSock DLLs on the workstation and the request is processed as any other local request. If the request is found to be an external request, the WinSock Proxy DLLs remote the request to the WinSock Proxy server for processing. The WinSock Proxy server establishes a link on the client's original request port (for example, port 119 for Telnet clients) to both the client and the Internet destination. The WinSock Proxy client DLLs from that point on simply forward all data to the original WinSock DLLs for standard processing. The original DLLs simply communicate on a given port at this time and not to a specific IP address. Because the WinSock Proxy server has been initialized by the WinSock Proxy client, the WinSock Proxy server will be responding on that port for the client. The client will see the WinSock Proxy server as the final destination and the WinSock Proxy server will be talking to the originally requested destination and relaying information to the client. Several clients such as FTP clients send out their local IP address in a handshaking packet so that the Internet server they are attempting to connection with will be able to open a connection back to the requester. The WinSock Proxy server is responsible for removing the local IP address used by the client and replacing it with the Internet IP the WinSock Proxy server has. This means the contacted server will open a backward connection to the WinSock Proxy server rather than attempting to open a connection with the originally indicated IP address. Keep in mind that the local IP addresses used on a private network are not valid, and the WinSock Proxy server must use the IP issued to it from the ISP giving the Internet connection.

Once the Internet server correctly establishes a return connection to the WinSock Proxy server, the WinSock Proxy server in turn establishes a return connection to its own client on the same port. The client lets the WinSock Proxy server know that it is expecting a return connection by listening to the proxy servers IP address for a return connection. Remember that when the client makes a connection, it actually makes it with the IP address of the WinSock Proxy server. The WinSock Proxy DLLs determine if the original request was local or remote. If local, the request was already passed to the real WinSock DLLs. If the request was remote, the WinSock Proxy WinSock DLLs would instruct the client to talk to the WinSock Proxy server IP as though it were the final destination. If the client is listening for a return connection to come from the IP of the WinSock Proxy server, the WinSock Proxy server knows to expect a return connection from the destination Internet site. Figure 1.2 shows this sequence.

Figure 1.2. The connection process between client and WinSock Proxy Server.

IP-less Listening

There are some rare situations where a client application might issue out a TCP/IP listen command to the WinSock DLL without a destination IP address. When these situations occur, and the WinSock Proxy DLL cannot determine if the listen request is to be kept local or remoted to the Internet, the request will remain local. This is done for security reasons.

Web Proxy and WinSock Proxy Working Together

The combination of the Web Proxy server and the WinSock Proxy server allows Microsoft Proxy Server to service nearly any WinSock 1.1 compliant client application. Those clients that have the internal ability to talk directly to a CERN-compatible proxy have that ability, and those clients that do not can still get Internet access via the WinSock Proxy server.

The WinSock Proxy server can handle the traffic of Web browsers, FTP clients, and Gopher clients on its own, if those browsers do not have the ability or are not set to use a proxy server. There are no drawbacks to using WinSock Proxy with these types of clients. The primary benefit the proxy server has over the WinSock Proxy server is that no special client software must be installed in order for it to work. This means that non-Windows operating systems can take advantage of a Windows NT Microsoft Proxy Server Web Proxy server via the TCP/IP protocol.

Using IPX/SPX

It is possible to use the IPX/SPX protocol as the only protocol installed on workstations and still be able to access the Internet with clients via the WinSock Proxy server. The WinSock Proxy client software has the ability to use IPX to transport Internet requests and responses to and from clients to the WinSock Proxy server. This ability allows a network administrator to have full control over Internet traffic.

When IPX is used as the only network protocol on a LAN, any request made by a TCP/IP client will be considered a remote request. The replacement WinSock Proxy DLLs will remote the request to the WinSock Proxy server which will use IPX to respond back to the client.

Obviously, the TCP/IP protocol must be installed on the NT server that is connected to the Internet in order for a legitimate TCP/IP connection to be established between the NT machine and the ISP.

The Microsoft Proxy Server will perform a protocol translation between TCP/IP and IPX. The IPX protocol is similar in nature to the TCP/IP protocol in that each workstation on the LAN has a unique node ID and IPX packets can be routed to specific destinations.

Summary

This introduction to Microsoft Proxy Server gives you a clear idea of Microsoft Proxy Server's purpose and abilities. As Microsoft gets deeper and deeper into the Internet market, you are going to see more and more applications like Microsoft Proxy Server coming around. Microsoft Proxy Server can be used as a cheap, efficient alternative to real connections to the Internet. Because Microsoft Proxy Server can provide all the services an actual connection to the Internet could, I have a feeling more people will be looking to Microsoft Proxy Server as the best solution.