Wednesday, February 4, 2009

TCP/IP....

Introduction to TCP/IP

(pronounced as separate letters) Short for Transmission Control Protocol/Internet Protocol, the suite of communications protocols used to connect hosts on the Internet. TCP/IP uses several protocols, the two main ones being TCP and IP. TCP/IP is built into the UNIX operating system and is used by the Internet, making it the de facto standard for transmitting data over networks. Even network operating systems that have their own protocols, such as Netware, also support TCP/IP

Summary:

TCP and IP were developed by a Department of Defense (DOD) research project to connect a number different networks designed by different vendors into a network of networks (the "Internet"). It was initially successful because it delivered a few basic services that everyone needs (file transfer, electronic mail, remote logon) across a very large number of client and server systems. Several computers in a small department can use TCP/IP (along with other protocols) on a single LAN. The IP component provides routing from the department to the enterprise network, then to regional networks, and finally to the global Internet. On the battlefield a communications network will sustain damage, so the DOD designed TCP/IP to be robust and automatically recover from any node or phone line failure. This design allows the construction of very large networks with less central management. However, because of the automatic recovery, network problems can go undiagnosed and uncorrected for long periods of time.

As with all other communications protocol, TCP/IP is composed of layers:

  • IP - is responsible for moving packet of data from node to node. IP forwards each packet based on a four byte destination address (the IP number). The Internet authorities assign ranges of numbers to different organizations. The organizations assign groups of their numbers to departments. IP operates on gateway machines that move data from department to organization to region and then around the world.
  • TCP - is responsible for verifying the correct delivery of data from client to server. Data can be lost in the intermediate network. TCP adds support to detect errors or lost data and to trigger retransmission until the data is correctly and completely received.
  • Sockets - is a name given to the package of subroutines that provide access to TCP/IP on most systems.

Network of Lowest Bidders

The Army puts out a bid on a computer and DEC wins the bid. The Air Force puts out a bid and IBM wins. The Navy bid is won by Unisys. Then the President decides to invade Grenada and the armed forces discover that their computers cannot talk to each other. The DOD must build a "network" out of systems each of which, by law, was delivered by the lowest bidder on a single contract.

ipdept.gif

The Internet Protocol was developed to create a Network of Networks (the "Internet"). Individual machines are first connected to a LAN (Ethernet or Token Ring). TCP/IP shares the LAN with other uses (a Novell file server, Windows for Workgroups peer systems). One device provides the TCP/IP connection between the LAN and the rest of the world.

To insure that all types of systems from all vendors can communicate, TCP/IP is absolutely standardized on the LAN. However, larger networks based on long distances and phone lines are more volatile. In the US, many large corporations would wish to reuse large internal networks based on IBM's SNA. In Europe, the national phone companies traditionally standardize on X.25. However, the sudden explosion of high speed microprocessors, fiber optics, and digital phone systems has created a burst of new options: ISDN, frame relay, FDDI, Asynchronous Transfer Mode (ATM). New technologies arise and become obsolete within a few years. With cable TV and phone companies competing to build the National Information Superhighway, no single standard can govern citywide, nationwide, or worldwide communications.

The original design of TCP/IP as a Network of Networks fits nicely within the current technological uncertainty. TCP/IP data can be sent across a LAN, or it can be carried within an internal corporate SNA network, or it can piggyback on the cable TV service. Furthermore, machines connected to any of these networks can communicate to any other network through gateways supplied by the network vendor.

Addresses

Each technology has its own convention for transmitting messages between two machines within the same network. On a LAN, messages are sent between machines by supplying the six byte unique identifier (the "MAC" address). In an SNA network, every machine has Logical Units with their own network address. DECNET, Appletalk, and Novell IPX all have a scheme for assigning numbers to each local network and to each workstation attached to the network.

On top of these local or vendor specific network addresses, TCP/IP assigns a unique number to every workstation in the world. This "IP number" is a four byte value that, by convention, is expressed by converting each byte into a decimal number (0 to 255) and separating the bytes with a period. For example, the PC Lube and Tune server is 130.132.59.234.

An organization begins by sending electronic mail to Hostmaster@INTERNIC.NET requesting assignment of a network number. It is still possible for almost anyone to get assignment of a number for a small "Class C" network in which the first three bytes identify the network and the last byte identifies the individual computer. The author followed this procedure and was assigned the numbers 192.35.91.* for a network of computers at his house. Larger organizations can get a "Class B" network where the first two bytes identify the network and the last two bytes identify each of up to 64 thousand individual workstations. Yale's Class B network is 130.132, so all computers with IP address 130.132.*.* are connected through Yale.

The organization then connects to the Internet through one of a dozen regional or specialized network suppliers. The network vendor is given the subscriber network number and adds it to the routing configuration in its own machines and those of the other major network suppliers.

There is no mathematical formula that translates the numbers 192.35.91 or 130.132 into "Yale University" or "New Haven, CT." The machines that manage large regional networks or the central Internet routers managed by the National Science Foundation can only locate these networks by looking each network number up in a table. There are potentially thousands of Class B networks, and millions of Class C networks, but computer memory costs are low, so the tables are reasonable. Customers that connect to the Internet, even customers as large as IBM, do not need to maintain any information on other networks. They send all external data to the regional carrier to which they subscribe, and the regional carrier maintains the tables and does the appropriate routing.

New Haven is in a border state, split 50-50 between the Yankees and the Red Sox. In this spirit, Yale recently switched its connection from the Middle Atlantic regional network to the New England carrier. When the switch occurred, tables in the other regional areas and in the national spine had to be updated, so that traffic for 130.132 was routed through Boston instead of New Jersey. The large network carriers handle the paperwork and can perform such a switch given sufficient notice. During a conversion period, the university was connected to both networks so that messages could arrive through either path.

Subnets

Although the individual subscribers do not need to tabulate network numbers or provide explicit routing, it is convenient for most Class B networks to be internally managed as a much smaller and simpler version of the larger network organizations. It is common to subdivide the two bytes available for internal assignment into a one byte department number and a one byte workstation ID.

classb.gif

The enterprise network is built using commercially available TCP/IP router boxes. Each router has small tables with 255 entries to translate the one byte department number into selection of a destination Ethernet connected to one of the routers. Messages to the PC Lube and Tune server (130.132.59.234) are sent through the national and New England regional networks based on the 130.132 part of the number. Arriving at Yale, the 59 department ID selects an Ethernet connector in the C& IS building. The 234 selects a particular workstation on that LAN. The Yale network must be updated as new Ethernets and departments are added, but it is not effected by changes outside the university or the movement of machines within the department.

A Uncertain Path

Every time a message arrives at an IP router, it makes an individual decision about where to send it next. There is concept of a session with a preselected path for all traffic. Consider a company with facilities in New York, Los Angeles, Chicago and Atlanta. It could build a network from four phone lines forming a loop (NY to Chicago to LA to Atlanta to NY). A message arriving at the NY router could go to LA via either Chicago or Atlanta. The reply could come back the other way.

How does the router make a decision between routes? There is no correct answer. Traffic could be routed by the "clockwise" algorithm (go NY to Atlanta, LA to Chicago). The routers could alternate, sending one message to Atlanta and the next to Chicago. More sophisticated routing measures traffic patterns and sends data through the least busy link.

If one phone line in this network breaks down, traffic can still reach its destination through a roundabout path. After losing the NY to Chicago line, data can be sent NY to Atlanta to LA to Chicago. This provides continued service though with degraded performance. This kind of recovery is the primary design feature of IP. The loss of the line is immediately detected by the routers in NY and Chicago, but somehow this information must be sent to the other nodes. Otherwise, LA could continue to send NY messages through Chicago, where they arrive at a "dead end." Each network adopts some Router Protocol which periodically updates the routing tables throughout the network with information about changes in route status.

If the size of the network grows, then the complexity of the routing updates will increase as will the cost of transmitting them. Building a single network that covers the entire US would be unreasonably complicated. Fortunately, the Internet is designed as a Network of Networks. This means that loops and redundancy are built into each regional carrier. The regional network handles its own problems and reroutes messages internally. Its Router Protocol updates the tables in its own routers, but no routing updates need to propagate from a regional carrier to the NSF spine or to the other regions (unless, of course, a subscriber switches permanently from one region to another).

Undiagnosed Problems

IBM designs its SNA networks to be centrally managed. If any error occurs, it is reported to the network authorities. By design, any error is a problem that should be corrected or repaired. IP networks, however, were designed to be robust. In battlefield conditions, the loss of a node or line is a normal circumstance. Casualties can be sorted out later on, but the network must stay up. So IP networks are robust. They automatically (and silently) reconfigure themselves when something goes wrong. If there is enough redundancy built into the system, then communication is maintained.

In 1975 when SNA was designed, such redundancy would be prohibitively expensive, or it might have been argued that only the Defense Department could afford it. Today, however, simple routers cost no more than a PC. However, the TCP/IP design that, "Errors are normal and can be largely ignored," produces problems of its own.

Data traffic is frequently organized around "hubs," much like airline traffic. One could imagine an IP router in Atlanta routing messages for smaller cities throughout the Southeast. The problem is that data arrives without a reservation. Airline companies experience the problem around major events, like the Super Bowl. Just before the game, everyone wants to fly into the city. After the game, everyone wants to fly out. Imbalance occurs on the network when something new gets advertised. Adam Curry announced the server at "mtv.com" and his regional carrier was swamped with traffic the next day. The problem is that messages come in from the entire world over high speed lines, but they go out to mtv.com over what was then a slow speed phone line.

Occasionally a snow storm cancels flights and airports fill up with stranded passengers. Many go off to hotels in town. When data arrives at a congested router, there is no place to send the overflow. Excess packets are simply discarded. It becomes the responsibility of the sender to retry the data a few seconds later and to persist until it finally gets through. This recovery is provided by the TCP component of the Internet protocol.

TCP was designed to recover from node or line failures where the network propagates routing table changes to all router nodes. Since the update takes some time, TCP is slow to initiate recovery. The TCP algorithms are not tuned to optimally handle packet loss due to traffic congestion. Instead, the traditional Internet response to traffic problems has been to increase the speed of lines and equipment in order to say ahead of growth in demand.

TCP treats the data as a stream of bytes. It logically assigns a sequence number to each byte. The TCP packet has a header that says, in effect, "This packet starts with byte 379642 and contains 200 bytes of data." The receiver can detect missing or incorrectly sequenced packets. TCP acknowledges data that has been received and retransmits data that has been lost. The TCP design means that error recovery is done end-to-end between the Client and Server machine. There is no formal standard for tracking problems in the middle of the network, though each network has adopted some ad hoc tools.


TCP/IP Protocols

The Transmission Control Protocol/Internet Protocol (TCP/IP) standards are always published as RFCs (Request For Comment), but not all RFCs specify standards.

TCP/IP protocols follow the Department Of Defence (DOD) four-layer model:

Application/Process
Transport or Host-to-Host
Internet
Network Access

OSI Model

DOD Model

Application

Application/Process

Presentation

Session

Transport

Transport or Host-to-Host

Network

Internet

Data Link

Network Access

Physical

The TCP/IP protocol suite consists of:

Application/Process Telnet FTP LPD SNMP
TFTP SMTP NFS X window
Host-to-Host

TCP

UDP

Internet ICMP BootP ARP RARP
IP
Network Access Ethernet Fast Ethernet Token Ring FDDI

At the Network layer IP uses NDIS (Network Device Interface Specification) to submit frames to the network layer.

The four protocols at the Internet layer are:

  1. IP. The Internet Protocol is a connectionless and unreliable protocol that addresses and routes packets between hosts. It contains the Source IP of the sender host, the destination IP, the transport protocol (TCP or UDP), checksum and Time to Live (TTL). The TTL is decremented by at least one second each time the IP datagram passes through a router. When TTL reaches zero, the packet is discarded. The default TTL in NT 4.0 is 128 seconds.
  2. ARP. The Address Resolution Protocol obtains the MAC addresse of a host on the same physical network by broadcast and maps it to the host's IP addresse. Once ARP obtains a hardware address, stores both the IP and the MAC address as one entry in the ARP cache. The cache maintains both static and dynamic entries. Dynamic entries are added and deleted automaticaly, where static entries remain in cache until the computer restarts. ARP always checks the cache before it initiates a broadcast. ARP allows two hosts on different subnets to communicate by broadcasting to the default gateway. Each ARP cache entry can live up to 10 minutes. If it is not used within 2 minutes, it is deleted; otherwise, if used, it is deleted after 10 minutes. By adding static ARP entries you decrease the number of ARP requests. To view the ARP cache use the arp -g command.
  3. ICMP. The Internet Control Message Protocol reports errors and control messages on behalf of IP. It is carried by IP datagrams and it is unreliable.
  4. IGMP. The Internet Group Management Protocol passes information to other routers so each router is aware of what hosts belong to what network. It is carried by IP datagrams and it is unreliable.

The two protocols at the Transport or Host-to- Host layer are:

  1. TCP. The Transmission Control Protocol is a reliable, connection-oriented delivery service. It uses byte-stream communications so data is treated as a sequence of bytes. For each data segment sent, the receiving host must return an acknowledgment within a specified period. If there is no acknowledgment, the data is retransmitted. A TCP session is initialized via a three-way handshake in order to synchronize the sending and receiving of data segments. All TCP data segments have two parts: data and header. Sockets applications use a unique port number. Port numbers for well-known server side applications are pre-assigned by IANA and do not change. Port numbers for client-side applications are dynamically assigned by the operating system. A socket is created by an application by specifying the IP of the host, the service type (TCP or UDP) and the port the applications is using.
  2. UDP. The User Datagram Protocol is an unreliable and connectionless datagram service. Is used by applications that do not require acknowledgment of data receipt such as NetBIOS name service and SNMP. UDP ports are separate from TCP ports even though some of them use the same port number.

At the Application layer Microsoft TCP/IP provides two interfaces for network applications: Windows Sockets and NetBIOS. Examples of sockets applications are FTP and Telnet. Examples of NetBIOS applications are net view, net use, etc.

TCP/IP enables you to connect dissimilar systems with utilities such as FTP and Telnet. On NT all TCP/IP utilities are implemented as client software except for FTP which is both client and server. Note that NT can be a SLIP client but not SLIP server, thus NT RAS servers do not accept SLIP client connections.

1. Data transfer utilities:

Trivial File Transfer Protocol (TFTP) works like FTP.
Remote Copy Protocol (RCP) copies files between NT and a Unix host.

  1. Remote execution utilities:

    Telnet provides terminal emulation.
    Remote Shell (RSH) which runs commands on a Unix host.
    Remote Execution (REXEC) which runs a process on a remote computer.

  2. Printing utilities:

    Line Printer Remote (LPR) prints a file to a host running the Line Printing Daemon (LPD) service.
    Line Printer Queue (LPQ) obtains status of a print queue on a host running the LPD service.

  3. Diagnostics utilities:

PING (Packet InterNet Groper)
IPCONFIG
Finger
NSLOOKUP
HOSTNAME
NETSTAT
NBTSTAT
Route
Tracert
ARP

TCP/IP configuration uses an IP address, subnet mask and default gateway to communicate with hosts. Each NIC in a computer that use TCP/IP requires these parameters.

An IP address is a logical 32-bit number that identifies a host. Each IP consists of the network ID and the host ID. The network ID identifies all hosts on the same physical network and the host ID identifies a host on the network.

A subnet mask blocks out a portion of the IP address so that TCP/IP can distinguish the network ID from the host ID. The subnet mask determines if the destination host is on a local or remote network. If a duplicate IP address is configured, the IP address appears as configured, but the subnet mask appears as 0.0.0.0

The default gateway receives all TCP/IP packets addressed to a remote network.

You can ping the loopback address 127.0.0.1 of any host to bypass the NIC and verify that TCP/IP is installed and loaded correctly.

IP Addressing

Each IP is 32 bits long and composed of four 8-bit fields called octets. Each octet can range from 0 to 255. When all bits of an octet are 0 then the value of the octet is 0. When all bits of an octet are 1 then the value of the octet is 255. The 32-bit IP addressing scheme supports a total of 3,720,314,628 hosts.

Each IP address has a network IP and a host IP part. All hosts on a network must have the same network ID in order to communicate. All TCP/IP hosts, including interface to routers, require unique host IDs.

IP addresses have five different classes. Each class defines the part of the IP which identifies the network ID and the part which identifies the host ID. You identify the class of an IP address by the number of the first octet.

Class A

High order bit = 0
Network ID = First octet
Range of network Ids = 1-126
Max networks = 126
Max hosts = approx. 17 million per network

Class B

High order bit = 10
Network ID = First two octets
Range of network Ids = 128-191
Max networks = 16,384
Max hosts = approx. 65,000 per network

Class C

High order bit = 110
Network ID = First three octets
Range of network Ids = 192-223
Max networks = approx. 2 million
Max hosts = 254 per network

Class D

High order bit = 1110
Use only for multicast group. There are no network or host bits in the multicast operations. WINS and Microsoft NetShow use multicast.

Class E

High order bit = 1111
Used for experimental purposes.

Some Addressing Rules

  • Each octet can range from 0 to 255.
  • Network IDs range from 1 to 223.
  • The network ID cannot be 127. This ID is reserved for loopback and diagnostic functions.
  • The network and the host ID bits cannot all be 1's (255.255.255.255). This address is interpreted as a broadcast address.
  • The network and the host ID bits cannot all be 0's (0.0.0.0). This address is interpreted to mean "this network only."
  • In any class IP address you cannot have 0 as the first octet (this network only) or 255 as the last octet (broadcast).
  • The host ID must be unique to the local network ID.
  • Networks connected by routers need unique network IDs.
  • Networks connected to the Internet need to have unique network ID portions assigned by the InterNIC.

Subnet Mask

A subnet mask is a 32-bit address use to block a portion of the IP address to distinguish the network ID from the host ID. This way TCP/IP can determine whether an address is on a local or remote network. A default subnet mask is used on networks that are not devided into subnets.

In the subnet mask, all bits that correspond to the network ID are set to 1 (255) and all bits that correspond to the host ID are set to 0.

The host IP is ANDed with its subnet mask and the destination address of a packet is ANDed with the same subnet mask. If the result of ANDing the source and destination address match, then the packet belongs to a host on the local network. If the results do not match, the packet is sent to the default gateway (router).

To AND an IP to a subnet mask, multiply each bit in the IP with the corresponding bit in the subnet mask.

Subnetting

A subnet is a physical segment in a TCP/IP environment that uses IP addresses derived from a single network ID. Subnetting requires that each segment use a different network ID, or subnet ID. A subnet ID is created by partitioning the bits in the host ID into two parts. One part is used to identify the segment as a unique network, and the other part to identify the hosts. Subnetting is not necessary for private networks. By using more bits for the subnet mask, more subnets are available, but fewer hosts are available per subnet.

Before subnetting you need to define:

  • One subnet mask for the entire network
  • A unique subnet ID for each physical segment
  • A range of host IDs for each subnet

To find the subnet mask:

  1. Count the number of physical segments in your network.
  2. Convert the above number to binary.
  3. Count the number of bits required to represent the above number in binary.
  4. Convert the required number of bits to decimal in high order (left to right).

For example if you have a class B network and you want to create 6 subnets:

The binary value of 6 is 110. So 6 requires 3 bits. The third octet of a class B network is the first octet of the host ID. This octet now becomes 11100000 in order to represent the subnet mask (remember that the subnet mask portion of a network ID must have all bits equal to 1). The binary 11100000 is equal to 224 decimal. So the new subnet mask is 255.255.224.0 for your subneted class B network.

You can subnet using more that one octet or more that 8 bits. This way you can create more subnets with more addressing flexibility.

Use the following table to simplify the additions:

128
+64=192
+32=224
+16=240
+ 8=248
+ 4=252
+ 2=254
+ 1=255

Formula for subnetting a class C network

If Subnet Bits = z (borrowed from the first octet of the host ID portion)
Number of possible subnets = 2z-2 (all possible combinations of subnet bits, exluding the all 0 and all 1)
Hosts per Subnet = 28-z-2 (all possible combinations of remaining host bits, excluding the all 0 and all 1)
Total Hosts = (Number of subnets) x (Hosts per subnet)
Networks = The decimal value of the subnet bits in high order
Valid Subnetwork IDs
= 28-z = net1
net1+28-z =net2
net2+28-z =net3 etc…
Valid Hosts per Subnet = (net1+1) to (net2-2) etc…

Example: You want to divide a class C network into 4 subnets.
Subnet bits = 3
Number of subnets = 23-2 = 8-2 = 6
Hosts per subnet = 28-3-2 = 32-2 = 30
Total hosts = 6x30 = 180
Netmask = 11111111.11111111.11111111.11100000 = 255.255.255.224
Valid subnet IDs = 28-3 = 32
32+32= 64
64+32= 96
98+32= 128
130+32=160
162+32=192

Valid hosts per subnet = (32+1) to (32-2) = 33 to 62
65 to 94
97 to 126
129 to 158
161 to 190
193 to 222

No comments:

Post a Comment