MRECW Students E- Learning Point Computer Security Student (CSS)

Tuesday, 13 October 2015

UDP

User Datagram Protocol (UDP)

Now lets start with the very 1st protocol of Transport Layer i.e. User Datagram protocol (UDP).

UDP is one of the most important protocol in the Transport Layer providing services to the Internet. The UDP was first ever developed by David P. Reed in 1980. UDP is used by transport Layer to pass the messages from Application Layer to the underlying Network Layer.

UDP is said to be a Connection-Less protocol. This means that there is no handshaking between the two hosts before transmission of packets. UDP provides no guarantee for the delivery of packets. As transport layer works over Network Layer. Thus, UDP working over IP , that is also an unreliable data delivery. Therefore in case of UDP , you cannot be sure about the delivery of message. UDP just provides simple multiplexing and demultiplexing at hosts and a checksum for data integrity at both the hosts.

Multiplexing/De-multiplexing is performed by the Transport Layer Protocols in order to send the data between the Network Layer and the correct-Application Layer process.

Checksum is a procedure or you can say that it is a mathematical calculation, that is done at both the sending and the receiving host, in order to check that, whether the data has arrived in its original form or not. It checks the correctness of the received data.

**** Now a question must be arising in your mind. That, if UDP doesn't provides any guarantee of transfer of packets, it doesn't provide any flow control, then why should a developer use UDP for his Application??

Let me tell you some uses of UDP and why should a Application Developer uses UDP for his Application.

There are various uses of UDP. These are as follows:

1. Stateless Protocol :

It doesn't maintain any state of the clients. Thus is very useful in Application where there are millions of clients and maintaining information of all the clients would be difficult task such as Streaming Media, etc.

2. No Re-transmission Delay :

As UDP doesn't provide any reliable delivery of data, so there are no re-transmission for the lost packets. Therefore, this is good for real Time Application such as Voice over IP etc.

3. Suitable for DNS:

It is a Transaction Oriented protocol, thus plays a vital role in query-response protocols, such as Domain Name Systems (DNS).

4. Small Header :

The header that UDP encapsulate with Application Layer message, is only of 8 bytes whereas TCP header is of 20 bytes. We will discuss about headers later in the post.

Working of UDP:

As we have discussed above that UDP do multiplexing/demultiplexing and some error checking. It just adds a UDP header containing the Source port Number and a Destination port Number and pass the data to the underlying Network Layer. Then the network Layer adds it own header to Transport Layer packet and sends it to the destination.

UDP is really helpful in real time Applications such as, Video Conferencing , Online Gaming etc. Because in such Applications, losing packets is preferable than getting Delayed.

Port Numbers:

Ports are identified by a port number. A Port Number is a 16 bit number i.e a port number can be assigned from 0 to 65535 to an Application. Port Number from 0 to 1023 are registered for the IANA registered services such as HTTP, FTP, TCP etc. and port Numbers from 1024 to 65535 are dynamic, that means, a newly developed application can be assigned port number among these.

UDP Packet Structure:

user datagram protocol packet, udp packet, source port number, destination port number, checksum in udp, non-reliable data transfer, basics of multiplexing and demultiplexing

The UDP Header has 4 fields . Each field is of 16 bit or 2 bytes. Thus UDP header length is 8 bytes or 32 bits.

******** 1 byte= 8 bits *********

Source and Destination Port Number:

The source and destination port number allows the application to pass the message to the correct process running on the end systems or the hosts. They are of length 16 bit each.

Length:

It tells you the complete length of the UDP Packet i.e Length = UDP header + Application Message length. Thus for different UDP segments, length will be different, depending upon the size of the Application Message.

Minimum Length will be 8 bytes i.e. the size of the header.

Checksum:

Checksum is used on the receiving side in order to check, whether the data has arrived in its original form or not. Or to check any error in the data, during transmission.

Checksum Calculation:

Checksum is used for Error Detection.

Let us suppose that we have

Source Port Number (S) = 0001101110010101

Destination Port Number (P) = 1010100101110011

Length (L) = 0011011110010001

We will add all these and put it in the Checksum field.

S+P=T

0001101110010101

1010100101110011

+ 1100010100001000 = T

T+L= A

1100010100001000

0011011110010001

+ 1111110010011001 = A

Now we will take the 1's complement of A, i.e converting all 1's to 0 and all 0's to 1.

Thus 1's complement of 1111110010011001 is 0000001101100110. And this is our checksum.

We will put this checksum in the header and send the packet. On the receiving side, the receiver will add all the 4 fields of the header i.e. Source+Destination+Length+Checksum.

If the data has arrived him correctly, then the sum will be equal to 1111111111111111. That is , he will get all 1's after adding the 4 fields. And if he gets 0 at some bit , that will show him the error.

You can see that UDP provides error checking, but it doesn't provide anything to recover from that error. Some UDP applications will discard the packet with an error, and some will pass the packet to the Application with a given warning.

Datagram Network Layer

Network Layer Datagram and its format

There are basically 3 components or 3 main parts of Network Layer. The 1st component is the Routing Protocols and Routing algorithms . The 2nd component is the IP protocol that covers the Network Layer datagram format and addressing convention. The 3rd component is the Internet Control Message Protocol (ICMP) protocol that undertakes Error reporting and Router Signaling.

Basics and characteristics of network layer, routing protocols in network layer, forwarding tables, IP protocol, ICMP protocol

Starting with IP protocol datagram. Remember that the Network layer packet is said to be as Datagram

IP Datagram Format

There are 2 versions of IP. IPv4 and IPv6. IPv4 is widely used in internet. As the number of users is increasing every day , thus an alternative of IPv4 was developed (IPv6) in order to provide more number of IP addresses to the hosts.

IPv4 Datagram format :

Format of IPv4 packet in network layer, transport layer protocol, fragmentation in operating system, datagram of IP, version of IP used in internet

1. Version :

It tells the router about the version of IP whether it is version 4 or version 6. Because different version are processed differently. This field is of 4 bits.

2. Header Length :

This field tells you the size of IP header. There are variable field in the IPv4 header. Like option can be there or not. So Header Length is usually used to tell, from where the actual Data is starting in an IP datagram. This field is of 4 bits.

3. Type of Service :

This field is used in IP header in order to allow router to distinguish about the type of application. For example, whether an Application is a Real-Time Datagrams such as for Video Calling or a Non Real-Time Datagrams such as for SMTP or FTP. This field is of 8 bits.

4. Datagram Length :

This field indicated the total size of the IP datagram i.e. IP header + Data. This field is of 16 bits.

5. Header Checksum :

This is used in order to detect the error in the IP datagram. Typically in most cases, if a router detects an error in a datagram, it discards that datagram. This field is of 16 bits.

6. Upper Layer Protocol :

This field indicates the name of the protocol that is being used in the above layer. Whether it is TCP or UDP. This field is used when the datagram reaches its destination. Then a number 6 in this field indicates , that TCP is used and a number 17 indicates that UDP is used at the Transport Layer. This field is of 8 bits.

7. Time-to-Live (TTL):

This field is used in order to ensure that the datagram is not circulated into the network for an unlimited period of time. For Example, in an infinite loop. Thus this field is decremented by one, every-time a datagram is processed at a router. Therefore when TTL becomes 0, the datagram is dropped. This field is of 8 bits.

8. Source and Destination IP address :

The source puts its own IP address in the source field and the address of the final destination in the destination field. The source often gets the IP address of the destination by a DNS lookup. Each of the source and destination address field is of 32 bits in IPv4 header.

9. Data :

This field contains the actual data to be transmitted to the destination. This field in an IP datagram contains the TCP segment to be transmitted. This field is of 32 bits.

The IPv4 header is of 20 bytes. We assume that there is no options. If it is working over TCP, then the each IPv4 datagram caries a total of 40 bytes header (TCP header of 20 bytes and IP header of 20 bytes) + the Application Message.

10. Identifiers, Flags and Offset :

These 3 fields are used in order to break the datagrams into the smaller segments , when a datagram is larger than a maximum limit arrives. Identifier field is of 16-bits, flags is of 3-bits, and fragmentation offset is of 13 bits.

******You can clearly see that the source IP address field is of 32 bits. That means there can exist 2^32 different IP addresses in Internet. This is almost equal to 4 billion IP addresses.*****

IPv6 Datagram Format

IPv6 packet format in network layer, hop limit in network layer, no options field, advantages of IPv6 over IPv4

1. Version :

Same as IPv4. It tells you about the version of the IP. Surely in this case, the value will be 6. This field is of 4 bits.

2. Traffic Class :

This field is of 8 bits. This is similar to Type of Service field in IPv4. This tells the router about the Real Time or non-real time Applications .

3. Payload Length :

It is a 16 bit field, describing the size of data in IP datagram. It gives you the size of the Data field.

4. Next header :

It tells you the type of the upper layer protocol being used. This field is of 8 bits.

5. Hop Limit :

This Hop count is decreased by one at every router. When Hop Count reaches, the datagram is discarded. This field is of 8 bits.

6. Source and Destination IP address :

Each of the source and destination address fields are of 128 bits. That means, now Internet can have 2^128 different IP addresses. This unit is in trillions. Thus the internet now can be expanded much bigger than in IPv4.

7. Data :

It contains the Transport Layer header along with the original Application message.

Now as the the size of Source and destination IP address increases, the header size of IPv6 datagram is 40 bytes.

Advantages of IPv6 :

1.In IPv6 , the size of IP addresses is increased from 32 bits to 128 bits. Therefore , now if you give an IP address to every seed on the Earth, the IP addresses will not end. That means, now the Internet world would not go out of IP addresses.

2. A lot of fields are removed in IPv6. Such as Options field, making it a fixed length header, resulting in the faster processing of IP datagram.

3. There is no fragmentation of IPv6 datagram. If the datagram is large, the router simply drops it and sends a "Datagram too Large" ICMP error message.

4. The checksum is removed in IPv6 header. The developers thought that the checksum at the Transport layer is suitable. At network layer, it is getting redundant, so developers decided to remove checksum from network layer.

As IPv4 header contains a TTL field, thus checksum has to be processed at every router. Along with fragmentation, this is a very costly process.

Transformation from IPv4 to IPv6 :

As you must know, that most of the routers around the globe are working on IPv4. And the present routers are not compatible of handling IPv6 datagrams. So what should be done to make them IPv6 compatible. One solution that some scientists give is that, a flag day should be declared and all the networks of the world should be closed on that day for this transformation, and in that time the routers must be converted to IPv6. But do you really think that is it a possible solution with millions of systems in the Internet? Surely , I don't think so.

Other solution can be that the new IPv6 routers can be made compatible of handling both the IPv4 and IPv6 datagrams. This is known as Dual Stack Approach. Such nodes or routers that are capable of implementing both are known as IPv4/IPv6 nodes. But a still a problem is there in this approach also. Let me explain you with an example :

Suppose Router A wants to send a IPv6 datagram to Router F. But for transmitting a datagram to F, the datagram has to traverse through the intermediate routers B,C & D. But the router C & D are only IPv4 compatible routers. Now Router A will send an IPv6 datagram to B, but C is only IPv4 compatible. Thus , B has to send an IPv4 datagram to C. So router B will copy the fields of IPv6 to IPv4 datagram router, and the appropriate mappings can be done. But you can see that, there are certainly some fields in IPv6 that doesn't have a counterpart in IPv4. In such case, some fields will be lost. Since router E & F are capable of exchanging IPv6 datagram. But certainly , datagram arriving from D to E, doesn't contain all the fields originally sent by router A.

To overcome this dual Approach problem, we have a technique known as Tunneling. Tunneling will enable router E to receive the original datagram sent by router A. Let us take an Example given below. Suppose IPv6 compatible Router U and Router Z wants to inter-operate , but are connected by intermediate router W & X, that are IPv4 compatible only. Thus the intermediate IPv4 routers are referred to a Tunnel.

Tunneling in routers of IP, Routers can process both IPv4 and IPv6

Now a IPv6 router on the sending side of the tunnel , say router V, puts a complete IPv6 datagram into the data field of IPv4 datagram. This IPv4 datagram is addressed to Router Y. Then this datagram is sent into the tunnel. The IPv4 router routes this datagram inside the tunnel among themselves , without knowing that the IPv4 datagram itself contains a complete IPv6 datagram inside it. And finally the datagram reach router Y, where the IPv6 datagram is extracted from IPv4 and pass it on to Z. In this way, the IPv6 datagram reaches its destination without losing any fields.

Network Layer 5

The Network Layer

The Network Layer is the most Typical and most important layer in TCP/IP or Internet stack. First of all, I would like to tell you that, the Application and the Transport Layer doesn't resides on the routers and switches. Its the only Network Layer and the underlying Layers, that are there on the intermediate routers.

Characteristics of Network Layer:

1. Host Addressing:

Every end system or a host must have a specific address, so that its address could be known to the outer world. This address is known as IP address. IP addresses are of 32 bits. Such as, 192.145.59.28. For Example : You can be "Steve Bond" for the people of your house, "Steve Bond", 54-Church Street for the people in USA, "Steve Bond", 54-Church Street, USA, for the people of the whole world. In the same way, this IP address hierarchy works.

2. Connection-less Service:

Since IP is the only protocol, working at Network Layer. Thus when a datagram is travelling from sender to receiver, the recipient doesn't need to send any acknowledgement as, IP is connection-less.

The Network Layer is responsible for Packet Forwarding and Routing .

Forwarding : This is the phenomenon that includes when a packet arrives at a router, then to which next appropriate outgoing link , it should be sent. For Example in the given figure, a packet from Host A arriving at router R1 must be forwarded to the next router on the way to Host B.

Forwarding of packet from one router to another, routing principles, basics of network layer , data link layer, services provides by network layer, guaranteed delivery of packets

Routing : On one hand , where forwarding includes the functioning between two routers, the Routing includes the functioning of the whole network. Routing determines the whole path and the number of routers that the packet should go through during its flow from sender to receiver. The algorithms that calculates these paths are known as Routing Algorithms.

There is a forwarding table in every router. Every router examines the header field in the arriving packet and based on the corresponding index to the header field, router forwards packet to the the outgoing link. For Example, in the given figure, the header field value of the incoming packet is 0101 and its corresponding outgoing link is 3. Thus the router examines its forwarding table and transfers that packet to the corresponding output link i.e 3.

Forwarding tables in routers and intermediate switches,Forwarding of packet from one router to another, routing principles, basics of network layer , data link layer, services provides by network layer, guaranteed delivery of packets

Figure: Values in Forwarding Table at Different Routers

Now a question must be arising in your mind, that how these forwarding tables are configured within a Router. This is a very important concept, that describes the relation between forwarding and routing. The Routing algorithm determines the values that should be inserted within the router forwarding table. A routing algorithm may be centralized or de-centralized.

Centralized Algorithm :

The Algorithm may be operating in a single router or the centralized router and updating the records of each of the other routers also.

The disadvantage of this technique is that, on a single router a lot of burden is there. And the other disadvantage is, if the centralized router goes down, then the whole network will crash.

De-Centralized Algorithm :

The Routing algorithm is running on every router.

In both the cases, the router receives a routing protocol message, and updates or configures its forwarding table.

You can assume that , this all forwarding and routing configuration is done by humans. That is, persons are physically present at every router. Thus every human operator have to interact with each other in order to update the forwarding table records, so that the travelling packets reach the correct desired destination. But as you know, human configuration are more prone to errors and will be very slow, in comparison to a routing protocol. Therefore we have softwares and routing protocols in our networks to automate our work, in order to provide a much more efficient and fast delivery of packets to the end users.

There are numerous numbers of algorithms that run in the routers to provide fast and correct delivery of packets . For Example : Link State (LS) Routing Algorithm, Distance Vector (DV) Routing Algorithms etc. We will discuss each of these algorithms in detail in the coming posts.

IP Address :

As we all know, that humans understand the words and the letters very well. Thus domains name are very easy to remember by us. Like www.google.com, www.com2networks.blogspot.com etc. But what about routers and intermediate switches. These domains are of variable length. Thus they are very difficult to be understood by the routers. Therefore, there are fixed length IP addresses corresponding to every domain name, that are understand by routers. These IP addresses are of 32 bits. For Example : www.google.com corresponds to 192.174.43.128. Every section separated by the decimal is of 8 bits. Thus every section can contain number from 0 to 255 (2^8) .

The mapping of these domain names with the corresponding IP address is done by a protocol known as Domain Name System (DNS).

Services Provided by the Network Layer :

1. Guarantee of Delivery of Packets :

This means the packet will definitely reach its destination.

2. Guaranteed Delivery with Bounded Delay :

This means not only delivery of packet to its destination, but also within the given period of time. For Example : Delivery of packet in 50 msec.

3. Delivery of Packet in-order :

The packets will reach the destination in the same order as they were sent.

4. Minimal Jitter :

Jitter is the difference in the delays of the two packets of the same message arriving at a destination. Thus network must provide Minimal Jitter.

Now you must always remember that, the Network Layer doesn't provide any of these services. The Network Layer just provide a single service that is "Best Effort Service". The Network Layer tries to provides the best of its effort to provide these above services to the communicating hosts. But it doesn't guarantees anything.

Thus you can say , Best Effort Service can be Alternatively be used for No Service at all also. A network providing no delivery of packets can also come under Best Effort Service.

Router Architecture:

Now lets have a brief look at the parts of the Router or what is all there inside a router.

Architecture of a router, input ports, output ports, switching fabric,Forwarding tables in routers and intermediate switches,Forwarding of packet from one router to another, routing principles, basics of network layer , data link layer, services provides by network layer, guaranteed delivery of packets

Figure : Architecture of a Router

1. Input Ports :

The input port performs various functions inside a router. The incoming link is terminated at the left most box of the input port. At the right most box of the input port, it is the place where the forwarding table is consulted and the corresponding output link is determined.

2. Switching fabric :

The Switching fabric connects the input ports of router to its output ports and incorporates the the processor.

3. Output Ports :

The output port transmits the packet to the correct outgoing link towards its destination.

4. Routing Processor :

The Routing processor controls all the functions of consulting , updating and configuring the router and forwarding table. It executes the routing protocol. It also performs network management functions.

DDNS

Distributed Domain Name System (DNS)

Continuing with our Former Post on Domain Name System(DNS), now we will be discussing about the Distributed DNS and the DNS caching in this post.

To deal with millions of Internet users throughout the globe, a single DNS server is not capable of mapping each and every hostname to every IP address in a Computer Network. Thus, a network of DNS known as Distributed DNS is formed. This Domain Name Systems are structured in a hierarchical format. There are basically 3 types of DNS servers- Root Servers, Top-level Domain Servers and Authoritative Servers. Lets have a look at this diagram.

hierarchy of DNS servers, root server, dns server, top level domain dns server, authoritative servers, stanford, google, bing, wallmart,

1. Root DNS Servers:

There are 13 root servers throughout the globe. They are named from A to M. Most of these are located in North America. This doesn't mean that there are only 13 root DNS servers. This indicates that there are 13 authoritative companies that look after these root DNS servers and most of these companies are in North America. Because root DNS servers are replicated at various places to distribute the load and provide better services. The number of root DNS servers is around 247 that are spread throughout the world.

2. Top Level Domain (TLD )Servers:

These servers are responsible for the Top level Domain Names such as .com, .org, .edu, .gov etc. and the Top level Domains of a country such as .in, .us, .fr etc. The Two companies, 1st is Verisign Global Registry Services maintains the TLD servers for com top level domain and 2nd one is the Educause, that maintains the edu top level domains.

I refer you to read IANA TLD 2012 to get more knowledge on Top Level Domain Servers.

3. Authoritative Servers:

A company or a university can maintain their own authoritative DNS servers. The organisation having its host accessible publicly to the Internet can provide an authoritative DNS servers.

Here is a Map showing all the DNS servers throughout the Globe.

DNS server locations throughout the world, most of them in america, 247 root server, 13 companies to manage them, stanford, wallmart, google, root servers

There is also one more type of DNS servers. These are known as Local DNS servers. Every Internet Service Provider (ISP) has a local DNS. Whenever a host connects to a ISP, the ISP provides it with the IP address of its local DNS server. When a host makes a DNS query , the query is 1st send to the local DNS, which forwards it to the upper DNS server hierarchy.

Let us discuss an example that will make you clear with the working of the DNS servers in a hierarchy.

Let us suppose that a host ec.school.edu wants the IP address of the cs.stanford.edu. The local DNS server of ec.school.edu is dns.school.edu and the authoritative DNS server of cs.stanford.edu is dns.stanford.edu. The host ec.school.edu will 1st send the DNS query to its local DNS server. The query is to translate the hostname cs.stanford.edu into its IP address. The local DNS server forwards the query to the root DNS server. The root DNS notes that the query contains the .edu suffix, and returns the local DNS server a list of IP addresses for TLD servers responsible for .edu. The local DNS server then re-sends the query to a TLD server. The TLD server notes that query is with .stanford.edu suffix. Thus it responds with the IP address of authoritative DNS server for the Stanford University, named dns.stanford.edu. The local server now sends the final query to the dns.stanford.edu, which responds with the IP address of the cs.stanford.edu. You can see that, to obtain the IP address for 1 hostname, 8 DNS queries are being sent. Thus to reduce these queries DNS caching is used, that I will tell you later in this post.

Lets clear it with the help of a figure :

finding IP address of a host from authoritative dns server, requesting host, stanford.edu, cs.stanford,local dns server

Here we have observed that the TLD server knows the address of the Authoritative server, but in real world , it might not be the case. For example : Stanford University has a DNS server dns.stanford.edu. And the individual departments in the University might have their separate DNS servers for the departments, that will act as Authoritative Servers for the hosts in that department. Now the local server will send a query for cs.stanford.edu to the Stanford DNS server, dns.stanford.edu. The Stanford server will now return the IP address of the Authoritative server of CS department, dns.cs.stanford.edu. Finally the local server will sends a query directly to the authoritative DNS server of CS department, and it will return the desired IP address of the host. In this case, there will be total 10 DNS messages sent.

A figure for this scenario:

finding IP address of a host from authoritative department dns server, requesting host, stanford.edu, cs.stanford,local dns server

There are particularly 2 types of queries.

i) Recursive Query
ii) Iterative Query

The query sent from ec.school.edu to dns.school.edu is recursive , as it is send on its own behalf. But the other subsequent queries are iterative, since the replies are directly returned to dns.school.edu. In Figure 1 and Figure 2, only the query send from ec.school.edu to dns.school.edu is recursive, rest all other queries are iterative.

Diagram for Recursive Queries:

recursive queries to get the IP address of a host from authoritative DNS server, recursive queries, on behalf of themselves, dns caching

In an Internet world, the queries follows the Figure 1 and Figure 2 pattern.

DNS Caching :

DNS caching is an important aspect of DNS. It is highly used in the real Internet world to reduce the delays and to reduce the number of DNS queries running around the Internet.

Let me take the above Stanford example and you will understand DNS Caching very well.

Here ec.school.edu queries to Local DNS server to get the IP address of cs.stanford.edu. Now after completing this request, the Local DNS server will save this mapping in its own memory. Therefore, if any other host from the school , queries for the cs.stanford.edu again, then the local server can reply from its own memory at much faster pace. This phenomenon is known as DNS Caching. The Local DNS servers can cache the mappings of TLD servers also, in order to bypass the root servers.

But this caching will be removed after some period of time , as mapping between the hosts and IP address is not permanent.

DNS

DOMAIN NAME SYSTEM (DNS)

After Covering HTTP, FTP and SMTP, now we will discuss about another Application layer protocol, DNS. DNS stands for Domain Name System.

Before starting I would like to ask you , how do you identify human beings. I am sure , your answer will be , by their names. But I want to tell you that , there are also other ways of identifying a human being. Such as from their Driving License, from their passport Number etc. For example, If you work in a industry, where 1000's of employees work. And there is a database, that store the information of every employee according to the Serial Number id of that employee. So for the database, your serial id is an appropriate option to remember you. But your friend will not use that serial id, he will call you by your name only. Therefore, we humans can be identified in different ways, those different ways can be used for different preferences where appropriate.

Similarly the Internet hosts are identified in many ways. One way is to identify them by their host names. For Example : Hostname can be www.google.com, yahoo.in , network.edu etc. But these host-names are appreciated by humans only because hostnames are easily readable by them. Hostnames provide some information about the host. Say, if a hostname is www.school.edu.fr. Thus the .fr at the last refers that the host might be located in France. Except that it tells nothing.

But hostnames can be of variable lengths. What about the routers. it will be difficult for them to process these variable length hostnames. Therefore, for these reasons, hosts are also identified by IP-addresses.

IP address are the fixed length numbers. These are of 32 bits or 4 bytes such as 198.168.32.45. Each of the 1 byte or 8 bits separated by a decimal, can contain number for 0 to 255. These 4 bytes follow a hierarchical structure. For example, if you read a postal address on a letter, you will keep getting a more idea as you go down reading it, that where the address is located. In the same way, as we keep scanning the IP address from left to right, we will keep getting more and more information about the host, where it is located.

Importance of Domain Name System (DNS) :

Above we have discussed two ways of identifying a host. Either by their hostname or IP address. Human prefers hostname while the routers prefer IP addresses. Therefore to fulfill these preferences, it is a need that there should be directory that transforms the hostnames into routers understandable IP addresses. This work is done by Domain name System. It transforms the hostnames into their respective IP addresses.

Therefore we can say that, DNS is a database or a distributed database that is implemented in a hierarchy of DNS servers.

Also DNS is an application layer-protocol that apply queries to that database. DNS is used or implemented by the other Application Layer Protocols like HTTP to translate the human provided hostnames to IP addresses.

Lets discuss this with an example. Say, you type a URL in your Browser ( a HTTP client), www.com2networks.com/ images.png. Thus, for the client host to send the HTTP request to the Web Server www.com2networks.com, the user host must obtain the IP address of www.com2networks.com. These are the steps that took place when you type the URL in the Browser and press Enter.

1. The user or the client machine executes the client side of the DNS.

2. The Browser extracts the hostname from the URL i.e. www.com2networks.com, and delivers it to the client DNS.

3.The DNS client sends a message containing the hostname to a DNS server.

4. The DNS server replies back with the IP address of the requested hostname to the DNS client.

5. Now the browser receives the IP address from client DNS, it can setup a TCP connection to the HTTP server at that IP address. ( Connection with HTTP process at port 80).

You must have noticed that except a HTTP request-response, now there is a added DNS request-response also, resulting in the additional delay.

The DNS servers are often UNIX machines running on the Berkeley Internet Name Domain (BIND) Software. And the DNS protocol runs over UDP at port 53.

There is certain other services also that are provided by the DNS. I am telling you one of those which is the most important of all..

HOST ALIASING:

A hostname can be very complicated to remember . For example: east-country.education.girls.school.com . Thus, 1 or more alias name can be made for it, such as school.com or www.school.com. Hence , in this scenario, the east-country.education.girls.school.com is said to be the canonical hostname. DNS can obtain the canonical hostname as well as the IP address of a host.

Other service of DNS is Load Distribution.

Working of DNS and Issues Related With It :

Now you know how DNS works. When the browser wants to transforms a hostname into IP address, it invokes the DNS client . The DNS in the host sends a query into the network. After some Delay, the DNS in the user host gets a reply message within UDP datagram at port 53 that provides the correct IP address for the requested hostname. You can see that , DNS provides a simple translation service behind the scene i.e. you can also say that it acts as a black box. But in reality, this is very complicated phenomenon, that consists of thousands of DNS servers that are distributed among the globe. And also an Application Layer Protocol that regulates how the DNS servers and the requesting hosts communicate.

Now its possible that here is a single DNS server that contains all the IP addresses and the related mappings. The hosts just query the single DNS and the DNS responds directly to the requesting host. But in Today's Internet, where millions of hosts are requesting at a time. Thus, for a single DNS to process all queries is impossible. There are certain problems associated with this centralized DNS design. These are:

i) DNS failure: If at some point of time, this single DNS server crashes or stops, then the whole Internet is dead.

ii) Far Away DNS: For example, if the single DNS is put in Australia, then all the requests from USA have to travel the whole globe to process their requests, resulting in large delays.

iii) Traffic : There are millions of users around the globe, thus making it almost impossible for the single DNS to process all the requests.

iv) Maintenance: Every day , large number of new hosts are getting added to the internet. Thus, the single DNS have to updated with these records. Hence making it very difficult to maintain.

You can now illustrate that a centralized DNS is not possible in today's Internet. Thus, distributed DNS are implemented all over the globe to provide a better and a fast service. We will discuss the Distributed DNS in the next Post. Now coming to DNS Records and Message Format.

DNS Records:

The DNS servers that together implements the DNS distributed database , store Resource Records(RR's). including RR's that provide transformation from hostname to IP address. Each DNS reply message contains one or more resource records.

A Resource Record(RR) has four fields:

(Name, Value, Type, TTL)

TTL= Time to Live

TTL determines, when the record should be removed from the cache.

The DNS servers have record in 4 types that have different fields for RR's. These records are as follows:

a) If Type=A, the "Name" is a "hostname" and "Value is the IP address "of the hostname. For example:(shop.kung.com, 127.134.87.197,A). This a Type A example.

b) If Type=NS, then "Name" is "Domain(as kung.com)" and the "Value" is the "hostname of an authoritative server" that will know , how to obtain the IP address of the host. For Example: (kung.com, dns.kung.com, NS). This is a NS Type Records.

c) If Type=CNAME, then "Value" is a canonical hostname for the alias hostname and "Name" will provide the "Domain name" for the hostname. For Example:(kung.com, shop.cloth.metre.kung.com, CNAME). This is CNAME Type Record.

d) If Type=MX, the "Value" is the "canonical Name" of a mail server that has a Alias Name. For Example:(kung.com, mail.shop.kung.com, MX).

MX records enables the hostnames of mail servers to have easy alias names.
MX also enable an organisation to have same alias name for its mail server and one of its other server.
To get the canonical name for the mail server, a DNS client would query for a MX record and to obtain the canonical name of the other server, the DNS client would query the CNAME record.

DNS Message Format :

There are two types of DNS messages. DNS query and DNS reply. The format of both these messages is same. Lets have a look at the message format of DNS.

1. The first 12 bytes or 96 bits, are called as the header section, which has 6 fields. The Identifier filed is of 16 bits, that is a number which identifies the query. A Flag contains 1 bit number, either 0 or 1. If the Message is a query, the flag is set as 0, and if the message is a reply, flag is set to 1.

2.The Next 4 fields i.e. No. of Questions, No. of Answers, No. of Authority RR's and No. of additional Information RR's. contains information about the Number of Occurrences of the Below Given Fields.

3. The Question Section Contains the information about the query. This Section includes two things. 1. A Name field that contains the name of the query. 2. A Type Field that contains the type of question being queried. For Example: A host Address associated with a Name of Type A.

4. In the Reply from the DNS server, the Answer Section contains the Resource Records for the name, that was queried.

5. The information about the Authoritative Servers is contained in the Authority Section.

This was all I had in Introduction, Basics and Message Formats of Domain Name System. In the next Post, Continuing with DNS, I will discuss about Distributed Structure of DNS and DNS caching.

SMTP

Simple Mail Transfer Protocol (SMTP)

After covering the Hyper Text Transfer protocol (HTTP) and File Transfer Protocol (FTP), now we are going to have a look at the another Application Layer protocol i.e. Simple Mail Transfer protocol (SMTP).

Introduction to Electronic Mail (E-Mail)

Electronic Mail or E-Mail has been among the most favorite Applications on the Internet for several years. It has become more and more powerful and secure over the years. It has been widely used throughout the Globe.

E-Mail is a asynchronous service i.e. you can send and read when it is convenient for you. You doesn't need to coordinate with other's schedule. Now a days, E-Mail has become so powerful that you can attach photos, videos, HTML files or any format file and send it.

An Internet Mail System has basically 3 components. 1. User-Agent. 2. Mail Server 3. Simple Mail Transfer protocol(SMTP). This Diagram will give you a overlook, how a mail system works.

Transfer of Mails between two Hosts, user agent, mail server, simple mail transfer protocol, smtp, outgoing message queue,http, ftp, email, steve, james

Now we will discuss all the above 3 components. To describe these, I will take a Sender , James, who is sending the E-Mail, to a receiver, Steve.

If we want to read, send reply or retrieve an E-Mail, we do all this with the help of a user-agent. For Example, Microsoft-Outlook is a user-agent for Email. After James is done with writing the mail,, his user-agent sends the message to his mail server. In the mail server, the mail is stored in the mail server's outgoing queue. When Steve wants to read that mail, his user-agent retrieves the mail from his mailbox in his mail server.

Every recipient has a mailbox located in the mail servers. A mail starts from a user-agent travels to the sender's mail server, after that message reaches the receiver mail server, where it is stored in the receiver mailbox. Now Steve comes to read the mail in his mailbox. Thus, the mail server containing his mailbox will authenticate the Steve with his username and password.

Mail Servers are core item in a Email structure. James mail server must take care of the failure in Steve mail server. If the sender mail server is unable to send the mail, then it holds the mail in its queue and will attempt to deliver the message later. The reattempts to send a message are mostly done every 30 or 40 min. But if the message is not sent for few days, then the mail server removes it from the mailbox and informs the sender with an e-mail.

As HTTP and FTP, SMTP also has a client and a server side. The mail sender behaves as a client whereas the mail receiver behaves as a server. When a mail server sends a mail to other mail server, it acts as a SMTP client. When a mail server receives mail from other mail server, it act as a SMTP server.

Introduction to Simple Mail transfer Protocol (SMTP)

SMTP is used to transfer mail from sender's mail server to the recipient's mail server. Along with the number of advantages of SMTP, there is also a disadvantage or you can say , an old-fashioned characteristics of SMTP. The message sent in a SMTP mail, should necessarily be in 7-bit ASCII format. This is the restriction that SMTP apply on the mails. In today's world of Multimedia where a large number of Photos and videos are being sent over mail, 7-bit ASCII is a pain. Thus, before sending a binary coded multimedia data over SMTP, it has to be converted into 7-bit ASCII and on the receiver side, it has to be decoded back to binary after its Transport.

James sending mail to Steve. Lets have a look:

1. Now James opens his user-agent to send a e-mail. He provides Steve e-mail address, writes or composes a message and tells the user agent to send the message (by clicking on the send button).

2. James user-agent send this mail to James' mail server, where the message gets placed into the message queue.

3. The SMTP client side, that is running on the James mail server, see the message in the queue. Then it opens a TCP connection with the SMTP server, that is running on Steve mail server.

4. After the handshaking process, the SMTP client sends the message into the TCP connection.

5. The SMTP server side receives this message. Steve's mail server then places this mail in his mailbox.

6. Then Steve according to his convenience opens his user-agent, authenticates himself to the mail server and reads the mail.

Let me describe these 6 steps with the help of a figure:

Sending of Mail from james to steve,Transfer of Mails between two Hosts, user agent, mail server, simple mail transfer protocol, smtp, outgoing message queue,http, ftp, email, steve, james

A very enchanting feature of SMTP is that SMTP doesn't store the mail in any intermediate mail server. For example: If a James mail server is located in India and Steve's server is located in USA, then the TCP connection will be directly between India and USA servers. No intermediate server will be there. You can say that, if Steve server is down, the message will remain in James mail server and waits for the next attempt.

A general message that is being sent through SMTP is as follows:

Suppose server(S) name is India.com and client(C) name is USA.com.

S : 220 India.com

C : HELO USA.com

S : 250 Hello USA.com, Nice to meet you.

C : MAIL FROM:<James@USA.com>

S : 250 James@USA.com.... Sender OK.

C : RCPT TO:<Steve@India.com>

S : 250 Steve@India.com... Recipient OK.

C : DATA

S : 354 Enter Mail, end with "." on a line by itself.

C : Do you have a Grammar Book?

C : What about Atlas?

C : .

S : 250 Message accepted for delivery.

C : QUIT

S: 221 India.com closing connection.

The SMTP uses a persistent connection. Therefore simultaneous message are being over the same connection. Like, the Client send the "Do you have a Grammar Book?" and the "What about Atlas? , together on the same TCP connection.

You can try the above script in the "Command prompt" of your system.

Before starting, give a command, <telnet servername 25>

Here sever name is the name of your local mail server. And 25 is the default port number for SMTP. With this command , you are establishing a connection between your local host and the mail server. If the connection is established, then you must get a 220 reply from the server. And after start with the above commands and send a mail.

SMTP and HTTP:

HTTP and SMTP both protocols are for transfer of files from one host to other. They both use the persistent connections and send number of files over the same TCP connection. HTTP transfer files from server to client and SMTP transfers mails from one mail server to other mail server.

There are some differences between HTTP and SMTP. These are:

1. On one hand where HTTP is a pull protocol. On the other hand, SMTP is a push protocol.

HTTP is used to pull or extract the information from the server to the client. SMTP is used to push or to put the information from the sender's mail server to the receiver's mail server.

2. HTTP retrieves the data as it is in its original form only. But SMTP requires every message to be in 7-bit ASCII format. If the message is not in 7-bit ASCII or it is binary, then it has to be converted into 7-bit ASCII before sending over SMTP.

****** If you want to know more about Email services, I will ask you to read RFC 5321.******

I am done with the Simple mail Transfer protocol (SMTP). In the coming post, I will tell you about one more Application Layer Protocol i.e. Domain Name System (DNS).