Tuesday, 13 October 2015

Crypt Key


computer networking security, confidentiality, authentication, integration by calculus , Caesar cipher, mono-alphabetic cipher, poly-alphabetic cipher"Crypt" means "encrypt" and "graphy" means "writing".  Cryptography is the study of writing the code in an encrypted form. 

Security in Computer Networking

Consider the today's scenario of Internet where , there are more than 2 billion Internet users throughout the Globe. People interacting with each other, exchanging mails, purchasing items online and many other activities are being done over Internet online.
Thus there is a great need of securing the data of the users from the bad commodity or person, or we can say this bad person as an Intruder.

computer networks, symmetric key cryptographic, James bond, Steve Waugh, be-fooling the intruder

Suppose there are 2 persons , James and Steve. They want to communicate over Internet. Lets consider a scenario, what can be there basic needs when they communicate. First of all, they both surely wanted that the data they are exchanging, must not be read by any third person, it should be between them only. Secondly, while transmitting the data, the message should not be deleted, or modified. It should reach the destination in its original form. Thirdly, they both wants to verify that the person they are communicating , is the legal person, with whom they wants to communicate. Steve wants to make sure that the person on other side is James only, and James wants to make sure, that he is communicating with Steve only.

Taking these 3 scenarios, we can say that there are some basic needs of the communicating parties. These are :

1. Confidentiality :
The term confidentiality means that the meaning of the data must be understood by the sender and the intended receiver only. As an Intruder can steal the data during its transmission. Thus the message must be encrypted, so the Intruder is unable to understand the message.

2. Integrity :
James and Steve wants to ensure that the data is not altered during its transmission  neither by an Intruder, nor accidentally. In Transport Layer and Network Layer, we have checksum for error in the message. But prior to that, Data Integrity is used for this purpose.

3. Authentication :
In real world, when you meet a person and talks to him, you both just tell your name to each other, in order to verify each other identity. But when you are communicating over Internet, and you cannot see the other party, its very difficult to authenticate the identity of that person. Thus Authentication includes verifying the identity of the person to each other over Communication Medium.

4. Organisation Security :

As you must be aware, that now a days , almost every organisation's network is connected to the outer world Internet. Thus an Intruder can get access of the Organisation Network, he can deposit harmful malwares or worms in the end systems of the organisation network, can access secret information of the organisation. To secure organisation from such attacks, we have Firewalls and Intrusion detection System in Computer Networks which we will cover in our later posts. 



Now lets have a look at what an Intruder can do and how it can harm the network. Suppose James and Steve are communicating and they want to ensure confidentiality, Data Integrity and Authentication.  Now what different things an Intruder can do.
i)   Record the messages on the channel.
ii)   Delete or Modify the original message during its transmission.
iii)   Impersonating himself as someone else.
If proper measures are not taken, then an Intruder can attack in a numerous number of ways. For example, if not properly encrypted, an Intruder can steal your username and Password. He can do Denial of Service ( DoS ) attack by overloading the network resources and disabling other Network users to communicate. There are various other attacks also. We will discuss each and there measure in detail , later in this post and in the coming post.
Till now you must have understsood that there is a great need of securing our data while transferring it over Communication medium. There are various techniques to safeguard our data, these are known as Cryptography Techniques.
Cryptographic techniques are so much much developed in past 30-40 years, that they themselves include Confidentiality, Integrity and Authentication. So there is no need of applying for different security algorithms to provide all these services individually. 
Cryptography allows a sender to be-fool an Intruder by encrypting the message in some other format using certain technique or algorithm. The Intruder can be disguised, that he cannot get any information from the data, if he is able to intercept it. Yes , the authenticated receiver will be able to receive the original data from the disguised data. 
Let us suppose , James and Steve wants to communicate. James wants to send a message to Steve.  For example: James wants to ask "How are you Steve". Thus James message in original form is known as Plain text. James will use an encryption algorithm to encrypt his original message, to save it from the intruder attack. The encrypted message is known as Cipher Text. Cipher Text is not understandable by the intruder. 
But as you know, in today's global world, almost all the encryption and decryption algorithms are open to every person on the Internet. Even intruder knows these algorithms.So if intruder knows the encryption method, he could have easily decrypt the message. But still something is there, that is preventing the intruder to decrypt and extract the transmitted message, that is known as key.
A key can be anything like a string of characters or numbers etc. Say in this case, the encryption algorithm takes key A, message m as input and produces the cipher text as output. The cipher text here will be denoted as A(m).
Cipher Text ( C ) = A(m)
This means that the plain text, m is encrypted using a key A.  On the receiving side, Steve will provide a key B and the cipher text to the decryption algorithm, that will generate the plain text.
Plain Text (m) = B (C)
or
we can also write it as   , Plain Text (m) = B(A(m))
There are two types of encryption Algorithms.
i) Symmetric Key Algorithm :
In these algorithms, both sender and receiver have the identical keys. They share it with each other secretly.
ii) Public Key Algorithm :
In these types of algorithms, sender and receiver make use of a pair of keys. One of the keys is known to both sender and receiver and infact to the rest of the world, and the other key is known only to one , either sender or receiver, but not to both.
Let us start with our discussion over different encryption Algorithm. First we will be going through Symmetric Key Algorithms and then moving on to various public key encryption techniques.

Symmetric Key Algorithm in Cryptography


Till now what have you understand about Cryptography. It must be that Cryptography is just putting one thing in place of the other using certain techniques so that it should not be understood by any wrong person. So we shall now look at various symmetric algorithms that are almost 500 years old or more. For Symmetric Key cryptography algorithms , we will use key as K

1. Caesar Cipher :

For example : James wants to send a message to Steve, 
       "James, Meet me at University, Steve". 

Caesar Cipher will replace the every alphabet by its K letter later in the alphabet. Say if key K=5.  Then every alphabet in the plain message will be replaced by its 5 letter later alphabet. Therefore in this case, the cipher text will be as follows.
m= James, Meet me at University, Steve
K=5
Cipher Text ( C) = K(m) = K(5) 

'J' will be replaced by 'O', 'a' will be replaced by 'f' and similarly all the letters will be replaced. So our generated cipher text would be,
Cipher Text = Ofrjx, Rjjy rj fy Zsnajwxnyd, Xyjaj

Disadvantage of Caesar Cipher


Might be , the above Cipher text looks very difficult to read or unintelligible. But as you know there are only 26 alphabets in English. Thus if the intruder came to know that Caesar Cipher is used, then its very easy for him to break the code, as there are only 25 key value possible. So he can use hit and try method for 25 times and will surely obtain the original message in one of its try. 
2. Mono-Alphabetic Cipher :

Mono-alphabetic Cipher is an improvement over Caesar cipher. In Caesar Cipher, we were only able to replace the letters according to a pattern ( substituting according to K ) . But in Mono-alphabetic Cipher, we can replace the letters randomly. This means any letter can be substituted for any letter, as long as for a single letter throughout the message, same susbstitute must be used. 
Lets take an Example of a mono-alphabetic Cipher. 
mono-alphabetic cipher, security in computer networks, poly-alphabetic cipher, public key encryption techniques,symmetric key encryption
The plain message, "James, Meet me at University, Steve" will be encrypted as ,
Cipher Text = pqdtl, Dttz dt qz Xfoctklozn, Lztct
Mono-alphabetic cipher can be extended to 26! ways to encrypt your text. It is almost equal 10^26. Thus, even if the intruder knows , that you have used Mono-alphabetic cipher algorithm, then using a Brute Force Approach ( Hit and Trial Approach ) also, it will be very brainstorming task for him to crack the code.
3. Poly-alphabetic Cipher :


One more improvement of Caesar Cipher was Poly-alphabetic Cipher. In poly-alphabetic , we use 2 Caesar ciphers together. There are 2 keys such as ( with K=3 and K=5). Now we can chose that how these 2 keys should be used in a pattern. For Example: C1  , C1 , C2 , C2 , C1. The 1st letter will be encrypted using 1st encryption key, the second will also be using 1st encryption key, 3rd will be using 2nd encryption key, 4th will use 2nd key also and the 5th letter will be encrypted using 1st encryption key. And this pattern will be followed for the coming letters also. Lets take our Example :

Plain Text (m) = "James, Meet me at the University, Steve".
Keys, K=3, K=5
                           
Cipher Text (C) = Mdrjv, Phjy ph dy ykh  Xsnyhuxnwb, Vyjyh
In Poly-alphabetic Cipher, the Encryption and the Decryption key is the knowledge of the two ciphers i.e.(K=3 and K=5) and the pattern C , C1 , C2 , C2 , C1. 


  • Since now a days , technology and the communication over Internet has grown to that extent, that these Symmetric Key Cryptographic Techniques also don't work very effectively. As the 2 commodities communicating needs to share the key. Which is not feasible. Thus these techniques are hardly in use in today's world. The techniques which are currently in use are PGP or Public Key Encryption.

Public Key Cryptography



Securing Computer Networks using Public Key

Symmetric Key Cryptographic Techniques getting obsolete with the increasing number of user over the Internet, public key cryptographic techniques came into existence and are playing a vital role in today's computer network.

In Public Key Cryptography, inspite of having a shared key between sender and receiver, there are 2 keys.
Let us Suppose, James is the sender and Steve is the receiver.
Steve will have 2 keys , one is Public Key- that is available to the whole world and the second is Private
Key- that is only known to Steve. We will use Us   for the public key of Steve and Ks   for the private key of Steve.


James in order to to send message to Steve securely, will 1st fetch the public key of Steve (that is known to everyone). Now James will encrypt message, m , using an encryption algorithm and the public key if Steve. That means, James will perform, U(m). On receiving the encrypted message from James, Steve will use a decryption algorithm and his private key, to get the plain text. Steve will compute Ks(U(m)). You can clearly see that, James and Steve can securely send message to each other, without sharing or distribution of any key ( that has to be done in Symmetric Key Cryptography).


But a problem arises in such cases. As the public key of Steve is well known to everyone around the globe, thus anyone can encrypt a message using using his public and send it to Steve, impersonating himself as James. In Symmetric Key cryptography, as sender and receiver share the key, they both are verified of each other identity. But this is no longer available in Public Key Cryptography, as the public key is known to everyone. Therefore, to overcome this problem of authentication, we have a phenomenon , known as Digital Signature, that I will discuss with you later.


Let us now start with some Public Key Encryption Algorithms.


RSA Encryption Algorithm


RSA was named after the scientists who developed it i.e. Ron Rivest , Adi Shamir , Leonard Adleman . Lets now see the working of of RSA. But before directly coming to the implementation of RSA, lets have a look at some mathematical calculations, as RSA makes a large use of modulo-n for computing and encrypting a data. In modulo,  if you compute A modulo n, then the result will be the remainder lest after dividing A by n. For example : If you compute 24 mod 5, the result will be 4. There are various operations associated with modulo. These are as follows.


  • [( (a mod n) + ( b mod n )] mod n = (a + b) mod n
  • [( (a mod n) - ( b mod n )] mod n = (a - b) mod n
  • [( (a mod n) * ( b mod n )] mod n = (a * b) mod n
  • ( ( a mod n)^d ) mod n = ( a^d) mod n
While you send a message to someone, it is nothing but a stream of bytes , that is following a bit pattern. And a bit pattern can be easily represented by an Integer. For Example : The bit pattern 0110 can be written as 6 , 1001 can be written as 9 and so on every bit pattern. Therefore, when you are encrypting a message using RSA, you are actually encrypting an Integer.

Now to generate a public key and a private key in RSA algorithm, Steve will perform the following steps :

1. Choose two large prime numbers  a and b.

Now a question arises, how large the numbers must be ? The answer is, as mush the numbers are large, the more harder is to break the RSA algorithm. But also, it will takes more time to compute encryption and decryption. Thus it is recommended that you use prime numbers of the order 1024 bits.

2. Calculate n = a * b and  y = ( a-1) * ( b-1 )

3. Select a number e , less than n ( e < n ) such that e and n doesn't have any common factor other than 1. Hence m and n are said to be as relatively prime.

e will be used in encrypting the message.

  • 5 and 11 are relatively prime , as they have only 1 as common factor. 
4. Find a number z such that ez-1 is exactly divisible by y. We can also write it as that select z such that 
                                                         ez mod y =1.

z will be used decrypting the message.
5.  Now the public and the private key of Steve are ready. The public key , Us that will be available to the whole world is  ( n, e) and his private key , Ks is  ( n, z ).
\
James is sending a message m to Steve. Message will be represented by some integer that can be mathematically computed. Thus James will encrypt the message in this way.
Cipher text or Encrypted Message ( c ) = ( m^e ) mod n
The bit pattern corresponding to c, will be sent to Steve.
On the receiving side, to get the original message, Steve will decrypt it in the following way.
Plain text ( m ) = ( c^z ) mod n.
Let us take an Example that will make clear , how the RSA algorithm works.
Suppose James has a message , m=5 that he wants to send to Steve. So he start with RSA algorithm.
1. He chooses two prime numbers, say a = 17 and b = 7 .
2. Calculate n= a * b= 17 * 7 = 119 and y = ( 17-1 ) * ( 7-1 ) = 16 * 6 = 96.
3. Select e=5. Thus 5 and 119 are relatively prime.
4. To calculate in order to fulfill ez mod y=1. We will get z as 77
5. Therefore Steve public key = ( 119, 5) and private key = ( 119, 77). 
Thus while encrypting the message before sending it, James will do the following computation. 
Cipher Text ( c ) = ( m^e) mod n  = ( 5^5) mod 119 = 3125 mod 119 = 31
On the receiving side, while decrypting the cipher text, Steve will do the following computation.
Plain Text ( m ) = (c^z) mod n = ( 31^77) mod 119 = 5


So you must have noticed that , RSA is secure in the sense, there are numerous number of prime numbers are there for computation. Hence it will be a very brainstorming task for an Intruder to break the code and get the plain text.

Transport Layer 4

Introduction to Transport Layer

Now after covering almost every aspect of the Application Layer and the Application Layer protocols, we have now here to start our discussion on the 2nd layer of the Internet stack , The Transport Layer.

Before starting with our discussion on Transport Layer, I want to give you a live scenario, that will make you clear with the working of Transport Layer and the other layers of the Internet stack.

Let us suppose that there are two bungalows. One is in India and Other is in America. In the bungalow in India, lives a person James along with his 5 children. And in the bungalow in America, lives a person Steve along with his 4 children. Now all 5 children of James write a letter to every children of Steve on every Sunday. Therefore total number of letters will be 20. Thus, all the children writes the letter , put them in envelopes and hand over it to James. Then James write source house address and the destination house address on the envelope and give it to the postal service of India. Now the postal service of India puts some other addresses corresponding to the country and delivers it to the America postal Service. The American Postal sees the destination address on the envelopes and will deliver those 20 letters to the Steve House. Steve collects the letter from the postman and after considering the name of his respective children on the envelopes, he gives the letter to each of them.
In this example we have processes and the layers. Let me explain.
Processes = children
Application Layer messages = envelopes
Hosts = The two Bungalows
Transport Layer Protocol = James and Steve
Network Layer protocol = Postal Service
  • Here you can see, that according to children, James and Steve are the mail services, but in real they are just a part of the delivery process.
Suppose James and Steve went on a holiday for 10 days. Then as their susbstitute, Angelina comes at place of James and Ashley comes at place of Steve. But now for the two bungalows, unfortunately Angelina and Ashley are not able of providing services as James and Steve. Angelina and Ashley often drops letters, lose them, which are often eaten by the dogs. Thus in the same way , services provided by the Transport Layer depends on type of protocol you use in your network.
  • In this scenario, for example  if the postal service cannot guarantee the maximum time taken in the delivery of the envelopes, thus there is no way that James and Steve can guarantee the maximum delays in the delivery of mails between the children. In the same way, the services provided by the Transport Layer are often subjected to the services provide by the underlying network layer. If the network layer cannot guarantee the delay in delivery of messages, then surely transport layer can never.
But still, a Transport Layer protocol guarantees certain services that are not guaranteed by the underlying Network Layer. 
Thus, in our coming posts on transport layer and the other layers, you must always remember this example, making it easier for you to understand the layers concept.
Overview of Transport Layer Protocol:

Transport Layer acts as a medium between the Application Layer and the the Network Layer. Transport Layer provides a logical communication medium between the processes running on different end systems. What does logical communication means?

Logical communication means that, from an Application perspective, it is like the hosts are directly connected to each other, but in reality , the hosts may be on two opposite sides of the earth, connected by routers and links. As the physical infrastructure of the two hosts and the intermediate can be different, so for this reason , the Application processes use the logical communication of the Transport Layer to send message from one host to other and be free from the worry of the physical infrastructure used.


Different Protocols of Transport Layer:

There are basically two types of protocols that Transport provides to give services to Application Layer. These are:

1. Transmission Control Protocol (TCP)

2. User Datagram Protocol (UDP)

TCP provides a reliable and connection-oriented service to the Application. On the other hand, UDP provides a unreliable and connection-less service to the Application. Thus, it is upto the Application Developer, which transport Layer protocol has to be used. For example: In case of HTTP, it is built over TCP.

 Transmission Control Protocol (TCP):


1. TCP provides a reliable delivery of data. Irrespective of the fact, Network layer doesn't provides reliable data transfer, TCP guarantees that. But TCP doesn't guarantee anything about the time taken to deliver the packet.

2. TCP delivers the packet in their respective order. This means, from the source, in the order in which you will send the packet, they will reach in the same order at the destination.
3. As TCP provides reliable data transfer and connection oriented services, it is a bit heavy and complex protocol.

4. TCP provides congestion control or the flow control within a network.


User Datagram Protocol (UDP):

1. UDP provides non-reliable data delivery. It is a connection-less service provider.

2. UDP doesn't guarantee the delivery of packets in the order they are sent.

3. Its a light protocol. 

Relationship Between Network and Transport layer:

On one hand , the Transport Layer provides communication between processes, whereas on the other hand, the Network Layer provides communication between hosts. Recall the above example of two bungalows and compare it with that. I am sure, you will understand it

TCP

Transmission Control Protocol (TCP)

Coming to the 2nd Transport Layer Protocol i.e. Transmission Control Protocol (TCP). The Transmission Control Protocol is such an important protocol in the Internet protocol model, that the whole Internet stack is called as the TCP/IP stack.  TCP was 1st introduced in 1974.

A number of applications use TCP for their execution such as HTTP, FTP, SMTP. Web Browser uses TCP to connect to the World Wide Web (www). TCP is used to deliver mails and transfer files from one host to another.
TCP is called as Connection-Oriented Protocol. Because in TCP, there is a 3-way handshake between the communicating hosts before exchanging the data or the message. There are lots of services provided by TCP that were not available by UDP. TCP provides Reliable Delivery of data, checks for error in the data and tries for its correction, delivers the packet in their respective order as they were sent by the sender. TCP also provides congestion control and flow control.

TCP provides a full-duplex mode of transmission i.e. if there is a connection between a Process Y on one host and a Process Z on another host. Then the Application message can be sent from Process Y to Process Z and from Process Z to Process Y within the same TCP connection.

A TCP connection is between a single sender and a single receiver. That means, TCP is a point-to-point connection control protocol. Multi-casting is not possible with TCP. Multi-catsing is a procedure of sending of data from a single sender to different receiver in a single send operation. 


Problems with TCP:


Due to large traffic and congestion in the network, packets might get lost or they can get corrupt. TCP being the reliable data delivery protocol, detects all these problems, try to resolve them, asks for the re-transmission of the lost or corrupt packets, arranges the packet in order and send them to the receiver. The TCP receiver arranges all these packets in order and send them to the Application.

Thus in all this, re-transmissions , in order packet delivery, sometimes TCP results in long delays.


How TCP provides Reliable Delivery of Data:

TCP does it by a process of positive Acknowledgement. The receiver has to acknowledge the sender by sending a message to the sender, if it has receive the data correctly. The sender keeps the record of every packet it sends and maintains a clock with every packet. If the acknowledgment doesn't comes before the clock expires, the sender will re-transmit that packet, assuming that the packet is either lost or corrupted.


TCP Packet Structure:


Transmission Control protocol message packet, source port number, destination port number,checksum in tcp, application message, header length of tcp packet, acknowledgement number in tcp, sequence number in tcp


1. Source Port Number: It is used to identify the sending port number.

2. Destination Port Number: It is used to identify the receiving port Number.

3. Sequence Number and Acknowledgement Number : These are used for the reliable Delivery of data. Each Sequence and Acknowledgement Number is of 32 bits.

4. Header Length : This field is of 4 bytes. It tells the length of the TCP Header. The TCP header can be of variable lengths due to the presence of Option Fields. (In most cases, option field is empty, then the TCP header length is 20 bytes).

5. Unused or Reserved : This is reserved for future use. It is of 3 bit.

6. Pointers: 

  • ACK: This bit is used to indicate that whether the value in Acknowledgement field is valid or not. That means, whether it contains an acknowledgment number for a packet that has been successfully received.
  • RST, SYN and FIN : These 3 bits are used for establishing a connection between and to stop a connection.
  • PSH : This bit indicates whether the data should be pass to the upper layer or not. If  PSH=1, it indicates the receiver should pass the data immediately to the upper layer.
  • URG: This bit is used to indicate , if there is any data , that the host has marked as urgent. If URG=1, then it shows that some data has been marked as urgent. To know the address of that Urgent data, the 16-bit Urgent Data Pointer field is used which gives that information.


7. Checksum: It is of 16 bits. It is used to check that whether the data has arrived correctly or not. It is also used to detect any error in the data.


8. Application Message: It is of 32 bits. It contains the actual Application that the user wants to transmit.

9. Options : This field is of 32 bits. If the user need to send a data that is larger that the Application Message field size, then option field is used by the TCP header.


  • It is recommended that you must send small packets, because if the packet size is larger, than the packet fragmentation will take place at the network layer, resulting in more number of lost packets.

TCP Implementation:

TCP is implemented in different ways. These are:
iii) TCP VEGAS
iv) TCP SACK
We will discuss all these in details in our coming posts. 

Concept of Sequence Number and Acknowledgement Number

These two numbers are of great importance in TCP. As they enable TCP for reliable Data Delivery and in order Delivery of Packets. Lets take a look and try to understand, what are these two fields .

First of all, I would like to tell you that, TCP Sequence and Acknowledgement Number doesn't hold the size or number of transmitted packet , instead they contain the byte stream of those packets. That means , Sequence and Acknowledgement number contains the 1st byte of the transmitted packet.

The sender and the receiver decides on , what should be the starting sequence number of the 1st packet.

Sequence Numbers:

For Example : If you have a 50,000 bytes of data to send and the Maximum Segment Size (MSS) is 1000 bytes, you have to send 50 packets. Suppose you and the receiver decide the starting sequence number of 1st packet as 0. The second packet will have sequence number as 1000, 3rd will have 2000 and so on till 50000.


  • MSS refers to the maximum size of a TCP packet that can be sent.

Acknowledgement Number:

As we have already discussed that TCP is full Duplex, so at the same time both sender and receiver can send packets to each other simultaneously. Let us Suppose that James is the sender and Steve is the receiver.
All the packets arriving from James, has a sequence number for the data flowing from James to Steve.  Now the Acknowledgment  number that Steve puts in its packet is the sequence number of the next packet, that Steve is expecting from James.
For Example: If Steve has received all bytes numbered from 0 to 499 from James. Then Steve will put the acknowledgement number as 500 in the packet , it will send to James. 

Cumulative Acknowledgement:


For Example: If Steve has received bytes numbered from 0 to 499 and bytes from 1000 to 1099. Due to some failure or packet loss, Steve has not received packet from 500 to 999 and Still he is waiting for 500 segment, to build the whole message from James in a proper order. Thus Steve's next packet will contain acknowledgment number as 500. Since TCP only acknowledges the 1st byte missing, this is known as Cumulative Acknowledgement.
This also brings one more concern, that when Steve receives packet from 0 to 499 and then from 1000 to 1099, but packet from 500 to 999 is still missing. Thus the 3rd packet arrives out of order. Now a question arises: What should the receiver do when it receives a packet out of order?
Thus , I would like to bring the fact to you that, Computer networking doesn't impose any restriction on this. It totally depends on the people programming the TCP implementation. They can choose either of these:
1. The receiver can discard the out-of-order packet as soon as he receives it.
or
2. The receiver can save the out of order packet in its buffer and waits for the missing packet to arrive.
  • Hopefully maximum people will choose the 2nd option , in order to use the network bandwidth to the maximum, and to shorten the delay process.   In Internet, the 2nd choice is used .


*****There are various interesting cases, where you sometimes have to re-transmit and sometimes you need not re-transmit a packet due to acknowledgment. Lets have a look at these scenarios.



Scene 1: Re-transmission due to lost Acknowledgment :


tcp acknowledgement number, sender sending a letter, receiver responding to the letter, sequence number, acknowledgment number, time taken by a packet to deliver




Scene 2: Cumulative Acknowledgment avoids re-transmission of the 1st packet.

cumulative acknowledgement in tcp, tcp acknowledgement number, sender sending a letter, receiver responding to the letter, sequence number, acknowledgment number, time taken by a packet to deliver
 Suppose, sender sends a packet of 15 bytes and a packet of 20 bytes back to back and they are received at the receivers side and the receiver sends an acknowledgement for the same. But the acknowledgment for the 1st packet is lost in the network. But before the timeout occurs, the sender receives an acknowledgment for the 2nd packet. Thus, the sender will understand, that the sender has received all the packets till Sequence number 54. Hence it will not resend the 1st packet.

UDP

User Datagram Protocol (UDP)

Now lets start with the very 1st protocol of Transport Layer  i.e. User Datagram protocol (UDP).
UDP is one of the most important protocol in the Transport Layer providing services to the Internet. The UDP was first ever developed by David P. Reed in 1980. UDP is used by transport Layer to pass the messages from Application Layer to the underlying Network Layer.
UDP is said to be a Connection-Less protocol. This means that there is no handshaking between the two hosts before transmission of packets. UDP provides no guarantee for the delivery of packets. As transport layer works over Network Layer. Thus, UDP working over IP , that is also an unreliable data delivery. Therefore in case of UDP , you cannot be sure about the delivery of message. UDP just provides simple multiplexing and demultiplexing at hosts and a checksum for data integrity at both the hosts.
Multiplexing/De-multiplexing is performed by the Transport Layer Protocols in order to send the data between the Network Layer and the correct-Application Layer process.
Checksum is a procedure or you can say that it is a mathematical calculation, that is done at both the sending and the receiving host, in order to check that, whether the data has arrived in its original form or not. It checks the correctness of the received data.
**** Now a question must be arising in your mind. That, if UDP doesn't provides any guarantee of transfer of packets, it doesn't provide any flow control, then why should a developer use UDP for his Application?? 
Let me tell you some uses of UDP and why should a Application Developer uses UDP for his Application.
There are various uses of UDP. These are as follows:
1. Stateless Protocol :

It doesn't maintain any state of the clients. Thus is very useful in Application where there are millions of clients and maintaining information of all the clients would be difficult task such as Streaming Media, etc. 
2. No Re-transmission Delay :

As UDP doesn't provide any reliable delivery of data, so there are no re-transmission for the lost packets. Therefore, this is good for real Time Application such as Voice over IP etc. 

3. Suitable for DNS:

It is a Transaction Oriented protocol, thus plays a vital role in query-response protocols, such as Domain Name Systems (DNS).

4. Small Header : 
The header that UDP encapsulate with Application Layer message, is only of 8 bytes whereas TCP header is of 20 bytes. We will discuss about headers later in the post.


Working of UDP:

As we have discussed above that UDP do multiplexing/demultiplexing and some error checking.  It just adds a UDP header containing the Source port Number and a Destination port Number and pass the data to the underlying Network Layer. Then the network Layer adds it own header to Transport Layer packet and sends it to the destination. 
UDP is really helpful in real time Applications such as, Video Conferencing , Online Gaming etc. Because in such Applications, losing packets is preferable than getting Delayed.


Port Numbers:

Ports are identified by a port number. A Port Number is a 16 bit number i.e a port number can be assigned from 0 to 65535 to an Application. Port Number from 0 to 1023 are registered for the IANA registered services such as HTTP, FTP, TCP etc. and port Numbers from 1024 to 65535 are dynamic, that means, a newly developed application can be assigned port number among these.

UDP Packet Structure:

user datagram protocol packet, udp packet, source port number, destination port number, checksum in udp, non-reliable data transfer, basics of multiplexing and demultiplexing
The UDP Header has 4 fields . Each field is of 16 bit or 2 bytes. Thus UDP header length is 8 bytes or 32 bits.
                       
                         ******** 1 byte= 8 bits  *********


Source and Destination Port Number: 

The source and destination port number allows the application to pass the message to the correct process running on the end systems or the hosts. They are of length 16 bit each.


Length:

It tells you the complete length of the UDP Packet i.e Length = UDP header + Application Message length. Thus for different UDP segments, length will be different, depending upon the size of the Application Message.
 Minimum Length will be 8 bytes i.e. the size of the header.
Checksum:

Checksum is used on the receiving side in order to check, whether the data has arrived in its original form or not. Or to check any error in the data, during transmission.


Checksum Calculation:

Checksum is used for Error Detection.

Let us suppose that we have
Source Port Number (S) = 0001101110010101
Destination Port Number (P) = 1010100101110011
Length (L) = 0011011110010001
We will add all these and put it in the Checksum field.
S+P=T
                      0001101110010101
                      1010100101110011
                 +   1100010100001000   = T         
T+L= A
                     1100010100001000
                     0011011110010001
                  + 1111110010011001     = A


Now we will take the 1's complement of A, i.e converting all 1's to 0 and all 0's to 1.

Thus 1's complement of 1111110010011001 is 0000001101100110. And this is our checksum.

We will put this checksum in the header and send the packet. On the receiving side, the receiver will add all the 4 fields of the header i.e. Source+Destination+Length+Checksum.


If the data has arrived him correctly, then the sum will be equal to 1111111111111111. That is , he will get all 1's after adding the 4 fields. And if he gets 0 at some bit , that will show him the error.

  • You can see that UDP provides error checking, but it doesn't provide anything to recover from that error. Some UDP applications will discard the packet with an error, and some will pass the packet to the Application with a given warning.

Datagram Network Layer

Network Layer Datagram and its format

There are basically 3 components or 3 main parts of Network Layer. The 1st component is the Routing Protocols and Routing algorithms . The 2nd component is the IP protocol  that covers the Network Layer datagram format and addressing convention. The 3rd component is the Internet Control Message Protocol (ICMP) protocol that undertakes Error reporting and Router Signaling.
Basics and characteristics of network layer, routing protocols in network layer, forwarding tables, IP protocol, ICMP protocol
Starting with IP protocol datagram. Remember that the Network layer packet is said to be as Datagram

IP Datagram Format

There are 2 versions of IP. IPv4 and IPv6.  IPv4 is widely used in internet. As the number of users is increasing every day , thus an alternative of IPv4 was developed (IPv6) in order to provide more number of IP addresses to the hosts.

IPv4 Datagram format :



Format of IPv4 packet in network layer, transport layer protocol, fragmentation in operating system, datagram of IP, version of IP used in internet
1. Version :

It tells the router about the version of IP whether it is version 4 or version 6. Because different version are processed differently. This field is of 4 bits.


2. Header Length :

This field tells you the size of IP header. There are variable field in the IPv4 header. Like option can be there or not. So Header Length is usually used to tell, from where the actual Data is starting in an IP datagram. This field is of 4 bits.


3. Type of Service :

This field is used in IP header in order to allow router to distinguish about the type of application. For example, whether an Application is a Real-Time Datagrams such as for Video Calling or a Non Real-Time Datagrams such as for SMTP or FTP. This field is of 8 bits.


4. Datagram Length :

This field indicated the total size of the IP datagram i.e. IP header + Data. This field is of 16 bits.


5.  Header Checksum :

This is used in order to detect the error in the IP datagram. Typically in most cases, if a router detects an error in a datagram, it discards that datagram. This field is of 16 bits.


6. Upper Layer Protocol :

This field indicates the name of the protocol that is being used in the above layer. Whether it is TCP or UDP. This field is used when the datagram reaches its destination. Then a number 6 in this field indicates , that TCP is used and a number 17 indicates that UDP is used at the Transport Layer. This field is of 8 bits.


7. Time-to-Live (TTL):

This field is used in order to ensure that the datagram is not circulated into the network for an unlimited period of time. For Example, in an infinite loop. Thus this field is decremented by one, every-time a datagram is processed at a router. Therefore when TTL becomes 0, the datagram is dropped. This field is of 8 bits.

8. Source and Destination IP address :

The source puts its own IP address in the source field and the address of the final destination in the destination field. The source often gets the IP address of the destination by a DNS lookup. Each of the source and destination address field is of 32 bits in IPv4 header.


9. Data :

This field contains the actual data to be transmitted to the destination. This field in an IP datagram contains the TCP segment to be transmitted. This field is of 32 bits.


  • The IPv4 header is of 20 bytes. We assume that there is no options. If it is working over TCP, then the each IPv4 datagram caries a total of 40 bytes header (TCP header of 20 bytes and IP header of 20 bytes) + the Application Message.

10. Identifiers, Flags and Offset :

These 3 fields are used in order to break the datagrams into the smaller segments , when a datagram is larger than a maximum limit arrives. Identifier field is of 16-bits, flags is of 3-bits, and fragmentation offset is of 13 bits.


******You can clearly see that the source IP address field is of 32 bits. That means there can exist 2^32 different IP addresses in Internet. This is almost equal to 4 billion IP addresses.*****


IPv6 Datagram Format

IPv6 packet format in network layer, hop limit in network layer, no options field, advantages of IPv6 over IPv4
1. Version : 

Same as IPv4. It tells you about the version of the IP. Surely in this case, the value will be 6. This field is of 4 bits.


2. Traffic Class :

This field is of 8 bits. This is similar to Type of Service field in IPv4. This tells the router about the Real Time or non-real time Applications .


3. Payload Length :

It is a 16 bit field, describing the size of data in IP datagram. It gives you the size of the Data field.

4. Next header :

It tells you the type of the upper layer protocol being used. This field is of 8 bits.


5. Hop Limit :

This Hop count is decreased by one at every router. When Hop Count reaches, the datagram is discarded. This field is of 8 bits.

6. Source and Destination IP address :

Each of the source and destination address fields are of 128 bits. That means, now Internet can have 2^128 different IP addresses. This unit is in trillions. Thus the internet now can be expanded much bigger than in IPv4.

7. Data :

It contains the Transport Layer header along with the original Application message. 

  • Now as the the size of Source and destination IP address increases, the header size of IPv6 datagram is 40 bytes.

Advantages of IPv6 :
1.In IPv6 , the size of IP addresses is increased from 32 bits to 128 bits. Therefore , now if you give an IP address to every seed on the Earth, the IP addresses will not end. That means, now the Internet world would not go out of IP addresses.
2. A lot of fields are removed in IPv6. Such as Options field, making it a fixed length header, resulting in the faster processing of IP datagram.

3. There is no fragmentation of IPv6 datagram. If the datagram is large, the router simply drops it and sends a "Datagram too Large" ICMP error message.

4. The checksum is removed in IPv6 header. The developers thought that the checksum at the Transport layer is suitable. At network layer, it is getting redundant, so developers decided to remove checksum from network layer.

As IPv4 header contains a TTL field, thus checksum has to be processed at every router. Along with fragmentation, this is a very costly process.


Transformation from IPv4 to IPv6 :

As you must know, that most of the routers around the globe are working on IPv4. And the present routers are not compatible of handling IPv6 datagrams. So what should be done to make them IPv6 compatible. One solution that some scientists give is that,  a flag day should be declared and all the networks of the world should be closed on that day for this transformation, and in that time the routers must be converted to IPv6. But do you really think that is it a possible solution with millions of systems in the Internet? Surely ,  I don't think so.

Other solution can be that the new IPv6 routers can be made compatible of handling both the IPv4 and IPv6 datagrams. This is known as Dual Stack Approach. Such nodes or routers that are capable of implementing both are known as IPv4/IPv6 nodes. But a still a problem is there in this approach also. Let me explain you with an example :

Suppose Router A wants to send a IPv6 datagram to Router F. But for transmitting a datagram to F, the datagram has to traverse through the intermediate routers B,C & D. But the router C &  D are only IPv4 compatible routers. Now Router A will send an IPv6 datagram to B, but C is only IPv4 compatible. Thus , B has to send an IPv4 datagram to C. So router B will copy the fields of IPv6 to IPv4 datagram router, and the appropriate mappings can be done. But you can see that, there are certainly some fields in IPv6 that doesn't have a counterpart in IPv4. In such case, some fields will be lost. Since router E & F are capable of exchanging IPv6 datagram. But certainly , datagram arriving from D to E, doesn't contain all the fields originally sent by router A. 
To overcome this dual Approach problem, we have a technique known as Tunneling. Tunneling will enable router E to receive the original datagram sent by router A. Let us take an Example given below. Suppose IPv6 compatible Router U and Router Z wants to inter-operate , but are connected by intermediate router W &  X, that are IPv4 compatible only. Thus the intermediate IPv4 routers are referred to a Tunnel. 
Tunneling in routers of IP, Routers can process both IPv4 and IPv6
Now a IPv6 router on the sending side of the tunnel , say router V, puts a complete IPv6 datagram into the data field of IPv4 datagram. This IPv4 datagram is addressed to Router Y. Then this datagram is sent into the tunnel. The IPv4 router routes this datagram inside the tunnel among themselves , without knowing that the IPv4 datagram itself contains a complete IPv6 datagram inside it. And finally the datagram reach router Y, where the IPv6 datagram is extracted from IPv4 and pass it on to Z. In this way, the IPv6 datagram reaches its destination without losing any fields.

Network Layer 5

The Network Layer

The Network Layer is the most Typical and most important layer in TCP/IP or Internet stack. First of all,  I would like to tell you that, the Application and the Transport Layer doesn't resides on the routers and switches. Its the only Network Layer and the underlying Layers, that are there on the intermediate routers.

Characteristics of Network Layer:

1. Host Addressing:

Every end system or a host must have a specific address, so that its address could be known to the outer world. This address is known as IP address. IP addresses are of 32 bits. Such as, 192.145.59.28. For Example : You can be "Steve Bond" for the people of your house, "Steve Bond", 54-Church Street for the people in USA, "Steve Bond", 54-Church Street, USA, for the people of the whole world. In the same way, this IP address hierarchy works. 

2. Connection-less Service: 

Since IP is the only protocol, working at Network Layer. Thus when a datagram is travelling from sender to receiver, the recipient doesn't need to send any acknowledgement as, IP is connection-less.
  • The Network Layer is responsible for Packet Forwarding and Routing . 
Forwarding : This is the phenomenon that includes when a packet arrives at a router, then to which next appropriate outgoing link , it should be sent.  For Example in the given figure, a packet from Host A arriving at router R1 must be forwarded to the next router on the way to Host B. 

Forwarding of packet from one router to another, routing principles, basics of network layer , data link layer, services provides by network layer, guaranteed delivery of packets
Routing : On one hand , where forwarding includes the functioning between two routers, the Routing includes the functioning of the whole network. Routing determines the whole path and the number of routers that the packet should go through during its flow from sender to receiver. The algorithms that calculates these paths are known as Routing Algorithms. 
There is a forwarding table in every router. Every router examines the header field in the arriving packet and based on the corresponding index to the header field, router forwards packet to the the outgoing link. For Example, in the given figure, the header field value of the incoming packet is 0101 and its corresponding outgoing link is 3. Thus the router examines its forwarding table and transfers that packet to the corresponding output link i.e 3.
Forwarding tables in routers and intermediate switches,Forwarding of packet from one router to another, routing principles, basics of network layer , data link layer, services provides by network layer, guaranteed delivery of packets
          Figure: Values in Forwarding Table at Different Routers


Now a question must be arising in your mind, that how these forwarding tables are configured within a Router. This is a very important concept, that describes the relation between forwarding and routing. The Routing algorithm determines the values that should be inserted within the router forwarding table. A routing algorithm may be centralized or de-centralized.

Centralized Algorithm :

The Algorithm may be operating in a single router or the centralized router and updating the records of each of the other routers also.

The disadvantage of this technique is that, on a single router  a lot of burden is there. And the other disadvantage is,  if the centralized router goes down, then the whole network will crash.

De-Centralized Algorithm :

The Routing algorithm is running on every router.

In both the cases, the router receives a routing protocol message, and updates or configures its forwarding table.

  • You can assume that , this all forwarding and routing configuration is done by humans. That is, persons are physically present at every router. Thus every human operator have to interact with each other in order to update the forwarding table records, so that the travelling packets reach the correct desired destination. But as you know, human configuration are more prone to errors and will be very slow, in comparison to a routing protocol. Therefore we have softwares and routing protocols in our networks to automate our work, in order to provide a much more efficient and fast delivery of packets to the end users.
  • There are numerous numbers of algorithms that run in the routers to provide fast and correct delivery of packets . For Example : Link State (LS) Routing Algorithm, Distance Vector (DV) Routing Algorithms etc. We will discuss each of these algorithms in detail in the coming posts. 
IP Address :


As we all know, that humans understand the words and the letters very well. Thus domains name are very easy to remember by us. Like www.google.com, www.com2networks.blogspot.com etc. But what about routers and intermediate switches. These domains are of variable length. Thus they are very difficult to be understood by the routers. Therefore, there are fixed length IP addresses corresponding to every domain name, that are understand by routers.  These IP addresses are of 32 bits. For Example : www.google.com corresponds to 192.174.43.128. Every section separated by the decimal is of 8 bits. Thus every section can contain number from 0 to 255 (2^8) .

The mapping of these domain names with the corresponding IP address is done by a protocol known as Domain Name System (DNS).


Services Provided by the Network Layer :

1. Guarantee of Delivery of Packets :

This means the packet will definitely reach its destination.

2. Guaranteed Delivery with Bounded Delay :

This means not only delivery of packet to its destination, but also within the given period of time. For Example : Delivery of packet in 50 msec.

3. Delivery of Packet in-order : 

The packets will reach the destination in the same order as they were sent.

4. Minimal Jitter : 

Jitter is the difference in the delays of the two packets of the same message arriving at a destination. Thus network must provide Minimal Jitter.


  • Now you must always remember that, the Network Layer doesn't provide any of these services. The Network Layer just provide a single service that is "Best Effort Service". The Network Layer tries to provides the best of its effort to provide these above services to the communicating hosts. But it doesn't guarantees anything. 
Thus you can say , Best Effort Service can be Alternatively be used for No Service at all also. A network providing no delivery of packets can also come under Best Effort Service.


Router Architecture:

Now lets have a brief look at the parts of the Router or what is all there inside a router. 
Architecture of a router, input ports, output ports, switching fabric,Forwarding tables in routers and intermediate switches,Forwarding of packet from one router to another, routing principles, basics of network layer , data link layer, services provides by network layer, guaranteed delivery of packets
Figure : Architecture of a Router
1. Input Ports :

The input port performs various functions inside a router. The incoming link is terminated at the left most box of the input port. At the right most box of the input port, it is the place where the forwarding table is consulted and the corresponding output link is determined.
2. Switching fabric :
The Switching fabric connects the input ports of router to its output ports and incorporates the the processor.
3. Output Ports :
The output port transmits the packet to the correct outgoing link towards its destination.
4. Routing Processor : 
The Routing processor controls all the functions of consulting , updating and configuring the router and forwarding table. It executes the routing protocol. It also performs network management functions. 

DDNS

Distributed Domain Name System (DNS)

Continuing with our Former Post on Domain Name System(DNS), now we will be discussing about the Distributed DNS and the DNS caching in this post.

To deal with millions of Internet users throughout the globe, a single DNS server is not capable of mapping each and every hostname to every IP address in a Computer Network.  Thus, a network of DNS known as Distributed DNS is formed. This Domain Name Systems are structured in a hierarchical format. There are basically 3 types of DNS servers- Root Servers, Top-level Domain Servers and Authoritative Servers. Lets have a look at this diagram.

hierarchy of DNS servers, root server, dns server, top level domain dns server, authoritative servers, stanford, google, bing, wallmart,
1. Root DNS Servers:

There are 13 root servers throughout the globe. They are named from A to M. Most of these are located in North America. This doesn't mean that there are only 13 root DNS servers. This indicates that there are 13 authoritative companies that look after these root DNS servers and most of these companies are in North America. Because root DNS servers are replicated at various places to distribute the load and provide better services. The number of root DNS servers is around 247 that are spread throughout the world.


2. Top Level Domain (TLD )Servers:

These servers are responsible for the Top level Domain Names such as .com, .org, .edu, .gov etc. and the Top level Domains of a country such as .in, .us, .fr etc. The Two companies, 1st is Verisign Global Registry Services maintains the TLD servers for com top level domain and 2nd one is the Educause, that maintains the edu top level domains.

I refer you to read IANA TLD 2012 to get more knowledge on Top Level Domain Servers.



3. Authoritative Servers:

A company or a university can maintain their own authoritative DNS servers. The organisation having its host accessible publicly to the Internet can provide an authoritative DNS servers.
  • Here is a Map showing all the DNS servers throughout the Globe.

DNS server locations throughout the world, most of them in america, 247 root server, 13 companies to manage them, stanford, wallmart, google, root servers

There is also one more type of DNS servers. These are known as Local DNS servers. Every Internet Service Provider (ISP) has a local DNS. Whenever a host connects to a ISP, the ISP provides it with the IP address of its local DNS server. When a host makes a DNS query , the query is 1st send to the local DNS, which forwards it to the upper DNS server hierarchy.

Let us discuss an example that will make you clear with the working of the DNS servers in a hierarchy.

Let us suppose that a host ec.school.edu wants the IP address of the cs.stanford.edu. The local DNS server of ec.school.edu is dns.school.edu and the authoritative DNS server of cs.stanford.edu is dns.stanford.edu. The host ec.school.edu will 1st send the DNS query to its local DNS server. The query is to translate the hostname cs.stanford.edu into its IP address. The local DNS server forwards the query to the root DNS server. The root DNS notes that the query contains the .edu suffix, and returns the local DNS server a list of IP addresses for TLD servers responsible for .edu. The local DNS server then re-sends the query to a TLD server. The TLD server notes that query is with .stanford.edu suffix. Thus it responds with the IP address of authoritative DNS server for the Stanford University, named dns.stanford.edu. The local server now sends the final query to the dns.stanford.edu, which responds with the IP address of the cs.stanford.edu. You can see that, to obtain the IP address for 1 hostname, 8 DNS queries are being sent. Thus to reduce these queries DNS caching is used, that I will tell you later in this post.

Lets clear it with the help of a figure :


finding IP address of a host from authoritative dns server, requesting host, stanford.edu, cs.stanford,local dns server
Here we have observed that the TLD server knows the address of the Authoritative server, but in real world , it might not be the case. For example : Stanford University has a DNS server dns.stanford.edu. And the individual departments in the University might have their separate DNS servers for the departments, that will act as Authoritative Servers for the hosts in that department. Now the local server will send a query for cs.stanford.edu to the Stanford DNS server, dns.stanford.edu. The Stanford server will now return the IP address of the Authoritative server of CS department, dns.cs.stanford.edu. Finally the local server will sends a query directly to the authoritative DNS server of CS department, and it will return the desired IP address of the host. In this case, there will be total 10 DNS messages sent.

A figure for this scenario:

finding IP address of a host from authoritative department dns server, requesting host, stanford.edu, cs.stanford,local dns server

There are particularly 2 types of queries.

i) Recursive Query
ii) Iterative Query

The query sent from ec.school.edu to dns.school.edu is recursive , as it is send on its own behalf. But the other subsequent queries are iterative, since the replies are directly returned to dns.school.edu. In Figure 1 and Figure 2, only the query send from ec.school.edu to dns.school.edu is recursive, rest all other queries are iterative.

Diagram for Recursive Queries:

recursive queries to get the IP address of a host from authoritative DNS server, recursive queries, on behalf of themselves, dns caching

  • In an Internet world, the queries follows the Figure 1 and Figure 2 pattern.
DNS Caching :


DNS caching is an  important aspect of DNS. It is highly used in the real Internet world to reduce the delays and to reduce the number of DNS queries running around the Internet.

Let me take the above Stanford example and you will understand DNS Caching very well.

Here ec.school.edu queries to Local DNS server to get the IP address of cs.stanford.edu. Now after completing this request, the Local DNS server will save this mapping in its own memory. Therefore, if any other host from the school , queries for the cs.stanford.edu again, then the local server can reply from its own memory at much faster pace. This phenomenon is known as DNS Caching. The Local DNS servers can cache the mappings of TLD servers also, in order to bypass the root servers.

But this caching will be removed after some period of time , as mapping between the hosts and IP address is not permanent.