Monday, April 1, 2019
Security Issues in Peer-to-peer Networking
gage Issues in Peer-to-peer NedeucerkingACKNOWLEDGEMENTSThe fill in the field of meshing, operate me to take the calculator electronic meshinging as my tr final stage in M.Sc. in that kettle of fish ar numerous incompatible vitrines of earningss. Out of them the more(prenominal) popularized and upcoming trend of earningss be peer-to-peer meshs. This report of my final discourse for the partial derivative fulfilment of my M.Sc, estimator meshworking, would non earn been possible without the support of my supervisor, Mr. chevy Benetatos. He dished me a draw play by guiding me and pin-pointing the key fruit mistakes which I shed d sensation during my research. My flesh leader Mr. Nicholas Ioannides besides helped me a slew to complete this dissertation. His advises and suggestions gave me a cargon of encouragement and support which made me do this research and finish it in cartridge holder. I am rattling appreciative to my university, LONDON METROP OLITAN UNIVERSITY which provided me the assoil penetration to the IEEE library which helped me to find the key written reports which be truly exerciseful for my research. I also thank my p arnts for their support minded(p) to me in any(prenominal) walks of my life.DEDICATIONI devote this report to my pargonnts and my come up wisher Sakshi for their uninterrupted support and encouragement throughout my education and life.CHAPTER 1PROJECT asylum1.1 INTRODUCTION TO THE PROJECTThis dissertation is wholly about the certificate issues in the peer-to-peer meshings. T here atomic flesh 18 many anformer(a)(prenominal) security issues in peer-to-peer intercommunicates. I have chosen to do research on squirm intrusions in peer-to-peer meshworks. In this schedule I have menti iodind how the wrick permeates in the electronic electronic mesh from one peer to an almost early(a)(a) peer, how the curve burn be feeled and how the obtained twist around undersurfa ce be attacked and besides the earnings from loseting infected.1.2 sustainSecurity issue in Peer-to-peer earningssSecuring the peer-to-peer nedeucerk from winds.1.3 OBJECTIVES To encounter how the peers pass along with apiece early(a) in the peer-to-peer network To dissect the genesis of squirms in the network. To get word the twines resolutionive the inspissations of the network To defence the de take shapes in the network.1.4 RESEARCH QUESTIONThis document briefly discusses about how the louses propagates in the network and how clear it be sight and attacked in tack to save the peer-to-peer network1.5 APPROACHMy flak for this dissertation is as follows Understanding peer-to-peer networks Defining the line of work Data collection and analytic thinking Study and understanding the existing responses for the hassle equivalence polar solutions cobblers last1.6 METHODOLOGYThis section of my document contains what crucial travel to be followed in diame triciate to achieve the mentioned objectives. It also helps to docket how to develop and complete contrasting parts of the dissertation.In this dissertation first gear of alone I result c either for and understand about the peer-to-peer networks and how the peers in the networks communicate and sh atomic effect 18 schooling with the rest peer in the network. Then I do research on how the twist around propagates in the network, how flock the biting louse be notice uponed and how the disc e re either(a)yplace pervert jackpot be attacked and recover the network. In the pictorial form the different stages of my dissertation be1.7 check over ABOUT THE COMING CHAPTERS IN THE REPORTThe rest of the report is nonionized as follows in the chapter 2, in that respect is brief discussion about the peer-to-peer networks, different types of peer-to-peer networks, advantages and disadvantages of the peer-to-peer networks. There is also intumesce-nigh discipline about the worms, its nature and different types of worms. In chapter 3, there is a discussion about the method actings given by the different person to retrieve the worm in the network by the method of fit ining the sign wagon train of the worm. In section 4, there is a solution for this issue. That is mathematical method of detect the worm in the network and defending it. Chapter 5 consists of a critical appraisal and suggestions for the gain work. Finall(a)y, I concluded in chapter 6.CHAPTER 2OVERVIEW OF THE GENERIC AREA AND identification OF PROBLEM2.1 NETWORK profits is a group of electronic thingmabobs which be affiliated to all(prenominal) other(a) in roam to communicate which each other. The devices smoke be estimators, laptops, printers and so on networks rotter be wired or wire little. Wired networks atomic figure of speech 18 networks in which the devices ar committed with the help of wires. radio squ be off networks argon the networks in which the devic es be machine-accessible without the wires. There ar many different types of networks and peer-to-peer is one of the alpha and special types of networks.2.2 PEER-TO-PEER NETWORKSPeer-to-peer networks argon emerged in 1990 beca engross of the culture of the peer-to-peer excite sharing standardised Napster 1. Peer-to-peer networks abbreviated as p2p networks argon the networks in which all the lymph lymph thickeners or peers in the network acts as servers as well as clients on demand. This is different typical client server model, in which the clients requests the services and server supplies the re starting sequences. scarcely in chemise of peer-to-peer networks e really customer in the networks requests services want a client and every thickener get out supply the re inaugurations wish well server on demand. Peer-to-peer network doesnt charter any centralise server coordination. Peer-to-peer network is scal able-bodied. Addition of bleak nodes to the netwo rk or removal of already existing nodes on the network doesnt affect the network. That bureau addition or removal of nodes potbelly be through with(p) dynamically. every last(predicate) the nodes machine-accessible in a peer-to-peer network run on the a jibe(p) network communications protocol and softw ar. Re reference books available on a node in the network are available to the be nodes of the network and they arse access this information good. Peer-to-peer networks provide rigour and scalability. all in all the wired and wireless(prenominal) networks gage be tack together as peer-to-peer networks. nursing home networks and elf standardized enterprise networks are preferred to configure in a peer-to-peer networks. Most the networks are non pure peer-to-peer networks because of they use whatever network interface devices. In the beginning, the information is stored at all the nodes by do a copy of it. provided this increases the draw of traffic in the ne twork. only when now, a centralised frame is maintained by the network and the requests are directed to the nodes which contains the relevant information. This go forth save the time and the traffic melt in the network.2.3 WIRELESS NETWORKSDevices connected to each other without any wires peck also be configured like peer-to-peer networks. In a case of small of number of devices it is preferable to configure the network in wireless peer-to-peer networks because it allow for be free to share the entropy in both(prenominal) the directions. It is even cheaper to connect the networks in wireless peer-to-peer because we do not need to spend on the wires.Peer-to-peer networks are divided into three types. They are fast messaging networksCollaborative networks kinship community networks2Instant messaging networksIn this type of peer-to-peer networks, the users support blabber with each other in unfeigned time by put in near software such(prenominal) as MSN messenger, AOL in stant messenger and so forthCollaborative networksThis type of peer-to-peer networks are also called as distri exclusivelyed computing. This is widely apply in the field of attainment and ergonomics where the intense computer edgeing is needed.Affinity community peer-to-peer networksIt is a type of p2p network, where the group of devices are connected only for the purpose of sharing the info among them.Peer to peer networks are basically classified into two types. They are Structured peer-to-peer networks Unstructured peer-to-peer networks2.4 structured PEER-TO-PEER NETWORKSIn the structured peer-to-peer nodes connected in the network are fixed. They use distri scarcelyed haschisching table (DHT) for indexing 4.In DHT selective information is stored in the form of hash table like (key, nourish). Any node testamenting to retrieve the entropy can easily do that victimization the keys. The mapping of values to the keys are maintained by all the nodes infix in the network such that there forget be very less disruption in case of change in the set of participantsDHT- ground networks are very efficient in retrieving the resources.2.5 UNSTRUCTURED PEER-TO-PEER NETWORKSIn uncrystallised p2p network nodes are accomplished arbitrarily. There are three types of formless p2p networks. They arePure peer-to-peer interbreeding peer-to-peerCentralized peer-to-peerIn Pure p2p networks all the nodes in the network are fit. There wint be any preferred node with special infrastructure rifle.In crossbred p2p networks there leave behind be a special node called supernodes 3 . This supernode can be any node in the network depending on the transitory need of the network.Centralized p2p network is a type of hybrid network in which there allow be one central agreement which manages the network. The network cannot be able to work without this centralized schemeBasically, all the nodes in the peer-to-peer networks contain the information of the neighbour in its rou ting table. The rate of university extension of worms in the peer-to-peer networks is larger than compared to the other networks. This is because the information of the neighbour peers can easily achieved from the routing table of the infected node. antithetical types of deposits are share between the nodes in the peer-to-peer networks. These files can be the phone files, video files, music files, text documents, books articles etc. there are a lot of peer-to-peer software available these eld in the market for sharing the files. several(prenominal) of them are bittorrent, limeware, shareaza, kazaa, Imesh, bearshare Lite, eMule, KCeasy, Ares Galaxy, Soulseek, WinMX, Piolet, Gnutella, Overnet, Azureus (vuze), FrostWire, uTorrent, Morpheus, Ants, Acquisition5. There are lot more file sharing softwares in the market scarcely these are the top 20 file sharing softwares for peer-to-peer networks.Basically, all the nodes connected together in the network should configure with the c omparable network protocol and the uniform software should be installed in all the nodes in order to communicate with each other. Else the nodes in the network cannot communicate if they are configured with the different software or protocol.2.6 ADVANTAGES OF PEER-TO-PEER NETWORKS 6It is more useful for the small barter network comprising of very small number of computer systems or devices.Computers in this network can be configured easily.Full time network administrator is not mandatory for the p2p networks.Easy maintenance of the network.Only a angiotensin-converting enzyme run system and less number of cables needed to get connectedCan be installed easilyUsers can control the shared resourcesDistributed nature of the network increases the robustness of the network.2.7 DISADVANTAGES OF THE PEER-TO-PEER NETWORKS 12No centralised administrationBack-up should be causeed on the each computer individually.Peer-to-peer networks are not secure all computer in the network behaves as server and client which can slow down the mathematical operation of the systemLegal controversy with the copyrights.2.8 twistWorm is a computer malware program or it can be called as a noxious code which can quintuple itself into close to(prenominal) replicas or it duplicate itself into some(prenominal) copies. Worm in wide-eyed can be called as self-directed intrusion agent 19 .It doesnt actually alters the function of the system but it pass through i.e., worm is conflicting virus. It intrudes the network without the mediation of the user.This is first notice by Robert T Morris in 198818. Today we have some billions of systems connected to net. Bu during 1988 there were only 60,000 systems connected to the internet. During that period 10% of the internet systems i.e., 6000 of the systems are infected and al approximately clogged because of the worms 8.Worms when enters the system it hides in the operating system where it cannot be noticeable 18 . It drastically slows down the system the effect the other programs in the system. In crush cases it could even effect the replete(p) network and slow down the internet across whole world.As it is state earlier that it replicates itself into multiple copies and extend itself to the emails and corrupt them and sometimes deleting the file without the user interaction. If it enters our email, it can able to pull itself to all the contacts in our email book and thus(prenominal) to all the contacts of the emails of our email book and likewise it propagates, modernize and spread at the higher rate.Worms leave even create the backdoor into the computer 11. This provide assoil the attackers to send spam easily.Some famous worms ascertained in 2003 and 2004 are Mydoom, Sobig and Sasser7. Sasser worm has late affected the computers which are apply Windows 2000 or Windows XP operating system. It restarts the system automatically and crashes it. It is spread to all the nodes in the network.There are some worms which are unlike the normal worms. These worms are very useful to the user some times. accordingly, these are called the helpful worms 9. sometimes they help users without the interaction with the user. precisely most of the cognise worms are hurtful and will always tries to infect the nodes in the network and affect the writ of execution of the network.When the peer-to-peer networks are attacked by the worms, it slows down the power of the network. So there is a need to save the networks from entering into the network and spreading itself all over the network. The worms should be discover and defended. If we delay in defending these worms, they replicate itself and makes many copies of itself and spread all through the network. This is very dangerous to the network as it affects the performance and efficiency of the network 10.CHAPTER 3RELEVANT WORK through BY OTHERS IN ORDER TO bring THE PROBLEMMany slew proposed solutions to this problem. First Zhou L gave sol ution to p2p worm and he discovered that propagation of worm in p2p network is very speed when compared to other networks13 . Jayanthkumar performed some simulations on worm propagation from infected node to other node10. Wei yu researched on the behaviour of worms in p2p networks14. In my research I entrap one more interesting method of espial the worms in the peer-to-peer network. This is thusly a special method of discover the worms in network because the authors Yu Yao, Yong Li, Fu-xiang Gao, Ge Yu in their paper titled A Signature-behaviour- base P2P worm contracting approach they proposed a machine of detection the cognize worms in the peer-to-peer networks ground on pieceistic take up interconnected. Worm make use of vulnerabilities in the network and +Spreads15. They also proposed the detection mechanism for the uncharted worms based on their behaviour. They proficiency mainly consists of the technology of distinction quarter matching, accounting the applic ation and the un cognize worm detection technology. They have given the algorithmic ruleic precept for the matching the singularitys mountain range of the worm called suffix-tree algorithm- suffix begin algorithm. This is efficient and simple with very less time complexity. As peer-to-peer network follows fragment bump off technique there is meet of mete outing the mentionistics gear of the worm to the other leaveages of entropy. And again during the reorganisation demonstrate this source strand can signalise the worm. These authors even validated their results by simulation. They fired that their method is also one of the efficient methods of p2p worm detection.As mentioned above this method detects the whopn worm and also the unknow worms based on attribute set up matching and their behaviour respectively. In this method they signly capture the network mailboats victimization the library function called LibPcap. LibPcap is the library function that captu res the network mailboats in UNIX and Linux platforms. This function contains many functions that will be useful for capturing the network packets. After capturing the data packets with help of these functions the non-P2P packets are filtered out. So now the P2P packets are filtered. In these P2P packets the know worms are detected by victimisation the quality wagon train matching. This is employ by the couple of algorithms. They are the suffix lay out algorithm and the dichotomy algorithm. These algorithms are very accurate and are capable of detecting the worms in very less time. As I mentioned above peer-to-peer networks follow fragment transfer mechanism. Hence the characteristic mountain chain of the worm can be assign to the other stave offs of data. So, in this situation it is difficult to detect the worm if the characteristic draw of the worm is based on the single packet. further if the characteristic hang is constitute in the block whence there is a chance o f detecting the worm because it will assign it to the two packets. At this time the worm characteristic gearing pre displace in the two different data packets need to restructure. After restructuring, the worm can be detected by using the matching mechanism. In this way the known worm in the network is detected by using the characteristic gearing matching. The unknown worms in the p2p network can be detected with the help of the act characteristics of the worm at the initial stage of its propagation. This can be called as the behaviour based detection of the unknown p2p worms. Like this all the known and unknown worms in the network are detected.3.1 P2P KNOWN distort DETECTIONThere are quaternary measures in detecting the p2p known worms. They areDeal carryTechnology of identifying the application mark string matchingReorganising the characteristic string3.1.1 DEAL FLOWIn this step of deal menses the flow of data is divided into quadruplet steps16. grade 1 Extracting the p2p data shoot from the original data swarm. timber 2 check the extracted p2p data stream for worms using characteristic string matching with the worms already existing in the library function. misuse 3 data is flow is reorganize. It now contains worm characteristic string as well. Go to step 2. spirit 4 check the data flow for unknown worms using unknown worm detection techniques.After do the four steps update the library function.All the four steps is repre directed pictorially as in the next page.Figure 4 flow chart repre directing four steps to detect wormsyes normalformnoAbnormalabnormal3.1.2 engineering OF IDENTIFYING THE APPLICATIONAs said earlier, this paper uses the method of capturing the data packets and sca it for the worms which are known with the help of a function library called LibPcap17 . For this there should be already some depute rules in the network interface devices. So assigning these rules to those devices is done in stepwise procedure as name the availabl e network interface devicesOpen the network interface deviceCompile the rules that we are free to attach to the devicesSetup the rules of filtering to the device now operate the equipmentStart the change of capturing the packetsThere are some rules for identifying the p2p application. They areCharacteristic information of the known p2p is utiliseSometimes, if source- close IP pairs dont use the known P2P and they may use TCP and UDP at same time, indeed they are p2p.At a particular time source pairs srcIP, srcport27 and the destination pairs dstIP, dstport27 are chequered here we can identify whether its a p2p or not. If the number of connection port is equal to the number of connection IP, whence we can grade that it is a p2p. There are the situations where these rules have been used unruly. So the there were some amendments made to these rules. The amendments are rule (2) can identify even the mazes which are present and rule (3) is modified in such a way that in the detec t cycle srcIP, srcport27 pairs at the source and the dstIP, dstport 27 pairs at the destination are checked. From this they derived that if the number of connection port is equal to the number of connection IP, the protocols which are used are same. If they are different then(prenominal) the protocols are different.3.1.3 CHARACTERISTIC drag MATCHINGThis is the most most-valuable section of the paper. Here authors have given some definitions to the terms which we are going to use, the algorithms which we are going to use to detect the worm. Couple of algorithms are mentioned. They are suffix- set out algorithm and the dichotomy algorithm. So the entire process of detecting the worm depends on the efficiency and the trueness of these algorithms.First of all forrader using and understanding suffix- cast algorithm we will try to understand some keywords and rules.Suffix suffix is the part of a string or a substring which starts at a particular location to the end of the string. If a suffix in the string S starts at the location i to the end of the string S, then the suffix can be delineate as Suffix(i)=Si,Len(S) 27 .Let us understand how the draw can be compared. The analyse in this paper followed dictionary coincidence If u and v are the two different thread. Comparing the strings u and v is same like canvas ui and vi, where i starts with the value 1. Here string u is equal to string v i.e., u=v when ui=vi String u is greater then string v i.e., uv when uivi String u is less than string v i.e., uBut the results were still not obtained for ilen(u) or ilen(v)Also if len(u)len(v) then u v, if len(u) Suffix-array suffix-array is denoted by SA. It is a one-dimensional array. It is an array of SA1, S2, SA3,. And so on. Here siRank-array rank-array is nothing but SA-1. If SAi=j, then Rankj=i. we can say that the ranki saves the rank of Suffix(i) in an ascending order for all the suffixes.In this paper the author has taken the example of string science and expl ained everything clearly. The string science can generate seven suffixes. They areSuffix(1) scienceSuffix(2) cienceSuffix(3) ienceSuffix(4) enceSuffix(5) nceSuffix(6) ceSuffix(7) eWhen we sort out everything in a dictionary order it will be in the order as followSuffix(6)= ceSuffix(2)= cienceSuffix(7)= eSuffix(4)= enceSuffix(3)= ienceSuffix(5)= nceSuffix(1)= scienceSuffix-array algorithm follows multiplier opinions. first of all get SA1 and Rank1 by comparing every character in the string. Comparing string is similar to comparing the every character sequentially. So by comparing every character, SA1 and Rank1 can derive SA2 and Rank2. And this SA2 and Rank2 will derive SA4 and Rank4. And this will again derive SA8 and Rank8. So finally suffix-array and rank-array are derived from this process. The main process of the suffix-array algorithm is Calculating SA1 and Rank1. first of all all the suffixes are arranged in the first letter order and then suffix-array (SA1) is generated by using ardent sorry algorithm and then Rank1 is also generated. Comparing 2k- affix Suffix(i) and Suffix(j) using SAk and Rankk.2k-Suffix(i) = 2k-Suffixes(j), this is homogeneous to RankkSAki = RankkSAkj and RankkSAki+k = RankkSAkj+k2k-Suffix(i) Suffix-array algorithm is a select algorithm which sorts out the characteristic string. So, this uses binary search algorithm. The algorithm followsStep 1 in the first step values are assigned like left=1, right=n and max_match=0Step 2 the middle value i.e., mid= (left +right)/2.Step 3 comparing the characters corresponding to Suffix (SAmid) and P. the longest common prefix r can be helpful in implantation and comparison. If r max_match, then max_match=r.Step 4 if Suffix(SAmid)If Suffix(SAmid)P, then right=mid-1If Suffix(SAmid)=P, then go to step 6Step 5 if leftStep 6 if max_match= m, then print match is successful.3.1.4 REORGANISING THE CHARACTERISTIC STRINGIn this step the characteristic string is reorganised. If the character string is divided into two different data blocks, then the data block with the partial characteristic string is stored. Basically, all the information about the data block like index, beginning get-go, length of the block and so on are contained at the interrogative sentence of the each block. Here a structure piece is delimit which consists of index of the block, beginning offset of the block offset, length of the character array breaker point and the length of the character array end18. Initially each and every data packet is compared with the characteristic string for matching. If it is matched then the warning or an alert is sent to all the users about the worm. Here if the tail of the characteristic string of the worm matches with the mental capacity of the data block, then it will be stored in the character array end. And if the head of the characteristic string of the worm matches with the tail of the data block then it is stored in the corresponding character array head. Supp ose if the neighbouring data block contains a partial characteristic string of the worm then the neighbour string in the array head as well as in the end will be reorganised. Now this reorganised string will again perform the characteristic string matching and if any worm is detected then again the warning is sent to all users saying that the worm have build. If it is not matched then it wont perform any operation. If in a case that the characteristic string is present in the block but is divided into two different data packets, then a special term called character array is introduced. First the matching mechanism is performed in both the data packet. If the matching characteristic string is found then the warning is sent to the users that there is a worm present. But if only part of the characteristic string is found then it will be enough if it meets some of the requirements like the head of the data packet should match with the tail of the characteristic string or the tail of th e data packet should match with the head of the characteristic string. But if these conditions are not satisfied then no operation is performed. Now, if the tail of the data packet contains the partial characteristic string then the data packet is stored in the array. If the length of the characteristic string is m, then the Arraym is set as . And if the head of the data packet contains a part of the characteristic string then that data packet is stored in the n consecutive units of array. Finally, this array will be the characteristic string matching and if the worm is detected then the warning is sent to all the users. If it is not matched then nothing is done.3.2 detect UNKNOWN P2P WORMIn the above section we have seen how the known worm is detected. But that algorithm or mechanism are meant to detect the unknown p2p worms. So here in this section we will understand how the unknown worms can be detected and restrain the network. As we know in p2p networks a node can able to send the information to multiple hosts at a same time. Anyhow same protocol is used by all the nodes in the network27. These characteristics of the network helps worm to propagate easily. As we discussed above, only the known worms can be detected by using the characteristic string matching method. Here we will see how the unknown worms can be detected. The unknown worms are detected based on the behaviour of the node. Some of the detection rules are same surfeit files are transferred to multiple hosts in a very short time. same protocol is used and the destination port is same. If these rules are satisfies by the source port then it allows the p2p worm to propagate. Now, it is necessary to extract the characteristics of worm near the worm propagation nodes. When these characteristics are extracted, they are added to the give library. This data similarity comparison and extracting the characteristics are done using the LCSeq algorithm. But the LCSeq algorithm based on generalized su ffix tree (GST) is the more efficient. The overall idea is that all the suffixes are represented as a tree.And this tree will have some characteristics like each node in a tree is a string and root is the empty string Every suffix can be represented as a running from the root. Every substring can be considered as a prefix of a suffix. To achieve the searching public sub sequence, every node should be set the information of its pendant source string.3.3 EXPERIMENTWe know that the worm body tries to infect the other nodes in the network by send the worm to the specific ports of p2p node. So here the author tried to prove the efficiency of his method by performing an try out. In this experiment he prepared a multiple group worm body and sent it repeatedly at regular(a) intervals of time. Then he captured these packets and extracted their characteristics and compared it with the one that already exist in the feature library.P2p worm is detected separately using different algorithms like BF algorithm, KMP algorithm and suffix-array algorithm and compared their results doing three experiments. In the experiment 1, worm characteristics are in the same packet.. in the experimentSecurity Issues in Peer-to-peer NetworkingSecurity Issues in Peer-to-peer NetworkingACKNOWLEDGEMENTSThe interest in the field of networking, driven me to take the computer networking as my course in M.Sc. there are many different types of networks. Out of them the more popularized and upcoming trend of networks are peer-to-peer networks. This report of my final dissertation for the partial fulfilment of my M.Sc, computer networking, would not have been possible without the support of my supervisor, Mr. Harry Benetatos. He helped me a lot by guiding me and pin-pointing the key mistakes which I have done during my research. My course leader Mr. Nicholas Ioannides also helped me a lot to complete this dissertation. His advises and suggestions gave me a lot of encouragement and support which m ade me do this research and finish it in time. I am very thankful to my university, LONDON METROPOLITAN UNIVERSITY which provided me the free access to the IEEE library which helped me to find the key papers which are very useful for my research. I also thank my parents for their support given to me in all walks of my life.DEDICATIONI dedicate this report to my parents and my well wisher Sakshi for their constant support and encouragement throughout my education and life.CHAPTER 1PROJECT INTRODUCTION1.1 INTRODUCTION TO THE PROJECTThis dissertation is all about the security issues in the peer-to-peer networks. There are many security issues in peer-to-peer networks. I have chosen to do research on worm intrusions in peer-to-peer networks. In this document I have mentioned how the worm propagates in the network from one peer to another peer, how the worm can be detected and how the detected worm can be attacked and save the network from getting infected.1.2 AIMSecurity issue in Peer-t o-peer networksSecuring the peer-to-peer network from worms.1.3 OBJECTIVES To understand how the peers communicate with each other in the peer-to-peer network To analyse the propagation of worms in the network. To detect the worms near the nodes of the network To defence the worms in the network.1.4 RESEARCH QUESTIONThis document briefly discusses about how the worms propagates in the network and how can it be detected and attacked in order to save the peer-to-peer network1.5 APPROACHMy approach for this dissertation is as follows Understanding peer-to-peer networks Defining the problem Data collection and analysis Study and understanding the existing solutions for the problem Comparing different solutions conclusion1.6 METHODOLOGYThis section of my document contains what important steps to be followed in order to achieve the mentioned objectives. It also helps to schedule how to develop and complete different parts of the dissertation.In this dissertation firstly I will study and u nderstand about the peer-to-peer networks and how the peers in the networks communicate and share information with the remaining peer in the network. Then I do research on how the worm propagates in the network, how can the worm be detected and how the detected worm can be attacked and restore the network. In the pictorial form the different stages of my dissertation are1.7 PREVIEW ABOUT THE COMING CHAPTERS IN THE REPORTThe rest of the report is organised as follows in the chapter 2, there is brief discussion about the peer-to-peer networks, different types of peer-to-peer networks, advantages and disadvantages of the peer-to-peer networks. There is also some information about the worms, its nature and different types of worms. In chapter 3, there is a discussion about the methods given by the different person to detect the worm in the network by the method of matching the characteristic string of the worm. In section 4, there is a solution for this issue. That is mathematical metho d of detecting the worm in the network and defending it. Chapter 5 consists of a critical appraisal and suggestions for the further work. Finally, I concluded in chapter 6.CHAPTER 2OVERVIEW OF THE GENERIC AREA AND IDENTIFICATION OF PROBLEM2.1 NETWORKNetwork is a group of electronic devices which are connected to each other in order to communicate which each other. The devices can be computers, laptops, printers etc. networks can be wired or wireless. Wired networks are networks in which the devices are connected with the help of wires. Wireless networks are the networks in which the devices are connected without the wires. There are many different types of networks and peer-to-peer is one of the important and special types of networks.2.2 PEER-TO-PEER NETWORKSPeer-to-peer networks are emerged in 1990 because of the development of the peer-to-peer file sharing like Napster 1. Peer-to-peer networks abbreviated as p2p networks are the networks in which all the nodes or peers in the net work acts as servers as well as clients on demand. This is unlike typical client server model, in which the clients requests the services and server supplies the resources. But in case of peer-to-peer networks every node in the networks requests services like a client and every node will supply the resources like server on demand. Peer-to-peer network doesnt need any centralized server coordination. Peer-to-peer network is scalable. Addition of new nodes to the network or removal of already existing nodes on the network doesnt affect the network. That means addition or removal of nodes can be done dynamically. All the nodes connected in a peer-to-peer network run on the same network protocol and software. Resources available on a node in the network are available to the remaining nodes of the network and they can access this information easily. Peer-to-peer networks provide robustness and scalability. All the wired and wireless networks can be configured as peer-to-peer networks. Ho me networks and small enterprise networks are preferable to configure in a peer-to-peer networks. Most the networks are not pure peer-to-peer networks because of they use some network interface devices. In the beginning, the information is stored at all the nodes by making a copy of it. But this increases the flow of traffic in the network. But now, a centralised system is maintained by the network and the requests are directed to the nodes which contains the relevant information. This will save the time and the traffic flow in the network.2.3 WIRELESS NETWORKSDevices connected to each other without any wires can also be configured like peer-to-peer networks. In a case of small of number of devices it is preferable to configure the network in wireless peer-to-peer networks because it will be easy to share the data in both the directions. It is even cheaper to connect the networks in wireless peer-to-peer because we do not need to spend on the wires.Peer-to-peer networks are divided into three types. They areInstant messaging networksCollaborative networksAffinity community networks2Instant messaging networksIn this type of peer-to-peer networks, the users can chat with each other in real time by installing some software such as MSN messenger, AOL instant messenger etc.Collaborative networksThis type of peer-to-peer networks are also called as distributed computing. This is widely used in the field of science and biotechnology where the intense computer treat is needed.Affinity community peer-to-peer networksIt is a type of p2p network, where the group of devices are connected only for the purpose of sharing the data among them.Peer to peer networks are basically classified into two types. They are Structured peer-to-peer networks Unstructured peer-to-peer networks2.4 STRUCTURED PEER-TO-PEER NETWORKSIn the structured peer-to-peer nodes connected in the network are fixed. They use distributed hashing table (DHT) for indexing 4.In DHT data is stored in the form of hash table like (key, value). Any node willing to retrieve the data can easily do that using the keys. The mapping of values to the keys are maintained by all the nodes present in the network such that there will be very less disruption in case of change in the set of participantsDHT-based networks are very efficient in retrieving the resources.2.5 UNSTRUCTURED PEER-TO-PEER NETWORKSIn unstructured p2p network nodes are established arbitrarily. There are three types of unstructured p2p networks. They arePure peer-to-peerHybrid peer-to-peerCentralized peer-to-peerIn Pure p2p networks all the nodes in the network are equal. There wont be any preferred node with special infrastructure function.In hybrid p2p networks there will be a special node called supernodes 3 . This supernode can be any node in the network depending on the momentary need of the network.Centralized p2p network is a type of hybrid network in which there will be one central system which manages the network. The net work cannot be able to work without this centralized systemBasically, all the nodes in the peer-to-peer networks contain the information of the neighbour in its routing table. The rate of propagation of worms in the peer-to-peer networks is larger than compared to the other networks. This is because the information of the neighbour peers can easily achieved from the routing table of the infected node.Different types of files are shared between the nodes in the peer-to-peer networks. These files can be the audio files, video files, music files, text documents, books articles etc. there are a lot of peer-to-peer software available these days in the market for sharing the files. Some of them are bittorrent, limeware, shareaza, kazaa, Imesh, bearshare Lite, eMule, KCeasy, Ares Galaxy, Soulseek, WinMX, Piolet, Gnutella, Overnet, Azureus (vuze), FrostWire, uTorrent, Morpheus, Ants, Acquisition5. There are lot more file sharing softwares in the market but these are the top 20 file sharing softwares for peer-to-peer networks.Basically, all the nodes connected together in the network should configure with the same network protocol and the same software should be installed in all the nodes in order to communicate with each other. Else the nodes in the network cannot communicate if they are configured with the different software or protocol.2.6 ADVANTAGES OF PEER-TO-PEER NETWORKS 6It is more useful for the small business network comprising of very small number of computer systems or devices.Computers in this network can be configured easily.Full time network administrator is not required for the p2p networks.Easy maintenance of the network.Only a single operating system and less number of cables needed to get connectedCan be installed easilyUsers can control the shared resourcesDistributed nature of the network increases the robustness of the network.2.7 DISADVANTAGES OF THE PEER-TO-PEER NETWORKS 12No centralised administrationBack-up should be performed on the each comp uter individually.Peer-to-peer networks are not secureEvery computer in the network behaves as server and client which can slow down the performance of the systemLegal controversy with the copyrights.2.8 WORMWorm is a computer malware program or it can be called as a mischievous code which can multiple itself into several replicas or it duplicate itself into several copies. Worm in simple can be called as autonomous intrusion agent 19 .It doesnt actually alters the function of the system but it pass through i.e., worm is unlike virus. It intrudes the network without the mediation of the user.This is first detected by Robert T Morris in 198818. Today we have some billions of systems connected to internet. Bu during 1988 there were only 60,000 systems connected to the internet. During that period 10% of the internet systems i.e., 6000 of the systems are infected and almost clogged because of the worms 8.Worms when enters the system it hides in the operating system where it cannot be n oticeable 18 . It drastically slows down the system the effect the other programs in the system. In worst cases it could even effect the entire network and slow down the internet across whole world.As it is said earlier that it replicates itself into multiple copies and attach itself to the emails and corrupt them and sometimes deleting the file without the user interaction. If it enters our email, it can able to send itself to all the contacts in our email book and then to all the contacts of the emails of our email book and likewise it propagates, grow and spread at the higher rate.Worms will even create the backdoor into the computer 11. This will make the attackers to send spam easily.Some famous worms discovered in 2003 and 2004 are Mydoom, Sobig and Sasser7. Sasser worm has recently affected the computers which are using Windows 2000 or Windows XP operating system. It restarts the system automatically and crashes it. It is spread to all the nodes in the network.There are some worms which are unlike the normal worms. These worms are very useful to the user some times. Hence, these are called the helpful worms 9. Sometimes they help users without the interaction with the user. But most of the known worms are harmful and will always tries to infect the nodes in the network and affect the performance of the network.When the peer-to-peer networks are attacked by the worms, it slows down the efficiency of the network. So there is a need to save the networks from entering into the network and spreading itself all over the network. The worms should be detected and defended. If we delay in defending these worms, they replicate itself and makes many copies of itself and spread all through the network. This is very dangerous to the network as it affects the performance and efficiency of the network 10.CHAPTER 3RELEVANT WORK DONE BY OTHERS IN ORDER TO SOLVE THE PROBLEMMany people proposed solutions to this problem. First Zhou L gave solution to p2p worm and he obse rved that propagation of worm in p2p network is very speed when compared to other networks13 . Jayanthkumar performed some simulations on worm propagation from infected node to other node10. Wei yu researched on the behaviour of worms in p2p networks14. In my research I found one more interesting method of detecting the worms in the peer-to-peer network. This is indeed a special method of detecting the worms in network because the authors Yu Yao, Yong Li, Fu-xiang Gao, Ge Yu in their paper titled A Signature-behaviour-based P2P worm detection approach they proposed a mechanism of detecting the known worms in the peer-to-peer networks based on characteristic string matching. Worm make use of vulnerabilities in the network and +Spreads15. They also proposed the detection mechanism for the unknown worms based on their behaviour. They technique mainly consists of the technology of characteristic string matching, identifying the application and the unknown worm detection technology. They have given the algorithm for the matching the characteristics string of the worm called suffix-tree algorithm- suffix array algorithm. This is efficient and simple with very less time complexity. As peer-to-peer network follows fragment transfer technique there is chance of assigning the characteristics string of the worm to the other blocks of data. And again during the reorganisation process this characteristic string can identify the worm. These authors even validated their results by simulation. They proved that their method is also one of the efficient methods of p2p worm detection.As mentioned above this method detects the known worm and also the unknown worms based on characteristic string matching and their behaviour respectively. In this method they initially capture the network packets using the library function called LibPcap. LibPcap is the library function that captures the network packets in UNIX and Linux platforms. This function contains many functions that will be useful for capturing the network packets. After capturing the data packets with help of these functions the non-P2P packets are filtered out. So now the P2P packets are filtered. In these P2P packets the known worms are detected by using the characteristic string matching. This is implemented by the couple of algorithms. They are the suffix array algorithm and the dichotomy algorithm. These algorithms are very accurate and are capable of detecting the worms in very less time. As I mentioned above peer-to-peer networks follow fragment transfer mechanism. Hence the characteristic string of the worm can be assigned to the other blocks of data. So, in this situation it is difficult to detect the worm if the characteristic string of the worm is based on the single packet. But if the characteristic string is present in the block then there is a chance of detecting the worm because it will assign it to the two packets. At this time the worm characteristic string present in the two differen t data packets need to restructure. After restructuring, the worm can be detected by using the matching mechanism. In this way the known worm in the network is detected by using the characteristic string matching. The unknown worms in the p2p network can be detected with the help of the act characteristics of the worm at the initial stage of its propagation. This can be called as the behaviour based detection of the unknown p2p worms. Like this all the known and unknown worms in the network are detected.3.1 P2P KNOWN WORM DETECTIONThere are four steps in detecting the p2p known worms. They areDeal flowTechnology of identifying the applicationCharacteristic string matchingReorganising the characteristic string3.1.1 DEAL FLOWIn this step of deal flow the flow of data is divided into four steps16.Step 1 Extracting the p2p data stream from the original data stream.Step 2 check the extracted p2p data stream for worms using characteristic string matching with the worms already existing in the library function.Step 3 data is flow is reorganised. It now contains worm characteristic string as well. Go to step 2.Step 4 check the data flow for unknown worms using unknown worm detection techniques.After performing the four steps update the library function.All the four steps is represented pictorially as in the next page.Figure 4 flow chart representing four steps to detect wormsyes normalNormalnoAbnormalabnormal3.1.2 TECHNOLOGY OF IDENTIFYING THE APPLICATIONAs said earlier, this paper uses the method of capturing the data packets and sca it for the worms which are known with the help of a function library called LibPcap17 . For this there should be already some assigned rules in the network interface devices. So assigning these rules to those devices is done in stepwise procedure asIdentify the available network interface devicesOpen the network interface deviceCompile the rules that we are willing to attach to the devicesSetup the rules of filtering to the deviceNow op erate the equipmentStart the process of capturing the packetsThere are some rules for identifying the p2p application. They areCharacteristic information of the known p2p is usedSometimes, if source-destination IP pairs dont use the known P2P and they may use TCP and UDP at same time, then they are p2p.At a particular time source pairs srcIP, srcport27 and the destination pairs dstIP, dstport27 are checkedHere we can identify whether its a p2p or not. If the number of connection port is equal to the number of connection IP, then we can say that it is a p2p. There are the situations where these rules have been used unruly. So the there were some amendments made to these rules. The amendments are rule (2) can identify even the mazes which are present and rule (3) is modified in such a way that in the detect cycle srcIP, srcport27 pairs at the source and the dstIP, dstport 27 pairs at the destination are checked. From this they derived that if the number of connection port is equal to the number of connection IP, the protocols which are used are same. If they are different then the protocols are different.3.1.3 CHARACTERISTIC STRING MATCHINGThis is the most important section of the paper. Here authors have given some definitions to the terms which we are going to use, the algorithms which we are going to use to detect the worm. Couple of algorithms are mentioned. They are suffix-array algorithm and the dichotomy algorithm. So the entire process of detecting the worm depends on the efficiency and the accuracy of these algorithms.First of all before using and understanding suffix-array algorithm we will try to understand some keywords and rules.Suffix suffix is the part of a string or a substring which starts at a particular location to the end of the string. If a suffix in the string S starts at the location i to the end of the string S, then the suffix can be represented as Suffix(i)=Si,Len(S) 27 .Let us understand how the strings can be compared. The comparison in this paper followed dictionary comparison If u and v are the two different strings. Comparing the strings u and v is same like comparing ui and vi, where i starts with the value 1. Here string u is equal to string v i.e., u=v when ui=vi String u is greater then string v i.e., uv when uivi String u is less than string v i.e., uBut the results were still not obtained for ilen(u) or ilen(v)Also if len(u)len(v) then u v, if len(u) Suffix-array suffix-array is denoted by SA. It is a one-dimensional array. It is an array of SA1, S2, SA3,. And so on. Here siRank-array rank-array is nothing but SA-1. If SAi=j, then Rankj=i. we can say that the ranki saves the rank of Suffix(i) in an ascending order for all the suffixes.In this paper the author has taken the example of string science and explained everything clearly. The string science can generate seven suffixes. They areSuffix(1) scienceSuffix(2) cienceSuffix(3) ienceSuffix(4) enceSuffix(5) nceSuffix(6) ceSuffix(7) eWhen we sort out eve rything in a dictionary order it will be in the order as followSuffix(6)= ceSuffix(2)= cienceSuffix(7)= eSuffix(4)= enceSuffix(3)= ienceSuffix(5)= nceSuffix(1)= scienceSuffix-array algorithm follows multiplier ideas. Firstly get SA1 and Rank1 by comparing every character in the string. Comparing string is similar to comparing the every character sequentially. So by comparing every character, SA1 and Rank1 can derive SA2 and Rank2. And this SA2 and Rank2 will derive SA4 and Rank4. And this will again derive SA8 and Rank8. So finally suffix-array and rank-array are derived from this process. The main process of the suffix-array algorithm is Calculating SA1 and Rank1. Firstly all the suffixes are arranged in the first letter order and then suffix-array (SA1) is generated by using quick sorry algorithm and then Rank1 is also generated. Comparing 2k-prefix Suffix(i) and Suffix(j) using SAk and Rankk.2k-Suffix(i) = 2k-Suffixes(j), this is equivalent to RankkSAki = RankkSAkj and RankkSAki+ k = RankkSAkj+k2k-Suffix(i) Suffix-array algorithm is a sorting algorithm which sorts out the characteristic string. So, this uses binary search algorithm. The algorithm followsStep 1 in the first step values are assigned like left=1, right=n and max_match=0Step 2 the middle value i.e., mid= (left +right)/2.Step 3 comparing the characters corresponding to Suffix (SAmid) and P. the longest public prefix r can be helpful in implantation and comparison. If r max_match, then max_match=r.Step 4 if Suffix(SAmid)If Suffix(SAmid)P, then right=mid-1If Suffix(SAmid)=P, then go to step 6Step 5 if leftStep 6 if max_match= m, then print match is successful.3.1.4 REORGANISING THE CHARACTERISTIC STRINGIn this step the characteristic string is reorganised. If the character string is divided into two different data blocks, then the data block with the partial characteristic string is stored. Basically, all the information about the data block like index, beginning offset, length of the block and so on are contained at the head of the each block. Here a structure piece is defined which consists of index of the block, beginning offset of the block offset, length of the character array head and the length of the character array end18. Initially each and every data packet is compared with the characteristic string for matching. If it is matched then the warning or an alert is sent to all the users about the worm. Here if the tail of the characteristic string of the worm matches with the head of the data block, then it will be stored in the character array end. And if the head of the characteristic string of the worm matches with the tail of the data block then it is stored in the corresponding character array head. Suppose if the neighbouring data block contains a partial characteristic string of the worm then the neighbour string in the array head as well as in the end will be reorganised. Now this reorganised string will again perform the characteristic string matching and if a ny worm is detected then again the warning is sent to all users saying that the worm have found. If it is not matched then it wont perform any operation. If in a case that the characteristic string is present in the block but is divided into two different data packets, then a special term called character array is introduced. First the matching mechanism is performed in both the data packet. If the matching characteristic string is found then the warning is sent to the users that there is a worm present. But if only part of the characteristic string is found then it will be enough if it meets some of the requirements like the head of the data packet should match with the tail of the characteristic string or the tail of the data packet should match with the head of the characteristic string. But if these conditions are not satisfied then no operation is performed. Now, if the tail of the data packet contains the partial characteristic string then the data packet is stored in the arra y. If the length of the characteristic string is m, then the Arraym is set as . And if the head of the data packet contains a part of the characteristic string then that data packet is stored in the n consecutive units of array. Finally, this array will be the characteristic string matching and if the worm is detected then the warning is sent to all the users. If it is not matched then nothing is done.3.2 DETECTING UNKNOWN P2P WORMIn the above section we have seen how the known worm is detected. But that algorithm or mechanism are meant to detect the unknown p2p worms. So here in this section we will understand how the unknown worms can be detected and restrain the network. As we know in p2p networks a node can able to send the information to multiple hosts at a same time. Anyhow same protocol is used by all the nodes in the network27. These characteristics of the network helps worm to propagate easily. As we discussed above, only the known worms can be detected by using the charact eristic string matching method. Here we will see how the unknown worms can be detected. The unknown worms are detected based on the behaviour of the node. Some of the detection rules are same content files are transferred to multiple hosts in a very short time. Same protocol is used and the destination port is same. If these rules are satisfies by the source port then it allows the p2p worm to propagate. Now, it is necessary to extract the characteristics of worm near the worm propagation nodes. When these characteristics are extracted, they are added to the feature library. This data similarity comparison and extracting the characteristics are done using the LCSeq algorithm. But the LCSeq algorithm based on generalized suffix tree (GST) is the more efficient. The overall idea is that all the suffixes are represented as a tree.And this tree will have some characteristics like Every node in a tree is a string and root is the empty string Every suffix can be represented as a path from the root. Every substring can be considered as a prefix of a suffix. To achieve the searching public sub sequence, every node should be set the information of its subordinate source string.3.3 EXPERIMENTWe know that the worm body tries to infect the other nodes in the network by sending the worm to the specific ports of p2p node. So here the author tried to prove the efficiency of his method by performing an experiment. In this experiment he prepared a multiple group worm body and sent it repeatedly at regular intervals of time. Then he captured these packets and extracted their characteristics and compared it with the one that already exist in the feature library.P2p worm is detected separately using different algorithms like BF algorithm, KMP algorithm and suffix-array algorithm and compared their results doing three experiments. In the experiment 1, worm characteristics are in the same packet.. in the experiment
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.