DSpace at EWHA: 패킷마이닝을 위한 분산처리시스템의 부하균형

Browse

My Repository

DSpace at EWHA과학기술대학원 컴퓨터학과 Theses_Master

View : 638 Download: 0

패킷마이닝을 위한 분산처리시스템의 부하균형

Title: 패킷마이닝을 위한 분산처리시스템의 부하균형

Authors: 옥지혜

Issue Date: 2003

Department/Major: 과학기술대학원 컴퓨터학과

Publisher: 이화여자대학교 과학기술대학원

Degree: Master

Abstract: 네트워크에서 사용되는 정보들은 수많은 패킷으로 구성되어 송수신 되는데 이러한 패킷의 정보를 이용하여 많은 정보를 알아낼 수 있다. 패킷의 정보를 통계적으로 분석해 외부로부터의 침입과 정보의 유출을 방지할 수 있으며 네트워크의 문제점을 파악하여 시스템을 안전하게 관리 할 수 있다. 그리고 각종 프로토콜을 분석해 네트워크의 부하와 사용자의 행위 패턴과 요구사항을 알아낼 수 있다. 그러나 사용자로부터 실시간으로 들어오는 패킷을 하나의 서버가 처리하고 분석하는 경우에는 많은 부하가 생겨 분산처리가 필요하다. 이러한 분산 시스템은 단일 컴퓨터 시스템에서 시스템에 미치는 영향을 적게 함으로서 신뢰도를 높일 수 있고 저렴한 비용으로 더 큰 성능을 얻을 수 있다. 본 연구에서는 이를 효율적으로 해결하기 위해 패킷 데이터를 IP별로 분산하여 처리하는 패킷마이닝을 위한 부하균형시스템을 제안하고자 한다. 이를 위해 서버로부터 들어오는 사용자의 패킷을 IP Group별로 분리하여 저장하였다. IP는 일정규칙으로 이루어져 있는데 이러한 IP의 특징을 이용하여 분산처리 알고리즘 IP Splitting Rule을 개발 하였다. IP Splitting Rule은 IP를 hashing 연산하여 생성된 h값에 따라 IP Group별로 패킷을 분리하여 저장한다. 이때 각각의 프로토콜 tcp, udp, icmp, telnet, smtp, http로 패킷정보가 분산되어 저장되어진다. 그리하여 패킷을 분석함에 있어서 부하를 해결할 수 있고 분석의 효율성을 높였다. 그리고 이전의 시스템과 본 연구에서 제안하는 시스템의 성능을 평가를 하는 알고리즘을 구현하여 제안하는 시스템의 효율성을 검증하였다. ; Mostly, all the data transmitted in a certain network are sent and received in the form of packet. From this situation, we can get a lot of information by reorganizing data from these packets. By analyzing the packet traffic information statistically we can prevent a data hacking and intrusion from outside people. Also we can catch the problem of network overhead and stabilize the network traffic system. And analyzing many kinds of protocols can give information of users behavior patterns and user s demands. But if one single server process packets in real time from users request, there may be lots of load for handling packet data. In this thesis, we suggested the system that can be distributed the packets by IP address group and processed them in order to solve the above problem. Our system will separate all the packets from a server by IP address group using specific hashing algorithm. It is called the IP Splitting Rule, which can be saved and processed later by inspecting their own protocol. The load distributing algorithm is based on estimation value obtaining from the difference between new and used system. In the process of generating the optimal value of packet hashing criteria, we consider the many heuristic approaches. Finally we can get adjusting value for known network systems.