Difference between revisions of "Understanding and Mitigating the Impact of Web Robot and IoT Traffic on Web Systems"

From Knoesis wiki
Jump to: navigation, search
(Created page with "== Introduction & Motivation == == Overview == == Approach == === Resource Request Types === Ideally, Web caches should be equipped to predict the exact resource that will be...")
 
Line 9: Line 9:
  
 
{| class="wikitable" style="text-align: center"
 
{| class="wikitable" style="text-align: center"
|+ Figure 1: Breakdown of Resource Types
+
|+ Table 1: Breakdown of Resource Types
 
|-
 
|-
 
! Class
 
! Class
Line 43: Line 43:
  
 
=== Classification Algorithms ===
 
=== Classification Algorithms ===
 +
 +
[[File:PredData.png|300px|center]]
 +
 
=== Datasets ===
 
=== Datasets ===
  

Revision as of 18:24, 23 October 2015

Introduction & Motivation

Overview

Approach

Resource Request Types

Ideally, Web caches should be equipped to predict the exact resource that will be requested next by a Web Robot session. This is not feasible due to the large set of resources that are available on a Web server. Even predicting the extension of the next resource may require a model to predict one type out of hundreds, a task that is challenging for a lightweight classifier to perform in real time. Instead we follow previous work <ref name="robotAnalysis" /> and cluster resources into types. Predicting the next type of resource may provide a smarter alternative since the popularity of robot requests exhibits a power tail <ref name="detectingRobots" /> and as such the most popular resources of a predicted type are the ones likely to be requested next.

Table 1: Breakdown of Resource Types
Class Extensions
text txt, xml, sty, tex, cpp, java
web asp, jsp, cgi, php, html, htm, css, js
img tiff, ico, raw, pgm, gif, bmp, png, jpeg, jpg
doc xls, xlsx, doc, docx, ppt, pptx, pdf, ps, dvi
av avi, mp3, wvm, mpg, wmv, wav
prog exe, dll, dat, msi, jar
compressed zip, rar, gzip, tar, gz, 7z
malformed request strings that are not well-formed
noExtention request for directory contents

Classification Algorithms

PredData.png

Datasets

Results & Analysis

Acknowledgement

This paper is based on work supported by the National Science Foundation (NSF) under Grant No. 1464104. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.

References

<ref name="robotAnalysis"> D. Doran, “Detection, classification, and workload analysis of web robots,” Ph.D. dissertation, University of Connecticut, 2014.</ref> <ref name="detectingRobots"> D. Doran and S. Gokhale, “Detecting Web Robots Using Resource Request Patterns,” in Proc. of Intl. Conference on Machine Learning and Applications, 2012, pp. 7–12.</ref> <references/>