ResearchIndex

Automated Worm Fingerprinting

by Sumeet Singh, Cristian Estan, George Varghese, Stefan Savage

url show details

You need to log in to add tags and post comments.

Tags

Public comments

#1 posted on Apr 20 2008, 12:53 in collection CMU 15-744: Computer Networks -- Spring 08
This is a good idea, and the design and implementation seem promising. However, even though the authors provide analysis on false positives and the source of the problem, they don't give us a comprehensive solution for it. Considering the fact that the best practice against worms is to contain (block) the worm payloads, the problem of false positives is a crucial factor for evaluating this kind of systems. Also, it would be better if they considered smarter worms trying to evade fingerprinting systems. e.g., a worm can deliberately change the prevalent pattern at a different stage of its propagation.

#2 posted on Apr 20 2008, 15:13 in collection CMU 15-744: Computer Networks -- Spring 08
Earlybird seems like a very promising system for detecting worms, and has already proven to be successful in several cases. One question I have is that does it have a large potential for false positives? If alot of traffic other than worms contains common substrings, couldn't Earlybird flag alot of things as worms, causing alot of resources to be spent investigating these false positives? Despite the inefficiency of the high-level algorithm, it seems to be feasible in practice -- the effective mean per byte processing time recorded was only 0.042 microseconds.

#3 posted on Apr 20 2008, 16:29 in collection CMU 15-744: Computer Networks -- Spring 08
I liked the ideas in the paper and thought they were well described and evaluated. I was however, hoping to see a more comprehensive analysis of the effect of each of the techniques they use to identify worm signatures -- an ROC or something similar as Samir suggests would have been good.

#4 posted on Apr 20 2008, 16:47 in collection CMU 15-744: Computer Networks -- Spring 08
A neat idea on fast identification of worm propagation. I'm less concerned about how "heavyweight" the monitoring code is since in the future we're likely to have many idle/spare cores in the background that can do this kind of off-the-critical-path processing.

Unfortunately, this technique breaks down once you consider metamorphic worms. All it takes is to insert extra "NOPS" randomly into the code or to change the flow control of the application. Perhaps it makes sense to generate fingerprints at a higher "semantic" level rather than at the assembly-level--so perhaps fingerprinting higher-level events (e.g., memory access patterns).

#5 posted on Apr 20 2008, 16:48 in collection CMU 15-744: Computer Networks -- Spring 08
I found this article very interesting. The problem, namely how to identify worms automatically and rapidly, is interesting both from a practical point of view and as a challenging question. The approach they follow, of identifying address dispersion and signatures, is well motivated. And finally the authors back up their claims that such a system can work well by implementing it and running it while observing the results.

As others have mentioned above, there are still some questions remaining about the false positive rate of their approach. There are also more questions remaining with false negatives. In Section 7.2, the authors mention a couple techniques that worm designers could use to try to avoid detection, but maybe there are also other methods that could do so effectively?

#6 posted on Apr 20 2008, 16:49 in collection CMU 15-744: Computer Networks -- Spring 08
They claim to be able to automatically identify all worms known prior to the publication of this paper -- what I am curious about is how they perform on subsequent worms. Given knowledge of the methods used to detect worms, how hard is it to circumvent them? Of course, this is more relevent if this system is actually used anywhere...

#7 posted on Apr 20 2008, 16:54 in collection CMU 15-744: Computer Networks -- Spring 08
Interesting idea. One key assumption is that the exploiting code has invariant sections by which it can be automatically recognized. So, an obvious question is: can the hacker obfuscate the exploiting code by automatically generating variants of it, so that it can't be automatically recognized? IIRC, most exploits require highly peculiar machine-code sequences and magic numbers (jump offsets, etc) to be directly hard-coded into the attack packets. If so, then hackers won't be able to hide the worm from Earlybird by obfuscation.

#8 posted on Apr 20 2008, 16:54 in collection CMU 15-744: Computer Networks -- Spring 08
I think the approach -- called content sifting -- is very interesting and neat.

I am not sure about this area. But I guess maybe more techniques in Machine Learning can be applied.

#9 posted on Apr 20 2008, 16:56 in collection CMU 15-744: Computer Networks -- Spring 08
This is a well-organized. The authors did well in describing and evaluating their systems called "Earlybird" which is used to do real-time detection of unknown worms and automated extraction of unique content signature. I think one contribution of the paper is to implement all required tasks for recognizing and detecting worms with moderate memory and computational requirements. Also, I like the section that the authors addressed the limitations, and how their systems could be improved. In addition, I agree with Samir that ROC curve might be a useful tool to evaluate the performance of this worm detection system.

#10 posted on Apr 20 2008, 16:57 in collection CMU 15-744: Computer Networks -- Spring 08
The use of signature extraction and gathering for identifying current and future worms seems quite interesting. Moreover the content sifting technique used by Earlybird greatly reduced the system requirements for implementing such a traffic monitoring tool.

The authors propose the use of white-lists to deal with false positives and they also mention that the Bittorrent P2P system causes traffic patterns that very closely resemble those of a worm. Given that P2P systems are becoming increasingly more popular and might even be used for distributing OS patches (e.g. Microsoft was considering a P2P system for distributing Windows updates”), I wonder if the number of false positives would increase to such an extent that systems such as Earlybird will become unusable.

On a sidenote here is an interesting paper on efficient string matching for intrusion detection

#11 posted on Apr 20 2008, 16:59 in collection CMU 15-744: Computer Networks -- Spring 08
I think this paper gives a very nice idea of fingerprint

#12 posted on Apr 20 2008, 17:05 in collection CMU 15-744: Computer Networks -- Spring 08
I think this paper gives a very nice idea. And I am especially interested in the early detection of worms because it will be very useful in practice. I am wondering is there any limitation for this idea or is there any more clever way of designing the worm to prevent being detected by the system.