Network forensics of P2P with Unsniff

Network forensics of P2P networks with Unsniff. Software used was Limewire on a Gnutella network.

Network based forensics of P2P traffic has many applications, particularly in law enforcement. Recently, I added preliminary support for Gnutella in Unsniff Network Analyzer as a proof of concept. Wireshark has a rather rudimentary support for this protocol.

We would like to :

  1. Monitor what various subjects are searching for on the P2P network
  2. Monitor query responses including the filenames and their respective SHA1 hash
  3. Identify the file actually downloaded
  4. Reconstruct the file actually downloaded

This blog post is a summary of what I found and how you can put this feature to use.

Setup

This was the test setup.

  • Limewire 5.5.9 which connects using the Gnutella Protocol 0.6
  • Connected to the Gnutella network using the default setttings
  • Searched for “flotilla” videos
  • Downloaded one of them

All traffic from startup to shutdown of Limewire was captured using Unsniff.  You can convert these files to libpcap format by opening them in Unsniff and selecting File > Export > TCPDUMP

Limewire_Complete.usnf Contains complete traffic dump including bootstrapping, punching hole via UPnP, handshakes, downloading the file, and shutdown
Limewire_Handshake.usnf A single TCP session which contains a Gnutella handshake. Recommend you start here first to understand how the compression works

Notes on protocol

A quick note about the P2P protocol elements viewed from a forensics viewpoint.

Bootstrapping

When Limewire first starts up, it uses plain HTTP to download a list of peers from one of many well known hosts. Next it attempts to punch a hole in your firewall using UPnP.  It then selects a suitable peer based on your Geo location and starts a handshake process with it.More details about the bootstrapping process are available at the Gnutella Development Forum. You can observe the bootstrapping process in sessions 1,2,3 in the capture file Limewire_Complete.usnf .

Bootstrap + Punch hole via UPnP + Direct peer handshakes
Bootstrap + Punch hole via UPnP + Direct peer handshakes

Use of compression

The handshake takes place over a single TCP connection. The initial part of the handshake is in plaintext, but if both sides signal support for compression, the same TCP connection switches over to stream compression. Unsniff creates a “synthetic” uncompressed TCP stream corresponding to each compressed TCP handshake stream. You can view the uncompressed stream by opening the second capture file Limewire_Handshake.usnf .

fCompressed original stream + uncompressed synthetic stream created by Unsniff
Compressed original stream + uncompressed synthetic stream created by Unsniff

Use of encryption

Many messages including the actual file transfers are encrypted using AES-128 whose keys are exchanged using Anonymous-Diffie Hellman. This is bad news because our forensics ability effectively ends here. But traffic analysis can give us the some clues about what might have been transferred across the encrypted session.

Actual content encrypted using AES128
Actual content encrypted using AES128

Here is the complete stream that was used to transfer the selected file (click for bigger picture). You can view the TLS handshake settling upon the Anon-DH key exchange protocol and AES128 encryption.

Content encypted via TLS : Anon-DH and AES-128
Content encypted via TLS : Anon-DH and AES-128

Messages

Gnutella defines a handful of messages and extensions. The relevant protocol messages for this exercise are QUERY and QUERY HIT.  To view these messages,

  • switch to the PDU sheet (we are far removed from link layer packets)
  • click on a message to view its breakout
QUERY HIT message decoded
QUERY HIT response for "flotilla" videos

Analyzing Gnutella with Unsniff

Viewing Gnutella messages

1. With USNF files, you just have to open the capture file.

2. With PCAP files or live captures, you need to set up Unsniff to recognize Gnutella.

Since Gnutella uses random TCP ports, just assign Port 65535  to Gnutella. This is the special ‘catch all’ port in Unsniff.  All unrecognized TCP traffic will be treated as Gnutella. To do this :  Select Manage > Access Points > Click on TCP > Add Access Point > Enter 65535 + Gnutella.

3. Import the PCAP file or start a live capture

4. Look at the PDU sheet, it contains all the Gnutella messages.

Scripting to automate this task

To really use this in forensics applications you need to be able to script this stuff. Luckily Unsniff’s scripting API allows us to ditch the GUI altogether and run scripts to extract just what we want.

Here is a useful sample script in Ruby.

  • This prints a list of  peers, the filenames, and the SHA1 hash.

Running this code as

Produces this report containing the peer, files available, and their SHA-1 hash.

The SHA1 hash is in Base32.

Conclusions

It is possible to conduct network based forensics on P2P traffic, but it requires tools with advanced reconstruction abilities.

  • You CAN conduct surveillance of location of content
  • You CAN conduct surveillance of content being searched for
  • You CANNOT reconstruct the file actually transferred due to encryption and use of Anon-Diffie-Hellman
  • You CAN however get a fair idea, not of forensic evidence quality about what was transferred

—-


Download Unsniff Network Analyzer

Follow me on Twitter

Reconstruction can aid network forensics

I’d like to illustrate how reassembly/reconstruction can give an investigator the much needed “entry points” into analyzing a dump of packets.

The demo test case is the Forensics Challenge #4 published by the good guys over at Honeynet.org https://www.honeynet.org/challenges/2010_4_voip You have to answer the challenge yourself of course.

1. Just got a dump of packets, now what ?

Most of us would load the thing into Wireshark and see if we can get some clues. While this gives you the best available protocol breakout of every link layer packet, it could be overwhelming. An alternative and potentially more efficient approach is to look at higher layer data first and identity “entry points”. You can then follow these entry points all the way down to bit level protocol decodes. Two such higher layers are content and flows.

2. Content and Flows

Content like images, video, voice calls, HTML pages, are meant for humans while fragmented TCP packets are not. So, a good place for a human analyst to start is to see if the network analysis tool was able to reconstruct content. If you can see the images, web pages viewed, files transferred, hear voice calls, you can get a fair idea of what the dump contains. You should keep in mind the limitations of the tool and the lurking party pooper (encryption).

Another key starting point is flows. It helps you identify the ones with high volumes in a noisy environment. An example : A typical YouTube video view involves dozens of tiny flows and one huge flow that contains the video being viewed.

3. Example : Forensics Challenge #4

Lets import the Forensics Challenge data pcap into Unsniff Network Analyzer.  Unsniff gives you the following levels of reassembly as first class objects (just like packets).

  1. Raw link layer packets (like Wireshark)
  2. Reassembled PDUs
  3. Flows
  4. Content (known as User Objects)
  5. Basic Statistics

Import the pcap and switch over to the User Objects sheet.

You will see a bunch of HTML/Images/VoIP calls. Clicking on a HTML object renders the page as it was viewed. To reconstruct the HTML you have to enable the option as shown here.

User Objects
Clicking on HTML renders it - completely offline

Right click the VoIP calls to check if the call is worthy of packet level analysis.

Right click to play each leg
Right click to play each leg

View form being edited

Someone editing form
Someone editing form

Conclusion

By starting off at the top layer (content), we know right off the bat that we are dealing with a Trixbox admin session and a VoIP conversation. We know all the pages on the Trixbox server which were visited / edited and the actual VoIP call that took place.  With this context firmly in place we can dive down to the flow level and then to the packet level with a much better understanding of what we are dealing with.

Getting this context would have been time consuming if you started your analysis from the bottom, link layer packets.

——-

You can download and try Unsniff Network Analyzer. It can be your perfect Wireshark complement.

You may also want to try out our upcoming product Trisul Network Metering and Forensics. A Linux based 24×7 surveillance system designed for Network Security Monitoring.


Trisul tidbit – multicore ready uses Intel TBB

I re-architected Trisul after months of intense coding to be able to take advantage of multiple cores. I just want to share the approach I took for this project.

The options I evaluated were :

  • Flow pinning (like in Suricata, the new IDS engine)
    • Packets mapped to hardware thread  based on tuples
  • Work stealing
    • Hardware threads if idle, steal stuff to do (see Cilk)

Flow pinning turned in disappointing results largely due to :

  • While Trisul does flow tracking and reassembly,  the main chunk of code deals with metering (counting hundreds of data points based on payload content)
  • Hard to balance work based only on tuples

Intel’s Threading Building Blocks are the way to go if you want to build on the Cilk style work stealing model. What’s more you get a lot of extra goodies like concurrent containers, atomics, and native threading wrappers.

Armed with TBB, Trisul is completely implemented as a pipeline with a few serial filters and dozens of parallel filters. The advantage of the pipeline pattern is that you get you can run a lot of code on caches that are still “hot“.

The end results are very encouraging.

Here is a screenshot of trisul chewing through the 11GB of packet traces from the LBL-ICSI Enterprise Tracing Project.

340.7% balanced CPU utilization and almost 3.2 times the speed on 1 hardware thread !!