thoughts/data/remote-forensics.md
Tommy Skaug a4adbc2b1c
All checks were successful
Export / Explore-GitHub-Actions (push) Successful in 34s
chore: adjustments to line length of code blocks
2024-08-06 16:13:47 +02:00

7.3 KiB

Like everything else in information security, forensics is constantly evolving. One matter of special interest for practitioners is doing forensics on remote computers, not that it's entirely new.

The use-case is self-explanatory to those working in the field, but for the beginners I'll give a brief introduction.

When you get a case on your desk and it lights up as something interesting, what do you do? Probably your first step is searching for known malicious indicators in network logs. Finding something interesting on some of the clients, let's say ten in this case, you decide to put some more effort into explaining the nature of the activity. None of the clients is nearby, multiple of them are even on locations with 1Mbps upload speeds.

The next phase would probably be a search in open sources, perhaps turning out in support of something fishy going on. Now you'd like to examine some of the client logs for known hashes and strings you found, and the traditional way to go is acquiring disk and memory images physically. Or is it? That would have easily taken weeks for ten clients. In this case you are lucky and you have a tool for performing remote forensics at hand. The tool was a major roll-out for your organization after a larger breach.

What's new in remote forensics is that the tools begin to get more mature, and by that I would like to introduce two products of which I find most relevant to the purpose:

  • Google Rapid Response (GRR) [1]
  • Mandiant for Incident Response (MIR) [2]

Actually I haven't put the latter option to the test (MIR supports OpenIOC which is an advantage) - but I have chosen to take GRR for a spin for some time now. There are also other tools which may be of interest to you such as Sourcefire FireAmp which I've heard performs well for end-point-protection. I've chosen to leave that out this presentation since this is about a different concept. Surprisingly the following will use GRR as a basis.

For this post there are two prerequisites for you to follow in which I highly recommend to get the feel with GRR:

  • Setup a GRR server [3]. In this post I've used the current beta 3.0-2, running all services on the same machine, including the web server and client roll-in interface. There is one install script for the beloved Ubuntu here, but I couldn't get it easily working on other systems. One exception is Debian which only needed minor changes. If you have difficulties with the latter, please give me a heads-up.
  • Sacrifice one client (it won't brick a production system as far as I can tell either though) to be monitored. You will find binaries after packing the clients in the GRR Server setup. See the screenshot below for details. The client will automatically report in to the server.

You can find the binaries by browsing from the home screen in the GRR web GUI. Download and install the one of choice.

A word warning before you read the rest of this post: The GRR website is was a little messy and not entirely intuitive. I found, after a lot of searching, that the best way to go about it is reading the code usage examples in the web GUI, especially when it comes to what Google named flows. Flows are little plugins in GRR that may for instance help you task GRR to fetch a file on a specific path.

Notice the call spec. This can be transferred directly to the iPython console. Before I started off I watched a couple of presentations that Google have delivered at LISA. I think you should too if you'd like to see where GRR is going and why it came to be. The one here gives a thorough introduction on how Google makes sure they are able to respond to breaches in their infrastructure [4].

I would also like to recommend an presentation by Greg Castle on BlackHat for reference [5]. For usage and examples Marley Jaffe at Champlain College have put up a great paper. Have a look at the exercises at the end of it.

What is good with GRR is that it supports the most relevant platforms: Linux, Windows and OS X. This is also fully supported platforms at Google, so expect development to have a practical and long-term perspective.

While GRR is relevant, it is also fully open source, and extensible. It's written in Python with all the niceness that comes with it. GRR have direct memory access by custom built drivers. You will find support for Volatility in there. Well they forked it into a new project named Rekall which is more suited for scale. Anyways it provides support for plugins such as Yara.

If you are like me and got introduced to forensics through academia, you will like that GRR builds on Sleuthkit through pytsk for disk forensics (actually you may choose what layer you'd like to stay on). When you've retrieved an item, I just love that it gets placed in a virtual file system in GRR with complete versioning.

The virtual filesystem where all the stuff you've retrieved or queried the client about is stored with versioning for you pleasure. In addition to having a way-to-go console application GRR provides a good web GUI which provides an intuitive way of browsing about everything you can do in the console. I think the console is where Google would like you to live though.

An so I ended up on the grr_console which is a purpose-build iPython shell, writing scripts for doing what I needed it to do. Remember that call spec that I mentioned initially, here is where it gets into play. Below you see an example using the GetFile call spec (notice that the pathspec in the flow statement says OS, this might as well have been REGISTRY or TSK):

token = access_control.ACLToken(username="someone", reason="Why")

flows=[]
path="/home/someone/nohup.out"

for client in SearchClients('host:Webserver'):
  id=client[0].client_id
  o=flow.GRRFlow.StartFlow(client_id=str(id),
  flow_name="GetFile", pathspec=rdfvalue.PathSpec(path=path, 
                       pathtype=rdfvalue.PathSpec.PathType.OS))
  flows.append(o)

files=[]
while len(flows)>0:
  for o in flows:
    f=aff4.FACTORY.Open(o)
    r = f.GetRunner()
    if not r.IsRunning():
      fd=aff4.FACTORY.Open(str(id)+"/fs/os%s"%path, token=token)
      files.append(str(fd.Read(10000)))
      flows.remove(o)

If interested in Mandiant IR (MIR) and its concept, I'd like to recommend another Youtube video by Douglas Wilson, which is quite awesome as well [7].

Update 2020: Today I wouldn't recommend MIR/FireEye HX, but rather something like LimaCharlie [8] due to the lack of hunting capabilities in the HX platform.

[1] https://github.com/google/grr

[2] http://www.fireeye.com/products-and-solutions/endpoint-forensics.html

[3] https://grr-doc.readthedocs.io/en/latest/installing-grr-server/index.html

[4] https://2459d6dc103cb5933875-c0245c5c937c5dedcca3f1764ecc9b2f.ssl.cf2.rackcdn.com/lisa13/castle.mp4

[5] GRR: Find All The Badness - https://docs.google.com/file/d/0B1wsLqFoT7i2Z2pxM0wycS1lcjg/edit?pli=1

[6] Jaffe, Marley. GRR Capstone Final Paper

[7] NoVA Hackers Doug Wilson - Lessons Learned from using OpenIOC: https://www.youtube.com/watch?v=L-J5DDG_SQ8

[8] https://www.limacharlie.io/