The Basics of osquery: Fleet Monitoring

Tommy
August 17 2018

In my post about telemetry I handle the concept of being able to express complex indicators, and how systems should be able to interpret and support the analyst.In another post I wrote about how telemetry is a challenge of a changing and more diverse and modern landscape. Recently I have reviewed some device inventory and endpoint detection tools that will add to the solution. In the future I will get back to my view on Mozilla InvestiGator (MIG), but this post will focus on a telemetry collection tool that I have grown fond of: osquery.

osquery was originally developed by Facebook for the purpose of:

Maintaining real-time insight into the current state of your infrastructure[…]

With osquery data is abstracted, in the operating system in which the agent runs, to a SQL-based interface. It contains a near-infinite amount of available data, which is perfect to a network defender. osquery can even parse native sqlite-databases, which there are lots of in macOS. It also works in a distributed mode like GRR and MiG. In practical terms this means that queries are distributed. On the other hand, events can be streamed as well when considering operational security.

Example of the hardware_events table when plugging in and then detaching a Yubikey

Since 2014 osquery has been open sourced and now has a large community developing about every aspect of the tool. According to the briefs that’s online several major institutions, including Facebook, now uses osquery in service networks.

osquery is cross-platform, and now supports: Linux, FreeBSD, Windows and macOS. That is also some of what separates it from its alternatives, like sysmon.

Posts about osquery that you should review before moving on:

Doug Wilsons presentation during FIRST 2018 is a good technical intro to osquery.

So that was a couple of links to get you started. The next section shows you how to quickly get a lab environment up and running.

Setup and Configuration

Prerequisites

There’s only two things that you need setup for the rest of this article if you are on macOS, which can both be easily installed using Homebrew:

brew install go yarn

Also you need to configure your Go-path, which can basically be:

echo "export GOPATH=$HOME/go" >> ~/.bash_profile

Server Setup

Setup the Docker image of Kolide Fleet:

 mkdir -p $GOPATH/src/github.com/kolide
 cd $GOPATH/src/github.com/kolide
 git clone git@github.com:kolide/fleet.git
 cd fleet
 make deps && make generate && make
 docker-compose up

Populate the database:

./build/fleet prepare db

You are now ready to boot up the web UI and API server:

./build/fleet serve --auth_jwt_key=3zqHl2cPa0tMmaCa9vPSEq6dcwN7oLbP

Get enrollment secret and certificate from the Kolide UI at https://localhost:8080 after doing the registration process.

Kolide enrollment
Kolide enrollment

Client Setup

Make the API-token (enrollment secret) persistent at the end-point:

export {enrollment-secret} > /etc/osquery/enrollment.secret

Define flags file in /private/var/osquery/osquery.flags. This one the client uses to apply the centralised tls logging method, which is the API Kolide has implemented. It is also certificate pinned, so all is good.

 --enroll_secret_path=/etc/osquery/enrollment.secret
 --tls_server_certs=/etc/osquery/kolide.crt
 --tls_hostname=localhost:8080
 --host_identifier=uuid
 --enroll_tls_endpoint=/api/v1/osquery/enroll
 --config_plugin=tls
 --config_tls_endpoint=/api/v1/osquery/config
 --config_tls_refresh=10
 --disable_distributed=false
 --distributed_plugin=tls
 --distributed_interval=10
 --distributed_tls_max_attempts=3
 --distributed_tls_read_endpoint=/api/v1/osquery/distributed/read
 --distributed_tls_write_endpoint=/api/v1/osquery/distributed/write
 --logger_plugin=tls
 --logger_tls_endpoint=/api/v1/osquery/log
 --logger_tls_period=10

You can start the osquery daemon on the client by using the following command. At this point you should start thinking about packaging, which is detailed in the osquery docs.

/usr/local/bin/osqueryd --disable_events=false --flagfile=/private/var/osquery/osquery.flags

osquery also has an interactive mode if you would like to test the local instance, based on a local configuration file:

sudo osqueryi --disable_events=false --config_path=/etc/osquery/osquery.conf --config_path=/etc/osquery/osquery.conf

To make the client persistent on macOS, use the following documentation from osquery.

Managing the Kolide Configuration

For this part I found what worked best was using the Kolide CLI client:

./build/fleetctl config set --address https://localhost:8080
./build/fleetctl login
./build/fleetctl apply -f ./options.yaml

The options.yaml I used for testing was the following. This setup also involves setting up the osquery File Integrity Monitoring (FIM), which I wasn’t able to get working by the patching curl command in the docs. The config monitors changes in files under /etc and a test directory at /var/tmp/filetest.

apiVersion: v1
kind: options
spec:
  config:
    decorators:
      load:
      - SELECT uuid AS host_uuid FROM system_info;
      - SELECT hostname AS hostname FROM system_info;
    file_paths:
      etc:
        - /etc/%%
      test:
        - /var/tmp/filetest/%%
    options:
      disable_distributed: false
      distributed_interval: 10
      distributed_plugin: tls
      distributed_tls_max_attempts: 3
      distributed_tls_read_endpoint: /api/v1/osquery/distributed/read
      distributed_tls_write_endpoint: /api/v1/osquery/distributed/write
      logger_plugin: tls
      logger_tls_endpoint: /api/v1/osquery/log
      logger_tls_period: 10
      pack_delimiter: /
  overrides: {}

Next Steps

Through this article we’ve reviewed some of the basic capabilities of osquery and also had a compact view on a lab-setup demonstrating centralised logging, to Kolide, using the tls API of osquery.

Platform support for other than Windows, MacOS and FreeBSD are issues left for exploration and needs porting. The OpenBSD ticket was closed in 2018 and Android and iOS is low priority.A couple of things that I would have liked to see was support for OpenBSD, Android and Ios.

The local setup obviously does not scale beyond your own computer. I briefly toyed with the idea that this would be a perfect fit for ingesting into a Hadoop environment, and not surprising there’s a nice starting point over at the Hortonworks forums.

There’s a lot of open source information on osquery. I also found the Uptycs blog useful.

Tags: #forensics #remote #endpoint #logging #monitoring
Read with Gemini

Remote Forensics is the New Black

Tommy
September 20 2014

Like everything else in information security, forensics is constantly evolving. One matter of special interest for practitioners is doing forensics on remote computers, not that it’s entirely new.

The use-case is self-explanatory to those working in the field, but for the beginners I’ll give a brief introduction.

When you get a case on your desk and it lights up as something interesting, what do you do? Probably your first step is searching for known malicious indicators in network logs. Finding something interesting on some of the clients, let’s say ten in this case, you decide to put some more effort into explaining the nature of the activity. None of the clients is nearby, multiple of them are even on locations with 1Mbps upload speeds.

The next phase would probably be a search in open sources, perhaps turning out in support of something fishy going on. Now you’d like to examine some of the client logs for known hashes and strings you found, and the traditional way to go is acquiring disk and memory images physically. Or is it? That would have easily taken weeks for ten clients. In this case you are lucky and you have a tool for performing remote forensics at hand. The tool was a major roll-out for your organization after a larger breach.

What’s new in remote forensics is that the tools begin to get more mature, and by that I would like to introduce two products of which I find most relevant to the purpose:

  • Google Rapid Response (GRR) [1]
  • Mandiant for Incident Response (MIR) [2]

Actually I haven’t put the latter option to the test (MIR supports OpenIOC which is an advantage) - but I have chosen to take GRR for a spin for some time now. There are also other tools which may be of interest to you such as Sourcefire FireAmp which I’ve heard performs well for end-point-protection. I’ve chosen to leave that out this presentation since this is about a different concept. Surprisingly the following will use GRR as a basis.

For this post there are two prerequisites for you to follow in which I highly recommend to get the feel with GRR:

  • Setup a GRR server [3]. In this post I’ve used the current beta 3.0–2, running all services on the same machine, including the web server and client roll-in interface. There is one install script for the beloved Ubuntu here, but I couldn’t get it easily working on other systems. One exception is Debian which only needed minor changes. If you have difficulties with the latter, please give me a heads-up.
  • Sacrifice one client (it won’t brick a production system as far as I can tell either though) to be monitored. You will find binaries after packing the clients in the GRR Server setup. See the screenshot below for details. The client will automatically report in to the server.

You can find the binaries by browsing from the home screen in the GRR web GUI. Download and install the one of choice.

A word warning before you read the rest of this post: The GRR website is was a little messy and not entirely intuitive. I found, after a lot of searching, that the best way to go about it is reading the code usage examples in the web GUI, especially when it comes to what Google named flows. Flows are little plugins in GRR that may for instance help you task GRR to fetch a file on a specific path.

Notice the call spec. This can be transferred directly to the iPython console. Before I started off I watched a couple of presentations that Google have delivered at LISA. I think you should too if you’d like to see where GRR is going and why it came to be. The one here gives a thorough introduction on how Google makes sure they are able to respond to breaches in their infrastructure [4].

I would also like to recommend an presentation by Greg Castle on BlackHat for reference [5]. For usage and examples Marley Jaffe at Champlain College have put up a great paper. Have a look at the exercises at the end of it.

What is good with GRR is that it supports the most relevant platforms: Linux, Windows and OS X. This is also fully supported platforms at Google, so expect development to have a practical and long-term perspective.

While GRR is relevant, it is also fully open source, and extensible. It’s written in Python with all the niceness that comes with it. GRR have direct memory access by custom built drivers. You will find support for Volatility in there. Well they forked it into a new project named Rekall which is more suited for scale. Anyways it provides support for plugins such as Yara.

If you are like me and got introduced to forensics through academia, you will like that GRR builds on Sleuthkit through pytsk for disk forensics (actually you may choose what layer you’d like to stay on). When you’ve retrieved an item, I just love that it gets placed in a virtual file system in GRR with complete versioning.

The virtual filesystem where all the stuff you’ve retrieved or queried the client about is stored with versioning for you pleasure. In addition to having a way-to-go console application GRR provides a good web GUI which provides an intuitive way of browsing about everything you can do in the console. I think the console is where Google would like you to live though.

An so I ended up on the grr_console which is a purpose-build iPython shell, writing scripts for doing what I needed it to do. Remember that call spec that I mentioned initially, here is where it gets into play. Below you see an example using the GetFile call spec (notice that the pathspec in the flow statement says OS, this might as well have been REGISTRY or TSK):

token = access_control.ACLToken(username="someone", reason="Why")

flows=[]
path="/home/someone/nohup.out"

for client in SearchClients('host:Webserver'):
  id=client[0].client_id
  o=flow.GRRFlow.StartFlow(client_id=str(id),
  flow_name="GetFile", pathspec=rdfvalue.PathSpec(path=path, pathtype=rdfvalue.PathSpec.PathType.OS))
  flows.append(o)

files=[]
while len(flows)>0:
  for o in flows:
    f=aff4.FACTORY.Open(o)
    r = f.GetRunner()
    if not r.IsRunning():
      fd=aff4.FACTORY.Open(str(id)+"/fs/os%s"%path, token=token)
      files.append(str(fd.Read(10000)))
      flows.remove(o)

If interested in Mandiant IR (MIR) and its concept, I’d like to recommend another Youtube video by Douglas Wilson, which is quite awesome as well [7].

Update 2020: Today I wouldn’t recommend MIR/FireEye HX, but rather something like LimaCharlie [8] due to the lack of hunting capabilities in the HX platform.

[1] https://github.com/google/grr

[2] http://www.fireeye.com/products-and-solutions/endpoint-forensics.html

[3] https://grr-doc.readthedocs.io/en/latest/installing-grr-server/index.html

[4] https://2459d6dc103cb5933875-c0245c5c937c5dedcca3f1764ecc9b2f.ssl.cf2.rackcdn.com/lisa13/castle.mp4

[5] GRR: Find All The Badness - https://docs.google.com/file/d/0B1wsLqFoT7i2Z2pxM0wycS1lcjg/edit?pli=1

[6] Jaffe, Marley. GRR Capstone Final Paper

[7] NoVA Hackers Doug Wilson - Lessons Learned from using OpenIOC: https://www.youtube.com/watch?v=L-J5DDG_SQ8

[8] https://www.limacharlie.io/

Tags: #forensics #remote
Read with Gemini

This blog is powered by cl-yag and Tufte CSS!