Velociraptor Hunting for evil - what Velociraptors do best! en-us Sat, 02 Mar 2019 00:00:00 +1000 <![CDATA[Agentless hunting with Velociraptor]]> Agentless hunting with Velociraptor

There has been a lot of interest lately in “Agentless hunting” especially using PowerShell. There are many reasons why Agentless hunting is appealing - there are already a ton of endpoint agents and yet another one may not be welcome. Somtimes we need to deploy endpoint agents as part of a DFIR engagement and we may not want to permanently install yet another agent on end points.

This blog post explores an agentless deployment scenario, where we do not want to install Velociraptor permanently on the end point, but rather push it to end points temporarily to collect specific artifacts. The advantage of this method is that there are no permanent changes to the end point, as nothing is actually installed. However, we do get the full power of Velociraptor to collect artifacts, hunt for evil and more…

Agentless Velociraptor

Normally when deploying Velociraptor as a service, the binary is copied to the system and a service is installed. The service ensures that the binary is restarted when the system reboots, and so Velociraptor is installed on a permanent basis.

However in the agentless deployment scenario we simply run the binary from a network share using group policy settings. The downside to this approach is that the endpoint needs to be on the domain network to receive the group policy update (and have the network share accessible) before it can run Velociraptor. When we run in Agentless mode we are really after collecting a bunch of artifacts via hunts and then exiting - the agent will not restart after a reboot. So this method is suitable for quick hunts on corporate (non roaming) assets.

In this post I will use Windows 2019 Server but this should also work on any older version.

Creating a network share

The first step is to create a network share with the Velociraptor binary and its configuration file. We will run the binary from the share in this example, but for more reliability you may want to copy the binary into e.g. a temp folder on the end point in case the system becomes disconnected from the domain. For quick hunts though it should be fine.

We create a directory on the server (I will create it on the domain controller but you should probably not do that - find another machine to host the share).


I created a directory C:\Users\Deployment and ensured that it is read only. I have shared the directory as the name Deployment.

I now place the Velociraptor executable and client config file in that directory and verify that I can run the binary from the network share. The binary should be accessible via \\DC\Deployment\velociraptor.exe:


Creating the group policy object.

Next we create the group policy object which forces all domain connected machines to run the Velociraptor client. We use the Group Policy Management Console:


Select the OU or the entire domain and click “Create New GPO”:


Now right click the GPO object and select “Edit”:


We will create a new scheduled task. Rather than schedule it at a particular time, we will select to run it immediately. This will force the command to run as soon as the endpoint updates its group policy settings (i.e. we do not want to wait for the next reboot of the endpoint).


Next we give the task a name and a description. In order to allow Velociraptor to access raw devices (e.g. to collect memory or NTFS artifacts) we can specify that the client will run at NT_AUTHORITY\SYSTEM privileges, and run without any user being logged on. It is also worth ticking the “hidden” checkbox here to prevent a console box from appearing.


Next click the Actions tab and add a new action. This is where we launch the Velociraptor client. The program will simply be launched from the share (i.e. \\DC\Deployment\velociraptor.exe) and we give it the arguments allowing it to read the provided configuration file (i.e. –config \\DC\Deployment\client.config.yaml client -v).


In the setting tab we can control how long we want the client to run. For a quick hunt this may be an hour or two but maybe for a DFIR engagement it might be a few days. The GPO will ensure the client is killed after the allotted time.


Once the GPO is installed it becomes active for all domain machines. You can now schedule any hunts you wish using the Velociraptor GUI. When a domain machine refreshes its group policy it will run the client, which will enroll and immediately participate in any outstanding hunts - thus collecting and delivering its artifacts to the server. After the allotted time has passed, the client will shut down without having installed anything on the endpoint.

You can force a group policy update by running the gpupdate program. Now you can verify that Velociraptor is running:



Note that when running Velociraptor in agent less mode you probably want to configure it so that the writeback file is written to the temp directory. The writeback file is how the client keeps track of its key material (and identity). The default is to store it in the client’s installation folder, but you should probably change it in the client’s config file:

  writeback_windows: $TEMP\\velociraptor.writeback.yaml

The file will remain in the client’s temp directory so if you ever decide to run the agentless client again (by pushing another group policy) the client id remains the same.

Sat, 02 Mar 2019 00:00:00 +1000 <![CDATA[Alerting on event patterns]]> Alerting on event patterns

We have shown in earlier posts how Velociraptor uses VQL to define event queries that can detect specific conditions. These conditions can be used to create alerts and escalation actions.

One of the most useful types of alerts is detecting a pattern of activity. For example we can detect failed and successful login attempts seperately, but it is the specific pattern of events (say 5 failed login attempts followed by a successful one) that is interesting from a detection point of view.

This post illustrates how this kind of temporal correlation can be expressed in a VQL query. We then use it to create alerts for attack patterns commonly seen by intrusions.

Event Queries

Velociraptor executes queries written in the Velociraptor Query Language (VQL). The queries can be executed on the client, and their results streamed to the server. Alternatively the queries may be executed on the server and process the result of other queries which collected information from the client.

A VQL query does not have to terminate at all. VQL queries draw their data from a VQL plugin which may simply return data rows at different times. For example, consider the following query:

SELECT EventData as FailedEventData,
       System as FailedSystem
FROM watch_evtx(filename=securityLogFile)
WHERE System.EventID = 4625

This query sets up a watcher on a windows event log file. As new events are written to the log file, the query will produce those events as new rows. The rows will then be filtered so we only see event id 4625 (Failed logon event).

Velociraptor can implement event queries on the client or on the server. For example, say we wanted to collect all failed event logs with the query above. We would write an artifact that encapsulates this query:

name: Windows.System.FailedLoginAttempts
  - name: securityLogFile
    default: C:/Windows/System32/Winevt/Logs/Security.evtx
  - queries:
     - SELECT EventData as FailedEventData,
           System as FailedSystem
       FROM watch_evtx(filename=securityLogFile)
       WHERE System.EventID.Value = 4625

Then we simply add that artifact to the monitored artifact list in the config file:

  - Generic.Client.Stats
  - Windows.System.FailedLoginAttempts
  version: 2
  ops_per_second: 10

The monitored artifacts are run on all clients connected to the server. The output from these queries is streamed to the server and stored in the client’s monitoring VFS directory.

Lets test this artifact by trying to run a command using the runas windows command. We will be prompted for a password but failing to give the correct password will result in a login failure event:


After a few seconds the event will be written to the windows event log and the watch_evtx() VQL plugin will emit the row - which will be streamed to the VFS monitoring directory on the server, where it can be viewed in the GUI:


The above screenshot shows that the monitoring directory now contains a subdirectory named after the artifact we created. Inside this directory are CSV files for each day and every failed logon attempt is detailed there.

Time correlation

While it is interesting to see all failed logon attempts in many cases these events are just noise. If you put any server on the internet (e.g. an RDP or SSH server) you will experience thousands of brute force attempts to break in. This is just the nature of the internet. If your password policy is strong enough it should not be a big problem.

However, what if someone guesses the password for one of your accounts? Then the activity pattern is more like a bunch of failed logons followed by a successful logon for the same account.

This pattern is way more interesting than just watching for a series of failed logons (although that is also good to know).

But how do we write a query to detect this? Essentially the query needs to look back in time to see how many failed logon attempts preceeded each successful logon.

This is a typical problem which may be generalized as followed:


We want to detect an event A preceeded by a specified number of events B within a defined time window.

This problem may be generalized for example:

  1. Detect a user account created and deleted within a short time window.
  2. A beacon to a specific DNS followed by at least 5 beacons within the last 5 hours to same DNS (Event A and B are the same).

The fifo() plugin

How shall we write the VQL query to achieve this? This is made possible by use of the fifo() plugin. As its name suggests, the FIFO plugin acts as a First In First Out cache for event queries.


The plugin is given a subquery which is also a VQL query generating its own events. As the subquery generates events, each event is kept in the fifo plugin’s cache in a first in first out manner. Events are also expired if they are too old.

We typically store the query in a variable. Each time the variable is queried the cache is returned at once. To illustrate how this works consider the following query:

LET fifo_events = SELECT * FROM fifo(
     SELECT * from watch_evtx(filename=securityLogFile)
     WHERE System.EventID.Value = 4625

SELECT * FROM foreach(
     SELECT * FROM clock(period=60)
     SELECT * from fifo_events

The first query is stored into the fifo_events variable. When it is first defined, the fifo() VQL plugin launches its subquery and simply collects its output into its local cache in a fifo manner. This will essentially keep the last 5 rows in its cache.

The second query runs the clock() plugin to receive a clock event every 60 seconds. For each of these events, we select from the fifo_events variable - that is we select the last 5 failed events.

You can see that this allows us to query the last 5 events in the fifo cache for every clock event. If we now replace the clock event with a successful logon event this query will do exactly what we want:

# This query will generate failed logon events - one per row, as
# they occur.
- LET failed_logon = SELECT EventData as FailedEventData,
     System as FailedSystem
  FROM watch_evtx(filename=securityLogFile)
  WHERE System.EventID.Value = 4625

# This query will create a fifo() to contain the last 5 failed
# logon events.
- LET last_5_events = SELECT FailedEventData, FailedSystem
      FROM fifo(query=failed_logon,

# This query simply generates successful logon events.
- LET success_logon = SELECT EventData as SuccessEventData,
     System as SuccessSystem
  FROM watch_evtx(filename=securityLogFile)
  WHERE System.EventID.Value = 4624

# For each successful event, we select the last 5 failed events
# and count them (using the group by). If the count is greater
# than 3 then we emit the row as an event.
- SELECT * FROM foreach(
     SELECT SuccessSystem.TimeCreated.SystemTime AS LogonTime,
            SuccessSystem, SuccessEventData, FailedEventData,
            FailedSystem, count(items=SuccessSystem) as Count
     FROM last_5_events
     WHERE FailedEventData.SubjectUserName = SuccessEventData.SubjectUserName
     GROUP BY LogonTime
    })  WHERE Count > 3

The above query simply watches the event log for failed logins and populates a fifo() with the last 5 failed events. At the same time we monitor the event log for successful logon events. If we see a successful event, we go back and check the last 5 failed events and count them.

If the failed events are for the same user and there are more than 3 then we report this as an event. We now have a high value event.

Let’s see what it looks like when such an event is triggered:


Just like before, the events are written to a daily CSV log, one event per CSV row. It is a bit hard to see in the GUI since there is a lot of data, (We probably need some GUI work to improve this) but there is a single row emitted for each event, and the FailedEventData column contains a list of all the failed login attempts stored in the fifo().

Server side queries.

We have seen how the fifo() plugin can be used in the monitoring artifact itself to have the client detect its own events. However, the endpoint is usually only able to see its own events in isolation. It would be nice to be able to detect patterns only evident by seeing concerted behaviour from multiple endpoints at the same time.

For example, consider the pattern of an attacker who compromised domain credentials running multiple PowerShell Remoting commands across the entire domain. A command like:

PS C:\WINDOWS\system32> Invoke-Command –ComputerName testcomputer -ScriptBlock {Hostname}

This command will generate multiple event log entries, including event 4624 (logon) on each host. While in isolation, on each individual endpoint this event is not suspicious, we might consider seeing this event repeated within a short time across the domain suspicious.

To set that up we would run the following artifact as a monitoring artifact on all endpoints:

name: Windows.Event.SuccessfulLogon
 - queries:
   - SELECT EventData as SuccessEventData,
        System as SuccessSystem
     FROM watch_evtx(filename=securityLogFile)
     WHERE System.EventID.Value = 4624

On the server we simple install a watcher on all monitoring events from this artifact and feed the result to the fifo(). This fills the fifo() with the last 500 successful logon events from all clients within the last 60 seconds:

LET last_successful_logons = SELECT * FROM fifo(
     SELECT * FROM watch_monitoring(

By counting the number of such unique events we can determine if there were too many successful logon events from different hosts within the last minute. This might indicate a scripted use of powershell remoting across the domain.


In this post we have seen how to write artifacts which capture a time ordered pattern of behavior. This technique is useful to codify common attack techniques. The technique is general and we can use the same idea on server side queries to correlate events from many hosts at the same time.

Thu, 14 Feb 2019 00:00:00 +1000 <![CDATA[Velociraptor Performance]]> Velociraptor Performance

We are often asked how much resources does a Velociraptor deployment use? How should one spec a machine for a Velociraptor deployment? We have previously said that one of the reasons we developed Velociraptor was to improve on the performance of GRR which was not scalable for our use case.

We’ve been working with the team at Klein & Co. on several intrusions over the past several months, which are providing valuable opportunities to deploy and test Velociraptor in a range of real world investigation scenarios. Through this process, we’ve been able to extend Velociraptor’s functionality and prove its performance on real client networks.

I thought I would write a short blog post to show how Velociraptor performed on such a recent engagement. In this engagement we deployed Velociraptor on AWS and selectively pushed the client to around 600 machines running a mix of MacOS and Windows.

This post will hopefully give readers some idea of how scalable the tool is and the typical workloads we run with it.

The Server

Since this is a smallish deployment we used a single VM with 32Gb of RAM and 8 cores. This was definitely over speced for this job as most of the time the server consumed less than 10% of one core:

top - 06:26:13 up 29 days,  2:31,  5 users,  load average: 0.00, 0.01, 0.05
Tasks: 214 total,   1 running, 213 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.5 us,  0.1 sy,  0.0 ni, 99.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:  32948060 total, 14877988 used, 18070072 free,   411192 buffers
KiB Swap:        0 total,        0 used,        0 free. 13381224 cached Mem

19334 root      20   0 1277924  94592  12616 S   3.0  0.3   9:11.03 ./velociraptor --config server.config.yaml frontend
    8 root      20   0       0      0      0 S   0.3  0.0   7:16.30 [rcuos/0]

You can see that the server consumed about 95mb when operating normally and CPU usage was around 3% of one core.

For this engagement we knew that we would be collecting a lot of data and so we specified a large 500gb volume.

Hunt performance

Velociraptor works by collecting “Artifacts” from clients. Artifacts are simply encapsulated VQL queries which specify something to search for on the endpoint. Without going into the details of this engagement, we can say that we collected typical artifacts for a DFIR/Forensic investigation engagement. In the following I want to explore how well hunts performed for the following typical artifacts in order of complexity:

  1. Search the filesystem for a file glob.
  2. Download the $MFT from the root filesystem.
  3. Run a Yara scan over every file on all mounted filesystems.

We ran these artifact collections over a large number of hosts (between 400-500) that fell within the scope of our engagement. Although the number of hosts is not huge, we hope to demonstrate Velociraptor’s scalability.

Searching for a file glob

One of the simplest and most common tasks in DFIR is to search the filesystem for a glob based on filename. This requires traversing directories and matching the filename based on the user specified expression - for example, find all files with the extension *.exe within the C:\Users directory.

Velociraptor can glob over the entire filesystem or over a limited set of files. Typically a full filesystem glob can take some minutes on the endpoint (it is equivalent to running the find unix command) and touches every file. We typically try to limit the scope of the glob as much as possible (e.g. only search system directories) but sometimes it is nice to run a glob over all mounted filesystems to make sure we don’t miss anything. In this case we opted for a full filesystem scan.

We searched the entire deployment using a hunt (The hunt is constructed using the File Finder flow in the GUI) which just launches the artifact collection. Therefore the horizontal distance between the red and blue dot, in the graph below, represents the total time taken by the host to collect the artifact.


The graph shows how many hosts were recruited into this hunt on the Y axis. The X axis show the number of seconds since the hunt launch. The red points indicate the time when clients started their collection, while the blue dots indicate the time when the client completed the artifact collection and the server saved its results.

The inset shows the same data but zoomed into the time origin.

Velociraptor improves and builds on the initial ideas implemented within the GRR DFIR tool, and so it is interesting to compare this graph to a typical graph produced by GRR’s hunt (reproduced from this paper).


The first noticeable difference is that Velociraptor clients complete their collection much faster than GRR’s (the horizontal distance between the red and blue dots represents the time between when the collection is issued and the time it completes).

The main reason for this is that GRR’s communication protocol relies on polling (by default every 10 minutes). Also, since hunting is so resource intensive in GRR, the clients actually poll the hunt foreman task every 30 minutes by default. This means that GRR clients typically have to wait up to 30 minutes to run a hunt!

The second difference is the slope of the line around the start of the hunt. GRR implements a hunt client rate - clients are recruited into the hunt slowly (by default 20 per minute) in order to limit the load on the frontends. Unlike GRR, Velociraptor does not implement a hunt rate since the Velociraptor frontend load is controlled by limiting concurrency instead (more on this below).

This means that Velociraptor can deliver useful results within seconds of the hunt starting. We see that this particular filename search typically takes 25-30 seconds and we see about 200 clients completing the hunt within this time consistently. The remaining clients are probably not online and they receive the hunt as they join the network. This makes Velociraptor hunts far more responsive and useful.

You might also notice a few outliers which spend a long time collecting this artifact - these machines have probably been shutdown or suspended while collecting this artifact.

MFT Download

A common technique is to examine the Master File Table (MFT) of an NTFS volume. By forensically analyzing the MFT it is possible to detect deleted files, time stomping and build a timeline of the system using tools like or ntfswalker .

In this case we decided to collect the $MFT from all the Windows hosts and post-process them offline. Typically the MFT is around 300-400mb and could be larger. Therefore this artifact collection is about performance downloading large quantities of data from multiple hosts quickly.

Velociraptor can read the raw NTFS partition and therefore read the $MFT file. We wrote the following artifact to just fetch the $MFT file:

name: Artifact.NTFS.MFT_puller
description: |
   Uses an NTFS accessor to pull the $MFT

- name: path
  default: \\.\C:\$MFT

- precondition:
    SELECT OS From info() where OS = 'windows'
  - SELECT upload(file=path, accessor="ntfs") as Upload from scope()

Here is the timing graph for this artifact collection:


This collection takes a lot longer on each host as clients are uploading around 400mb each to the server, but our server was in the cloud so it had fast bandwidth. Again we see the hosts that are currently up being tasked within seconds, while as hosts come online gradually we see them receiving the hunt and a few minutes later uploading their $MFT file.

Was the frontend loaded at the time? I took a screenshot of top on the server seconds after launching the hunt:


We can see that the CPU load is trivial (4.7%) but the major impact of a heavy upload collection is the memory used (about 4.7gb - up from about 100mb). The reason is that each client is posting a large buffer of data (several mb) simultaneously. The server needs to buffer the data before it can decrypt and process it which takes memory.

In order to limit the amount of memory used, Velociraptor limits the total number of connections it is actively processing to 8-10 concurrent connections. By carefully managing concurrency we are able to keep a limit on server memory use. We may lower the total memory use by reducing the concurrency (and therefore maybe fit into a smaller VM). Clients simply wait until the server is able to process their uploaded buffers. If the server takes too long, the clients automatically back off and retry to send the same buffer.

Yara scan over the entire filesystem

The final example of a very intense artifact is to scan the entire filesystem with a YARA rule. This not only requires traversing the entire filesystem, but also opening each file and searching it.

One of the dangers with such a scan is that users will be negatively impacted as their workstations start to read every file on disk! The main resources a YARA scan consumes is disk IO and CPU load. Users might complain and blame Velociraptor for their machine being slow (disk IO may negatively affect performance much more than CPU load!).

However in this case, we don’t care how long we take to scan the user’s system, as long as every file was scanned, and as long as the endpoint is not overloaded and the user’s work is not affected. Luckily Velociraptor allows us to specify the trade-off between collection time and collection intensity.

Velociraptor rate limiting

Velociraptor controls client side load by rate limiting the client’s VQL query. Each VQL plugin consumes an “operation” from the throttler. We define an “operation” as a notional unit of work - the heavier the VQL plugin’s work, the more operations are consumed. For example for yara scanning, an operation is defined as 1mb of scanned data, or a single file if the file is smaller.

When a user issues an artifact collection task, they may limit the rate at which operations are performed by the client. The Velociraptor agent then limits the operations to the specified rate. For example, if the rate is 20 ops/sec then the client will scan less than 20mb per seconds.

Other collections may run concurrently at different rates, though; The client is not blocked while performing a single artifact collection. This makes sense since we often need to collect a low priority artifact slowly, but we do not want this to compromise rapid response to that host.

For example, one of our targets was a server with large attached storage. We ran the Yara scan over this system, scanning the first 100Mb of each file, at a rate of 50 ops/sec. In total we scanned 1.5Tb of files and the scan took 14 hours (for a total scan rate of 30Mb/sec).

Velociraptor by default collects the Generic.Client.Stats artifact, which samples the client’s CPU utilization and memory usage every 10 seconds. These samples are streamed to the server and form a record of the client’s footprint on the endpoint. We can use this data to visualize the effects of performing the yara scan on this host:


Above is the CPU usage on that particular server over the course of a full day (24 hours). The 14 hour yara scan is clearly visible but at no time is CPU utilization exceeding 30% of one core. With endpoint disk IO limited to about 30mb/sec we have achieved a balance between performance and endpoint load we are happy with.


We can see that most endpoints take approximately an hour to perform this yara scan, but server load is minimal since the server simply stores the results of the scans while doing minimal processing.


This post provides some typical numbers for Velociraptor performance in typical DFIR engagements. We also covered some considerations and trade-offs we must think about when issuing large artifact collections. Readers can use these as a guideline in their own deployments - please comment below about your experiences. Velociraptor is under very active development and this feedback is important to ensure we put in place the mechanisms to account for more use cases.


We would like to thank the folk at Klein&Co for their wonderful support and assistance in Velociraptor development.

Sun, 10 Feb 2019 00:00:00 +1000 <![CDATA[Velociraptor Python API]]> Velociraptor Python API

Velociraptor is very good at collecting artifacts from endpoints. However, in modern DFIR work, the actual collection is only the first step of a much more involved process. Typically we want to post process data using more advanced data mining tools (such as data stacking). Velociraptor usually is only a part of a wider solution which might include a SIEM and SOC integration.

In order to facilitate interoperability with other tools, Velociraptor now offers an external API. The API is offered via gRPC so it can be used in any language which gRPC supports (e.g. Java, C++, Python etc). In this blog post we illustrate the Python API but any language should work.

The Velociraptor API Server

The API server exposes an endpoint ready to accept gRPC connections. By default the API server listen only on the loopback interface ( but it is easy to change to be externally accessible if you need by changing the server.config.yaml file:

  bind_port: 8001

Client programs simply connect directly to this API and call gRPC methods on it.


The connection is encrypted using TLS and authenticated using mutual certificates. When we initially created the Velociraptor configuration file, we created a CA certificate and embedded it in the server.config.yaml file. It is this CA certificate which is used to verify that the certificate each end presents was issued by the Velociraptor CA.


If you need to have extra security in your environment you should keep the original server.config.yaml file generated in an offline location, then deploy a redacted file (without the CA.private_key value) on the server. This way api client certificates can only be issued offline.

Before the client may connect to the API server they must have a certificate issued by the Velociraptor CA. This is easy to generate:

$ velociraptor --config server.config.yaml \
     config api_client --name Fred > api_client.yaml

Will generate something like:

ca_certificate: |
client_cert: |
client_private_key: |
name: Fred

The certificate generated has a common name as specified by the –name flag. This name will be logged in the server’s audit logs so you can use this to keep track of which programs have access. This file keeps both private key and certificate as well as the CA certificate which must be used to authenticate the server in a single file for convenience.

Using the API from Python

Although the API exposes a bunch of functions used by the GUI, the main function (which is not exposed through the GUI) is the Query() method. This function simply executes one or more VQL queries, and streams their results back to the caller.

The function requires an argument which is a protobuf of type VQLCollectorArgs:

     env:  list of VQLEnv(string key, string value)
     Query: list of VQLRequest(string Name, string VQL)
     max_row: int
     max_wait: int
     ops_per_second: float

This very simple structure allows the caller to specify one or more VQL queries to run. The call can set up environment variables prior to the query execution. The max_row and max_wait parameters indicate how many rows to return in a single result set and how long to wait for additional rows before returning a result set.

The call simply executes the VQL queries and returns result sets as VQLResponse protobufs:

   Response: json encoded string
   Columns: list of string
   total_rows: total number of rows in this packet

The VQL query may return many responses - each represents a set of rows. These responses may be returned over a long time, the API call will simply wait until new responses are available. For example, the VQL may represent an event query - i.e. watch for the occurrence of some event in the system - in this case it will never actually terminate, but keep streaming response packets.

How does this look like in code?

The following will cover an example implementation in python. The first step is to prepare credentials for making the gRPC call. We parse the api_config yaml file and prepare a credential object:

config = yaml.load(open("api_client.yaml").read())
creds = grpc.ssl_channel_credentials(

options = (('grpc.ssl_target_name_override', "VelociraptorServer",),)

Next we connect the channel to the API server:

with grpc.secure_channel(config["api_connection_string"],
                         creds, options) as channel:
    stub = api_pb2_grpc.APIStub(channel)

The stub is the object we use to make calls with. We can then issue our call:

request = api_pb2.VQLCollectorArgs(

for response in stub.Query(request):
    rows = json.loads(response.Response)
    for row in rows:

We issue the query and then just wait for the call to generate response packets. Each packet may contain several rows which will all be encoded as JSON in the Response field. Each row is simply a dict with keys being the column names, and the values being possibly nested dicts or simple data depending on the query.

What can we do with this?

The Velociraptor API is deliberately open ended - meaning we do not pose any limitations on what can be done with it. It is conceptually a very simple API - just issue the query and look at the results, however this makes it extremely powerful.

We already have a number of very useful server side VQL plugins you can use. We also plan to add a number of other plugins in future - this means that the Velociraptor API can easily be extended in a backwards compatible way by simply adding new VQL plugins. New queries can do more, without breaking existing queries.

Post process artifacts

This is the most common use case for the API. Velociraptor deliberately does not do any post processing on the server - we don’t want to slow the server down by making it do more work than necessary.

But sometimes users need to do some more with the results - for example upload to an external system, check hashes against Virus Total, and even initiate an active response like escalation or disruption when something is detected.

In a recent engagement we needed to collect a large number of $MFT files from many endpoints. We wanted to analyze these using external tools like

We wrote a simple artifact to collect the MFT:

name: Windows.Upload.MFT
description: |
   Uses an NTFS accessor to pull the $MFT

  - name: path
    default: \\.\C:\$MFT

  - precondition:
      SELECT OS From info() where OS = 'windows'

    - select upload(file=path, accessor="ntfs") as Upload from scope()

We then created a hunt to collect this artifact from the machines of interest. Once each $MFT file is uploaded we need to run to parse it:

  SELECT Flow,
         file_store(path=Flow.FlowContext.uploaded_files) as Files
  FROM  watch_monitoring(artifact='System.Flow.Completion')
  WHERE 'Windows.Upload.MFT' in Flow.FlowContext.artifacts

with grpc.secure_channel(config["api_connection_string"],
                         creds, options) as channel:
    stub = api_pb2_grpc.APIStub(channel)
    request = api_pb2.VQLCollectorArgs(

    for response in stub.Query(request):
        rows = json.loads(response.Response)
        for row in rows:
            for file_name in row["Files"]:
                    ["", "-f", file_name,
                     "-o", file_name+".analyzed"])

The previous code sets up a watcher query which will receive every completed flow on the server which collected the artifact “Windows.Upload.MFT” (i.e. each completed flow will appear as a row to the query).

We can have this program running in the background. We can then launch a hunt collecting the artifact, and the program will automatically process all the results from the hunt as soon as they occur. When new machines are turned on they will receive the hunt, have their $MFT collected and this program will immediately process that.

Each flow contains a list of files that were uploaded to it. The file_store() VQL function reveals the server’s filesystem path where the files actually reside. The server simply stores the uploaded files on its filesystem since Velociraptor does not use a database (everything is a file!).

The python code then proceeds to launch the script to parse the $MFT.


The nice thing with this scheme is that the is running in its own process and can be managed separately to the main Velociraptor server (e.g. we can set its execution priority or even run it on a separate machine). The Velociraptor server does not actually need to wait for post processing nor will the post processing affect its performance in any way. If the script takes a long time, it will just fall behind but it eventually will catch up. In the meantime, the Velociraptor server will continue receiving the uploads regardless.

The above example sets up a watcher query to receive flow results in real time, but you can also just process the results of a specific hunt completely using a query like:

SELECT Flow, file_store(path=Flow.FlowContext.uploaded_files) as Files
FROM hunt_flows(hunt_id=huntId)


The Velociraptor python API opens up enormous possibilities for automating Velociraptor and interfacing it with other systems. Combining the power of VQL and the flexibility (and user familiarity) of Python allows users to build upon Velociraptor in a flexible and creative way. I am very excited to see what the community will do with this feature - I can see integration with ELK, BigQuery and other data analytic engines being a valuable use case.

Please share your experiences in the comments or on the mailing list at

Sat, 09 Feb 2019 00:00:00 +1000 <![CDATA[Deploying Velociraptor with OAuth SSO]]> Deploying Velociraptor with OAuth SSO

In the previous post we saw how to set up Velociraptor’s GUI over SSL. This is great, but we still need to create users and assign them passwords manually. The trouble with user account management is that we can not enforce 2 factor authentication, or any password policies or any of the usual enterprise requirements for user account management. It is also difficult for users to remember yet another password for a separate system, and so might make the password easily guessable.

Most enterprise systems require an SSO mechanism to manage user accounts and passwords. Manual user account management simply does not scale!

In this post we discuss how to enable Google’s SSO authentication for Velociraptor identity management.

OAuth Identity management

Velociraptor can use Google’s oauth mechanism to verify a user’s identity. This requires a user to authenticate to Google via their usual mechanism - if their account requires 2 factor authentication, then users need to log in this way.

Once the user authenticates to Google, they are redirected back into the Velociraptor application with a token that allows the application to request information about the user (for example, the username or email address).


OAuth is an authentication protocol. This means Velociraptor can be pretty confident the user is who they claim they are. This does not automatically grant them access to the application! A Velociraptor administrator must still manually grant them access before a user may log in.

Before we can use Google for Authentication, we need to register our Velociraptor deployment as an OAuth App with Google. Unfortunately Google is not known for having intuitive and easy to follow processes so actually doing this is complicated and bounces through many seemingly unrelated Google products and services. This post attempts to document this process at it exists in this time.

For our example we assume that our server is located at as we continue on from our example in the last post (i.e. it is already configured to use SSL).

Registering Velociraptor as an OAuth application

The first step is to register Velociraptor as an OAuth app. We do this by accessing the Google cloud console at . You will need to set up a cloud account first and create a cloud project. Although in this example we do not necessarily need to host our application on Google cloud or have anything to do with Google cloud, OAuth seems to exist within the Google cloud product.

Our ultimate goal is to obtain OAuth credentials to give our Velociraptor app, but we have to have a few things set up first. The cloud console is fairly confusing so I usually use the search feature to find exactly what I need. Searching for “oauth” at the search bar indicates that it is under “APIs and Services”.

We need to set up the OAuth consent screen first - in which we give our application a name to be presented to the user by the OAuth flow:


Further down we need to provide an authorized domain


In order to add an Authorized domain we need to verify it. Google’s help pages explain it further:

Authorized domains

To protect you and your users, Google restricts your OAuth 2.0 application to using Authorized Domains. If you have verified the domain with Google, you can use any Top Private Domain as an Authorized Domain.

And this links to which again seems completely unrelated to OAuth, Velociraptor or even a web app (the web masters product is supposed to help sites increase their search presence).

Within this product we now need to “Add a property”:


Hidden within the settings menu there is an option “Verification Details” which allows you to verify that you own the domain. If you purchased your domain from Google Domains then it should already be verified - otherwise you can set some TXT records to prove you own the domain.


After all this we can go back to the cloud console and Create Credentials/OAuth client ID:


Now select “Web App” and we must set the “Authorized redirect URIs” to - This is the URL that successful OAuth authentication will direct to. Velociraptor accepts this redirect and uses it to log the user on.



The UI is a bit confusing here - you must press enter after typing the redirect URL to have it registered before you hit Create otherwise it misses that you typed it completely. I spent some time stumped on this UI bug.

If all goes well the Google cloud console will give us a client ID and a client secret. We can then copy those into the Velociraptor configuration file under the GUI section:

  google_oauth_client_secret: qsadlkjhdaslkjasd

  output_directory: /var/log/velociraptor/
  separate_logs_per_component: true

In the above config we also enabled logging (which is important for a secure application!). The separate_logs_per_component option will create a separate log file for the GUI, Frontend as well as important Audit related events.

Now we can start the Velociraptor frontend:

$ velociraptor --config server.config.yaml frontend

Connecting using the browser goes through the familiar OAuth flow and arrives at this Velociraptor screen:


The OAuth flow ensures the user’s identity is correct but does not give them permission to log into Velociraptor. Note that having an OAuth enabled application on the web allows anyone with a Google identity to authenticate to the application but the user is still required to be authorized. We can see the following in the Audit logs:

  "level": "error",
  "method": "GET",
  "msg": "User rejected by GUI",
  "remote": "",
  "time": "2018-12-21T18:17:47+10:00",
  "user": ""

In order to authorize the user we must explicitly add them using the velociraptor admin tool:

$ velociraptor --config ~/server.config.yaml user add
Authentication will occur via Google - therefore no password needs to be set.

Note that this time, Velociraptor does not ask for a password at all, since authentication occurs using Google’s SSO. If we hit refresh in the browser we can now see the Velociraptor application:


We can see that the logged in user is authenticated by Google, and we can also see their Google avatar at the top right for some more eye candy :-).


Shouts to the folks from Klein & Co who sponsored this exciting feature!.

Sun, 23 Dec 2018 00:00:00 +1000 <![CDATA[Configuring Velociraptor for SSL]]> Configuring Velociraptor for SSL

We have previously seen how to deploy a new Velociraptor server. For a simple deployment we can have Velociraptor server and clients provisioned in minutes.

Usually we deploy a specific Velociraptor deployment on our DFIR engagements. We use cloud resources to provision the server and have the clients connect to this cloud VM. A proper secure deployment of Velociraptor will use SSL for securing both client communication and protecting the web GUI.

In the past provisioning an SSL enabled web application was complex and expensive - you had to create certificate signing requests, interact with a CA. Pay for the certificates, then configure the server. In particular you had to remember to renew the cert in 2 years or your website suddenly broke!

Those days are over with the emergence of Lets Encrypt! and autocert. These days applications can automatically provision their own certificates. Velociraptor can manage its own certificates, fully automatically - and then renew its certificates when the time comes with no user intervention required.

In this blog post we will see how to configure a new Velociraptor server in a cloud VM.

Setting up a domain

The first step in deploying an SSL enabled web application is to have a domain name. SSL verifies the authenticity of a web site by its DNS name.

We go over to Google Domains and buy a domain. In this post I will be using the domain

Provisioning a Virtual Machine

Next we provision an Ubuntu VM from any cloud provider. Depending on your deployment size your VM should be large enough. An 8 or 16Gb VM should be sufficient for around 5-10k clients. Additionally we will need sufficient disk space to hold the data we will collect. We recommend to start with a modest amount of storage and then either backup data as it gets collected or increase the storage volume as needed.

Our virtual machine will receive connections over ports 80 and 443.


When using SSL both the client communication and the GUI are served over the same ports to benefit from SSL transport encryption.

When we deploy our Virtual Machine we may choose either a static IP address or allow the cloud provider to assign a dynamic IP address. We typically choose a dynamic IP address and so we need to configure Dynamic DNS.

Go to the Google Domains dashboard and create a new dynamic DNS for your domain. In our example we will use as our endpoint address.


After the dynamic address is created, we can get the credentials for updating the IP address.


Next we install ddclient on our VM. This will update our dynamic IP address whenever the external interface changes. Configure the file /etc/ddclient.conf:


Next configure the service to start:

# Configuration for ddclient scripts
# generated from debconf on Tue Oct 23 20:25:23 AEST 2018
# /etc/default/ddclient

# Set to "true" if ddclient should be run every time DHCP client ('dhclient'
# from package isc-dhcp-client) updates the systems IP address.

# Set to "true" if ddclient should be run every time a new ppp connection is
# established. This might be useful, if you are using dial-on-demand.

# Set to "true" if ddclient should run in daemon mode
# If this is changed to true, run_ipup and run_dhclient must be set to false.

# Set the time interval between the updates of the dynamic DNS name in seconds.
# This option only takes effect if the ddclient runs in daemon mode.

Run dhclient and check that it updates the address correctly.

Configuring Velociraptor for SSL

Now comes the hard part! We need to configure Velociraptor to use SSL. Edit the following in your server.config.yaml file (if you do not have one yet you can generate one using velociraptor config generate > server.config.yaml ):


autocert_cert_cache: /etc/velociraptor_cache/

The autocert_domain parameter tells Velociraptor to provision its own cert for this domain automatically. The certificates will be stored in the directory specified by autocert_cert_cache. You don’t have to worry about rotating the certs, Velociraptor will automatically renew them.

Obviously now the clients need to connect to the control channel over SSL so we also need to direct the client’s server_urls parameter to the SSL port.

Lets start the frontend (We need to start Velociraptor as root because it must be able to bind to port 80 and 443):

$ sudo velociraptor --config server.config.yaml frontend -v

[INFO] 2018-12-22T17:12:42+10:00 Loaded 43 built in artifacts
[INFO] 2018-12-22T17:12:42+10:00 Increased open file limit to 999999
[INFO] 2018-12-22T17:12:42+10:00 Launched gRPC API server on
[INFO] 2018-12-22T17:12:42+10:00 Autocert specified - will listen on ports 443 and 80. I will ignore specified GUI port at 8889
[INFO] 2018-12-22T17:12:42+10:00 Autocert specified - will listen on ports 443 and 80. I will ignore specified Frontend port at 8889
[INFO] 2018-12-22T17:12:42+10:00 Frontend is ready to handle client requests using HTTPS

If all goes well we now can point our browser to and it should just work. Don’t forget to provision a user and password using:

$ velociraptor --config server.config.yaml user add mic


The autocert configuration is very easy to do but there are a few caveats:

  1. Both ports 80 and 443 must be accessible over the web. This is needed because Letsencrypt’s servers need to connect to our domain name in order to verify our domain ownership.
  2. It is not possible to change the ports from port 80 and 443 due to limitations in Letsencrypt’s ACME protocol. This is why we can not have more than one Velociraptor deployment on the same IP currently.

We have seen how easy it is to deploy secure Velociraptor servers. In the next post we will discuss how to enhance security further by deploying two factor authentication with Google’s Single Sign On (SSO).


This feature will be available in the upcoming 0.27 release. You can try it now by building from git head.

Sat, 22 Dec 2018 00:00:00 +1000 <![CDATA[Velociraptor Interactive Shell]]> Velociraptor Interactive Shell

One of the interesting new features in the latest release of Velociraptor is an interactive shell. One can interact with the end point over the standard Velociraptor communication mechanism - an encrypted and authenticated channel.

This feature is implemented by utilizing the Velociraptor event monitoring, server side VQL queries. This post explores how these components come together to deliver a responsive, interactive workflow.

Endpoint shell access

Although we generally try to avoid it, sometimes the easiest way to extract certain information is to run a command and parse its output. For example, consider the windows ipconfig command. It is possible to extract this information using win32 apis but this requires additional code to be written in the client. The ipconfig command is guaranteed to be available. Soemtimes running a command and parsing its output is the easiest option.

The GRR client has a client action which can run a command. However that client action is restricted to run a whitelist of commands, since GRR chose to prevent the running of arbitrary commands on the endpoint. In practice, though it is difficult to add new commands to the whitelist (and rebuild and deploy new clients that have the updated whitelist). But users need to run arbitrary commands (including their own third party tools) anyway. So in the GRR world, most people use “python hacks” routinely to run arbitrary commands.

When we came to redesign Velociraptor we pondered if arbitrary command execution should be included or not. To be sure, this is a dangerous capability - effectively giving Velociraptor root level access on the endpoint. In our experience restricting it in an arbitrary way (as was done in GRR) is not useful because it is harder adapt to real incident response needs (you hardly ever know in advance what is needed at 2am in the morning when trying to triage an incident!).

Other endpoint monitoring tools also have a shell interface (For example Carbon Black). It is understood that this feature is extremely powerful, but it is necessary sometimes.

Velociraptor mitigates this risk in a few ways:

  1. If an organization deems the ability to run arbitrary commands too dangerous, they can completely disable this feature in the client’s configuration.
  2. Every shell command run by the client is audited and its output is archived. Misuse can be easily detected and investigated.
  3. This feature is considered high risk and it is not available via the GUI. One must use the velociraptor binary on the server itself to run the interactive shell.

Interactive Shell

The interactive shell feature is accessed by issueing the shell command to the velociraptor binary:

$ velociraptor --config ~/server.config.yaml shell C.7403676ab8664b2b
C.7403676ab8664b2b (trek) >ls /
Running ls / on C.7403676ab8664b2b
Received response at 2018-12-11 13:12:35 +1000 AEST - Return code 0


C.7403676ab8664b2b (trek) >id
Running id on C.7403676ab8664b2b
Received response at 2018-12-11 13:13:05 +1000 AEST - Return code 0

uid=1000(mic) gid=1000(mic) groups=1000(mic),4(adm),24(cdrom),27(sudo)

C.7403676ab8664b2b (trek) >whoami
Running whoami on C.7403676ab8664b2b
Received response at 2018-12-11 13:13:10 +1000 AEST - Return code 0


As you can see it is pretty straight forward - type a command, the command is sent to the client, and the client responds with the output.

How does it work?

The main components are shown in the figure below. Note that the shell process is a different process from the frontend:


The workflow starts when a user issues a command (for example “ls -l /”) on the terminal. The shell process schedules a VQL query for the client:

SELECT now() as Timestamp, Argv, Stdout,
     Stderr, ReturnCode FROM execve(argv=['ls', '-l', '/'])

However, this query is scheduled as part of the monitoring flow - which means it’s response will be sent and stored with the monitoring logs. As soon as the shell process schedules the VQL query the frontend is notified and the client is woken. Note that due to Velociraptor’s near instantaneous communication protocol this causes the client to run the command almost immediately.

The client executes the query which returns one or more rows containing the Stdout of the process. The client will then send the response to the server as a monitoring event. The frontend will then append the event to a CSV file.

After sending the initial client query, the interactive shell process will issue a watch VQL query to watch for the shell response:

SELECT ReturnCode, Stdout, Stderr, Timestamp, Argv
FROM watch_monitoring(client_id=ClientId, artifact='Shell')

The process now blocks until this second query detects the response arrived on the monitoring queue. Now we simply display the result and go back to the interactive prompt.

Note that the interactive shell is implemented using the same basic building blocks that Velociraptor offers:

  1. Issuing client VQL queries.
  2. Waking the client immediately gives instant results (no need for polling).
  3. Utilizing the event monitoring flow to receive results from queries immediately.
  4. Writing server side event queries to watch for new events, such as responses from the client.

Note that the frontend is very simple and does no specific processing of the interactive shell, the feature is implemented completely within the interactive shell process itself. This design lowers the load on the frontends since their job is very simple, but enables complex post processing and interaction to tbe implemented by other processes.


We mentioned previously that running shell commands on endpoints is a powerful feature and we need to audit its use closely. Since shell command output is implemented via the monitored event queues it should be obvious that we can monitor all such commands by simply watching the Shell artifact event queue:

$ velociraptor query "select * from watch_monitoring(artifact='Shell')"
  "Argv": "\"{\\\"Argv\\\":[\\\"id\\\"]}\"",
  "Artifact": "Shell",
  "ClientId": "C.7403676ab8664b2b",
  "ReturnCode": "0",
  "Stderr": "\"\"",
  "Stdout": "\"uid=1000(mic) gid=1000(mic) groups=1000(mic)\\n\"",
  "Timestamp": "1544499929"

We can easily write an artifact that escalates any use of the interactive shell by sending the admin an mail (See previous blog post). This way we can see if someone missused the feature. Alternatively we may simply archive the event queue CSV file for long term auditing of any interactive shell use.

Tue, 11 Dec 2018 00:00:00 +1000 <![CDATA[Server side VQL queries and Escalation Events]]> Server side VQL queries and Escalation Events

Previously we have seen how Velociraptor collects information from end points using Velociraptor artifacts. These artifacts encapsulate user created queries using the Velociraptor Query Language (VQL). The power of VQL is that it provides for a very flexible way of specifying exactly what should be collected from the client and how - without needing to modify client code or deploy new clients!

This is not the whole story though! It is also possible to run VQL queries on the server side! Similarly server side Velociraptor artifacts can be used to customize the operation of the server - without modifying any code or redeploying the server components.

Server Side VQL Queries.

By now you are probably familiar with Velociraptor and VQL. We have seen that it is possible to run a VQL query interactively from the commandline. For example to find all processes matching the ‘gimp’:

$ velociraptor query \
   "SELECT Pid, Exe, Cmdline FROM pslist() WHERE Exe =~ 'gimp'"
  "Cmdline": "gimp-2.10",
  "Exe": "/usr/bin/gimp-2.10",
  "Pid": 13207

We have used this feature previously in order to perfect and test our queries by interactively building the query as we go along.

However it is also possible to run queries on the server itself in order to collect information about the server. There is nothing special about this as such - it is simply that some VQL plugins are able to operate on the server’s internal data store and therefore provide a way to interact with the server via VQL queries.


Other endpoint monitoring tools export a rich API and even an API client library to enable users to customize and control their installation. For example, GRR expects users write python scripts using the GRR client API library.

Velociraptor’s approach is different - the functionality typically available via APIs is made available to VQL queries via VQL plugins (e.g. client information, flow information and results collected). In this way the VQL itself forms an API with which one controls the server and deployment. There is no need to write any code - simply use existing VQL plugins in any combination that makes sense to create new functionality - then encapsulates these queries inside Velociraptor artifacts for reuse and sharing.

For example, to see all the clients and their hostnames:

$ velociraptor query \
   "SELECT os_info.fqdn as Hostname, client_id from clients()" --format text
|    Hostname     |     client_id      |
| mic-Inspiron    | C.772d16449719317f |
| TestComputer    | C.11a3013cca8f826e |
| trek            | C.952156a4b022ddee |
| DESKTOP-IOME2K5 | C.c916a7e445eb0868 |
SELECT os_info.fqdn AS Hostname,
client_id FROM clients()

To inspect what flows were run on a client:

$ velociraptor query \
   "SELECT runner_args.creator, runner_args.flow_name, \
    runner_args.start_time FROM \
  "runner_args.creator": "",
  "runner_args.flow_name": "MonitoringFlow",
  "runner_args.start_time": 1544338661236625
  "runner_args.creator": "mic",
  "runner_args.flow_name": "VFSDownloadFile",
  "runner_args.start_time": 1544087705756469

Client Event Monitoring

We have also previously seen that Velociraptor can collect event streams from clients. For example, the client’s process execution logs can be streamed to the server. Clients can also receive event queries which forward selected events from the windows event logs.

When we covered those features in earlier blog posts, we stressed that the Velociraptor server does not actually do anything with the client events, other than save them to a file. The server just writes the client’s events in simple Comma Separated files (CSV files) on the server.

We mentioned that it is possible to import this file into another tool (e.g. a spreadsheet or database) for post-processing. An alternative is to perform post-processing with Velociraptor itself using server side VQL queries.

For example, we can filter a client’s process execution log using a VQL query:

$ velociraptor query "SELECT * from monitoring(
    WHERE Name =~ '(?i)psexesvc' "
  "CommandLine": "\"C:\\\\Windows\\\\PSEXESVC.exe\"",
  "Name": "\"PSEXESVC.exe\"",
  "PID": "452",
  "PPID": "512",
  "Timestamp": "\"2018-12-09T23:30:42-08:00\"",
  "artifact": "Windows.Events.ProcessCreation",
  "client_id": "C.87b19dba006fcddb"

The above query finds running instances of psexec’s service component - a popular method of lateral movement and privilege escalation.

This query uses the monitoring() VQL plugin which opens each of the CSV event monitoring logs for the specified artifact on the server, decodes the CSV file and emits all the rows within it into the VQL Query. The rows are then filtered by applying the regular expression to the name.

Server side event queries

VQL queries do not have to terminate at all. Some VQL plugins can run indefinitely, emitting rows at random times - usually in response to some events. These are called Event Queries since they never terminate. We saw this property when monitoring the client - the above Windows.Events.ProcessCreation artifact uses an event query which emits a single row for each process execution on the end point.

However, we can also have Event Queries on the server. When used in this way the query triggers in response to data collected by the server of various clients.

For example, consider the above query to detect instances of psexec executions. While we can detect this by filtering existing monitoring event logs, it would be nice to be able to respond to such an event dynamically.

One way is to repeatedly run the same query (say every minute) and look for newly reported instances of psexec executions. But this approach is not terribly efficient. A better approach is to install a watcher on the monitoring event log:

$ velociraptor query "SELECT * from watch_monitoring(
     artifact='Windows.Events.ProcessCreation') where Name =~ '(?i)psexesvc' "
  "CommandLine": "\"C:\\\\Windows\\\\PSEXESVC.exe\"",
  "Name": "\"PSEXESVC.exe\"",
  "PID": "4592",
  "PPID": "512",
  "Timestamp": "\"2018-12-10T01:18:06-08:00\"",
  "artifact": "Windows.Events.ProcessCreation",
  "client_id": "C.87b19dba006fcddb"

The watcher efficiently follows the monitoring CSV file to detect new events. These events are then emitted into the VQL query and subsequently filtered. When the query processes all rows in the file, the plugin just sleeps and waits for the file to grow again. The watch_monitoring() plugin essentially tails the CSV file as it is being written. Note that due to the fact that log files are never truncated and always grow, and that CSV file format is a simple, one row per line format it is possible to both read and write to the same file without locking. This makes following a growing log file extremely efficient and safe - even from another process.

Responding to server side events

The previous query will return a row when psexec is run on the client. This is a very suspicious event in our environment and we would like to escalate this by sending us an email.

We can modify the above query to send an email for each event:

SELECT * FROM foreach(
     SELECT * from watch_monitoring(
    WHERE Name =~ '(?i)psexesvc'
     SELECT * FROM mail(
       subject='PsExec launched on host',
       body=format(format='PsExec execution detected at %v: %v',
                   args=[Timestamp, Commandline])

The query sends an email from each event emitted. The message body is formatted using the format() VQL function and this includes important information from the generated event. Note that the mail() plugin restricts the frequency of mails to prevent triggering the mail server’s spam filters. So if two psexec executions occur within 60 seconds we will only get one email.

In order for Velociraptor to be able to send mail you must configure SMTP parameters in the server’s configuration file. The following example uses gmail to send mails (other mail providers will have similar authentication requirements).

  server: ""
  auth_password: zldifhjsdflkjfsdlie

The password in the configuration is an application specific password obtained from


Tying it all together: Server Side Event Artifacts

As always we really want to encapsulate VQL queries in artifact definitions. This way we can design specific alerts, document them and invoke them by name. Let us encapsulate the above queries in a new artifact:

name: Server.Alerts.PsExec
description:  |
   Send an email if execution of the psexec service was detected on any client.

   Note this requires that the Windows.Event.ProcessCreation
   monitoring artifact be collected.

  - name: EmailAddress
  - name: MessageTemplate
    default: |
      PsExec execution detected at %v: %v for client %v

  - queries:
     - |
       SELECT * FROM foreach(
           SELECT * from watch_monitoring(
           WHERE Name =~ '(?i)psexesvc'
           SELECT * FROM mail(
             subject='PsExec launched on host',
               args=[Timestamp, CommandLine, ClientId])

We create a new directory called my_artifact_directory and store that file inside as psexesvc.yaml. Now, on the server we invoke the artifact collector and instruct it to also add our private artifacts:

$ velociraptor --definitions my_artifact_directory/ \
    --config ~/server.config.yaml \
    --format json \
    artifacts collect Server.Alerts.PsExec
INFO:2018/12/10 21:36:27 Loaded 40 built in artifacts
INFO:2018/12/10 21:36:27 Loading artifacts my_artifact_directory/
  "To": [
  "CC": null,
  "Subject": "PsExec launched on host",
  "Body": "PsExec execution detected at \"2018-12-10T03:36:49-08:00\": \"C:\\\\Windows\\\\PSEXESVC.exe\"",
  "Period": 60


This blog post demonstrates how VQL can be used on the server to create a full featured incident response framework. Velociraptor does not dictate a particular workflow, since all its actions are governed by VQL queries and artifacts. Using the same basic building blocks, users can fashion their own highly customized incident response workflow. Here is a brainstorm of possible actions:

  1. An artifact can be written to automatically collect a memory capture if a certain event is detected.
  2. Using the http_client() VQL plugin, when certain events are detected on the server open a ticket automatically (using a SOAP or JSON API).
  3. If a particular event is detected, immediately shut the machine down or quarantine it (by running shell commands on the compromised host).

The possibilities are truly endless. Comment below if you have more interesting ideas and do not hesitate to contribute artifact definitions to address your real world use cases.

Mon, 10 Dec 2018 00:00:00 +1000 <![CDATA[More on client event collection]]> More on client event collection

Previously we have seen that Velociraptor can monitor client events using Event Artifacts. To recap, Event Artifacts are simply artifacts which contain event VQL queries. Velociraptor’s VQL queries do not have to terminate by themselves - instead VQL queries may run indefinitely, trickling results over time.

This post takes another look at event queries and demonstrates how these can be used to implement some interesting features.

Periodic Event queries

The simplest kind of events are periodically generated events. These are created using the clock() VQL plugin. This is a simple event plugin which just emits a new row periodically.

$ velociraptor query "select Unix from clock(period=5)" --max_wait 1
   "Unix": 1544339715
   "Unix": 1544339720

The query will never terminate, instead the clock() plugin will emit a new timestamp every 5 seconds. Note the –max_wait flag which tells Velociraptor to wait at least for 1 second in order to batch rows before reporting them.

This query is not very interesting! Let’s do something more interesting. GRR has a feature where each client sends its own CPU use and memory footprint sampled every minutes to the server. This is a really useful feature because it can be used to make sure the client’s impact on the host’s performance is minimal.

Let us implement the same feature with a VQL query. What we want is to measure the client’s footprint every minute and send that to the server:

SELECT * from foreach(
   SELECT UnixNano FROM clock(period=60)
   SELECT UnixNano / 1000000000 as Timestamp,
          Times.user + Times.system as CPU,
          MemoryInfo.RSS as RSS
   FROM pslist(pid=getpid())

This query runs the clock() VQL plugin and for each row it emits, we run the pslist() plugin, extracting the total CPU time (system + user) used by our own pid (i.e. the Velociraptor client).

We can now encapsulate this query in an artifact and collect it:

$ velociraptor artifacts collect Generic.Client.Stats --max_wait 1 --format json
  "CPU": 0.06999999999999999,
  "RSS": 18866176,
  "Timestamp": 1544340582.9939497
  "CPU": 0.09,
  "RSS": 18866176,
  "Timestamp": 1544340602.9944408


You must specify the –format json to be able to see the results from event queries on the command line. Otherwise Velociraptor will try to get all the results so it can format them in a table and never return any results.

Installing the event collector.

In order to have clients collect this event, we need to add the artifact to the server. Simply add the YAML file into a directory on the server and start the server with the –definitions flag. Then simply add the event name to the Events clause of the server configuration. When clients connect to the server they will automatically start collecting these events and sending them to the server:

$ velociraptor --definitions path/to/my/artifacts/ frontend
  - Generic.Client.Stats
  version: 2

Note that we do not need to redeploy any clients, modify any code or recompile anything. We simply add the new artifact definition and clients will automatically start monitoring and feeding back our information.

The data is sent to the server where it is stored in a file (Events are stored in a unique file for each day).

For example, the path /var/lib/velociraptor/clients/C.772d16449719317f/monitoring/Artifact%20Generic.Client.Stats/2018-12-10 stores all events collected from client id C.772d16449719317f for the Generic.Client.Stats artifact on the day of 2018-12-10.

In the next blog post we will demonstrate how these events can be post processed and acted on. It is important to note that the Velociraptor server does not interpret the collected monitoring events at all - they are simply appended to the daily log file (which is a CSV file).

The CSV file can then be imported into basically any tool designed to work with tabular data (e.g. spreadsheets, databases, BigQuery etc). CSV is almost universally supported by all major systems.

Sun, 09 Dec 2018 00:00:00 +1000 <![CDATA[Velociraptor training at NZITF]]> Velociraptor training at NZITF

We are very excited to run this full day training workshop at the New Zealand Internet Engineering Task Force (NZITF) conference.

The training material can be downloaded here “Velociraptor NZITF training”.

Tue, 13 Nov 2018 00:00:00 +1000 <![CDATA[Event Queries and Endpoint Monitoring]]> Event Queries and Endpoint Monitoring

In previous posts we have seen how Velociraptor can run artifacts to collect information from hosts. For example, we can collect WMI queries, user accounts and files.

However it would be super awesome to be able to do this collection in real time, as soon as an event of interest appears on the host, we would like to have that collected on the server. This post describes the new event monitoring framework and shows how Velociraptor can collect things such as event logs, process execution and more in real time.

Why monitor endpoint events? Recording end point event information on the server gives a bunch of advantages. For one, the server keeps a record of historical events, which makes going back to search for these easy as part of an incident response activity.

For example, Velociraptor can keep a running log of process execution events for all clients, on the server. If a particular executable is suspected to be malicious, we can now go back and search for the execution of that process in the past on the infected machine (for establishing the time of infection), as well as search the entire deployment base for the same binary execution to be able identify lateral movement and wider compromises.

How are events monitored?

Velociraptor relies heavily on VQL queries. A VQL query typically produces a single table of multiple rows. For example, the query:

SELECT Name, CommandLine FROM pslist()

Returns a single row of all running processes, and then returns.

However, VQL queries do not have to terminate at all. If the VQL plugin they are calling does not terminate, the VQL query will continue to run and pass events in partial results to the VQL caller.

Event queries are just regular VQL queries which do not terminate (unless cancelled) returning rows whenever an event is generated.


Consider the parse_evtx() plugin. This plugin parses an event log file and returns all events in it. We can then filter events and return specific events of interest. The following query returns all the service installation events and terminates:

F:\>velociraptor.exe query "SELECT EventData, System.TimeCreated.SystemTime from
   parse_evtx(filename='c:/windows/system32/winevt/logs/system.evtx') where
   System.EventId.value = '7045'"
  "EventData": {
   "AccountName": "",
   "ImagePath": "system32\\DRIVERS\\VBoxGuest.sys",
   "ServiceName": "VirtualBox Guest Driver",
   "ServiceType": "kernel mode driver",
   "StartType": "boot start"
  "System.TimeCreated.SystemTime": "2018-11-10T06:32:34Z"

The query specifically looks at the 7045 event “A service was installed in the system”

Lets turn this query into an event query:

F:\>velociraptor.exe query "SELECT EventData, System.TimeCreated.SystemTime from
   watch_evtx(filename='c:/windows/system32/winevt/logs/system.evtx') where
   System.EventId.value = '7045'" --max_wait 1
  "EventData": {
    "AccountName": "",
    "ImagePath": "C:\\Users\\test\\AppData\\Local\\Temp\\pmeFF0E.tmp",
    "ServiceName": "pmem",
    "ServiceType": "kernel mode driver",
    "StartType": "demand start"
  "System.TimeCreated.SystemTime": "2018-11-10T04:57:35Z"

The watch_evtx() plugin is the event watcher equivalent of the parse_evtx() plugin. If you ran the above query, you will notice that Velociraptor does not terminate. Instead it will show all existing service installation events in the log file, and then just wait in the console.

If you then install a new service (in another terminal), for example using winpmem.exe -L, a short time later you should see the event reported by Velociraptor as in the above example. You will notice that the watch_evtx() plugin emits event logs as they occur, but Velociraptor will try to group the events into batches. The max_wait flag controls how long to wait before releasing a partial result set.

Employing event queries for client monitoring

The above illustrates how event queries work, but to actually be able to use these we had to implement the Velociraptor event monitoring framework.

Normally, when we launch a CollectVQL flow, the client executes the query and returns the result to the flow. Clearly since event queries never terminate, we can not run them in series (because the client will never be able to do anything else). The Velociraptor client has a table of executing event queries which are run in a separate thread. As these queries return more results, the results are sent back to the server.

We also wanted to be able to update the events the clients are monitoring on the fly (without a client restart). Therefore we needed a way to be able to update the client’s event table. This simply cancels current event queries, and installs new queries in their place.


As events are generated by the Event Table, they are sent back to the server into the Monitoring flow. This flow is automatically created for each client. The monitoring flow simply writes events into the client’s VFS. Therefore, events are currently simply recorded for each client. In future there will be a mechanism to post process event and produce alerts based on these.

Process Execution logs

One of the most interesting event plugins is the WMI eventing plugin. This allows Velociraptor to install a temporary WMI event listener. For example, we can install a listener for new process creation:

// Convert the timestamp from WinFileTime to Epoch.
SELECT timestamp(epoch=atoi(
  string=Parse.TIME_CREATED) / 10000000 - 11644473600 ) as Timestamp,
  Parse.ParentProcessID as PPID,
  Parse.ProcessID as PID,
  Parse.ProcessName as Name, {
    SELECT CommandLine
    FROM wmi(
      query="SELECT * FROM Win32_Process WHERE ProcessID = " +
          format(format="%v", args=Parse.ProcessID),
  } AS CommandLine
  FROM wmi_events(
       query="SELECT * FROM __InstanceCreationEvent WITHIN 1 WHERE
              TargetInstance ISA 'Win32_Process'",
       wait=5000000,   // Do not time out.

The wmi_events() plugin installs an event listener into WMI and therefore receives events from the OS about new process creation events. Unfortunately these events, do not contain a lot of information about the process. They only provide the ProcessID but not the full command line. The above query executes a second subquery to retrieve the command line for the process. We also parse the timestamp and convert it into a more standard epoch based timestamp.

Specifying what should the client monitor

We have seen how Event VQL queries can generate events for the server. However, this is difficult for Velociraptor’s end users to directly use. Who can really remember the full query?

As we have shown previously, Velociraptor’s Artifacts are specifically designed to solve this issue. Artifacts encapsulate a VQL query so it can be called by name alone.

For example, the Windows.Events.ProcessCreation artifact encapsulates the above query in one easy to remember name.

To specify what clients should collect, users simply need to name the event artifacts that should be monitored. Currently this is done in the server configuration (in future this may be done via the GUI).

  - Windows.Events.ServiceCreation
  - Windows.Events.ProcessCreation
  version: 1

The event table version should be incremented each time the monitored event list is updated. This forces all clients to refresh their event tables.

How does it look like in the GUI?

The Monitoring flow simply writes files into the client’s VFS. This allows these to be downloaded and post processed outside of Velociraptor.



Adding event monitoring to Velociraptor is a great step forward. Even just keeping the logs around is extremely helpful for incident response. There is a lot of value in things like process execution logging, and remote event log forwarding. We will cover some more examples of event log monitoring in future blog posts. Until then, have a play and provide feedback as usual by filing issues and feature requests.

Fri, 09 Nov 2018 00:00:00 +1000 <![CDATA[Detecting powershell persistence with Velociraptor and Yara]]> Detecting powershell persistence with Velociraptor and Yara

I was watching the SANS DFIR Summit 2018 videos on youtube and came across Mari DeGrazia’s talk titled “Finding and Decoding Malicious Powershell Scripts”. This is an excellent talk and it really contains a wealth of information. It seems that Powershell is really popular these days, allowing attacker to “live off the land” by installing fully functional reverse shells and backdoors, in a few lines of obfuscated scripts.

Mari went through a number of examples and also expanded on some in her blog post Malicious PowerShell in the Registry: Persistence, where she documents persistence through an autorun key launching powershell to execute a payload within another registry key.

A similar persistence mechanism is documented by David Kennedy from Binary defence in his post PowerShell Injection with Fileless Payload Persistence and Bypass Techniques. In that case an msha.exe link was stored in the user’s Run key which executed a payload from another registry key.

I was eager to write a Velociraptor artifact to attempt to detect such keys using a YARA signature. Of course signature based detection is not as robust as behavioural analysis but it is quick and usually quite effective.

I thought it was still quite instructive to document how one can develop the VQL queries for a simple Velociraptor artifact. We will be developing the artifact interactively on a Windows system.


Our artifact will attempt to detect the persistence mechanism detailed in the above posts. We start by adding a value to our test user account under the key

Key: "HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Run"
Value: "C:\Windows\system32\mshta.exe"

Defining the Artifact.

We create a directory called “artifacts” then create a new file inside it called powershell_persistence.yaml. Velociraptor artifacts are just YAML files that can be loaded at runtime using the –definitions flag.

Every artifact has a name, by convention the name is separated into its major categories. We will call ours Windows.Persistence.Powershell:

name: Windows.Persistence.Powershell

This is the minimum required for Velociraptor to identify it. We can see a listing of all artifacts Velociraptor knows about using the “artifacts list” command:

F:\>velociraptor.exe --definitions artifacts artifacts list
INFO:2018/09/28 07:59:40 Loaded 34 built in artifacts


We can collect the artifact simply by using the “artifacts collect” command:

F:\>velociraptor.exe --definitions artifacts artifacts collect Windows.Persistence.Powershell
INFO:2018/09/28 20:01:32 Loaded 34 built in artifacts

Ok so Velociraptor can load and collect this new artifact, but as yet it does nothing! We need to think about what exactly we want to collect.

We know we want to search for all values in the Run/RunOnce hive of all the users. Let’s first see if we can retrieve all the values using a glob:

name: Windows.Persistence.Powershell
  - name: keyGlob
    default: "HKEY_USERS\\*\\Software\\Microsoft\\Windows\
 - precondition:
    SELECT OS from info() where OS = "windows"
   - |
    SELECT FullPath from glob(

This artifact demonstrates a few concepts:

  1. We can define parameters by name, and reference them from within the VQL query. This keeps the VQL query clean and more readable.
  2. We can define a precondition on the artifact. If the precondition is not met, the VQL query will not be run.

Lets run this artifact:

F:\>velociraptor.exe --definitions artifacts artifacts collect Windows.Persistence.Powershell
INFO:2018/09/28 20:51:47 Loaded 34 built in artifacts
|            FullPath            |
| HKEY_USERS\S-1-5-19\Software\M |
| icrosoft\Windows\CurrentVersio |
| n\Run\OneDriveSetup            |
| HKEY_USERS\S-1-5-20\Software\M |
| icrosoft\Windows\CurrentVersio |
| n\Run\OneDriveSetup            |
| HKEY_USERS\S-1-5-21-546003962- |
| 2713609280-610790815-1001\Soft |
| ware\Microsoft\Windows\Current |
| Version\Run\"C:\Windows\system |
| 32\mshta.exe"                  |

It returns a couple of results so there are two Run/RunOnce values defined. For this artifact, we only want to return those entries which match a specific yara signature. We can work later on improving the yara signature, but for now let’s just detect uses of the eval() powershell command within 500 characters of an ActiveXObject instantiation. We will try to match each value returned from the Run keys with this object:

name: Windows.Persistence.Powershell
  - name: keyGlob
    default: "HKEY_USERS\\*\\Software\\Microsoft\\Windows\
  - name: yaraRule
    default: |
      rule Powershell {
        $ = /ActiveXObject.{,500}eval/ nocase
        $ = /ActiveXObject.{,500}eval/ wide nocase
        any of them
 - precondition:
    SELECT OS from info() where OS = "windows"
   - |
     // This is a stored query
     LET file = SELECT FullPath from glob(
   - |
     SELECT * FROM yara(
       files=file.FullPath,   // This will expand to a list of paths.

This version recovers the FullPath of all the Run/RunOnce values and stores them in a stored query. We then issue another query that applies the yara rule on these values:

F:\>velociraptor.exe --definitions artifacts artifacts collect Windows.Persistence.Powershell
INFO:2018/09/28 21:29:10 Loaded 34 built in artifacts
|    Rule    | Meta | Tags |            Strings             |              File              |
| Powershell |      |      | {"Name":"$","Offset":40,"HexDa | {"FullPath":"HKEY_USERS\\S-1-5 |
|            |      |      | ta":["00000000  41 63 74 69 76 | -21-546003962-2713609280-61079 |
|            |      |      |  65 58 4f  62 6a 65 63 74 28 2 | 0815-1001\\Software\\Microsoft |
|            |      |      | 2 57  |ActiveXObject(\"W|","00 | \\Windows\\CurrentVersion\\Run |
|            |      |      | 000010  53 63 72 69 70 74 2e 5 | \\\"C:\\Windows\\system32\\msh |
|            |      |      | 3  68 65 6c 6c 22 29 3b 51  |S | ta.exe\"","Type":"SZ","Data":{ |
|            |      |      | cript.Shell\");Q|","00000020   | "type":"SZ","value":"about:\u0 |
|            |      |      | 52 33 69 72 6f 55 66 3d  22 49 | 03cscript\u003ec1hop=\"X642N10 |
|            |      |      |  37 70 4c 37 22 3b  |R3iroUf=\ | \";R3I=new%20ActiveXObject(\"W |
|            |      |      | "I7pL7\";|","00000030  6b 39 5 | Script.Shell\");QR3iroUf=\"I7p |
|            |      |      | 4 6f 37 50 3d 52  33 49 2e 52  | L7\";k9To7P=R3I.RegRead(\"HKCU |
|            |      |      | 65 67 52 65  |k9To7P=R3I.RegRe | \\\\software\\\\bkzlq\\\\zsdnh |
|            |      |      | |","00000040  61 64 28 22 48 4 | epyzs\");J7UuF1n=\"Q2LnLxas\"; |
|            |      |      | b 43 55  5c 5c 73 6f 66 74 77  | eval(k9To7P);JUe5wz3O=\"zSfmLo |
|            |      |      | 61  |ad(\"HKCU\\\\softwa|","00 | d\";\u003c/script\u003e"},"Mti |
|            |      |      | 000050  72 65 5c 5c 62 6b 7a 6 | me":{"sec":1538191253,"usec":1 |
|            |      |      | c  71 5c 5c 7a 73 64 6e 68  |r | 538191253231489700},"Ctime":{" |
|            |      |      | e\\\\bkzlq\\\\zsdnh|","0000006 | sec":1538191253,"usec":1538191 |
|            |      |      | 0  65 70 79 7a 73 22 29 3b  4a | 253231489700},"Atime":{"sec":1 |
|            |      |      |  37 55 75 46 31 6e 3d  |epyzs\ | 538191253,"usec":1538191253231 |
|            |      |      | ");J7UuF1n=|","00000070  22 51 | 489700}}                       |

We can see that the last query returns 5 columns, but each column actually contains objects with quite a lot of additional information. For example, the File column returns information about the file that matched the yara rule (its filename, timestamps etc). The output is a bit confusing so we just return the relevant columns. We can replace the * in the last query with a curated list of columns to return:

SELECT File.FullPath as ValueName, File.Data.value as Contents,
  timestamp(epoch=File.Mtime.Sec) as ModTime
FROM yara(rules=yaraRule,

Which results in the quite readable:

F:\>velociraptor.exe --definitions artifacts artifacts collect Windows.Persistence.Powershell
INFO:2018/09/28 21:42:18 Loaded 34 built in artifacts
|           ValueName            |            Contents            |          ModTime          |
| HKEY_USERS\S-1-5-21-546003962- | about:<script>c1hop="X642N10"; | 2018-09-28T20:20:53-07:00 |
| 2713609280-610790815-1001\Soft | R3I=new%20ActiveXObject("WScri |                           |
| ware\Microsoft\Windows\Current | pt.Shell");QR3iroUf="I7pL7";k9 |                           |
| Version\Run\"C:\Windows\system | To7P=R3I.RegRead("HKCU\\softwa |                           |
| 32\mshta.exe"                  | re\\bkzlq\\zsdnhepyzs");J7UuF1 |                           |
|                                | n="Q2LnLxas";eval(k9To7P);JUe5 |                           |
|                                | wz3O="zSfmLod";</script>       |                           |
Artifact: Windows.Persistence.Powershell

Great! This works and only returns values that match the yara signature we developed.

Testing the artifact

Let’s test this artifact for real now. We restart the frontend with the –definition flag and this makes the new artifact available in the GUI under the Artifact Collector flow. The GUI also shows the entire artifact we defined so we can see what VQL will be run:


Launching the flow appears to work and shows exactly the same result as we collected on the command line:


But wait! There is a problem!

When we log out of the machine, and then rerun the artifact it returns no results!


Why is that? Experienced incident responders would recognize that any artifact that works from the HKEY_USERS registry hive is inherently unreliable. This is because the HKEY_USERS hive is not a real hive - it is a place where Windows mounts the user’s hive when the user logs in.

How does HKEY_USERS hive work?

Windows implements the concept of user profiles. Each user has a personal registry hive that stores user specific settings. It is actually a file stored on their home directory called ntuser.dat. When a user logs into the workstation, the file may be synced from the domain controller and then it is mounted under the HKEY_USERS<sid> registry hive.

This means that when the user logs out, their user registry hive is unmounted and does not appear in HKEY_USERS any longer. Any artifacts based around the HKEY_USERS hive will work only if the collection is run when a user is logged in.

This is obviously not what we want when we hunt for persistence! We want to make sure that none of the users on the system have this persistence mechanism installed. You can imagine a case where a system has been cleaned up but then a user logs into the machine, thereby reinfecting it!

How to fix this?

Yara is a very powerful tool because it allows us to search for patterns in amorphous data (such as process memory and structured files) without having to fully understand the structure of the data we are searching for. Of course this has its limitations, but yara can raise a red flag if the signature matches the file, and we can analyse this file more carefully later.

In this case, we can not rely on globbing the HKEY_USER registry hive, so maybe we can just search the files that back these hives? We know that each user on the system has an NTUSER.DAT file in their home directory (usually C:\Users\<username>), so let’s write an artifact to find these files. We can reuse the artifact Windows.Sys.Users that reports all user accounts on a system (we display it as JSON to enhance readability):

F:\>velociraptor.exe artifacts collect Windows.Sys.Users --format json
INFO:2018/09/28 22:44:26 Loaded 34 built in artifacts
 "Description": "",
 "Directory": "C:\\Users\\test",
 "Gid": 513,
 "Name": "test",
 "Type": "local",
 "UUID": "S-1-5-21-546003962-2713609280-610790815-1001",
 "Uid": 1001
 "Description": "",
 "Directory": "C:\\Users\\user1",
 "Gid": 513,
 "Name": "user1",
 "Type": "local",
 "UUID": "S-1-5-21-546003962-2713609280-610790815-1003",
 "Uid": 1003

So we just want to YARA scan the NTUSER.DAT file in each home directory:

SELECT * from foreach(
   SELECT Name, Directory as HomeDir
     FROM Artifact.Windows.Sys.Users()
    WHERE Directory.value and Gid
  SELECT File.FullPath As FullPath,
         Strings.Offset AS Off,
         Strings.HexData As Hex,
          upload(file=File.FullPath, accessor="ntfs") AS Upload
      FROM yara(
            files="\\\\.\\" + HomeDir + "\\ntuser.dat",
            rules=yaraRule, context=10)

This query:

  1. Selects all the usernames and their home directory from the Windows.Sys.Users artifact.
  2. For each directory prepends \.\ and appends “ntuser.dat”. For example c:\Users\test becomes \.\c:\Users\test\NTUSER.dat
  3. The file is accessed using the NTFS filesystem accessor. This is necessary because the registry hive is locked if the user is logged in. Therefore we must access it using raw NTFS parsing to bypass the OS locking.
  4. For each file that matches the yara expression, we upload the file to the server for further analysis.

Lets run this new artifact on the server:


Unlike the previous artifact, this one simply returns the YARA hit, but because we do not have any context on which value contained the signature, or even if it had been deleted. Luckily we uploaded the raw registry hive for further analysis, and we can use a tool such as RegRipper to extract more information from the hive:

$ wine rip.exe -p user_run -r
Launching user_run v.20140115
user_run v.20140115
(NTUSER.DAT) [Autostart] Get autostart key contents from NTUSER.DAT hive

LastWrite Time Thu Sep 27 01:19:08 2018 (UTC)
 OneDrive: "C:\Users\user1\AppData\Local\Microsoft\OneDrive\OneDrive.exe"
 c:\windows\system32\mshta.exe: about:<script>c1hop="X642N10";

Note above how we can simply retrieve the uploaded file from Velociraptor’s filestore. Velociraptor stores uploaded files on the filesystem within the flow’s directory.


In this blog post we saw how to utilize YARA to find suspicious powershell persistence mechanisms. YARA is a powerful tool and using Velociraptor’s artifacts we can apply it to files, registry values, and raw NTFS files such as locked registry hives and the pagefile.

We also saw some of the inherent problems with relying on the HKEY_USERS registry hive for detection - the hive is only present when a user is logged in so when we hunt, we might miss those users who are currently logged out. We saw how YARA can be used to detect suspicious patterns in raw registry hive files and how artifacts may retrieve those files for further analysis.

Sat, 29 Sep 2018 00:00:00 +1000 <![CDATA[Velorciraptor’s filesystem’s accessors]]> Velorciraptor’s filesystem’s accessors

The latest release of Velociraptor introduces the ability to access raw NTFS volumes, allowing users to read files which are normally locked by the operating system such as registry hives, pagefile and other locked files. In addition, Velociraptor can now also read Volume Shadow Copy snapshots. The gives a kind of time-machine ability to allow the investigator to look through the drive content at a previous point in the past.

This blog post introduces the new features and describe how Velociraptor’s filesystem accessors work to provide data from multiple sources to VQL queries.

We have previously seen that Velociraptor can list and download files from the client’s filesystem, as well as registry keys and values. The client’s filesystem is made available to VQL plugins such as glob() allowing many Artifacts to be written that work on files, registry keys and raw NTFS volumes.

While Velociraptor is a great remote response tool, everything that it can do remotely, it can also do locally using a command line interface. This gives the user an opportunity to interactively test their VQL queries while writing artifacts.

The latest release adds a couple of convenient command line options which allow the user to interact with the filesystem accessors. For example, to list the files in a directory we can use the “velociraptor fs ls” command:

F:\>velociraptor.exe fs ls
| Name | Size |    Mode    |           mtime           |              Data               |
| C:   |    0 | d--------- | 1969-12-31T16:00:00-08:00 | Description: Local Fixed Disk   |
|      |      |            |                           | DeviceID: C:                    |
|      |      |            |                           | FreeSpace: 12686422016          |
|      |      |            |                           | Size: 33833349120               |
|      |      |            |                           | SystemName: DESKTOP-IOME2K5     |
|      |      |            |                           | VolumeName:                     |
|      |      |            |                           | VolumeSerialNumber: 9459F443    |
| D:   |    0 | d--------- | 1969-12-31T16:00:00-08:00 | Description: CD-ROM Disc        |
|      |      |            |                           | DeviceID: D:                    |
|      |      |            |                           | FreeSpace: 0                    |
|      |      |            |                           | Size: 57970688                  |
|      |      |            |                           | SystemName: DESKTOP-IOME2K5     |
|      |      |            |                           | VolumeName: VBox_GAs_5.2.11     |
|      |      |            |                           | VolumeSerialNumber: A993F576    |
SELECT Name, Size, Mode.String AS Mode, timestamp(epoch=Mtime.Sec) AS mtime,
   Data FROM glob(globs=path, accessor=accessor)

The “fs ls” command instructs Velociraptor to list directories using its internal filesystem accessors. By default it will use the “file” accessor - which simply uses the usual Win32 api filesystem calls (i.e. CreateFile, FindFirstFile etc).

On windows, the file accessor lists the drive letters at the root of the filesystem, then allows subdirectories to be listed under each letter. The above output shows some metadata for each drive letter (like its size etc) and below the table we can see the VQL query that was used to generate the table. To be clear, the “fs ls” command is simply a shortcut for producing a VQL query that ultimately uses the filesystem accessor in the glob() VQL plugin. Therefore, we can enter any glob expression to find files:

F:\>velociraptor.exe fs ls -v "c:\program files\**\*.exe"
|            FullPath            |   Size   |    Mode    |           mtime           | Data |
| C:\Program Files\Windows Defen |  4737448 | -rw-rw-rw- | 2018-07-14T17:56:49-07:00 |      |
| der Advanced Threat Protection |          |            |                           |      |
| \MsSense.exe                   |          |            |                           |      |
| C:\Program Files\Windows Defen |   791384 | -rw-rw-rw- | 2018-07-14T17:56:43-07:00 |      |
| der Advanced Threat Protection |          |            |                           |      |
| \SenseCncProxy.exe             |          |            |                           |      |
| C:\Program Files\Windows Defen |  3832016 | -rw-rw-rw- | 2018-07-14T17:56:50-07:00 |      |
| der Advanced Threat Protection |          |            |                           |      |
| \SenseIR.exe                   |          |            |                           |      |
| C:\Program Files\Windows Defen |  2147192 | -rw-rw-rw- | 2018-07-14T18:05:00-07:00 |      |
| der Advanced Threat Protection |          |            |                           |      |
| \SenseSampleUploader.exe       |          |            |                           |      |
SELECT FullPath, Size, Mode.String AS Mode, timestamp(epoch=Mtime.Sec) AS mtime, Data FROM
glob(globs=path, accessor=accessor)

When using the registry filesystem accessor, the registry appears like a filesystem, allowing us to run glob expressions against registry keys and values (Note that the registry accessor provides the value in the metadata):

F:\>velociraptor.exe fs --accessor reg ls "HKEY_USERS\*\Software\Microsoft\Windows\CurrentVersion\{Run,RunOnce}\*"
|     Name      | Size |    Mode    |           mtime           |             Data                |
| OneDriveSetup |  104 | -rwxr-xr-x | 2018-09-03T02:48:53-07:00 | type: SZ                        |
|               |      |            |                           | value: C:\Windows\SysWOW64\     |
|               |      |            |                           | OneDriveSetup.exe /thfirstsetup |
| OneDriveSetup |  104 | -rwxr-xr-x | 2018-09-03T02:48:47-07:00 | type: SZ                        |
|               |      |            |                           | value:   C:\Windows\SysWOW64\   |
|               |      |            |                           | OneDriveSetup.exe /thfirstsetup |
SELECT Name, Size, Mode.String AS Mode, timestamp(epoch=Mtime.Sec) AS mtime,
Data FROM glob(globs=path, accessor=accessor)

Finally, the NTFS accessor can access files by parsing the NTFS filesystem directly. At the top level, the accessor shows all NTFS formatted partitions. These include regular drives as well as Volume Shadow Copies:

F:\>velociraptor.exe fs --accessor ntfs ls
|              Name              | Size |    Mode    |                             Data                        |
| \\.\C:                         |    0 | d--------- | Description: Local Fixed Disk                           |
|                                |      |            | DeviceID: C:                                            |
|                                |      |            | FreeSpace: 11802157056                                  |
|                                |      |            | Size: 33833349120                                       |
|                                |      |            | SystemName: DESKTOP-IOME2K5                             |
|                                |      |            | VolumeName:                                             |
|                                |      |            | VolumeSerialNumber: 9459F443                            |
| \\?\GLOBALROOT\Device\Harddisk |    0 | d--------- | DeviceObject: \\?\GLOBALROOT\Device\                    |
|                                |      |            |             HarddiskVolumeShadowCopy1                   |
| VolumeShadowCopy1              |      |            | ID: {CAF25144-8B70-4F9E-B4A9-5CC702281FA1}              |
|                                |      |            | InstallDate: 20180926154712.490617-420                  |
|                                |      |            | OriginatingMachine: DESKTOP-IOME2K5                     |
|                                |      |            | VolumeName: \\?\Volume{3dc4b590-0000-000-501f00000000}\ |
| \\?\GLOBALROOT\Device\Harddisk |    0 | d--------- | DeviceObject: \\?\GLOBALROOT\Device\                    |
|                                |      |            |            HarddiskVolumeShadowCopy2                    |
| VolumeShadowCopy2              |      |            | ID: {E48BFDD7-7D1D-40AE-918C-36FCBB009941}              |
|                                |      |            | InstallDate: 20180927174025.893104-420                  |
|                                |      |            | OriginatingMachine: DESKTOP-IOME2K5                     |
|                                |      |            | VolumeName: \\?\Volume{3dc4b590-0000-000-501f00000000}\ |
SELECT Name, Size, Mode.String AS Mode, timestamp(epoch=Mtime.Sec) AS mtime,, Data FROM glob(globs=path, accessor=accessor) WHERE Sys.name_type != 'DOS'

The above example shows two volume shadow copies that Windows has takens on two different dates (highlighted above). We can browse these snapshots just like they were another drive (We can also apply any glob expressions to this path):

F:\>velociraptor.exe fs --accessor ntfs ls "\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy1\
|       Name       |   Size   |    Mode    |           mtime           |       Data       |
| velociraptor.exe | 12521472 | -rwxr-xr-x | 2018-08-19T23:37:01-07:00 | mft: 39504-128-0 |
|                  |          |            |                           | name_type: Win32 |
| winpmem.exe      |  3619260 | -rwxr-xr-x | 2017-12-28T21:17:50-08:00 | mft: 39063-128-1 |
|                  |          |            |                           | name_type: POSIX |
SELECT Name, Size, Mode.String AS Mode, timestamp(epoch=Mtime.Sec) AS mtime, Data FROM
glob(globs=path, accessor=accessor) WHERE Sys.name_type != 'DOS'

Volume shadow copies are like a time machine - they can reveal data that was stored on the drive days or weeks prior to the time we inspected it which makes them very useful for some investigations.

Using filesystem accessors remotely - The Velociraptor VFS

The above description shows how Velociraptor’s command line interface can be used to interact with the various filesystem accessors. This is important for writing and collecting artifacts for triage and general system state exploration.

However, how do filesystem accessors appear in the Velociraptor GUI?


The nice thing about Velociraptor’s GUI is that it is just a way to present the same information that the “fs ls” command is getting by using the same VQL queries. Therefore the view is very familiar:

  1. The top level of the Velociraptor VFS represents all the filesystem accessors implemented in the client.
  2. Each of these accessors shows its own view:
    1. The file accessor uses the OS APIs to list files and directories. Its top level is a list of mounted drives (which may be CDROM’s or even network shares).
    2. The NTFS accessor shows all NTFS volumes accessible, including local drives and Volume Shadow Copies.
    3. The registry accessor uses Win32 APIs to access the registry and shows at the top level a list of all system hives currently attached.
  3. For each file listed, the accessor also includes a Data attribute. This contains accessor specific metadata about the file (for example the MFT entry).

In the below screenshot we can see how the user may navigate into the Volume Shadow Copy and retrieve files from it:


A note about filenames.

NTFS can have several different names to the same file. Typically, a short DOS 8.3 style filename (e.g. PROGRA~1), as well as a Win32 long filename (e.g. Program Files). You can see the short name for a file using the API GetShortPathName() (or the command dir /x), but a program needs to deliberately ask for it. Most programs do not explicitly collect or show the short filename of a file.

This can cause problems for DFIR applications. For example, Imagine we discovered a Run key to C:\Users\test\runme.exe. If we only considered the long filename (as for example returned by the Win32API FindFile() or the output of the dir command), then we would assume the file has been removed and the run key is not active. In reality however, the file may be called “This is some long filename.exe” with a DOS name of “runme.exe”. Explorer (and most tools) will only show the long filename by default, but the runkey will still execute by referring to the DOS filename!

Usually the short filename is some variation of the long filename with a ~1 or ~2 at the end. In reality it can be anything. In the snippet below, I am setting the short filename for the velociraptor.exe binary to be something completely unrelated, then I am running the binary using the unrelated filename:

C:\Users\test>fsutil file setshortname velociraptor.exe runme.exe
C:\Users\test>dir /x *.exe
 Volume in drive C has no label.
 Volume Serial Number is 9459-F443

 Directory of C:\Users\test

08/19/2018  11:37 PM        12,521,472 RUNME.EXE    velociraptor.exe
               2 File(s)     16,140,732 bytes
               0 Dir(s)  11,783,704,576 bytes free
C:\Users\test>runme.exe -h
usage: velociraptor [<flags>] <command> [<args> ...]

An advanced incident response and monitoring agent.

You can see that Windows explorer shows no trace of the runme.exe file since it only displays the Win32 long file name:


It is important for DFIR investigators to be aware of this and test your tools! You can see that sysinternals’ autoruns program won’t have any of these shenanigans when I added a runkey to “runme.exe”. It shows the real filename velociraptor.exe even though the runkey indicates runme.exe:


Velocirpator treats a file’s DOS name and Win32 Name as distinct entries in the NTFS directory listing. This allows us to find any references to the file by it’s DOS name as well as its Win32 name.


As Velociraptor gains more functionality, we envision more filesystem accessors to become available. The nice thing about these accessors is that they just slot in to the rest of the VQL plugins. By providing a new accessor, we are able to glob, hash, yara scan etc the new abstraction. For example, to yara scan a registry key one simply calls the VQL plugin yara with an accessor of reg: yara(rules=myRules, files=my_reg_keys, accessor=”reg”)

Sun, 30 Sep 2018 00:00:00 +1000 <![CDATA[Velociraptor walk through and demo]]> Velociraptor walk through and demo

I just uploaded a screencast of the latest Velociraptor - check it out and play with it, and please provide feedback at