thoughts/data/telemetry.md
Tommy Skaug 805a34f937
All checks were successful
Export / Explore-GitHub-Actions (push) Successful in 2m19s
initial migration
2024-08-05 20:24:56 +02:00

11 KiB
Raw Permalink Blame History

Telemetry for cyber security is currently at a crossroads. While past methods have been efficient by being based on network monitoring, the current revolution in encryption and the distributed workspace makes it insufficient to solely rely on network monitoring. Through this post we are going to focus on the current challenges.

Telemetry is an electrical apparatus for measuring a quantity (such as pressure, speed, or temperature) and transmitting the result especially by radio to a distant station
Meriam Webster

Telemetry, a term mostly used by AV-vendors, have become broadly applied as services change from a central to decentralised geographically spread. Yesterday an employee would work at his desk from 9-5 and then go home, while today's modern worker moves around the office area and can basically work from anywhere in the world when they feel like it.

In cyber security, telemetry can generally be categorised in: 1) Network-centric and 2) endpoint-based. A complete telemetry profile is essential for being able to monitor security events and to execute retrospective analysis. Through my recent article on indicators [1] I proposed a structure for indicators organised in three levels of abstraction. In this article a telemetry profile means something that covers a degree of these three levels.

| Level of abstraction  |    | Formats
|-----------------------|----|-------------
| Behavior              |    | MITRE (PRE-)ATT&CK
|-----------------------|--->|-------------
| Derived               |    | Suricata+Lua, Yara
|-----------------------|--->|-------------
| Atomic                |    | OpenIOC 1.1

The Challenges

There are generally two problems that needs to be fully solved when collecting data for cyber security:

  • The use of encryption from end-to-end
  • Workers and thereby the defended environment are or will be distributed

As of February 2017 the web was 50% encrypted [2]. Today that number [3] is growing close to 70%.

For defense purposes, it is possible to identify malicous traffic, such as beaconing, through metadata analysis. There have been some developments on detecting anomalies in encrypted content lately - namely the fingerprinting of programs using SSL/TLS. In the future I believe this will be the primary role of network-based detection. This is actually a flashback to a pre-2010 monitoring environment when full content was rarely stored and inspected by security teams.

An additional element to consider is the previous debate about public key pinning, which has now evolved into Expect-CT [4]. This means that man in the middle (MitM) techniques is going to be a no-no at some point. Yes, that includes your corporate proxy as well.

There is one drawback and dealbreaker with the above for security teams: it requires access to the datastream used by the endpoints to be fully effective.

VPNs are going away as more resilient and modern network architectures will become dominating. The most promising challenger at the moment is the Beyondcorp [5] (based on zero trust) architecture proposed by Google more than six years ago. A zero trust architecture means that clients will only check in to the corporate environment at the points that they need or are in the vicinity of corporate resources. Other activity, such as browsing on external websites are actually no longer going via the corporate infrastructure or its monitored links. Additionally, the endpoint is easily the most common infiltration vector.

To be honest, the Beyondcorp model reflects to a larger extent how humans actually interact with computers. Humans have never been confined to the perimeter of the enterprise network. This may be some of the reason for organisations being in a currently defeatable state as well. The only ones to confine themselves to the enterprise network is ironically the network defenders.

The only ones to confine themselves to the enterprise network is ironically the network defenders.

The battle of controlling the technology evolution is not completely lost though, it is a matter of changing the mindset of where data or telemetry is collected. Yesterday it was at the corporate proxy or in the corporate environment - today it is on the endpoint and during the connections to valuable resources.

For endpoints, the primary challenges currently faced are:

  • Maintaining the integrity of locally stored and buffered data
  • The availability and transport of data to a centralised logging instance
  • Confidentiality of the data in transport or at rest
  • Data source consistency for central correlation of information from several host sources
  • Raising the stakes on operational security in a cat and mouse chase between intruders and defenders

Remote logging is a subject that has gained much publicity previously, so we are not going into depth about that here.

Existing Tooling For Endpoints

This section was not originally a part of the scope of this article, but I'd like to establish a baseline of parts of the available tooling to handle the above issues. I also believe it touches some of the endpoint challenges.

For the purpose of this article, we define the following well-known computer abstraction stack:

  1. Hardware
  2. Operating System
  3. Application

Hardware verification and logging is currently a more or less unexplored field, with primarily only one tool available to my knowlege. That tool is Chipsec [6] which has been of interest and integrated into the Google Rapid Response (GRR) [7] project for some time.

Operating system logs are well understood today, and many organisations manages logging from the host operating system properly.

There are increasingly good event streaming and agent-based systems available, such as LimaCharlie [8], Sysmon [9] and Carbon Black [10]. The media focus of these platforms are on the more trendy term "hunting", but their real purpose is OS-level logging and pattern matching.

Further, distributed forensic platforms are available from FireEye (HX) and an open source equivalent from Google named GRR. GRR have been featured extensively on this site previously. Common for these are that they do not stream events, but rather stores information on the endpoint.

Application layer logging is extremely challenging. The logging mechanism in this regard needs to be connected to the structure of the application itself, and there are a lot of applications. Further, many application developers does not focus on logging.

Application logging is important and could be seen as the technical contextual information provided by the endpoint. Exposed applications that are important in terms of coverage:

  • Browsers
  • Email Readers
  • Application Firewalls (if you have one)
  • Instant Messaging Clients
  • Rich Document editors, such as Excel, Word, Powerpoint

These applications are important since they are the first point of contact for almost any technical threat. Done right, application logs will be at a central location before the intruder manages to get a foothold on the client. Thus, the risk of data being misrepresented in the central system are highly reduced (integrity).

Taking browsers and Microsoft Office as an example, there are some options readily available:

  • Firefox HTTP and DNS logging: mozilla.org [11]
  • Office Telemetry logging: Office Telemetry Log [12]

The above examples are not security focused as far as I could tell, more often they are debug oriented. However, the same data is often what we are after as well (such as: did the document have a macro? or what is the HTTP header?).

The dependency on the application developers to create logging mechanisms is quite a challenge in this arena. However, I believe the solutions in cases where applications does not log sufficiently is to take advantage of plugins. Most modern applications supports plugins to some extent.

To summarise the tooling discussion, we can populate the computer abstraction layers with the mentioned tools.

| Level of abstraction  |    | Tools
|-----------------------|----|-------------
| Application           |    | Browser, Email and so on
|-----------------------|--->|-------------
| Operating System      |    | LC, CB, Sysmon, 
|-----------------------|--->|-------------
| Hardware              |    | Chipsec

Conclusions: How Do We Defend in The Future?

In this article we have defined a structure and discussed in short one of the most prominent challenges faced by enterprise defenders today: how do we defend in the future?

Technology. This is the point were technology alone is no longer the sole solution to defending a network. Modern network architectures means that defenders needs to be able to fully comprehend and use the human nature as sensors. It is also about building intuitive systems which makes the necessary data and information available to the defenders. In my mind technology has never been the sole solution either, so the technology evolution is for the greater good.

It seems obvious and unavoidable to me that network defenders must start looking outside the perimeter, just as intruders have done for many years already. This means adapting the toolsets available and lobbying for an architecture that reflects how humans actually use technology resources. Most people have owned private equipment for many years (surprise), and the line between employee and enterprise is blurred and confusing when realitity now sinks in.

This means, in the technology aspect, that an emphasis must be put on the endpoints - and that network monitoring must again be about the metadata of the activity. In short: collect metadata from networks and content from endpoints.

Only this way will we, in the future, be able to create a full telemetry profile from each device under our responsibility.

[1] Article on indicators: /indicators/
[2] 50% encrypted: https://www.eff.org/deeplinks/2017/02/were-halfway-encrypting-entire-web
[3] that number: https://letsencrypt.org/stats/
[4] Expect-CT: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Expect-CT
[5] Beyondcorp: https://cloud.google.com/beyondcorp/
[6] Chipsec: https://github.com/chipsec/chipsec
[7] Google Rapid Response (GRR): https://github.com/google/grr-doc/blob/master/publications.adoc
[8] LimaCharlie: https://github.com/refractionPOINT/lce_doc/blob/master/README.md
[9] Sysmon: https://www.rsaconference.com/writable/presentations/file_upload/hta-w05-tracking_hackers_on_your_network_with_sysinternals_sysmon.pdf
[10] Carbon Black: http://the.report/assets/Advanced-Threat-Hunting-with-Carbon-Black.pdf
[11] mozilla.org: https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
[12] Office Telemetry Log: https://msdn.microsoft.com/en-us/library/office/jj230106.aspx