thoughts/data/cognitive-automation.md
Tommy Skaug 805a34f937
All checks were successful
Export / Explore-GitHub-Actions (push) Successful in 2m19s
initial migration
2024-08-05 20:24:56 +02:00

4.5 KiB
Raw Blame History

There is a lot of hype around many things in cyber security. One concept that is not, is called Cognitive Automation (CA). CA can be explained by comparing it to traditional automation. That is, how tasks are automated: like alerts correlation. By using cognitive automation, the way the mind works is taken into account. I believe many security professionals will recognise the practical aspects of Schulte's model for "Complexity of automation vs effectiveness/safety" [1].

I've written a post on this topic years ago ("The Role of Cognitive Automation in Information Security"), but unluckily that was lost in migration. It probably needed an update anyways, and I believe the cyber security field is more mature to receive this input now rather than at that point.

Cognitive automation is strongly applied in the aerospace industry for instance. In aerospace, long ago, there was a realisation that the strengths of thee human-being is the ability to learn, instinct, problem reduction, ability of abstraction and several others. The machines strength is parallel processing, objectivity, long-term monitoring, complex planning and decision making and so on. Schulte describes this concept in detail, in Man-Machine Cooperation model [1].

In order to benefit from a similar model in cyber security there is a need to evolve the way data is extracted, preprocessed and prepared for human-machine interaction. As may be recognised at this point there are already technology available to provide parallel processing on the machine part. How a computing cluster would solve such a problem is the evident problem. In that regard, machine learning is the most promising technique to structure and classify the data which seems to scale really well. Efficiently ingesting, storing and preprocessing the data is the first stage of that challenge.

Another detail that I would like to point out here, from the great book "The Multitasking Mind" by Salvucci and Taatgen, is how the human mind works with buffers (the aural, visual, declarative, goal, manual and problem buffers). A human can actually only handle one thing at once. So when analysts are tasked with several simultaneous tasks or roles, this will definitively produce bad quality results. This is really important to understand to all cyber security seniors and designers, so read the book.

Back to how this applies in practical terms: when analysts manually analyse and decide by expert knowledge, classifying the attributes of full content data and e.g. creates Yara and Snort signatures, it is a reasonable assumption that a number of relevant attributes are never evaluated as potential anomalies. This greatly increases the possibilities of the threat groups. In aerospace cognitive automation there is a concept called Mission Management, that is similar to the problem described here.

Now for a practical example of how cognitive automation can work, this time paralleled with the approach taken by Netflix to movie recommenders. Let's say that you have stored the PDFiD [2] vector of all PDF documents over the last ten years, passing through a network. The vector structure will look like:

obj,endobj,stream,endstream,xref,trailer,startxref,/Page,/Encrypt,/JS,/JavaScript,/AA,/OpenAction,/JBIG2Decode

or:

1. 7,7,1,1,1,1,1,1,0,1,1,0,1,0
[...]

If 500 PDF files passes through the systems each day on average, that will be 1825' documents over those ten years. In addition qtime is a significant part of that vector - and other parameters could be file names and so on.

If an analyst receives a suspicious PDF file. That file may initially hard to classify by the analyst. In such a case the system should propose other related files to look at. Practically speaking this saves the analyst cognitive capacity to use instict, pattern recognition and creativity to classify the document. The machine on the other hand maintains objectivity, has great stress resistance, can retrieve a lot more information, and it can process and pivot on all those 10 years of documents as opposed to the analyst.

Now that you have gotten an introduction to the world of cognitive automation, I hope this will drive a discussion on how we can take our field to the next level. I am confident that this means understanding and solving problems before attempting to buy our way out of them.

[1] Schulte, D. A. 2002. Mission management and crew assistance for military aircraft: cognitive concepts and prototype evaluation.
[2] PDFiD: https://blog.didierstevens.com/2009/03/31/pdfid/