106 lines
4.5 KiB
Markdown
106 lines
4.5 KiB
Markdown
|
There is a lot of hype around many things in cyber
|
|||
|
security. One concept that is not, is called Cognitive
|
|||
|
Automation (CA). CA can be explained by comparing it to
|
|||
|
traditional automation. That is, how tasks are automated:
|
|||
|
like alerts correlation. By using cognitive automation, the
|
|||
|
way the mind works is taken into account. I believe many
|
|||
|
security professionals will recognise the practical aspects
|
|||
|
of Schulte's model for "Complexity of automation vs
|
|||
|
effectiveness/safety" [1].
|
|||
|
|
|||
|
I've written a post on this topic years ago ("The Role of
|
|||
|
Cognitive Automation in Information Security"), but
|
|||
|
unluckily that was lost in migration. It probably needed an
|
|||
|
update anyways, and I believe the cyber security field is
|
|||
|
more mature to receive this input now rather than at that
|
|||
|
point.
|
|||
|
|
|||
|
Cognitive automation is strongly applied in the aerospace
|
|||
|
industry for instance. In aerospace, long ago, there was a
|
|||
|
realisation that the strengths of thee human-being is the
|
|||
|
ability to learn, instinct, problem reduction, ability of
|
|||
|
abstraction and several others. The machine’s strength is
|
|||
|
parallel processing, objectivity, long-term monitoring,
|
|||
|
complex planning and decision making and so on. Schulte
|
|||
|
describes this concept in detail, in Man-Machine Cooperation
|
|||
|
model [1].
|
|||
|
|
|||
|
In order to benefit from a similar model in cyber security
|
|||
|
there is a need to evolve the way data is extracted,
|
|||
|
preprocessed and prepared for human-machine interaction. As
|
|||
|
may be recognised at this point there are already technology
|
|||
|
available to provide parallel processing on the machine
|
|||
|
part. How a computing cluster would solve such a problem is
|
|||
|
the evident problem. In that regard, machine learning is the
|
|||
|
most promising technique to structure and classify the data
|
|||
|
which seems to scale really well. Efficiently ingesting,
|
|||
|
storing and preprocessing the data is the first stage of
|
|||
|
that challenge.
|
|||
|
|
|||
|
Another detail that I would like to point out here, from the
|
|||
|
great book "The Multitasking Mind" by Salvucci and Taatgen,
|
|||
|
is how the human mind works with buffers (the aural, visual,
|
|||
|
declarative, goal, manual and problem buffers). A human can
|
|||
|
actually only handle one thing at once. So when analysts are
|
|||
|
tasked with several simultaneous tasks or roles, this will
|
|||
|
definitively produce bad quality results. This is really
|
|||
|
important to understand to all cyber security seniors and
|
|||
|
designers, so read the book.
|
|||
|
|
|||
|
Back to how this applies in practical terms: when analysts
|
|||
|
manually analyse and decide by expert knowledge, classifying
|
|||
|
the attributes of full content data and e.g. creates Yara
|
|||
|
and Snort signatures, it is a reasonable assumption that a
|
|||
|
number of relevant attributes are never evaluated as
|
|||
|
potential anomalies. This greatly increases the
|
|||
|
possibilities of the threat groups. In aerospace cognitive
|
|||
|
automation there is a concept called Mission Management,
|
|||
|
that is similar to the problem described here.
|
|||
|
|
|||
|
Now for a practical example of how cognitive automation can
|
|||
|
work, this time paralleled with the approach taken by
|
|||
|
Netflix to movie recommenders. Let's say that you have
|
|||
|
stored the PDFiD [2] vector of all PDF documents over the
|
|||
|
last ten years, passing through a network. The vector
|
|||
|
structure will look like:
|
|||
|
|
|||
|
```
|
|||
|
obj,endobj,stream,endstream,xref,trailer,startxref,/Page,/Encrypt,/JS,/JavaScript,/AA,/OpenAction,/JBIG2Decode
|
|||
|
```
|
|||
|
|
|||
|
or:
|
|||
|
|
|||
|
```
|
|||
|
1. 7,7,1,1,1,1,1,1,0,1,1,0,1,0
|
|||
|
[...]
|
|||
|
```
|
|||
|
|
|||
|
If 500 PDF files passes through the systems each day on
|
|||
|
average, that will be 1825' documents over those ten
|
|||
|
years. In addition qtime is a significant part of that
|
|||
|
vector - and other parameters could be file names and so on.
|
|||
|
|
|||
|
If an analyst receives a suspicious PDF file. That file may
|
|||
|
initially hard to classify by the analyst. In such a case
|
|||
|
the system should propose other related files to look
|
|||
|
at. Practically speaking this saves the analyst cognitive
|
|||
|
capacity to use instict, pattern recognition and creativity
|
|||
|
to classify the document. The machine on the other hand
|
|||
|
maintains objectivity, has great stress resistance, can
|
|||
|
retrieve a lot more information, and it can process and
|
|||
|
pivot on all those 10 years of documents as opposed to the
|
|||
|
analyst.
|
|||
|
|
|||
|
Now that you have gotten an introduction to the world of
|
|||
|
cognitive automation, I hope this will drive a discussion on
|
|||
|
how we can take our field to the next level. I am confident
|
|||
|
that this means understanding and solving problems before
|
|||
|
attempting to buy our way out of them.
|
|||
|
|
|||
|
|
|||
|
[1] Schulte, D. A. 2002. Mission management and crew assistance for military aircraft: cognitive concepts and prototype evaluation.
|
|||
|
[2] PDFiD: https://blog.didierstevens.com/2009/03/31/pdfid/
|
|||
|
|
|||
|
|
|||
|
|