thoughts/data/cognitive-automation.md

There is a lot of hype around many things in cyber
security. One concept that is not, is called Cognitive
Automation (CA). CA can be explained by comparing it to
traditional automation. That is, how tasks are automated:
like alerts correlation. By using cognitive automation, the
way the mind works is taken into account. I believe many
security professionals will recognise the practical aspects
of Schulte's model for "Complexity of automation vs
effectiveness/safety" [1].

I've written a post on this topic years ago ("The Role of
Cognitive Automation in Information Security"), but
unluckily that was lost in migration. It probably needed an
update anyways, and I believe the cyber security field is
more mature to receive this input now rather than at that
point.

Cognitive automation is strongly applied in the aerospace
industry for instance. In aerospace, long ago, there was a
realisation that the strengths of thee human-being is the
ability to learn, instinct, problem reduction, ability of
abstraction and several others. The machine’s strength is
parallel processing, objectivity, long-term monitoring,
complex planning and decision making and so on. Schulte
describes this concept in detail, in Man-Machine Cooperation
model [1].

In order to benefit from a similar model in cyber security
there is a need to evolve the way data is extracted,
preprocessed and prepared for human-machine interaction. As
may be recognised at this point there are already technology
available to provide parallel processing on the machine
part. How a computing cluster would solve such a problem is
the evident problem. In that regard, machine learning is the
most promising technique to structure and classify the data
which seems to scale really well. Efficiently ingesting,
storing and preprocessing the data is the first stage of
that challenge.

Another detail that I would like to point out here, from the
great book "The Multitasking Mind" by Salvucci and Taatgen,
is how the human mind works with buffers (the aural, visual,
declarative, goal, manual and problem buffers). A human can
actually only handle one thing at once. So when analysts are
tasked with several simultaneous tasks or roles, this will
definitively produce bad quality results. This is really
important to understand to all cyber security seniors and
designers, so read the book.

Back to how this applies in practical terms: when analysts
manually analyse and decide by expert knowledge, classifying
the attributes of full content data and e.g. creates Yara
and Snort signatures, it is a reasonable assumption that a
number of relevant attributes are never evaluated as
potential anomalies. This greatly increases the
possibilities of the threat groups. In aerospace cognitive
automation there is a concept called Mission Management,
that is similar to the problem described here.

Now for a practical example of how cognitive automation can
work, this time paralleled with the approach taken by
Netflix to movie recommenders. Let's say that you have
stored the PDFiD [2] vector of all PDF documents over the
last ten years, passing through a network. The vector
structure will look like:

```
obj,endobj,stream,endstream,xref,trailer,startxref,/Page,/Encrypt,/JS,/JavaScript,/AA,/OpenAction,/JBIG2Decode
```

or:

```
1. 7,7,1,1,1,1,1,1,0,1,1,0,1,0
[...]
```

If 500 PDF files passes through the systems each day on
average, that will be 1825' documents over those ten
years. In addition qtime is a significant part of that
vector - and other parameters could be file names and so on.

If an analyst receives a suspicious PDF file. That file may
initially hard to classify by the analyst. In such a case
the system should propose other related files to look
at. Practically speaking this saves the analyst cognitive
capacity to use instict, pattern recognition and creativity
to classify the document. The machine on the other hand
maintains objectivity, has great stress resistance, can
retrieve a lot more information, and it can process and
pivot on all those 10 years of documents as opposed to the
analyst.

Now that you have gotten an introduction to the world of
cognitive automation, I hope this will drive a discussion on
how we can take our field to the next level. I am confident
that this means understanding and solving problems before
attempting to buy our way out of them.


[1] Schulte, D. A. 2002. Mission management and crew assistance for military aircraft: cognitive concepts and prototype evaluation.
[2] PDFiD: https://blog.didierstevens.com/2009/03/31/pdfid/