thoughts/data/microsoft-dominating-email.md
Tommy Skaug 3860d96672
Some checks failed
Export / Explore-GitHub-Actions (push) Has been cancelled
More of that content. Latest version.
2024-08-05 21:33:47 +02:00

159 lines
6.2 KiB
Markdown

## Key Takeaways
* While market dominance was formerly an issue discussed for
operating systems, the modern equivalent occurs in form of cloud
services, primarily from Microsoft, Amazon and Google.
* Data from the Norwegian business registry mapped to email
records shows that Microsoft Office 365 has become a dominating
force amongst Norwegian private businesses and 61% of the
government.
* Microsoft being a significant actor for email indicates that
Norwegian organisations are putting a lot more faith in
Microsoft. Today email as a service is bundled with direct
messaging and wikis.
## Introduction
In 2003 Dan Geer, Bruce Schneier and others wrote a paper named
"How the Dominance of Microsoft's Products Poses a Risk to
Security". It eventually cost Geer his job at AtStake.
[^realpolitik]: [Cybersecurity as Realpolitik](https://www.youtube.com/watch?v=nT-TGvYOBpI) by Dan Geer,
presented at Black Hat USA 2014. In the presentation he gives
Microsoft some credit for improving the security situation in
Windows. At the same time, have a look at his presentation
form without PowerPoint.
The paper evolves around Microsoft's dominance in operating
systems and some solutions to market dominance as a problem.
In this article I am not going to reiterate on the points made by
Geer et àl. I think these are perfectly valid and easily
transferrable to the current landscape. The whole paper is
read-worthy, but I'd like highlight one part:
> Governments, and perhaps only governments, are in leadership
> positions to affect how infrastructures develop. By enforcing
> diversity of platform to thereby blunt the monoculture risk,
> governments will reap a side benefit of increased market
> reliance on interoperability, which is the only foundation for
> effective incremental competition and the only weapon against
> end-user lock-in. A requirement that no operating system be more
> than 50% of the installed based in a critical industry or in a
> government would moot monoculture risk. Other branches to the
> risk diversification tree can be foliated to a considerable
> degree, but the trunk of that tree on which they hang is a total
> prohibition of monoculture coupled to a requirement of
> standards-based interoperability.
[^eu_interoperability]: The European Union has [long been a
proponent](https://www.openforumeurope.org/wp-content/uploads/2020/11/Ian_Brown_Interoperability_for_competition_regulation.pdf) of interoperability in markets.
Azure is Windows in 2021. The walled gardens are somewhat
redefined - but they are there in a similar fashion as Windows was
in 2003. The Microsoft monopoly is technically broken, and there
are now options from Amazon, Google and even Apple, but I would
argue the monoculture is still present in shared approaches,
infrastructure and concepts.
I decided to have a closer look at the distribution from a
representative dataset provided by an authorative source in
Norway; the business registry.
## Taking a Close Look at The Data
In Norway we a public registry of organisations. This registry is
categorised by standardised sector codes (typically "government",
"private" and so on). Using the JSON-data provided by brreg.no, a
list of websites can be extracted:
Retrieve the organisation list from brreg.no:
```
curl https://data.brreg.no/enhetsregisteret/api/enheter/lastned > enheter.gz
gzip -d enheter.gz
```
Reshape the JSON data by website URL, sector and business code.
```
cat enheter |
jq '[.[] | select(.hjemmeside != null) | {url:.hjemmeside, code:.naeringskode1.kode, sector:.institusjonellSektorkode.kode}]' > webpages.txt
```
Based on the URL, add the primary domain and resolve its MX record
and the MX primary domain to each JSON entity
[^ssb]: A rough categorisation based on the Norwegian
[standard](https://www.ssb.no/klass/klassifikasjoner/39) provided by Statistics Norway (I'm sure it could
be improved)
Using the JSON-file generated above, populate the following JSON
dictionary.
```
{
"government":{"codes": [6100,6500,1110,1120], "total":0, "counts":{}},
"municipals":{"codes": [1510,4900,1520], "total":0, "counts":{}},
"finance":{"codes": [3200,3500,3600,4300,3900,4100,4500,4900,5500,5700,4900,7000], "total":0, "counts":{}},
"private":{"codes": [4500,4900,2100,2300,2500], "total":0, "counts":{}}
}
```
Generate CSV output based on each sector grouping above.
## The Result
The top vendor was not surprising Microsoft's outlook.com. For the
120k sites, 98k resolved an MX record. Of these I will give an
outlook.com summary as follows, as it would seem this is the
dominating actor in all categories:
* In government 61% is O365 users (1420/2317)
* For municipals, the amount is 55% (688/1247)
* For the diverse financial grouping, 21% uses O365 (4836/23125)
* For the diverse private companies 38% uses O365 (14615/38129)
Of the 98k sites Microsoft runs the email service for 21559
organisations. For comparison Google MX domains accounts for about
5500.
While the above are directly a measurement of who delivers email
services, it also indicated that these organisations relies on
other services, such as internal wikis and direct messaging.
An overview of the top 10 vendors are shown below.
![](static/img/data/mx_domains.png)
## Sources of Errors
Even though I believe the statistics above is representative it
has some possible sources of error:
1. The organisation isn't listed with URL in the organisation
registry or it uses a domain not associated with the primary
domain of its web address
2. The organisation uses an SMTP proxy
3. The organisation has an inactive SMTP record
I found that there are more than 1 million listed organisations in
the brreg.no registry and 120k websites in the JSON data
provided. This means this dataset represent at most 12% of the
companies listed.
Also, email doesn't represent a diverse infrastructure, but I
believe it is an indicator of the current trends also for other
cloud services in e.g. Azure, Google Compute Engine and so on.