thoughts/data/microsoft-dominating-email.md

160 lines
5.8 KiB
Markdown
Raw Normal View History

2024-08-05 18:24:56 +00:00
## Key Takeaways
* While market dominance was formerly an issue discussed for
operating systems, the modern equivalent occurs in form of cloud
services, primarily from Microsoft, Amazon and Google.
* Data from the Norwegian business registry mapped to email
records shows that Microsoft Office 365 has become a dominating
force amongst Norwegian private businesses and 61% of the
government.
* Microsoft being a significant actor for email indicates that
Norwegian organisations are putting a lot more faith in
Microsoft. Today email as a service is bundled with direct
messaging and wikis.
## Introduction
In 2003 Dan Geer, Bruce Schneier and others wrote a paper named
"How the Dominance of Microsoft's Products Poses a Risk to
Security". It eventually cost Geer his job at AtStake.
The paper evolves around Microsoft's dominance in operating
systems and Geer has later given Microsoft credit for a better
approach to security [2].
In this article I am not going to reiterate on the points made by
Geer et àl. I think these are perfectly valid and easily
transferrable to the current landscape. The whole paper is
read-worthy, but I'd like highlight one part:
> Governments, and perhaps only governments, are in leadership
> positions to affect how infrastructures develop. By enforcing
> diversity of platform to thereby blunt the monoculture risk,
> governments will reap a side benefit of increased market
> reliance on interoperability, which is the only foundation for
> effective incremental competition and the only weapon against
> end-user lock-in. A requirement that no operating system be more
> than 50% of the installed based in a critical industry or in a
> government would moot monoculture risk. Other branches to the
> risk diversification tree can be foliated to a considerable
> degree, but the trunk of that tree on which they hang is a total
> prohibition of monoculture coupled to a requirement of
> standards-based interoperability.
Azure is Windows in 2021. The walled gardens are somewhat
redefined - but they are there in a similar fashion as Windows was
in 2003. The Microsoft monopoly is technically broken, and there
are now options from Amazon, Google and even Apple, but I would
argue the monoculture is still present in shared approaches,
infrastructure and concepts.
I decided to have a closer look at the distribution from a
representative dataset provided by an authorative source in
Norway; the business registry.
## Taking a Close Look at The Data
In Norway we a public registry of organisations. This registry is
categorised by standardised sector codes (typically "government",
"private" and so on). Using the JSON-data provided by brreg.no, a
list of websites can be extracted:
1. Retrieve the organisation list from brreg.no [1]
```
curl https://data.brreg.no/enhetsregisteret/api/enheter/lastned > enheter.gz
gzip -d enheter.gz
```
2. Reshape the JSON data by website URL, sector and business code.
```
cat enheter |
jq '[.[] | select(.hjemmeside != null) | {url:.hjemmeside, code:.naeringskode1.kode, sector:.institusjonellSektorkode.kode}]' > webpages.txt
```
3. Based on the URL, add the primary domain and resolve its MX
record and the MX primary domain to each JSON entity
4. Using the JSON-file generated above, populate the following
JSON dictionary. This is also a rough categorisation based on
the standard provided by Statistics Norway (I'm sure it could
be improved) [4]:
```
{
"government":{"codes": [6100,6500,1110,1120], "total":0, "counts":{}},
"municipals":{"codes": [1510,4900,1520], "total":0, "counts":{}},
"finance":{"codes": [3200,3500,3600,4300,3900,4100,4500,4900,5500,5700,4900,7000], "total":0, "counts":{}},
"private":{"codes": [4500,4900,2100,2300,2500], "total":0, "counts":{}}
}
```
5. Generate CSV output based on each sector grouping above.
## The Result
The top vendor was not surprising Microsoft's outlook.com. For the
120k sites, 98k resolved an MX record. Of these I will give an
outlook.com summary as follows, as it would seem this is the
dominating actor in all categories:
* In government 61% is O365 users (1420/2317)
* For municipals, the amount is 55% (688/1247)
* For the diverse financial grouping, 21% uses O365 (4836/23125)
* For the diverse private companies 38% uses O365 (14615/38129)
Of the 98k sites Microsoft runs the email service for 21559
organisations. For comparison Google MX domains accounts
for about 5500.
While the above are directly a measurement of who delivers email
services, it also indicated that these organisations relies on
other services, such as internal wikis and direct messaging.
An overview of the top 10 vendors are shown below.
![](static/img/data/mx_domains.png)
## Sources of Errors
Even though I believe the statistics above is representative it
has some possible sources of error:
1. The organisation isn't listed with URL in the organisation
registry or it uses a domain not associated with the primary
domain of its web address
2. The organisation uses an SMTP proxy
3. The organisation has an inactive SMTP record
I found that there are more than 1 million listed organisations in
the brreg.no registry and 120k websites in the JSON data
provided. This means this dataset represent at most 12% of the
companies listed.
Also, email doesn't represent a diverse infrastructure, but I
believe it is an indicator of the current trends also for other
cloud services in e.g. Azure, Google Compute Engine and so on.
[1] CyberInsecurity: The Cost of Monopoly, Geer et àl, 2003 -
https://cryptome.org/cyberinsecurity.htm
[2] Cybersecurity as Realpolitik by Dan Geer presented at Black
Hat USA 2014: https://www.youtube.com/watch?v=nT-TGvYOBpI
[3] https://data.brreg.no/enhetsregisteret/api/enheter/lastned
[4] https://www.ssb.no/klass/klassifikasjoner/39