thoughts/data/microsoft-dominating-email.md

160 lines
6.2 KiB
Markdown
Raw Normal View History

2024-08-05 18:24:56 +00:00
## Key Takeaways
2024-08-05 19:33:47 +00:00
* While market dominance was formerly an issue discussed for
2024-08-05 18:24:56 +00:00
operating systems, the modern equivalent occurs in form of cloud
services, primarily from Microsoft, Amazon and Google.
2024-08-05 19:33:47 +00:00
* Data from the Norwegian business registry mapped to email
records shows that Microsoft Office 365 has become a dominating
force amongst Norwegian private businesses and 61% of the
2024-08-05 18:24:56 +00:00
government.
2024-08-05 19:33:47 +00:00
* Microsoft being a significant actor for email indicates that
Norwegian organisations are putting a lot more faith in
Microsoft. Today email as a service is bundled with direct
2024-08-05 18:24:56 +00:00
messaging and wikis.
## Introduction
2024-08-05 19:33:47 +00:00
In 2003 Dan Geer, Bruce Schneier and others wrote a paper named
"How the Dominance of Microsoft's Products Poses a Risk to
2024-08-05 18:24:56 +00:00
Security". It eventually cost Geer his job at AtStake.
2024-08-05 19:33:47 +00:00
[^realpolitik]: [Cybersecurity as Realpolitik](https://www.youtube.com/watch?v=nT-TGvYOBpI) by Dan Geer,
presented at Black Hat USA 2014. In the presentation he gives
Microsoft some credit for improving the security situation in
Windows. At the same time, have a look at his presentation
form without PowerPoint.
2024-08-05 18:24:56 +00:00
2024-08-05 19:33:47 +00:00
The paper evolves around Microsoft's dominance in operating
systems and some solutions to market dominance as a problem.
In this article I am not going to reiterate on the points made by
Geer et àl. I think these are perfectly valid and easily
transferrable to the current landscape. The whole paper is
2024-08-05 18:24:56 +00:00
read-worthy, but I'd like highlight one part:
2024-08-05 19:33:47 +00:00
> Governments, and perhaps only governments, are in leadership
> positions to affect how infrastructures develop. By enforcing
> diversity of platform to thereby blunt the monoculture risk,
> governments will reap a side benefit of increased market
> reliance on interoperability, which is the only foundation for
> effective incremental competition and the only weapon against
2024-08-05 18:24:56 +00:00
> end-user lock-in. A requirement that no operating system be more
2024-08-05 19:33:47 +00:00
> than 50% of the installed based in a critical industry or in a
> government would moot monoculture risk. Other branches to the
> risk diversification tree can be foliated to a considerable
2024-08-05 18:24:56 +00:00
> degree, but the trunk of that tree on which they hang is a total
2024-08-05 19:33:47 +00:00
> prohibition of monoculture coupled to a requirement of
2024-08-05 18:24:56 +00:00
> standards-based interoperability.
2024-08-05 19:33:47 +00:00
[^eu_interoperability]: The European Union has [long been a
proponent](https://www.openforumeurope.org/wp-content/uploads/2020/11/Ian_Brown_Interoperability_for_competition_regulation.pdf) of interoperability in markets.
Azure is Windows in 2021. The walled gardens are somewhat
2024-08-05 18:24:56 +00:00
redefined - but they are there in a similar fashion as Windows was
2024-08-05 19:33:47 +00:00
in 2003. The Microsoft monopoly is technically broken, and there
are now options from Amazon, Google and even Apple, but I would
argue the monoculture is still present in shared approaches,
2024-08-05 18:24:56 +00:00
infrastructure and concepts.
2024-08-05 19:33:47 +00:00
I decided to have a closer look at the distribution from a
representative dataset provided by an authorative source in
2024-08-05 18:24:56 +00:00
Norway; the business registry.
## Taking a Close Look at The Data
2024-08-05 19:33:47 +00:00
In Norway we a public registry of organisations. This registry is
categorised by standardised sector codes (typically "government",
"private" and so on). Using the JSON-data provided by brreg.no, a
2024-08-05 18:24:56 +00:00
list of websites can be extracted:
2024-08-05 19:33:47 +00:00
Retrieve the organisation list from brreg.no:
2024-08-05 18:24:56 +00:00
```
curl https://data.brreg.no/enhetsregisteret/api/enheter/lastned > enheter.gz
gzip -d enheter.gz
```
2024-08-05 19:33:47 +00:00
Reshape the JSON data by website URL, sector and business code.
2024-08-05 18:24:56 +00:00
```
cat enheter |
jq '[.[] | select(.hjemmeside != null) | {url:.hjemmeside, code:.naeringskode1.kode, sector:.institusjonellSektorkode.kode}]' > webpages.txt
```
2024-08-05 19:33:47 +00:00
Based on the URL, add the primary domain and resolve its MX record
and the MX primary domain to each JSON entity
2024-08-05 18:24:56 +00:00
2024-08-05 19:33:47 +00:00
[^ssb]: A rough categorisation based on the Norwegian
[standard](https://www.ssb.no/klass/klassifikasjoner/39) provided by Statistics Norway (I'm sure it could
be improved)
Using the JSON-file generated above, populate the following JSON
dictionary.
2024-08-05 18:24:56 +00:00
```
{
"government":{"codes": [6100,6500,1110,1120], "total":0, "counts":{}},
"municipals":{"codes": [1510,4900,1520], "total":0, "counts":{}},
"finance":{"codes": [3200,3500,3600,4300,3900,4100,4500,4900,5500,5700,4900,7000], "total":0, "counts":{}},
"private":{"codes": [4500,4900,2100,2300,2500], "total":0, "counts":{}}
}
```
2024-08-05 19:33:47 +00:00
Generate CSV output based on each sector grouping above.
2024-08-05 18:24:56 +00:00
## The Result
The top vendor was not surprising Microsoft's outlook.com. For the
2024-08-05 19:33:47 +00:00
120k sites, 98k resolved an MX record. Of these I will give an
outlook.com summary as follows, as it would seem this is the
2024-08-05 18:24:56 +00:00
dominating actor in all categories:
* In government 61% is O365 users (1420/2317)
* For municipals, the amount is 55% (688/1247)
* For the diverse financial grouping, 21% uses O365 (4836/23125)
* For the diverse private companies 38% uses O365 (14615/38129)
2024-08-05 19:33:47 +00:00
Of the 98k sites Microsoft runs the email service for 21559
organisations. For comparison Google MX domains accounts for about
5500.
2024-08-05 18:24:56 +00:00
2024-08-05 19:33:47 +00:00
While the above are directly a measurement of who delivers email
services, it also indicated that these organisations relies on
2024-08-05 18:24:56 +00:00
other services, such as internal wikis and direct messaging.
An overview of the top 10 vendors are shown below.
![](static/img/data/mx_domains.png)
## Sources of Errors
2024-08-05 19:33:47 +00:00
Even though I believe the statistics above is representative it
2024-08-05 18:24:56 +00:00
has some possible sources of error:
2024-08-05 19:33:47 +00:00
1. The organisation isn't listed with URL in the organisation
registry or it uses a domain not associated with the primary
2024-08-05 18:24:56 +00:00
domain of its web address
2. The organisation uses an SMTP proxy
3. The organisation has an inactive SMTP record
I found that there are more than 1 million listed organisations in
2024-08-05 19:33:47 +00:00
the brreg.no registry and 120k websites in the JSON data
provided. This means this dataset represent at most 12% of the
2024-08-05 18:24:56 +00:00
companies listed.
2024-08-05 19:33:47 +00:00
Also, email doesn't represent a diverse infrastructure, but I
believe it is an indicator of the current trends also for other
2024-08-05 18:24:56 +00:00
cloud services in e.g. Azure, Google Compute Engine and so on.