postoffice.co.za downtime and DNS issues

So, last week, the South Africa Post Office web site went down.

The media has covered it as follows:

The main cause was because of suspension due to non-payment of the domain name renewal fee for “postoffice.co.za”.  This puts them in august company. Major organisations such as Microsoft and SAA, to name just two, have had downtime in the past due to unpaid domain name fees.

In this post, I look at the process involved in the suspension, and look at the technical reasons why the domain took so long to come up. I also examine the publicly available DNS records to show factors contributing to the unreliability of “postoffice.co.za”.

But first, it is worth briefly mentioning how the co.za domain registration system works to provide some background.

co.za’s are registered and managed in one of two ways – through a legacy email system and via a newer EPP system:

  • The EPP system needs formal accreditation, along with an indication, but not guarantee, of technical competence. It uses a documented API, and allows a registrar (and thus the registrant through the registrar) to do things like set domains to auto-renew. Billing is handled periodically, and through an accredited registrar.
  • The legacy email system is basically an email parser. It takes an email from almost anyone, decodes it, checks the listed nameservers for the records in the email, and registers the domain. These days, as payment is made and allocated to the registration, the domain is then made active. Each year upon invoice, it needs the payment to be made and allocated. It’s clunky, has manual processes, and is priced accordingly  (over 200% more).

“postoffice.co.za” was registered on the 16th of January 1997, when only the legacy system existed and has been maintained on it ever since. The full history is available at the legacy whois.

One can also see the various updates through the ages, along with accounting information, which is important for how to determine what happened next.

On the accounting information linked to above, the line

2018-02-01| R | 125.40|domainmaster@postoffice.co.za|2018-03-20| 2 | 2847135|SA Post Office

tells us that on 2018-02-01, and invoice was issued for the domain, and sent to the current billing address of the domain: “domainmaster@postoffice.co.za”.

When I check the email logs for the legacy billing email system, I can see that the invoice was successfully delivered to Mimecast’s servers, which handle email for “postoffice.co.za”.

As per the accounts schedule, a month later on the 1st of March, a additional statement was sent, detailing the outstanding amount, and successfully delivered according to the logs I examined.

Then, on the 12th of March, a final warning was emailed to all the whois records, along with the SOA email address (which in this case is domainmaster@postoffice.co.za), namely “david.maseko@postoffice.co.za”, “domainmaster@postoffice.co.za” and “sphiwe.nkosi@postoffice.co.za” as per the public whois records. A check on the email logs, indicate that these were all delivered successfully.

It does appear from his Linkedin profile that David Maseko is no longer with the Postoffice. My google-fu on Sphiwe Nkosi reveals nothing.

A week later, on the 19th March, the domain is suspended due to non-payment, by removing it from the co.za. As caching nameservers expire the record according to the TTLs, the website becomes unreachable, and email starts bouncing.

A payment is processed the next day on the 20th, and the domain is returned to the co.za zone.

But, all is not well. There are several errors with the “postoffice.co.za” zone setup, and it’s quite remarkable that anything works at all. This invariably contributed to the extended downtime reported.

It helps to understand the nature of the DNS. It’s a hierarchical distributed database, consisting of two types of nameservers – caching/resolving nameservers that go and answer the questions, and authoritative nameservers that reply with the answers. The resolving nameservers will go and ask the nameserver tree, starting at the root. They then walk down the delegated hierarchy, remembering the answers they get for a pre-determined time. If the answers provided by the authoritative servers are inconsistent, incorrect or incongruent, the resolving nameserver will take longer, or not be able to do their job, or time out, depending on the error. The authoritative nameservers typically consist of a primary nameserver which are replicated to secondary nameservers.

With this in mind, lets take a technical look at the “postoffice.co.za” DNS setup.

First lets look at the registration again – the name servers are listed as:

6a. primnsfqdn : waterbok.postoffice.co.za
6b. primnsip : 165.8.13.171
6c. primnsipv6 :
6e. secns1fqdn : demeter.is.co.za
6f. secns1ip :
6g. secns1ipv6 :
6i. secns2fqdn : titan.is.co.za
6j. secns2ip :
6k. secns2ipv6 :
6m. secns3fqdn : sangoma.saix.net
6n. secns3ip :
6o. secns3ipv6 :
6q. secns4fqdn : sabela.saix.net

A dig on the co.za nameservers shows the records match:

calvin@calvin-office:~$ dig ns postoffice.co.za @ns.coza.net.za.

<SNIP>;; AUTHORITY SECTION:
postoffice.co.za. 86400 IN NS sabela.saix.net.
postoffice.co.za. 86400 IN NS titan.is.co.za.
postoffice.co.za. 86400 IN NS waterbok.postoffice.co.za.
postoffice.co.za. 86400 IN NS sangoma.saix.net.
postoffice.co.za. 86400 IN NS demeter.is.co.za.

But things start going pear shaped when we query the primary nameserver:

calvin@calvin-office:~$ dig ns postoffice.co.za @waterbok.postoffice.co.za.

; <<>> DiG 9.10.3-P4-Ubuntu <<>> ns postoffice.co.za @waterbok.postoffice.co.za.
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 58115
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 7, AUTHORITY: 0, ADDITIONAL: 6

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4000
;; QUESTION SECTION:
;postoffice.co.za. IN NS

;; ANSWER SECTION:
postoffice.co.za. 600 IN NS sabela.saix.net.
postoffice.co.za. 600 IN NS demeter.is.co.za.
postoffice.co.za. 600 IN NS waterbok.postoffice.co.za.
postoffice.co.za. 600 IN NS sangoma.saix.net.
postoffice.co.za. 600 IN NS gemsbok.postoffice.co.za.
postoffice.co.za. 600 IN NS waterbok.
postoffice.co.za. 600 IN NS titan.is.co.za.

There are seven nameservers, instead of the listed five in the registration. “waterbok” is clearly not valid. Any resolving nameserver should timeout when selecting that one to query, and hopefully try one of the others.

The nameserver is also a recursive nameserver (In the flags section, the “ra” bit is set, indicating recursion is available). This means the “postoffice.co.za” domain is susceptible to DNS cache poisoning and is vulnerable to being hacked to give out incorrect entries.

“gemsbok.postoffice.co.za” is not listed in the co.za zone. When we query it:

calvin@calvin-office:~$ dig ns postoffice.co.za @gemsbok.postoffice.co.za.

; <<>> DiG 9.10.3-P4-Ubuntu <<>> ns postoffice.co.za @gemsbok.postoffice.co.za.
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 37205
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 6

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4000
;; QUESTION SECTION:
;postoffice.co.za. IN NS

;; ANSWER SECTION:
postoffice.co.za. 86400 IN NS demeter.is.co.za.
postoffice.co.za. 86400 IN NS waterbok.postoffice.co.za.
postoffice.co.za. 86400 IN NS gemsbok.
postoffice.co.za. 86400 IN NS sangoma.saix.net.
postoffice.co.za. 86400 IN NS titan.is.co.za.
postoffice.co.za. 86400 IN NS sabela.saix.net.

Here I get multiples weirdness. Firstly, the nameservers differ from those listed in the primary. “waterbok” has been replaced by “gemsbok” and “gemsbok.postoffice.co.za” is gone. Again, resolving nameservers that get this record will timeout on “gemsbok” and hopefully go onto another. Additionally, the TTL’s (Time To Live records) are different – on “gemsbok.postoffice.co.za” they are set to expire in one day (86400 seconds) and “waterbok.postoffice.co.za”, they’re ten minutes (600 seconds).

These differences indicates that “gemsbok.postoffice.co.za” is not a secondary of “waterbok.postoffice.co.za” and indeed, a comparison of SOA serial numbers further reinforces this:

calvin@calvin-office:~$ dig soa postoffice.co.za @waterbok.postoffice.co.za +short
waterbok.postoffice.co.za. domainmaster.postoffice.co.za. 2013062691 21600 3600 604800 600
calvin@calvin-office:~$ dig soa postoffice.co.za @gemsbok.postoffice.co.za +short
waterbok.postoffice.co.za. domainmaster.postoffice.co.za. 2013062543 21600 3600 604800 86400

See the Serial numbers 2013062691 vs 2013062543 (and the Serial numbers seem to indicated these are records from 2013 ?????). Again, the nameserver says it is recursive (the “ra” bit in the flags section), raising that pesky cache poisoning stuff again.

Now lets move on to the other nameservers – first the IS ones:

calvin@calvin-office:~$ dig soa postoffice.co.za @demeter.is.co.za

; <<>> DiG 9.10.3-P4-Ubuntu <<>> soa postoffice.co.za @demeter.is.co.za
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 56176
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;postoffice.co.za. IN SOA

;; Query time: 18 msec
;; SERVER: 196.26.5.8#53(196.26.5.8)
;; WHEN: Mon Mar 26 13:23:27 SAST 2018
;; MSG SIZE rcvd: 45

calvin@calvin-office:~$ dig soa postoffice.co.za @titan.is.co.za

; <<>> DiG 9.10.3-P4-Ubuntu <<>> soa postoffice.co.za @titan.is.co.za
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 23702
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;postoffice.co.za. IN SOA

;; Query time: 178 msec
;; SERVER: 196.33.171.36#53(196.33.171.36)
;; WHEN: Mon Mar 26 13:23:29 SAST 2018
;; MSG SIZE rcvd: 45

Both of these respond with “SERVFAIL”, which basically means they know nothing about “postoffice.co.za”. Any recursive nameserver should move on to another listed nameserver when encountering this reply. This should not be fatal, but will delay things. But those nameservers should either be removed from the zone (assuming no relationship between IS and the Postoffice), or be made secondaries for “postoffice.co.za” (assuming a relationship between IS and the Postoffice).

Both of SAIX‘s nameservers that are listed, appear to offer relatively sane replies (ignoring the invalid “waterbok” and “waterbok.postoffice.co.za” entries:

calvin@calvin-office:~$ dig postoffice.co.za @sangoma.saix.net ns

; <<>> DiG 9.10.3-P4-Ubuntu <<>> postoffice.co.za @sangoma.saix.net ns
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49603
;; flags: qr aa rd; QUERY: 1, ANSWER: 7, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;postoffice.co.za. IN NS

;; ANSWER SECTION:
postoffice.co.za. 600 IN NS sangoma.saix.net.
postoffice.co.za. 600 IN NS waterbok.postoffice.co.za.
postoffice.co.za. 600 IN NS sabela.saix.net.
postoffice.co.za. 600 IN NS gemsbok.postoffice.co.za.
postoffice.co.za. 600 IN NS titan.is.co.za.
postoffice.co.za. 600 IN NS waterbok.
postoffice.co.za. 600 IN NS demeter.is.co.za.

;; Query time: 318 msec
;; SERVER: 2c0e:2001:4000:1::c419:109#53(2c0e:2001:4000:1::c419:109)
;; WHEN: Mon Mar 26 13:30:15 SAST 2018
;; MSG SIZE rcvd: 208

calvin@calvin-office:~$ dig postoffice.co.za @sabela.saix.net ns

; <<>> DiG 9.10.3-P4-Ubuntu <<>> postoffice.co.za @sabela.saix.net ns
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7368
;; flags: qr aa rd; QUERY: 1, ANSWER: 7, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;postoffice.co.za. IN NS

;; ANSWER SECTION:
postoffice.co.za. 600 IN NS waterbok.postoffice.co.za.
postoffice.co.za. 600 IN NS waterbok.
postoffice.co.za. 600 IN NS sabela.saix.net.
postoffice.co.za. 600 IN NS sangoma.saix.net.
postoffice.co.za. 600 IN NS gemsbok.postoffice.co.za.
postoffice.co.za. 600 IN NS titan.is.co.za.
postoffice.co.za. 600 IN NS demeter.is.co.za.

;; Query time: 345 msec
;; SERVER: 2c0e:2001:0:1::c42b:109#53(2c0e:2001:0:1::c42b:109)
;; WHEN: Mon Mar 26 13:30:21 SAST 2018
;; MSG SIZE rcvd: 208

and they appear to be serving the zone available from “waterbok.postoffice.co.za.”

So – basically, you have a whole bunch of misconfigurations in the “postoffice.co.za” domain, with errors across all of the listed nameservers. Two of five registered nameservers do not even know about the domain. All of the ones that do answer contain weird nonsensical nameservers. And two of the nameservers cannot be trusted to serve the correct information.

It is no wonder things took so long to come back up once the domain was re-instated.

These errors appear to be replicated on the “postbank.co.za” and other Postoffice domains. Additionally, “speedservices.co.za” is currently up for renewal as per “postoffice.co.za”.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s