Sunday, 16 May 2010

May 12: DNS blackout in Germany

On May 12, 2010, Internet in Germany faced serious problems. According to German media (Focus, Spiegel, Chip, Frankfurter Allgemeine etc.) most of the web pages with addresses ending with .DE were not available between 11:30 UTC and 15:45 UTC. During the "blackout" DNS queries returned NXDOMAIN (Name Error RCODE), indicating probable existence of some errors during the process of uploading of the zone file into the DNS servers. This information has been briefly confirmed by Peter Koch from DENIC eG. It's quite likely that the defective or incomplete zone files have been uploaded into DNS servers.


Usually domain registration and maintenance systems are separated from Domain Name System (DNS is resolving queries send by Internet users, translating domians into IP addresses). Normally there is no "direct" connection between registration system and live DNS. Those two "worlds" are connected when the registry system is exporting* zone file(s) and the zone file is uploaded into primary DNS server (and than distributed to the secondary Name Servers).

German accident is similar to the case from Spain (in 2006 the empty zone file has been uploaded) and Sweden (in 2009 incorrect DNSSEC-signed zone file has been uploaded). Unfortunately in both mentioned situations this critical part of the process has not been secured by automatic checks verifying the newly generated zone file.


To avoid such incidents, automatic verification mechanism(s) can be used to check the differences between the new generated zone file and the previous one (the last correct zone file). If the number of changes is higher than usual, it indicates that some errors are quite likely to happen and the process should be terminated. It's the common practice in many different industry sectors, to check if the changes in the computer databases, physical storage, resources or energy consumption etc. are not deviated from what is expected from statistical estimations. The idea behind such statistical checks is to eliminate major errors using automatic verification systems. Of course such (statistical) methods don't eliminate minor problems, requiring more sophisticated control solutions to be implemented.

*There is also another solution for updating zone files called "dynamic updates", allowing small portions of domains to up updated more frequently than zone files reloads. German registry is not using this method to update DNS.

