ResearchIndex

DNS Performance and the Effectiveness of Caching

by Jaeyeon Jung, Emil Sit, Hari Balakrishnan, Robert Morris

url show details

Details

type:	misc
journal:	IEEE\slash ACM Transactions on Networking
year:	2002
annote:	Jaeyeon Jung (MIT Laboratory for Computer Science; 200 Technology Square; Cambridge , MA 02139); Emil Sit (MIT Laboratory for Computer Science; 200 Technology Square; Cambridge , MA 02139); Hari Balakrishnan (MIT Laboratory for Computer Science; 200 Technology Square; Cambridge , MA 02139); Robert Morris (MIT Laboratory for Computer Science; 200 Technology Square; Cambridge , MA 02139);
abstract:	This paper presents a detailed analysis of traces of DNS and associated TCP traffic collected on the Internet links of the MIT Laboratory for Computer Science and the Korea Advanced Institute of Science and Technology (KAIST). The first part of the analysis details how clients at these institutions interact with the wide-area DNS system, focusing on performance and prevalence of failures. The second part evaluates the effectiveness of DNS caching. In the most recent MIT trace, 23% of lookups receive no answer; these lookups account for more than half of all traced DNS packets since they are retransmitted multiple times. About 13% of all lookups result in an answer that indicates a failure. Many of these failures appear to be caused by missing inverse (IP-to-name) mappings or NS records that point to non-existent or inappropriate hosts. 27% of the queries sent to the root name servers result in such failures. The paper presents trace-driven simulations that explore the effect of varying TTLs and varying degrees of cache sharing on DNS cache hit rates. The results show that reducing the TTLs of address (A) records to as low as a few hundred seconds has little adverse effect on hit rates, and that little benefit is obtained from sharing a forwarding DNS cache among more than 10 or 20 clients. These results suggest that the performance of DNS is not as dependent on aggressive caching as is commonly believed, and that the widespread use of dynamic, low-TTL A-record bindings should not degrade DNS performance. I.

You need to log in to add tags and post comments.

Tags

Public comments

#1 posted on Mar 28 2008, 17:24 in collection CMU 15-744: Computer Networks -- Spring 08
The paper quantifies the performance of DNS with relatively recent datasets,
which reflects changes of usage patterns from that observed at the time of
initial deployment. The DNS query logs are matched with TCP connection logs
to see how DNS flows interact with TCP flows.

Many interesting observation is made in this paper.
First half of the paper studies failures and latency from DNS client's
perspective. This answers questions like "what types of queries are popular",
"how many failures do users see", "how many DNS requests are not answered",
"what's the latency distribution like", and "how many round-trip does a
client have to make to get final answer?"

The paper showed that most of the queries are of type "A"(~60%) and
"PTR"(~30%). It is surprising to me that many of the queries are of
type "PTR" (reverse queries) and the percentages are relatively
stable in different datasets.

Latency distribution of DNS response shows that number of referals is
a crucial factor in determining latency. As number of referral goes up
latency increases. This suggest that NS cache is very effective in
reducing the latency, and it is corroborated by the fact that latency
of queries that goes up to root server is higher than others.

There are some pathological cases especially when query received no answer.
Not only the percentage of unanswered queries (~20%) are high, clients
retransmit too many times that percentage of query packets generated in
this case is around 60% of the total.

The second half studies the effectiveness of caching under real workload,
which I think is far more interesting than the first half.
The impact of relatively recent use of DNS in loadbalancing is well studied
in this section. Heavy-tailed distribution of DNS lookups and its effect on
caching is one of the most important contribution of this paper.
To summarize the results:
DNS lookup flows Zip-f distribution, limiting the effectiveness of caching
and shared caching with any number of users as a group.
Interarrival time of tcp connection, which is closely tied with user input
that cause DNS lookup fits Pareto distribution with infinite mean. Thus,
increasing TTL value for A-record does not help in terms of performance.
Smaller value of TTL used intentionally for different purposes does not
significantly harm the performance of DNS.

Some thoughts:
One surprising thing is that the percentage of unanswered DNS lookups is very high (more than 20%) in all three cases. The paper breaks down the types of unanswered lookups, but does not explain the cause.
Is it that some hosts are misconfigured? But non-zero referrals were the most common case.
It is not clear what might be the reason.

#2 posted on Mar 29 2008, 18:59 in collection CMU 15-744: Computer Networks -- Spring 08
Based on three separate traces of TCP traffic together with the corresponding DNS traffic, collected during 2000-2001, the paper analyzes performance and scalability of DNS focusing on:
1)Performance perceived by DNS clients in terms of latency, failure-i.e. retransmissions, negative responses, and negative caching- and interactions with root servers
2)Effectiveness of caching in terms of the impacts of TTL and degree of cache sharing on caching effectiveness

Summary of Results and Conclusions
- Latency is likely to be adversely affected by the number of referrals; and hence caching NS record can substantially reduce the DNS lookup latency.
- DNS servers appear to retransmit too aggressively; and it is better for these servers to stop retransmitting after two or three trials.
- A large portion, over one-third, of lookups is unanswered; and these unanswered lookups would cause large amount of DNS traffic.
- NS-record caching can reduce a lot of queries to a root or generic Top-Level Domain (gTLD) server, thus good NS-record caching is critical to DNS scalability
- Small-TTL, i.e. as low as a few hundred seconds, A-record caching has little adverse effect on DNS performance.
- Little benefit is obtained from sharing a forwarding DNS cache among more than 10 or 20 clients.

Questions:
- According to the results and conclusions from the paper, how should we improve the performance of DNS and DNS caching?
- How could heavy-tail nature of domain name popularity render caching less effective?

#3 posted on Mar 30 2008, 09:27 in collection CMU 15-744: Computer Networks -- Spring 08
One surprising conclusion of this paper is that the reasons behind DNS performance are not the ones that we would expect. Caching A-records, for instance, is not really critical, as long as there are cached NS records pointing to the right DNS servers (thus decreasing the latency of lookups and avoiding a excessive load on root serves).
There is only one small set of names that are popular enough so that the probably of finding its A-record in the cache is significant, and those names probably have small TTL values. This will most likely become more pronounced with the popularity of sites like Google and Digg which means that using cached A-records will probably never be able to significantly improve DNS performance

#4 posted on Mar 30 2008, 15:24 in collection CMU 15-744: Computer Networks -- Spring 08
This paper starts by thoroughly analyzing some sets of collected DNS traffic traces and then evaluates the effect of caching and different TTL values on DNS performance using simulation. The authors study how DNS performance is affected by latency, number of retransmissions, negative responses and caching and provide explanations for the observed results.

The number of loops due to misconfigured servers was surprisingly large. I was also surprised to see that most of the queries (63%) in the collected traces were generated by lookups that obtained no answer. The authors attribute this behavior in large part to the persistent retry strategies of DNS name servers. Overall it seems that small TTL values do not have a negative impact on DNS performance or scalability and that caching in small groups (e.g. 10-20 clients) can slightly boost the hit rate of DNS queries.

In subsection E of section IV the authors explain that the load on the root servers could be reduced if the rest of the DNS servers refused to forward queries for unqualified host names (e.g. loopback). I wonder if today’s DNS servers follow this strategy.

#5 posted on Mar 30 2008, 16:27 in collection CMU 15-744: Computer Networks -- Spring 08
The paper pointed out something that seems rather obvious after reflection: it is unlikely that adding more users to a shared cache will improve the hit rate since most individuals are accessing items of their own interest.

I'm curious if there's any benefit to prefetching DNS entries ahead of time in order to improve the cache hit rate.

#6 posted on Mar 30 2008, 16:34 in collection CMU 15-744: Computer Networks -- Spring 08
It is surprising that a considerable amount of DNS queries is unanswered and thus generating packets traveling wide area networks, and also that low TTLs for A-record DNS queries are not as useful as we expect. Overall, it was a very enjoyable reading.

#7 posted on Mar 30 2008, 16:39 in collection CMU 15-744: Computer Networks -- Spring 08
It makes sense that there is a long-tailed, roughly power-law distribution of hostnames inquiried about. There are a few names, e.g., "www.google.com", that a vast majority of people need to resolve everyday, and then there are a vast number of names that only a few users need to resolve. So it is very reasonable to only cache the relatively few heavily used hostnames.

#8 posted on Mar 30 2008, 16:57 in collection CMU 15-744: Computer Networks -- Spring 08
The paper presents a set of performance studies of DNS over the internet around 2001. A worth noting result is that aggressive retransmission for unanswered lookups (23%) accounted for more than 50% of DNS packets in the wide area. Also the worst-case latency for DNS lookups became longer, partially due to the exponential scaling of the internet. The power distribution of name popularity is not surprising. Also, the paper suggests lowering TTL is not so harmful.

#9 posted on Mar 30 2008, 23:45 in collection CMU 15-744: Computer Networks -- Spring 08
I wish the author could explain in the paper the reason of the large fraction of the DNS lookup receiving no answer. Is it because the DNS is too busy to answer the queries?

In the paper, the authors pointed out that most of the negative responses are due to mis-typed names, bare host names or other mistakes. Local network may do something to stop the queries like "index.htm" or "loopback".