[Yeti DNS Discuss] Thoughts on Mirroring Yeti Traffic

Shane Kerr shane at biigroup.cn
Fri May 6 20:19:35 UTC 2016


All,

A long-standing problem with Yeti is getting enough traffic. One
approach to solving this is to mirror other production traffic.

The idea here is to send queries that have been sent by real users to
real resolvers, but to send a COPY of these queries to another resolver,
which uses the Yeti roots to perform the resolution. This was pioneered
by the folks at the MSK-IX, and recently Daniel Stirnimann has given us
a 1-line config option to dnsdist to perform this as well.

This helps us a little, because it gives us extra real-world traffic
which exercises the Yeti platform. However it has limitations, because
we don't know if the answers that Yeti is giving are actually ones that
the users want.

I suggest that there are four levels of traffic that we can send to
Yeti:

1. "Real" traffic, meaning traffic from real users and systems using
   resolvers configured with the Yeti roots.

2. Artificial traffic, for example from measurements or load tests or
   the like.

3. Mirrored traffic, as described above. We take real traffic and send
   a copy of it to a resolver configured with the Yeti roots.

We have all of those. Perhaps we can also add:

4. Mirrored & checked traffic. This would be mirrored traffic, but we
   would also compare the results with the IANA root server results.

To properly do mirrored & checked traffic, we need to know what the
IANA answers are for any given query. Robert Edmonds suggested that we
can use dnstap to capture this data - I don't know too much about it,
but it makes sense to me and seems like a good solution to me. 

My understanding is that dnstap is logging at the application level,
a bit like "query logging done right". So rather than having to try to
match answers and queries in pcap files or even worse reassemble TCP
streams, the DNS server itself simply logs what messages it sees.

If we don't want to use dnstap, we could also simply only mirror
queries that go to the IANA root. The program performing the mirroring
could do another query to IANA and get the results that way.  The
advantage of this approach is that it is less traffic and that we can
do things like send the query to *all* of the Yeti root servers and
confirm they are correct. The disadvantage is that this won't be doing
the caching itself, which might result in slightly different TTL
behaviors.

Another issue is that we will definitely get differences between
servers as the serial of the root zone changes. The IANA servers
typically have about 20 minutes between when the first IANA server gets
a new zone and when the last one does, so even if all of the Yeti
servers changed during the exact same time we would see slight
differences. (The Yeti servers do not all change at the exact same
time!) I think we'd need to see what kind of differences we actually
see to know if we need to account for this when looking for differences.

Yet another issue would be to validate the replies from the Yeti root
servers using DNSSEC. It shouldn't be too hard for a tool to perform
such a check, but I am not sure any library actually supports such a
check today, so it might involve some non-trivial hacking.

My proposal would be to develop a stand-alone tool which:

* could use either dnstap or pcap input,
* extract out only queries to root servers,
* send these queries to Yeti servers, and
* check the results against IANA results

Please let me know if any of this makes sense. :)

Cheers,

--
Shane





More information about the discuss mailing list