Discussion:
[Opendnssec-user] NSEC3 failure?
Havard Eidnes
2016-04-01 07:42:36 UTC
Permalink
Hi,

our zones are set up to use NSEC3 for authenticated denial of
existence. In our setup, we let OpenDNSSEC do zone transfers in
and out (as explained before), but on the public distribution
master we run periodic checks of all the zones using both
ldns-verify-zone and BIND's dnssec-verify program.

This morning, dnssec-verify flagged a problem for one of our
zones, where all the problems are related to NSEC3 records which
dnssec-verify thinks are missing:

Loading zone '255.39.128.in-addr.arpa' from file 'zones/255.39.128.in-addr.arpa'
Verifying the zone using the following algorithms: RSASHA256.
Missing NSEC3 record for 255.39.128.in-addr.arpa (NAKEP4OF03QEFOD18FBGE5GTKBLV4BHK.255.39.128.in-addr.arpa)
Missing NSEC3 record for 10.255.39.128.in-addr.arpa (6U9IB2FVPQS353THQ1SJ2UGN32KFDNDB.255.39.128.in-addr.arpa)
...

It does this for all the records in the zone.

The checker script preserves a copy of the zone which is flagged
with errors. All the "bad" zones do have NSEC3 records in
appropriate quantities.

The zone has been automatically signed three times where the
resulting transferred zone to the slave (or "public master")
fails the check:

Apr 1 02:50:06 hugin ods-signerd: [STATS] 255.39.128.in-addr.arpa 2016040100 RR[count=0 time=0(sec)] NSEC3[count=0 time=0(sec)] RRSIG[new=2 reused=237 time=0(sec) avg=0(sig/sec)] TOTAL[time=0(sec)]
Apr 1 04:50:07 hugin ods-signerd: [STATS] 255.39.128.in-addr.arpa 2016040101 RR[count=0 time=0(sec)] NSEC3[count=0 time=0(sec)] RRSIG[new=5 reused=234 time=1(sec) avg=5(sig/sec)] TOTAL[time=1(sec)]
Apr 1 06:50:06 hugin ods-signerd: [STATS] 255.39.128.in-addr.arpa 2016040102 RR[count=0 time=0(sec)] NSEC3[count=0 time=0(sec)] RRSIG[new=5 reused=234 time=0(sec) avg=0(sig/sec)] TOTAL[time=0(sec)]

When I realized this was happening, I manually initiated a
signing via "ods-signer sign 255.39.128.in-addr.arpa", and this
has apparently cured the problem:

Apr 1 07:41:47 hugin ods-signerd: [STATS] 255.39.128.in-addr.arpa 2016040103 RR[count=0 time=0(sec)] NSEC3[count=0 time=0(sec)] RRSIG[new=2 reused=237 time=0(sec) avg=0(sig/sec)] TOTAL[time=0(sec)]

Now, manually verifying whether the NSEC3 records are OK is
currently above what I do...

Does anyone have an idea what more needs to be done to zero in on
this problem?

Regards,

- Håvard
Yuri Schaeffer
2016-04-01 08:19:11 UTC
Permalink
Hi Håvard,
Post by Havard Eidnes
Apr 1 02:50:06 hugin ods-signerd: [STATS] 255.39.128.in-addr.arpa 2016040100 RR[count=0 time=0(sec)] NSEC3[count=0 time=0(sec)] RRSIG[new=2 reused=237 time=0(sec) avg=0(sig/sec)] TOTAL[time=0(sec)]
Apr 1 04:50:07 hugin ods-signerd: [STATS] 255.39.128.in-addr.arpa 2016040101 RR[count=0 time=0(sec)] NSEC3[count=0 time=0(sec)] RRSIG[new=5 reused=234 time=1(sec) avg=5(sig/sec)] TOTAL[time=1(sec)]
Apr 1 06:50:06 hugin ods-signerd: [STATS] 255.39.128.in-addr.arpa 2016040102 RR[count=0 time=0(sec)] NSEC3[count=0 time=0(sec)] RRSIG[new=5 reused=234 time=0(sec) avg=0(sig/sec)] TOTAL[time=0(sec)]
When I realized this was happening, I manually initiated a
signing via "ods-signer sign 255.39.128.in-addr.arpa", and this
Apr 1 07:41:47 hugin ods-signerd: [STATS] 255.39.128.in-addr.arpa 2016040103 RR[count=0 time=0(sec)] NSEC3[count=0 time=0(sec)] RRSIG[new=2 reused=237 time=0(sec) avg=0(sig/sec)] TOTAL[time=0(sec)]
Now, manually verifying whether the NSEC3 records are OK is
currently above what I do...
Does anyone have an idea what more needs to be done to zero in on
this problem?
Hmm. My first guess would be that it involves a resalt. Your log lines
seem to indicate that no new NSECS are being generated. Yet a resign
solves the problem. Could you compare the NSEC3PARAM from the failing
zone to the one after the manual resign?

//Yuri
Havard Eidnes
2016-04-01 08:34:43 UTC
Permalink
Post by Yuri Schaeffer
Post by Havard Eidnes
Does anyone have an idea what more needs to be done to zero in on
this problem?
Hmm. My first guess would be that it involves a resalt. Your log lines
seem to indicate that no new NSECS are being generated. Yet a resign
solves the problem. Could you compare the NSEC3PARAM from the failing
zone to the one after the manual resign?
It seems this was a correct hunch, ref. my other posting.

Regards,

- Håvard
Havard Eidnes
2016-04-01 15:09:33 UTC
Permalink
Post by Havard Eidnes
Post by Yuri Schaeffer
Post by Havard Eidnes
Does anyone have an idea what more needs to be done to zero in on
this problem?
Hmm. My first guess would be that it involves a resalt. Your log lines
seem to indicate that no new NSECS are being generated. Yet a resign
solves the problem. Could you compare the NSEC3PARAM from the failing
zone to the one after the manual resign?
It seems this was a correct hunch, ref. my other posting.
...and if I'm not terribly mistaken, the three zones which have been
flagged in this way (yep, two more popped up) so far have all been
added to our OpenDNSSEC setup after we upgraded to 1.4.9.

Regards,

- Håvard
Yuri Schaeffer
2016-04-01 18:30:48 UTC
Permalink
Post by Havard Eidnes
...and if I'm not terribly mistaken, the three zones which have been
flagged in this way (yep, two more popped up) so far have all been
added to our OpenDNSSEC setup after we upgraded to 1.4.9.
I think there is a relation but no causation in this case. They are
probably added around the same time and thus resalted at the same time.
Though not entirely sure on that.

I do have a possible fix ready.
https://github.com/yschaeff/opendnssec/tree/double_nsec3param if you are
feeling adventurous. It passes our regression tests but I wasn't able to
reproduce the yet so I'm not 100 percent sure it is a fix.

I think there is a window between a resalt and a manual resign or
incoming zone transfer where this double nsec3param can occur. I hope
that I can reproduce it soon with this new insight.

//Yuri
Havard Eidnes
2016-04-03 22:12:39 UTC
Permalink
Post by Yuri Schaeffer
Post by Havard Eidnes
...and if I'm not terribly mistaken, the three zones which have been
flagged in this way (yep, two more popped up) so far have all been
added to our OpenDNSSEC setup after we upgraded to 1.4.9.
I think there is a relation but no causation in this case. They are
probably added around the same time and thus resalted at the same time.
Though not entirely sure on that.
Could very well be. A couple of new zones came up with this
problem, and they didn't share the commonality with the others.
Post by Yuri Schaeffer
I do have a possible fix ready.
https://github.com/yschaeff/opendnssec/tree/double_nsec3param if you are
feeling adventurous. It passes our regression tests but I wasn't able to
reproduce the yet so I'm not 100 percent sure it is a fix.
Thanks! I'm now running with this patch, we'll see in a while if
it's helped.

Regards,

- Håvard

Havard Eidnes
2016-04-01 08:31:26 UTC
Permalink
Hm,

seems I need to follow up on my own posting, as I see that all
the three "bad" zones have *two* NSEC3PARAM records:

255.39.128.in-addr.arpa. 0 IN NSEC3PARAM 1 0 5 45F39B9A60C14581
255.39.128.in-addr.arpa. 0 IN NSEC3PARAM 1 0 5 D9E0ED2449E3721D

while the good one only has one:

255.39.128.in-addr.arpa. 0 IN NSEC3PARAM 1 0 5 45F39B9A60C14581

I bet that's what's causing BIND's dnssec-verify to balk at the
"bad" zones.

Regards,

- Håvard
Loading...