Discussion:
[Opendnssec-user] CRITICAL: failed to sign zone example.com: General error
Michael Grimm
2017-01-16 18:49:06 UTC
Permalink
Hi --

This is opendnssec 1.4.12 and FreeBSD 11-STABLE.

Today I found the following error message in my logs:

| ods-signerd: [worker[4]] CRITICAL: failed to sign zone example.com:
General error

After removing all files in /usr/local/var/opendnssec/signconf and
/usr/local/var/opendnssec/tmp, and restartion opendnssec afterwards,
I'll end up with:

| ods-enforcerd: Zone example.com found.
| ods-enforcerd: Policy for example.com set to default.
| ods-enforcerd: Config will be output to
/usr/local/var/opendnssec/signconf/example.com.xml.
| ods-enforcerd: Not enough keys to satisfy zsk policy for zone:
example.com. keys_to_allocate(1) = keys_needed(1) - (keys_available(1) -
keys_pending_retirement(1))
| ods-enforcerd: Tried to allocate 1 keys, failed on allocating key
number 1
| ods-enforcerd: ods-enforcerd will create some more keys on its next
run
| ods-enforcerd: Error allocating zsks to zone example.com

and

| ods-signerd: [worker[4]] CRITICAL: failed to sign zone example.com:
General error

dns> ods-ksmutil key list -all --zone example.com
Keys:
Zone: Keytype: State: Date of next transition:
example.com KSK active 2026-01-20 12:59:25
example.com ZSK active 2017-01-16 14:00:07

Hmm, what do I need to do in order to recover from that error? Any input
is highly appreciated.

Thanks and regards,
Michael
Berry A.W. van Halderen
2017-01-16 19:34:46 UTC
Permalink
Post by Michael Grimm
Hi --
This is opendnssec 1.4.12 and FreeBSD 11-STABLE.
General error
After removing all files in /usr/local/var/opendnssec/signconf and
/usr/local/var/opendnssec/tmp, and restartion opendnssec afterwards,
| ods-enforcerd: Zone example.com found.
| ods-enforcerd: Policy for example.com set to default.
| ods-enforcerd: Config will be output to
/usr/local/var/opendnssec/signconf/example.com.xml.
example.com. keys_to_allocate(1) = keys_needed(1) - (keys_available(1) -
keys_pending_retirement(1))
| ods-enforcerd: Tried to allocate 1 keys, failed on allocating key
number 1
| ods-enforcerd: ods-enforcerd will create some more keys on its next run
| ods-enforcerd: Error allocating zsks to zone example.com
and
General error
dns> ods-ksmutil key list -all --zone example.com
example.com KSK active 2026-01-20 12:59:25
example.com ZSK active 2017-01-16 14:00:07
Hmm, what do I need to do in order to recover from that error? Any input
is highly appreciated.
The enforcer will try to allocate more keys upon the next run. The time
when this is depends (in 1.4), upon the Interval setting in the
conf.xml. Normally a number of minutes (at 14:00 your time).
But my assumption is that this already was tried a number of times.

I don't know which HSM you are using. If you are using SoftHSM, it
could be due to permissions problems on the files where the keys
are stored, or to a full filesystem. Check /var/lib/softhsm,
the default location (set in /etc/softhsm.conf).

If you are using a real HSM, it might be a connection problem,
a problem with the library or even that the HSM is full.

Clearing tmp in this case makes no difference.

You can also increase the verbosity in conf.xml and restart
to get a bit more information. You just need to look at
the ods-enforcerd lines in the log. The signer doesn't seem
to be the real problem, though I am puzzled why it got the
initial problem. Did you keep the original
/usr/local/var/opendnssec/signconf/example.com.xml
by any change? The current state I can explain, but now
the original.

\Berry
Michael Grimm
2017-01-16 20:07:47 UTC
Permalink
Post by Berry A.W. van Halderen
Post by Michael Grimm
Hmm, what do I need to do in order to recover from that error? Any input
is highly appreciated.
The enforcer will try to allocate more keys upon the next run. The time
when this is depends (in 1.4), upon the Interval setting in the
conf.xml. Normally a number of minutes (at 14:00 your time).
But my assumption is that this already was tried a number of times.
Indeed. In the meantime I do find many of those errors in the logfile.
Post by Berry A.W. van Halderen
I don't know which HSM you are using.
softhsm 1.3.8
Post by Berry A.W. van Halderen
If you are using SoftHSM, it
could be due to permissions problems on the files where the keys
are stored, or to a full filesystem. Check /var/lib/softhsm,
the default location (set in /etc/softhsm.conf).
-rw-r--r-- 1 root wheel uarch 44032 Jan 16 20:48
/usr/local/var/opendnssec/kasp.db

I have to note, that 8 other domains are kept in that database. None of
the other domains triggered a similar error (yet).
Post by Berry A.W. van Halderen
You can also increase the verbosity in conf.xml and restart
to get a bit more information.
I had had <Verbosity>3</Verbosity>. I did increase to 4,5, and 10, but
to no avail. The very same log messages are reported, no additional
ones. Is this the verbosity you were refering to?
Post by Berry A.W. van Halderen
Did you keep the original
/usr/local/var/opendnssec/signconf/example.com.xml
by any change?
Yes. I did save before rescue trials:

-rw-r--r-- root/opendnssec 990 2017-01-06 21:02
opendnssec/signconf/example.com.xml

What do you want me to do with that?

I do have to admit that I am pretty helpless in understanding the
details of the software I am using. Sad to say :-(

So, what should I do next?

Create a new key for example.com and import it into softhsm?
Export kaps.db and re-import? (how?)
Anything else?

Thanks and regards,
Michael
Michael Grimm
2017-01-16 20:23:39 UTC
Permalink
Post by Michael Grimm
So, what should I do next?
Create a new key for example.com and import it into softhsm?
Export kaps.db and re-import? (how?)
Anything else?
Or: would it be wise to upgrade both opendnssec and softhsm under these
circumstances?

Regards,
Michael
Berry A.W. van Halderen
2017-01-16 20:44:05 UTC
Permalink
Post by Michael Grimm
Post by Berry A.W. van Halderen
Post by Michael Grimm
Hmm, what do I need to do in order to recover from that error? Any input
is highly appreciated.
The enforcer will try to allocate more keys upon the next run. The time
when this is depends (in 1.4), upon the Interval setting in the
conf.xml. Normally a number of minutes (at 14:00 your time).
But my assumption is that this already was tried a number of times.
Indeed. In the meantime I do find many of those errors in the logfile.
Post by Berry A.W. van Halderen
I don't know which HSM you are using.
softhsm 1.3.8
Post by Berry A.W. van Halderen
If you are using SoftHSM, it
could be due to permissions problems on the files where the keys
are stored, or to a full filesystem. Check /var/lib/softhsm,
the default location (set in /etc/softhsm.conf).
-rw-r--r-- 1 root wheel uarch 44032 Jan 16 20:48
/usr/local/var/opendnssec/kasp.db
I'm afraid that is the enforcer database, it has no storage of
the keys.
Given SoftHSM, the proper location is can be seen in /etc/softhsm.conf
or /usr/local/etc/softhsm.conf. But given FreeBSD I'm pretty sure it is
in /usr/local/var/lib. You you can check with:
ls -ld /usr/local/var/lib/softhsm
df -k /usr/local/var/lib/softhsm
To know if there are any filesystem problems.

Also check if there is a <Capacity> specified in your
/usr/local/etc/opendnssec/conf.xml
This is also a limit on the maximum keys possible.
Post by Michael Grimm
I have to note, that 8 other domains are kept in that database. None of
the other domains triggered a similar error (yet).
Post by Berry A.W. van Halderen
You can also increase the verbosity in conf.xml and restart
to get a bit more information.
I had had <Verbosity>3</Verbosity>. I did increase to 4,5, and 10, but
to no avail. The very same log messages are reported, no additional
ones. Is this the verbosity you were refering to?
Yes, you did restart the daemons right? Otherwise the change isn't
picked up. An increase to 6 or 7 often is very verbose.
Post by Michael Grimm
Post by Berry A.W. van Halderen
Did you keep the original
/usr/local/var/opendnssec/signconf/example.com.xml
by any change?
-rw-r--r-- root/opendnssec 990 2017-01-06 21:02
opendnssec/signconf/example.com.xml
What do you want me to do with that?
Can you send it to me privately? Me or one of my co-workers can
have a look at it. There are only references to keys placed
there so no serious security concerns.
Post by Michael Grimm
I do have to admit that I am pretty helpless in understanding the
details of the software I am using. Sad to say :-(
So, what should I do next?
Create a new key for example.com and import it into softhsm?
Export kaps.db and re-import? (how?)
Anything else?
I don't see how that would help, quick repairs for the signer
are often repairs to the signconf such that the cause of the
failure is seen.
Post by Michael Grimm
Thanks and regards,
Michael
_______________________________________________
Opendnssec-user mailing list
https://lists.opendnssec.org/mailman/listinfo/opendnssec-user
Michael Grimm
2017-01-16 21:09:33 UTC
Permalink
Post by Berry A.W. van Halderen
Post by Michael Grimm
Post by Berry A.W. van Halderen
If you are using SoftHSM, it
could be due to permissions problems on the files where the keys
are stored, or to a full filesystem. Check /var/lib/softhsm,
the default location (set in /etc/softhsm.conf).
-rw-r--r-- 1 root wheel uarch 44032 Jan 16 20:48
/usr/local/var/opendnssec/kasp.db
I'm afraid that is the enforcer database, it has no storage of
the keys.
Given SoftHSM, the proper location is can be seen in /etc/softhsm.conf
or /usr/local/etc/softhsm.conf.
Sorry my fault. Here is the information you asked for:

MW-dns2|root> ls -al /usr/local/var/softhsm/slot0.db
-rw------- 1 root wheel uarch 150528 Jan 4 03:01 /usr/local/var/softhsm/slot0.db
Post by Berry A.W. van Halderen
Also check if there is a <Capacity> specified in your
/usr/local/etc/opendnssec/conf.xml
This is also a limit on the maximum keys possible.
No, there is no such Capacity limitation defined.
Post by Berry A.W. van Halderen
Post by Michael Grimm
Post by Berry A.W. van Halderen
You can also increase the verbosity in conf.xml and restart
to get a bit more information.
I had had <Verbosity>3</Verbosity>. I did increase to 4,5, and 10, but
to no avail. The very same log messages are reported, no additional
ones. Is this the verbosity you were refering to?
Yes, you did restart the daemons right?
Yes :-)
Post by Berry A.W. van Halderen
An increase to 6 or 7 often is very verbose.
Not here :-( Still no increase observable.
Post by Berry A.W. van Halderen
Post by Michael Grimm
Post by Berry A.W. van Halderen
Did you keep the original
/usr/local/var/opendnssec/signconf/example.com.xml
by any change?
-rw-r--r-- root/opendnssec 990 2017-01-06 21:02
opendnssec/signconf/example.com.xml
What do you want me to do with that?
Can you send it to me privately? Me or one of my co-workers can
have a look at it. There are only references to keys placed
there so no serious security concerns.
Sure, I will send it in private mail.

Regards,
Michael
Yuri Schaeffer
2017-01-18 13:34:13 UTC
Permalink
Hi Michael,

Please check for the availability of the key in the hsm:
ods-hsmutil -c /etc/opendnssec/conf.xml list

It may have trouble finding one of the keys from your signconf:
0347526dbd7d57ff891f017c26a30846
a55ae0ef264253145c8f29c491829d29

Also make sure you pass the correct conf.xml file. I'm a little worried
you may have one on multiple locations. Since increasing the verbosity
doesn't seem to work for you?

//Yuri
Michael Grimm
2017-01-18 16:21:47 UTC
Permalink
Hi Yuri —
Post by Yuri Schaeffer
ods-hsmutil -c /etc/opendnssec/conf.xml list
0347526dbd7d57ff891f017c26a30846
a55ae0ef264253145c8f29c491829d29
Nope. Both keys are found:

dns> ods-hsmutil -c /usr/local/etc/opendnssec/conf.xml list | egrep -i '(0347526dbd7d57ff891f017c26a30846|a55ae0ef264253145c8f29c491829d29)'
SoftHSM a55ae0ef264253145c8f29c491829d29 RSA/2048
SoftHSM 0347526dbd7d57ff891f017c26a30846 RSA/2048
Post by Yuri Schaeffer
Also make sure you pass the correct conf.xml file. I'm a little worried
you may have one on multiple locations.
Hmm. This is a FreeBSD port I did install. but I double-checked, and no, there is only one conf.xml available.
Post by Yuri Schaeffer
Since increasing the verbosity doesn't seem to work for you?
I do have the following section in my conf.xml file regarding verbosity:

<Logging>
<Verbosity>7</Verbosity>
<Syslog><Facility>local0</Facility></Syslog>
</Logging>

Opendnssec runs in a FreeBSD jail, and all log messages are forwarded to the host's syslogd. But that shouldn't be the reason for a "not working verbosity setting", correct? Is there a way to fetch error massages into a file?


Well, coming back to my issue. As I mentioned before, I am not that well informed about all the details of DNSSEC. Does that current lack in key rollover for that domain may imply major issues for that given domain? I am willing to upgrade opendnssec, but that would need some time of testing, because I do not want to screw my recent setup. Would the current issue lead to a disaster if I would perform an upgrade under these circumstances? Would it be worth a try?

I really do appreciate your help,
Michael
Michael Grimm
2017-01-18 17:14:17 UTC
Permalink
Post by Michael Grimm
Post by Yuri Schaeffer
Since increasing the verbosity doesn't seem to work for you?
Is there a way to fetch error massages into a file?
My bad. Sure, there is a local logfile (ods.log) in addition which holds much more infos than syslog.

Here's a snipett of example.com:

Jan 18 17:10:17 dns2 ods-enforcerd: Zone example.com found.
Jan 18 17:10:17 dns2 ods-enforcerd: Policy for example.com set to default.
Jan 18 17:10:17 dns2 ods-enforcerd: Config will be output to /usr/local/var/opendnssec/signconf/example.com.xml.
Jan 18 17:10:17 dns2 ods-enforcerd: Not enough keys to satisfy zsk policy for zone: example.com. keys_to_allocate(1) = keys_needed(1) - (keys_available(1) - keys_pending_retirement(1))
Jan 18 17:10:17 dns2 ods-enforcerd: Tried to allocate 1 keys, failed on allocating key number 1
Jan 18 17:10:17 dns2 ods-enforcerd: ods-enforcerd will create some more keys on its next run
Jan 18 17:10:17 dns2 ods-enforcerd: Error allocating zsks to zone example.com
Jan 18 17:10:17 dns2 ods-enforcerd: Disconnecting from Database...
Jan 18 17:10:17 dns2 ods-enforcerd: Sleeping for 3600 seconds.

Jan 18 17:11:26 dns2 ods-signerd: [worker[4]] report for duty
Jan 18 17:11:26 dns2 ods-signerd: [scheduler] pop task for zone example.com
Jan 18 17:11:26 dns2 ods-signerd: [scheduler] unschedule task [configure] for zone example.com
Jan 18 17:11:26 dns2 ods-signerd: [worker[4]] start working on zone example.com
Jan 18 17:11:26 dns2 ods-signerd: [worker[4]] perform task [configure] for zone example.com at 1484755886
Jan 18 17:11:26 dns2 ods-signerd: [worker[4]] configure zone example.com
Jan 18 17:11:26 dns2 ods-signerd: [file] open file file=/usr/local/var/opendnssec/signconf/example.com.xml mode=reading
Jan 18 17:11:26 dns2 ods-signerd: [file] unable to open file /usr/local/var/opendnssec/signconf/example.com.xml for reading: No such file or directory
Jan 18 17:11:26 dns2 ods-signerd: [file] unable to stat file /usr/local/var/opendnssec/signconf/example.com.xml: ods_fopen() failed
Jan 18 17:11:26 dns2 ods-signerd: [zone] zone example.com signconf file /usr/local/var/opendnssec/signconf/example.com.xml is unchanged since 2017-01-18 17:11:26
Jan 18 17:11:26 dns2 ods-signerd: [worker[4]] no signconf.xml for zone example.com yet
Jan 18 17:11:26 dns2 ods-signerd: [worker[4]] CRITICAL: failed to sign zone example.com: General error
Jan 18 17:11:26 dns2 ods-signerd: [worker[4]] backoff task [configure] for zone example.com with 3600 seconds
Jan 18 17:11:26 dns2 ods-signerd: [worker[4]] finished working on zone example.com
Jan 18 17:11:26 dns2 ods-signerd: [scheduler] schedule task [configure] for zone example.com
Jan 18 17:11:26 dns2 ods-signerd: [task] On Wed Jan 18 18:11:26 2017 I will [configure] zone example.com
Jan 18 17:11:26 dns2 ods-signerd: [worker[4]] report for duty
Jan 18 17:11:26 dns2 ods-signerd: [worker[4]] nothing to do

Please let me know if I do need to look for other entries.

Regards,
Michael
PGNet Dev
2017-01-18 17:42:09 UTC
Permalink
I haven't followed this thread, sry if this Q's already been asked/answered.

I've seen this before
Post by Michael Grimm
Jan 18 17:11:26 dns2 ods-signerd: [file] open file file=/usr/local/var/opendnssec/signconf/example.com.xml mode=reading
Jan 18 17:11:26 dns2 ods-signerd: [file] unable to open file /usr/local/var/opendnssec/signconf/example.com.xml for reading: No such file or directory
albeit with ods 2.1x ... here, it was perms.

Do user/group shown in

ps aux | grep ods

match perms on

/usr/local/var/opendnssec/signconf
/usr/local/var/opendnssec/signconf/example.com.xml

?

here, e.g.,

ps aux | grep ods
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
ods-signe 14141 opendnssec 7u IPv4 55794 0t0 UDP dns.example.net:12345
^^^^^^^^^^
...

cd /var/opendnssec
tree -ug
.
├── [opendnssec opendnssec 4096] enforcer
│ └── [opendnssec opendnssec 2032] zones.xml
├── [opendnssec opendnssec 98304] kasp.db
├── [opendnssec opendnssec 4096] raw
├── [opendnssec opendnssec 4096] signconf
│ ├── [opendnssec opendnssec 1517] example.com.xml
│ ├── [opendnssec opendnssec 1172] example.com.xml.ZONE_DELETED
...
├── [opendnssec opendnssec 4096] signed
├── [opendnssec opendnssec 4096] signer
│ ├── [opendnssec opendnssec 8242] example.com.axfr
│ ├── [opendnssec opendnssec 10186] example.com.backup2
│ ├── [opendnssec opendnssec 21442] example.com.ixfr
│ ├── [opendnssec opendnssec 345] example.com.xfrd-state
...
└── [opendnssec opendnssec 4096] unsigned
Michael Grimm
2017-01-18 18:53:23 UTC
Permalink
Post by PGNet Dev
I've seen this before
Post by Michael Grimm
Jan 18 17:11:26 dns2 ods-signerd: [file] open file file=/usr/local/var/opendnssec/signconf/example.com.xml mode=reading
Jan 18 17:11:26 dns2 ods-signerd: [file] unable to open file /usr/local/var/opendnssec/signconf/example.com.xml for reading: No such file or directory
albeit with ods 2.1x ... here, it was perms.
Do user/group shown in
ps aux | grep ods
match perms on
/usr/local/var/opendnssec/signconf
/usr/local/var/opendnssec/signconf/example.com.xml
?
ods-signerd/enforcerd both run uid root; permissions of directories in /usr/local/var/opendnssec are opendnssec:opendnssec, though.

BUT: /usr/local/var/opendnssec/signconf/example.com.xml is missing because I cleaned that directory on purpose in order to recover from my issue. If I am not mistaken are those files in /usr/local/var/opendnssec/signconf rebuild after restarting opendnssec's deamons. My issue is that ods-enforcerd isn't able to allocate a ZSK although such a key is available in HSM:

Jan 18 07:10:13 dns2 ods-enforcerd: Error allocating zsks to zone example.com
Jan 18 07:11:26 dns2 ods-signerd: [worker[3]] CRITICAL: failed to sign zone example.com: General error

Regards,
Michael
PGNet Dev
2017-01-18 19:12:51 UTC
Permalink
Post by Michael Grimm
If I am not mistaken are those files in /usr/local/var/opendnssec/signconf rebuild after restarting opendnssec's deamons.
here, with ods2, starting with a clean tree

tree /var/opendnssec
/var/opendnssec
├── [opendnssec 4096] enforcer
├── [opendnssec 4096] raw
├── [opendnssec 4096] signconf
├── [opendnssec 4096] signed
├── [opendnssec 4096] signer
└── [opendnssec 4096] unsigned

after

ods-enforcer-db-setup -f
Database setup successfully.
systemctl start ods-signerd
systemctl start ods-enforcerd
ods-enforcer policy import
Created policy default successfully
Created policy lab successfully
tree /var/opendnssec
/var/opendnssec
├── [opendnssec 4096] enforcer
├── [opendnssec 98304] kasp.db
├── [opendnssec 4096] raw
├── [opendnssec 4096] signconf
├── [opendnssec 4096] signed
├── [opendnssec 4096] signer
└── [opendnssec 4096] unsigned

it's the add zone step that initially populates the signconf/ dir

ods-enforcer zone add \
--zone eample.com \
--xml \
--policy lab \
--input /usr/local/etc/opendnssec/addns.xml \
--output /usr/local/etc/opendnssec/addns.xml \
--in-type DNS \
--out-type DNS

tree /var/opendnssec
/var/opendnssec
├── [opendnssec 4096] enforcer
│ └── [opendnssec 2032] zones.xml
├── [opendnssec 98304] kasp.db
├── [opendnssec 4096] raw
├── [opendnssec 4096] signconf
Post by Michael Grimm
│ └── [opendnssec 1168] example.com.xml
├── [opendnssec 4096] signed
├── [opendnssec 4096] signer
...

If I

rm -f /var/opendnssec/signconf/*
systemctl restart ods-signerd
systemctl restart ods-enforcerd

that's NOT sufficient to recreate the signconf/*

tree /var/opendnssec
/var/opendnssec
...
├── [opendnssec 4096] raw
Post by Michael Grimm
├── [opendnssec 4096] signconf
├── [opendnssec 4096] signed
├── [opendnssec 4096] signer
...
Yuri Schaeffer
2017-01-18 21:12:21 UTC
Permalink
Please note that Michael is running 1.4 which has an entirely different
enforcer than 2.0. It is clear now that the signer can't sign the zone
because you removed the signconf. And the enforcer isn't generating a
signconf because it is stuck generating a new key.

It is hard to imagine anything else than permissions to be the problem
here. Please check if ods-signerd actually runs as root and doesn't drop
permissions. Also share your conf.xml with us/me if you can. Check the
permissions on /etc/softhsm/softhsm.conf and the path mentioned in that
file. It really seems like something is missing write permissions.

Updating OpenDNSSEC will therefore not resolve your problems. After
fixing this issue I would encourage you to update, but not right now.


//Yuri
Post by PGNet Dev
Post by Michael Grimm
If I am not mistaken are those files in /usr/local/var/opendnssec/signconf rebuild after restarting opendnssec's deamons.
here, with ods2, starting with a clean tree
tree /var/opendnssec
/var/opendnssec
├── [opendnssec 4096] enforcer
├── [opendnssec 4096] raw
├── [opendnssec 4096] signconf
├── [opendnssec 4096] signed
├── [opendnssec 4096] signer
└── [opendnssec 4096] unsigned
after
ods-enforcer-db-setup -f
Database setup successfully.
systemctl start ods-signerd
systemctl start ods-enforcerd
ods-enforcer policy import
Created policy default successfully
Created policy lab successfully
tree /var/opendnssec
/var/opendnssec
├── [opendnssec 4096] enforcer
├── [opendnssec 98304] kasp.db
├── [opendnssec 4096] raw
├── [opendnssec 4096] signconf
├── [opendnssec 4096] signed
├── [opendnssec 4096] signer
└── [opendnssec 4096] unsigned
it's the add zone step that initially populates the signconf/ dir
ods-enforcer zone add \
--zone eample.com \
--xml \
--policy lab \
--input /usr/local/etc/opendnssec/addns.xml \
--output /usr/local/etc/opendnssec/addns.xml \
--in-type DNS \
--out-type DNS
tree /var/opendnssec
/var/opendnssec
├── [opendnssec 4096] enforcer
│ └── [opendnssec 2032] zones.xml
├── [opendnssec 98304] kasp.db
├── [opendnssec 4096] raw
├── [opendnssec 4096] signconf
Post by Michael Grimm
│ └── [opendnssec 1168] example.com.xml
├── [opendnssec 4096] signed
├── [opendnssec 4096] signer
...
If I
rm -f /var/opendnssec/signconf/*
systemctl restart ods-signerd
systemctl restart ods-enforcerd
that's NOT sufficient to recreate the signconf/*
tree /var/opendnssec
/var/opendnssec
...
├── [opendnssec 4096] raw
Post by Michael Grimm
├── [opendnssec 4096] signconf
├── [opendnssec 4096] signed
├── [opendnssec 4096] signer
...
_______________________________________________
Opendnssec-user mailing list
https://lists.opendnssec.org/mailman/listinfo/opendnssec-user
Berry A.W. van Halderen
2017-01-18 23:27:46 UTC
Permalink
Post by Yuri Schaeffer
Please note that Michael is running 1.4 which has an entirely different
enforcer than 2.0. It is clear now that the signer can't sign the zone
because you removed the signconf. And the enforcer isn't generating a
signconf because it is stuck generating a new key.
It is hard to imagine anything else than permissions to be the problem
here. Please check if ods-signerd actually runs as root and doesn't drop
permissions. Also share your conf.xml with us/me if you can. Check the
permissions on /etc/softhsm/softhsm.conf and the path mentioned in that
file. It really seems like something is missing write permissions.
Updating OpenDNSSEC will therefore not resolve your problems. After
fixing this issue I would encourage you to update, but not right now.
To chip in, Yuri indicates there might be a section:
<Enforcer>
...
<Privileges>
<User>...</User>
<Group>...</Group>

The Privileges section is optional, as are both User and Group section.
Also the <Signer> has such a section. If you have such a section,
then the enforcer requires write permissions on the file mentioned
Post by Yuri Schaeffer
-rw------- 1 root wheel uarch 150528 Jan 4 03:01
/usr/local/var/softhsm/slot0.db

This file isn't accessed by OpenDNSSEC directly, but by the SoftHSM
library, which isn't subject to the verbosity setting and it's logging
might be somewhere else.

Based on an earlier response, you this could very well be set as earlier
Post by Yuri Schaeffer
-rw-r--r-- root/opendnssec 990 2017-01-06 21:02
opendnssec/signconf/example.com.xml

Though that one only had the group set differently.

It is perfectly possible to run OpenDNSSEC as a different user or
drop permissions, but it needs write/read permissions to several
files.

\Berry
Michael Grimm
2017-01-18 22:08:40 UTC
Permalink
It is clear now that the signer can't sign the zone because you removed
the signconf. And the enforcer isn't generating a signconf because it is
stuck generating a new key.
Understood.

I do have daily backups of all files, involved over the last 6 month (thanks to ZFS snapshots). I haven't tried to re-install example.com.xml, yet.

[What doesn't make sense to me is: there are eight other domains involved that do not show this issue.]
It is hard to imagine anything else than permissions to be the problem
here. Please check if ods-signerd actually runs as root and doesn't drop
permissions.
dns> ps aux
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
root 71041 0.0 0.0 53060 10416 - IsJ 22:30 0:00.06 /usr/local/sbin/ods-enforcerd
root 71050 0.0 0.0 79672 11160 - IsJ 22:30 0:00.10 /usr/local/sbin/ods-signerd -c /usr/local/etc/opendnsse
Also share your conf.xml with us/me if you can.
dns> cat conf.xml
<?xml version="1.0" encoding="UTF-8"?>
<Configuration>
<RepositoryList>
<Repository name="SoftHSM">
<Module>/usr/local/lib/softhsm/libsofthsm.so</Module>
<TokenLabel>OpenDNSSEC</TokenLabel>
<PIN>__SECRET__</PIN>
<SkipPublicKey/>
</Repository>
</RepositoryList>
<Common>
<Logging>
<Verbosity>5</Verbosity>
<Syslog><Facility>local0</Facility></Syslog>
</Logging>
<PolicyFile>/usr/local/etc/opendnssec/kasp.xml</PolicyFile>
<ZoneListFile>/usr/local/etc/opendnssec/zonelist.xml</ZoneListFile>
</Common>
<Enforcer>
<Datastore>
<SQLite>/usr/local/var/opendnssec/kasp.db</SQLite>
</Datastore>
</Enforcer>
<Signer>
<WorkingDirectory>/usr/local/var/opendnssec/tmp</WorkingDirectory>
<WorkerThreads>4</WorkerThreads>
<Listener>
<Interface>
<Address>10.x.x.x</Address>
<Port>53</Port>
</Interface>
</Listener>
</Signer>
</Configuration>

[IP address obscured]

dns> ls -al /usr/local/etc/opendnssec/*.xml
-rw-r--r-- 1 root wheel 1662 Dec 28 12:17 /usr/local/etc/opendnssec/addns.xml
-rw-r----- 1 root wheel 1867 Jan 18 21:17 /usr/local/etc/opendnssec/conf.xml
-rw-r--r-- 1 root wheel 5881 Apr 12 2015 /usr/local/etc/opendnssec/kasp.xml
-rw-r--r-- 1 root wheel 3541 Jan 16 18:30 /usr/local/etc/opendnssec/zonelist.xml

dns> ls -al /usr/local/var/opendnssec/
-rw-r--r-- 1 root wheel 44032 Jan 18 22:30 kasp.db
-rw-r--r-- 1 root wheel 0 Jan 18 22:44 kasp.db.our_lock
drwxr-xr-x 2 opendnssec opendnssec 10 Jan 18 22:30 signconf
drwxr-xr-x 2 opendnssec opendnssec 2 May 18 2016 signed
drwxr-xr-x 2 opendnssec opendnssec 37 Jan 18 22:19 tmp
drwxr-xr-x 2 opendnssec opendnssec 2 May 18 2016 unsigned
Check the permissions on /etc/softhsm/softhsm.conf and the path mentioned
in that file.
dns> ls -al /usr/local/etc/softhsm.conf
-rw-r--r-- 1 root wheel 293 Feb 3 2015 /usr/local/etc/softhsm.conf

dsn> cat /usr/local/etc/softhsm.conf
0:/usr/local/var/softhsm/slot0.db

dns> ls -al /usr/local/var/softhsm/slot0.db
-rw------- 1 root wheel 150528 Jan 18 20:26 /usr/local/var/softhsm/slot0.db

[I did change that to 666 for testing purposes, to no avail]
It really seems like something is missing write permissions.
At first glance, I cannot see an issue here. But I will do continue investigating.
Updating OpenDNSSEC will therefore not resolve your problems. After
fixing this issue I would encourage you to update, but not right now.
Thanks for that info. So I will solve my issue first.

Thank you very much for your help,
Michael
Berry A.W. van Halderen
2017-01-19 08:48:26 UTC
Permalink
Post by Michael Grimm
Post by Yuri Schaeffer
It is hard to imagine anything else than permissions to be the problem
here. Please check if ods-signerd actually runs as root and doesn't drop
permissions.
dns> ps aux
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
root 71041 0.0 0.0 53060 10416 - IsJ 22:30 0:00.06 /usr/local/sbin/ods-enforcerd
root 71050 0.0 0.0 79672 11160 - IsJ 22:30 0:00.10 /usr/local/sbin/ods-signerd -c /usr/local/etc/opendnsse
Post by Yuri Schaeffer
Also share your conf.xml with us/me if you can.
dns> cat conf.xml
<?xml version="1.0" encoding="UTF-8"?>
<Configuration>
<RepositoryList>
<Repository name="SoftHSM">
<Module>/usr/local/lib/softhsm/libsofthsm.so</Module>
<TokenLabel>OpenDNSSEC</TokenLabel>
<PIN>__SECRET__</PIN>
<SkipPublicKey/>
</Repository>
</RepositoryList>
[deleted]
Post by Michael Grimm
<Enforcer>
<Datastore>
<SQLite>/usr/local/var/opendnssec/kasp.db</SQLite>
</Datastore>
</Enforcer>
That looks all very sound, and indeed the processes should be also to
read everything. Let's also check:

ls -l /usr/local/lib/softhsm/libsofthsm.so

and in case it is a link:

ls -lL /usr/local/lib/softhsm/libsofthsm.so
Post by Michael Grimm
I do have daily backups of all files, involved over the last 6 month
(thanks to ZFS snapshots). I haven't tried to re-install
example.com.xml, yet.
Post by Michael Grimm
[What doesn't make sense to me is: there are eight other domains
involved that do not show this issue.]

After that I think we have exhausted all possible access permissions.
And we are left with the puzzling question why the other domains
aren't seeing the same issue. It would mean that just the generation
of keys isn't working.
@Yuri also: could there be a change in the policy/kasp which prevents
generation of keys?

\Berry
Post by Michael Grimm
[IP address obscured]
dns> ls -al /usr/local/etc/opendnssec/*.xml
-rw-r--r-- 1 root wheel 1662 Dec 28 12:17 /usr/local/etc/opendnssec/addns.xml
-rw-r----- 1 root wheel 1867 Jan 18 21:17 /usr/local/etc/opendnssec/conf.xml
-rw-r--r-- 1 root wheel 5881 Apr 12 2015 /usr/local/etc/opendnssec/kasp.xml
-rw-r--r-- 1 root wheel 3541 Jan 16 18:30 /usr/local/etc/opendnssec/zonelist.xml
dns> ls -al /usr/local/var/opendnssec/
-rw-r--r-- 1 root wheel 44032 Jan 18 22:30 kasp.db
-rw-r--r-- 1 root wheel 0 Jan 18 22:44 kasp.db.our_lock
drwxr-xr-x 2 opendnssec opendnssec 10 Jan 18 22:30 signconf
drwxr-xr-x 2 opendnssec opendnssec 2 May 18 2016 signed
drwxr-xr-x 2 opendnssec opendnssec 37 Jan 18 22:19 tmp
drwxr-xr-x 2 opendnssec opendnssec 2 May 18 2016 unsigned
Post by Yuri Schaeffer
Check the permissions on /etc/softhsm/softhsm.conf and the path mentioned
in that file.
dns> ls -al /usr/local/etc/softhsm.conf
-rw-r--r-- 1 root wheel 293 Feb 3 2015 /usr/local/etc/softhsm.conf
dsn> cat /usr/local/etc/softhsm.conf
0:/usr/local/var/softhsm/slot0.db
dns> ls -al /usr/local/var/softhsm/slot0.db
-rw------- 1 root wheel 150528 Jan 18 20:26 /usr/local/var/softhsm/slot0.db
[I did change that to 666 for testing purposes, to no avail]
Post by Yuri Schaeffer
It really seems like something is missing write permissions.
At first glance, I cannot see an issue here. But I will do continue investigating.
Post by Yuri Schaeffer
Updating OpenDNSSEC will therefore not resolve your problems. After
fixing this issue I would encourage you to update, but not right now.
Thanks for that info. So I will solve my issue first.
Thank you very much for your help,
Michael
_______________________________________________
Opendnssec-user mailing list
https://lists.opendnssec.org/mailman/listinfo/opendnssec-user
Yuri Schaeffer
2017-01-19 09:10:18 UTC
Permalink
Post by Berry A.W. van Halderen
After that I think we have exhausted all possible access permissions.
And we are left with the puzzling question why the other domains
aren't seeing the same issue. It would mean that just the generation
of keys isn't working.
It could be that they simply haven't initiated a rollover yet so no
writing necessary. And they still have their signconf so the signer will
keep running.
Post by Berry A.W. van Halderen
@Yuri also: could there be a change in the policy/kasp which prevents
generation of keys?
Yes, you can set <ManualRollover/> in the <KSK> and <ZSK> sections. In
1.4 for ZSK it will mean no ZSK will be generated at all. A KSK might be
generated but not rolled too unless issues by the user.

//Yuri
Berry A.W. van Halderen
2017-01-19 09:18:43 UTC
Permalink
Post by Yuri Schaeffer
Post by Berry A.W. van Halderen
@Yuri also: could there be a change in the policy/kasp which prevents
generation of keys?
Yes, you can set <ManualRollover/> in the <KSK> and <ZSK> sections. In
1.4 for ZSK it will mean no ZSK will be generated at all. A KSK might be
generated but not rolled too unless issues by the user.
I don't mean that, perhaps the policy has been changed such that now
an algorithm or key length is being requested that isn't supported?

\Berry
Yuri Schaeffer
2017-01-19 09:24:48 UTC
Permalink
Post by Berry A.W. van Halderen
I don't mean that, perhaps the policy has been changed such that now
an algorithm or key length is being requested that isn't supported?
Ah. I wondered why you asked. :)

Yes, exactly that, an unsupported algorithm or keylength or a bad
combination of the two might spurr similar errors on 1.4. I think.
Michael Grimm
2017-01-19 17:03:00 UTC
Permalink
Hi —

@Berry: you asked for ...
dns> ls -al /usr/local/lib/softhsm/libsofthsm.so
-rwxr-xr-x 1 root wheel 149136 Jan 13 22:03 /usr/local/lib/softhsm/libsofthsm.so
Post by Yuri Schaeffer
Post by Berry A.W. van Halderen
I don't mean that, perhaps the policy has been changed such that now
an algorithm or key length is being requested that isn't supported?
Ah. I wondered why you asked. :)
Yes, exactly that, an unsupported algorithm or keylength or a bad
combination of the two might spurr similar errors on 1.4. I think.
Hmm. I came about "ods-hsmutil test" and tried it on a copy of

dns> ods-hsmutil info
Repository: SoftHSM
Module: /usr/local/lib/softhsm/libsofthsm.so
Slot: 0
Token Label: OpenDNSSEC
Manufacturer: SoftHSM
Model: SoftHSM
Serial: 1

dns|root> ods-hsmutil -v test SoftHSM
Testing repository: SoftHSM

Generating 512-bit RSA key... OK
Extracting key identifier... OK, 0c912e61825b94cd1508dc2759990d81
Signing (RSA/SHA1) with key... OK
Signing (RSA/SHA256) with key... OK
Deleting key... OK

Generating 768-bit RSA key... OK
Extracting key identifier... OK, deec6a16dab536014f97e9d7fb2425d2
Signing (RSA/SHA1) with key... OK
Signing (RSA/SHA256) with key... OK
Deleting key... OK

Generating 1024-bit RSA key... OK
Extracting key identifier... OK, 4c811b6400962ac1d2315c6f04e9b9b6
Signing (RSA/SHA1) with key... OK
Signing (RSA/SHA256) with key... OK
Signing (RSA/SHA512) with key... OK
Deleting key... OK

Generating 1536-bit RSA key... OK
Extracting key identifier... OK, 1c9d249bf36560a2a98d3adf35107344
Signing (RSA/SHA1) with key... OK
Signing (RSA/SHA256) with key... OK
Signing (RSA/SHA512) with key... OK
Deleting key... OK

Generating 2048-bit RSA key... OK
Extracting key identifier... OK, 7752b3962e79f9bdc7c51639d8645715
Signing (RSA/SHA1) with key... OK
Signing (RSA/SHA256) with key... OK
Signing (RSA/SHA512) with key... OK
Deleting key... OK

Generating 4096-bit RSA key... OK
Extracting key identifier... OK, 264f708cb68c8618100f0e5503da6d42
Signing (RSA/SHA1) with key... OK
Signing (RSA/SHA256) with key... OK
Signing (RSA/SHA512) with key... OK
Deleting key... OK

Generating 512-bit DSA key... Failed
generate domain parameters: CKR_FUNCTION_NOT_SUPPORTED

Generating 768-bit DSA key... Failed
generate domain parameters: CKR_FUNCTION_NOT_SUPPORTED

Generating 1024-bit DSA key... Failed
generate domain parameters: CKR_FUNCTION_NOT_SUPPORTED

Generating 512-bit GOST key... Failed
generate key pair: CKR_MECHANISM_INVALID

Segmentation fault (core dumped)



Hmmm!? What does that mean? I guess I should be worried.

What to do next:

#) would such a database be possible to migrate to softhsm2? Either by the migration script or manually (export, import)?
#) should I try to trigger a manual ZSK rollover for the erratic domain?
#) anything else?

#) I am already thinking about a worst case scenario: Restarting from scratch (only 9 domains involved). I have read that it should be possible to run two opendnssec versions in parallel. Can you confirm this?

Thank you very much that you are still trying to help me,
Michael
PGNet Dev
2017-01-19 17:27:33 UTC
Permalink
Post by Michael Grimm
Generating 512-bit DSA key... Failed
generate domain parameters: CKR_FUNCTION_NOT_SUPPORTED
Generating 768-bit DSA key... Failed
generate domain parameters: CKR_FUNCTION_NOT_SUPPORTED
Generating 1024-bit DSA key... Failed
generate domain parameters: CKR_FUNCTION_NOT_SUPPORTED
Generating 512-bit GOST key... Failed
generate key pair: CKR_MECHANISM_INVALID
Segmentation fault (core dumped)
Hmmm!? What does that mean? I guess I should be worried.
Without seeing a trace, my 1st *guess* would be that the linked Botan or
OpenSSL (DID softhsm1 even support OpenSSL?) crypto backend doesn't have
DSA enabled, or is somehow busted.

Just curious -- where are you getting your Softhsm/ODS installs?

DIY?
Distro pkgs?
Post by Michael Grimm
#) would such a database be possible to migrate to softhsm2? Either by the migration script or manually (export, import)?
#) should I try to trigger a manual ZSK rollover for the erratic domain?
#) anything else?
#) I am already thinking about a worst case scenario: Restarting from scratch (only 9 domains involved). I have read that it should be possible to run two opendnssec versions in parallel. Can you confirm this?
Just my $0.02 ... and, I'm certainly not one of the devs.

I'd had zero luck getting softhsm1x and ods1x working on my system; if
it wasn't one thing it was another.

Yes, I know, others obviously have it working.

I moved, instead to building from src

ldns 1.7.x
softhsm 2.3.x, backed by openssl 1.0.2j
ods 2.1.x

and run under systemd.

Since, I've have had a much more reliable system.

IIUC from a previous post, ods 2.1 is targeted for _release_ end of Jan.

Apart from the fact that it all works (so far) it's also, inevitably,
where new development will be.

YMMV.
Michael Grimm
2017-01-19 20:07:39 UTC
Permalink
Post by PGNet Dev
Just curious -- where are you getting your Softhsm/ODS installs?
DIY?
Distro pkgs?
I am running FreeBSD 11-STABLE. In this OS universe add-on software either is installed via the port system or by pre-compiled packages. I opt for ports, because here every software is compiled from scratch by definition:

https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/ports.html
https://www.freebsd.org/cgi/ports.cgi?query=softhsm&stype=all&sektion=all
https://www.freebsd.org/cgi/ports.cgi?query=opendnssec&stype=all&sektion=all

SCNR ;-) and with kind regards,
Michael
PGNet Dev
2017-01-19 20:35:34 UTC
Permalink
This post might be inappropriate. Click to display it.
Yuri Schaeffer
2017-01-20 09:27:01 UTC
Permalink
Post by Michael Grimm
dns|root> ods-hsmutil -v test SoftHSM
Hmm this shows that generating new keys is not a problem perse. Can you
send me your kasp.xml?
Post by Michael Grimm
Segmentation fault (core dumped) Hmmm!? What does that mean? I guess
I should be worried.
A crash in ods-hsmutil. It should have created a coredump file. (likely
named something like ods-hsmutil.core). Perhaps I can extract some info
from it if you send it to me together with the ods-hsmutil executable
from your system.
Post by Michael Grimm
#) would such a database be possible to migrate to softhsm2? Either
by the migration script or manually (export, import)?
If this is indeed a softhsm issue it might work. I'm not involved in the
SoftHSM development but as far as I know SoftHSMv2 includes a
softhsm2-migrate program to do this import for you.
Post by Michael Grimm
#) should I try to trigger a manual ZSK rollover for the erratic domain?
It seems to have trouble generating new keys from the enforcer. So I
don't think that would help you.
Post by Michael Grimm
#) I am already thinking about a worst case scenario: Restarting from
scratch (only 9 domains involved). I have read that it should be
possible to run two opendnssec versions in parallel. Can you confirm
this?
It is perfectly possible to run two instances in parallel. Though you
have to make sure you set all the paths correctly so that config files,
PID files, tmp files etc don't mix.

//Yuri
Michael Grimm
2017-01-20 11:54:00 UTC
Permalink
Post by Yuri Schaeffer
Post by Michael Grimm
dns|root> ods-hsmutil -v test SoftHSM
Hmm this shows that generating new keys is not a problem perse. Can you
send me your kasp.xml?
[see separate mail]
Post by Yuri Schaeffer
Post by Michael Grimm
#) would such a database be possible to migrate to softhsm2? Either
by the migration script or manually (export, import)?
If this is indeed a softhsm issue it might work. I'm not involved in the
SoftHSM development but as far as I know SoftHSMv2 includes a
softhsm2-migrate program to do this import for you.
Yes, there is such a tool. I was referring to: wouldn't it be wiser to
export/import every non-problematic domain manually instead?
Post by Yuri Schaeffer
Post by Michael Grimm
#) should I try to trigger a manual ZSK rollover for the erratic domain?
It seems to have trouble generating new keys from the enforcer. So I
don't think that would help you.
Thanks for your clarification.
Post by Yuri Schaeffer
Post by Michael Grimm
#) I am already thinking about a worst case scenario: Restarting from
scratch (only 9 domains involved). I have read that it should be
possible to run two opendnssec versions in parallel. Can you confirm
this?
It is perfectly possible to run two instances in parallel. Though you
have to make sure you set all the paths correctly so that config files,
PID files, tmp files etc don't mix.
Ok, then I will give that a try.

Again, I do really appreciate the help from all of you,
Michael
Havard Eidnes
2017-01-21 11:39:32 UTC
Permalink
Post by Michael Grimm
This is opendnssec 1.4.12 and FreeBSD 11-STABLE.
| General error
After removing all files in /usr/local/var/opendnssec/signconf and
/usr/local/var/opendnssec/tmp, and restartion opendnssec afterwards,
| ods-enforcerd: Zone example.com found.
| ods-enforcerd: Policy for example.com set to default.
| ods-enforcerd: Config will be output to /usr/local/var/opendnssec/signconf/example.com.xml.
| example.com. keys_to_allocate(1) = keys_needed(1) - (keys_available(1) - keys_pending_retirement(1))
| ods-enforcerd: Tried to allocate 1 keys, failed on allocating key number 1
| ods-enforcerd: ods-enforcerd will create some more keys on its next run
| ods-enforcerd: Error allocating zsks to zone example.com
I think I've seen a similar problem sometime before. When I was
debugging that problem I added the patch below, which will give
you some more information (but not fix the problem).

If I recall correctly, the problem turned out to be that there
was a key stuck in a "funny state". Ah, yes, found my message
from January 25 last year which started me on this, message-id
is <***@uninett.no>.

This is also related to

https://issues.opendnssec.org/browse/OPENDNSSEC-752

In the middle there I also found that the SoftHSM database had
somehow become owned by root, and since I run OpenDNSSEC as user
"ods" that can't work properly, so that was mended, but
OpenDNSSEC could not fully recover by itself.

In my case, one problematic zone had a key stuck in "generate"
state (only visible with "--all" given to ods-ksmutil, as in
"ods-ksmutil key list -v --all --zone <zone>"), and I deleted it
with

ods-ksmutil key delete --cka_id 15e81adbc4a30ced30cf1bab8cb2b212

At least that worked for one of the two zones I had -- the other
one had a different state.

What I think finally solved this for me was that "ods-ksmutil key
list --verbose --all" found two keys marked "NOT ALLOCATED" and
apparently with no "key tag". Stopping OpenDNSSEC, removing
those two keys with

ods-ksmutil key delete --cka_id 3b929d0ab308b4e1e8bf81abf1e6dafe
ods-ksmutil key delete --cka_id b3c5b3d619c086f41f3f2ed440419f23

and restarting OpenDNSSEC made it work better again.

Regards,

- Håvard
Michael Grimm
2017-01-21 12:34:35 UTC
Permalink
Hi Havard —

Congratulations, I do believe that you solved my problem! Thank you very, very much.

JFTR: I migrated to softhsm2 in the meantime, and that worked out fine running:
dns> ./softhsm2-migrate --db /usr/local/var/softhsm/slot0.db --token OpenDNSSEC
[I had had to use "—token" instead of "—slot", dunno why]

But my reported issue with example.com couldn't be solved hereby.
Post by Havard Eidnes
Post by Michael Grimm
| ods-enforcerd: Zone example.com found.
| ods-enforcerd: Policy for example.com set to default.
| ods-enforcerd: Config will be output to /usr/local/var/opendnssec/signconf/example.com.xml.
| example.com. keys_to_allocate(1) = keys_needed(1) - (keys_available(1) - keys_pending_retirement(1))
| ods-enforcerd: Tried to allocate 1 keys, failed on allocating key number 1
| ods-enforcerd: ods-enforcerd will create some more keys on its next run
| ods-enforcerd: Error allocating zsks to zone example.com
I think I've seen a similar problem sometime before.
[…]
Post by Havard Eidnes
If I recall correctly, the problem turned out to be that there
was a key stuck in a "funny state". Ah, yes, found my message
from January 25 last year which started me on this, message-id
This is also related to
https://issues.opendnssec.org/browse/OPENDNSSEC-752
I did read this thread, and ...
Post by Havard Eidnes
In my case, one problematic zone had a key stuck in "generate"
state (only visible with "--all" given to ods-ksmutil, as in
"ods-ksmutil key list -v --all --zone <zone>"), and I deleted it
with
ods-ksmutil key delete --cka_id 15e81adbc4a30ced30cf1bab8cb2b212
… bingo! I did find two keys in a "generate" state as well.

In my case it turned out to be two KSKs of two different domains, not example.com:

dns> ods-ksmutil key list --verbose --all
Keys:
Zone: Keytype: State: Date of next transition (to): ...
example.tld1 KSK active 2025-12-09 09:21:53 (retire) ...
example.tld1 ZSK active 2017-03-03 18:15:51 (retire) ...
example.tld1 KSK generate (not scheduled) (publish) ...
example.tld2 KSK active 2025-12-10 15:07:05 (retire) ...
example.tld2 ZSK active 2017-03-06 12:16:21 (retire) ...
example.tld2 KSK generate (not scheduled) (publish) ...
example.com KSK active 2026-01-20 12:59:25 (retire) ...
example.com ZSK active 2017-01-16 14:00:07 (retire) ...

Thus, I did remove those two keys as well ...
Post by Havard Eidnes
Stopping OpenDNSSEC, removing those two keys with
ods-ksmutil key delete --cka_id 3b929d0ab308b4e1e8bf81abf1e6dafe
ods-ksmutil key delete --cka_id b3c5b3d619c086f41f3f2ed440419f23
and restarting OpenDNSSEC made it work better again.
… and restarting opendnssec left me with promising log entries ...

| ods-enforcerd: Zone example.com found.
| ods-enforcerd: Policy for example.com set to default.
| ods-enforcerd: Config will be output to /usr/local/var/opendnssec/signconf/example.com.xml.
| ods-enforcerd: WARNING: ZSK rollover for zone 'example.com' not completed as there are no keys in the 'ready' state; ods-enforcerd will try again when it runs next
| ods-enforcerd: Could not call signer engine
| ods-enforcerd: Will continue: call '/usr/local/sbin/ods-signer update example.com' to manually update the zone
| ods-enforcerd: Disconnecting from Database...
| ods-enforcerd: Sleeping for 3600 seconds.

… and promising key list:

dns> ods-ksmutil key list --verbose
example.com KSK active 2026-01-20 12:59:25
example.com ZSK active 2017-01-16 14:00:07
example.com ZSK publish 2017-01-22 02:56:24

If I am not mistaken you did solve my problem. Tomorrow morning I should know, correct?

Thank you and all the others very much that helped me solve this issue and taught me so much about the software I am using.

Regards,
Michael

Loading...