Discussion:
[Opendnssec-user] RE :About Broken pipe (version 1.4.7 & 1.4.9)
yaohongyuan
2016-04-18 06:35:10 UTC
Permalink
Hi all,
Last week I report an issue about "ods-signerd thread abnormal running" , after got Yuri's reply then I version up my test env's opendnssec to 1.4.9 , but with 3 days test it's still not work.
The signerd thread will disappear , I tend to think this is a major issue .
CUP : 14
Mem : 128G
General load average: 5.50, 4.43, 4.04
Zones : 20
Per zone RR count : 660,000
Total zone RR count : 13,200,000
Per zone RRset increasing speed : 1000/1h/zone
opendnssec version : 1.4.9 (1.4.7 last week)
And this machine just run 2 bind and opendnssec . Mem total cost less then 30G .
I don't know why always got error as "wire/notify.c:477: notify_handle_zone: assertion notify->handler.fd == -1 failed" .
Did anybody have met this like me ? How do you solving this ?
Mar 30 13:51:39 p01-test-devops-9-81 ods-signerd: [socket] unable to handle outgoing tcp response: write() failed (Broken pipe)
Mar 30 13:53:23 p01-test-devops-9-81 ods-signerd: [socket] unable to handle outgoing tcp response: write() failed (Broken pipe)
Mar 30 13:53:40 p01-test-devops-9-81 ods-signerd: [socket] unable to handle outgoing tcp response: write() failed (Broken pipe)
Mar 30 13:54:41 p01-test-devops-9-81 ods-signerd: [socket] unable to handle outgoing tcp response: write() failed (Broken pipe)
Mar 30 13:54:54 p01-test-devops-9-81 ods-signerd: [xfrd] zone testzone9 cannot tcp write to 192.168.1.110: Broken pipe
Mar 30 13:54:54 p01-test-devops-9-81 ods-signerd: [xfrd] zone testzone8 cannot tcp write to 192.168.1.110: Broken pipe
Mar 30 13:54:54 p01-test-devops-9-81 ods-signerd: [xfrd] zone testzone6 cannot tcp write to 192.168.1.110: Broken pipe
Mar 30 13:54:54 p01-test-devops-9-81 ods-signerd: [xfrd] zone testzone2 cannot tcp write to 192.168.1.110: Broken pipe
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
Mar 30 19:03:14 p01-test-devops-9-81 ods-signerd: [xfrd] zone testzone7 cannot tcp write to 192.168.1.110: Broken pipe
Mar 30 19:03:14 p01-test-devops-9-81 ods-signerd: [xfrd] zone testzone6 cannot tcp write to 192.168.1.110: Broken pipe
Mar 30 19:25:55 p01-test-devops-9-81 ods-signerd: [STATS] testzone20 2015126051 RR[count=44 time=0(sec)] NSEC3[count=6 time=0(sec)] RRSIG[new=10 reused=172846 time=2(sec) avg=5(sig/sec)] TOTAL[time=8(sec)]
Mar 30 19:25:55 p01-test-devops-9-81 ods-signerd: [worker[4]] read zone testzone8
Mar 30 19:25:55 p01-test-devops-9-81 ods-signerd: [xfrd] zone testzone8 transfer done [notify acquired 1459337138, serial on disk 2015112767, notify serial 2015112767]
Mar 30 19:25:55 p01-test-devops-9-81 ods-signerd: [xfrd] zone testzone8 reset notify acquired
Mar 30 19:25:55 p01-test-devops-9-81 ods-signerd: [xfrd] tcp read xfr: release connection
Mar 30 19:25:55 p01-test-devops-9-81 ods-signerd: wire/notify.c:477: notify_handle_zone: assertion notify->handler.fd == -1 failed
From above messages we could get that the signerd thread just work 6.5 H .
Could anybody please help me to fix this issue together?
With kind regards.
Dean
Hi all ,
Last week we do some changes with source wire/notify.c:477 and have solved above problem , the change as below :
Base source version : 1.4.8
Before :
if (notify->is_waiting) {
ods_log_debug("[%s] already waiting, skipping notify for zone %s", notify_str, zone->name);
ods_log_assert(notify->handler.fd == -1);
return;
}
After :
if (notify->is_waiting) {
ods_log_debug("[%s] already waiting, skipping notify for zone %s", notify_str, zone->name);
if (notify->handler.fd > 0) {
close(notify->handler.fd);
notify->handler.fd = -1;
}
return;
}


I monitoring the handle count which under ods-signerd thread for a week and didn't find any abnormal phenomena .
The total number of handle count remain at around 1500.
Hope get some suggestions about the change .


With kind regards.
Dean
Yuri Schaeffer
2016-04-18 07:58:54 UTC
Permalink
Hi Dean,
Post by yaohongyuan
Last week we do some changes with source wire/notify.c:477 and
Thanks for your patch. We will take a look at it and if we agree with
you we'll merge it in. Meanwhile be advised that 0 is also a valid file
descriptor.

So you will probably want to do

- if (notify->handler.fd > 0) {
+ if (notify->handler.fd >= 0) {

Best regards,
Yuri

Loading...