p2p: Pass proper wait-in-line timeout.
Use TimeDelta::FromSeconds(), not TimeDelta::FromHours(), when passing
a number of seconds. With this change we get the 6 hour timeout that
we want instead of a 6*3600 = 21600 hours = 900 days = 3+ years
timeout. (Ouch.)
Also add a unit-test for checking that the right value is passed.
Curiously enough, even with this bug we were still seeing 4.78%
cancelations in the P2P.Client.LookupResult metric including
corresponding samples in the P2P.Client.Canceled.WaitingTimeSeconds
metric. For the latter metric, 60+% of all observations are at the 5
second mark and 98% of all observations being within 30 minutes and
the max observation is roughly one hour.
Since this can only happen when the p2p-client process is handling the
SIGTERM signal and update_engine never sends it, one guess is that it
happens when the system is shutting down and the init system sends
SIGTERM to all processes (and then SIGKILL after a while if they're
still around).
This hypothesis however does not directly explain the spike at the 5
second mark, nor the distrbution. However another bug in p2p-client
namely that we report monotonic time, not wall-clock time, might
explain this. See CL:182700 for further information.
BUG=None
TEST=New unit test + Unit tests pass.
Change-Id: I29ff16c5434ab68cb9a5a314f29f5154982fe0e1
Reviewed-on: https://chromium-review.googlesource.com/182710
Reviewed-by: Don Garrett <dgarrett@chromium.org>
Commit-Queue: David Zeuthen <zeuthen@chromium.org>
Tested-by: David Zeuthen <zeuthen@chromium.org>
2 files changed