Guys, I'd like to be able to have SIM continuously monitor my servers ping response. To do so, I do the following:
- Create a dynamic group to group all devices identified as servers
- Create a hardware status monitoring job that only uses the ping test to run every 15 mins
Now, this works fine but will not repeatedly fire an alert on each poll interval even when the system is not reachable. The reason is as follows:
- The polling task pings are sent on each run of the task regardless of status of target in SIM. When the server is unreachable, the ping fails and an event is logged against the server (system is unreachable) and the overall status of the server is set to critical and the sub ping status is also set to critical. However, next time the polling tasks runs, another event of type 'system is not reachable' will not be logged against the system unless the ping status is healthy so unless a 'system is reachable' event is logged as a result of the poling task successfully pinging the server and the status is healthy, the 'system is not reachable' will not re-fire
Now for me, I consider this a failing of the software. With batch runs/av etc. running at night, it's easy to miss a ping. ALso, with so many alerts form our various software each night, it also easy to miss a server that genuinely when offline over night. So, is there any way to reset the healthy status of a server manually or at least the ping health status? That way the polling would refire the 'system is not reachable' event and I could alert on it ;)