Mail Delivery Background Jobs

Only 8 years into running this product and I still learn something new about it.

Monday there was an event. Two nodes became responsive at about the same time. The other ten nodes did their jobs and transferred session information to the nodes taking on the sessions. Most were so busy they did not respond to monitor requests. There was lots of slowness. But we did not lose sessions. Nor did we lose the cluster.

Somehow we did lose the Mail tool. (Think internal email, but it can forward messages to email.)

In WebCT Vista 3 we diagnosed this by going to Weblogic, finding the email queues, and restarting some things to email would start flowing again. I was not able to find it that way. Apparently now, we go to the Background Jobs as a server administrator. The waiting mail jobs show up in Pending Jobs view.

Once I restarted the cluster, the blocking mail job was changed to Retried as soon as the JMS node came online. Rejected only shows up in the All Jobs view. All the other views do not show it. Which makes sense because each view shows the status of the view name. So the Cancelled Jobs view only shows jobs with the Cancelled status. Any jobs with a Retried status should only show in the (non-existent) Retried Jobs and (existing) All Jobs views. It was bad assumption on my part that all potential statuses have a view.

Hindsight being 20/20, what we need is a Nagios monitor to detect is Pending jobs exceeds maybe 20-50 jobs. Normally this table appears empty. But I could see cases where it normally grows fast then quickly clears.

But then again, we have less than a year on this product. What are the odds this will happen again?

from Rants, Raves, and Rhetoric v4

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s