This week has been full of excitement here at cbtr.net. We had ssis packages that were being executed via a sql jobs. These jobs were taking forever to finish. For all our other servers, there’s a Long Running Job setup via DPA to monitor but since many developer uses SSIS and DPA can’t send alerts to specific group of people, the alert was disabled.
Today, I got a message from a dev telling me that some of his packages were running for a long time, 18 hours long (that’s what she said), and it should of taken < 30 seconds (that’s not what she said).
He tells me that he has stopped the sql jobs but SSIS execution report was still showing [running]. 5 instances of them for that matter with different parameters. I looked at the execution message and nothing was happening. No output of any kind. I ran the following query to see what is active:
select * FROM catalog.executions where status = 2
there were about 15 packages in there, running from 2016/10/10 @ 2:20p yesterday. I tried to kill the package using:
exec catalog.stop_operation @operation_id = 3472251
but all that did was sit there with an sos_handler waittype and one time, it switched to writelog. For an hour, it tired to kill just that one operation. I went ahead and cycled ssis and sql server. Afterward, those operations were still present. After a few more attempts to trying to get to the operations to stop, I ended up having to reboot the entire VM. When the VM came online, all [running] operations were gone and new ones were going thru.
I went and talked to the VM guy. We looked at the guest and looked at the host. the Host was using 50% of memory. Very little CPU activity. The guest itself didn’t have much memory usage nor CPU usage.
Still don’t know why it happened by rebooting the entire server fixed it.