Jobs stuck in queue

mikedhanson · March 28, 2024, 7:18pm

PSU: 4.2.13
Db: sql

I had a process that created a bunch of Job found a good number of them got stuck in “queued” status.

mikedhanson · March 28, 2024, 7:20pm

At this point im not sure what to do or why these jobs are showing as queued but not through hangfire. Could it be that the job never got to the hangfire queue? Could we add a button in the UI to requeue these?

Edit: or even show a queued date/time

alexrgreenwood · April 3, 2024, 2:38pm

Hi Mike - we had the same issue for a while on 4.2.x versions , currently on 4.2.7 with the issue not happening much any more

its seemed to get worse as the Job table filled up much beyond 50k rows, pointing the server at a fresh empty database “fixed” the problem (at the cost of losing job history and saved licence / secrets etc), so i think its related to the time it takes for the query to execute and might improve if you reduce the data in the backend (adjusting the history for example)

the status of queued is 0 so you can tidy things up a bit with this SQL

SELECT * FROM [dbo].[Job] 
where status=0 and CreatedTime<dateadd(hh,-50,getdate())
--anything that is queued should be set to failed
update [dbo].[Job] set status=3 where status=0 and CreatedTime<dateadd(hh,-50,getdate())

PorreKaj · June 21, 2024, 9:21am

Not fixed in 4.3.0 sadly

@adam

mikedhanson · January 28, 2025, 2:36pm

We are seeing this issue as well still in version 4.5.1

When you look in hangfire it shows that the job was deleted?

@adam we need you

adam · January 29, 2025, 7:47pm

Are you running these against any specific computers\computer groups or just the default queue?

mikedhanson · January 29, 2025, 8:03pm

Default queue.

One thing I’ve noticed is if I have a node in maintenance, then the jobs go to queued and show up under deleted in hangfire.

Below is a picture of hangfire and this is showing the queued job in deleted state and trying to process on the node that is in “maintenance” mode.

adam · January 30, 2025, 9:37pm

Ok. We need to fix this. The problem is that the job shouldn’t be sent to the node at all that is maintenance mode. And really, the job should be marked failed if it is sent to the node and not queued indefinitely. We have a check in place, right before starting a job, that should be doing that.

If hangfire deletes the job without PSU realizing it, that will cause the job to queue indefinitely since it never transitions out of that state since the job never runs.

Do you see this same behavior when you don’t have machines in maintenance mode?

mikedhanson · January 30, 2025, 10:15pm

Yes

I suspect it has something to do with schedules as well. Did you want to open a support case and do a screen share on our setup or do you have enough information to replicate this?

Here is a sample from schedules.ps1


$Parameters = @{
    Cron       = "0 8 31 JUN,DEC *"
    Script     = "report\myscript.ps1"
    TimeZone   = "America/Chicago"
    Credential = "Default"
    Name       = "Run some script"
    Condition = {
        $Environment -eq 'production'
    }
    Computer   = "ProdPSUNode"
}
New-PSUSchedule @Parameters

adam · January 30, 2025, 10:30pm

Please open a ticket. I likely won’t be able to replicate this easily.

Topic		Replies	Views
Jobs Queued but not Running PowerShell Universal	2	420	September 6, 2022
Error and script stuck enqueued when run with boolean parameters PowerShell Universal	1	131	November 3, 2023
"Run Now" on scheduled job stuck in queue Universal Automation	0	161	February 19, 2024
Jobs stuck as Running after Waiting for Feedback Universal Automation	1	346	July 28, 2022
Jobs stuck running hangfire showing completed PowerShell Universal	0	180	March 16, 2023

Jobs stuck in queue

Related topics