Job Stuck in "Canceling" Status (with stack trace!)

OK had a job today that was waiting on user feedback because I accidentally made a parameter mandatory when it should’ve been optional.

I provided a value to the parameter when I logged in and saw the problem, and the script finished running but the job stayed in a “running” state.

So I tried cancelling the job, and now it’s in a “canceling” state.

Hoping this stack trace from the Hangfire dashboard helps diagnose the issue! I’m not too concerned about the “stuck” job - I’m assuming it will be groomed after a while anyway.

System.InvalidOperationException
The instance of entity type 'Job' cannot be tracked because another instance with the same key value for {'Id'} is already being tracked. When attaching existing entities, ensure that only one entity instance with a given key value is attached. Consider using 'DbContextOptionsBuilder.EnableSensitiveDataLogging' to see the conflicting key values.

System.InvalidOperationException: The instance of entity type 'Job' cannot be tracked because another instance with the same key value for {'Id'} is already being tracked. When attaching existing entities, ensure that only one entity instance with a given key value is attached. Consider using 'DbContextOptionsBuilder.EnableSensitiveDataLogging' to see the conflicting key values.
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.IdentityMap`1.ThrowIdentityConflict(InternalEntityEntry entry)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.IdentityMap`1.Add(TKey key, InternalEntityEntry entry, Boolean updateDuplicate)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.StartTracking(InternalEntityEntry entry)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.InternalEntityEntry.SetEntityState(EntityState oldState, EntityState newState, Boolean acceptChanges, Boolean modifyProperties)
   at Microsoft.EntityFrameworkCore.ChangeTracking.EntityEntry.set_State(EntityState value)
   at PowerShellUniversal.SQL.SqlTable`1.Update(T item) in C:\actions-runner\_work\universal\universal\src\PowerShellUniversal.SQL\Database.cs:line 473
   at UniversalAutomation.ExecutionService.Execute(Job job, Schedule schedule) in C:\actions-runner\_work\universal\universal\src\Universal.Server\Services\Automation\ExecutionService.cs:line 268
   at UniversalAutomation.ExecutionService.Execute(Job job, Schedule schedule) in C:\actions-runner\_work\universal\universal\src\Universal.Server\Services\Automation\ExecutionService.cs:line 268
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
Product: PowerShell Universal
Version: 3.2.0
2 Likes

Possibly related, @adam:
Got in this morning to a new job still in “running” status, and its output has this error:

[error] At least one element in the specified array was null. (Parameter 'waitHandles[1]')

There’s no output after that, but I can tell that the job has finished running, so this seems to be something internal to PSU logging this error.

I can’t find anything in the log at the time the script ran. The log is full of unrelated groom service exceptions though!

2022-08-22 06:48:03.397 +10:00 [INF] Starting groom job.
2022-08-22 06:48:03.397 +10:00 [INF] Script Test-Psu.ps1 has a max history of 20
2022-08-22 06:48:03.414 +10:00 [INF] Script has 23 jobs to delete.
2022-08-22 06:48:03.519 +10:00 [ERR] Failed to groom.
Microsoft.EntityFrameworkCore.DbUpdateConcurrencyException: The database operation was expected to affect 1 row(s), but actually affected 0 row(s); data may have been modified or deleted since entities were loaded. See http://go.microsoft.com/fwlink/?LinkId=527962 for information on understanding and handling optimistic concurrency exceptions.
   at Microsoft.EntityFrameworkCore.Update.AffectedCountModificationCommandBatch.ThrowAggregateUpdateConcurrencyException(Int32 commandIndex, Int32 expectedRowsAffected, Int32 rowsAffected)
   at Microsoft.EntityFrameworkCore.Update.AffectedCountModificationCommandBatch.ConsumeResultSetWithoutPropagation(Int32 commandIndex, RelationalDataReader reader)
   at Microsoft.EntityFrameworkCore.Update.AffectedCountModificationCommandBatch.Consume(RelationalDataReader reader)
   at Microsoft.EntityFrameworkCore.Update.ReaderModificationCommandBatch.Execute(IRelationalConnection connection)
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.Execute(IEnumerable`1 commandBatches, IRelationalConnection connection)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChanges(IList`1 entriesToSave)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChanges(StateManager stateManager, Boolean acceptAllChangesOnSuccess)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.<>c__DisplayClass31_0`2.<Execute>b__0(DbContext context, TState state)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementation[TState,TResult](Func`3 operation, Func`3 verifySucceeded, TState state)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.Execute[TState,TResult](TState state, Func`3 operation, Func`3 verifySucceeded)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChanges(Boolean acceptAllChangesOnSuccess)
   at Microsoft.EntityFrameworkCore.DbContext.SaveChanges(Boolean acceptAllChangesOnSuccess)
   at PowerShellUniversal.SQL.SqlTable`1.Delete(T item) in C:\actions-runner\_work\universal\universal\src\PowerShellUniversal.SQL\Database.cs:line 449
   at UniversalAutomation.GroomService.DeleteJob(Job job) in C:\actions-runner\_work\universal\universal\src\Universal.Server\Services\GroomService.cs:line 164
   at UniversalAutomation.GroomService.ScriptLevelGroom() in C:\actions-runner\_work\universal\universal\src\Universal.Server\Services\GroomService.cs:line 122
   at UniversalAutomation.GroomService.Groom() in C:\actions-runner\_work\universal\universal\src\Universal.Server\Services\GroomService.cs:line 199

Thanks for the update. We’ve seen a couple of these groom errors and trying to knock them down. The problem is some of kinda data specific so it’s been a bit hard to reproduce at times. I’ll open an issue for this.

As for this error, I’ll dig around and see if I can see what is causing that:

[error] At least one element in the specified array was null. (Parameter 'waitHandles[1]')

It certainly seems internal to PSU. It seems like it may be some sort of parallel operation since it’s involving wait handles.

1 Like

I seem to be able to get the “at least one element in the specified array was null” error to appear consistently with a script I just wrote.

The script in question has a begin { }, process { } and end {} block, and the process block is returning objects to the pipeline.

I’ll try refactoring the script so that it batches up the return values in an array and returns them all in one hit in the end block … see if that makes the error go away.

EDIT: Nope, still gets the error.

The script calls Graph so it’s probably not something you can easily run yourself, Adam, but if you want a copy let me know.

It looks similar to the issue I’ve been experiencing

I think I recall having a very simple script in the end, almost literally just Read-Host, and still was having issues, I wiped my database in connection with another issue though, so I don’t have the original runtime properties.

Can you send me the script? I can try to replicate.

1 Like

Still seeing the waitHandles exception every once in a while. Sometimes a job will include the error in its output but still complete successfully, and sometimes the job will be stuck in a “running” status once it happens. (And then when I try to cancel it, it stays in “cancelling” status.)