PSU v4 to v5 issues

I’ve been running PSU v4 for a while in my current org and its been an awesome success story that’s helped aid our audit and compliance requirements for access reviews, something we’ve had failings in managing before. What I’ve been able to achieve with this tool has been amazing and an off the shelf extremely expensive IAM solution that was purchased before I joined, has not been able to compete given the bespoke nature of our environment and all the technical debt that exists to unpick. Honestly, it’s got us out of some pretty hot water!

I’ve been running my solution off a single server until recently where I’m looking to get a dev box up and running, (we’re putting in a request for 2 more licenses).

I figured I may as well get the dev server on v5.
I installed v5 and used the same repository codebase from v4, here’s my initial feedback and struggles:

  • Errors were thrown for my variables as the -secret parameter no longer exists. (maybe it’s worth keeping a deprecated non-functional parameter for backward compatibility here?)
  • I noticed when adding new secrets the variables.ps1 was updated with the -secret parameter to which i’d then also have to remove to prevent the error.
  • Variables that had been saved to the database which I expected to be showing as $null after a fresh install of v5 (only the repository folder was cloned from my v4 instance, and the local db was freshly generated) - none of the variable values save, if i edit them and update the value, then restart the service, they show as $null again each time. I figured maybe i just need to delete the variable and re-add it via the GUI but i also tried this and the value is showing as $null after restarting the service.
  • My dashboard wont load. My first action is to click on edit in the app and goto the Log tab. it seems to take quite a long time to load compared to v4 which was pretty instant, in fact I thought it hadn’t worked at first and tried clicking other things to which no tab worked. it just sort of stalled, then threw a javascript error. The second time i went through I was more patient and the log tab eventually loaded. - (Maybe a pre-loader or some feedback to show it’s still doing something.) Navigating away from this page also seemed to just stall, but once loaded it was responsive again between other pages.
    At the moment I figure the reason my dashboard wont load is mainly to do with my variable issues, so i’ll wait to get this sorted before troubleshooting further.
  • I’ve had similar lag/delay when trying to update values in variables, I hit okay and it does nothing. Waiting for 20 seconds and it eventually closes the modal and updates the variable with a toast success message. (it doesnt always seem to do this, I’ve noticed it after restarting the PSU service, I went on to try editing another variable after this and it was responsive again)
  • Similar lag seen when trying to update settings\platform and hitting save. (I added proxy details) no feedback from the button, around 10 - 20 seconds later I got a ‘saved’ toast.

Any ideas? I think the GUI delays and dashboard not issues can be ignored for now until I figure out whats happening with the variables.

TLDR & some more info on my environment:
Clean windows server VM, install of the v5.0.0-rc5 msi as a service
Repository folder copied from my V4 instance.
OIDC auth works perfectly, no issues there!
Attempted to manually update variable values in the database but cant get this to work and cant progress further, database variables value’s just set to $null each time upon restarting the PSU service.

2 Likes

Really appreciate the effort you put in to try out v5.

The particular issue with the -Secret is resolved and will be in the final release.

The database saved variables, I’m not sure but we may need to open an issue for this.

I’d agree on hold off on debugging the dashboard until we have the variables issue resolved. I did have another user mention they had a 100+ page dashboard load in v5 without issue so mileage may vary. We changed little in the dashboard framework in terms of breaking changes. Only some additions here and there.

The lag you are seeing is likely due to how we have saving items in Blazor and we need to address that. One issue with v4 was certain things were asynchronous. So that 20 second delay was likely happening in the server, but it didn’t affect the UI. This was a better user experience, obviously, but it also meant that we could run into race conditions if you changed things too quickly. We should certainly open an issue for this and look at, both, speeding up this saving process and avoiding blocking the UI.

3 Likes

Thanks for the response, and I’m happy to perform any more testing as required or if you need anything else, please let me know!

Hey @adam
I can confirm this issue on 5.0.0 GA as well.
Variables saved in the DB show with value $null (also in the DB, so it’s not just a visual thing in the GUI), immediately after hitting F5 or changing Function in the Admin GUI and going back to the variables section.

Do we need to open an issue for that? This is a major problem I think…

Another related issue is that even deleting the variables or secrets from the UI does not really delete them in the DB table. This should also be addressed to keep things clean.

Thanks for you efforts!!! :smiley:

Best regards,
Don

This has been resolved today and will be in the nightly 5.0.1 builds. We will be doing a patch release next week sometime while we fix a couple more issues and collect any other feedback.

This actually was an issue with the configuration system itself so you may see some other oddities until you have the patch installed.

2 Likes

I’ve been testing this out briefly, upgrading went without a hitch.

After the upgrade we got errors related to Published Folders, with a mandatory -Name paramater missing. I added that parameter manually which solved the issue.

On the automation side we’re running into issues with our scripts, firstly we got scripts that are failing without any messages or output (I need to investigate those) but we have a trigger script that collects metadata about our scripts and reports in our ITSM-system and those are having issues with cmdlets like Get-PSUJob, Get-PSUJobOutput,Get-PSUJobPipelineOutput. (The script is running in the integrated environment)

The error message suggests I need to specify a computername or use Connect-PSUServer. Specifying the URL (Specifying the Computername causes it to attempt to connect to port 5000 over http) of the server causes an SSLException:

[8/22/2024 8:51:44 AM] Status(StatusCode=“Internal”, Detail=“Error starting gRPC call. HttpRequestException: The SSL connection could not be established, see inner exception. AuthenticationException: The remote certificate is invalid according to the validation procedure: RemoteCertificateNameMismatch”, DebugException=“System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.”)

Oh, and the log output in jobs seems to be sorted (by timestamp) causing the messages within the same timestamp to appear out of order/jumbled.
Edit: Actually, looking closer, maybe it just shows the newest message first? That fooled me for a second :slight_smile:
Edit 2: Actually it does appear to be ordered in a way causing the output to not be shown in the correct order, the entire script is basically executed in the same second and each time I run it the messages appear in potentially different order (the order does seem to be persistent though, so maybe it’s an issue with the SQL?)

Edit 3: Scripts that are failing without output seems to be running when clearing the credential (in scripts.ps1, the UI doesn’t allow for a blank credential), bunch of other problems I need to sus out because it needs that security context to run :slightly_smiling_face:

1 Like

Do you have HTTPS configured for your PSU server? The cmdlets now use the forward facing gRPC channel for all communication and it seems like it’s not trusting the cert for some reason. We could add some configuration to work around this but would probably need to determine if that’s the actual issue.

I’ll open an issue for this. I’ve noticed this as well. Not sure if it’s a UI thing or something else.

Do you have HTTPS configured for your PSU server? The cmdlets now use the forward facing gRPC channel for all communication and it seems like it’s not trusting the cert for some reason. We could add some configuration to work around this but would probably need to determine if that’s the actual issue.

Yeah we configured https.

Notably our certificates are issued to the computername (i.e. CN=FQDN), but universal is hosted on a different FQDN, which is also in its SAN, so our configuration for kestrel looks a bit like this:

{
  "Kestrel": {
    "Endpoints": {
      "HTTP": {
        "Url": "http://fqdn.example.org:80"
      },
      "HTTPS": {
        "Url": "https://fqdn.example.org:443",
        "Certificate": {
          "Subject": "computername.example.com",
          "Store": "My",
          "Location": "LocalMachine",
          "AllowInvalid": "false"
        }
      }
    },
    "RedirectToHttps": "true"
  }

  ....
}

The certificate looks something like this:

Subject: CN=computername.example.com
SAN: 
  DNS-Name=computername.example.com
  DNS-Name=fqdn.example.org

I’m off work for today, but tommorrow I’ll see if I can issue a different certificate, or address universal using the computername FQDN instead and report back.

So update on the above, changing the certificate to one with the CN being the same as the URL used to access Universal solved the issues with the remote certificate mismatch.

I’m still unable to get any output either using Get-PSUJob on its own or by specifying the URL, I assume I could use Connect-PSUServer with an app-token.

I also noticed you had already uploaded 5.0.1 which I upgraded to, because it had a change related to gRPC, just to test if that solved it, maybe that solved the Subject vs. SAN issue but I already changed the certificate.

Is there a different error you are seeing when you call Get-PSUJob?

Yes the error is just that it’s unauthenticated:

Status(StatusCode=“Unauthenticated”, Detail=“Bad gRPC response. HTTP status code: 401”)

(Same after updating to 5.0.1, I didn’t test with Connect-PSUServer yet and supplying an AppToken)

The cmdlet authorization is more rigid in v5. You could be running into that if they are scheduled.

This is for a trigger script for handling Failed/Completed jobs. But I can definitely start using an App Token if its necessary

@adam I’m running into the exact same gRPC errors about a lack of authentication. Our PSU instance is also configured to use HTTPS, and its certificate is a public cert issued by Let’s Encrypt. Please advise so I don’t need to roll back to v4.

I’ve tried using the API settings below inside the appsettings.json file:

  "Api": {
    "OpenAPI": {
      "Name": "Endpoints",
      "Description": "Endpoints defined within the PowerShell Universal admin console.",
      "Url": "v2",
      "Version": "v2"
    },
    "Url": "https://<our-psu-fqdn>/psu",
    "GrpcPort": 0
  },

but this made no difference. I’m not sure if I’m setting the parameter(s) correctly, though.

Edit: I had to restore my snapshot of the host to get 4.3.4 to work again, as something that 5.0.4 did is now not allowing things to work. I’m not sure what, but I was unable to authenticate even with local creds after reinstalling 4.3.4, so I’m guessing it touched the DB and did something 4.3.4 doesn’t work with.

I ended up getting it working by just creating an app token from a user (i.e. admin with the administrator role, less might work, but haven’t had a lot of time to test) and using the command Connect-PSUServer -AppToken '...' -ComputerName 'https://psufqdn.example.org' with the computername being an url to your instance of PSU.

It works fine after that, but I still think it’s a bit weird that you need to take a specific identity when it’s for the server itself.

Where are you referencing the token at? I got the error about gRPC when running a script that launches another script via Invoke-PSUScript and passes some parameters to it.

Our instance of PSU has a Let’s Encrypt cert, so I’m under the impression that part of the error is due to the processes now trying to access the server on its internal hostname and getting a cert error due to the name not matching what the cert says it’s for.

The App Token can be generated in the application tokens section (Security → Tokens), I used admin as the identity and administrator as the role, and set the expiration to never. (Again, just for testing, actual production values could probably be stricter)

The token is a base64-encoded string representing a JSON Web Token, which starts like ey[... and a bunch of gibberish]

The use of -Computername followed with an URL to the external hostname, and as long as its resolvable by the server itself that works just fine here. We do use an Internal CA instead of Let’s Encrypt, but that should have no bearing on it as long as the used certificates are trusted in your environment.

So when I put that Connect-PSUServer cmdlet with those 2 parameters in before any cmdlets that require the new authentication, in my case Get-PSUJob, Get-PSUJobOutput and Get-PSUJobPipelineOutput, then those cmdlets simply work as before without any changes to syntax.

I hope that helps :slightly_smiling_face:

Edit: Just to put an example script, my scripts used to look like this:

$JobOutput = Get-PSUJobOutput -JobId $Job.Id
$JobPipelineOutput = Get-PSUJobPipelineOutput -JobId $Job.Id
$PSUJob = Get-PSUJob -Id $Job.Id

Now the script looks like this:

Connect-PSUServer -AppToken 'ey...' -ComputerName 'https://mypsu.example.org'
$JobOutput = Get-PSUJobOutput -JobId $Job.Id
$JobPipelineOutput = Get-PSUJobPipelineOutput -JobId $Job.Id
$PSUJob = Get-PSUJob -Id $Job.Id
1 Like

I am having the same issue.

I simply do:
$script=Invoke-PSUScript -Script scriptvars.ps1 -Wait

This has worked in v4.

In v5 it throws an error:

This is my dev instance and I am using a self-signed certificate with FQDN of the server.

What version of PSU did you install? This was fixed for me in the 5.0.2 release

1 Like

Thanks. I have lots of tokens already but wasn’t sure where to use one in regard with this. It seems like more of a workaround and a hack to do this rather than the PSU processes and agents having their own trusted certs pinned for their own internal communications.