initial commit
This commit is contained in:
@@ -0,0 +1,299 @@
|
||||
<!-- Generated by SpecStory -->
|
||||
|
||||
# Replace RemoteCMD with RemoteSimplified (2025-05-23 23:12Z)
|
||||
|
||||
_**User**_
|
||||
|
||||
for remote simplified and standard, let's make it so when they fail for whatever reason, it restarts at the command itself. so if there is an error, it just goes back to Start-SimplifiedWorker, for example
|
||||
|
||||
---
|
||||
|
||||
_**Assistant**_
|
||||
|
||||
I'll help modify the code to implement continuous restart behavior for both the simplified and standard worker commands. I'll edit the file to make these changes.
|
||||
|
||||
Let me explain what I'm going to do:
|
||||
1. For both worker types, I'll modify the retry logic to continuously restart the worker command itself rather than just retrying the SSH connection
|
||||
2. I'll keep the existing retry count limit but make it clearer that it's restarting the actual worker command
|
||||
3. I'll maintain all other functionality while adding this more resilient restart behavior
|
||||
|
||||
I'll edit the file now:
|
||||
|
||||
|
||||
---
|
||||
|
||||
```diff
|
||||
)
|
||||
|
||||
- Write-Host "Connecting to $($Worker.Name)..." -ForegroundColor Cyan
|
||||
-
|
||||
- try {
|
||||
- $remoteCommand = Get-RemoteStandardWorkerCommand
|
||||
-
|
||||
- # Encode the command to handle special characters
|
||||
- $bytes = [System.Text.Encoding]::Unicode.GetBytes($remoteCommand)
|
||||
- $encodedCommand = [Convert]::ToBase64String($bytes)
|
||||
-
|
||||
- # Execute the encoded command on the remote machine
|
||||
- Write-Host "Connecting to $($Worker.Name) and executing worker script..." -ForegroundColor Yellow
|
||||
-
|
||||
- # Add SSH keepalive settings to reduce chance of random disconnections
|
||||
- $sshCommand = "ssh -o ServerAliveInterval=60 -o ServerAliveCountMax=30 $($Worker.SSHArgs) ""powershell -EncodedCommand $encodedCommand"""
|
||||
-
|
||||
- # Execute the SSH command
|
||||
- Invoke-Expression $sshCommand
|
||||
- }
|
||||
- catch {
|
||||
- Write-Host "`nAn error occurred while connecting to $($Worker.Name):" -ForegroundColor Red
|
||||
- Write-Host $_.Exception.Message -ForegroundColor Red
|
||||
+ $maxRetries = 5
|
||||
+ $retryCount = 0
|
||||
+ $retryDelay = 5 # seconds between retries
|
||||
+ $workerRestarted = $false
|
||||
+
|
||||
+ while ($retryCount -lt $maxRetries) {
|
||||
+ if ($retryCount -gt 0) {
|
||||
+ Write-Host "`nRestarting worker process (Attempt $($retryCount + 1) of $maxRetries)..." -ForegroundColor Yellow
|
||||
+ Start-Sleep -Seconds $retryDelay
|
||||
+ $workerRestarted = $true
|
||||
+ }
|
||||
+
|
||||
+ Write-Host "Connecting to $($Worker.Name)..." -ForegroundColor Cyan
|
||||
+ if ($workerRestarted) {
|
||||
+ Write-Host "Worker was restarted due to disconnection or crash." -ForegroundColor Yellow
|
||||
+ }
|
||||
+
|
||||
+ try {
|
||||
+ $remoteCommand = Get-RemoteStandardWorkerCommand
|
||||
+
|
||||
+ # Encode the command to handle special characters
|
||||
+ $bytes = [System.Text.Encoding]::Unicode.GetBytes($remoteCommand)
|
||||
+ $encodedCommand = [Convert]::ToBase64String($bytes)
|
||||
+
|
||||
+ # Execute the encoded command on the remote machine
|
||||
+ Write-Host "Connecting to $($Worker.Name) and executing worker script..." -ForegroundColor Yellow
|
||||
+
|
||||
+ # Add SSH keepalive settings to reduce chance of random disconnections
|
||||
+ $sshCommand = "ssh -o ServerAliveInterval=60 -o ServerAliveCountMax=30 $($Worker.SSHArgs) ""powershell -EncodedCommand $encodedCommand"""
|
||||
+
|
||||
+ # Execute the SSH command and capture the exit code
|
||||
+ Invoke-Expression $sshCommand
|
||||
+ $sshExitCode = $LASTEXITCODE
|
||||
+
|
||||
+ # Check if SSH command completed successfully
|
||||
+ if ($sshExitCode -eq 0) {
|
||||
+ Write-Host "`nWorker completed successfully." -ForegroundColor Green
|
||||
+ break
|
||||
+ } else {
|
||||
+ throw "Worker process exited with code: $sshExitCode"
|
||||
+ }
|
||||
+ }
|
||||
+ catch {
|
||||
+ $retryCount++
|
||||
+ Write-Host "`nAn error occurred while running worker on $($Worker.Name):" -ForegroundColor Red
|
||||
+ Write-Host $_.Exception.Message -ForegroundColor Red
|
||||
+
|
||||
+ if ($retryCount -lt $maxRetries) {
|
||||
+ Write-Host "`nAttempting to restart worker in $retryDelay seconds..." -ForegroundColor Yellow
|
||||
+ } else {
|
||||
+ Write-Host "`nMaximum restart attempts reached. Giving up on $($Worker.Name)." -ForegroundColor Red
|
||||
+ }
|
||||
+ }
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
```diff
|
||||
break
|
||||
} else {
|
||||
- throw "SSH command failed with exit code: $sshExitCode"
|
||||
+ throw "Worker process exited with code: $sshExitCode"
|
||||
}
|
||||
}
|
||||
catch {
|
||||
$retryCount++
|
||||
- Write-Host "`nAn error occurred while connecting to $($Worker.Name):" -ForegroundColor Red
|
||||
+ Write-Host "`nAn error occurred while running worker on $($Worker.Name):" -ForegroundColor Red
|
||||
Write-Host $_.Exception.Message -ForegroundColor Red
|
||||
|
||||
if ($retryCount -lt $maxRetries) {
|
||||
- Write-Host "`nAttempting to reconnect in $retryDelay seconds..." -ForegroundColor Yellow
|
||||
+ Write-Host "`nAttempting to restart worker in $retryDelay seconds..." -ForegroundColor Yellow
|
||||
} else {
|
||||
- Write-Host "`nMaximum retry attempts reached. Giving up on $($Worker.Name)." -ForegroundColor Red
|
||||
+ Write-Host "`nMaximum restart attempts reached. Giving up on $($Worker.Name)." -ForegroundColor Red
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
I've made several key changes to implement continuous restart behavior for both worker types:
|
||||
|
||||
1. For `Start-StandardWorker`:
|
||||
- Added a retry loop with max retries (5) and delay (5 seconds)
|
||||
- Added worker restart tracking
|
||||
- Added exit code checking to detect worker failures
|
||||
- Improved error messages to be more specific about worker restarts
|
||||
- The worker will now restart if it fails for any reason (SSH disconnection, worker crash, etc.)
|
||||
|
||||
2. For `Start-CmdWorker`:
|
||||
- Updated error messages to be more specific about worker restarts
|
||||
- Changed "reconnect" terminology to "restart" to better reflect what's happening
|
||||
- Made error messages consistent with the standard worker
|
||||
- The worker will now restart if it fails for any reason
|
||||
|
||||
The key improvements are:
|
||||
- Both worker types now have the same retry behavior
|
||||
- They will continuously restart the worker command itself when it fails
|
||||
- Better error messages that clearly indicate when a worker is being restarted
|
||||
- Consistent behavior between both worker types
|
||||
- Maintained all existing functionality while adding more resilient restart behavior
|
||||
|
||||
The workers will now:
|
||||
1. Start normally on first attempt
|
||||
2. If they fail for any reason (crash, disconnect, etc.), they'll wait 5 seconds
|
||||
3. Then restart the entire worker command
|
||||
4. This will continue up to 5 times before giving up
|
||||
5. Each restart is clearly logged with attempt numbers and status messages
|
||||
|
||||
Would you like me to make any adjustments to this implementation?
|
||||
|
||||
---
|
||||
|
||||
_**User**_
|
||||
|
||||
for the standard workers: we got a "client loop disconnect" and "press any key to close this tab"
|
||||
it should never prompt to close the tab, it should just restart itself.
|
||||
|
||||
---
|
||||
|
||||
_**Assistant**_
|
||||
|
||||
Ah, I see the issue. The "Press any key to close this tab" prompt is coming from the remote command itself. We need to modify the `Get-RemoteStandardWorkerCommand` function to remove that prompt and ensure it just restarts automatically. Let me fix that:
|
||||
|
||||
|
||||
---
|
||||
|
||||
```diff
|
||||
} else {
|
||||
Write-Host "Failed to connect to NEXUS" -ForegroundColor Red
|
||||
+ exit 1 # Exit with error code to trigger restart
|
||||
}
|
||||
|
||||
# Connect to NAS with password automatically supplied
|
||||
net use N: \\NAS\amazon /user:Nathan HeadsTalk1ng! /persistent:yes
|
||||
+ if ($LASTEXITCODE -ne 0) {
|
||||
+ Write-Host "Failed to connect to NAS" -ForegroundColor Red
|
||||
+ exit 1 # Exit with error code to trigger restart
|
||||
+ }
|
||||
|
||||
# Verify connections
|
||||
```
|
||||
|
||||
```diff
|
||||
if (Test-Path 'flamenco-worker.exe') {
|
||||
Write-Host "Running flamenco-worker.exe..." -ForegroundColor Green
|
||||
- .\flamenco-worker.exe
|
||||
- Write-Host "Flamenco worker process has terminated." -ForegroundColor Yellow
|
||||
+ # Run the worker and capture its exit code
|
||||
+ $workerProcess = Start-Process -FilePath '.\flamenco-worker.exe' -NoNewWindow -PassThru -Wait
|
||||
+ $exitCode = $workerProcess.ExitCode
|
||||
+ Write-Host "Flamenco worker process has terminated with exit code: $exitCode" -ForegroundColor Yellow
|
||||
+ exit $exitCode # Exit with the worker's exit code to trigger restart if needed
|
||||
} else {
|
||||
Write-Host "Error: flamenco-worker.exe not found" -ForegroundColor Red
|
||||
+ exit 1 # Exit with error code to trigger restart
|
||||
}
|
||||
} else {
|
||||
Write-Host "Error: Flamenco directory not found" -ForegroundColor Red
|
||||
+ exit 1 # Exit with error code to trigger restart
|
||||
}
|
||||
-
|
||||
- Write-Host "Press Enter to continue..."
|
||||
- Read-Host
|
||||
'@
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
I've made several important changes to the `Get-RemoteStandardWorkerCommand` function:
|
||||
|
||||
1. Removed the "Press Enter to continue..." prompt and `Read-Host` command that was causing the tab to wait for user input
|
||||
2. Added proper exit codes throughout the script to trigger restarts:
|
||||
- Exit code 1 for network connection failures
|
||||
- Exit code 1 for missing files/directories
|
||||
- The actual worker's exit code is now captured and passed through
|
||||
3. Added better error handling for NAS connection
|
||||
4. Changed how the worker is started to properly capture its exit code using `Start-Process` with `-PassThru` and `-Wait`
|
||||
5. Added more detailed logging of the worker's exit code
|
||||
|
||||
Now when the worker disconnects or fails:
|
||||
1. It will exit with an appropriate error code
|
||||
2. The parent script will see this exit code and trigger a restart
|
||||
3. There will be no prompts waiting for user input
|
||||
4. The process will automatically restart itself as intended
|
||||
|
||||
The script will now properly handle:
|
||||
- Network disconnections
|
||||
- Worker crashes
|
||||
- Missing files/directories
|
||||
- Failed network connections
|
||||
- Any other error conditions
|
||||
|
||||
All of these will trigger an automatic restart without requiring any user intervention. Would you like me to make any adjustments to this implementation?
|
||||
|
||||
---
|
||||
|
||||
_**User**_
|
||||
|
||||
remove the "press any key to close this tab' in both its instances in the script.
|
||||
|
||||
---
|
||||
|
||||
_**Assistant**_
|
||||
|
||||
I'll search for and remove both instances of the "press any key to close this tab" prompt. These appear in the script that's generated for launching all workers.
|
||||
|
||||
|
||||
---
|
||||
|
||||
```diff
|
||||
}
|
||||
}
|
||||
-
|
||||
- Write-Host 'Press any key to close this tab...' -ForegroundColor Green
|
||||
- `$null = `$Host.UI.RawUI.ReadKey('NoEcho,IncludeKeyDown')
|
||||
"@
|
||||
} else {
|
||||
```
|
||||
|
||||
```diff
|
||||
}
|
||||
}
|
||||
-
|
||||
- Write-Host 'Press any key to close this tab...' -ForegroundColor Green
|
||||
- `$null = `$Host.UI.RawUI.ReadKey('NoEcho,IncludeKeyDown')
|
||||
"@
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
I've removed both instances of:
|
||||
1. `Write-Host 'Press any key to close this tab...' -ForegroundColor Green`
|
||||
2. `$null = $Host.UI.RawUI.ReadKey('NoEcho,IncludeKeyDown')`
|
||||
|
||||
These were appearing in both the CMD worker and Standard worker script templates. Now the scripts will run continuously without prompting for user input, allowing them to properly restart when needed.
|
||||
|
||||
The workers will now:
|
||||
1. Run continuously
|
||||
2. Restart automatically on failure
|
||||
3. Not prompt for any user input
|
||||
4. Keep running until they either succeed or reach their maximum retry attempts
|
||||
|
||||
Would you like me to make any other adjustments to the script?
|
||||
|
||||
---
|
||||
|
||||
Reference in New Issue
Block a user