Compare commits

..

2 Commits

Author SHA1 Message Date
Nathan
b61baac09b attempting fix 2025-11-26 18:18:13 -07:00
Nathan
2aa9061114 begin persistence/attachment overhaul 2025-11-25 14:51:22 -07:00
13 changed files with 13379 additions and 28843 deletions

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -1,556 +0,0 @@
<!-- Generated by SpecStory, Markdown v2.1.0 -->
<!-- cursor Session ce0482b1-b455-4f1b-bdb4-685ed183c431 (2026-01-09 18:37Z) -->
# Nexus Flamenco manager batch script (2026-01-09 18:37Z)
_**User**_
context:@.specstory/history/2026-01-08_17-28Z-shaman-symlinking-for-windows-samba.md @.specstory/history/2026-01-08_23-42Z-windows-symlinking-error.md @.specstory/history/2026-01-09_18-22Z-nexus-manager-access-issue.md
@ssh (82-117)
i need you to create a batch that:
1. ssh nexus
2. cd to /mnt/PHOENIX/flamenco/software/Flamenco
3. run the flamenco manager (linux executable)
---
_**Agent (model default, mode Agent)**_
Creating a batch file to SSH to nexus, change to the Flamenco directory, and run the manager. Checking existing batch files for style:
---
<tool-use data-tool-type="read" data-tool-name="read_file_v2">
<details><summary>Tool use: **read_file_v2** • Read file: c:\Users\Nathan\Desktop\flamenco\unified_flamenco_launcher.bat</summary>
</details>
</tool-use>
---
<tool-use data-tool-type="read" data-tool-name="read_file_v2">
<details><summary>Tool use: **read_file_v2** • Read file: c:\Users\Nathan\Desktop\flamenco\unified_sheepit_launcher.bat</summary>
</details>
</tool-use>
---
Creating a batch file to SSH to nexus and run the Flamenco manager:
---
<tool-use data-tool-type="write" data-tool-name="edit_file_v2">
<details><summary>Tool use: **edit_file_v2** • Edit file: c:\Users\Nathan\Desktop\flamenco\nexus_flamenco_manager.bat</summary>
</details>
</tool-use>
---
Created `nexus_flamenco_manager.bat`. It:
1. Connects to nexus via SSH
2. Changes to `/mnt/PHOENIX/flamenco/software/Flamenco`
3. Runs `./flamenco-manager`
The batch file uses a single SSH command that chains the directory change and execution. Double-click the batch file to run it, or run it from the command line.
---
_**User**_
@unified_flamenco_launcher.ps1:59-68 currently, we have A, F, N, and P drives mapped. Can we also map \\NEXUS\glacier to G:\ ?
---
_**Agent (model default, mode Agent)**_
Reviewing the script to see how drive mappings are handled:
---
<tool-use data-tool-type="read" data-tool-name="read_file_v2">
<details><summary>Tool use: **read_file_v2** • Read file: c:\Users\Nathan\Desktop\flamenco\unified_flamenco_launcher.ps1</summary>
</details>
</tool-use>
---
Adding the G: drive mapping for `\\NEXUS\glacier` in all three remote command functions. Updating each:
---
<tool-use data-tool-type="write" data-tool-name="edit_file_v2">
<details><summary>Tool use: **edit_file_v2** • Edit file: c:\Users\Nathan\Desktop\flamenco\unified_flamenco_launcher.ps1</summary>
</details>
</tool-use>
---
<tool-use data-tool-type="write" data-tool-name="edit_file_v2">
<details><summary>Tool use: **edit_file_v2** • Edit file: c:\Users\Nathan\Desktop\flamenco\unified_flamenco_launcher.ps1</summary>
</details>
</tool-use>
---
Updated the script to map `\\NEXUS\glacier` to `G:\`. Changes:
1. Added `'G:'` to the `$drives` array in all three remote command functions
2. Added `'\\NEXUS\glacier'` to the `$networkPaths` array in all three functions
3. Added the mapping command `net use G: \\NEXUS\glacier /persistent:yes` after the NEXUS connection is established
The G: drive will be:
- Disconnected during cleanup (along with A:, F:, N:, and P:)
- Mapped when connecting to NEXUS (along with A:, F:, and P:)
All three worker command functions (`Get-RemoteStandardWorkerCommand`, `Get-RemoteCmdWorkerCommand`, and `Get-RemoteSimplifiedCmdWorkerCommand`) now include the G: drive mapping.
---
_**User**_
good, but we also need to apply this to CMD workers, too.
---
_**Agent (model default, mode Agent)**_
Checking the current state of the CMD worker functions to ensure the G: drive mapping is applied:
---
<tool-use data-tool-type="read" data-tool-name="read_file_v2">
<details><summary>Tool use: **read_file_v2** • Read file: c:\Users\Nathan\Desktop\flamenco\unified_flamenco_launcher.ps1</summary>
</details>
</tool-use>
---
Checking the simplified CMD worker command function:
---
<tool-use data-tool-type="read" data-tool-name="read_file_v2">
<details><summary>Tool use: **read_file_v2** • Read file: c:\Users\Nathan\Desktop\flamenco\unified_flamenco_launcher.ps1</summary>
</details>
</tool-use>
---
Verifying the standard worker function for consistency:
---
<tool-use data-tool-type="read" data-tool-name="read_file_v2">
<details><summary>Tool use: **read_file_v2** • Read file: c:\Users\Nathan\Desktop\flamenco\unified_flamenco_launcher.ps1</summary>
</details>
</tool-use>
---
All three worker command functions already include the G: drive mapping:
1. **Standard Worker** (`Get-RemoteStandardWorkerCommand`) - line 94
2. **CMD Worker** (`Get-RemoteCmdWorkerCommand`) - line 159
3. **Simplified CMD Worker** (`Get-RemoteSimplifiedCmdWorkerCommand`) - line 230
All three also include:
- `'G:'` in the `$drives` array
- `'\\NEXUS\glacier'` in the `$networkPaths` array
- The mapping command `net use G: \\NEXUS\glacier /persistent:yes`
The changes were applied to all functions, including both CMD worker functions. If you see something missing, point me to the specific location and Ill fix it.
---
_**User**_
ah, most of them are working, but echo and i9-13ks are not.
Launching CMD worker for echo
Connecting to echo (CMD mode)...
Setting up network connections...
Disconnecting existing network connections...
A: was deleted successfully.
F: was deleted successfully.
N: was deleted successfully.
P: was deleted successfully.
All network connections cleared.
Establishing network connections...
The command completed successfully.
The command completed successfully.
The command completed successfully.
System error 85 has occurred.
The local device name is already in use.
The command completed successfully.
The command completed successfully.
Current network connections:
New connections will be remembered.
Status Local Remote Network
-------------------------------------------------------------------------------
OK A: \\NEXUS\amazon Microsoft Windows Network
OK F: \\NEXUS\flamenco Microsoft Windows Network
OK N: \\NAS\amazon Microsoft Windows Network
OK P: \\NEXUS\proj Microsoft Windows Network
OK \\NEXUS\amazon Microsoft Windows Network
The command completed successfully.
Launching CMD worker for i9-13ks
Connecting to i9-13ks (CMD mode)...
Setting up network connections...
Disconnecting existing network connections...
A: was deleted successfully.
F: was deleted successfully.
N: was deleted successfully.
P: was deleted successfully.
All network connections cleared.
Establishing network connections...
The command completed successfully.
The command completed successfully.
The command completed successfully.
System error 85 has occurred.
The local device name is already in use.
The command completed successfully.
The command completed successfully.
Current network connections:
New connections will be remembered.
Status Local Remote Network
-------------------------------------------------------------------------------
OK A: \\NEXUS\amazon Microsoft Windows Network
OK F: \\NEXUS\flamenco Microsoft Windows Network
OK N: \\NAS\amazon Microsoft Windows Network
OK P: \\NEXUS\proj Microsoft Windows Network
Y: \\sshfs.r\talmage@10.1.10.3!1923\mnt\BACKUP01
WinFsp.Np
Z: \\sshfs.r\talmage@10.1.10.3!1923\mnt\ARCHIVE01
WinFsp.Np
OK \\NEXUS\amazon Microsoft Windows Network
The command completed successfully.
Running Flamenco worker...
---
_**Agent (model default, mode Agent)**_
G: isn't being deleted during cleanup, and the mapping fails with "System error 85" (already in use). Checking the script to ensure G: is included in the cleanup and mapping:
---
<tool-use data-tool-type="unknown" data-tool-name="ripgrep_raw_search">
<details>
<summary>Tool use: **ripgrep_raw_search**</summary>
Parameters:
```json
{
"pattern": "\\$drives = @",
"path": "c:\\Users\\Nathan\\Desktop\\flamenco\\unified_flamenco_launcher.ps1",
"caseInsensitive": false
}
```
Result:
```json
{
"success": {
"pattern": "\\$drives = @",
"path": "c:\\Users\\Nathan\\Desktop\\flamenco\\unified_flamenco_launcher.ps1",
"outputMode": "content",
"workspaceResults": {
"C:\\Users\\Nathan\\Desktop\\flamenco": {
"content": {
"matches": [
{
"file": "unified_flamenco_launcher.ps1",
"matches": [
{
"lineNumber": 61,
"content": "$drives = @('A:', 'F:', 'G:', 'N:', 'P:')"
},
{
"lineNumber": 135,
"content": "$drives = @('A:', 'F:', 'G:', 'N:', 'P:')"
},
{
"lineNumber": 206,
"content": "$drives = @('A:', 'F:', 'G:', 'N:', 'P:')"
}
]
}
],
"totalLines": 3,
"totalMatchedLines": 3
}
}
}
}
}
```
Status: **completed**
</details>
</tool-use>
---
<tool-use data-tool-type="unknown" data-tool-name="ripgrep_raw_search">
<details>
<summary>Tool use: **ripgrep_raw_search**</summary>
Parameters:
```json
{
"pattern": "net use G:",
"path": "c:\\Users\\Nathan\\Desktop\\flamenco\\unified_flamenco_launcher.ps1",
"caseInsensitive": false
}
```
Result:
```json
{
"success": {
"pattern": "net use G:",
"path": "c:\\Users\\Nathan\\Desktop\\flamenco\\unified_flamenco_launcher.ps1",
"outputMode": "content",
"workspaceResults": {
"C:\\Users\\Nathan\\Desktop\\flamenco": {
"content": {
"matches": [
{
"file": "unified_flamenco_launcher.ps1",
"matches": [
{
"lineNumber": 94,
"content": " net use G: \\\\NEXUS\\glacier /persistent:yes"
},
{
"lineNumber": 159,
"content": " net use G: \\\\NEXUS\\glacier /persistent:yes"
},
{
"lineNumber": 230,
"content": " net use G: \\\\NEXUS\\glacier /persistent:yes"
}
]
}
],
"totalLines": 3,
"totalMatchedLines": 3
}
}
}
}
}
```
Status: **completed**
</details>
</tool-use>
---
G: is in the arrays, but it's not being deleted on those machines, likely because it's already mapped. Adding explicit G: deletion before mapping and improving error handling:
---
<tool-use data-tool-type="read" data-tool-name="read_file_v2">
<details><summary>Tool use: **read_file_v2** • Read file: c:\Users\Nathan\Desktop\flamenco\unified_flamenco_launcher.ps1</summary>
</details>
</tool-use>
---
Adding explicit G: deletion before mapping in all three functions to handle cases where G: is already mapped:
---
<tool-use data-tool-type="write" data-tool-name="edit_file_v2">
<details><summary>Tool use: **edit_file_v2** • Edit file: c:\Users\Nathan\Desktop\flamenco\unified_flamenco_launcher.ps1</summary>
</details>
</tool-use>
---
Added explicit deletion of G: and the `\\NEXUS\glacier` path right before mapping in all three worker command functions. This should resolve the "System error 85" (already in use) on echo and i9-13ks.
The script now:
1. Deletes G: explicitly before mapping (even if it's already mapped)
2. Deletes the `\\NEXUS\glacier` network path before mapping
3. Then maps G: to `\\NEXUS\glacier`
This ensures any existing G: mapping is cleared before creating the new one. The `2>$null` suppresses errors if G: isn't already mapped, so it works whether G: is mapped or not.
---
_**User**_
I think it's because there are local external usb devices mapped to G:\. I probably need to ask these users to remap or temporarily remove those drives
Launching CMD worker for echo
Connecting to echo (CMD mode)...
Setting up network connections...
Disconnecting existing network connections...
A: was deleted successfully.
F: was deleted successfully.
N: was deleted successfully.
P: was deleted successfully.
All network connections cleared.
Establishing network connections...
The command completed successfully.
The command completed successfully.
The command completed successfully.
System error 85 has occurred.
The local device name is already in use.
The command completed successfully.
The command completed successfully.
Launching CMD worker for i9-13ks
Connecting to i9-13ks (CMD mode)...
Setting up network connections...
Disconnecting existing network connections...
A: was deleted successfully.
F: was deleted successfully.
N: was deleted successfully.
P: was deleted successfully.
All network connections cleared.
Establishing network connections...
The command completed successfully.
The command completed successfully.
The command completed successfully.
System error 85 has occurred.
The local device name is already in use.
The command completed successfully.
The command completed successfully.
---
_**Agent (model default, mode Agent)**_
G: is mapped to a local USB device on those machines, so `net use G: /delete` won't work (it only deletes network drives). The mapping then fails because G: is already in use.
Options:
1. Use a different drive letter for glacier (if available)
2. Add error handling to make the G: mapping non-fatal so the worker continues even if it fails
3. Check if G: is a network drive before attempting deletion/mapping
Since you'll coordinate with those users, I can add error handling so the script continues if G: mapping fails. Should I add that, or do you prefer to resolve it with the users first?
---
<tool-use data-tool-type="read" data-tool-name="read_file_v2">
<details><summary>Tool use: **read_file_v2** • Read file: c:\Users\Nathan\Desktop\flamenco\unified_flamenco_launcher.ps1</summary>
</details>
</tool-use>
---
Adding error handling so G: mapping failures are non-fatal and the script continues:
---
Cancelled
---

File diff suppressed because one or more lines are too long

View File

@@ -1,4 +0,0 @@
@echo off
title Nexus Flamenco Manager
echo Connecting to nexus and starting Flamenco Manager...
ssh nexus "cd /mnt/PHOENIX/flamenco/software/Flamenco && ./flamenco-manager"

29
persistent_workers.md Normal file
View File

@@ -0,0 +1,29 @@
## Persistent Remote Worker Architecture
This repository now includes a small controller/attach framework that keeps both Flamenco and SheepIt workers running on each remote Windows host even after the SSH session closes.
### Components
| File | Purpose |
| --- | --- |
| `remote_worker_controller.ps1` | Runs on each remote host. Launches the real worker process (Flamenco or SheepIt), redirects its stdout/stderr to a log file, listens for commands, and restarts the worker if it crashes. |
| `remote_worker_attach.ps1` | Also runs remotely. Streams the controllers log file back to the local SSH session and forwards commands (`pause`, `resume`, `quit`, etc.) to the workers stdin. It can also be invoked in a `-CommandOnly` mode to send a single command without attaching. |
| `unified_flamenco_launcher.ps1` / `unified_sheepit_launcher.ps1` | Local launchers that copy the helper scripts to each host, start controllers in the background, and open attach sessions as requested. |
All metadata, logs and helper scripts live under `C:\ProgramData\UnifiedWorkers\<worker-type>\<worker-name>\`.
### Typical Flow
1. **Start/attach**: The launcher ensures the controller + payload are present on the target host, starts the controller through `Start-Process`, then launches `remote_worker_attach.ps1` via SSH so you can see live output and type commands. Closing the SSH window only ends the attach session—the worker keeps running.
2. **Reattach**: Run the launcher again and choose the same worker. If a controller reports that a worker is already running, the launcher simply opens another attach session.
3. **Pause/Resume/Quit all (SheepIt)**: The SheepIt launcher now exposes menu options that iterate over every enabled worker and invoke the attach helper in `-CommandOnly` mode to send the requested command.
### Manual Verification Checklist
1. **Controller deployment**: From the launcher, start a worker once and confirm `%ProgramData%\UnifiedWorkers` appears on the remote host with `controller.ps1`, `attach-helper.ps1`, `logs\worker.log`, and `state\worker-info.json`.
2. **Persistence**: Attach to a worker, close the SSH window, then reattach. The log stream should resume and the worker PID reported in metadata should remain unchanged.
3. **Command channel**: While attached, type `pause`, `resume`, and `quit`. The controller should log each command and the worker should react accordingly. Repeat the same commands using the SheepIt “Pause/Resume/Quit All” menu entries to validate the automation path.
4. **Failure recovery**: Manually terminate the worker process on a remote host. The controller should log the exit, update metadata, and restart the worker up to the configured retry limit.
These steps ensure both Flamenco and SheepIt workflows operate as intended with the new persistent worker model. Adjust the retry/backoff values inside `remote_worker_controller.ps1` if you need more aggressive or conservative restart behavior.

107
remote_worker_attach.ps1 Normal file
View File

@@ -0,0 +1,107 @@
param(
[Parameter(Mandatory = $true)]
[string]$WorkerName,
[Parameter(Mandatory = $true)]
[string]$WorkerType,
[string]$DataRoot = (Join-Path ([Environment]::GetFolderPath('LocalApplicationData')) 'UnifiedWorkers'),
[switch]$CommandOnly,
[string]$Command
)
$ErrorActionPreference = 'Stop'
function Get-WorkerPaths {
param([string]$Root, [string]$Type, [string]$Name)
$instanceRoot = Join-Path -Path (Join-Path -Path $Root -ChildPath $Type) -ChildPath $Name
return [pscustomobject]@{
Metadata = Join-Path -Path $instanceRoot -ChildPath 'state\worker-info.json'
Command = Join-Path -Path $instanceRoot -ChildPath 'state\commands.txt'
Log = Join-Path -Path $instanceRoot -ChildPath 'logs\worker.log'
}
}
$paths = Get-WorkerPaths -Root $DataRoot -Type $WorkerType -Name $WorkerName
if (-not (Test-Path $paths.Metadata)) {
Write-Host "No worker metadata found for $WorkerName ($WorkerType)." -ForegroundColor Red
exit 1
}
try {
$metadata = Get-Content -Path $paths.Metadata -Raw | ConvertFrom-Json
}
catch {
Write-Host "Unable to read worker metadata: $($_.Exception.Message)" -ForegroundColor Red
exit 1
}
if (Test-Path $paths.Log) {
# ensure log file exists but do not truncate
$null = (Get-Item $paths.Log)
} else {
New-Item -Path $paths.Log -ItemType File -Force | Out-Null
}
if (-not (Test-Path $paths.Command)) {
New-Item -Path $paths.Command -ItemType File -Force | Out-Null
}
function Send-WorkerCommand {
param([string]$Value)
$clean = $Value.Trim()
if (-not $clean) {
return
}
Add-Content -Path $paths.Command -Value $clean -Encoding UTF8
Write-Host "Sent command '$clean' to $WorkerName." -ForegroundColor DarkGray
}
if ($CommandOnly) {
if (-not $Command) {
Write-Host "CommandOnly flag set but no command provided." -ForegroundColor Red
exit 1
}
Send-WorkerCommand -Value $Command
exit 0
}
Write-Host "Attaching to $WorkerName ($WorkerType) logs." -ForegroundColor Cyan
Write-Host "Type commands and press Enter. Type 'detach' to exit session." -ForegroundColor Yellow
$logJob = Start-Job -ScriptBlock {
param($LogPath)
Get-Content -Path $LogPath -Tail 50 -Wait
} -ArgumentList $paths.Log
try {
while ($true) {
$input = Read-Host "> "
if ($null -eq $input) {
continue
}
if ($input.Trim().ToLower() -eq 'detach') {
break
}
Send-WorkerCommand -Value $input
}
}
finally {
if ($logJob) {
Stop-Job -Job $logJob -Force -ErrorAction SilentlyContinue
Receive-Job -Job $logJob -ErrorAction SilentlyContinue | Out-Null
Remove-Job -Job $logJob -Force -ErrorAction SilentlyContinue
}
Write-Host "Detached from worker $WorkerName." -ForegroundColor Cyan
}

View File

@@ -0,0 +1,198 @@
param(
[Parameter(Mandatory = $true)]
[string]$WorkerName,
[Parameter(Mandatory = $true)]
[string]$WorkerType,
[Parameter(Mandatory = $true)]
[string]$PayloadBase64,
[string]$DataRoot = (Join-Path ([Environment]::GetFolderPath('LocalApplicationData')) 'UnifiedWorkers'),
[int]$MaxRestarts = 5,
[int]$RestartDelaySeconds = 10
)
$ErrorActionPreference = 'Stop'
# region Path setup
$workerRoot = Join-Path -Path $DataRoot -ChildPath $WorkerType
$instanceRoot = Join-Path -Path $workerRoot -ChildPath $WorkerName
New-Item -ItemType Directory -Path $instanceRoot -Force | Out-Null
$logsRoot = Join-Path -Path $instanceRoot -ChildPath 'logs'
$stateRoot = Join-Path -Path $instanceRoot -ChildPath 'state'
New-Item -ItemType Directory -Path $logsRoot -Force | Out-Null
New-Item -ItemType Directory -Path $stateRoot -Force | Out-Null
$logPath = Join-Path -Path $logsRoot -ChildPath 'worker.log'
$metaPath = Join-Path -Path $stateRoot -ChildPath 'worker-info.json'
$commandPath = Join-Path -Path $stateRoot -ChildPath 'commands.txt'
$payloadPath = Join-Path -Path $stateRoot -ChildPath "payload-$([Guid]::NewGuid().ToString()).ps1"
# endregion
# region Helpers
function Write-ControllerLog {
param([string]$Message)
$timestamp = (Get-Date).ToString('u')
Add-Content -Path $logPath -Value "[CTRL $timestamp] $Message" -Encoding UTF8
}
function Write-Metadata {
param(
[string]$Status,
[nullable[int]]$WorkerPid = $null,
[nullable[int]]$ControllerPid = $PID,
[int]$Restarts = 0,
[nullable[int]]$LastExitCode = $null
)
$payload = [pscustomobject]@{
WorkerName = $WorkerName
WorkerType = $WorkerType
Status = $Status
ControllerPid = $ControllerPid
WorkerPid = $WorkerPid
Restarts = $Restarts
LastExitCode = $LastExitCode
LogPath = $logPath
CommandPath = $commandPath
PayloadPath = $payloadPath
UpdatedAtUtc = (Get-Date).ToUniversalTime()
}
$payload | ConvertTo-Json -Depth 5 | Set-Content -Path $metaPath -Encoding UTF8
}
function Get-PendingCommands {
if (-not (Test-Path $commandPath)) {
return @()
}
try {
$lines = Get-Content -Path $commandPath -ErrorAction Stop
Remove-Item -Path $commandPath -Force -ErrorAction SilentlyContinue
return $lines | Where-Object { -not [string]::IsNullOrWhiteSpace($_) }
}
catch {
return @()
}
}
# endregion
try {
# Write payload script to disk
$payloadBytes = [Convert]::FromBase64String($PayloadBase64)
[IO.File]::WriteAllBytes($payloadPath, $payloadBytes)
Write-ControllerLog "Payload written to $payloadPath"
}
catch {
Write-Error "Unable to write payload: $($_.Exception.Message)"
exit 1
}
$restartCount = 0
$controllerPid = $PID
while ($restartCount -le $MaxRestarts) {
try {
# Initialize worker process
$psi = [System.Diagnostics.ProcessStartInfo]::new()
$pwsh = Get-Command pwsh -ErrorAction SilentlyContinue
if ($pwsh) {
$psi.FileName = $pwsh.Source
}
else {
$psi.FileName = (Get-Command powershell -ErrorAction Stop).Source
}
$psi.Arguments = "-NoLogo -NoProfile -ExecutionPolicy Bypass -File `"$payloadPath`""
$psi.UseShellExecute = $false
$psi.RedirectStandardInput = $true
$psi.RedirectStandardOutput = $true
$psi.RedirectStandardError = $true
$psi.CreateNoWindow = $true
$workerProcess = New-Object System.Diagnostics.Process
$workerProcess.StartInfo = $psi
if (-not $workerProcess.Start()) {
throw "Failed to start worker process."
}
Write-ControllerLog "Worker process started with PID $($workerProcess.Id)"
Write-Metadata -Status 'running' -WorkerPid $workerProcess.Id -ControllerPid $controllerPid -Restarts $restartCount
$logWriter = [System.IO.StreamWriter]::new($logPath, $true, [System.Text.Encoding]::UTF8)
$logWriter.AutoFlush = $true
$outputHandler = [System.Diagnostics.DataReceivedEventHandler]{
param($s, $e)
if ($e.Data) {
$ts = (Get-Date).ToString('u')
$logWriter.WriteLine("[OUT $ts] $($e.Data)")
}
}
$errorHandler = [System.Diagnostics.DataReceivedEventHandler]{
param($s, $e)
if ($e.Data) {
$ts = (Get-Date).ToString('u')
$logWriter.WriteLine("[ERR $ts] $($e.Data)")
}
}
$workerProcess.add_OutputDataReceived($outputHandler)
$workerProcess.add_ErrorDataReceived($errorHandler)
$workerProcess.BeginOutputReadLine()
$workerProcess.BeginErrorReadLine()
while (-not $workerProcess.HasExited) {
$commands = Get-PendingCommands
foreach ($command in $commands) {
$trimmed = $command.Trim()
if (-not $trimmed) { continue }
Write-ControllerLog "Received command '$trimmed'"
try {
$workerProcess.StandardInput.WriteLine($trimmed)
$workerProcess.StandardInput.Flush()
}
catch {
Write-ControllerLog "Failed to forward command '$trimmed': $($_.Exception.Message)"
}
if ($trimmed -ieq 'quit') {
Write-ControllerLog "Quit command issued. Waiting for worker to exit."
}
}
Start-Sleep -Milliseconds 500
}
$exitCode = $workerProcess.ExitCode
Write-ControllerLog "Worker exited with code $exitCode"
Write-Metadata -Status 'stopped' -WorkerPid $null -ControllerPid $controllerPid -Restarts $restartCount -LastExitCode $exitCode
if ($exitCode -eq 0) {
break
}
}
catch {
Write-ControllerLog "Controller error: $($_.Exception.Message)"
}
$restartCount++
if ($restartCount -gt $MaxRestarts) {
Write-ControllerLog "Maximum restart attempts reached. Controller stopping."
break
}
Write-ControllerLog "Restarting worker in $RestartDelaySeconds seconds (attempt $restartCount of $MaxRestarts)."
Start-Sleep -Seconds $RestartDelaySeconds
}
Write-Metadata -Status 'inactive' -WorkerPid $null -ControllerPid $controllerPid -Restarts $restartCount
Write-ControllerLog "Controller exiting."

View File

@@ -50,6 +50,372 @@ $workers = @(
}
)
$script:ControllerScriptBase64 = $null
$script:AttachHelperScriptBase64 = $null
$script:WorkerBasePathCache = @{}
function Build-SshArgsFromParts {
param(
[pscustomobject]$Parts,
[switch]$Interactive
)
$args = @('-o','ServerAliveInterval=60','-o','ServerAliveCountMax=30')
if ($Interactive -and $Parts.RequestPty) {
$args += '-t'
}
elseif (-not $Interactive) {
$args += '-T'
}
$args += $Parts.Options
if ($Parts.Port) {
$args += '-p'
$args += $Parts.Port
}
$args += $Parts.Host
return $args
}
function Build-ScpArgsFromParts {
param(
[pscustomobject]$Parts
)
$args = @('-o','ServerAliveInterval=60','-o','ServerAliveCountMax=30')
$args += $Parts.Options
if ($Parts.Port) {
$args += '-P'
$args += $Parts.Port
}
return $args
}
function Get-SshArgs {
param(
[object]$Worker,
[switch]$Interactive
)
$parts = Get-WorkerConnectionParts -RawArgs $Worker.SSHArgs -DefaultHost $Worker.Name
return Build-SshArgsFromParts -Parts $parts -Interactive:$Interactive
}
function Get-WorkerBasePath {
param(
[object]$Worker,
[pscustomobject]$ConnectionParts = $null
)
if ($script:WorkerBasePathCache.ContainsKey($Worker.Name)) {
return $script:WorkerBasePathCache[$Worker.Name]
}
if (-not $ConnectionParts) {
$ConnectionParts = Get-WorkerConnectionParts -RawArgs $Worker.SSHArgs -DefaultHost $Worker.Name
}
$sshArgs = Build-SshArgsFromParts -Parts $ConnectionParts -Interactive:$false
$scriptBlock = "`$ProgressPreference='SilentlyContinue'; [Environment]::GetFolderPath('LocalApplicationData')"
$encoded = [Convert]::ToBase64String([Text.Encoding]::Unicode.GetBytes($scriptBlock))
$remoteCmd = "powershell -NoLogo -NoProfile -ExecutionPolicy Bypass -EncodedCommand $encoded"
$output = & ssh @sshArgs $remoteCmd
if ($LASTEXITCODE -ne 0) {
throw "Unable to determine LocalAppData on $($Worker.Name)."
}
$base = ($output | Where-Object { -not [string]::IsNullOrWhiteSpace($_) } | Select-Object -Last 1).Trim()
if (-not $base) {
throw "Unable to read LocalAppData path on $($Worker.Name)."
}
$final = Join-Path $base 'UnifiedWorkers'
$script:WorkerBasePathCache[$Worker.Name] = $final
return $final
}
function Get-FileBase64 {
param([string]$Path)
[Convert]::ToBase64String([IO.File]::ReadAllBytes($Path))
}
function Get-ControllerScriptBase64 {
if (-not $script:ControllerScriptBase64) {
$controllerPath = Join-Path $PSScriptRoot 'remote_worker_controller.ps1'
$script:ControllerScriptBase64 = Get-FileBase64 -Path $controllerPath
}
return $script:ControllerScriptBase64
}
function Get-AttachHelperScriptBase64 {
if (-not $script:AttachHelperScriptBase64) {
$helperPath = Join-Path $PSScriptRoot 'remote_worker_attach.ps1'
$script:AttachHelperScriptBase64 = Get-FileBase64 -Path $helperPath
}
return $script:AttachHelperScriptBase64
}
function ConvertTo-Base64Unicode {
param([string]$Content)
[Convert]::ToBase64String([Text.Encoding]::Unicode.GetBytes($Content))
}
function Get-WorkerSshArgs {
param([string]$RawArgs)
if ([string]::IsNullOrWhiteSpace($RawArgs)) {
return @()
}
return $RawArgs -split '\s+' | Where-Object { $_.Trim().Length -gt 0 }
}
function Get-WorkerConnectionParts {
param(
[string]$RawArgs,
[string]$DefaultHost
)
$tokens = Get-WorkerSshArgs -RawArgs $RawArgs
$options = New-Object System.Collections.Generic.List[string]
$targetHost = $null
$port = $null
$requestPty = $false
$optionsWithArgs = @('-i','-o','-c','-D','-E','-F','-I','-J','-L','-l','-m','-O','-Q','-R','-S','-W','-w')
for ($i = 0; $i -lt $tokens.Count; $i++) {
$token = $tokens[$i]
if ($token -eq '-t' -or $token -eq '-tt') {
$requestPty = $true
continue
}
if ($token -eq '-p' -and ($i + 1) -lt $tokens.Count) {
$port = $tokens[$i + 1]
$i++
continue
}
if ($token.StartsWith('-')) {
$options.Add($token)
if ($optionsWithArgs -contains $token -and ($i + 1) -lt $tokens.Count) {
$options.Add($tokens[$i + 1])
$i++
}
continue
}
if (-not $targetHost) {
$targetHost = $token
continue
}
$options.Add($token)
}
if (-not $targetHost) {
$targetHost = $DefaultHost
}
return [pscustomobject]@{
Host = $targetHost
Options = $options.ToArray()
Port = $port
RequestPty = $requestPty
}
}
function Invoke-RemotePowerShell {
param(
[Parameter(Mandatory = $true)][object]$Worker,
[Parameter(Mandatory = $true)][string]$Script,
[switch]$Interactive
)
$parts = Get-WorkerConnectionParts -RawArgs $Worker.SSHArgs -DefaultHost $Worker.Name
if (-not $parts.Host) {
throw "Unable to determine SSH host for $($Worker.Name)"
}
$sshBaseArgs = Build-SshArgsFromParts -Parts $parts -Interactive:$Interactive
$remoteBasePath = Get-WorkerBasePath -Worker $Worker -ConnectionParts $parts
$localTemp = [System.IO.Path]::GetTempFileName() + '.ps1'
Set-Content -Path $localTemp -Value $Script -Encoding UTF8
$remoteTmpDir = Join-Path $remoteBasePath 'tmp'
$remoteScriptWin = Join-Path $remoteTmpDir ("script-{0}.ps1" -f ([guid]::NewGuid().ToString()))
$remoteScriptScp = $remoteScriptWin -replace '\\','/'
$remoteTarget = "{0}:{1}" -f $parts.Host, ('"'+$remoteScriptScp+'"')
$ensureScript = "New-Item -ItemType Directory -Path '$remoteTmpDir' -Force | Out-Null"
$ensureCmd = "powershell -NoLogo -NoProfile -ExecutionPolicy Bypass -EncodedCommand " + [Convert]::ToBase64String([Text.Encoding]::Unicode.GetBytes($ensureScript))
& ssh @sshBaseArgs $ensureCmd
if ($LASTEXITCODE -ne 0) {
Remove-Item $localTemp -ErrorAction SilentlyContinue
return $LASTEXITCODE
}
$scpArgs = Build-ScpArgsFromParts -Parts $parts
$scpArgs += $localTemp
$scpArgs += $remoteTarget
& scp @scpArgs
$scpExit = $LASTEXITCODE
Remove-Item $localTemp -ErrorAction SilentlyContinue
if ($scpExit -ne 0) {
return $scpExit
}
$execCmd = "powershell -NoLogo -NoProfile -ExecutionPolicy Bypass -File `"$remoteScriptWin`""
& ssh @sshBaseArgs $execCmd
$execExit = $LASTEXITCODE
$cleanupScript = "Remove-Item -LiteralPath '$remoteScriptWin' -ErrorAction SilentlyContinue"
$cleanupCmd = "powershell -NoLogo -NoProfile -ExecutionPolicy Bypass -EncodedCommand " + [Convert]::ToBase64String([Text.Encoding]::Unicode.GetBytes($cleanupScript))
& ssh @sshBaseArgs $cleanupCmd | Out-Null
return $execExit
}
function Ensure-ControllerDeployed {
param([object]$Worker)
$controllerBase64 = Get-ControllerScriptBase64
$script = @"
`$ProgressPreference = 'SilentlyContinue'
`$dataRoot = Join-Path ([Environment]::GetFolderPath('LocalApplicationData')) 'UnifiedWorkers'
New-Item -ItemType Directory -Path `$dataRoot -Force | Out-Null
`$controllerPath = Join-Path `$dataRoot 'controller.ps1'
[IO.File]::WriteAllBytes(`$controllerPath, [Convert]::FromBase64String('$controllerBase64'))
"@
Invoke-RemotePowerShell -Worker $Worker -Script $script | Out-Null
}
function Ensure-AttachHelperDeployed {
param([object]$Worker)
$helperBase64 = Get-AttachHelperScriptBase64
$script = @"
`$ProgressPreference = 'SilentlyContinue'
`$dataRoot = Join-Path ([Environment]::GetFolderPath('LocalApplicationData')) 'UnifiedWorkers'
New-Item -ItemType Directory -Path `$dataRoot -Force | Out-Null
`$attachPath = Join-Path `$dataRoot 'attach-helper.ps1'
[IO.File]::WriteAllBytes(`$attachPath, [Convert]::FromBase64String('$helperBase64'))
"@
Invoke-RemotePowerShell -Worker $Worker -Script $script | Out-Null
}
function Get-EnsureWorkerScript {
param(
[string]$WorkerName,
[string]$WorkerType,
[string]$PayloadBase64
)
$payload = @{
WorkerName = $WorkerName
WorkerType = $WorkerType
PayloadBase64 = $PayloadBase64
} | ConvertTo-Json -Compress
$payloadBase64 = ConvertTo-Base64Unicode -Content $payload
return @"
`$ProgressPreference = 'SilentlyContinue'
`$params = ConvertFrom-Json ([Text.Encoding]::Unicode.GetString([Convert]::FromBase64String('$payloadBase64')))
`$workerName = `$params.WorkerName
`$workerType = `$params.WorkerType
`$payloadBase64 = `$params.PayloadBase64
`$dataRoot = Join-Path ([Environment]::GetFolderPath('LocalApplicationData')) 'UnifiedWorkers'
`$instanceRoot = Join-Path (Join-Path `$dataRoot `$workerType) `$workerName
`$metaPath = Join-Path `$instanceRoot 'state\worker-info.json'
`$controllerPath = Join-Path `$dataRoot 'controller.ps1'
if (-not (Test-Path `$controllerPath)) {
throw "Controller missing at `$controllerPath"
}
`$shouldStart = `$true
if (Test-Path `$metaPath) {
try {
`$meta = Get-Content `$metaPath -Raw | ConvertFrom-Json
if (`$meta.Status -eq 'running' -and `$meta.WorkerPid) {
if (Get-Process -Id `$meta.WorkerPid -ErrorAction SilentlyContinue) {
Write-Host "Worker `$workerName already running (PID `$($meta.WorkerPid))."
`$shouldStart = `$false
}
}
} catch {
Write-Host "Unable to parse metadata for `$workerName. Starting new controller." -ForegroundColor Yellow
}
}
if (`$shouldStart) {
`$pwsh = Get-Command pwsh -ErrorAction SilentlyContinue
if (`$pwsh) {
`$psExe = `$pwsh.Source
}
else {
`$psExe = (Get-Command powershell -ErrorAction Stop).Source
}
`$controllerArgs = @(
'-NoLogo','-NoProfile','-ExecutionPolicy','Bypass',
'-File',"`$controllerPath",
'-WorkerName',"`$workerName",
'-WorkerType',"`$workerType",
'-PayloadBase64',"`$payloadBase64"
)
Start-Process -FilePath `$psExe -ArgumentList `$controllerArgs -WindowStyle Hidden | Out-Null
Write-Host "Worker `$workerName started under controller." -ForegroundColor Green
}
"@
}
function Ensure-PersistentWorker {
param(
[object]$Worker,
[string]$WorkerType,
[string]$PayloadScript
)
Write-Host "[$($Worker.Name)] Ensuring worker is running under controller..." -ForegroundColor Cyan
Ensure-ControllerDeployed -Worker $Worker
$payloadBase64 = ConvertTo-Base64Unicode -Content $PayloadScript
$ensureScript = Get-EnsureWorkerScript -WorkerName $Worker.Name -WorkerType $WorkerType -PayloadBase64 $payloadBase64
$result = Invoke-RemotePowerShell -Worker $Worker -Script $ensureScript
if ($result -ne 0) {
Write-Host "[$($Worker.Name)] Remote ensure command exited with code $result." -ForegroundColor Yellow
}
}
function Invoke-WorkerAttach {
param(
[object]$Worker,
[string]$WorkerType,
[switch]$CommandOnly,
[string]$Command
)
Ensure-AttachHelperDeployed -Worker $Worker
$paramsBlock = @("-WorkerName","$($Worker.Name)","-WorkerType","$WorkerType")
if ($CommandOnly) {
$paramsBlock += "-CommandOnly"
}
if ($Command) {
$paramsBlock += "-Command"
$paramsBlock += $Command
}
$parts = Get-WorkerConnectionParts -RawArgs $Worker.SSHArgs -DefaultHost $Worker.Name
$sshArgs = Build-SshArgsFromParts -Parts $parts -Interactive:(!$CommandOnly)
$remoteBasePath = Get-WorkerBasePath -Worker $Worker -ConnectionParts $parts
$remoteHelper = Join-Path $remoteBasePath 'attach-helper.ps1'
$quotedArgs = ($paramsBlock | ForEach-Object { '"' + ($_ -replace '"','""') + '"' }) -join ' '
$remoteCmd = "powershell -NoLogo -NoProfile -ExecutionPolicy Bypass -File `"$remoteHelper`" $quotedArgs"
& ssh @sshArgs $remoteCmd
}
# FUNCTIONS
# This function generates the standard PowerShell remote command
@@ -58,11 +424,10 @@ function Get-RemoteStandardWorkerCommand {
Write-Host "Setting up network connections..." -ForegroundColor Cyan
# Define arrays of drives and network paths
$drives = @('A:', 'F:', 'G:', 'N:', 'P:')
$drives = @('A:', 'F:', 'N:', 'P:')
$networkPaths = @(
'\\NEXUS\amazon',
'\\NEXUS\flamenco',
'\\NEXUS\glacier',
'\\NEXUS\proj',
'\\NAS\amazon'
)
@@ -91,7 +456,6 @@ if ($LASTEXITCODE -eq 0) {
# Map all NEXUS drives
net use A: \\NEXUS\amazon /persistent:yes
net use F: \\NEXUS\flamenco /persistent:yes
net use G: \\NEXUS\glacier /persistent:yes
net use P: \\NEXUS\proj /persistent:yes
} else {
Write-Host "Failed to connect to NEXUS" -ForegroundColor Red
@@ -111,7 +475,7 @@ net use
# Start worker
Write-Host "Starting Flamenco worker..." -ForegroundColor Cyan
Set-Location 'F:\software\Flamenco'
Set-Location 'F:\software\Flamenco 3.7'
if (Test-Path 'flamenco-worker.exe') {
Write-Host "Running flamenco-worker.exe..." -ForegroundColor Green
# Run the worker and capture its exit code
@@ -120,7 +484,7 @@ if (Test-Path 'flamenco-worker.exe') {
Write-Host "Flamenco worker process has terminated with exit code: $exitCode" -ForegroundColor Yellow
exit $exitCode # Exit with the worker's exit code to trigger restart if needed
} else {
Write-Host "Error: flamenco-worker.exe not found in F:\software\Flamenco" -ForegroundColor Red
Write-Host "Error: flamenco-worker.exe not found in F:\software\Flamenco 3.7" -ForegroundColor Red
exit 1 # Exit with error code to trigger restart
}
'@
@@ -132,11 +496,10 @@ function Get-RemoteCmdWorkerCommand {
Write-Host "Setting up network connections..." -ForegroundColor Cyan
# Define arrays of drives and network paths
$drives = @('A:', 'F:', 'G:', 'N:', 'P:')
$drives = @('A:', 'F:', 'N:', 'P:')
$networkPaths = @(
'\\NEXUS\amazon',
'\\NEXUS\flamenco',
'\\NEXUS\glacier',
'\\NEXUS\proj',
'\\NAS\amazon'
)
@@ -156,7 +519,6 @@ if ($LASTEXITCODE -eq 0) {
# Map all NEXUS drives
net use A: \\NEXUS\amazon /persistent:yes
net use F: \\NEXUS\flamenco /persistent:yes
net use G: \\NEXUS\glacier /persistent:yes
net use P: \\NEXUS\proj /persistent:yes
} else {
Write-Host "Failed to connect to NEXUS" -ForegroundColor Red
@@ -171,13 +533,13 @@ net use
# Start worker via CMD - hardcoded paths
Write-Host "Running command file..." -ForegroundColor Cyan
$defaultCmdPath = "F:\software\Flamenco\run-flamenco-worker.cmd"
$defaultCmdPath = "F:\software\Flamenco 3.7\run-flamenco-worker.cmd"
if (Test-Path $defaultCmdPath) {
Set-Location "F:\software\Flamenco"
Set-Location "F:\software\Flamenco 3.7"
Write-Host "Starting worker..." -ForegroundColor Green
# Use hardcoded path to avoid variable expansion issues
cmd.exe /c "F:\software\Flamenco\run-flamenco-worker.cmd"
cmd.exe /c "F:\software\Flamenco 3.7\run-flamenco-worker.cmd"
Write-Host "Worker process has terminated." -ForegroundColor Yellow
} else {
Write-Host "Command file not found at default location." -ForegroundColor Red
@@ -203,11 +565,10 @@ function Get-RemoteSimplifiedCmdWorkerCommand {
Write-Host "Setting up network connections..." -ForegroundColor Cyan
# Define arrays of drives and network paths
$drives = @('A:', 'F:', 'G:', 'N:', 'P:')
$drives = @('A:', 'F:', 'N:', 'P:')
$networkPaths = @(
'\\NEXUS\amazon',
'\\NEXUS\flamenco',
'\\NEXUS\glacier',
'\\NEXUS\proj',
'\\NAS\amazon'
)
@@ -227,7 +588,6 @@ if ($LASTEXITCODE -eq 0) {
# Map all NEXUS drives
net use A: \\NEXUS\amazon /persistent:yes
net use F: \\NEXUS\flamenco /persistent:yes
net use G: \\NEXUS\glacier /persistent:yes
net use P: \\NEXUS\proj /persistent:yes
} else {
Write-Host "Failed to connect to NEXUS" -ForegroundColor Red
@@ -242,7 +602,7 @@ net use
# Simple direct command execution with automatic "2" input
Write-Host "Running Flamenco worker..." -ForegroundColor Cyan
Set-Location -Path "F:\software\Flamenco"
Set-Location -Path "F:\software\Flamenco 3.7"
if (Test-Path -Path "run-flamenco-worker.cmd") {
# Create a temporary file to store the "2" input
$tempInputFile = [System.IO.Path]::GetTempFileName()
@@ -265,56 +625,9 @@ function Start-StandardWorker {
[object]$Worker
)
$retryCount = 0
$retryDelay = 15 # seconds between retries
$workerRestarted = $false
while ($true) { # Changed to infinite loop
if ($retryCount -gt 0) {
Write-Host "`nRestarting worker process (Attempt $($retryCount + 1))..." -ForegroundColor Yellow
Start-Sleep -Seconds $retryDelay
$workerRestarted = $true
}
Write-Host "Connecting to $($Worker.Name)..." -ForegroundColor Cyan
if ($workerRestarted) {
Write-Host "Worker was restarted due to disconnection or crash." -ForegroundColor Yellow
}
try {
$remoteCommand = Get-RemoteStandardWorkerCommand
# Encode the command to handle special characters
$bytes = [System.Text.Encoding]::Unicode.GetBytes($remoteCommand)
$encodedCommand = [Convert]::ToBase64String($bytes)
# Execute the encoded command on the remote machine
Write-Host "Connecting to $($Worker.Name) and executing worker script..." -ForegroundColor Yellow
# Add SSH keepalive settings to reduce chance of random disconnections
$sshCommand = "ssh -o ServerAliveInterval=60 -o ServerAliveCountMax=30 $($Worker.SSHArgs) ""powershell -EncodedCommand $encodedCommand"""
# Execute the SSH command and capture the exit code
Invoke-Expression $sshCommand
$sshExitCode = $LASTEXITCODE
# Check if SSH command completed successfully
if ($sshExitCode -eq 0) {
Write-Host "`nWorker completed successfully. Restarting automatically..." -ForegroundColor Green
Start-Sleep -Seconds 2 # Brief pause before restarting
$retryCount = 0 # Reset counter for successful completion
continue # Continue the loop instead of breaking
} else {
throw "Worker process exited with code: $sshExitCode"
}
}
catch {
$retryCount++
Write-Host "`nAn error occurred while running worker on $($Worker.Name):" -ForegroundColor Red
Write-Host $_.Exception.Message -ForegroundColor Red
Write-Host "`nAttempting to restart worker in $retryDelay seconds..." -ForegroundColor Yellow
}
}
$payloadScript = Get-RemoteStandardWorkerCommand
Ensure-PersistentWorker -Worker $Worker -WorkerType 'flamenco' -PayloadScript $payloadScript
Invoke-WorkerAttach -Worker $Worker -WorkerType 'flamenco'
}
# This function launches the CMD worker
@@ -324,56 +637,9 @@ function Start-CmdWorker {
[object]$Worker
)
$retryCount = 0
$retryDelay = 5 # seconds between retries
$workerRestarted = $false
while ($true) { # Changed to infinite loop
if ($retryCount -gt 0) {
Write-Host "`nRestarting worker process (Attempt $($retryCount + 1))..." -ForegroundColor Yellow
Start-Sleep -Seconds $retryDelay
$workerRestarted = $true
}
Write-Host "Connecting to $($Worker.Name) (CMD mode)..." -ForegroundColor Cyan
if ($workerRestarted) {
Write-Host "Worker was restarted due to disconnection or crash." -ForegroundColor Yellow
}
try {
$remoteCommand = Get-RemoteCmdWorkerCommand
# Encode the command to handle special characters
$bytes = [System.Text.Encoding]::Unicode.GetBytes($remoteCommand)
$encodedCommand = [Convert]::ToBase64String($bytes)
# Execute the encoded command on the remote machine
Write-Host "Connecting to $($Worker.Name) and executing CMD worker script..." -ForegroundColor Yellow
# Add SSH keepalive settings to reduce chance of random disconnections
$sshCommand = "ssh -o ServerAliveInterval=60 -o ServerAliveCountMax=30 $($Worker.SSHArgs) ""powershell -EncodedCommand $encodedCommand"""
# Execute the SSH command and capture the exit code
Invoke-Expression $sshCommand
$sshExitCode = $LASTEXITCODE
# Check if SSH command completed successfully
if ($sshExitCode -eq 0) {
Write-Host "`nWorker completed successfully. Restarting automatically..." -ForegroundColor Green
Start-Sleep -Seconds 2 # Brief pause before restarting
$retryCount = 0 # Reset counter for successful completion
continue # Continue the loop instead of breaking
} else {
throw "Worker process exited with code: $sshExitCode"
}
}
catch {
$retryCount++
Write-Host "`nAn error occurred while running worker on $($Worker.Name):" -ForegroundColor Red
Write-Host $_.Exception.Message -ForegroundColor Red
Write-Host "`nAttempting to restart worker in $retryDelay seconds..." -ForegroundColor Yellow
}
}
$payloadScript = Get-RemoteCmdWorkerCommand
Ensure-PersistentWorker -Worker $Worker -WorkerType 'flamenco' -PayloadScript $payloadScript
Invoke-WorkerAttach -Worker $Worker -WorkerType 'flamenco'
}
# This function launches ALL workers in Windows Terminal tabs
@@ -383,173 +649,21 @@ function Start-AllWorkers {
[string]$WorkerType
)
Write-Host "Launching ALL $WorkerType workers in Windows Terminal tabs..." -ForegroundColor Cyan
Write-Host "Ensuring ALL $WorkerType workers are running under controllers..." -ForegroundColor Cyan
try {
# First, check if Windows Terminal is available
if (-not (Get-Command wt.exe -ErrorAction SilentlyContinue)) {
Write-Host "Windows Terminal (wt.exe) not found. Falling back to separate windows." -ForegroundColor Yellow
$useTerminal = $false
} else {
$useTerminal = $true
}
foreach ($worker in $workers) {
# Create a new PowerShell script with a unique name for this worker
$tempScriptPath = [System.IO.Path]::GetTempFileName() + ".ps1"
# Create different script content based on worker type
if ($WorkerType -eq "CMD") {
# CMD workers get retry logic at the local level
$scriptContent = @"
# Wrap everything in a try-catch to prevent script termination
try {
Write-Host 'Launching $WorkerType worker for $($worker.Name)' -ForegroundColor Cyan
`$retryCount = 0
`$retryDelay = 5 # seconds between retries
`$workerRestarted = `$false
while (`$true) { # Changed to infinite loop
try {
if (`$retryCount -gt 0) {
Write-Host "`nRestarting worker process (Attempt `$(`$retryCount + 1))..." -ForegroundColor Yellow
Start-Sleep -Seconds `$retryDelay
`$workerRestarted = `$true
}
Write-Host "Connecting to $($worker.Name) ($WorkerType mode)..." -ForegroundColor Cyan
if (`$workerRestarted) {
Write-Host "Worker was restarted due to disconnection or crash." -ForegroundColor Yellow
}
# Get remote command
`$remoteCommand = @'
$(Get-RemoteSimplifiedCmdWorkerCommand)
'@
# Encode the command
`$bytes = [System.Text.Encoding]::Unicode.GetBytes(`$remoteCommand)
`$encodedCommand = [Convert]::ToBase64String(`$bytes)
# Execute SSH command with keepalive settings and capture exit code
ssh -o ServerAliveInterval=60 -o ServerAliveCountMax=30 $($worker.SSHArgs) "powershell -EncodedCommand `$encodedCommand"
`$sshExitCode = `$LASTEXITCODE
# Check if SSH command completed successfully
if (`$sshExitCode -eq 0) {
Write-Host "`nWorker completed successfully. Restarting automatically..." -ForegroundColor Green
Start-Sleep -Seconds 2 # Brief pause before restarting
`$retryCount = 0 # Reset counter for successful completion
continue # Continue the loop instead of breaking
} else {
throw "SSH command failed with exit code: `$sshExitCode"
}
}
catch {
`$retryCount++
Write-Host "An error occurred while connecting to $($worker.Name):" -ForegroundColor Red
Write-Host `$_.Exception.Message -ForegroundColor Red
Write-Host "Attempting to reconnect in `$retryDelay seconds..." -ForegroundColor Yellow
# Don't rethrow - we want to continue the retry loop
}
}
}
catch {
# This outer catch block is for any unexpected errors that might terminate the script
Write-Host "`nCRITICAL ERROR: Script encountered an unexpected error:" -ForegroundColor Red
Write-Host `$_.Exception.Message -ForegroundColor Red
Write-Host "`nRestarting the entire worker process in 5 seconds..." -ForegroundColor Yellow
Start-Sleep -Seconds 5
# Restart the script by calling itself
& `$MyInvocation.MyCommand.Path
}
"@
} else {
# Standard workers keep the original retry logic
$scriptContent = @"
# Wrap everything in a try-catch to prevent script termination
try {
Write-Host 'Launching $WorkerType worker for $($worker.Name)' -ForegroundColor Cyan
`$retryCount = 0
`$retryDelay = 5 # seconds between retries
while (`$true) { # Changed to infinite loop
try {
if (`$retryCount -gt 0) {
Write-Host "Retry attempt `$retryCount..." -ForegroundColor Yellow
Start-Sleep -Seconds `$retryDelay
}
# Get remote command
`$remoteCommand = @'
$(Get-RemoteStandardWorkerCommand)
'@
# Encode the command
`$bytes = [System.Text.Encoding]::Unicode.GetBytes(`$remoteCommand)
`$encodedCommand = [Convert]::ToBase64String(`$bytes)
# Execute SSH command with keepalive settings
ssh -o ServerAliveInterval=60 -o ServerAliveCountMax=30 $($worker.SSHArgs) "powershell -EncodedCommand `$encodedCommand"
`$sshExitCode = `$LASTEXITCODE
# Check SSH exit code and handle accordingly
if (`$sshExitCode -eq 0) {
Write-Host "`nWorker completed successfully. Restarting automatically..." -ForegroundColor Green
Start-Sleep -Seconds 2 # Brief pause before restarting
`$retryCount = 0 # Reset counter for successful completion
continue # Continue the loop instead of breaking
} else {
throw "SSH command failed with exit code: `$sshExitCode"
}
}
catch {
`$retryCount++
Write-Host "An error occurred while connecting to $($worker.Name):" -ForegroundColor Red
Write-Host `$_.Exception.Message -ForegroundColor Red
Write-Host "Attempting to reconnect in `$retryDelay seconds..." -ForegroundColor Yellow
# Don't rethrow - we want to continue the retry loop
}
}
}
catch {
# This outer catch block is for any unexpected errors that might terminate the script
Write-Host "`nCRITICAL ERROR: Script encountered an unexpected error:" -ForegroundColor Red
Write-Host `$_.Exception.Message -ForegroundColor Red
Write-Host "`nRestarting the entire worker process in 5 seconds..." -ForegroundColor Yellow
Start-Sleep -Seconds 5
# Restart the script by calling itself
& `$MyInvocation.MyCommand.Path
}
"@
}
# Write the script to file
Set-Content -Path $tempScriptPath -Value $scriptContent
if ($useTerminal) {
# Launch in a new Windows Terminal tab
$tabTitle = "$($worker.Name) - $WorkerType Worker"
Start-Process wt.exe -ArgumentList "-w 0 new-tab --title `"$tabTitle`" powershell -NoLogo -NoProfile -ExecutionPolicy Bypass -File `"$tempScriptPath`""
} else {
# Fallback to separate window if Windows Terminal is not available
Start-Process powershell -ArgumentList "-NoLogo -NoProfile -ExecutionPolicy Bypass -File `"$tempScriptPath`""
}
Write-Host "Started $($worker.Name) ($WorkerType) worker in a new tab." -ForegroundColor Green
Start-Sleep -Milliseconds 300 # Small delay between launches
}
Write-Host "`nAll $WorkerType worker scripts have been launched in Windows Terminal tabs." -ForegroundColor Cyan
}
catch {
Write-Host "Error launching workers: $($_.Exception.Message)" -ForegroundColor Red
$payloadScript = if ($WorkerType -eq 'CMD') {
Get-RemoteCmdWorkerCommand
} else {
Get-RemoteStandardWorkerCommand
}
foreach ($worker in $workers) {
Ensure-PersistentWorker -Worker $worker -WorkerType 'flamenco' -PayloadScript $payloadScript
}
Write-Host "All workers processed. Attach to any worker from the menu to monitor output." -ForegroundColor Green
Write-Host "Press Enter to return to the menu..." -ForegroundColor Green
Read-Host
Read-Host | Out-Null
}
# Main menu loop

View File

@@ -32,6 +32,414 @@ $workers = @(
@{ ID = 6; Name = "i9-13ks"; SSHArgs = "-t -p 22146 i9-13ks"; Enabled = $true }
)
$script:ControllerScriptBase64 = $null
$script:AttachHelperScriptBase64 = $null
$script:WorkerBasePathCache = @{}
function Build-SshArgsFromParts {
param(
[pscustomobject]$Parts,
[switch]$Interactive
)
$args = @('-o','ServerAliveInterval=60','-o','ServerAliveCountMax=30')
if ($Interactive -and $Parts.RequestPty) {
$args += '-t'
}
elseif (-not $Interactive) {
$args += '-T'
}
$args += $Parts.Options
if ($Parts.Port) {
$args += '-p'
$args += $Parts.Port
}
$args += $Parts.Host
return $args
}
function Build-ScpArgsFromParts {
param(
[pscustomobject]$Parts
)
$args = @('-o','ServerAliveInterval=60','-o','ServerAliveCountMax=30')
$args += $Parts.Options
if ($Parts.Port) {
$args += '-P'
$args += $Parts.Port
}
return $args
}
function Get-SshArgs {
param(
[object]$Worker,
[switch]$Interactive
)
$parts = Get-WorkerConnectionParts -RawArgs $Worker.SSHArgs -DefaultHost $Worker.Name
return Build-SshArgsFromParts -Parts $parts -Interactive:$Interactive
}
function Get-WorkerBasePath {
param(
[object]$Worker,
[pscustomobject]$ConnectionParts = $null
)
if ($script:WorkerBasePathCache.ContainsKey($Worker.Name)) {
return $script:WorkerBasePathCache[$Worker.Name]
}
if (-not $ConnectionParts) {
$ConnectionParts = Get-WorkerConnectionParts -RawArgs $Worker.SSHArgs -DefaultHost $Worker.Name
}
$sshArgs = Build-SshArgsFromParts -Parts $ConnectionParts -Interactive:$false
$scriptBlock = "`$ProgressPreference='SilentlyContinue'; [Environment]::GetFolderPath('LocalApplicationData')"
$encoded = [Convert]::ToBase64String([Text.Encoding]::Unicode.GetBytes($scriptBlock))
$remoteCmd = "powershell -NoLogo -NoProfile -ExecutionPolicy Bypass -EncodedCommand $encoded"
$output = & ssh @sshArgs $remoteCmd
if ($LASTEXITCODE -ne 0) {
throw "Unable to determine LocalAppData on $($Worker.Name)."
}
$base = ($output | Where-Object { -not [string]::IsNullOrWhiteSpace($_) } | Select-Object -Last 1).Trim()
if (-not $base) {
throw "Unable to read LocalAppData path on $($Worker.Name)."
}
$final = Join-Path $base 'UnifiedWorkers'
$script:WorkerBasePathCache[$Worker.Name] = $final
return $final
}
function Get-FileBase64 {
param([string]$Path)
[Convert]::ToBase64String([IO.File]::ReadAllBytes($Path))
}
function Get-ControllerScriptBase64 {
if (-not $script:ControllerScriptBase64) {
$controllerPath = Join-Path $PSScriptRoot 'remote_worker_controller.ps1'
$script:ControllerScriptBase64 = Get-FileBase64 -Path $controllerPath
}
return $script:ControllerScriptBase64
}
function Get-AttachHelperScriptBase64 {
if (-not $script:AttachHelperScriptBase64) {
$helperPath = Join-Path $PSScriptRoot 'remote_worker_attach.ps1'
$script:AttachHelperScriptBase64 = Get-FileBase64 -Path $helperPath
}
return $script:AttachHelperScriptBase64
}
function ConvertTo-Base64Unicode {
param([string]$Content)
[Convert]::ToBase64String([Text.Encoding]::Unicode.GetBytes($Content))
}
function Get-WorkerSshArgs {
param([string]$RawArgs)
if ([string]::IsNullOrWhiteSpace($RawArgs)) {
return @()
}
return $RawArgs -split '\s+' | Where-Object { $_.Trim().Length -gt 0 }
}
function Get-WorkerConnectionParts {
param(
[string]$RawArgs,
[string]$DefaultHost
)
$tokens = Get-WorkerSshArgs -RawArgs $RawArgs
$options = New-Object System.Collections.Generic.List[string]
$targetHost = $null
$port = $null
$requestPty = $false
$optionsWithArgs = @('-i','-o','-c','-D','-E','-F','-I','-J','-L','-l','-m','-O','-Q','-R','-S','-W','-w')
for ($i = 0; $i -lt $tokens.Count; $i++) {
$token = $tokens[$i]
if ($token -eq '-t' -or $token -eq '-tt') {
$requestPty = $true
continue
}
if ($token -eq '-p' -and ($i + 1) -lt $tokens.Count) {
$port = $tokens[$i + 1]
$i++
continue
}
if ($token.StartsWith('-')) {
$options.Add($token)
if ($optionsWithArgs -contains $token -and ($i + 1) -lt $tokens.Count) {
$options.Add($tokens[$i + 1])
$i++
}
continue
}
if (-not $targetHost) {
$targetHost = $token
continue
}
$options.Add($token)
}
if (-not $targetHost) {
$targetHost = $DefaultHost
}
return [pscustomobject]@{
Host = $targetHost
Options = $options.ToArray()
Port = $port
RequestPty = $requestPty
}
}
function Invoke-RemotePowerShell {
param(
[object]$Worker,
[string]$Script,
[switch]$Interactive
)
$parts = Get-WorkerConnectionParts -RawArgs $Worker.SSHArgs -DefaultHost $Worker.Name
if (-not $parts.Host) {
throw "Unable to determine SSH host for $($Worker.Name)"
}
$sshBaseArgs = Build-SshArgsFromParts -Parts $parts -Interactive:$Interactive
$remoteBasePath = Get-WorkerBasePath -Worker $Worker -ConnectionParts $parts
$localTemp = [System.IO.Path]::GetTempFileName() + '.ps1'
Set-Content -Path $localTemp -Value $Script -Encoding UTF8
$remoteTmpDir = Join-Path $remoteBasePath 'tmp'
$remoteScriptWin = Join-Path $remoteTmpDir ("script-{0}.ps1" -f ([guid]::NewGuid().ToString()))
$remoteScriptScp = $remoteScriptWin -replace '\\','/'
$remoteTarget = "{0}:{1}" -f $parts.Host, ('"'+$remoteScriptScp+'"')
$ensureScript = "New-Item -ItemType Directory -Path '$remoteTmpDir' -Force | Out-Null"
$ensureCmd = "powershell -NoLogo -NoProfile -ExecutionPolicy Bypass -EncodedCommand " + [Convert]::ToBase64String([Text.Encoding]::Unicode.GetBytes($ensureScript))
& ssh @sshBaseArgs $ensureCmd
if ($LASTEXITCODE -ne 0) {
Remove-Item $localTemp -ErrorAction SilentlyContinue
return $LASTEXITCODE
}
$scpArgs = Build-ScpArgsFromParts -Parts $parts
$scpArgs += $localTemp
$scpArgs += $remoteTarget
& scp @scpArgs
$scpExit = $LASTEXITCODE
Remove-Item $localTemp -ErrorAction SilentlyContinue
if ($scpExit -ne 0) {
return $scpExit
}
$execCmd = "powershell -NoLogo -NoProfile -ExecutionPolicy Bypass -File `"$remoteScriptWin`""
& ssh @sshBaseArgs $execCmd
$execExit = $LASTEXITCODE
$cleanupScript = "Remove-Item -LiteralPath '$remoteScriptWin' -ErrorAction SilentlyContinue"
$cleanupCmd = "powershell -NoLogo -NoProfile -ExecutionPolicy Bypass -EncodedCommand " + [Convert]::ToBase64String([Text.Encoding]::Unicode.GetBytes($cleanupScript))
& ssh @sshBaseArgs $cleanupCmd | Out-Null
return $execExit
}
function Ensure-ControllerDeployed {
param([object]$Worker)
$controllerBase64 = Get-ControllerScriptBase64
$script = @"
`$ProgressPreference = 'SilentlyContinue'
`$dataRoot = Join-Path ([Environment]::GetFolderPath('LocalApplicationData')) 'UnifiedWorkers'
New-Item -ItemType Directory -Path `$dataRoot -Force | Out-Null
`$controllerPath = Join-Path `$dataRoot 'controller.ps1'
[IO.File]::WriteAllBytes(`$controllerPath, [Convert]::FromBase64String('$controllerBase64'))
"@
Invoke-RemotePowerShell -Worker $Worker -Script $script | Out-Null
}
function Ensure-AttachHelperDeployed {
param([object]$Worker)
$helperBase64 = Get-AttachHelperScriptBase64
$script = @"
`$ProgressPreference = 'SilentlyContinue'
`$dataRoot = Join-Path ([Environment]::GetFolderPath('LocalApplicationData')) 'UnifiedWorkers'
New-Item -ItemType Directory -Path `$dataRoot -Force | Out-Null
`$attachPath = Join-Path `$dataRoot 'attach-helper.ps1'
[IO.File]::WriteAllBytes(`$attachPath, [Convert]::FromBase64String('$helperBase64'))
"@
Invoke-RemotePowerShell -Worker $Worker -Script $script | Out-Null
}
function Get-EnsureWorkerScript {
param(
[string]$WorkerName,
[string]$WorkerType,
[string]$PayloadBase64
)
$payload = @{
WorkerName = $WorkerName
WorkerType = $WorkerType
PayloadBase64 = $PayloadBase64
} | ConvertTo-Json -Compress
$payloadBase64 = ConvertTo-Base64Unicode -Content $payload
return @"
`$ProgressPreference = 'SilentlyContinue'
`$params = ConvertFrom-Json ([Text.Encoding]::Unicode.GetString([Convert]::FromBase64String('$payloadBase64')))
`$workerName = `$params.WorkerName
`$workerType = `$params.WorkerType
`$payloadBase64 = `$params.PayloadBase64
`$dataRoot = Join-Path ([Environment]::GetFolderPath('LocalApplicationData')) 'UnifiedWorkers'
`$instanceRoot = Join-Path (Join-Path `$dataRoot `$workerType) `$workerName
`$metaPath = Join-Path `$instanceRoot 'state\worker-info.json'
`$controllerPath = Join-Path `$dataRoot 'controller.ps1'
if (-not (Test-Path `$controllerPath)) {
throw "Controller missing at `$controllerPath"
}
`$shouldStart = `$true
if (Test-Path `$metaPath) {
try {
`$meta = Get-Content `$metaPath -Raw | ConvertFrom-Json
if (`$meta.Status -eq 'running' -and `$meta.WorkerPid) {
if (Get-Process -Id `$meta.WorkerPid -ErrorAction SilentlyContinue) {
Write-Host "Worker `$workerName already running (PID `$($meta.WorkerPid))."
`$shouldStart = `$false
}
}
} catch {
Write-Host "Failed to read metadata. Controller will restart worker." -ForegroundColor Yellow
}
}
if (`$shouldStart) {
`$pwsh = Get-Command pwsh -ErrorAction SilentlyContinue
if (`$pwsh) {
`$psExe = `$pwsh.Source
}
else {
`$psExe = (Get-Command powershell -ErrorAction Stop).Source
}
`$controllerArgs = @(
'-NoLogo','-NoProfile','-ExecutionPolicy','Bypass',
'-File',"`$controllerPath",
'-WorkerName',"`$workerName",
'-WorkerType',"`$workerType",
'-PayloadBase64',"`$payloadBase64"
)
Start-Process -FilePath `$psExe -ArgumentList `$controllerArgs -WindowStyle Hidden | Out-Null
Write-Host "Worker `$workerName started under controller." -ForegroundColor Green
}
"@
}
function Ensure-PersistentWorker {
param(
[object]$Worker,
[string]$WorkerType,
[string]$PayloadScript
)
Ensure-ControllerDeployed -Worker $Worker
$payloadBase64 = ConvertTo-Base64Unicode -Content $PayloadScript
$ensureScript = Get-EnsureWorkerScript -WorkerName $Worker.Name -WorkerType $WorkerType -PayloadBase64 $payloadBase64
Invoke-RemotePowerShell -Worker $Worker -Script $ensureScript | Out-Null
}
function Test-WorkerMetadataExists {
param(
[object]$Worker,
[string]$WorkerType
)
$payload = @{
WorkerName = $Worker.Name
WorkerType = $WorkerType
} | ConvertTo-Json -Compress
$payloadBase64 = ConvertTo-Base64Unicode -Content $payload
$script = @"
`$ProgressPreference = 'SilentlyContinue'
`$params = ConvertFrom-Json ([Text.Encoding]::Unicode.GetString([Convert]::FromBase64String('$payloadBase64')))
`$dataRoot = Join-Path ([Environment]::GetFolderPath('LocalApplicationData')) 'UnifiedWorkers'
`$instanceRoot = Join-Path (Join-Path `$dataRoot `$params.WorkerType) `$params.WorkerName
`$metaPath = Join-Path `$instanceRoot 'state\worker-info.json'
if (Test-Path `$metaPath) {
exit 0
}
exit 1
"@
$result = Invoke-RemotePowerShell -Worker $Worker -Script $script
return ($result -eq 0)
}
function Wait-WorkerMetadata {
param(
[object]$Worker,
[string]$WorkerType,
[int]$TimeoutSeconds = 30
)
$deadline = [DateTime]::UtcNow.AddSeconds($TimeoutSeconds)
while ([DateTime]::UtcNow -lt $deadline) {
if (Test-WorkerMetadataExists -Worker $Worker -WorkerType $WorkerType) {
return $true
}
Start-Sleep -Milliseconds 500
}
return $false
}
function Invoke-WorkerAttach {
param(
[object]$Worker,
[string]$WorkerType,
[switch]$CommandOnly,
[string]$Command
)
Ensure-AttachHelperDeployed -Worker $Worker
$paramsBlock = @("-WorkerName","$($Worker.Name)","-WorkerType","$WorkerType")
if ($CommandOnly) {
$paramsBlock += "-CommandOnly"
}
if ($Command) {
$paramsBlock += "-Command"
$paramsBlock += $Command
}
$parts = Get-WorkerConnectionParts -RawArgs $Worker.SSHArgs -DefaultHost $Worker.Name
$sshArgs = Build-SshArgsFromParts -Parts $parts -Interactive:(!$CommandOnly)
$remoteBase = Get-WorkerBasePath -Worker $Worker -ConnectionParts $parts
$remoteHelper = Join-Path $remoteBase 'attach-helper.ps1'
$quotedArgs = ($paramsBlock | ForEach-Object { '"' + ($_ -replace '"','""') + '"' }) -join ' '
$remoteCmd = "powershell -NoLogo -NoProfile -ExecutionPolicy Bypass -File `"$remoteHelper`" $quotedArgs"
& ssh @sshArgs $remoteCmd
}
function Get-RemoteSheepItCommand {
param(
[string]$RenderKey,
@@ -44,6 +452,7 @@ function Get-RemoteSheepItCommand {
$urlLiteral = '@(' + (($SheepItJarUrls | ForEach-Object { "'$_'" }) -join ', ') + ')'
@"
`$ProgressPreference = 'SilentlyContinue'
`$ErrorActionPreference = 'Stop'
try {
@@ -135,6 +544,7 @@ try {
throw
}
}
catch {
Write-Host ('Error: {0}' -f `$_.Exception.Message) -ForegroundColor Red
Write-Host ('Stack trace: {0}' -f `$_.ScriptStackTrace) -ForegroundColor DarkRed
@@ -142,128 +552,46 @@ catch {
"@
}
function New-SheepItSessionScript {
param(
[object]$Worker,
[string]$RemoteCommand
)
$rawArgs = @($Worker.SSHArgs -split '\s+' | Where-Object { $_ -and $_.Trim().Length -gt 0 })
$sshArgsJson = if ($rawArgs) { ($rawArgs | ConvertTo-Json -Compress) } else { '[]' }
$targetHost = ($rawArgs | Where-Object { $_ -notmatch '^-{1,2}' } | Select-Object -Last 1)
$hostLiteral = if ($targetHost) { "'" + ($targetHost -replace "'", "''") + "'" } else { '$null' }
$portValue = $null
for ($i = 0; $i -lt $rawArgs.Count; $i++) {
if ($rawArgs[$i] -eq '-p' -and $i -lt ($rawArgs.Count - 1)) {
$portValue = $rawArgs[$i + 1]
break
}
}
$portLiteral = if ($portValue) { "'" + ($portValue -replace "'", "''") + "'" } else { '$null' }
$remoteScriptBase64 = [Convert]::ToBase64String([System.Text.Encoding]::Unicode.GetBytes($RemoteCommand))
$scriptContent = @"
Write-Host 'Connecting to $($Worker.Name)...' -ForegroundColor Cyan
`$sshArgs = ConvertFrom-Json '$sshArgsJson'
`$targetHost = $hostLiteral
`$port = $portLiteral
`$scriptBase64 = '$remoteScriptBase64'
if (-not `$targetHost) {
Write-Host "Unable to determine SSH host for $($Worker.Name)." -ForegroundColor Red
return
}
`$localTemp = [System.IO.Path]::GetTempFileName() + '.ps1'
`$remoteScript = [System.Text.Encoding]::Unicode.GetString([Convert]::FromBase64String(`$scriptBase64))
Set-Content -Path `$localTemp -Value `$remoteScript -Encoding UTF8
`$remoteFile = "sheepit-$([System.Guid]::NewGuid().ToString()).ps1"
`$scpArgs = @('-o','ServerAliveInterval=60','-o','ServerAliveCountMax=30')
if (`$port) {
`$scpArgs += '-P'
`$scpArgs += `$port
}
`$scpArgs += `$localTemp
`$scpArgs += ("${targetHost}:`$remoteFile")
Write-Host "Transferring SheepIt script to `$targetHost..." -ForegroundColor Gray
& scp @scpArgs
if (`$LASTEXITCODE -ne 0) {
Write-Host "Failed to copy script to `$targetHost." -ForegroundColor Red
Remove-Item `$localTemp -ErrorAction SilentlyContinue
return
}
`$sshBase = @('-o','ServerAliveInterval=60','-o','ServerAliveCountMax=30') + `$sshArgs
`$execArgs = `$sshBase + @('powershell','-NoLogo','-NoProfile','-ExecutionPolicy','Bypass','-File',"`$remoteFile")
Write-Host "Starting SheepIt client on `$targetHost..." -ForegroundColor Cyan
& ssh @execArgs
Write-Host "Cleaning up remote script..." -ForegroundColor Gray
`$cleanupArgs = `$sshBase + @('powershell','-NoLogo','-NoProfile','-ExecutionPolicy','Bypass','-Command',"`$ErrorActionPreference='SilentlyContinue'; Remove-Item -LiteralPath ``"$remoteFile``" -ErrorAction SilentlyContinue")
& ssh @cleanupArgs | Out-Null
Remove-Item `$localTemp -ErrorAction SilentlyContinue
Write-Host "`nSSH session ended." -ForegroundColor Yellow
Read-Host "Press Enter to close"
"@
return $scriptContent
}
function Start-SheepItTab {
param(
[string]$Title,
[string]$Content
)
$tempScript = [System.IO.Path]::GetTempFileName() + '.ps1'
Set-Content -Path $tempScript -Value $Content -Encoding UTF8
if (Get-Command wt.exe -ErrorAction SilentlyContinue) {
Start-Process wt.exe -ArgumentList "-w 0 new-tab --title `"$Title`" powershell -NoLogo -NoProfile -ExecutionPolicy Bypass -File `"$tempScript`""
} else {
Start-Process powershell -ArgumentList "-NoLogo -NoProfile -ExecutionPolicy Bypass -File `"$tempScript`""
}
function Ensure-SheepItWorkerController {
param([object]$Worker)
Initialize-SheepItCredentials
$payloadScript = Get-RemoteSheepItCommand -RenderKey $script:SheepItRenderKey -Username $script:SheepItUsername
Ensure-PersistentWorker -Worker $Worker -WorkerType 'sheepit' -PayloadScript $payloadScript
}
function Start-SheepItWorker {
param([object]$Worker)
if (-not $Worker.Enabled) {
Write-Host "$($Worker.Name) is not enabled for SheepIt." -ForegroundColor Yellow
Ensure-SheepItWorkerController -Worker $Worker
if (-not (Wait-WorkerMetadata -Worker $Worker -WorkerType 'sheepit' -TimeoutSeconds 20)) {
Write-Host "Worker metadata did not appear on $($Worker.Name). Check controller logs." -ForegroundColor Red
return
}
Initialize-SheepItCredentials
$remoteCommand = Get-RemoteSheepItCommand -RenderKey $script:SheepItRenderKey -Username $script:SheepItUsername
$sessionScript = New-SheepItSessionScript -Worker $Worker -RemoteCommand $remoteCommand
$title = "$($Worker.Name) - SheepIt"
Start-SheepItTab -Title $title -Content $sessionScript
Write-Host "Opened SheepIt session for $($Worker.Name) in a new terminal tab." -ForegroundColor Green
Invoke-WorkerAttach -Worker $Worker -WorkerType 'sheepit'
}
function Start-AllSheepIt {
Initialize-SheepItCredentials
$targets = $workers | Where-Object { $_.Enabled }
if (-not $targets) {
Write-Host "No systems are ready for SheepIt." -ForegroundColor Yellow
return
}
foreach ($worker in $targets) {
Start-SheepItWorker -Worker $worker
Start-Sleep -Milliseconds 200
foreach ($worker in $workers | Where-Object { $_.Enabled }) {
Ensure-SheepItWorkerController -Worker $worker
}
Write-Host "All enabled SheepIt workers ensured running under controllers." -ForegroundColor Green
Write-Host "Use the attach option to monitor any worker." -ForegroundColor Cyan
Read-Host "Press Enter to continue" | Out-Null
}
function Send-SheepItCommandAll {
param([string]$CommandText)
foreach ($worker in $workers | Where-Object { $_.Enabled }) {
Write-Host "[$($worker.Name)] Sending '$CommandText'..." -ForegroundColor Gray
Invoke-WorkerAttach -Worker $worker -WorkerType 'sheepit' -CommandOnly -Command $CommandText
}
Write-Host "Command '$CommandText' dispatched to all enabled workers." -ForegroundColor Green
Read-Host "Press Enter to continue" | Out-Null
}
function Select-SheepItWorker {
while ($true) {
Show-Header
@@ -308,20 +636,22 @@ Initialize-SheepItCredentials
while ($true) {
Show-Header
Write-Host "Main Menu:" -ForegroundColor Magenta
Write-Host "1. Launch SheepIt on a single system" -ForegroundColor Yellow
Write-Host "2. Launch SheepIt on all ready systems" -ForegroundColor Yellow
Write-Host "3. Exit" -ForegroundColor Yellow
Write-Host "1. Launch/Attach SheepIt on a single system" -ForegroundColor Yellow
Write-Host "2. Ensure all ready systems are running" -ForegroundColor Yellow
Write-Host "3. Pause all workers" -ForegroundColor Yellow
Write-Host "4. Resume all workers" -ForegroundColor Yellow
Write-Host "5. Quit all workers" -ForegroundColor Yellow
Write-Host "6. Exit" -ForegroundColor Yellow
$choice = Read-Host "Select option (1-3)"
switch ($choice) {
'1' { Select-SheepItWorker }
'2' {
Start-AllSheepIt
Write-Host
Read-Host "Press Enter to return to the main menu" | Out-Null
}
'3' { break }
'2' { Start-AllSheepIt }
'3' { Send-SheepItCommandAll -CommandText 'pause' }
'4' { Send-SheepItCommandAll -CommandText 'resume' }
'5' { Send-SheepItCommandAll -CommandText 'quit' }
'6' { break }
default {
Write-Host "Invalid selection." -ForegroundColor Red
Start-Sleep -Seconds 1