Grid
The easiest way to get distribution of pipelines and pipeline batches running is to use the Simplygon Grid. There is no need to setup coordinators or network details, just run the grid agent executable on any number of machines on the local network and everything is setup for distribution.
Running a distribution agent
To start a grid agent run the SimplygonGridAgent.exe tool on the machine. There is no coordinator or network setup needed, the pipeline execution locates available processing nodes using mDNS/DNS-SD.
As of version 10.2.5200 the agent must run the same Simplygon version as the client. The agent is always discoverable but the -Discover
query will produce errors asking you to upgrade the agent to the compatible version displayed in the error message:
[2023-09-29 10:14:47] [INFO] -------------------------------------------
[2023-09-29 10:14:47] [INFO] 9.9.9.132:57644 (10.2.5600)
[2023-09-29 10:14:47] [INFO] Load | Vulkan | Service
[2023-09-29 10:14:47] [INFO] 0 | Yes | No
[2023-09-29 10:14:47] [ERROR] - This agent is unable to process any Grid jobs: agent (10.2.5600) is not running the same version as client (10.2.5803)
[2023-09-29 10:14:47] [ERROR] Please upgrade agent to the same version as client SDK: 10.2.5803
[2023-09-29 10:14:47] [INFO] -------------------------------------------
Check Grid capabilities
Grid agent availability is currently based on compatible version, vulkan capabilities and load. If you want to check the status of your grid agents, run SimplygonBatch.exe -Discover
. That will list all your agents that are discoverable and also print information about each agent, here is an example output:
❯ & "C:\Program Files\Simplygon\10\SimplygonBatch.exe" -Discover
SimplygonBatch discover remote agents
...
[2023-09-29 10:14:47] [INFO] Found 3 agents:
[2023-09-29 10:14:47] [INFO] -------------------------------------------
[2023-09-29 10:14:47] [INFO] 9.9.9.120:49157 (10.2.5803)
[2023-09-29 10:14:47] [INFO] Load | Vulkan | Service
[2023-09-29 10:14:47] [INFO] 0 | Yes | No
[2023-09-29 10:14:47] [INFO] -------------------------------------------
[2023-09-29 10:14:47] [INFO] 9.9.9.248:49157 (10.2.5600)
[2023-09-29 10:14:47] [INFO] Load | Vulkan | Service
[2023-09-29 10:14:47] [INFO] 10 | Yes | No
[2023-09-29 10:14:47] [ERROR] - This agent is unable to process any Grid jobs: agent (10.2.5600) is not running the same version as client (10.2.5803)
[2023-09-29 10:14:47] [ERROR] Please upgrade agent to the same version as client SDK: 10.2.5803
[2023-09-29 10:14:47] [INFO] -------------------------------------------
[2023-09-29 10:14:47] [INFO] 9.9.9.203:49157 (10.2.5803)
[2023-09-29 10:14:47] [INFO] Load | Vulkan | Service
[2023-09-29 10:14:47] [INFO] 65 | Yes | Yes
[2023-09-29 10:14:47] [WARNING] - No Vulkan capabilities. Pipelines requiring Vulkan cannot be distributed to this agent.
[2023-09-29 10:14:47] [INFO] -------------------------------------------
In the above example we can see that two of the agents are missing vulkan capabilites and won't pick up pipelines that requires vulkan. Also, one of the agent is running an older version and needs an upgrade in order to be availabe and pick up distributed jobs.
Bridging subnets
If your network is segmented and multicast UDP packets cannot traverse subnet boundaries you can bridge these subnets by allowing the batch tool to connect to a Grid agent on a different subnet over TCP/IP and routing discovery of hosts from that subnet back to the batch tool over the TCP/IP connection. This is done by setting the environment variable SIMPLYGON_<MajorVersion>_GRID_HOSTS
to a semicolon separated list of machines (either IP address or FQDN) running Grid agents on different subnets. This way you do not need to have every machine in the list, only one on each subnet that can act as the bridge during discovery. The bridge connection utilizes TCP/IP port 55001 which must be open in any firewalls. The environment variable needs to be set only on the machine that is the distributor.
Replace <MajorVersion>
with the major version of Simplygon, e.g. SIMPLYGON_10_GRID_HOSTS
.
Running as a Windows service
The grid agent can be started as a Windows service. To install the service, run the grid agent executable with the -InstallService
argument. To uninstall the service, run the grid agent executable with the -UninstallService
argument. Errors will be posted to the Windows event log. Note that services are run in session 0 and cannot display any UI to the logged in user.
Log to file
If you want the grid agent log output as a file, add the -LogFile <path-to-file>
argument to the command line.
Distributed pipeline execution
Distribution is enabled by passing the appropriate run mode to the pipeline run scene API
spReductionPipeline reductionPipeline = sg->CreateReductionPipeline();
// Process a scene from file directly using distribution
reductionPipeline->RunSceneFromFile( "input.obj", "output.obj", EPipelineRunMode::RunDistributedUsingSimplygonGrid );
// Or if you have a scene object, process it using distribution
reductionPipeline->RunScene( scene, EPipelineRunMode::RunDistributedUsingSimplygonGrid )
spReductionPipeline reductionPipeline = sg.CreateReductionPipeline();
// Process a scene from file directly using distribution
reductionPipeline.RunSceneFromFile( "input.obj", "output.obj", EPipelineRunMode.RunDistributedUsingSimplygonGrid );
// Or if you have a scene object, process it using distribution
reductionPipeline.RunScene( scene, EPipelineRunMode.RunDistributedUsingSimplygonGrid )
reductionPipeline = sg.CreateReductionPipeline()
# Process a scene from file directly using distribution
reductionPipeline.RunSceneFromFile( "input.obj", "output.obj", EPipelineRunMode_RunDistributedUsingSimplygonGrid )
# Or if you have a scene object, process it using distribution
reductionPipeline.RunScene( scene, EPipelineRunMode_RunDistributedUsingSimplygonGrid )
Distributed batch tool
Distribution can be enabled when invoking the batch tool (SimplygonBatch.exe
) by passing the -Distribute
parameter on the command line
SimplygonBatch.exe -Distribute <path/to/pipeline.json> <input.scene> <output.scene>
SimplygonBatch.exe -Distribute <path/to/pipelinebatch.json>
If you wish to integrate the batch tool in an asset processing tool and read progress for an UI, you can pass and additional parameters to the batch tool executable, -Progress
. It will suppress the command line progress bar and instead print progress as a number between 0 and 100, one line at a time, to the standard output.
SimplygonBatch.exe -Distribute -Progress <path/to/pipeline.json> <input.scene> <output.scene>
SimplygonBatch.exe -Distribute -Progress <path/to/pipelinebatch.json>
Here is an example how to do batch processing using Simplygon Grid.
How it works
The pipeline execution usees the batch tool (SimplygonBatch.exe
) to do the distribution. The batch tool find a suitable node using mDNS/DNS-SD (multicast UDP port 5353), picks the node with the lowest overall load and sends/receives files using TCP/IP (by default using an ephemeral port). Progress and errors are propagated back to the originating host.
Cascaded distribution
If a pipeline execution is distributed, and the pipeline is cascaded with multiple child pipelines, the execution of the child pipeline will be re-distributed to parallelize work as much as possible. This is handled internally by the pipeline execution.
Fallback
If the pipeline execution is unable to locate a suitable node for distributed execution it will fall back to local processing.
Settings
Global setting | Description |
---|---|
BatchLogPath | Output SimplygonBatch.exe's log to file. This setting can be very useful when you want to troubleshoot pipelines executed using RunInNewProcess or RunDistributedUsingSimplygonGrid . |
GridDiscoverTimeout | The time in milliseconds SimplygonBatch waits before presenting agents found during -Discover . |
GridAgentLoadLimit | The maximum allowed agent load between 1-100%. |
GridAgentDiscoveryDelay | How long SimplygonBatch pauses before searching for agents on the network again. (Milliseconds). |
GridRetryLimitForFailedJob | Number of retries for a failed grid job. |
GridAgentBalancerJobTolerance | Specify the number of jobs that the balancer can skip before it runs. Increasing this number will reduce the efficiency of the agent usage. |
GridEnableAgentIpSorting | Set to True if you want the discovered agents to also be sorted by ip address. |
GridOverloadedTimeout | Specify how long grid should wait for busy agents to be available for distribution again. |
GridEnableVerboseLogging | Control whether grid should output detailed information about its internal operations and events to the console or a log file. If set to true, SimplygonBatch will produce verbose logs; otherwise, it will only produce minimal logs. |
Simplygon Grid Agent arguments
Example usage: SimplygonGridAgent.exe -LogFile "C:\GridLogs\agent.log" -LogVerbose -LogClient -MaxJobs 2
Grid Agent Arguments | Description |
---|---|
-LogFile PATH | Output SimplygonGridAgent.exe's log to file by specifying a PATH. This setting can be very useful when you want to troubleshoot the agent. |
-LogVerbose | If set, SimplygonGridAgent will produce verbose logs; otherwise, it will only produce minimal logs. This setting can be very useful when you want to troubleshoot the agent. |
-LogClient | If set, logging from each client thread will be outputted to the log. This setting can be very useful when you want to troubleshoot the agent. |
-MaxJobs N | Limit the number of jobs that this agent can process. The default limit is 50% of the logical cores. For example, a CPU with 16 logical cores will have a limit of 8 jobs. |
-CreateConsoleWindow | Use this flag to open the agent console window at start up. |
Troubleshoot
Network troubleshooting
In case the distribution batch tool has trouble distributing processing work you can run the SimplygonBatch.exe
tool with a -Discover
command line argument to make it list the remote grid agents it is able to find. If you are running agents on your local network and this call does not list any remote hosts, check that your network is not blocking multicast UDP packets between machines.
If the Grid agent reports the wrong IP address, it is possible to select which network interface the agent should listen on by setting the SIMPLYGON_<MajorVersion>_GRID_IFINDEX
environment variable before starting the Grid agent.
See more information under Installation -> Environment variables.
Agents table
If the grid agents are fixed and known in advance, a user can provide a table with the agents to use for the processing, thereby sidestepping the mDNS discovery process. This can be beneficial for environments with a number of dedicated processing machines, where the discovery adds a bit of overhead for each processing. This is especially useful if most processings are quick, and the overhead becomes a significant part of the processing time.
Creating an agents table
The agents table is an .xml file which lists all agents available for the processing. The easiest way to create the agents table, is to use the argument -DiscoverToFile
:
.\SimplygonBatch.exe -DiscoverToFile CurrentAgents.xml
This call will create the .xml file CurrentAgents.xml
, which contains the currently available agents using the mDNS discovery path.
The file will look like the example below, where each Agent
element lists the version, address, port and capavilities of the available agent.
<?xml version="1.0" encoding="UTF-8"?>
<SimplygonGrid>
<GridAgentsInfo Version="10.2">
<AgentsTable>
<Agent VersionMajor="10" VersionMinor="3" Protocol="ip4" Address="192.168.1.12" Port="46938" JobsMax="16" VulkanSupported="1"/>
<Agent VersionMajor="10" VersionMinor="3" Protocol="ip4" Address="192.168.1.16" Port="44345" JobsMax="16" VulkanSupported="1"/>
<Agent VersionMajor="10" VersionMinor="3" Protocol="ip4" Address="192.168.1.24" Port="44847" JobsMax="16" VulkanSupported="1"/>
<Agent VersionMajor="10" VersionMinor="3" Protocol="ip4" Address="192.168.2.68" Port="40227" JobsMax="8" VulkanSupported="0"/>
<Agent VersionMajor="10" VersionMinor="3" Protocol="ip4" Address="192.168.2.38" Port="49094" JobsMax="16" VulkanSupported="1"/>
<Agent VersionMajor="10" VersionMinor="3" Protocol="ip4" Address="192.168.2.66" Port="30125" JobsMax="8" VulkanSupported="0"/>
<Agent VersionMajor="10" VersionMinor="3" Protocol="ip4" Address="192.168.2.67" Port="30219" JobsMax="8" VulkanSupported="0"/>
<Agent VersionMajor="10" VersionMinor="2" Protocol="ip4" Address="192.168.2.61" Port="43434" JobsMax="4" VulkanSupported="0"/>
</AgentsTable>
</GridAgentsInfo>
</SimplygonGrid>
Please note that the agents table should be as up to date as possible. The batch tool will only try to use agents from the table if it is provided (see section below on how to provide the table when running grid processings). Also, although the batch tool will continue to contact agents until one is found, there will be an overhead if an agent is listed, but not avilable, as the batch tool will wait for a connection timeout.
Using the agents table
To run grid processings from the API, the path to the agents table is provided in the global setting GridAgentsTableFilePath before calling the RunScene or RunSceneFromFile pipeline processing calls:
// To disable discovery via mDNS, use this setting: "GridAgentsTableFilePath"
// which provides the path to the agents table XML file.
auto agentsTablePath = std::filesystem::absolute( "./AgentsTable.xml" ).u8string();
sg->SetGlobalGridAgentsTableFilePathSetting(agentsTablePath.c_str());
// Run the processing. Since GridAgentsTableFilePath is set, this will be used when running
// with the EPipelineRunMode::RunDistributedUsingSimplygonGrid setting
auto result = pipeline->RunScene(sgScene, Simplygon::EPipelineRunMode::RunDistributedUsingSimplygonGrid);
if( CheckLog(sg) || result != Simplygon::EErrorCodes::NoError )
std::cerr << "The processing failed with an error. Details in the log above. Error code: " << result << std::endl;
Disable discovery and fix the grid agent processing port
To avoid mDNS chatter on the grid agents, and to fix the processing port to a specific port, use these additional arguments when starting the agent:
.\SimplygonGridAgent.exe [...] -NoDiscovery -Port 43434
-NoDiscovery
turns off the mDNS discovery support, as it is not used. -Port 43434
fixes the processing to the specified port (in this example, port 43434). It is useful to have a fixed port, since any restart of the grid agent will result in a random port being assigned if the port is not fixed. This results in the agents table needing an update.