High Runtime Experiments for Executing Sample Codes in the 'examples' Folder

34 Views Asked by FahimSh87 At 07 June 2025 at 03:50

I am new to Flow and Ray[rllib], and I would like to request your assistance in sharing your estimated runtime experiences for the example codes provided in the 'examples' folder of Flow, considering your system properties.

For instance, when I run the command 'python examples/train.py multiagent_highway' with the 'N_CPUS=39' setting, I obtained the following result for one of the intermediate iterations:

Result for PPO_MultiAgentHighwayPOEnv-v1_a2ecd650:
custom_metrics: {}
date: 2023-06-16_17-23-55
done: false
episode_len_mean: 840.0
episode_reward_max: 10575.141053844392
episode_reward_mean: 5036.832631980259
zpisode_reward_min: 40.51603171896423
episodes_this_iter: 40
episodes_total: 12124
experiment_id: 3dba0c81a4784739a13832d4ece952a3
experiment_tag: '0'
hostname: fahim-System-Product-Name
info:
zrad_time_ms: 166143.367
zearner:
av:
cur_kl_coeff: 0.0
cur_lr: 4.999999873689376e-05
entropy: -2.3118395805358887
zntropy_coeff: 0.0
zl: 0.015078149735927582
policy_loss: 0.0045335013419389725
total_loss: 154.9982147216797
vf_explained_var: 0.7336021661758423
vf_loss: 154.99366760253906
load_time_ms: 293.187
num_steps_sampled: 10200000
num_steps_trained: 266994816
sample_time_ms: 67658.344
update_time_ms: 11.643
iterations_since_restore: 340
node_ip: 172.17.33.76
num_healthy_workers: 39
off_policy_estimator: {}
perf:
cpu_util_percent: 30.821290322580648
ram_util_percent: 8.600322580645159
pid: 3108
policy_reward_max:
av: 459.162328689778
policy_reward_mean:
av: 149.32797604447833
policy_reward_min:
av: 0.1245569206950437
sampler_perf:
mean_env_wait_ms: 63.503262993884285
mean_inference_ms: 1.317287123312144
mean_processing_ms: 5.036362789052089
time_since_restore: 77110.67197370529
time_this_iter_s: 217.05030941963196
time_total_s: 77110.67197370529
timestamp: 1686923635
timesteps_since_restore: 10200000
timesteps_this_iter: 30000
timesteps_total: 10200000
training_iteration: 340
trial_id: a2ecd650

== Status ==
Memory usage on this node: 16.2/188.8 GiB
Using FIFO scheduling algorithm.
Resources requested: 40/40 CPUs, 0/0 GPUs, 0.0/177.0 GiB heap, 0.0/0.1 GiB objects
Result logdir: /home/fahim/ray_results/multiagent_highway
Number of trials: 1 (1 RUNNING)
+----------------------------------------+----------+-------------------+--------+------------------+-------------+----------+
| Trial name                             | status   | loc               |   iter |   total time (s) |   timesteps |   reward |
|----------------------------------------+----------+-------------------+--------+------------------+-------------+----------|
| PPO_MultiAgentHighwayPOEnv-v1_a2ecd650 | RUNNING  | 172.17.33.76:3108 |    340 |          77110.7 |    10200000 |  5036.83 |
+----------------------------------------+----------+-------------------+--------+------------------+-------------+----------+

Does this high runtime make sense?

Additionally, I would like to incorporate GPU resources along with the CPU resources. Can you please guide me on which part of the code needs to be modified to achieve this? I noticed that the 'num_cpu' setting has been used in some codes, but I'm unsure about how to add the 'num_gpu' parameter. Any insights on this matter would be greatly appreciated.

Thank you all in advance for your help.

Original Q&A

High Runtime Experiments for Executing Sample Codes in the 'examples' Folder

There are 0 best solutions below

Related Questions in REINFORCEMENT-LEARNING

Related Questions in RAY

Related Questions in RLLIB

Related Questions in FLOW-PROJECT

Trending Questions

Popular # Hahtags

Popular Questions