Initially, I tried to run 2+ jobs in my workstation with multiple GPU, what I found is running one simulation at a time is much faster than running in parallel. You are not going to get equal or exactly half the performance during parallel inputs.