[gmx-developers] rvec structure
Nathan Moore
nmoore at physics.umn.edu
Thu Jul 7 16:31:23 CEST 2005
Berk,
I've rewritten move_x to use MPI_Allgatherv rather than the ring. I know
you guys think this will be slower, but Blue Gene has a really fast
interconnect and at np~64 mpiwime (particularly the sendrecv used in
move_x and _f) accounts for half of the walltime - to be successful on the
architecture GROMACS has to scale to larger np.
All that to say, I've got a segmentation fault from the code I've written
that I don't understand. MPI_Allgatherv collectes the data from each
processor's x[] array into another temp_x[] array. After the allgather,
the temp_x arrays are identical across processors. For the simulation to
progress, I need to copy the temp_x array to x, I used,
/* update the x array with the gathered temp_x array */
for (i = 0; i < array_length; i++) {
x[i][0] = temp_x[i][0];
x[i][1] = temp_x[i][1];
x[i][2] = temp_x[i][2];
}
I also wrote in a check before this array in order that I might see if the
data was being moved properly,
FILE *NTM_OUT;
char file_out_NTM[150];
sprintf(file_out_NTM, "check_file.move_x.node.%d.print.%d.txt",
nsb->nodeid, print_count_NTM);
NTM_OUT = fopen(file_out_NTM, "w");
fprintf(NTM_OUT, "nodeid, nnodes i, j, x[i][j], temp_x[i][j]\n");
int start, limit;
start = nsb->index[nsb->nodeid];
limit = nsb->homenr[nsb->nodeid];
for (i = start; i < (start + limit); i++) {
fprintf(NTM_OUT, "(%dof%d) at [%d|%d], %f,%f\n",
nsb->nodeid, nsb->nnodes, i, 0, x[i][0], temp_x[i][0]);
fprintf(NTM_OUT, "(%dof%d) at [%d|%d], %f,%f\n",
nsb->nodeid, nsb->nnodes, i, 1, x[i][1], temp_x[i][1]);
fprintf(NTM_OUT, "(%dof%d) at [%d|%d], %f,%f\n",
nsb->nodeid, nsb->nnodes, i, 2, x[i][2], temp_x[i][2]);
}
fclose(NTM_OUT);
At present, I get file output, MPI segfaults, and the cores seem to
indicate that the failure is in the (do_pbc_first -> mk_mshift -> mk_grey)
function sequence (I'm using the water tpr from /tutor as a simple
benchmark system)
Any ideas? (I'd rather not give up)
NT Moore
Before the simulation progresses the I've been trying the printf below in
order to see that within each node the .
> Nathan Moore wrote:
>
>>I'm looking into the move_x and move_f functions.
>>
>>Do I understand the rvec structure properly: Atom i is described by rvec
>>(x[i],v[i],f[i]), and the cartesian components are accessible with the
>>following statement,
>>
>>printf("atom i=%d is at X=(%f, %f, %f)\n",
>> i, x[i][0], x[i][0], x[i][0]);
>>
>>
>>
> No,
> I suppose you meant x[i][0], x[i][1], x[i][2].
>
> But as David said before, you can pass the whole array using x[0]
> to an MPI function which expects a real *, no conversion is required.
>
> Berk.
>
> _______________________________________________
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
>
More information about the gromacs.org_gmx-developers
mailing list