omp
compiler directives
-fopenmp
to the CFLAGS
line in the Makefile
tid
and ncores
in main()
find out how many cores there at your
disposal via this piece of code:
#pragma omp parallel private(tid)
{
if((tid = omp_get_thread_num())==0)
ncores = omp_get_num_threads();
}
chunk
to the number of
rows (height of the image to be ray traced) divided by
the number of cores—this stipulates how many
rows of the image each core will process—we parallelize
the ray tracer by chopping up the image into horizontal
bands, and having each core process each band of pixels
simultaneously
for
loops iterating over the image pixels
(x
,y
):
#pragma omp parallel for \
shared(...fill this in...) \
private(...fill this in...) \
schedule(static,chunk)
one important aspect of this assignment is to properly
figure out which variables are meant to be shared among
parallel processes and which are meant to only be manipulated
privately
img
,
an array of (pointer to) type
rgb_t<unsigned char>
,
allocated to size w*h
and that the code
inside the loops does not write to std::cout
directly—provide a second doubly-nested loop to
ouptut the pixels after the parallelized nested
loops finish
obj->getlast_hit()
This suggests that an object stores the surface point (and normal)
of the last ray that hit it. When there was only one ray at a time,
querying the object for this piece of info made sense. Now, however,
this design strategy leads to a race condition: in between
the time that some ray hit an object, another ray came along and
hit it again. The first ray, when querying the object for the hit
point, gets the second ray's location. This leads to rather nasty
color banding problems. To fix this:
ray_t::trace()
so that it no longer needs
and object_t *last_hit
argument
model::find_closest()
so that it updates
the vec_t hit
point and the normal at the point
(for the plane, this will just be the plane's normal, for
the sphere, this gets calculated on the fly); these variables
are handled in a similar manner to dist
,
the distance to the closest hit point from the ray's current
position
hits()
routines so that they
update the hit point and normal—the objects' normal must
not be altered, and no last_hit
should be stored
(this should be removed from the object_t
class)
#pragma omp
directive to run the serial version (once your parallelized version
works); on my dual-core laptop, the parallel version takes 1.517
seconds, while the serial version takes 2.934 seconds, not quite
a factor of 2 reduction, but 1.9, which is noticeable. On a
quad-core machine, the speedup should be even better.
ray
and color
should be private as each
independent process has its own version of these variables.
What about the model
? Should there be many copies
of these, or just one? Just the one...so it should be shared.
Is there one img
or many? Just one, so it too should
be shared. What about w
and h
, the image
dimensions? Do these change per process or do they stay the same?
In general, any time you need mutual exclusivity, you need privacy.
But, the img
is shared because access to its pixels
is (must be) made exclusive, i.e., no two processes should write to
the same pixel.
Here are all the relevant (to ray tracing) variable declarations:
model_t model; // model (the world)
int tid,ncores=1; // thread id, no. of cores
int w=model.getpixel_w(); // image width (screen coords)
int h=model.getpixel_h(); // image height (screen coords)
int chunk; // no. rows per thread (core)
double wx,wy,wz=0.0; // pixel in world coords
double ww=model.getworld_w(); // world width (world coords)
double wh=model.getworld_h(); // world height (world coords)
vec_t pos=model.getviewpoint();// camara (ray) position
vec_t pix,dir; // pixel pos (world), ray dir
ray_t *ray=NULL; // a ray
rgb_t<double> color; // color set by ray
rgb_t<uchar> *imgloc,*img=NULL; // image and image location ptr
tar.gz
archive of your asg##/ directory, including:
README
file containing
Makefile
.h
headers and .cpp
source)
make clean
before tar
)
handin
notes