Asg 1: Ray Tracer OMP

Objectives

To parallelize the ray tracer with OpenMP (see also: wikipedia) using as many cores as there are available on any given system

Assignment

Parallelize the C++ ray tracer from the previous assignment using omp compiler directives
1. add -fopenmp to the CFLAGS line in the Makefile
2. using integer variables tid and ncores in main() find out how many cores there at your disposal via this piece of code: #pragma omp parallel private(tid) { if((tid = omp_get_thread_num())==0) ncores = omp_get_num_threads(); }
3. set integer variable chunk to the number of rows (height of the image to be ray traced) divided by the number of cores—this stipulates how many rows of the image each core will process—we parallelize the ray tracer by chopping up the image into horizontal bands, and having each core process each band of pixels simultaneously
4. insert this compiler directive immediately above the nested for loops iterating over the image pixels (x,y): #pragma omp parallel for \ shared(...fill this in...) \ private(...fill this in...) \ schedule(static,chunk) one important aspect of this assignment is to properly figure out which variables are meant to be shared among parallel processes and which are meant to only be manipulated privately
5. make sure that the nested loops write to img, an array of (pointer to) type rgb_t<unsigned char>, allocated to size w*h and that the code inside the loops does not write to std::cout directly—provide a second doubly-nested loop to ouptut the pixels after the parallelized nested loops finish
The above is almost all there is too it...if the ray tracer had been designed with multiple (simultaneous) rays in mind, the above would be all that is needed. Unfortunately, the ray tracer suffers from a serial design decision, one that ever only expected a single ray at any given time. The problem shows up in the call obj->getlast_hit() This suggests that an object stores the surface point (and normal) of the last ray that hit it. When there was only one ray at a time, querying the object for this piece of info made sense. Now, however, this design strategy leads to a race condition: in between the time that some ray hit an object, another ray came along and hit it again. The first ray, when querying the object for the hit point, gets the second ray's location. This leads to rather nasty color banding problems. To fix this:
1. rewrite ray_t::trace() so that it no longer needs and object_t *last_hit argument
2. rewrite model::find_closest() so that it updates the vec_t hit point and the normal at the point (for the plane, this will just be the plane's normal, for the sphere, this gets calculated on the fly); these variables are handled in a similar manner to dist, the distance to the closest hit point from the ray's current position
3. rewrite all the objects' hits() routines so that they update the hit point and normal—the objects' normal must not be altered, and no last_hit should be stored (this should be removed from the object_t class)
Compare the timing of the old serial approach vs. the parallel one—is there a benefit? (Simply comment out the #pragma omp directive to run the serial version (once your parallelized version works); on my dual-core laptop, the parallel version takes 1.517 seconds, while the serial version takes 2.934 seconds, not quite a factor of 2 reduction, but 1.9, which is noticeable. On a quad-core machine, the speedup should be even better.
As with the previous assignment, the image generated should be identical:

Hints

Shared or private? The idea is variables that are meant to change indepentenly should be made private. As an example, the ray and color should be private as each independent process has its own version of these variables. What about the model? Should there be many copies of these, or just one? Just the one...so it should be shared. Is there one img or many? Just one, so it too should be shared. What about w and h, the image dimensions? Do these change per process or do they stay the same?

In general, any time you need mutual exclusivity, you need privacy. But, the img is shared because access to its pixels is (must be) made exclusive, i.e., no two processes should write to the same pixel.

Here are all the relevant (to ray tracing) variable declarations:

        model_t        model;                   // model (the world)
        int            tid,ncores=1;            // thread id, no. of cores
        int            w=model.getpixel_w();    // image width (screen coords)
	int	       h=model.getpixel_h();    // image height (screen coords)
        int            chunk;                   // no. rows per thread (core)
        double         wx,wy,wz=0.0;            // pixel in world coords
        double         ww=model.getworld_w();   // world width (world coords)
	double	       wh=model.getworld_h();   // world height (world coords)
        vec_t          pos=model.getviewpoint();// camara (ray) position
        vec_t          pix,dir;                 // pixel pos (world), ray dir
        ray_t          *ray=NULL;               // a ray
        rgb_t<double>  color;                   // color set by ray
        rgb_t<uchar>   *imgloc,*img=NULL;       // image and image location ptr

Turn in

Turn in all of your code, in one tar.gz archive of your asg##/ directory, including:

A README file containing
1. Course id--section no
2. Name
3. Lab description
4. Brief solution description (e.g., program design, description of algorithm, etc., however appropriate).
5. Lessons learned, identified interesting features of your program
6. Any special usage instructions
Makefile
source code (.h headers and .cpp source)
~~object code~~ (do a make clean before tar)

How to hand in

See handin notes