From walt@parl.ces.clemson.edu Thu Jul 13 19:15:36 2000 Date: Thu, 13 Jul 2000 15:31:23 -0400 From: Walter B. Ligon III To: Robert G. Brown Subject: Re: Paper draft (somewhat more advanced) -------- Robert, I haven't had time to look at all of it, but I have gotten through the first few pages. I find some of the Amdahl's Law description confusing. I think it would help to stick to the classicial definitions as follows: T(n) is the time to execute a program on n processors S(n) is the speedup when executing on n processors is = T(1) / T(n) Amdahl notes that most programs have some portion of the code that can be divided among the processors (executed in parallel) and some that must be executed sequentially on one processor. Tp is the time taken to execute the parallel fraction on one processor Ts is the time taken to execute the serial fraction on one processor T(1) = Ts + Tp T(n) = Ts + Tp/n so S(n) = (Ts + Tp) / (Ts + Tp/n) which is Amdahl's Law. If you prefer the rate expression then define R(n) = 1/T(n) then S(n) = T(1)/T(n) = R(n)/R(1). The description for Tis and Tip is fine, but rather than calling it "the (time) cost of parallelizing the code" maybe something like "overhead incurred due to parallelizing the code" or "extra work required becuase of parallelizing the code." When I read the original phrase I was thinking of the programmer's time spent crafting the code and that is not what you meant. Probably you should define Tis as Tis(P) because at one point you say its contribution is P * Tis, and later P^2 * Tis. If you combine Tip and Tis into To(P) (overhead time) then you can set To(P) = Tip + P * Tis for some system and To(P) = Tip + P^2 * Tis for some other system. This then is in keeping with more advanced parallel performance theory where the overhead associated with a hardware/software combimbination is paramount in determining scalabilty. Then when you describe your graphs you can argue that the value of To(P) controls the shape of the curve, and that, in turn, is determined by the details of the application and architecture. One of the problems I found was that when you start to discuss the graphs you indicate that the graph shapes are affected by IPC overhead and communications, but this isn't well justified in the text (though I realize it is completely true). So, I'm suggesting that by putting the focus on the overhead To, you can THEN justify how that overhead is affected by IPC setup and communication, etc. I think at the end of section 2 you should emphasize the benefit of scaling up the computation to get better speedups, as this is probably the most important avenue for building effective Beouwlf systems. You DO mention it, but in passing AFTER you have said there is no point in having hundreds of processors. I think you could tone that differently by showing that simply adding processors doesn't do - you have to have the problem scale up as well. I'm digging into section 3 now. I'll send more comments later. Overall it looks like it's comming along. Glad to see it! Walt -- Dr. Walter B. Ligon III Associate Professor ECE Department Clemson University