===== SPD Project Proposals for academic year 2013-2014 ===== This SPD project for the current academic year can be selected by any of the students. As explained during the course, the project development and test is still an individual work. **The description of the project and the support code can also be dowloaded as an RTF file {{:magistraleinformaticanetworking:spd:spd-13-14-project_description.rtf|here}}, and a zipped text file {{:magistraleinformaticanetworking:spd:spd-13-14-snqueens.c.zip|here}}.** **Note (29/07/2014)** -- Additional explanations about some aspects of the project and its algorithm were given to the students in private talks in June and July. They are reported at the end of this web page, [[magistraleinformaticanetworking:spd:13-14-projectproposals#Additional explanations|here]]. ==== Rules ==== Starting from the description of the project and from the fragments of code provided, a parallel application has to be developed and run and its results and performance must be analyzed. * The parallelization has to exploit either MPI, TBB, or OpenCL. * The student is required to propose an analytical model of his program behavior (e.g. expected speedup and performance) and evaluate it against the actual application. * The result of the project to be delivered is the program code and a short report about the program structure and its results. The report is discussed with the teacher. Note that the problem output will be checked when you deliver your project, before looking at the actual code and the report. **Additional Points**: more parallelization strategies can usually be developed with the same framework. Comparing several solutions and studying their behavior will improve the project evaluation. Some effort in this direction is expected from all students, at least in order to motivate the choice of which solution will be implemented and run. **Bonus points**: more than one framework can possibly be combined in the same project, or be compared with each other. This is not a requirement to pass the course or achieve full grades. **Exception**: as reported in the course wiki main page, students can still propose their own project topic to be parallelized using one or more of the frameworks, according to previous year rules. Please read in the main page the rules that apply, this page does not deal with that case. ==== Problem description ==== Design and implement a parallel application solving a variation of the N-Queen problem. The N-queen problem requires you to place exactly N queen pieces on an N by N chessboard in such a way that none is attacking any other one. It is a specific set-cover problem, with both heuristic and combinatorial solutions, which has been studied already (see D.E. Knuth's paper for a quick start on set coverage problems) and previously used as a test case for parallel and distributed programming frameworks (see Denis Caromel et al.) The sequential algorithm for generating solutions is provided as a starting point. 1) In our project we want to generate all ordinary solutions to the N-Queen problem given a specific value on N as a parameter, provided that N<=24. Since we are interested in the combinatorial exploration of the solution space, a problem whose complexity is exponential, we have an obvious case for parallel solutions. No student is expected to reach or exceed the upper limit N=24 with a complete computation, but the parallel code should cope with all N values up to 24 and test results should show the gains and limits of the implemented parallelization. 2) When generating all simple solutions, we want to record them in a data structure so that they can be reused. In this phase we also want to check when a newfound solution is a derivation of a previous one (either by a 180 degrees reflection along an axis and/or by a 90 degrees rotation), and record how many derivative solutions we find in each group. Note that for moderately high values of N the data structure may become large enough to need its own parallel implementation (e.g. splitting across resources) . 3) The set of unique solutions generated must also be scrutinized for solutions of the amazon problem, that is, placing N pieces on an N by N chessboard where each one of the pieces can move like an amazon. An amazon is a non standard chesspiece that can move both as a queen and as a knight (see http://en.wikipedia.org/wiki/Amazon_%28chess%29). This means that: (a) by converse, each solution of the N amazon problems must be a solution of the N-queen problem; (b) each N-Queen unique solution where at least two queens are less than 2 steps away along rows and columns is not a solution of the amazon problem. ==== Some References ==== Donald E. Knuth, Dancing links (2000) downloadable from http://arxiv.org/abs/cs/0011047 and from http://www-cs-faculty.stanford.edu/~uno/preprints.html http://en.wikipedia.org/wiki/Queen_%28chess%29 http://en.wikipedia.org/wiki/Eight_queens_puzzle //Other Bibliography on the problem// Rodolfo Toledo, Eric Tanter, José Piquer, Denis Caromel, Mario Leyton, "Using Reflexd For A Grid Solution To The N-Queens Problem: A Case Study", in Achievements in European Research on Grid Systems, Springer, 2008. dowmloadable from http://pleiad.dcc.uchile.cl/papers/2006/toledoAl-cgiw2006.pdf Wirth, Niklaus (1976), Algorithms + Data Structures = Programs, Prentice-Hall, ISBN 0-13-022418-9 http://www.npluskqueens.info/background.html ==== Detailed Description of the Application to build ==== === Problem Input === * the value of N, 1 #include /********* * reference sequential code for the N Queen problem, SPD course * 2013-2014, Computer Science and Networking, University of Pisa. * * The code is a derivative work of material from wikibooks.org, and * as such the Creative Commons Attribution-ShareAlike License applies * to this code. M. Coppola */ /** global definitions ******************************************************/ // largest board size #define MAXBOARDSIZE 24 /********* * levels we expand the problem of before calling the solver * (i.e. before spawning parallel work!) */ #define EXPANDLEVELS 3 /******** * this type can be used if we need to reliably copy information about * a set of subproblem across parallel instances, regardless of the * program parameters; you will likely need to define functions that * copy/transfer this kind of structures across processes, threads or * larger arrays in order to match your parallelizatin framework of * choice */ typedef struct boardplus{ int size; // board size int x; // this is where we need to restart int y; // idem int board[MAXBOARDSIZE]; // the actual board, padded to largest instance } subproblem; /******** * static variable defining the board size; you may want to make it an * explicit parameter in your code like it's done in function main() */ static int board_size = 8; /******** * static variable counting solutions found; in a parallel settings, * you will need to devise a different solution */ static int solution_count = 0; /* count the subproblems spawned from the expander function */ static int spawned_tasks=0; /** utility functions ******************************************************/ /******** * human-readable output; you may need to rewrite it in order to make * it really flexible */ void putboard(int rows[board_size]) { int x, y; printf("\nSolution #%d:\n------------------------------------------------------------------\n", ++solution_count); for (y=0; y < board_size; ++y) { for (x=0; x < board_size; ++x) printf(x == rows[y] ? "| Q " : "| "); printf("|\n------------------------------------------------------------------\n"); } } // simpler printout suitable for automatic analysis void dump_board(int rows[board_size]) { int y; printf("%d", ++solution_count); for (y=0; yEXPANDLEVELS && temp <=MAXBOARDSIZE) board_size = temp; else { printf("Wrong Size parameter: %d\n",temp); return -1; } } // n_queens_solver(rows, 0); /****** * expand three levels, then call the full solver routine; note * that this value is one of the parameters to be tuned when * parallelizing */ n_queens_expander(rows, 0, 3); printf("Found %d solutions over % d taks/subproblems\n",solution_count,spawned_tasks); return 0; } ==== Additional explanations ==== The following should be obvious but it was asked recently. - Students may optimize or replace the sequential code, but are not strictly required to do that. On the one hand, a strongly optimized sequential code will dwarven the parallel speedup of your solution, and spending most of your time in this will not address the key issue and will not raise your evaluation. On the other hand, depending on the parallel solution and technology, some optimizations may be unavoidable in order to achieve a good speedup or obey the limits of your computing platform (memory size and/or bandwidth, parallelism degree, acceptable computation time, degree of software lockout or computation/communication superposition). Recognizing which optimizations are necessary and which ones are not will simplify the development work. - The problem is naturally divided in three stages, from an abstract point of view. It is up to the student to decide if a different decomposition of the work is actually more suited to their technology. The student is expected to be able to explain and discuss his/her choices and motivate them, both in the case they match the abstract problem description and in the case they depart from it. The key question to ask yourself should sound like "would it be possible, sticking to the same platform and framework, to enhance the parallelisation of the problem by changing the parallel structure adopted?" - Unique solutions (also called fundamental solutions) are those which remain after we identify those which happen to be mirrored or rotated of each other. Of course you can choose a representative in each equivalence class in the set of solutionns, then you can only store that representative in the dictionary structure of stage 2 of the problem. Choosing wisely what is the representative and how to find it greatly simplifies the implementation. - Explaining you choices in terms of measured features of the evolving code DOES give extra point. That is, tracing code behaviour in different conditions and for specific subproblems in terms of observable features like memory size, number of recursive calls, achieved load balancing and so on will help you achieve a better solution and be able to analyze it.