您将使用SimpleScalar作为评估引擎,使用所提供的框架代码,在18维处理器管道和内存层次结构设计空间(其中一些维度不是独立的)上执行设计空间探索。您将使用一个5基准套件作为工作负载。请注意,您可以帮助其他学生进行设置(例如,帮助他们编译或运行代码)或进行高级讨论,但这种帮助和讨论不能包括共享对项目作业解决方案中生成的任何代码的访问权限。您必须使用135台CSE机器(e5-CSE-135-01.CSE.psu.edu到e5-cs-135-28.CSE.psu.edu),因为该项目使用了一个模拟器和一组预先安装在机器中的基准测试。用户名为PSU id(abc1234),密码为PSU密码。上学期的这段YouTube视频解释了如何在CSE机器中运行模拟器。
CMPEN 431-Project 3In this project, you are going to use SimpleScalar as the evaluation engine to perform a design space exploration, using the provided framework code, over a 18-dimensional processor pipeline and memory hierarchy design space (some of these dimensions are not independent). You will use a 5-benchmark suite as the workload. Note that you can help other students with the setup (e.g., help them compile or run the code) or have a high-level discussion, but this assistance and discussion cannot include the sharing of access to any code produced in solution to the project assignments.
You MUST use 135 CSE machines (e5-cse-135-01.cse.psu.edu through e5-cse-135-28.cse.psu.edu), because the project uses a simulator and a set of benchmarks that are pre-installed in the machine. The username is your PSU id (abc1234) and the password is your PSU password.
This YouTube video from the previous semester explains how to run the simulator in the CSE machine.

1. Project Goal

Your assignment is to, with an evaluation count limit of 1000 design points, explore the design space in order to select the best performing design under a set of two different optimization functions. These include:The “best” performing overall design (in term of the geometric mean of normal- ized execution time normalized across all benchmarks)The most energy-efficient design (as measured by the lowest geometric mean of normalized energy-delay product [units of energy delay product are joule-seconds] across all benchmarks)

2. Background

2.1. SimpleScalar

SimpleScalar is an architectural simulator which enables a study of how different pro- cessor and memory system parameters affect performance and energy efficiency. The simulator accepts a set of system design parameters and an executable (workload) to run on the described system. A wide range of system statistics are recorded by the simulator as the executable runs on the simulated system. Once the framework in this project issetup, interested readers can have a look at one of the log files in rawProjectOutputDatafolder to view SimpleScalar output.This project heavily uses SimpleScalar but most of the interface is abstracted out by a simpler framework interface. Nevertheless, you can refer to this SimpleScalar guide for details about parameters passed to SimpleScalar.

2.2. Design Space

ExplorationGiven a set of design parameters, Design Space Exploration (DSE) involves probing var- ious design points to find the most suitable design to meet required goals. Follow this quick reading about DSE before moving ahead.DSE can be performed for different design goals. For example, one DSE may want to find the best performing design whereas another DSE may be aimed at finding the most energy efficient design. A more complex DSE may look for the best performing design given a fixed energy budget.An exhaustive DSE simply tries out all possible combinations of parameter values to find the absolute best design. However, as the size of design space increases this approach quickly becomes infeasible. Consider a 10-dimensional design space with 5 possiblevalues for each parameter and 2 minutes simulation time to evaluate a given design point; an exhaustive search will take 510 ? 2min ≈ 37years.A more intelligent DSE employs heuristics to intelligently prune down the design space and to prioritize evaluation of more reasonable design points first. If the assumptions employed by the heuristics are correct, the DSE will still result in the best design. On the other hand. with a set of reasonably justified assumptions a heuristic can result in a “good enough” design point.

2.3. Energy-Delay

ProductEnergy-Delay Product (EDP) is a metric which consolidates both performance and energy efficiency.EDP = total execution energy * execution timeDesign A takes 100pJ to process an image in 100ms, EDP = 10000 units. Design B takes 80pJ to process an image in 2000ms, EDP = 160000. Design A is clearly more energy efficient, but it performs poorly as it incurs more execution time. EDP enables a more holistic design comparison.

3. Our Heuristic

We define OurHeuristic as follows:Designspace dimensions can be labelled as either explored andInitiallyall dimensions are unexploredChoosean unexplored dimension, go to 4 if all dimensions are exploredEvaluateall possible design points by changing the value of this dimension onlyFix value of this dimension by selecting the best design so far (consider DSE goal)Markthis dimension as exploredSetall dimensions to unexplored and go to stepYou should choose an unexplored dimension in step 3 based on two PSU ID numbers, as follows.DSE dimensions can be categorized in four major classes as follows:Branchpredictor (BP) configurations (i.e. branchsettings, ras, btb)Cacheconfigurations (i.e. {l1, ul2} block, {dl1, il1, ul2} sets, {dl1, il1, ul2} assoc)Coreconfigurations (i.e. width, scheduling)FloatingPoint Unit (FPU) configuration (i.e. fpwidth)Based on your PSU ID number, you should calculate(PSU ID Number) mod 24and then you should look at the Table 1 and start from the first category in the correspond- ing row, and then second category, and so on.
For example, if your PSU ID number is 9123456789, the remainder of its division by 24 is 21 and you should explore Core configs first, then BPconfigs, then Cache configs, and then FPU configs at last.
Please note that the current implemented heuristic ingenerateNextConfigurationPro- posal function is a simple heuristic asfollows and you should extend it as explained above. Current implementation starts from the leftmost dimension and explores all pos- sible options for this dimension, and then goes to the next dimension until the rightmost dimension.

4. Logistics

The set of possible points within the design space to be considered are constrained by the provided shell script wrapperrunprojectsuite.sh. All allowed configuration parameters for each dimension of the design space are briefly described in the provided shell script.runprojectsuite.sh shell script takes 18 integer arguments, one for each configuration dimension, which expresses the index of the selected parameter for each dimension. All reported results should be normalized against a baseline design with configuration param- eters which already hard-coded in theframework.Note that not all possible parameter settings represent a valid combination. One of your tasks will be to write a configuration validation function based upon restrictions described later in this document. Further, note that this design space is too large to effi- ciently search in an exhaustive manner. Hence, a heuristic will be developed to specify an order in which the design space will be explored.The framework code will evaluate a fixed number of design points per run. This pa- rameter cannot be changed. The key part of your task in this project is to implement a heuristic search function that selects the next design point to consider, given either a performance, or an energy efficiency goal. Note that the framework code must be run once for each of the optimization function options.The framework, as given, provides functionality to enforce several, but by no means all, of the validation constraints. It is your job to implement validation functions to enforce constraints described throughout this document.

5. Framework

A sample run to use the provided framework can look something like this:Extract project files archive and navigate to project directory.make clean make

./DSE performance

Different components of the framework are invoked in the following order: DSE (project binary) → runprojectsuite.sh (shell script) → SimpleScalarDSE binary invokes runprojectsuite.sh script which in turn invokes SimpleScalar simulator with appropriate arguments. Several log files are generated in project directory on every invocation.

6. Anticipated Steps

These steps can serve as a high-level guideline to aid you during the project:EnterMY PSU ID in c.Setupprovided framework to get a set of results using the provided “unintelligentImplementvalidateConfiguration and generateCacheLatencyParams ImplementOurHeuristic in generateNextConfigurationProposal for both opti- mization goals (a well performing design and an energy efficient design)CompleteReport

7. Submission RequirementsSubmitted artifacts should include:

ProjectreportCodeimplementations of missing or stub functions within the providedframework
7.1. Project Report
Your report must at conform to requirements listed in Appendix A. This report, data contained within and their analysis will be the primary means of assessing this project.Your report must be submitted via Canvas. (PDF only)
7.2. Code ImplementationsYou will submit the source files (Makefile, runprojectsuite.sh, .cpp and .h) of your implementation as a single tar archive for an audit of your implementation efforts. Ensure that your code compiles on CSE machines without errors. You can make changes to framework if you conclude that they are required. The following commands will be used to compile and execute your code (followed by analysis of generated log files):# Extract project files archive and navigate to project directory.Please note that running each of exploration modes could take more than one hour, so please start as soon as possible so you could finish by the deadline.

8. Modeling

ConsiderationsThe Instruction Count (IC) for each benchmark is a constant. Thus, for performance, you will be trying to optimize Instructions Per Cycle (IPC) and the Clock Cycle (CC) time. Unless specified otherwise, the following modeling consideration have already been implemented in the framework to calculate EDP. However, the provided information may be used for explaining design space exploration results.
WX:codehelp


玩滑板的松鼠
1 声望0 粉丝