Page 2 - Contents
Contents 1 Introduction 2 1.1 What is Sniper? . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 Getting Started 3 2.1 Downloading Sniper . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Compiling Sniper . . . . . . . ....
Page 3 - Comprehensive Option List; Introduction; This extends the use for Sniper to; Features; Interval core model
9 Comprehensive Option List 26 9.1 Base Sniper Options . . . . . . . . . . . . . . . . . . . . . . . 26 9.2 Options used to configure the Nehalem core . . . . . . . . . . 33 9.3 Options used to configure the Gainestown processor . . . . . 35 9.4 Sniper Prefetcher Options . . . . . . . . . . . . . . ...
Page 4 - Getting Started; Downloading Sniper
• CPI Stack generation • Parallel, multi-threaded simulator • Multi-threaded application support • Multi-program workload support with the SIFT trace format • Validated against the Core2 microarchitecture • Shared and private caches • Heterogeneous core configuration • Modern, Pentium-M style branch...
Page 5 - Listing 1: Running an included application; Running Sniper; Listing 2: Integrated Benchmarks Quickstart; Simulation Output
2.2 Compiling Sniper One can compile Sniper with the following command: cd sniper && make . If you have multiple processors, you can take advantage of a parallel makebuild starting with Sniper version 3.0. In that case, you can run make -j N , where N is the number of make processes to start...
Page 6 - Using the Integrated Benchmarks; Listing 3: Integrated Benchmarks Quickstart; Automatically Running Multiple Multi-threaded Workloads; Simulation modes
In addition to viewing the sim.out file, we encourage the use of the sniper/tools/sniper lib.py:get results() function to parse and pro- cess results. The sim.stats files store the raw counter information that has been recorded by the different components of Sniper. Since Sniper 4.0, the new SQLite3...
Page 8 - Using Your Own Benchmarks; option to; Manually Running Multi-program Workloads; Collecting Traces; BBVs suitable for SimPoint processing.
3.5 Using Your Own Benchmarks Sniper can run most applications out of the box. Both static and dynamicbinaries are supported, and no special recompilation of your binary is nec-essary. Nevertheless, we find that most people will want to define a regionof interest (ROI) in their application to focus ...
Page 9 - Collecting and Playing Back Traces Simultaneously; Running MPI applications; Running
$ . / r e c o r d - t r a c e - o f f t - b 1 0 0 0 0 0 0 - - t e s t / f f t / f f t - p 1 - m 2 0 This command will generate a number of SIFT files with names in the format of <name>.N.sift , where <name> if defined by the -o option, and N is the block number. BBVs will be stored in &l...
Page 10 - Logical versus physical addresses; Scripted simulator control with Python
run-sniper generates an mpiexec command line using mpiexec and the number of MPI ranks ( -np ). By default, one MPI rank is run per core. This can be overriden using the --mpi-ranks=N option. The mpiexec command can be overriden with the --mpi-exec= option. For example, this command will use mpirun ...
Page 11 - Listing 12: High Level DVFS Periodic Callback Example; Class
advancing too quickly in respect to the others. When using barrier synchro-nization via the clock skew minimization/scheme=barrier configuration option, the clock skew minimization/barrier/quantum=100 variable sets how often the periodic callback, or sim hooks.HOOK PERIODIC Python hook, in nanosecon...
Page 12 - Runtime Configuration Support with the SimAPI; Configuring Sniper; Listing 14: Key Value and Section
H O O K _ S Y S C A L L _ E N T E RH O O K _ S Y S C A L L _ E X I TH O O K _ A P P L I C A T I O N _ S T A R TH O O K _ A P P L I C A T I O N _ E X I TH O O K _ A P P L I C A T I O N _ R O I _ B E G I NH O O K _ A P P L I C A T I O N _ R O I _ E N D 4.1 Runtime Configuration Support with the SimAPI...
Page 13 - Listing 17: Hierarchical Section Config File; Configuration Files; config; Command Line Configuration; Listing 20: Passing Options via the Command Line; Heterogeneous Configuration
Listing 17: Hierarchical Section Config File [ p e r f _ m o d e l / c o r e / i n t e r v a l _ t i m e r ] w i n d o w _ s i z e = 9 6 5.1 Configuration Files The method we most often use is to pass an entire configuration file toSniper from the command line. In the example below, we pass the opti...
Page 14 - Listing 21: Example configuration file; Heterogeneous Options; Listing 22: A Selection of Heterogeneous Options; Configuration Parameters; Basic architectural options; Processor core
Listing 21: Example configuration file [ p e r f _ m o d e l / c o r e ] f r e q u e n c y = 2 . 6 6 # S e t t h e d e f a u l t v a l u e f r e q u e n c y [ ] = 1 . 0 , , , 1 . 0 # C o r e 1 , 2 u s e s t h e d e f a u l t a b o v e , 2 . 6 6 In the example above, we first set a default frequency ...
Page 15 - Caches; Reschedule Cost; Configuring the DVFS Architecture
6.1.2 Caches Description Example Option Number of cache levels perf model/cache/levels=3 L1-I options perf model/l1 icache/* L1-D options perf model/l1 dcache/* L2 options perf model/l2 cache/* L* options perf model/l* cache/* Total cache size (kB) perf model/l* cache/cache size=256 Cache associativ...
Page 16 - Configuring DVFS at Startup
# S e t u p t h e c o r e D V F S t r a n s i t i o n l a t e n c y [ d v f s ] t r a n s i t i o n _ l a t e n c y = 1 0 0 0 0 # I n n s # C o n f i g u r e 4 - c o r e D V F S g r a n u l a r i t y [ d v f s / s i m p l e ] c o r e s _ p e r _ s o c k e t = 4 # P l a c e t h e L 2 ( a n d L 1 ’ s ...
Page 17 - Understanding your Software with Sniper; CPI Stacks; To generate CPI Stacks, run the; Power Stacks; file in the directory where
" " " i m p o r t s y s , o s , s i m c l a s s D v f s : d e f s e t u p ( s e l f , a r g s ) : s e l f . e v e n t s = [ ]a r g s = a r g s . s p l i t ( ’ : ’ ) f o r i i n r a n g e ( 0 , l e n ( a r g s ) , 3 ) : s e l f . e v e n t s . a p p e n d ( ( l o n g ( a r g s [ i ] ) * s...
Page 18 - Loop Tracer; Listing 25: Loop Tracer Setup; Listing 26: Loop Tracer Output; Visualization; Cycle stacks plotted over time
7.3 Loop Tracer The loop tracer allows one to determine the steady-state performance ofan application loop. To use it, configure Sniper with the parameters fromListing 25. The output should will look similar to Listing 26. Listing 25: Loop Tracer Setup [ g e n e r a l ] s y n t a x = a t t # O p t i...
Page 19 - McPAT visualizations plotted over time
Figure 1: CPI stack over time for Splash-2 FFT with 2 threads in the detailed, normalized view. The application was run in Sniper with the gainestown configuration in Sniper using the --viz and --power options. detailed view. In the simple view, the used cycles are grouped in four maincomponents: co...
Page 21 - Command Listing; Main Commands
Figure 3: Topology of the gainestown microarchitecture with a sparkline showing the misses per 1000 instructions (MPKI) of the first L1 data cache.The sparkline shows the MPKI of the Splash-2 FFT application running ontwo cores. 8 Command Listing 8.1 Main Commands 8.1.1 run-sniper run-sniper [-n <...
Page 23 - Sniper Utilities; bt
--save-patch — Save a patch (to sim.patch ) with the current Sniper code differences --pin-stats — Enable basic pin statists. Normally saves to pin.log --mpi — Enable single-node (shared-memory) MPI simulation sup- port. Works with MPICH2 and Intel MPI (Requires version 4.0+) --mpi-ranks — Specify t...
Page 24 - SIFT Utilities
-o <file> — Save gnuplot plotted data to ¡file¿.png --simplified — Create a CPI stack merging all items into the fol- lowing categories: compute, communicate, synchronize --no-collapse — Show all items, even if they are zero or below the threshold for merging them into the category other. --ti...
Page 27 - Base Sniper Options
9 Comprehensive Option List 9.1 Base Sniper Options Listing 27: Base options (base.cfg) # C o n f i g u r a t i o n f i l e f o r t h e S n i p e r s i m u l a t o r # T h i s f i l e i s o r g a n i z e d i n t o s e c t i o n s d e f i n e d i n [ ] b r a c k e t s a s i n [ s e c t i o n ] . # S ...
Page 34 - Options used to configure the Nehalem core
[ s c h e d u l e r ] t y p e = p i n n e d [ s c h e d u l e r / p i n n e d ] q u a n t u m = 1 0 0 0 0 0 0 # S c h e d u l e r q u a n t u m ( r o u n d - r o b i n f o r a c t i v e t h r e a d s o n e a c h c o r e ) , i n n a n o s e c o n d s c o r e _ m a s k = 1 # M a s k o f c o r e s o n ...
Page 36 - Options used to configure the Gainestown processor
t a g s _ a c c e s s _ t i m e = 1p e r f _ m o d e l _ t y p e = p a r a l l e lw r i t e t h r o u g h = 0s h a r e d _ c o r e s = 1 [ p e r f _ m o d e l / l 2 _ c a c h e ] p e r f e c t = f a l s ec a c h e _ s i z e = 2 5 6a s s o c i a t i v i t y = 8a d d r e s s _ h a s h = m a s kr e p l...
Page 38 - Sniper Prefetcher Options; DRAM Cache Options; SimAPI Commands
9.4 Sniper Prefetcher Options Listing 30: Prefetcher options (prefetcher.cfg) [ p e r f _ m o d e l / l 2 _ c a c h e ] # p r e f e t c h e r = s i m p l e p r e f e t c h e r = g h b [ p e r f _ m o d e l / l 2 _ c a c h e / p r e f e t c h e r ] p r e f e t c h _ o n _ p r e f e t c h _ h i t = t ...