Commit graph

769 commits

Author SHA1 Message Date
Dam Sunwoo
809134a2b1 config: Add options to take/resume from SimPoint checkpoints
More documentation at http://gem5.org/Simpoints

Steps to profile, generate, and use SimPoints with gem5:

1. To profile workload and generate SimPoint BBV file, use the
following option:

--simpoint-profile --simpoint-interval <interval length>

Requires single Atomic CPU and fastmem.
<interval length> is in number of instructions.

2. Generate SimPoint analysis using SimPoint 3.2 from UCSD.
(SimPoint 3.2 not included with this flow.)

3. To take gem5 checkpoints based on SimPoint analysis, use the
following option:

--take-simpoint-checkpoint=<simpoint file path>,<weight file
path>,<interval length>,<warmup length>

<simpoint file> and <weight file> is generated by SimPoint analysis
tool from UCSD. SimPoint 3.2 format expected. <interval length> and
<warmup length> are in number of instructions.

4. To resume from gem5 SimPoint checkpoints, use the following option:

--restore-simpoint-checkpoint -r <N> --checkpoint-dir <simpoint
checkpoint path>

<N> is (SimPoint index + 1). E.g., "-r 1" will resume from SimPoint
#0.
2014-12-23 09:31:17 -05:00
Gabe Black
7540656fc5 config: Add two options for setting the kernel command line.
Both options accept template which will, through python string formatting,
have "mem", "disk", and "script" values substituted in from the mdesc.
Additional values can be used on a case by case basis by passing them as
keyword arguments to the fillInCmdLine function. That makes it possible to
have specialized parameters for a particular ISA, for instance.

The first option lets you specify the template directly, and the other lets
you specify a file which has the template in it.
2014-12-04 16:42:07 -08:00
Nilay Vaish
cca1608bd5 config: ruby: mi protocol: correct master slave setting for dma
In the MI protocol, the master slave connection between the dma controller
and network was being set incorrectly.  This patch corrects it.
2014-12-04 08:59:44 -06:00
Gabe Black
b7dc4ba516 config: Get rid of some extra spaces around default arguments. 2014-12-03 03:11:00 -08:00
Alexandru Dutu
a19cf6943b config, kvm: Enabling KvmCPU in SE mode
This patch modifies se.py such that it can now use kvm cpu model.
2014-11-23 18:01:08 -08:00
Steve Reinhardt
252a463b6b Backed out prior changeset f9fb64a72259
Back out use of importlib to avoid implicitly creating
dependency on Python 2.7.
2014-11-23 18:00:47 -08:00
Gabe Black
12243a3835 config: ruby: Get rid of an "eval" and an "exec" operating on generated code.
We can get the same result using importlib.
2014-11-23 05:55:26 -08:00
Nilay Vaish
708e80d9bb configs: small fix to ruby portion of fs.py and se.py
In fs.py the io port controller was being attached to the iobus multiple
times.  This should be done only once.  In se.py, the the option use_map
was being set which no longer exists.
2014-11-18 19:17:29 -06:00
Marc Orr
bf80734b2c x86 isa: This patch attempts an implementation at mwait.
Mwait works as follows:
1. A cpu monitors an address of interest (monitor instruction)
2. A cpu calls mwait - this loads the cache line into that cpu's cache.
3. The cpu goes to sleep.
4. When another processor requests write permission for the line, it is
   evicted from the sleeping cpu's cache. This eviction is forwarded to the
   sleeping cpu, which then wakes up.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-11-06 05:42:22 -06:00
Nilay Vaish
0811f21f67 ruby: provide a backing store
Ruby's functional accesses are not guaranteed to succeed as of now.  While
this is not a problem for the protocols that are currently in the mainline
repo, it seems that coherence protocols for gpus rely on a backing store to
supply the correct data.  The aim of this patch is to make this backing store
configurable i.e. it comes into play only when a particular option:
--access-backing-store is invoked.

The backing store has been there since M5 and GEMS were integrated.  The only
difference is that earlier the system used to maintain the backing store and
ruby's copy was write-only.  Sometime last year, we moved to data being
supplied supplied by ruby in SE mode simulations.  And now we have patches on
the reviewboard, which remove ruby's copy of memory altogether and rely
completely on the system's memory to supply data.  This patch adds back a
SimpleMemory member to RubySystem.  This member is used only if the option:
access-backing-store is set to true.  By default, the memory would not be
accessed.
2014-11-06 05:42:21 -06:00
Nilay Vaish
3022d463fb ruby: interface with classic memory controller
This patch is the final in the series.  The whole series and this patch in
particular were written with the aim of interfacing ruby's directory controller
with the memory controller in the classic memory system.  This is being done
since ruby's memory controller has not being kept up to date with the changes
going on in DRAMs.  Classic's memory controller is more up to date and
supports multiple different types of DRAM.  This also brings classic and
ruby ever more close.  The patch also changes ruby's memory controller to
expose the same interface.
2014-11-06 05:42:21 -06:00
Nilay Vaish
95a0b18431 ruby: single physical memory in fs mode
Both ruby and the system used to maintain memory copies.  With the changes
carried for programmed io accesses, only one single memory is required for
fs simulations.  This patch sets the copy of memory that used to reside
with the system to null, so that no space is allocated, but address checks
can still be carried out.  All the memory accesses now source and sink values
to the memory maintained by ruby.
2014-11-06 05:41:44 -06:00
Ali Saidi
f2db2a96d1 arm, tests: Update config files to more recent kernels and create 64-bit regressions.
This changes the default ARM system to a Versatile Express-like system that supports
2GB of memory and PCI devices and updates the default kernels/file-systems for
AArch64 ARM systems (64-bit) to support up to 32GB of memory and PCI devices. Some
platforms that are no longer supported have been pruned from the configuration files.

In addition a set of 64-bit ARM regressions have been added to the regression system.
2014-10-29 23:18:27 -05:00
Ali Saidi
3a5c975fd7 arm: fix bare-metal memory setup.
The bare-metal configuration option still configured memory with the old scheme
that no-longer works. This change unifies the code so there aren't any differences.
2014-10-29 23:18:26 -05:00
Andreas Hansson
66df7b7fd4 config: Add the ability to read a config file using C++ and Python
This patch adds the ability to load in config.ini files generated from
gem5 into another instance of gem5 built without Python configuration
support. The intended use case is for configuring gem5 when it is a
library embedded in another simulation system.

A parallel config file reader is also provided purely in Python to
demonstrate the approach taken and to provided similar functionality
for as-yet-unknown use models. The Python configuration file reader
can read both .ini and .json files.

C++ configuration file reading:

A command line option has been added for scons to enable C++ configuration
file reading: --with-cxx-config

There is an example in util/cxx_config that shows C++ configuration in action.
util/cxx_config/README explains how to build the example.

Configuration is achieved by the object CxxConfigManager. It handles
reading object descriptions from a CxxConfigFileBase object which
wraps a config file reader. The wrapper class CxxIniFile is provided
which wraps an IniFile for reading .ini files. Reading .json files
from C++ would be possible with a similar wrapper and a JSON parser.

After reading object descriptions, CxxConfigManager creates
SimObjectParam-derived objects from the classes in the (generated with this
patch) directory build/ARCH/cxx_config

CxxConfigManager can then build SimObjects from those SimObjectParams (in an
order dictated by the SimObject-value parameters on other objects) and bind
ports of the produced SimObjects.

A minimal set of instantiate-replacing member functions are provided by
CxxConfigManager and few of the member functions of SimObject (such as drain)
are extended onto CxxConfigManager.

Python configuration file reading (configs/example/read_config.py):

A Python version of the reader is also supplied with a similar interface to
CxxConfigFileBase (In Python: ConfigFile) to config file readers.

The Python config file reading will handle both .ini and .json files.

The object construction strategy is slightly different in Python from the C++
reader as you need to avoid objects prematurely becoming the children of other
objects when setting parameters.

Port binding also needs to be strictly in the same port-index order as the
original instantiation.
2014-10-16 05:49:37 -04:00
Nilay Vaish
b80e574d01 config: separate function for instantiating a memory controller
This patch moves code for instantiating a single memory controller from
the function config_mem() to a separate function.  This is being done
so that memory controllers can be instantiated without assuming that
they will be attached to the system in a particular fashion.
2014-10-11 15:02:23 -05:00
Nilay Vaish
0f28d63272 ruby: moesi hammer: correct typo in master-slave assignment 2014-10-11 15:02:22 -05:00
Jiuyue Ma
e1a5522a89 config, x86: Ensure that PCI devs get bridged to the memory bus
This patch force IO device to be mapped to 0xC0000000-0xFFFF0000 by
reserve anything between the end of memory and 3GB if memory is less
than 3GB. It also statically bridge these address range to the IO bus,
which guaranty access to pci address space will pass though bridge to
iobus.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-07-17 12:05:41 +08:00
Jiuyue Ma
7d03bf4d6b config, x86: swap bus_id of ISA/PCI in X86 IntelMPTable
This patch assign bus_id=0 to PCI bus and bus_id=1 to ISA bus for
X86 platform. Because PCI device get config space address using
Pc::calcPciConfigAddr() which requires "assert(bus==0)".
This fixes PCI interrupt routing and discovery on Linux.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-07-17 11:00:12 +08:00
Andreas Hansson
1f6d5f8f84 mem: Rename Bus to XBar to better reflect its behaviour
This patch changes the name of the Bus classes to XBar to better
reflect the actual timing behaviour. The actual instances in the
config scripts are not renamed, and remain as e.g. iobus or membus.

As part of this renaming, the code has also been clean up slightly,
making use of range-based for loops and tidying up some comments. The
only changes outside the bus/crossbar code is due to the delay
variables in the packet.

--HG--
rename : src/mem/Bus.py => src/mem/XBar.py
rename : src/mem/coherent_bus.cc => src/mem/coherent_xbar.cc
rename : src/mem/coherent_bus.hh => src/mem/coherent_xbar.hh
rename : src/mem/noncoherent_bus.cc => src/mem/noncoherent_xbar.cc
rename : src/mem/noncoherent_bus.hh => src/mem/noncoherent_xbar.hh
rename : src/mem/bus.cc => src/mem/xbar.cc
rename : src/mem/bus.hh => src/mem/xbar.hh
2014-09-20 17:18:32 -04:00
Wendy Elsasser
a384525355 cpu: Update DRAM traffic gen
Add new DRAM_ROTATE mode to traffic generator.
This mode will generate DRAM traffic that rotates across
banks per rank, command types, and ranks per channel

The looping order is illustrated below:
for (ranks per channel)
   for (command types)
      for (banks per rank)
         // Generate DRAM Command Series

This patch also adds the read percentage as an input argument to the
DRAM sweep script. If the simulated read percentage is 0 or 100, the
middle for loop does not generate additional commands.  This loop is
used only when the read percentage is set to 50, in which case the
middle loop will toggle between read and write commands.

Modified sweep.py script, which generates DRAM traffic.
Added input arguments and support for new DRAM_ROTATE mode.
The script now has input arguments for:
 1) Read percentage
 2) Number of ranks
 3) Address mapping
 4) Traffic generator mode  (DRAM or DRAM_ROTATE)

The default values are:
 100% reads, 1 rank, RoRaBaCoCh address mapping, and DRAM traffic gen mode

For the DRAM traffic mode, added multi-rank support.
2014-09-20 17:17:55 -04:00
Dam Sunwoo
ca3513d630 cpu: use probes infrastructure to do simpoint profiling
Instead of having code embedded in cpu model to do simpoint profiling use
the probes infrastructure to do it.
2014-09-20 17:17:43 -04:00
Ali Saidi
1c0ae90027 arm: Support >2GB of memory for AArch64 systems 2014-09-03 07:43:05 -04:00
Ali Saidi
16262a8fc3 arm: Assume we have a kernel that supports pci devices
Change the default kernel for AArch64 and since it supports PCI devices
remove the hack that made it use CF. Unfortunately, there isn't really
a half-way here and we need to switch. Current users will get an error
message that the kernel isn't found and hopefully go download a new
kernel that supports PCI.
2014-09-03 07:43:04 -04:00
Geoffrey Blake
845e199934 config: Refactor RealviewEMM to fit into new config system
This eliminates some default devices and adds in helper functions
to connect the devices defined here to associate with the proper
clock domains.
2014-09-03 07:43:01 -04:00
Mitch Hayenga
976f27487b cpu: Change writeback modeling for outstanding instructions
As highlighed on the mailing list gem5's writeback modeling can impact
performance.  This patch removes the limitation on maximum outstanding issued
instructions, however the number that can writeback in a single cycle is still
respected in instToCommit().
2014-09-03 07:42:33 -04:00
Andreas Hansson
0756406739 mem: Add utility script to plot DRAM efficiency sweep
This patch adds basic functionality to quickly visualise the output
from the DRAM efficiency script. There are some unfortunate hacks
needed to communicate the needed information from one script to the
other, and we fall back on (ab)using the simout to do this.

As part of this patch we also trim the efficiency sweep to stop at 512
bytes as this should be sufficient for all forseeable DRAMs.
2014-09-03 07:42:29 -04:00
Nilay Vaish
7a0d5aafe4 ruby: message buffers: significant changes
This patch is the final patch in a series of patches.  The aim of the series
is to make ruby more configurable than it was.  More specifically, the
connections between controllers are not at all possible (unless one is ready
to make significant changes to the coherence protocol).  Moreover the buffers
themselves are magically connected to the network inside the slicc code.
These connections are not part of the configuration file.

This patch makes changes so that these connections will now be made in the
python configuration files associated with the protocols.  This requires
each state machine to expose the message buffers it uses for input and output.
So, the patch makes these buffers configurable members of the machines.

The patch drops the slicc code that usd to connect these buffers to the
network.  Now these buffers are exposed to the python configuration system
as Master and Slave ports.  In the configuration files, any master port
can be connected any slave port.  The file pyobject.cc has been modified to
take care of allocating the actual message buffer.  This is inline with how
other port connections work.
2014-09-01 16:55:47 -05:00
Emilio Castillo ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)
01f792a367 ruby: Fixes clock domains in configuration files
This patch fixes scripts related to ruby by adding the ruby clock domain.
Now the L1 controllers and  the Sequencer shares the cpu clock domain,
while the rest of the components use the ruby clock domain.

Before this patch, running simulations with the cpu clock set at 2GHz or
1GHz will output the same time results and could distort power measurements.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-09-01 16:55:30 -05:00
Radhika Jagtap
860f00228b config: Fix cache latency param in mem test
This patch fixes the cache latency in mem test which is split into two params,
hit and response latency as per BaseCache.
2014-08-10 05:39:40 -04:00
Anthony Gutierrez
0ac4624595 arm: make the PseudoLRU tags the default for the O3_ARM_v7aL2
the Cortex-A15 has a random replacement policy for its L2 cache. see the
Cortex-A15 Technical Reference Manual 1.7 About the L2 memory system. this
patch makes the PseudoLRU tags the default for the ARM O3 CPU's L2 cache.
2014-07-28 12:22:00 -04:00
Andrew Bardsley
0e8a90f06b cpu: `Minor' in-order CPU model
This patch contains a new CPU model named `Minor'. Minor models a four
stage in-order execution pipeline (fetch lines, decompose into
macroops, decompose macroops into microops, execute).

The model was developed to support the ARM ISA but should be fixable
to support all the remaining gem5 ISAs. It currently also works for
Alpha, and regressions are included for ARM and Alpha (including Linux
boot).

Documentation for the model can be found in src/doc/inside-minor.doxygen and
its internal operations can be visualised using the Minorview tool
utils/minorview.py.

Minor was designed to be fairly simple and not to engage in a lot of
instruction annotation. As such, it currently has very few gathered
stats and may lack other gem5 features.

Minor is faster than the o3 model. Sample results:

     Benchmark     |   Stat host_seconds (s)
    ---------------+--------v--------v--------
     (on ARM, opt) | simple | o3     | minor
                   | timing | timing | timing
    ---------------+--------+--------+--------
    10.linux-boot  |   169  |  1883  |  1075
    10.mcf         |   117  |   967  |   491
    20.parser      |   668  |  6315  |  3146
    30.eon         |   542  |  3413  |  2414
    40.perlbmk     |  2339  | 20905  | 11532
    50.vortex      |   122  |  1094  |   588
    60.bzip2       |  2045  | 18061  |  9662
    70.twolf       |   207  |  2736  |  1036
2014-07-23 16:09:04 -05:00
Anthony Gutierrez
db267da822 arm: make the bi-mode predictor the default for O3_ARM_v7a_BP
the branch predictor used in the Cortex-A15 is a bi-mode style predictor,
see:

http://arm.com/files/pdf/at-exploring_the_design_of_the_cortex-a15.pdf
and
http://nvidia.com/docs/IO/116757/NVIDIA_Quad_a15_whitepaper_FINALv2.pdf

this patch makes the bi-mode predictor the default for the ARM O3 CPU.
2014-06-30 13:50:01 -04:00
Anthony Gutierrez
53dd4497b3 config: remove unecessary assignment of etherlink interfaces
in makeDualRoot() the etherlink interfaces are set using the tsunami interface
however, they are set again a few lines later based on whether or not the system
is a realview or tsunami system; the original assignment is always overwritten
or there will be a fatal. this seems like an artifact from when tsunami was the
only type of system capable of running with the dual option.
2014-05-15 13:26:31 -04:00
Andreas Hansson
aa329f4757 config: Bump DRAM sweep bus speed to match DDR4 config
This patch bumps the bus clock speed such that the interconnect does
not become a bottleneck with a DDR4-2400-x64 DRAM delivering 19.2
GByte/s theoretical max.
2014-05-09 18:58:49 -04:00
Nilay Vaish
097aadc2cd config: ruby: remove memory controller from network test
It is not in use and not required as such.
2014-04-19 09:00:30 -05:00
Anthony Gutierrez
5ab6bdc1ec arm: set default kernels for VExpress_EMM and VExpress_EMM64 2014-04-14 19:30:24 -04:00
Gedare Bloom
ca90a54476 config: add num-work-ids command line option
Adds the parameter --num-work-ids to Options.py and reads the parameter
into the System params in Simulation.py. This parameter enables setting
the number of possible work items to different than 16. Support for this
parameter already exists in src/sim/System.py, so this changeset only
affects the Python config files.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-04-10 13:43:33 -05:00
Nilay Vaish
d6542d7758 configs: use SimpleMemory when using ruby in se mode
A recent changeset altered the default memory class to DRAMCtrl.  In se mode,
ruby uses the physical memory to check if a given address is within the bounds
of the physical memory.  SimpleMemory is enough for this.  Moreover,
SimpleMemory does not check whether it is connected or not, something which
DRAMCtrl does.
2014-04-01 11:17:46 -05:00
Andreas Hansson
7c18691db1 mem: Rename SimpleDRAM to a more suitable DRAMCtrl
This patch renames the not-so-simple SimpleDRAM to a more suitable
DRAMCtrl. The name change is intended to ensure that we do not send
the wrong message (although the "simple" in SimpleDRAM was originally
intended as in cleverly simple, or elegant).

As the DRAM controller modelling work is being presented at ISPASS'14
our hope is that a broader audience will use the model in the future.

--HG--
rename : src/mem/SimpleDRAM.py => src/mem/DRAMCtrl.py
rename : src/mem/simple_dram.cc => src/mem/dram_ctrl.cc
rename : src/mem/simple_dram.hh => src/mem/dram_ctrl.hh
2014-03-23 11:12:12 -04:00
Andreas Hansson
3dd1587afc mem: Change memory defaults to be more representative
Make the default memory type DDR3-1600 x64, and use the open-adaptive
page policy. This change is aiming to ensure that users by default are
using a realistic memory system.
2014-03-23 11:12:10 -04:00
Andreas Hansson
7d883df7e5 config: Add a DRAM efficiency-sweep script
This patch adds a configuration that simplifies evaluation of DRAM
controller configurations by automating a sweep of stride size and
bank parallelism. It works in a rather unconventional way, as it needs
to print the traffic generator stimuli based on the memory
organisation. Hence, it starts by configuring the memory, then it
prints a traffic-generator config file, and loads it.

The resulting stats have one period per data point, identified by the
stride size, and the number of banks being used.
2014-03-23 11:12:00 -04:00
Andreas Hansson
7e7b67472a mem: More descriptive address-mapping scheme names
This patch adds the row bits to the name of the address mapping
schemes to make it more clear that all the current schemes places the
row bits as the most significant bits.
2014-03-23 11:11:53 -04:00
Nilay Vaish
4b67ada89e ruby: garnet: convert network interfaces into clocked objects
This helps in configuring the network interfaces from the python script and
these objects no longer rely on the network object for the timing information.
2014-03-20 09:14:14 -05:00
Nilay Vaish
b5cc4c7604 config: ruby: rename _cpu_ruby_ports to _cpu_ports 2014-03-20 09:14:14 -05:00
Nilay Vaish
f2059f8399 config: fs.py: move creating of test/drive systems to functions
The code that creates test and drive systems is being moved to separate
functions so as to make the code more readable.  Ultimately the two
functions would be combined so that the replicated code is eliminated.
2014-03-20 09:14:08 -05:00
Nilay Vaish
d5b5d89b34 config: remove ruby_fs.py
The patch removes the ruby_fs.py file.  The functionality is being moved to
fs.py.  This would being ruby fs simulations in line with how ruby se
simulations are started (using --ruby option).  The alpha fs config functions
are being combined for classing and ruby memory systems.  This required
renaming the piobus in ruby to iobus.  So, we will have stats being renamed
in the stats file for ruby fs regression.
2014-03-20 08:03:09 -05:00
Nilay Vaish
9b3418d163 ruby: no piobus in se mode
Piobus was recently added to se scripts for ruby so that the interrupt
controller can be connected to something (required since the interrupt
controller sends address range messages).  This patch removes the piobus
and instead, the pio port of ruby port will now ignore the range change
messages in se mode.
2014-03-20 08:03:09 -05:00
Nilay Vaish
a20fbdfc23 config: ruby: remove piobus from protocols
This patch removes the piobus from the protocol config files.  The ports
are now connected to the piobus in the Ruby.py file.
2014-03-17 17:40:15 -05:00
Nilay Vaish
8504b079b8 ruby: correct errors in changeset 4eec7bdde5b0
Couple of errors were discovered in 4eec7bdde5b0 which necessitated this patch.
Firstly, we create interrupt controllers in the se mode, but no piobus was
being created.  RubyPort, which earlier used to ignore range changes now
forwards those to the piobus.  The lack of piobus resulted in segmentation
fault.  This patch creates a piobus even in se mode.  It is not created only
when some tester is running.  Secondly,  I had missed out on modifying port
connections for other coherence protocols.
2014-02-24 20:50:05 -06:00
Nilay Vaish
7e27860ef4 ruby: route all packets through ruby port
Currently, the interrupt controller in x86 is connected to the io bus
directly.  Therefore the packets between the io devices and the interrupt
controller do not go through ruby.  This patch changes ruby port so that
these packets arrive at the ruby port first, which then routes them to their
destination.  Note that the patch does not make these packets go through the
ruby network.  That would happen in a subsequent patch.
2014-02-23 19:16:16 -06:00
Nilay Vaish
6aafd5cb3f config: topologies: slight code refactor 2014-02-23 19:16:15 -06:00
Nilay Vaish
385e542c5a config: ruby_random_test: updates due to recent unrelated changes 2014-02-21 08:02:06 -06:00
Anthony Gutierrez
6b765ba8b7 arm: armv8 boot options to enable v8
Modifies FSConfig.py to enable ARMv8 compatibility.
To boot gem5 with ARMv8:
   Download the v8 kernel, .dtb file, and root FS from: http://gem5.org/Download
   Download the ARMv8 toolchain, and add the bin dir to your path:
       http://www.linaro.org/engineering/engineering-projects/armv8
   Build gem5 for ARM
   Build the v8 bootloader (in gem5/system/arm/aarch64_bootloader)
   Make script in gem5/system/arm/aarch64_bootloader will require v8 toolchain,
   drop the produced boot_emm.arm64 in $(M5_PATH)/binaries/
Run:
   $ build/ARM/gem5.fast configs/example/fs.py --machine-type=VExpress_EMM64 \
     --kernel=/path/to/kernel/vmlinux-linaro-tracking \
     --dtb-filename=/path/to/dtb/rtsm_ve-aemv8a.dtb \
     --disk-image=/path/to/img/linaro-minimal-armv8.img
2014-02-18 17:20:56 -05:00
Andreas Hansson
bf2f178f85 mem: Add a wrapped DRAMSim2 memory controller
This patch adds DRAMSim2 as a memory controller by wrapping the
external library and creating a sublass of AbstractMemory that bridges
between the semantics of gem5 and the DRAMSim2 interface.

The DRAMSim2 wrapper extracts the clock period from the config
file. There is no way of extracting this information from DRAMSim2
itself, so we simply read the same config file and get it from there.

To properly model the response queue, the wrapper keeps track of how
many transactions are in the actual controller, and how many are
stacking up waiting to be sent back as responses (in the wrapper). The
latter requires us to move away from the queued port and manage the
packets ourselves. This is due to DRAMSim2 not having any flow control
on the response path.

DRAMSim2 assumes that the transactions it is given are matching the
burst size of the choosen memory. The wrapper checks to ensure the
cache line size of the system matches the burst size of DRAMSim2 as
there are currently no provisions to split the system requests. In
theory we could allow a cache line size smaller than the burst size,
but that would lead to inefficient use of the DRAM, so for not we
fatal also in this case.
2014-02-18 05:50:53 -05:00
Nilay Vaish
3526676165 config: correct bug in x86 drive sys instantiation 2014-01-31 15:35:45 -06:00
Nilay Vaish
7792dedfdd x86: add a warning about the number of memory controllers
When memory size > 3GB, print a warning that twice the number of memory
controllers would be created.
2014-01-28 07:15:53 -06:00
Nilay Vaish
95b782f600 config: allow more than 3GB of memory for x86 simulations
This patch edits the configuration files so that x86 simulations can have
more than 3GB of memory.  It also corrects a bug in the MemConfig.py script.
2014-01-27 18:50:51 -06:00
ARM gem5 Developers
612f8f074f arm: Add support for ARMv8 (AArch64 & AArch32)
Note: AArch64 and AArch32 interworking is not supported. If you use an AArch64
kernel you are restricted to AArch64 user-mode binaries. This will be addressed
in a later patch.

Note: Virtualization is only supported in AArch32 mode. This will also be fixed
in a later patch.

Contributors:
Giacomo Gabrielli    (TrustZone, LPAE, system-level AArch64, AArch64 NEON, validation)
Thomas Grocutt       (AArch32 Virtualization, AArch64 FP, validation)
Mbou Eyole           (AArch64 NEON, validation)
Ali Saidi            (AArch64 Linux support, code integration, validation)
Edmund Grimley-Evans (AArch64 FP)
William Wang         (AArch64 Linux support)
Rene De Jong         (AArch64 Linux support, performance opt.)
Matt Horsnell        (AArch64 MP, validation)
Matt Evans           (device models, code integration, validation)
Chris Adeniyi-Jones  (AArch64 syscall-emulation)
Prakash Ramrakhyani  (validation)
Dam Sunwoo           (validation)
Chander Sudanthi     (validation)
Stephan Diestelhorst (validation)
Andreas Hansson      (code integration, performance opt.)
Eric Van Hensbergen  (performance opt.)
Gabe Black
2014-01-24 15:29:34 -06:00
Nilay Vaish
407f37e15f ruby: move all statistics to stats.txt, eliminate ruby.stats 2014-01-10 16:19:47 -06:00
Nilay Vaish
4070b00875 ruby: add a three level MESI protocol.
The first two levels (L0, L1) are private to the core, the third level (L2)is
possibly shared. The protocol supports clustered designs.  For example, one
can have two sets of two cores. Each core has an L0 and L1 cache. There are
two L2 controllers where each set accesses only one of the L2 controllers.
2014-01-04 00:03:34 -06:00
Nilay Vaish
bb6d7d402b ruby: rename MESI_CMP_directory to MESI_Two_Level
This is because the next patch introduces a three level hierarchy.

--HG--
rename : build_opts/ALPHA_MESI_CMP_directory => build_opts/ALPHA_MESI_Two_Level
rename : build_opts/X86_MESI_CMP_directory => build_opts/X86_MESI_Two_Level
rename : configs/ruby/MESI_CMP_directory.py => configs/ruby/MESI_Two_Level.py
rename : src/mem/protocol/MESI_CMP_directory-L1cache.sm => src/mem/protocol/MESI_Two_Level-L1cache.sm
rename : src/mem/protocol/MESI_CMP_directory-L2cache.sm => src/mem/protocol/MESI_Two_Level-L2cache.sm
rename : src/mem/protocol/MESI_CMP_directory-dir.sm => src/mem/protocol/MESI_Two_Level-dir.sm
rename : src/mem/protocol/MESI_CMP_directory-dma.sm => src/mem/protocol/MESI_Two_Level-dma.sm
rename : src/mem/protocol/MESI_CMP_directory-msg.sm => src/mem/protocol/MESI_Two_Level-msg.sm
rename : src/mem/protocol/MESI_CMP_directory.slicc => src/mem/protocol/MESI_Two_Level.slicc
rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/config.ini => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/config.ini
rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/ruby.stats => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/ruby.stats
rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/simerr => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/simerr
rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/simout => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/simout
rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/stats.txt => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/stats.txt
rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/system.pc.com_1.terminal => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/system.pc.com_1.terminal
rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/config.ini => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/config.ini
rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/ruby.stats => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/ruby.stats
rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/simerr => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/simerr
rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/simout => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/simout
rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/stats.txt => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/stats.txt
rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/config.ini => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/config.ini
rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/ruby.stats => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/ruby.stats
rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/simerr => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/simerr
rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/simout => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/simout
rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/stats.txt => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/stats.txt
rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/config.ini => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/config.ini
rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/ruby.stats => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/ruby.stats
rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/simerr => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/simerr
rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/simout => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/simout
rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/stats.txt => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/stats.txt
rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/config.ini => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/config.ini
rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/ruby.stats => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/ruby.stats
rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/simerr => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/simerr
rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/simout => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/simout
rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/stats.txt => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/stats.txt
2014-01-04 00:03:33 -06:00
Nilay Vaish
9ec59e8b69 ruby: remove cntrl_id from python config scripts. 2014-01-04 00:03:32 -06:00
Nilay Vaish
9853ef6651 ruby: some small changes 2014-01-04 00:03:30 -06:00
Steve Reinhardt
a212844f67 config, x86: move kernel specification from tests to FSConfig.py
For some reason, the default x86 kernel is specified in
tests/configs/x86_generic.py and not in configs/common/FSConfig.py,
where the kernels for all the other ISAs are.  This means that
running configs/example/fs.py for x86 fails because no kernel
is specified.  Moving the specification over fixes this problem.

There is another problem that this uncovers, which is that going
past the init stage (i.e., past where the regression test stops)
fails because the fsck test on the disk device fails, but that's
a separate issue.
2014-01-03 17:08:44 -08:00
Nilay Vaish
f5b52a265a ruby: mesi: remove owner and sharer fields from directory tags
The directory controller should not have the sharer field since there is
only one level 2 cache. Anyway the field was not in use.  The owner field
was being used to track the l2 cache version (in case of distributed l2) that
has the cache block under consideration.  The information is not required
since the version of the level 2 cache can be obtained from a subset of the
address bits.
2013-12-20 20:34:03 -06:00
Anthony Gutierrez
8a53da22c2 cpu: allow the fetch buffer to be smaller than a cache line
the current implementation of the fetch buffer in the o3 cpu
is only allowed to be the size of a cache line. some
architectures, e.g., ARM, have fetch buffers smaller than a cache
line, see slide 22 at:
http://www.arm.com/files/pdf/at-exploring_the_design_of_the_cortex-a15.pdf

this patch allows the fetch buffer to be set to values smaller
than a cache line.
2013-11-15 13:21:15 -05:00
Dam Sunwoo
1e2a455a23 util: Streamline .apc project convertsion script
This Python script generates an ARM DS-5 Streamline .apc project based
on gem5 run. To successfully convert, the gem5 runs needs to be run
with the context-switch-based stats dump option enabled (The guest
kernel also needs to be patched to allow gem5 interrogate its task
information.) See help for more information.
2013-10-17 10:20:45 -05:00
Ali Saidi
735847179d arm, config: Fix a small issue with the dtb file being specified 2013-10-17 10:20:45 -05:00
Ali Saidi
21f1e16763 config: Fix memtest example script 2013-10-17 10:20:45 -05:00
Nilay Vaish
dbc3b2e86a config: correct example ruby scripts
A couple of recent changesets added/deleted/edited some variables
that are needed for running the example ruby scripts. This changeset
edits these scripts to bring them to a working state.
2013-10-09 17:28:14 -05:00
Nilay Vaish
2dec06a57b config: set cwd for processes in se.py 2013-10-07 18:05:50 -05:00
Andreas Sandberg
fec2dea5c3 x86: Add support for m5ops through a memory mapped interface
In order to support m5ops in virtualized environments, we need to use
a memory mapped interface. This changeset adds support for that by
reserving 0xFFFF0000-0xFFFFFFFF and mapping those to the generic IPR
interface for m5ops. The mapping is done in the
X86ISA::TLB::finalizePhysical() which means that it just works for all
of the CPU models, including virtualized ones.
2013-09-30 12:20:53 +02:00
Andreas Sandberg
f62119c77a config: Add a 'kvm' CPU alias
Add a CPU alias, 'kvm', for the first available KVM-accelerated CPU
model.
2013-09-30 09:45:43 +02:00
Joel Hestness
30c588a483 configs: Fix ruby_fs.py cache line size
Recent changes added setting of system-wide cache line size and these settings
occur in the top-level configs (se.py and fs.py). This setting also needs to
take place in ruby_fs.py. This change sets the cache line size as appropriate.
2013-09-17 19:39:11 -05:00
Andreas Hansson
c9e45f01e4 config: Add voltage domain to Ruby example scripts
This patch adds the minimum required voltage domain configuration to
the Ruby example scripts.
2013-09-12 17:49:12 -04:00
Joel Hestness
073b27c257 config: Initialize and check cpt_starttick
The previous changeset (9816) that fixes the use of max ticks introduced the
variable cpt_starttick, which is used for setting the relative max tick.
Unfortunately, with checkpointing at an instruction count or with simpoints,
the checkpoint tick is not stored conveniently, so to ensure that cpt_starttick
is initialized, set it to 0. Also, if using --rel-max-tick, check the use of
instruction counts or simpoints to warn the user that the max tick setting does
not include the checkpoint ticks.
2013-09-11 15:34:21 -05:00
Nilay Vaish
e9ae8b7d29 ruby: network: correct naming of routers
The routers are created before the network class. This results in the routers
becoming children of the first link they are connected to and they get generic
names like int_node and node_b. This patch creates the network object first
and passes it to the topology creation function. Now the routers are children
of the network object and names are much more sensible.
2013-09-06 16:21:33 -05:00
Ali Saidi
e1b9c8c169 ARM: Fix configuration files for bare-metal binaries. 2013-08-26 10:58:06 -05:00
Nilay Vaish
c4e7e18eeb ruby: add option for number of transitions per cycle
The number of transitions per cycle that a controller can carry out is
a proxy for the number of ports that a controller has. This value is
currently 32 which is way too high. The patch introduces an option
for the number of ports and uses this option in the protocol files
to set the number of transitions. The default value is being set to
4. None of the se regressions change. Ruby stats for the fs regression
change and are being updated.
2013-08-20 11:32:31 -05:00
Andreas Hansson
c26911013c config: Command line support for multi-channel memory
This patch adds support for specifying multi-channel memory
configurations on the command line, e.g. 'se/fs.py
--mem-type=ddr3_1600_x64 --mem-channels=4'. To enable this, it
enhances the functionality of MemConfig and moves the existing
makeMultiChannel class method from SimpleDRAM to the support scripts.

The se/fs.py example scripts are updated to make use of the new
feature.
2013-08-19 03:52:34 -04:00
Andreas Hansson
49d88f08b0 mem: Change AbstractMemory defaults to match the common case
This patch changes the default parameter value of conf_table_reported
to match the common case. It also simplifies the regression and config
scripts to reflect this change.
2013-08-19 03:52:33 -04:00
Akash Bagdia
e7e17f92db power: Add voltage domains to the clock domains
This patch adds the notion of voltage domains, and groups clock
domains that operate under the same voltage (i.e. power supply) into
domains. Each clock domain is required to be associated with a voltage
domain, and the latter requires the voltage to be explicitly set.

A voltage domain is an independently controllable voltage supply being
provided to section of the design. Thus, if you wish to perform
dynamic voltage scaling on a CPU, its clock domain should be
associated with a separate voltage domain.

The current implementation of the voltage domain does not take into
consideration cases where there are derived voltage domains running at
ratio of native voltage domains, as with the case where there can be
on-chip buck/boost (charge pumps) voltage regulation logic.

The regression and configuration scripts are updated with a generic
voltage domain for the system, and one for the CPUs.
2013-08-19 03:52:28 -04:00
Andreas Hansson
a8480fe1c3 config: Move the memory instantiation outside FSConfig
This patch moves the instantiation of the memory controller outside
FSConfig and instead relies on the mem_ranges to pass the information
to the caller (e.g. fs.py or one of the regression scripts). The main
motivation for this change is to expose the structural composition of
the memory system and allow more tuning and configuration without
adding a large number of options to the makeSystem functions.

The patch updates the relevant example scripts to maintain the current
functionality. As the order that ports are connected to the memory bus
changes (in certain regresisons), some bus stats are shuffled
around. For example, what used to be layer 0 is now layer 1.

Going forward, options will be added to support the addition of
multi-channel memory controllers.
2013-08-19 03:52:27 -04:00
Joel Hestness
7aa67f42cb Configs: Fix up maxtick and maxtime
This patch contains three fixes to max tick options handling in Options.py and
Simulation.py:

 1) Since the global simulator frequency isn't bound until m5.instantiate()
is called, the maxtick resolution needs to happen after this call, since
changes to the global frequency will cause m5.simulate() to misinterpret the
maxtick value. Shuffling this also requires tweaking the checkpoint directory
handling to signal the checkpoint restore tick back to run().  Fixing this
completely and correctly will require storing the simulation frequency into
checkpoints, which is beyond the scope of this patch.

 2) The maxtick option in Options.py was defaulted to MaxTicks, so the old code
would always skip over the maxtime part of the conditionals at the beginning
of run(). Change the maxtick default to None, and set the maxtick local
variable in run() appropriately.

 3) To clarify whether max ticks settings are relative or absolute, split the
maxtick option into separate options, for relative and absolute. Ensure that
these two options and the maxtime option are handled appropriately to set the
maxtick variable in Simulation.py.
2013-07-18 14:46:54 -05:00
Andreas Hansson
c20105c2ff config: Update script to set cache line size on system
This patch changes the config scripts such that they do not set the
cache line size per cache instance, but rather for the system as a
whole.
2013-07-18 08:31:19 -04:00
Nilay Vaish
516b7849e8 configs: rearrange the available options in Options.py
It also changes the instantiation of physmem in se.py so as to make
use of the memory size supplied by the mem_size option.
2013-06-28 21:42:26 -05:00
Nilay Vaish
62a93f0bf0 ruby: check for compatibility between mem size and num dirs
The configuration scripts provided for ruby assume that the available
physical memory is equally distributed amongst the directory controllers.
But there is no check to ensure this assumption has been adhered to. This
patch adds the required check.
2013-06-28 21:36:11 -05:00
Akash Bagdia
7d7ab73862 sim: Add the notion of clock domains to all ClockedObjects
This patch adds the notion of source- and derived-clock domains to the
ClockedObjects. As such, all clock information is moved to the clock
domain, and the ClockedObjects are grouped into domains.

The clock domains are either source domains, with a specific clock
period, or derived domains that have a parent domain and a divider
(potentially chained). For piece of logic that runs at a derived clock
(a ratio of the clock its parent is running at) the necessary derived
clock domain is created from its corresponding parent clock
domain. For now, the derived clock domain only supports a divider,
thus ensuring a lower speed compared to its parent. Multiplier
functionality implies a PLL logic that has not been modelled yet
(create a separate clock instead).

The clock domains should be used as a mechanism to provide a
controllable clock source that affects clock for every clocked object
lying beneath it. The clock of the domain can (in a future patch) be
controlled by a handler responsible for dynamic frequency scaling of
the respective clock domains.

All the config scripts have been retro-fitted with clock domains. For
the System a default SrcClockDomain is created. For CPUs that run at a
different speed than the system, there is a seperate clock domain
created. This domain incorporates the CPU and the associated
caches. As before, Ruby runs under its own clock domain.

The clock period of all domains are pre-computed, such that no virtual
functions or multiplications are needed when calling
clockPeriod. Instead, the clock period is pre-computed when any
changes occur. For this to be possible, each clock domain tracks its
children.
2013-06-27 05:49:49 -04:00
Akash Bagdia
597d2aa3a6 config: Rename clock option to Ruby clock
This patch changes the 'clock' option to 'ruby-clock' as it is only
used by Ruby.
2013-06-27 05:49:49 -04:00
Akash Bagdia
076d04a653 config: Add a system clock command-line option
This patch adds a 'sys_clock' command-line option and use it to assign
clocks to the system during instantiation.

As part of this change, the default clock in the System class is
removed and whenever a system is instantiated a system clock value
must be set. A default value is provided for the command-line option.

The configs and tests are updated accordingly.
2013-06-27 05:49:49 -04:00
Akash Bagdia
4459b30525 config: Add a CPU clock command-line option
This patch adds a 'cpu_clock' command-line option and uses the value
to assign clocks to components running at the CPU speed (L1 and L2
including the L2-bus). The configuration scripts are updated
accordingly.

The 'clock' option is left unchanged in this patch as it is still used
by a number of components. In follow-on patches the latter will be
disambiguated further.
2013-06-27 05:49:49 -04:00
Akash Bagdia
7eccb1b779 config: Remove redundant explicit setting of default clocks
This patch removes the explicit setting of the clock period for
certain instances of CoherentBus, NonCoherentBus and IOCache where the
specified clock is same as the default value of the system clock. As
all the values used are the defaults, there are no performance
changes. There are similar cases where the toL2Bus is set to use the
parent CPU clock which is already the default behaviour.

The main motivation for these simplifications is to ease the
introduction of clock domains.
2013-06-27 05:49:49 -04:00
Nilay Vaish
be981772b9 config: Do not instantiate membus when using ruby
This patch moves the instantiation of system.membus in se.py to the area of
code where classic memory system has been dealt with. Ruby does not require
this bus and hence it should not be instantiated.
2013-06-13 07:24:25 -05:00
Andreas Sandberg
d989a3ad50 config: Add missing CPUs to --restore-with-cpu
The --restore-with-cpu option didn't use CpuConfig.cpu_names() to
determine which CPU names are valid, instead it used a static list of
known CPU names. This changeset makes the option parsing code use the
CPU list from the CpuConfig module instead.
2013-06-03 13:40:05 +02:00
Andreas Hansson
3bc4ecdcb4 mem: More descriptive DRAM config names
This patch changes the class names of the variuos DRAM configurations
to better reflect what memory they are based on. The speed and
interface width is now part of the name, and also the alias that is
used to select them on the command line.

Some minor changes are done to the actual parameters, to better
reflect the named configurations. As a result of these changes the
regressions change slightly and the stats will be bumped in a separate
patch.
2013-05-30 12:54:14 -04:00
Andreas Hansson
bf6291460d mem: Add a LPDDR3-1600 configuration
This patch adds a typical (leaning towards fast) LPDDR3 configuration
based on publically available data. As expected, it looks very similar
to the LPDDR2-S4 configuration, only with a slightly lower burst time.
2013-05-30 12:53:56 -04:00
Andreas Hansson
88aa7755f4 mem: Avoid explicitly zeroing the memory backing store
This patch removes the explicit memset as it is redundant and causes
the simulator to touch the entire space, forcing the host system to
allocate the pages.

Anonymous pages are mapped on the first access, and the page-fault
handler is responsible for zeroing them. Thus, the pages are still
zeroed, but we avoid touching the entire allocated space which enables
us to use much larger memory sizes as long as not all the memory is
actually used.
2013-05-30 12:53:54 -04:00
Nilay Vaish
4ef466cc8a ruby: moesi hammer: cosmetic changes
Updates copyright years, removes space at the end of lines, shortens
variable names.
2013-05-21 11:32:45 -05:00
Nilay Vaish
09d5bc7e6f ruby: mesi cmp directory: cosmetic changes
Updates copyright years, removes space at the end of lines, shortens
variable names.
2013-05-21 11:32:38 -05:00
Nilay Vaish
bd3d1955da ruby: moesi cmp token: cosmetic changes
Updates copyright years, removes space at the end of lines, shortens
variable names.
2013-05-21 11:32:24 -05:00
Nilay Vaish
e7ce518168 ruby: moesi cmp directory: cosmetic changes
Updates copyright years, removes space at the end of lines, shortens
variable names.
2013-05-21 11:32:15 -05:00
Nilay Vaish
9bc75e3c58 configs: ruby: pass the option use_map to directory controller
The option was not being passed to directory controllers for the protocols
MOESI_CMP_token and MOESI_CMP_directory. This was resulting in an error
while instantiating the directory controller as it tries to access the
wrong type of memory.
2013-05-21 11:32:08 -05:00
Anthony Gutierrez
d3c33d91b6 cpu: remove local/globalHistoryBits params from branch pred
having separate params for the local/globalHistoryBits and the
local/globalPredictorSize can lead to inconsistencies if they
are not carefully set. this patch dervies the number of bits
necessary to index into the local/global predictors based on
their size.

the value of the localHistoryTableSize for the ARM O3 CPU has been
increased to 1024 from 64, which is more accurate for an A15 based
on some correlation against A15 hardware.
2013-05-14 18:39:47 -04:00
Marco Elver
0c57194a09 config: Fix mem-type option not used in ruby_fs script
This fixes missing mem-type arguments to makeLinuxAlphaRubySystem and
makeLinuxX86System after a recent changeset allowing mem-type to be
configured via options missed fixing these calls.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-04-23 11:56:48 -05:00
Andreas Hansson
3477d60d5c config: Add a mem-type config option to se/fs scripts
This patch enables selection of the memory controller class through a
mem-type command-line option. Behind the scenes, this option is
treated much like the cpu-type, and a similar framework is used to
resolve the valid options, and translate the short-hand description to
a valid class.

The regression scripts are updated with a hardcoded memory class for
the moment. The best solution going forward is probably to get the
memory out of the makeSystem functions, but Ruby complicates things as
it does not connect the memory controller to the membus.

--HG--
rename : configs/common/CpuConfig.py => configs/common/MemConfig.py
2013-04-22 13:20:33 -04:00
Andreas Sandberg
7865d6e838 config: Add a KVM VM to systems with KVM CPUs
KVM-based CPUs need a KVM VM object in the system to manage
system-global KVM stuff (VM creation, interrupt delivery, memory
managment, etc.). This changeset adds a VM to the system if KVM has
been enabled at compile time (the BaseKvmCPU object exists) and a
KVM-based CPU has been selected at runtime.
2013-04-22 13:20:32 -04:00
Dam Sunwoo
2c1e344313 cpu: generate SimPoint basic block vector profiles
This patch is based on http://reviews.m5sim.org/r/1474/ originally written by
Mitch Hayenga. Basic block vectors are generated (simpoint.bb.gz in simout
folder) based on start and end addresses of basic blocks.

Some comments to the original patch are addressed and hooks are added to create
and resume from checkpoints based on instruction counts dictated by external
SimPoint analysis tools.

SimPoint creation/resuming options will be implemented as a separate patch.
2013-04-22 13:20:31 -04:00
Nilay Vaish
407b1a77c8 config: ruby network test: remove piobus check 2013-04-17 16:06:24 -05:00
Joel Hestness
82c6734f6b Configs: Fix handling of maxtick and take_checkpoints
In Simulation.py, calls to m5.simulate(num_ticks) will run the simulated system
for num_ticks after the current tick. Fix calls to m5.simulate in
scriptCheckpoints() and benchCheckpoints() to appropriately handle the maxticks
variable.
2013-04-09 16:25:30 -05:00
Anthony Gutierrez
7fb55b98cc rcs scripts: remove bbench.rcS
this run script shouldn't be used; bbench-ics.rcS or bbench-gb.rcS
should be used instead.
2013-04-02 12:46:49 -04:00
Nilay Vaish
433cab9d95 x86: create space in bios memory map
As of now, we mark the top 1MB of memory space as unusable. Part of
it is actually usable and is required to be marked so by some of the
newer versions of linux kernel. This patch marks the top 639KB as usable.
This value was chosen by looking at QEMU's output for bios memory map.
2013-03-28 09:34:15 -05:00
Nilay Vaish
546ffb2d6d config: return exit event instead of cause
changeset: a4739b6f799d made some changes that where an exit event
should have been returned in place of exit cause. This patch corrects
the error.
2013-03-22 17:31:24 -05:00
Nilay Vaish
5aa43e130a ruby: convert Topology to regular class
The Topology class in Ruby does not need to inherit from SimObject class.
This patch turns it into a regular class. The topology object is now created
in the constructor of the Network class. All the parameters for the topology
class have been moved to the network class.
2013-03-22 15:53:23 -05:00
Nilay Vaish
2d50127642 ruby: network: move routers from topology to network 2013-03-22 15:53:22 -05:00
Nilay Vaish
c061819890 ruby: remove the functional copy of memory in se mode
This patch removes the functional copy of the memory that was maintained in
the se mode. Now ruby itself will provide the data.
2013-03-06 21:53:57 -06:00
Nilay Vaish
e8802fa127 ruby: garnet: fixed: implement functional access 2013-03-06 21:53:16 -06:00
Ali Saidi
82cf1565d0 config: Fix --prog-interval command line option 2013-02-20 08:18:22 -05:00
Anthony Gutierrez
21aa950318 options: add command line option for dtb file 2013-02-15 18:48:59 -05:00
Andreas Sandberg
1c7aa665bf config: Remove O3 dependencies
The default cache configuration script currently import the O3_ARM_v7a
model configuration, which depends on the O3 CPU. This breaks if gem5
has been compiled without O3 support. This changeset removes the
dependency by only importing the model if it is requested by the
user. As a bonus, it actually removes some code duplication in the
configuration scripts.
2013-02-15 17:40:08 -05:00
Andreas Sandberg
e5dca84c3f config: Move CPU handover logic to m5.switchCpus()
CPU switching consists of the following steps:
 1. Drain the system
 2. Switch out old CPUs (cpu.switchOut())
 3. Change the system timing mode to the mode the new CPUs require
 4. Flush caches if switching to hardware virtualization
 5. Inform new CPUs of the handover (cpu.takeOverFrom())
 6. Resume the system

m5.switchCpus() previously only did step 2 & 5. Since information
about the new processors' memory system requirements is now exposed,
do all of the steps above.

This patch adds automatic memory system switching and flush (if
needed) to switchCpus(). Additionally, it adds optional draining to
switchCpus(). This has the following implications:

* changeToTiming and changeToAtomic are no longer needed, so they have
  been removed.

* changeMemoryMode is only used internally, so it is has been renamed
  to be private.

* switchCpus requires a reference to the system containing the CPUs as
  its first parameter.

WARNING: This changeset breaks compatibility with existing
configuration scripts since it changes the signature of
m5.switchCpus().
2013-02-15 17:40:08 -05:00
Andreas Sandberg
e9f66dceac config: Cleanup CPU configuration
The CPUs supported by the configuration scripts used to be
hard-coded. This was not ideal for several reasons. For example, the
configuration scripts depend on all CPU models even though only a
subset might have been compiled.

This changeset adds a new module to the configuration scripts that
automatically discovers the available CPU models from the compiled
SimObjects. As a nice bonus, the use of introspection allows us to
automatically generate a list of available CPU models suitable for
printing. This list is augmented with the Python doc string from the
underlying class if available.
2013-02-15 17:40:08 -05:00
Andreas Sandberg
7cd1fd4324 cpu: Add CPU metadata om the Python classes
The configuration scripts currently hard-code the requirements of each
CPU. This is clearly not optimal as it makes writing new configuration
scripts painful and adding new CPU models requires existing scripts to
be updated. This patch adds the following class methods to the base
CPU and all relevant CPUs:

 * memory_mode -- Return a string describing the current memory mode
                  (invalid/atomic/timing).

 * require_caches -- Does the CPU model require caches?

 * support_take_over -- Does the CPU support CPU handover?
2013-02-15 17:40:08 -05:00
Andreas Sandberg
6155400421 config: Don't call sys.exit in interactive mode in run()
The run() method in Simulation.py used to call sys.exit() when the
simulator exits. This is undesirable when user has requested the
simulator to be run in interactive mode since it causes the simulator
to exit rather than entering the interactive Python environment.
2013-02-10 13:23:54 +01:00
Andreas Hansson
c4898b15bc mem: Add DDR3 and LPDDR2 DRAM controller configurations
This patch moves the default DRAM parameters from the SimpleDRAM class
to two different subclasses, one for DDR3 and one for LPDDR2. More can
be added as we go forward.

The regressions that previously used the SimpleDRAM are now using
SimpleDDR3 as this is the most similar configuration.
2013-01-31 07:49:14 -05:00
Nilay Vaish ext:(%2C%20Timothy%20Jones%20%3Ctimothy.jones%40cl.cam.ac.uk%3E)
dbeabedaf0 branch predictor: move out of o3 and inorder cpus
This patch moves the branch predictor files in the o3 and inorder directories
to src/cpu/pred. This allows sharing the branch predictor across different
cpu models.

This patch was originally posted by Timothy Jones in July 2010
but never made it to the repository.

--HG--
rename : src/cpu/o3/bpred_unit.cc => src/cpu/pred/bpred_unit.cc
rename : src/cpu/o3/bpred_unit.hh => src/cpu/pred/bpred_unit.hh
rename : src/cpu/o3/bpred_unit_impl.hh => src/cpu/pred/bpred_unit_impl.hh
rename : src/cpu/o3/sat_counter.hh => src/cpu/pred/sat_counter.hh
2013-01-24 12:28:51 -06:00
Malek Musleh
3137557cad config: move ruby objects under ruby_system in obj hierarchy
This patch moves the contollers to be children of the ruby_system instead of
'system' under the python object hierarchy. This is so that these objects
can inherit some of the ruby_system's parameter values without resorting to
calling a global system pointer during run-time.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-01-14 10:05:14 -06:00
Ali Saidi
fe3fbe624e config: Fix issue with changeset: a4739b6f799d. 2013-01-08 17:12:22 -05:00
Lluís Vilanova
807168a1de util: add m5_fail op.
Used as a command in full-system scripts helps the user ensure the benchmarks have finished successfully.

For example, one can use:

    /path/to/benchmark args || /sbin/m5 fail 1

and thus ensure gem5 will exit with an error if the benchmark fails.
2013-01-08 08:54:12 -05:00
Andreas Sandberg
2cfe62adc4 cpu: Rename defer_registration->switched_out
The defer_registration parameter is used to prevent a CPU from
initializing at startup, leaving it in the "switched out" mode. The
name of this parameter (and the help string) is confusing. This patch
renames it to switched_out, which should be more descriptive.
2013-01-07 13:05:45 -05:00
Andreas Hansson
e65de3f5ca config: Do not use hardcoded physmem in fs script
This patch generalises the address range resolution for the I/O cache
and I/O bridge such that they do not assume a single memory. The patch
involves adding a parameter to the system which is then defined based
on the memories that are to be visible from the I/O subsystem, whether
behind a cache or a bridge.

The change is needed to allow interleaved memory controllers in the
system.
2013-01-07 13:05:38 -05:00
Andreas Sandberg
3db3f83a5e arch: Make the ISA class inherit from SimObject
The ISA class on stores the contents of ID registers on many
architectures. In order to make reset values of such registers
configurable, we make the class inherit from SimObject, which allows
us to use the normal generated parameter headers.

This patch introduces a Python helper method, BaseCPU.createThreads(),
which creates a set of ISAs for each of the threads in an SMT
system. Although it is currently only needed when creating
multi-threaded CPUs, it should always be called before instantiating
the system as this is an obvious place to configure ID registers
identifying a thread/CPU.
2013-01-07 13:05:35 -05:00
Nilay Vaish
f3d0be210f ruby: add support for prefetching to MESI protocol 2012-12-11 10:05:56 -06:00
Nilay Vaish
c120273708 ruby: modify the directed tester to read/write streams
The directed tester supports only generating only read or only write accesses. The
patch modifies the tester to support streams that have both read and write accesses.
2012-12-11 10:05:55 -06:00
Erik Tomusk
3dc7e4f496 TournamentBP: Fix some bugs with table sizes and counters
globalHistoryBits, globalPredictorSize, and choicePredictorSize are decoupled.
globalHistoryBits controls how much history is kept, global and choice
predictor sizes control how much of that history is used when accessing
predictor tables. This way, global and choice predictors can actually be
different sizes, and it is no longer possible to walk off the predictor arrays
and cause a seg fault.

There are now individual thresholds for choice, global, and local saturating
counters, so that taken/not taken decisions are correct even when the
predictors' counters' sizes are different.

The interface for localPredictorSize has been removed from TournamentBP because
the value can be calculated from localHistoryBits.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2012-12-06 09:31:06 -06:00
Andreas Hansson
13f6d29a76 config: Fix description of checkpoint option from cycle to tick
This patch merely updates the description of the "take-checkpoints"
option to reflect that it is specified in ticks and not in cycles.
2012-11-19 11:21:09 -05:00
Andreas Sandberg
dc01535c7e python: Rename doDrain()->drain() and make it do the right thing
There is no point in exporting the old drain() method in
Simulate.py. It should only be used internally by doDrain(). This
patch moves the old drain() method into doDrain() and renames
doDrain() to drain().
2012-11-02 11:32:02 -05:00
Andreas Sandberg
7e25052fee Partly revert [4f54b0f229b5] and move draining to m5.changeToTiming
Changeset 4f54b0f229b5 removed the call to doDrain in changeToTiming
based on the assumption that the system does not need draining when
running in atomic mode. This is a false assumption since at least the
System class requires the system to be drained before it allows
switching of memory modes. This patch reverts that part of the
changeset.
2012-11-02 11:32:00 -05:00
Andreas Hansson
9cbe1cb653 config: Unify caches used in regressions and adjust L2 MSHRs
This patch unified the L1 and L2 caches used throughout the
regressions instead of declaring different, but very similar,
configurations in the different scripts.

The patch also changes the default L2 configuration to match what it
used to be for the fs and se scripts (until the last patch that
updated the regressions to also make use of the cache config). The
MSHRs and targets per MSHR are now set to a more realistic default of
20 and 12, respectively.

As a result of both the aforementioned changes, many of the regression
stats are changed. A follow-on patch will bump the stats.
2012-10-30 07:44:08 -04:00
Malek Musleh
d2d431f439 ruby: set the is_icache param for caches
This patch sets the is_icache param for the L1 caches used in
the MESI and the MOESI CMP directory protocols.
2012-10-27 16:04:30 -05:00
Jason Power ext:(%2C%20Joel%20Hestness%20%3Chestness%40cs.wisc.edu%3E)
931ec6b7cc Ruby: Use block size in configuring directory bits in address
This patch replaces hard coded values used in Ruby's configuration files
for setting directory bits with values based on the block size in use.
2012-10-27 16:01:09 -05:00
Andreas Hansson
a4d8996fd9 config: Add a check for fastmem only used with Atomic CPU
This patch adds an additional check to ensure that the fastmem option
is only used if the system is using the Atomic CPU.
2012-10-26 06:42:45 -04:00
Andreas Hansson
7cd01cf769 config: Remove unused mem_size in fs.py
This patch removes a segment of dead code that is never used.
2012-10-26 06:42:43 -04:00
Andreas Hansson
651de2d9af config: Fix the cache class naming in regression scripts
This patch unifies the naming of the default L1 and L2 caches in the
regression configs to be in line with what is used in the se and fs
scripts.
2012-10-26 06:42:42 -04:00
Andreas Hansson
66e331c7bb config: Use SimpleDRAM in full-system, and with o3 and inorder
This patch favours using SimpleDRAM with the default timing instead of
SimpleMemory for all regressions that involve the o3 or inorder CPU,
or are full system (in other words, where the actual performance of
the memory is important for the overall performance).

Moving forward, the solution for FSConfig and the users of fs.py and
se.py is probably something similar to what we use to choose the CPU
type. I envision a few pre-set configurations SimpleLPDDR2,
SimpleDDR3, etc that can be choosen by a dram_type option. Feedback on
this part is welcome.

This patch changes plenty stats and adds all the DRAM controller
related stats. A follow-on patch updates the relevant statistics. The
total run-time for the entire regression goes up with ~5% with this
patch due to the added complexity of the SimpleDRAM model. This is a
concious trade-off to ensure that the model is properly tested.
2012-10-25 13:14:38 -04:00
Andreas Hansson
d22796c03c config: Use shared cache config for regressions
This patch uses the common L1, L2 and IOCache configuration for the
regressions that all share the same cache parameters. There are a few
regressions that use a slightly different configuration (memtest,
o3-timing=mp, simple-atomic-mp and simple-timing-mp), and the latter
are not changed in this patch. They will be updated in a future patch.

The common cache configurations are changed to match the ones used in
the regressions, and are slightly changed with respect to what they
were. Hopefully this means we can converge on a common base
configuration, used both in the normal user configurations and
regressions.

As only regressions that shared the same cache configuration are
updated, no regressions are affected.
2012-10-25 04:32:44 -04:00
Nilay Vaish
5ffc165939 ruby: improved support for functional accesses
This patch adds support to different entities in the ruby memory system
for more reliable functional read/write accesses. Only the simple network
has been augmented as of now. Later on Garnet will also support functional
accesses.
The patch adds functional access code to all the different types of messages
that protocols can send around. These messages are functionally accessed
by going through the buffers maintained by the network entities.
The patch also rectifies some of the bugs found in coherence protocols while
testing the patch.

With this patch applied, functional writes always succeed. But functional
reads can still fail.
2012-10-15 17:51:57 -05:00
Andreas Hansson
88554790c3 Mem: Use cycles to express cache-related latencies
This patch changes the cache-related latencies from an absolute time
expressed in Ticks, to a number of cycles that can be scaled with the
clock period of the caches. Ultimately this patch serves to enable
future work that involves dynamic frequency scaling. As an immediate
benefit it also makes it more convenient to specify cache performance
without implicitly assuming a specific CPU core operating frequency.

The stat blocked_cycles that actually counter in ticks is now updated
to count in cycles.

As the timing is now rounded to the clock edges of the cache, there
are some regressions that change. Plenty of them have very minor
changes, whereas some regressions with a short run-time are perturbed
quite significantly. A follow-on patch updates all the statistics for
the regressions.
2012-10-15 08:10:54 -04:00
Andreas Hansson
1c321b8847 Regression: Use CPU clock and 32-byte width for L1-L2 bus
This patch changes the CoherentBus between the L1s and L2 to use the
CPU clock and also four times the width compared to the default
bus. The parameters are not intending to fit every single scenario,
but rather serve as a better startingpoint than what we previously
had.

Note that the scripts that do not use the addTwoLevelCacheHiearchy are
not affected by this change.

A separate patch will update the stats.
2012-10-15 08:08:08 -04:00
Nilay Vaish
4488379244 ruby: changes to simple network
This patch makes the Switch structure inherit from BasicRouter, as is
done in two other networks.
2012-10-02 14:35:45 -05:00