Commit graph

682 commits

Author SHA1 Message Date
Andreas Hansson
9738f34411 config: Revamp memtest to allow testers on any level
This patch revamps the memtest example script and allows for the
insertion of testers at any level in the cache hierarchy. Previously
all created topologies placed testers only at the very top, and the
tree was thus entirely symmetric. With the changes made, it is possible
to not only place testers at the leaf caches (L1), but also to connect
testers at the L2, L3 etc.

As part of the changes the object hierarchy is also simplified to
ensure that the visual representation from the DOT printing looks
sensible. Using SubSystems to group the objects is one of the key
features.
2015-02-11 10:23:31 -05:00
Andreas Hansson
6563ec8634 cpu: Tidy up the MemTest and make false sharing more obvious
The MemTest class really only tests false sharing, and as such there
was a lot of old cruft that could be removed. This patch cleans up the
tester, and also makes it more clear what the assumptions are. As part
of this simplification the reference functional memory is also
removed.

The regression configs using MemTest are updated to reflect the
changes, and the stats will be bumped in a separate patch. The example
config will be updated in a separate patch due to more extensive
re-work.

In a follow-on patch a new tester will be introduced that uses the
MemChecker to implement true sharing.
2015-02-11 10:23:28 -05:00
Steve Reinhardt
774922895b config: rename 'file' var
Rename uses of 'file' as a local variable to avoid conflict
with the built-in type of the same name.
2015-02-05 16:45:12 -08:00
Steve Reinhardt
634d923751 config: make M5_PATH a real search path
Although you can put a list of colon-separated directory names
in M5_PATH, the current code just takes the first one that
exists and assumes all files must live there.  This change
makes the code search the specified list of directories
for each individual binary or disk image that's requested.

The main motivation is that the x86/Alpha binaries and the
ARM binaries are in separate downloads, and thus naturally
end up in separate directories.  With this change, you can
have M5_PATH point to those two directories, then run any
FS regression test without changing M5_PATH.  Currently,
you either have to merge the two download directories
or change M5_PATH (or do something else I haven't figured out).
2015-02-05 16:45:06 -08:00
Andreas Hansson
28a7cea2b3 config: Add XOR hashing to the DRAM channel interleaving
This patch uses the recently added XOR hashing capabilities for the
DRAM channel interleaving. This avoids channel biasing due to strided
access patterns.
2015-02-03 14:25:55 -05:00
Andreas Hansson
5ea60a95b3 config: Adjust DRAM channel interleaving defaults
This patch changes the DRAM channel interleaving default behaviour to
be more representative. The default address mapping (RoRaBaCoCh) moves
the channel bits towards the least significant bits, and uses 128 byte
as the default channel interleaving granularity.

These defaults can be overridden if desired, but should serve as a
sensible starting point for most use-cases.
2015-02-03 14:25:52 -05:00
Malek Musleh
ca131a4196 config: arm: fix os_flags
Fix the makeArmSystem routine to reflect recent changes that support kernel
commandline option when running android. Without this fix, trying to run
android encounters a 'reference before assignment' error.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-30 15:49:34 -06:00
Malek Musleh
be3a952394 config, ruby: connect dma to network
DMA Controller was not being connected to the network for the MESI_Three_Level
protocol as was being done in the other protocol config files. Without this
patch, this protocol segfaults during startup.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-20 14:15:28 -06:00
Andreas Hansson
3cb9c361e2 scons: Do not build the InOrderCPU
One step closer to shifting focus to the MinorCPU.
2015-01-20 08:12:45 -05:00
Anthony Gutierrez
0d8d6e4441 arm: fix build_drive_system when not using default options
when trying to dual boot on arm build_drive_system will only use the default
values for the dtb file, number of processors, and disk image. if you are using
the non-default files by passing values on the command line for example, or by
making a new entry in Benchmarks.py, the build config scripts will still look
for the default files. this will lead to the wrong system files being used, or
the simulator will fail if you do not have them.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-03 17:51:48 -06:00
Nilay Vaish
1ee70e9d84 configs: ruby: removes bug introduced by 05b5a6cf3521 2015-01-03 17:51:48 -06:00
Andreas Hansson
59460b91f3 config: Expose the DRAM ranks as a command-line option
This patch gives the user direct influence over the number of DRAM
ranks to make it easier to tune the memory density without affecting
the bandwidth (previously the only means of scaling the device count
was through the number of channels).

The patch also adds some basic sanity checks to ensure that the number
of ranks is a power of two (since we rely on bit slices in the address
decoding).
2014-12-23 09:31:18 -05:00
Marco Elver
177682ead4 config: Add --memchecker option
This patch adds the --memchecker option, to denote that a MemChecker
should be instantiated for the system. The exact usage of the MemChecker
depends on the system configuration.

For now CacheConfig.py makes use of the option, adding MemCheckerMonitor
instances between CPUs and D-Caches.

Note, however, that currently this only provides limited checking on a
running system; other parts of the system, such as I/O devices are not
monitored, and may cause warnings to be issued by the monitor.
2014-12-23 09:31:18 -05:00
Dam Sunwoo
809134a2b1 config: Add options to take/resume from SimPoint checkpoints
More documentation at http://gem5.org/Simpoints

Steps to profile, generate, and use SimPoints with gem5:

1. To profile workload and generate SimPoint BBV file, use the
following option:

--simpoint-profile --simpoint-interval <interval length>

Requires single Atomic CPU and fastmem.
<interval length> is in number of instructions.

2. Generate SimPoint analysis using SimPoint 3.2 from UCSD.
(SimPoint 3.2 not included with this flow.)

3. To take gem5 checkpoints based on SimPoint analysis, use the
following option:

--take-simpoint-checkpoint=<simpoint file path>,<weight file
path>,<interval length>,<warmup length>

<simpoint file> and <weight file> is generated by SimPoint analysis
tool from UCSD. SimPoint 3.2 format expected. <interval length> and
<warmup length> are in number of instructions.

4. To resume from gem5 SimPoint checkpoints, use the following option:

--restore-simpoint-checkpoint -r <N> --checkpoint-dir <simpoint
checkpoint path>

<N> is (SimPoint index + 1). E.g., "-r 1" will resume from SimPoint
#0.
2014-12-23 09:31:17 -05:00
Gabe Black
7540656fc5 config: Add two options for setting the kernel command line.
Both options accept template which will, through python string formatting,
have "mem", "disk", and "script" values substituted in from the mdesc.
Additional values can be used on a case by case basis by passing them as
keyword arguments to the fillInCmdLine function. That makes it possible to
have specialized parameters for a particular ISA, for instance.

The first option lets you specify the template directly, and the other lets
you specify a file which has the template in it.
2014-12-04 16:42:07 -08:00
Nilay Vaish
cca1608bd5 config: ruby: mi protocol: correct master slave setting for dma
In the MI protocol, the master slave connection between the dma controller
and network was being set incorrectly.  This patch corrects it.
2014-12-04 08:59:44 -06:00
Gabe Black
b7dc4ba516 config: Get rid of some extra spaces around default arguments. 2014-12-03 03:11:00 -08:00
Alexandru Dutu
a19cf6943b config, kvm: Enabling KvmCPU in SE mode
This patch modifies se.py such that it can now use kvm cpu model.
2014-11-23 18:01:08 -08:00
Steve Reinhardt
252a463b6b Backed out prior changeset f9fb64a72259
Back out use of importlib to avoid implicitly creating
dependency on Python 2.7.
2014-11-23 18:00:47 -08:00
Gabe Black
12243a3835 config: ruby: Get rid of an "eval" and an "exec" operating on generated code.
We can get the same result using importlib.
2014-11-23 05:55:26 -08:00
Nilay Vaish
708e80d9bb configs: small fix to ruby portion of fs.py and se.py
In fs.py the io port controller was being attached to the iobus multiple
times.  This should be done only once.  In se.py, the the option use_map
was being set which no longer exists.
2014-11-18 19:17:29 -06:00
Marc Orr
bf80734b2c x86 isa: This patch attempts an implementation at mwait.
Mwait works as follows:
1. A cpu monitors an address of interest (monitor instruction)
2. A cpu calls mwait - this loads the cache line into that cpu's cache.
3. The cpu goes to sleep.
4. When another processor requests write permission for the line, it is
   evicted from the sleeping cpu's cache. This eviction is forwarded to the
   sleeping cpu, which then wakes up.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-11-06 05:42:22 -06:00
Nilay Vaish
0811f21f67 ruby: provide a backing store
Ruby's functional accesses are not guaranteed to succeed as of now.  While
this is not a problem for the protocols that are currently in the mainline
repo, it seems that coherence protocols for gpus rely on a backing store to
supply the correct data.  The aim of this patch is to make this backing store
configurable i.e. it comes into play only when a particular option:
--access-backing-store is invoked.

The backing store has been there since M5 and GEMS were integrated.  The only
difference is that earlier the system used to maintain the backing store and
ruby's copy was write-only.  Sometime last year, we moved to data being
supplied supplied by ruby in SE mode simulations.  And now we have patches on
the reviewboard, which remove ruby's copy of memory altogether and rely
completely on the system's memory to supply data.  This patch adds back a
SimpleMemory member to RubySystem.  This member is used only if the option:
access-backing-store is set to true.  By default, the memory would not be
accessed.
2014-11-06 05:42:21 -06:00
Nilay Vaish
3022d463fb ruby: interface with classic memory controller
This patch is the final in the series.  The whole series and this patch in
particular were written with the aim of interfacing ruby's directory controller
with the memory controller in the classic memory system.  This is being done
since ruby's memory controller has not being kept up to date with the changes
going on in DRAMs.  Classic's memory controller is more up to date and
supports multiple different types of DRAM.  This also brings classic and
ruby ever more close.  The patch also changes ruby's memory controller to
expose the same interface.
2014-11-06 05:42:21 -06:00
Nilay Vaish
95a0b18431 ruby: single physical memory in fs mode
Both ruby and the system used to maintain memory copies.  With the changes
carried for programmed io accesses, only one single memory is required for
fs simulations.  This patch sets the copy of memory that used to reside
with the system to null, so that no space is allocated, but address checks
can still be carried out.  All the memory accesses now source and sink values
to the memory maintained by ruby.
2014-11-06 05:41:44 -06:00
Ali Saidi
f2db2a96d1 arm, tests: Update config files to more recent kernels and create 64-bit regressions.
This changes the default ARM system to a Versatile Express-like system that supports
2GB of memory and PCI devices and updates the default kernels/file-systems for
AArch64 ARM systems (64-bit) to support up to 32GB of memory and PCI devices. Some
platforms that are no longer supported have been pruned from the configuration files.

In addition a set of 64-bit ARM regressions have been added to the regression system.
2014-10-29 23:18:27 -05:00
Ali Saidi
3a5c975fd7 arm: fix bare-metal memory setup.
The bare-metal configuration option still configured memory with the old scheme
that no-longer works. This change unifies the code so there aren't any differences.
2014-10-29 23:18:26 -05:00
Andreas Hansson
66df7b7fd4 config: Add the ability to read a config file using C++ and Python
This patch adds the ability to load in config.ini files generated from
gem5 into another instance of gem5 built without Python configuration
support. The intended use case is for configuring gem5 when it is a
library embedded in another simulation system.

A parallel config file reader is also provided purely in Python to
demonstrate the approach taken and to provided similar functionality
for as-yet-unknown use models. The Python configuration file reader
can read both .ini and .json files.

C++ configuration file reading:

A command line option has been added for scons to enable C++ configuration
file reading: --with-cxx-config

There is an example in util/cxx_config that shows C++ configuration in action.
util/cxx_config/README explains how to build the example.

Configuration is achieved by the object CxxConfigManager. It handles
reading object descriptions from a CxxConfigFileBase object which
wraps a config file reader. The wrapper class CxxIniFile is provided
which wraps an IniFile for reading .ini files. Reading .json files
from C++ would be possible with a similar wrapper and a JSON parser.

After reading object descriptions, CxxConfigManager creates
SimObjectParam-derived objects from the classes in the (generated with this
patch) directory build/ARCH/cxx_config

CxxConfigManager can then build SimObjects from those SimObjectParams (in an
order dictated by the SimObject-value parameters on other objects) and bind
ports of the produced SimObjects.

A minimal set of instantiate-replacing member functions are provided by
CxxConfigManager and few of the member functions of SimObject (such as drain)
are extended onto CxxConfigManager.

Python configuration file reading (configs/example/read_config.py):

A Python version of the reader is also supplied with a similar interface to
CxxConfigFileBase (In Python: ConfigFile) to config file readers.

The Python config file reading will handle both .ini and .json files.

The object construction strategy is slightly different in Python from the C++
reader as you need to avoid objects prematurely becoming the children of other
objects when setting parameters.

Port binding also needs to be strictly in the same port-index order as the
original instantiation.
2014-10-16 05:49:37 -04:00
Nilay Vaish
b80e574d01 config: separate function for instantiating a memory controller
This patch moves code for instantiating a single memory controller from
the function config_mem() to a separate function.  This is being done
so that memory controllers can be instantiated without assuming that
they will be attached to the system in a particular fashion.
2014-10-11 15:02:23 -05:00
Nilay Vaish
0f28d63272 ruby: moesi hammer: correct typo in master-slave assignment 2014-10-11 15:02:22 -05:00
Jiuyue Ma
e1a5522a89 config, x86: Ensure that PCI devs get bridged to the memory bus
This patch force IO device to be mapped to 0xC0000000-0xFFFF0000 by
reserve anything between the end of memory and 3GB if memory is less
than 3GB. It also statically bridge these address range to the IO bus,
which guaranty access to pci address space will pass though bridge to
iobus.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-07-17 12:05:41 +08:00
Jiuyue Ma
7d03bf4d6b config, x86: swap bus_id of ISA/PCI in X86 IntelMPTable
This patch assign bus_id=0 to PCI bus and bus_id=1 to ISA bus for
X86 platform. Because PCI device get config space address using
Pc::calcPciConfigAddr() which requires "assert(bus==0)".
This fixes PCI interrupt routing and discovery on Linux.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-07-17 11:00:12 +08:00
Andreas Hansson
1f6d5f8f84 mem: Rename Bus to XBar to better reflect its behaviour
This patch changes the name of the Bus classes to XBar to better
reflect the actual timing behaviour. The actual instances in the
config scripts are not renamed, and remain as e.g. iobus or membus.

As part of this renaming, the code has also been clean up slightly,
making use of range-based for loops and tidying up some comments. The
only changes outside the bus/crossbar code is due to the delay
variables in the packet.

--HG--
rename : src/mem/Bus.py => src/mem/XBar.py
rename : src/mem/coherent_bus.cc => src/mem/coherent_xbar.cc
rename : src/mem/coherent_bus.hh => src/mem/coherent_xbar.hh
rename : src/mem/noncoherent_bus.cc => src/mem/noncoherent_xbar.cc
rename : src/mem/noncoherent_bus.hh => src/mem/noncoherent_xbar.hh
rename : src/mem/bus.cc => src/mem/xbar.cc
rename : src/mem/bus.hh => src/mem/xbar.hh
2014-09-20 17:18:32 -04:00
Wendy Elsasser
a384525355 cpu: Update DRAM traffic gen
Add new DRAM_ROTATE mode to traffic generator.
This mode will generate DRAM traffic that rotates across
banks per rank, command types, and ranks per channel

The looping order is illustrated below:
for (ranks per channel)
   for (command types)
      for (banks per rank)
         // Generate DRAM Command Series

This patch also adds the read percentage as an input argument to the
DRAM sweep script. If the simulated read percentage is 0 or 100, the
middle for loop does not generate additional commands.  This loop is
used only when the read percentage is set to 50, in which case the
middle loop will toggle between read and write commands.

Modified sweep.py script, which generates DRAM traffic.
Added input arguments and support for new DRAM_ROTATE mode.
The script now has input arguments for:
 1) Read percentage
 2) Number of ranks
 3) Address mapping
 4) Traffic generator mode  (DRAM or DRAM_ROTATE)

The default values are:
 100% reads, 1 rank, RoRaBaCoCh address mapping, and DRAM traffic gen mode

For the DRAM traffic mode, added multi-rank support.
2014-09-20 17:17:55 -04:00
Dam Sunwoo
ca3513d630 cpu: use probes infrastructure to do simpoint profiling
Instead of having code embedded in cpu model to do simpoint profiling use
the probes infrastructure to do it.
2014-09-20 17:17:43 -04:00
Ali Saidi
1c0ae90027 arm: Support >2GB of memory for AArch64 systems 2014-09-03 07:43:05 -04:00
Ali Saidi
16262a8fc3 arm: Assume we have a kernel that supports pci devices
Change the default kernel for AArch64 and since it supports PCI devices
remove the hack that made it use CF. Unfortunately, there isn't really
a half-way here and we need to switch. Current users will get an error
message that the kernel isn't found and hopefully go download a new
kernel that supports PCI.
2014-09-03 07:43:04 -04:00
Geoffrey Blake
845e199934 config: Refactor RealviewEMM to fit into new config system
This eliminates some default devices and adds in helper functions
to connect the devices defined here to associate with the proper
clock domains.
2014-09-03 07:43:01 -04:00
Mitch Hayenga
976f27487b cpu: Change writeback modeling for outstanding instructions
As highlighed on the mailing list gem5's writeback modeling can impact
performance.  This patch removes the limitation on maximum outstanding issued
instructions, however the number that can writeback in a single cycle is still
respected in instToCommit().
2014-09-03 07:42:33 -04:00
Andreas Hansson
0756406739 mem: Add utility script to plot DRAM efficiency sweep
This patch adds basic functionality to quickly visualise the output
from the DRAM efficiency script. There are some unfortunate hacks
needed to communicate the needed information from one script to the
other, and we fall back on (ab)using the simout to do this.

As part of this patch we also trim the efficiency sweep to stop at 512
bytes as this should be sufficient for all forseeable DRAMs.
2014-09-03 07:42:29 -04:00
Nilay Vaish
7a0d5aafe4 ruby: message buffers: significant changes
This patch is the final patch in a series of patches.  The aim of the series
is to make ruby more configurable than it was.  More specifically, the
connections between controllers are not at all possible (unless one is ready
to make significant changes to the coherence protocol).  Moreover the buffers
themselves are magically connected to the network inside the slicc code.
These connections are not part of the configuration file.

This patch makes changes so that these connections will now be made in the
python configuration files associated with the protocols.  This requires
each state machine to expose the message buffers it uses for input and output.
So, the patch makes these buffers configurable members of the machines.

The patch drops the slicc code that usd to connect these buffers to the
network.  Now these buffers are exposed to the python configuration system
as Master and Slave ports.  In the configuration files, any master port
can be connected any slave port.  The file pyobject.cc has been modified to
take care of allocating the actual message buffer.  This is inline with how
other port connections work.
2014-09-01 16:55:47 -05:00
Emilio Castillo ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)
01f792a367 ruby: Fixes clock domains in configuration files
This patch fixes scripts related to ruby by adding the ruby clock domain.
Now the L1 controllers and  the Sequencer shares the cpu clock domain,
while the rest of the components use the ruby clock domain.

Before this patch, running simulations with the cpu clock set at 2GHz or
1GHz will output the same time results and could distort power measurements.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-09-01 16:55:30 -05:00
Radhika Jagtap
860f00228b config: Fix cache latency param in mem test
This patch fixes the cache latency in mem test which is split into two params,
hit and response latency as per BaseCache.
2014-08-10 05:39:40 -04:00
Anthony Gutierrez
0ac4624595 arm: make the PseudoLRU tags the default for the O3_ARM_v7aL2
the Cortex-A15 has a random replacement policy for its L2 cache. see the
Cortex-A15 Technical Reference Manual 1.7 About the L2 memory system. this
patch makes the PseudoLRU tags the default for the ARM O3 CPU's L2 cache.
2014-07-28 12:22:00 -04:00
Andrew Bardsley
0e8a90f06b cpu: `Minor' in-order CPU model
This patch contains a new CPU model named `Minor'. Minor models a four
stage in-order execution pipeline (fetch lines, decompose into
macroops, decompose macroops into microops, execute).

The model was developed to support the ARM ISA but should be fixable
to support all the remaining gem5 ISAs. It currently also works for
Alpha, and regressions are included for ARM and Alpha (including Linux
boot).

Documentation for the model can be found in src/doc/inside-minor.doxygen and
its internal operations can be visualised using the Minorview tool
utils/minorview.py.

Minor was designed to be fairly simple and not to engage in a lot of
instruction annotation. As such, it currently has very few gathered
stats and may lack other gem5 features.

Minor is faster than the o3 model. Sample results:

     Benchmark     |   Stat host_seconds (s)
    ---------------+--------v--------v--------
     (on ARM, opt) | simple | o3     | minor
                   | timing | timing | timing
    ---------------+--------+--------+--------
    10.linux-boot  |   169  |  1883  |  1075
    10.mcf         |   117  |   967  |   491
    20.parser      |   668  |  6315  |  3146
    30.eon         |   542  |  3413  |  2414
    40.perlbmk     |  2339  | 20905  | 11532
    50.vortex      |   122  |  1094  |   588
    60.bzip2       |  2045  | 18061  |  9662
    70.twolf       |   207  |  2736  |  1036
2014-07-23 16:09:04 -05:00
Anthony Gutierrez
db267da822 arm: make the bi-mode predictor the default for O3_ARM_v7a_BP
the branch predictor used in the Cortex-A15 is a bi-mode style predictor,
see:

http://arm.com/files/pdf/at-exploring_the_design_of_the_cortex-a15.pdf
and
http://nvidia.com/docs/IO/116757/NVIDIA_Quad_a15_whitepaper_FINALv2.pdf

this patch makes the bi-mode predictor the default for the ARM O3 CPU.
2014-06-30 13:50:01 -04:00
Anthony Gutierrez
53dd4497b3 config: remove unecessary assignment of etherlink interfaces
in makeDualRoot() the etherlink interfaces are set using the tsunami interface
however, they are set again a few lines later based on whether or not the system
is a realview or tsunami system; the original assignment is always overwritten
or there will be a fatal. this seems like an artifact from when tsunami was the
only type of system capable of running with the dual option.
2014-05-15 13:26:31 -04:00
Andreas Hansson
aa329f4757 config: Bump DRAM sweep bus speed to match DDR4 config
This patch bumps the bus clock speed such that the interconnect does
not become a bottleneck with a DDR4-2400-x64 DRAM delivering 19.2
GByte/s theoretical max.
2014-05-09 18:58:49 -04:00
Nilay Vaish
097aadc2cd config: ruby: remove memory controller from network test
It is not in use and not required as such.
2014-04-19 09:00:30 -05:00
Anthony Gutierrez
5ab6bdc1ec arm: set default kernels for VExpress_EMM and VExpress_EMM64 2014-04-14 19:30:24 -04:00