Commit graph

294 commits

Author SHA1 Message Date
Andreas Sandberg
9a13acaa36 config, arm: Add multi-core KVM support to bL config
Add support for KVM in the big.LITTLE(tm) example configuration. This
replaces the --atomic option with a --cpu-type option that can be used
to switch between atomic, kvm, and timing simulation.

When running in KVM mode, the simulation script automatically assigns
separate event queues (threads) to each of the simulated CPUs. All
simulated devices, including CPU child devices (e.g., interrupt
controllers and caches), are assigned to event queue 0.

Change-Id: Ic9a3f564db91f5a3d3cb754c5a02fdd5c17d5fdf
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
Reviewed-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/2561
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Weiping Liao <weipingliao@google.com>
2017-04-03 16:37:55 +00:00
Andreas Sandberg
3547af6e44 config, arm: Unify checkpoint path handling in bL configs
The vanilla bL configuration file and the dist-gem5 configuration file
use slightly different code paths when restoring from
checkpoints. Unify this by passing the parsed options to the
instantiate() method and adding an optional checkpoint keyword
argument for checkpoint directories (only used by the dist-gem5
script).

Change-Id: I9943ec10bd7a256465e29c8de571142ec3fbaa0e
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
Reviewed-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/2560
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Weiping Liao <weipingliao@google.com>
2017-04-03 16:37:55 +00:00
Brandon Potter
3886c4a8f2 syscall_emul: [patch 5/22] remove LiveProcess class and use Process instead
The EIOProcess class was removed recently and it was the only other class
which derived from Process. Since every Process invocation is also a
LiveProcess invocation, it makes sense to simplify the organization by
combining the fields from LiveProcess into Process.
2016-11-09 14:27:40 -06:00
Gabor Dozsa
4b8b9c0585 arm,config: Add dist-gem5 support to the big.LITTLE(tm) config
This patch extends the example big.LITTLE configuration to enable
dist-gem5 simulations of big.LITTLE systems.

Change-Id: I49c095ab3c737b6a082f7c6f15f514c269217756
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-14 15:36:15 -06:00
Gabor Dozsa
54c478c0b8 arm,config: Refactor the example big.LITTLE(tm) configuration
This patch prepares future extensions and customisation of the example
big.LITTLE configuration script. It breaks out the major phases into
functions so they can be called from other python scripts.

Change-Id: I2cb7c207c410fe14602cf17af7482719abba6c24
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-14 15:09:18 -06:00
Curtis Dunham
41beacce08 sim, kvm: make KvmVM a System parameter
A KVM VM is typically a child of the System object already, but for
solving future issues with configuration graph resolution, the most
logical way to keep track of this object is for it to be an actual
parameter of the System object.

Change-Id: I965ded22203ff8667db9ca02de0042ff1c772220
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-14 15:09:18 -06:00
Wendy Elsasser
ca0fd665dc mem: Update DRAM configuration names
Names of DRAM configurations were updated to reflect both
the channel and device data width.

Previous naming format was:
	<DEVICE_TYPE>_<DATA_RATE>_<CHANNEL_WIDTH>

The following nomenclature is now used:
	<DEVICE_TYPE>_<DATA_RATE>_<n>x<w>
where n = The number of devices per rank on the channel
      x = Device width

Total channel width can be calculated by n*w

Example:
A 64-bit DDR4, 2400 channel consisting of 4-bit devices:
	n = 16
	w = 4
The resulting configuration name is:
	DDR4_2400_16x4

Updated scripts to match new naming convention.

Added unique configurations for DDR4 for:
1) 16x4
2) 8x8
3) 4x16

Change-Id: Ibd7f763b7248835c624309143cb9fc29d56a69d1
Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2017-02-14 15:09:18 -06:00
Matthias Jung
b682366b30 config: Fix missing include in fs.py
Bugfix for Elastic Traces

This patch fixes the bug when elastic traces are used:


    build/ARM/gem5.opt \
    configs/example/fs.py \
    --cpu-type=arm_detailed \
    --num-cpu=1 \
    --mem-type=SimpleMemory \
    --mem-size=512MB \
    --mem-channels=1 \
    --caches \
    --elastic-trace-en \
    --data-trace-file=data.proto.gz \
    --inst-trace-file=inst.proto.gz \
    --machine-type=VExpress_EMM \
    --dtb-filename=vexpress.aarch32.ll_20131205.0-gem5.1cpu.dtb \
    --kernel=vmlinux.aarch32.ll_20131205.0-gem5 \
    --disk-image=linux-aarch32-ael.img


NameError: global name 'CpuConfig' is not defined

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2017-01-09 09:32:13 -06:00
Gabor Dozsa
3ef797623a arm, config: Add missing IOCache in bL config
This patch adds an IOCache to the example bigLITTLE
configuration. An IOCache is required for correct DMA
transfers when we have caches in the system.

Change-Id: Ifeddc1b360aacbb16b1393f361dd98873c834012
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-06 17:10:36 +00:00
Nikos Nikoleris
9e57e4e89d config: Add an option to generate a random topology in memcheck
This change adds the option to use the memcheck with random memory
hierarchies at the moment limited to a maximum depth of 3 allowing
testing with uncommon topologies.

Change-Id: Id2c2fe82a8175d9a67eb4cd7f3d2e2720a809b60
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05 16:48:31 -05:00
Nikos Nikoleris
c1a40f9e44 config: Add whole line accesses to improve memchecker's coverage
Change-Id: Ie1a047139e350ce7400f3a20be644eaff1e21428
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05 16:48:30 -05:00
Sophiane Senni
ce2722cdd9 mem: Split the hit_latency into tag_latency and data_latency
If the cache access mode is parallel, i.e. "sequential_access" parameter
is set to "False", tags and data are accessed in parallel. Therefore,
the hit_latency is the maximum latency between tag_latency and
data_latency. On the other hand, if the cache access mode is
sequential, i.e. "sequential_access" parameter is set to "True",
tags and data are accessed sequentially. Therefore, the hit_latency
is the sum of tag_latency plus data_latency.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:27 -05:00
Tony Gutierrez
de72e36619 gpu-compute: support in-order data delivery in GM pipe
this patch adds an ordered response buffer to the GM pipeline
to ensure in-order data delivery. the buffer is implemented as
a stl ordered map, which sorts the request in program order by
using their sequence ID. when requests return to the GM pipeline
they are marked as done. only the oldest request may be serviced
from the ordered buffer, and only if is marked as done.

the FIFO response buffers are kept and used in OoO delivery mode
2016-10-26 22:48:28 -04:00
Andreas Hansson
90b087171b config: Break out base options for usage with NULL ISA
This patch breaks out the most basic configuration options into a set
of base options, to allow them to be used also by scripts that do not
involve any ISA, and thus no actual CPUs or devices.

The patch also fixes a few modules so that they can be imported in a
NULL build, and avoid dragging in FSConfig every time Options is
imported.
2016-10-26 14:50:54 -04:00
Andreas Hansson
2f5262eb67 config: Make configs/common a Python package
Continue along the same line as the recent patch that made the
Ruby-related config scripts Python packages and make also the
configs/common directory a package.

All affected config scripts are updated (hopefully).

Note that this change makes it apparent that the current organisation
and naming of the config directory and its subdirectories is rather
chaotic. We mix scripts that are directly invoked with scripts that
merely contain convenience functions. While it is not addressed in
this patch we should follow up with a re-organisation of the
config structure, and renaming of some of the packages.
2016-10-14 10:37:38 -04:00
Andreas Hansson
68fdccb30b ruby: Fix regressions and make Ruby configs Python packages
This patch moves the addition of network options into the Ruby module
to avoid the regressions all having to add it explicitly. Doing this
exposes an issue in our current config system though, namely the fact
that addtoPath is relative to the Python script being executed. Since
both example and regression scripts use the Ruby module we would end
up with two different (relative) paths being added. Instead we take a
first step at turning the config modules into Python packages, simply
by adding a __init__.py in the configs/ruby, configs/topologies and
configs/network subdirectories.

As a result, we can now add the top-level configs directory to the
Python search path, and then use the package names in the various
modules. The example scripts are also updated, and the messy
path-deducing variations in the scripts are unified.
2016-10-13 03:17:19 -04:00
Tushar Krishna
b9e23a6d74 config: add a separate config file for the network.
This patch adds a new file configs/network/Network.py to setup the network,
instead of doing that within Ruby.py.
2016-10-06 14:35:17 -04:00
Tushar Krishna
0f68b50ff1 ruby: rename networktest to garnet_synthetic_traffic.
networktest is essentially a collection of synthetic traffic patterns
for the network. The protocol name and the tester having the same name
led to multiple python configuration files with the same name, adding
confusion. This patch renames networktest to garnet_synthetic_traffic,
and also adds more synthetic traffic patterns.
2016-10-06 14:35:16 -04:00
Gabor Dozsa
fb349aa984 arm, config: Fixups for the example big.LITTLE(tm) configuration
This patch refactors the configuration file to use a more
object-oriented design.

Change-Id: I44ac2d063c2b5901f385544fb6ce3f259459cb05
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
2016-09-15 18:00:59 +01:00
David Hashe
d1abc287f6 config: KVM acceleration for apu_se.py
Add support for using KVM to accelerate APU simulations. The intended use
case is to fast-forward through runtime initialization until the first
kernel launch.
2016-08-22 11:43:44 -04:00
Andreas Sandberg
eb87ed8e74 arm, config: Add initial support for Ruby
Add initial support for creating an ARM system with a Ruby-based
memory system. This support is currently experimental and limited to
the new VExpress_GEM5_V1 platform.

Change-Id: I36baeb68b0d891e34ea46aafe17b5e55217b4bfa
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Brad Beckmann <brad.beckmann@amd.com>
2016-08-10 16:26:34 +01:00
Gabor Dozsa
a288c94387 arm, config: Add an example ARM big.LITTLE(tm) configuration script
An ARM big.LITTLE system consists of two cpu clusters: the big
CPUs are typically complex out-of-order cores and the little
CPUs are simpler in-order ones. The fs_bigLITTLE.py script
can run a full system simulation with various number of big
and little cores and cache hierarchy. The commit also includes
two example device tree files for booting Linux on the
bigLITTLE system.

Change-Id: I6396fb3b2d8f27049ccae49d8666d643b66c088b
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-07-21 17:19:16 +01:00
Abdul Mutaal Ahmad
9c20880fb5 mem: tester for new HMC configuration
This patch provides the example test script to configure different HMC
architecture and run traffic through traffic generator.

Committed by Jason Lowe-Power <jason@lowepower.com>
2016-07-01 09:48:43 -05:00
jkalamat
3724fb15fa gpu-compute: parametrize Wavefront size
Eliminate the VSZ constant that defined the Wavefront size (in numbers of work
items); replaced it with a parameter in the GPU.py configuration script.
Changed all data structures dependent on the Wavefront size to be dynamically
sized. Legal values of Wavefront size are 16, 32, 64 for now and checked at
initialization time.
2016-06-09 11:24:55 -04:00
Andreas Hansson
53d735b17e config: Add missing point of coherency to memcheck script
Bring in line with changes to the XBar class.
2016-04-21 04:48:04 -04:00
Andreas Hansson
92f021cbbe mem: Move the point of coherency to the coherent crossbar
This patch introduces the ability of making the coherent crossbar the
point of coherency. If so, the crossbar does not forward packets where
a cache with ownership has already committed to responding, and also
does not forward any coherency-related packets that are not intended
for a downstream memory controller. Thus, invalidations and upgrades
are turned around in the crossbar, and the memory controller only sees
normal reads and writes.

In addition this patch moves the express snoop promotion of a packet
to the crossbar, thus allowing the downstream cache to check the
express snoop flag (as it should) for bypassing any blocking, rather
than relying on whether a cache is responding or not.
2016-02-10 04:08:25 -05:00
Steve Reinhardt
dc8018a5c3 style: remove trailing whitespace
Result of running 'hg m5style --skip-all --fix-white -a'.
2016-02-06 17:21:18 -08:00
Brad Beckmann
97a5e5b25e ruby: changed all references to numCPs to num-cp 2016-01-22 10:42:12 -05:00
Tony Gutierrez
1a7d3f9fcb gpu-compute: AMD's baseline GPU model 2016-01-19 14:28:22 -05:00
Gabor Dozsa
64ca31976f config: Updates for distributed gem5 simulations 2016-01-07 16:33:47 -06:00
Andreas Hansson
219f1c1a7f configs: Make the default memtest behaviour more complex
Add functional and uncacheable accesses by default.
2015-12-17 17:07:22 -05:00
Brad Beckmann
173a786921 ruby: more flexible ruby tester support
This patch allows the ruby random tester to use ruby ports that may only
support instr or data requests.  This patch is similar to a previous changeset
(8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets.
This current patch implements the support in a more straight-forward way.
Since retries are now tested when running the ruby random tester, this patch
splits up the retry and drain check behavior so that RubyPort children, such
as the GPUCoalescer, can perform those operations correctly without having to
duplicate code.  Finally, the patch also includes better DPRINTFs for
debugging the tester.
2015-07-20 09:15:18 -05:00
Radhika Jagtap
9bd5051b60 config: Enable elastic trace capture and replay in se/fs
This patch adds changes to the configuration scripts to support elastic
tracing and replay.

The patch adds a command line option to enable elastic tracing in SE mode
and FS mode. When enabled the Elastic Trace cpu probe is attached to O3CPU
and a few O3 CPU parameters are tuned. The Elastic Trace probe writes out
both instruction fetch and data dependency traces. The patch also enables
configuring the TraceCPU to replay traces using the SE and FS script.

The replay run is designed to resume from checkpoint using atomic cpu to
restore state keeping it consistent with FS run flow. It then switches to
TraceCPU to replay the input traces.
2015-12-07 16:42:16 -06:00
Andrew Bardsley
4375678a0d config: Added missing types to JSON/INI Python reader
Added the missing types EthernetAddr and Current to the JSON/INI file
reader example configs/example/read_config.py.

Also added __str__ to EthernetAddr to make values appear in the same form
in JSON an INI files.
2015-11-22 05:10:21 -05:00
Andreas Hansson
337774e192 config: Update memtest to stress test clean writebacks
This patch adds yet another twist to the memtest cache hierarchy, in that
the writeback_clean option is toggled at every level to match the
clusivity of the downstream cache.
2015-11-06 03:26:44 -05:00
Andreas Hansson
afa252b0b9 config: Update memtest to stress test cache clusivity
This patch adds an new twist to the memtest cache hierarchy, in that
it switches from mostly inclusive to mostly exclusive at every level
in the tree. This has helped weed out plenty issues, and serves as a
good stress tests.
2015-11-06 03:26:42 -05:00
Erfan Azarkhish
100cbc9cf6 mem: hmc: top level design
This patch enables modeling a complete Hybrid Memory Cube (HMC) device. It
highly reuses the existing components in gem5's general memory system with some
small modifications. This changeset requires additional patches to model a
complete HMC device.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-11-03 12:17:56 -06:00
Mitch Hayenga
a5c4eb3de9 isa,cpu: Add support for FS SMT Interrupts
Adds per-thread interrupt controllers and thread/context logic
so that interrupts properly get routed in SMT systems.
2015-09-30 11:14:19 -05:00
Mitch Hayenga
582a0148b4 config,cpu: Add SMT support to Atomic and Timing CPUs
Adds SMT support to the "simple" CPU models so that they can be
used with other SMT-supported CPUs. Example usage: this enables
the TimingSimpleCPU to be used to warmup caches before swapping to
detailed mode with the in-order or out-of-order based CPU models.
2015-09-30 11:14:19 -05:00
Nilay Vaish
bc5be9ac43 config: allow ruby to be used with Minor CPU 2015-09-06 23:11:11 -05:00
Andreas Hansson
ddfa96cf45 mem: Add explicit Cache subclass and make BaseCache abstract
Open up for other subclasses to BaseCache and transition to using the
explicit Cache subclass.

--HG--
rename : src/mem/cache/BaseCache.py => src/mem/cache/Cache.py
2015-08-21 07:03:23 -04:00
Nilay Vaish
0b163ea707 configs: network test: remove redundant physical memory 2015-07-21 10:08:25 -05:00
Andreas Hansson
b93c912013 mem: Remove redundant is_top_level cache parameter
This patch takes the final step in removing the is_top_level parameter
from the cache. With the recent changes to read requests and write
invalidations, the parameter is no longer needed, and consequently
removed.

This also means that asymmetric cache hierarchies are now fully
supported (and we are actually using them already with L1 caches, but
no table-walker caches, connected to a shared L2).
2015-07-03 10:14:43 -04:00
bpotter
936768c8f4 config: enable setting SE-mode environment variables from file 2015-04-23 13:40:18 -07:00
Curtis Dunham
c3268f8820 config: Support full-system with SST's memory system
This patch adds an example configuration in ext/sst/tests/ that allows
an SST/gem5 instance to simulate a 4-core AArch64 system with SST's
memHierarchy components providing all the caches and memories.
2015-04-08 15:56:06 -05:00
Andreas Hansson
ecd4bad351 config: Add soak test for memtest.py
This patch adds a random option to memtest.py which allows the user to
easily test valid random tree topologies. The patch also adds a
wrapper script to run soak tests using the newly introduced option.

We also adjust the progress interval and progress limit check to make
the output less noisy, and avoid false positives.

Bring on the pain.
2015-03-19 04:06:18 -04:00
Chris Emmons
142ab40c4b config: Specify OS type and release on command line
This patch enables users to speficy --os-type on the command
line. This option is used to take specific actions for an OS type,
such as changing the kernel command line. This patch is part of the
Android KitKat enablement.
2015-03-19 04:06:14 -04:00
Andreas Hansson
36dc93a5fa mem: Move crossbar default latencies to subclasses
This patch introduces a few subclasses to the CoherentXBar and
NoncoherentXBar to distinguish the different uses in the system. We
use the crossbar in a wide range of places: interfacing cores to the
L2, as a system interconnect, connecting I/O and peripherals,
etc. Needless to say, these crossbars have very different performance,
and the clock frequency alone is not enough to distinguish these
scenarios.

Instead of trying to capture every possible case, this patch
introduces dedicated subclasses for the three primary use-cases:
L2XBar, SystemXBar and IOXbar. More can be added if needed, and the
defaults can be overridden.
2015-03-02 04:00:47 -05:00
Andreas Hansson
f18d2120fa config: Add memcheck stress test
This is a rather unfortunate copy of the memtest.py example script,
that actually stresses the system with true sharing as opposed to the
false sharing of the MemTest. To do so it uses TrafficGen instances to
generate the reads/writes, and MemCheckerMonitor combined with the
MemChecker to check the validity of the read/written values.

As a bonus, this script also enables the addition of prefetchers, and
the traffic is created to have a mix of random addresses and linear
strides. We use the TaggedPrefetcher since the packets do not have a
request with a PC.

At the moment the code is almost identical to the memtest.py script,
and no effort has been made to factor out the construction of the
tree. The challenge is that the instantiation and connection of the
testers and monitors is done as part of the tree building.
2015-02-16 03:35:23 -05:00
Curtis Dunham
07ce60bdfa config: add --root-device machine parameter
In case /dev/sda1 is not actually the boot partition for an image,
we can override it on the command line or in a benchmark definition.
2015-01-16 14:12:03 -06:00