Commit graph

11799 commits

Author SHA1 Message Date
Tony Gutierrez
1961a942f3 ruby: guard usage of GPUCoalescer code in Profiler
the GPUCoalescer code is used in the ruby profiler regardless of
whether or not the coalescer code has been compiled, which can
lead to link/run time errors. here we add #ifdefs to guard the
usage of GPUCoalescer code. eventually we should refactor this
code to use probe points.
2017-01-19 11:59:34 -05:00
Matthew Poremba
42044645b9 ruby: Check MessageBuffer space in garnet NetworkInterface
Garnet's NetworkInterface does not consider the size of MessageBuffers when
ejecting a Message from the network. Add a size check for the MessageBuffer
and only enqueue if space is available. If space is not available, the
message if placed in a queue and the credit is held. A callback from the
MessageBuffer is implemented to wake the NetworkInterface. If there are
messages in the stalled queue, they are processed first, in a FIFO manner
and if succesfully ejected, the credit is finally sent back upstream. The
maximum size of the stall queue is equal to the number of valid VNETs
with MessageBuffers attached.
2017-01-19 11:59:10 -05:00
Matthew Poremba
a4b546c3a1 ruby: Add occupancy stats to MessageBuffers
This patch is an updated version of /r/3297.

"The most important statistic for measuring memory hierarchy performance is
throughput, which is affected by independent variables, buffer sizing and
communication latency. It is difficult/impossible to debug performance issues
through series buffers without knowing which are the bottlenecks. For finite
buffers, this patch adds statistics for the average number of messages in the
buffer, the occupancy of the buffer slots, and number of message stalls."
2017-01-19 11:58:59 -05:00
Matthew Poremba
501f170924 ruby: Check all VNETs for injection in garnet NetworkInterface
The NetworkInterface wakeup currently iterates over all VNETs and breaks the
loop if a VNET is unable to allocate a VC. This can cause a deadlock if a
lower numbered VNET is unable to allocate a VC while a higher numbered VNET
has idle VCs. This seems like a bug as Garnet 1.0 uses a while loop over an
if-statement, suggesting the break was intended for this while loop. This
patch removes the break statement, which allows up to one message to be
dequeued from a VNET and injected into the network.
2017-01-19 11:58:49 -05:00
Brandon Potter
1ced08c850 syscall_emul: [patch 2/22] move SyscallDesc into its own .hh and .cc
The class was crammed into syscall_emul.hh which has tons of forward
declarations and template definitions. To clean it up a bit, moved the
class into separate files and commented the class with doxygen style
comments. Also, provided some encapsulation by adding some accessors and
a mutator.

The syscallreturn.hh file was renamed syscall_return.hh to make it consistent
with other similarly named files in the src/sim directory.

The DPRINTF_SYSCALL macro was moved into its own header file with the
include the Base and Verbose flags as well.

--HG--
rename : src/sim/syscallreturn.hh => src/sim/syscall_return.hh
2016-11-09 14:27:40 -06:00
Brandon Potter
7a8dda49a4 style: [patch 1/22] use /r/3648/ to reorganize includes 2016-11-09 14:27:37 -06:00
Matthias Jung
63bb17e4bd misc: fixes deprecated sc_time function for SystemC 2.3.1
The non-standard sc_time constructors

- sc_time( uint64, bool scale )
- sc_time( double, bool scale )

have been deprecated in SystemC 2.3.1 and a warning is issued when being
used. Insted the new 'sc_time::from_value' function is used to omit the
warning message.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-01-09 09:34:36 -06:00
Matthias Jung
5b08ae2372 misc: Documentation Update
Updates for READMEs of /util/cxx_config, /util/systemc, /util/tlm.
Some minor corrections, mostly with respect to MAC/OSX

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-01-09 09:33:42 -06:00
Matthias Jung
b682366b30 config: Fix missing include in fs.py
Bugfix for Elastic Traces

This patch fixes the bug when elastic traces are used:


    build/ARM/gem5.opt \
    configs/example/fs.py \
    --cpu-type=arm_detailed \
    --num-cpu=1 \
    --mem-type=SimpleMemory \
    --mem-size=512MB \
    --mem-channels=1 \
    --caches \
    --elastic-trace-en \
    --data-trace-file=data.proto.gz \
    --inst-trace-file=inst.proto.gz \
    --machine-type=VExpress_EMM \
    --dtb-filename=vexpress.aarch32.ll_20131205.0-gem5.1cpu.dtb \
    --kernel=vmlinux.aarch32.ll_20131205.0-gem5 \
    --disk-image=linux-aarch32-ael.img


NameError: global name 'CpuConfig' is not defined

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2017-01-09 09:32:13 -06:00
Andreas Sandberg
1738a7d260 sim: Remove declaration of unused CountedDrainEvent
The CountedDrainEvent event was used to keep track of objects that
required additional simulation to drain. It was removed as a part of
the great drain rewrite, but the declaration remained.

Change-Id: I767a3213669040d3f27e2afafa2e4a5bb997e325
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2017-01-03 17:31:39 +00:00
Andreas Sandberg
c8b1e8f1cf python: Don't use Swig to cast stats
Call the stat visitor from the stat itself rather than casting stats
in Python. This reduces the number of ways visitors are called.

Change-Id: Ic4d0b7b32e3ab9897b9a34cd22d353f4da62d738
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Joe Gross <joseph.gross@amd.com>
2017-01-03 12:03:45 +00:00
Andreas Sandberg
abe7ef95cb sim: Remove redundant export_method_cxx_predecls
The headers declared in export_method_cxx_predecls are redundant since a
SimObject's main header is automatically included.

Change-Id: Ied9e84630b36960e54efe91d16f8c66fba7e0da0
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
Reviewed-by: Joe Gross <joseph.gross@amd.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
2017-01-03 12:03:06 +00:00
Andreas Sandberg
f835378bea util: Add maintainer tools to create upstream patches
This changeset adds a maintainer script, create_patches.sh, that can
be used to prepare for upstream from a git repository. The script can
be used to generate patches in Mercurial or git format. The commit
messages in the exported patches are all filtered, see
upstream_msg_filter.sed, to ensure that irrelevant meta data isn't
included in the upstream commit.

Kudos to Curtis Dunham and Nikos Nikoleris for reviews and usability
enhancements for earlier versions of this patch.

Change-Id: Ia4cd089a32834b5e046ef58c0a173ca285b77bca
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
2017-01-03 11:31:46 +00:00
Joel Hestness
6a49dee3f3 sim: Fix SE mode checkpoint restore file handling
When restoring from a checkpoint, the simulation used to use file handles from
the checkpoint. This disallows multiple separate restore simulations from using
separate input and output files and directories, and plays havoc when the
checkpointed file locations may have changed. Add handling to allow the command
line specified files to be used as input/output for the restored simulation
(Note: this is the similar functionality to FS mode for output and error).
2016-12-23 08:43:18 -06:00
Arthur Perais
c9d933efb0 cpu: implement an L-TAGE branch predictor
This patch implements an L-TAGE predictor, based on André Seznec's code
available from CBP-2
(http://hpca23.cse.tamu.edu/taco/camino/cbp2/cbp-src/realistic-seznec.h).

Signed-off-by Jason Lowe-Power <jason@lowepower.com>
2016-12-21 15:25:13 -06:00
Arthur Perais
497cc2d373 cpu: disallow speculative update of branch predictor tables (o3)
The Minor and o3 cpu models share the branch prediction
code. Minor relies on the BPredUnit::squash() function
to update the branch predictor tables on a branch mispre-
diction. This is fine because Minor executes in-order, so
the update is on the correct path. However, this causes the
branch predictor to be updated on out-of-order branch
mispredictions when using the o3 model, which should not
be the case.

This patch guards against speculative update of the branch
prediction tables. On a branch misprediction, BPredUnit::squash()
calls BpredUnit::update(..., squashed = true). The underlying
branch predictor tests against the value of squashed. If it is
true, it restores any speculatively updated internal state
it might have (e.g., global/local branch history), then returns.
If false, it updates its prediction tables. Previously, exist-
ing predictors did not test against the "squashed" parameter.

To accomodate for this change, the Minor model must now call
BPredUnit::squash() then BPredUnit::update(..., squashed = false)
on branch mispredictions. Before, calling BpredUnit::squash()
performed the prediction tables update.

The effect is a slight MPKI improvement when using the o3
model. A further patch should perform the same modifications
for the indirect target predictor and BTB (less critical).

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-12-21 15:07:16 -06:00
Arthur Perais
34065f8d5f cpu: correct comments in tournament branch predictor
The tournament predictor is presented as doing speculative
update of the global history and non-speculative update
of the local history used to generate the branch prediction.
However, the code does speculative update of both histories.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-12-21 15:06:13 -06:00
Arthur Perais
1664625db8 cpu: Resolve targets of predicted 'taken' decode for O3
The target of taken conditional direct branches does not
need to be resolved in IEW: the target can be computed at
decode, usually using the decoded instruction word and the PC.

The higher-than-necessary penalty is taken only on conditional
branches that are predicted taken but miss in the BTB. Thus,
this is mostly inconsequential on IPC if the BTB is big/associative
enough (fewer capacity/conflict misses). Nonetheless, what gem5
simulates is not representative of how conditional branch targets
can be handled.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-12-21 15:05:24 -06:00
Arthur Perais
e5fb6752d6 cpu: Clarify meaning of cachePorts variable in lsq_unit.hh of O3
cachePorts currently constrains the number of store packets written to the
D-Cache each cycle), but loads currently affect this variable. This leads
to unexpected congestion (e.g., setting cachePorts to a realistic 1 will
in fact allow a store to WB only if no loads have accessed the D-Cache
this cycle). In the absence of arbitration, this patch decouples how many
loads can be done per cycle from how many stores can be done per cycle.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-12-21 15:04:06 -06:00
Joel Hestness
3a656da1a6 ruby: Make MessageBuffers actually finite sized
When Ruby controllers stall messages in MessageBuffers, the buffer moves those
messages off the priority heap and into a per-address stall map. When buffers
are finite-sized, the test areNSlotsAvailable() only checks the size of the
priority heap, but ignores the stall map, so the map is allowed to grow
unbounded if the controller stalls numerous messages. This patch fixes the
problem by tracking the stall map size and testing the total number of messages
in the buffer appropriately.
2016-12-20 11:38:24 -06:00
Tony Gutierrez
3eb979a8ce ruby: fix typo in DMASequencer::ackCallback() 2016-12-20 11:53:36 -05:00
Tony Gutierrez
02cb6b19a7 ruby: fix issue with unused var in DMASequencer
the iterator declared in DMASequencer::ackCallback() is only used in an
assert, this causes clang to fail when building fast. here we move
the find call on the request table directly into the assert.
2016-12-20 11:47:30 -05:00
Curtis Dunham
b7d072b235 dist, dev: fix etherswitch upgrade script
The aforementioned upgrader in [1] assumes every option in [system]
has a delimiting '.', and also seems to do its rewriting work a bit too
unconditionally.  Most checkpoints in the wild don't have this device,
in which case this script should be a safe no-op.

[1] 2aa4d7b  dist, dev: Fixed the packet ordering in etherswitch

Change-Id: Icfd0350985109df1628eb9ab864cda42c54060a8
Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
2016-12-19 12:12:28 -06:00
Curtis Dunham
bc7ab6970a stats: update references 2016-12-19 11:03:28 -06:00
Curtis Dunham
f04d81163c arm: provide correct timer availability in ID_PFR1 register
Change-Id: Id4cd839c12b70616017a5830e3f9bbb59b0f97ba
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19 11:03:28 -06:00
Curtis Dunham
ae2e0ca3d0 arm: compute ID_AA64PFR{0,1}_EL1 registers
Compute the proper values of the aforementioned registers from
the system configuration rather than configuring the values themselves.

Change-Id: If9774b6610a29568b80ae4866107b9a6a5b5be0f
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19 11:03:28 -06:00
Curtis Dunham
a73937b60c arm: compute ID_PFR{0,1} registers
Compute the proper values of the aforementioned registers from
the system configuration rather than configuring the values themselves.

Change-Id: Ie7685b5d8b5f2dd9d6380b4af74f16d596b2bfd1
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19 11:03:27 -06:00
Curtis Dunham
282cf5807d arm: miscreg refactoring
Change-Id: I4e9e8f264a4a4239dd135a6c7a1c8da213b6d345
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19 11:03:27 -06:00
Curtis Dunham
9cf6bc444b arm: audit SCTLR
Change-Id: I814f1431a5f754f75721c9ac51171f860a714d24
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19 11:03:27 -06:00
Curtis Dunham
7ddb55a5f2 arm: remove SCTLR.FI
Removed from ARMARM.

Change-Id: Ie8f28e4fa6e1b46dfd9c8c4b379e5b42fe25421d
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19 11:03:27 -06:00
Curtis Dunham
19d90956eb arm: update AArch{64,32} register mappings
Change-Id: Idaaaeb3f7b1a0bdbf18d8e2d46686c78bb411317
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19 11:03:27 -06:00
Andreas Sandberg
bbd3703fbb mem: Make the BaseXBar public to not confuse Python wrappers
The Python wrappers generally assume that destructors are public. Make
the BaseXBar destructor public to avoid confusing the Python wrapper.

Change-Id: If958802409c0be74e875dd6e279742abfdb3ede1
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2016-12-19 16:25:40 +00:00
Andreas Sandberg
8702208f3f python: Export periodicStatDump
Some configuration scripts need periodic stat dumps. One of the ways
this can be achieved is by using the pariodicStatDump helper
function. This function was previously only exported in the internal
name space. Export it as a normal function in m5.stat instead.

Change-Id: Ic88bf1fd33042a62ab436d5944d8ed778264ac98
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Sascha Bischoff <sascha.bischoff@arm.com>
2016-12-19 16:25:39 +00:00
Andreas Sandberg
73627fa007 dev: Include DmaDevice in NULL builds
Builds for the NULL ISA include Device.py, which contains the Python
declaration of DmaDevice, but don't include the actual C++
implementation. Add dma_device.cc to the NULL build to the Python and
C++ worlds consistent again.

Change-Id: I47a57181a1f4d5a7276467678bf16fbc7f161681
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Sascha Bischoff <sascha.bischoff@arm.com>
2016-12-19 16:25:38 +00:00
Andreas Sandberg
d113153b52 python: Fix incorrect header in the DmaDevice wrapper
The header declared in the DmaDevice wrapper doesn't actually contain
the DmaDevice class. This can potentially lead to incorrect type cases
in Swig.

Change-Id: If2266d4180d1d6fd13359a81067068854c5e96fe
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Sascha Bischoff <sascha.bischoff@arm.com>
2016-12-19 16:25:38 +00:00
Andreas Sandberg
ac8e73565a sim: Remove redundant buildEnv import
Change-Id: Id6bdbc0c988aa92b96e292cabc913e6b974f14bb
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2016-12-19 16:25:37 +00:00
Jieming Yin
b9c7b8190c ruby: Detect garnet network-level deadlock.
This patch detects garnet network deadlock by monitoring
network interfaces. If a network interface continuously
fails to allocate virtual channels for a message, a
possible deadlock is detected.
2016-12-15 16:59:17 -05:00
Brandon Potter
cc1f5a4d16 base: remove header file to prevent a macro name collision 2016-11-09 14:27:37 -06:00
Brandon Potter
cc84eb813c syscall_emul: implement fallocate 2016-12-15 13:16:25 -05:00
Brandon Potter
68e9c0e73b syscall_emul: add support for x86 statfs system calls 2016-12-15 13:16:03 -05:00
Brandon Potter
4ff1b165d0 syscall_emul: extend sysinfo system call to include mem_unit 2016-12-15 13:14:41 -05:00
Gabor Dozsa
ecf68fac40 dev: Fix race conditions at terminating dist-gem5 simulations
Two problems may arise when a distributed gem5 simulation terminates:
(i) simulation thread(s) may get stuck in an incomplete synchronisation
event which prohibits processing  the simulation exit event; and (ii) a
stale receiver thread may try to access objects that have already been
deleted while exiting gem5. This patch terminates receive threads properly
and aborts the processing of any incomplete synchronisation event.

Change-Id: I72337aa12c7926cece00309640d478b61e55a429
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-06 17:33:06 +00:00
Gabor Dozsa
3ef797623a arm, config: Add missing IOCache in bL config
This patch adds an IOCache to the example bigLITTLE
configuration. An IOCache is required for correct DMA
transfers when we have caches in the system.

Change-Id: Ifeddc1b360aacbb16b1393f361dd98873c834012
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-06 17:10:36 +00:00
Andreas Hansson
c642d6fc37 ruby: Remove RubyMemoryControl and associated files
This patch removes the deprecated RubyMemoryControl. The DRAMCtrl
module should be used instead.
2016-12-05 16:49:07 -05:00
Andreas Hansson
ebd9018a13 stats: Update stats to reflect cache changes 2016-12-05 16:48:34 -05:00
Nikos Nikoleris
9e57e4e89d config: Add an option to generate a random topology in memcheck
This change adds the option to use the memcheck with random memory
hierarchies at the moment limited to a maximum depth of 3 allowing
testing with uncommon topologies.

Change-Id: Id2c2fe82a8175d9a67eb4cd7f3d2e2720a809b60
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05 16:48:31 -05:00
Nikos Nikoleris
c1a40f9e44 config: Add whole line accesses to improve memchecker's coverage
Change-Id: Ie1a047139e350ce7400f3a20be644eaff1e21428
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05 16:48:30 -05:00
Nikos Nikoleris
0054f1ad53 mem: Respond to InvalidateReq when the block is (pending) dirty
Previously when an InvalidateReq snooped a cache with a dirty block or
a pending modified MSHR, it would invalidate the block or set the
postInv flag. The cache would not send an InvalidateResp. though,
causing memory order violations. This patches changes this behavior,
making the cache with the dirty block or pending modified MSHR the
ordering point.

Change-Id: Ib4c31012f4f6693ffb137cd77258b160fbc239ca
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05 16:48:29 -05:00
Nikos Nikoleris
9916e4276c mem: Invalidate a blk when servicing the 1st invalidating target
Previously an MSHR with one or more invalidating targets would first
service all targets in the MSHR TargetList and then invalidate the
block. As a result any service snooping targets would lookup in the
cache and incorrectly find the block. This patch forces the
invalidation to happen when the first invalidating target is
encountered.

Change-Id: I9df15de24e1d351cd96f5a2c424d9a03d81c2cce
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05 16:48:28 -05:00
Nikos Nikoleris
77dfeb8c09 mem: Allow non invalidating snoops on an InvalidateReq MSHR
This patch changes an assertion that previously assumed that a non
invalidating snoop request should never be serviced by an
InvalidateReq MSHR. The MSHR serves as the ordering point for the
snooping packet. When the InvalidateResp reaches the cache the
snooping packet snoops the caches above to find the requested
block. One or more of the caches above will have the block since
earlier it has seen a WriteLineReq.

Change-Id: I0c147c8b5d5019e18bd34adf9af0fccfe431ae07
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05 16:48:27 -05:00