Commit graph

7211 commits

Author SHA1 Message Date
Andreas Sandberg
c661cc75ec arm: Enable LPAE support by default
LPAE has been tested with Linux 4.4 and seems to work just fine. Let's
enable it by default.

Change-Id: Id88c6e3c91ae9c353279d42f2aa1f8a78485bd32
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
2016-05-31 12:14:40 +01:00
Andreas Sandberg
9d4a42e8c5 arm: Correctly check translation mode (aarch64/aarch32)
According to the ARM ARM (see AArch32.TranslateAddress in the
pseudocode library), the TLB should be operating in aarch64 mode if
the EL0 is aarch32 and EL1 is aarch64. This is currently not the case
in gem5, which breaks 64/32 interprocessing. Update the check to match
the reference manual.

Change-Id: I6f1444d57c0e2eb5f8880f513f33a9197b7cb2ce
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Mitch Hayenga <mitch.hayenga@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2016-05-31 12:14:37 +01:00
Andreas Hansson
be014d4338 scons: Bump minimum gcc version to 4.8
After reaching consensus on the mailing list, this patch officially
makes gcc 4.8 the minimum.

A few checks in the SConstruct are cleaned up as a result. This patch
also adds "-fno-omit-frame-pointer" when using ASAN (which is part of
the gcc/clang  recommended flags).
2016-05-30 02:10:48 -04:00
Ilias Vougioukas
7c8d6e3660 cpu: fix lastStopped unserialisation
MinorCPU fix for corrupt numCycles when resuming from a previous simulation.
---
 src/cpu/minor/cpu.cc | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)
2016-05-27 16:55:01 +01:00
Akash Bagdia
07452aebf5 power: Allow voltage to be configured via cmd line
---
 src/python/m5/params.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
2016-05-27 16:54:59 +01:00
Andreas Sandberg
4a6bb82123 arm: Use the target EL state when determining fault format
We currently check the current state instead of the state of the
target EL when determining how we report a fault. This breaks
interprocessing since EL0 in aarch32 would report its fault status
using the aarch32 registers even if EL1 is in aarch64. Fix this to
report the fault using the format of the target EL.

Change-Id: Ic080267ac210783d1e01c722a4ddaa687dce280e
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Mitch Hayenga <mitch.hayenga@arm.com>
2016-05-27 15:02:01 +01:00
Andreas Sandberg
2ace05044c arm: Fix incorrect TLB permission check in aarch32
The TLB currently assumes that the pxn bit in an LPAE page descriptor
disables execution from unprivileged mode. However, according to the
architecture manual, this bit should disable execution from privileged
modes. Update the TLB implementation to reflect this behavior.

Change-Id: I7f1bb232d7a94a93fd601a9230223195ac952947
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2016-05-26 17:38:15 +01:00
Andreas Sandberg
e360f4db07 arm: Make EL checks available in SE mode
A lot of code assumes that it is possible to test what the highest EL
is and if it is 64 bit. These calls currently don't work in SE mode
since they rely on an instance of an ArmSystem.

Change-Id: I0d1f261926a66ce3dc4fa116845ffb2a081446f2
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Nathanael Premillieu <nathanael.premillieu@arm.com>
2016-05-26 17:33:38 +01:00
Andreas Hansson
e3e808416f mem: Fix memory leak in handling of deferred snoops
This patch fixes a memory leak where deferred snoop packets never got
deallocated. On the call to MSHR::handleSnoop these snoops were
treated as if a response will be sent, as the MSHR was
pendingModified. Consequently, a copy of the packet was created and
added to the MSHR targets. However, an preceeding target to the same
MSHR, originally from a CPU, was serviced before the snoop, and caused
the block to be invalidated. This happens for ReadExReq and
UpgradeReq.

Note that the original snoop will receive a response, just not from
the cache in question, but instead from the cache upstream that issued
the ReadExReq or UpgradeReq.

Change-Id: I4ac012fbc8a46cf693ca390fe9476105d444e6f4
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2016-05-26 11:56:24 +01:00
Andreas Sandberg
4d577ac8f1 dev, arm: Add a flag to enable/disable gem5 GIC extensions
Make it possible to disable gem5 gic extensions by setting the
gem5_extensions param to False from Python.

Change-Id: Icb255105925ef49891d69cc9fe5cc55578ca066d
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Geoffrey Blake <geoffrey.blake@arm.com>
2016-05-26 11:56:24 +01:00
Andreas Hansson
d023b7e8db cpu: Add a basic progress check to the TrafficGen
This patch adds a progress check to the TrafficGen so that it is
easier to detect deadlock scenarios where the generator gets stuck
waiting for a retry, and makes no further progress.

Change-Id: Ifb8779ad0939f52c0518d0e867bac73f99b82e2b
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com>
2016-05-26 11:56:24 +01:00
Andreas Hansson
4ff4f9c531 mem: Do not set cacheResponding on MSHR snoop if not responding
This patch changes the flow control for HSHR::handleSnoop to ensure
that we only set cacheResponding on the snoop packet if we are
actually responding. This avoids situations where a responder is
stalling indefinitely on a response that never arrives.

Change-Id: I691dd01755b614b30203581aa74fc743b350eacc
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2016-05-26 11:56:24 +01:00
Andreas Hansson
90de9be2ef mem: Fix MemChecker unique_ptr type mismatch
This patch fixes the type of the unique_ptr instances, to ensure that
the data that is allocated with new[] is also deleted with
delete[]. The issue was highlighted by ASAN.

Change-Id: I2c5510424959d862a9954d83e728d901bb18d309
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-05-26 11:56:24 +01:00
Andreas Hansson
7dc5034ff2 arm: Fix heap overflow issue in Neon64Load operation
This patch fixes an issue identified by ASAN where the Neon64Load
operation assumes the packet always contains 16 bytes.

Change-Id: If24a7e461d60cb80970dfbe61d923d7d56926698
Reviewed-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2016-05-26 11:56:24 +01:00
Andreas Hansson
4a3e2156ac arm, dev: Remove superfluous loop increment in flash device
As identified by clang-3.8, there was a superfluous loop increment in
the flash device which is now removed.

Change-Id: If46a1c4f72d3d4c9f219124030894ca433c790af
Reviewed-by: Rene De Jong <rene.dejong@arm.com>
2016-05-26 11:56:24 +01:00
Nikos Nikoleris
a69a0f33cb mem: fix headers include order in the cache related classes
Change-Id: Ia57cc104978861ab342720654e408dbbfcbe4b69
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-05-26 11:56:24 +01:00
Nikos Nikoleris
f9d62b63e1 mem: remove redudant check whether the cache forwards snoops
Change-Id: I57b56771086e1e2f512977fb7248d93c171ab925
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-05-26 11:56:24 +01:00
Nikos Nikoleris
d68f3577d6 mem: change NULL to nullptr in the cache related classes
Change-Id: I5042410be54935650b7d05c84d8d9efbfcc06e70
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-05-26 11:56:24 +01:00
Nikos Nikoleris
90bf50b4c7 mem: fix the line length in the cache related classes
Change-Id: I6d1feb164a958dde0da87a1cd2698096112c4a82
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-05-26 11:56:24 +01:00
Bjoern A. Zeeb
7fc668fae9 config, x86: Properly space pad the X86IntelMPBus Entry descriptions
According to the Intel Multi Processor Specification rev 1.4 (-006) (*),
section 4.3.2 Bus Entries, Bus type strings are >>6-character ASCII
(blank-filled) strings<<.
This patch properly pads the entries with the missing spaces at the end.

(*) http://www.intel.com/design/pentium/datashts/24201606.pdf

Committed by Jason Lowe-Power <power.jg@gmail.com>
2016-05-19 15:19:35 -05:00
Bjoern A. Zeeb
a6b00c07f6 arm,dev: PL011 UART_FR read status enhancement
Given we do not simulate a FIFO currently there are only two states
we can be in upon read: empty or full.  Properly signal the latter.

Add and sort constants for states in the header file.

Committed by Jason Lowe-Power <power.jg@gmail.com>
2016-05-19 15:19:35 -05:00
Bjoern A. Zeeb
5fa6b68981 x86, dev: properly space the APIC registers
Registers are 0x10 and not 0x8 apart.  The latter leads to invalid
calculations of index in array which in turn means that we will not
find the interrupt we were looking (been notified) for in the OS.

Committed by Jason Lowe-Power <power.jg@gmail.com>
2016-05-19 15:19:35 -05:00
Bjoern A. Zeeb
ce610dcab1 dev, virtio: properly set PCI address space to use IOREG
VirtIO spec < 1.0 demands IOREG to be used on PCI and not memory mapped.
Set the correct bit on the PCI address accordingly.

Committed by Jason Lowe-Power <power.jg@gmail.com>
2016-05-19 15:19:34 -05:00
Tony Gutierrez
7dad4377ec gpu-compute: fix bug in GPUDynInst::isScalarRegister() 2016-05-16 15:36:24 -04:00
Tony Gutierrez
bb83fa2051 gpu-compute: fix spacing in GPUDynInst ctor 2016-05-06 17:00:54 -04:00
Tony Gutierrez
4f3139e696 gpu-compute: fix uninitialized member bug in GPUDynInst
the n_reg field in the GPUDynInst is not currently set in the constructor.
if it is not set externally, there are assertion failures that may occur
if the random value it gets is just right. here we set it to 0 by default.
2016-05-06 16:44:38 -04:00
Andreas Sandberg
84cfa10b15 dev, arm: Update GIC to use GICv2 register naming
The GICv2 has a new and slightly more consistent register
naming. Update gem5's GIC register names to match the new
documentation.

Change-Id: I8ef114eee8a95bf0b88b37c18a18e137be78675a
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2016-05-06 15:52:34 +01:00
Andreas Sandberg
53f58b5fc1 arm: Remove BreakPCEvent on guest kernel panic
The LinuxArmSystem class normally provides support for panicing gem5
if the simulated kernel panics. When this is turned off (default),
gem5 uses a BreakPCEvent to provide a debugger hook into the simulator
when the kernel crashes. This hook unconditionally kills gem5 with a
SIGTRAP unless gem5 is compiled in fast mode. This is undesirable
since the panic_on_panic param already provides similar functionality.

Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2016-04-27 15:34:58 +01:00
Andreas Sandberg
f1575fdc4a kvm, arm: Make GIC interrupt lines configurable
Add support for overriding the number of interrupt lines in the ARM
KvmGic.

Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2016-04-27 15:34:48 +01:00
Andreas Sandberg
d5e7892350 kvm, arm: Refactor KVM GIC device
Factor out the kernel device wrapper from the KvmGIC and put it in a
separate class. This will simplify a future kernel/gem5 hybrid GIC.

Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2016-04-27 15:34:31 +01:00
Andreas Sandberg
6d74892b38 dev: Fix incorrect terminal backlog handling
The Terminal device currently uses the peek functionality in gem5's
circular buffer implementation to send existing buffered content on
the terminal when a new client attaches. This functionallity is
however not implemented correctly and re-sends the same block multiple
time.

Add the required functionality to peek with an offset into the
circular buffer and change the Terminal::accept() implementation to
send the buffered contents.

Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2016-04-27 15:33:58 +01:00
Matthew Poremba
67e93a5846 ruby: Rename pkt to m_pkt so it may be accessed via SLICC
Allow usage of packet class in ruby for convenience purposes. This may be
used to access members of the packet/request class (e.g., via helper
functions) and/or push protocol specific information to the packets
SenderState without needing to modify SLICC types and protocols in multiple
locations.
2016-04-26 12:07:51 -04:00
Andreas Hansson
5a1dea51d2 mem: Include WriteLineReq in cache demand stats
Somehow the WriteLineReq were never added to the list of commands
considered demand.
2016-04-21 04:48:20 -04:00
Andreas Hansson
a7c94f6e69 mem: Remove unused cache stats
Prune cache stats that are never actually used.
2016-04-21 04:48:19 -04:00
Andreas Hansson
13b9d4215d mem: Deallocate all write-queue entries when sent
This patch removes the write-queue entry tracking previously used for
uncacheable writes. The write-queue entry is now deallocated as soon
as the packet is sent. As a result we also forego the stats for
uncacheable writes. Additionally, there is no longer a need to attach
the write-queue entry to the packet.
2016-04-21 04:48:07 -04:00
Andreas Hansson
6c92ee49f1 mem: Align downstream cache packet creation in atomic and timing
This patch makes the control flow more uniform in atomic and timing,
ultimately making the code easier to understand.
2016-04-21 04:48:06 -04:00
Joel Hestness
39e10ced03 ruby: Fix block_on behavior
Ruby's controller block_on behavior aimed to block MessageBuffer requests into
SLICC controllers when a Locked_RMW was in flight. Unfortunately, this
functionality only partially works: When non-Locked_RMW memory accesses are
issued to the sequencer to an address with an in-flight Locked_RMW, the
sequencer may pass those accesses through to the controller. At the controller,
a number of incorrect activities can occur depending on the protocol. In
MOESI_hammer, for example, an intermediate IFETCH will cause an L1D to L2
transfer, which cannot be serviced, because the block_on functionality blocks
the trigger queue, resulting in a deadlock. Further, if an intermediate store
arrives (e.g. from a separate SMT thread), the sequencer allows the request
through to the controller, and the atomicity of the Locked_RMW may be broken.

To avoid these problems, disallow the Sequencer from passing any memory
accesses to the controller besides Locked_RMW_Write when a Locked_RMW is in-
flight.
2016-04-15 12:34:02 -05:00
Bjoern A. Zeeb
edbf748181 arm,dev: remove PMU assertion hit on reset
Remve the assertion that we always need to add a delta larger than
zero as that does not seem to be true when we hit it in the
'PMU reset cycle counter to zero' case.

Committed by Jason Lowe-Power <power.jg@gmail.com>
2016-04-15 10:03:03 -05:00
Bjoern A. Zeeb
bc45e930e4 mem: FreeBSD does not provide MAP_NORESERVE either
Like OS X, FreeBSD does not support MAP_NORESERVE.
Handle accordingly and update comment.

Committed by Jason Lowe-Power <power.jg@gmail.com>
2016-04-15 10:02:58 -05:00
Andreas Hansson
8127c4e7bf misc: Fix issues flagged by gcc 6
A few warnings (and thus errors) pop up after being added to -Wall:

1. -Wmisleading-indentation

In the auto-generated code there were instances of if/else blocks that
were not indented to gcc's liking. This is addressed by adding braces.

2. -Wshift-negative-value

gcc is clever enougn to consider ~0 a negative constant, and
rightfully complains. This is addressed by using mask() which
explicitly casts to unsigned before shifting.

That is all. Porting done.
2016-04-13 12:13:44 -04:00
Andreas Hansson
4b802a09c5 misc: Appease clang...again
Once again, clang is having issues with recently committed code.

Unfortunately HSAIL_X86 is still broken.
2016-04-12 05:28:39 -04:00
Rekai Gonzalez Alberquilla
af27586fbc mem: Add priority to QueuedPrefetcher
Queued prefetcher entries now count with a priority field. The idea is to
add packets ordered by priority and then by age.

For the existing algorithms in which priority doesn't make sense, it is set
to 0 for all deferred packets in the queue.
2016-04-07 11:32:38 -05:00
Rekai Gonzalez Alberquilla
dad7d9277b mem: Handful extra features for BasePrefetcher
Some common functionality added to the base prefetcher, mainly dealing with
extracting the block address, page address, block index inside the page and
some other information that can be inferred from the block address. This is
used for some prefetching algorithms, and having the methods in the base,
as well as the block size and other information is the sensible way.
2016-04-07 11:32:38 -05:00
Victor Garcia
df5a811833 mem: Add Program Counter to MemTraceProbe 2016-04-07 11:32:38 -05:00
Rekai Gonzalez Alberquilla
a3bf4aa6ec mem: Add unused prefetch counter in caches
Added stat to the cache to account for HardPF'ed blocks that are evicted
before being referenced (over-prefetching).
2015-05-27 13:50:01 +01:00
Mitch Hayenga
c75ff71139 mem: Remove threadId from memory request class
In general, the ThreadID parameter is unnecessary in the memory system
as the ContextID is what is used for the purposes of locks/wakeups.
Since we allocate sequential ContextIDs for each thread on MT-enabled
CPUs, ThreadID is unnecessary as the CPUs can identify the requesting
thread through sideband info (SenderState / LSQ entries) or ContextID
offset from the base ContextID for a cpu.

This is a re-spin of 20264eb after the revert (bd1c6789) and includes
some fixes of that commit.
2016-04-07 09:30:20 -05:00
Mitch Hayenga
d99deff8ea cpu: Implement per-thread GHRs
Branch predictors that use GHRs should index them on a
per-thread basis.  This makes that so.

This is a re-spin of fb51231 after the revert (bd1c6789).
2016-04-05 12:20:19 -05:00
Mitch Hayenga
0fd4bb7f12 cpu: Add an indirect branch target predictor
This patch adds a configurable indirect branch predictor that can be indexed
by a combination of GHR and path history hashes. Implements the functionality
described in:

"Target prediction for indirect jumps" by Chang, Hao, and Patt
http://dl.acm.org/citation.cfm?id=264209

This is a re-spin of fb9d142 after the revert (bd1c6789).
2016-04-05 11:48:37 -05:00
Mitch Hayenga
3f6874cb29 cpu: Fix BTB threading oversight
The extant BTB code doesn't hash on the thread id but does check the
thread id for 'btb hits'.  This results in 1-thread of a multi-threaded
workload taking a BTB entry, and all other threads missing for the same branch
missing.
2016-04-05 11:44:27 -05:00
Sascha Bischoff
1097aa1638 misc: Bail out of DVFS dot if we cannot resolve the domains
This changeset updates the dot output to bail out if it is unable to
resolve the voltage or clock domains (which will cause it to raise an
AttributeError). Additionally, the DVFS dot output is disabled by
default for speed purposes.

Minor fixup for 0aeca8f.
2016-04-06 17:55:17 +01:00