Commit graph

61 commits

Author SHA1 Message Date
Fernando Endo 6c72c35519 cpu, arm: Distinguish Float* and SimdFloat*, create FloatMem* opClass
Modify the opClass assigned to AArch64 FP instructions from SimdFloat* to
Float*. Also create the FloatMemRead and FloatMemWrite opClasses, which
distinguishes writes to the INT and FP register banks.
Change the latency of (Simd)FloatMultAcc to 5, based on the Cortex-A72,
where the "latency" of FMADD is 3 if the next instruction is a FMADD and
has only the augend to destination dependency, otherwise it's 7 cycles.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-10-15 14:58:45 -05:00
Andreas Sandberg 2c05f5207d cpu: Add missing override in Minor's exec context
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-08-15 12:00:37 +01:00
Reiley Jeapaul ff8257c7c2 cpu: Fixed clang errors. Added 'override' keyword for virtual functions.
Change-Id: Ic37311443ca11ee6d95bceffea599e054e7aa110
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-08-15 12:00:36 +01:00
Nikos Nikoleris 698767e538 cpu, arch: fix the type used for the request flags
Change-Id: I183b9942929c873c3272ce6d1abd4ebc472c7132
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-08-15 12:00:35 +01:00
Mitch Hayenga 752f1c1fe9 cpu: Fix Minor SMT WFI/drain interaction issues
The behavior of WFI is to cause minor to cease evaluating
pipeline logic until an interrupt is observed, however
a user may wish to drain the system while a core is sleeping
due to a WFI.  This patch makes WFI drain.  If an actual
drain occurs during a WFI, the CPU is already drained and will
immediately be ready for swapping, checkpointing, etc.  This
should not negatively impact performance as WFI instructions
are 'stream-changing' (treated like unpredicted branches), so
all remaining instructions are wrong-path and will be squashed
rapidly.

Change-Id: I63833d5acb53d8dde78f9f0c9611de0ece385e45
2016-07-21 17:19:16 +01:00
Mitch Hayenga ff4009ac00 cpu: Add SMT support to MinorCPU
This patch adds SMT support to the MinorCPU.  Currently
RoundRobin or Random thread scheduling are supported.

Change-Id: I91faf39ff881af5918cca05051829fc6261f20e3
2016-07-21 17:19:16 +01:00
David Guillen Fandos fb5fc11da4 pwr: Low-power idle power state for idle CPUs
Add functionality to the BaseCPU that will put the entire CPU
into a low-power idle state whenever all threads in it are idle.

Change-Id: I984d1656eb0a4863c87ceacd773d2d10de5cfd2b
2016-06-06 17:16:43 +01:00
Ilias Vougioukas 7c8d6e3660 cpu: fix lastStopped unserialisation
MinorCPU fix for corrupt numCycles when resuming from a previous simulation.
---
 src/cpu/minor/cpu.cc | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)
2016-05-27 16:55:01 +01:00
Mitch Hayenga c75ff71139 mem: Remove threadId from memory request class
In general, the ThreadID parameter is unnecessary in the memory system
as the ContextID is what is used for the purposes of locks/wakeups.
Since we allocate sequential ContextIDs for each thread on MT-enabled
CPUs, ThreadID is unnecessary as the CPUs can identify the requesting
thread through sideband info (SenderState / LSQ entries) or ContextID
offset from the base ContextID for a cpu.

This is a re-spin of 20264eb after the revert (bd1c6789) and includes
some fixes of that commit.
2016-04-07 09:30:20 -05:00
Andreas Sandberg be28d96510 Revert power patch sets with unexpected interactions
The following patches had unexpected interactions with the current
upstream code and have been reverted for now:

e07fd01651f3: power: Add support for power models
831c7f2f9e39: power: Low-power idle power state for idle CPUs
4f749e00b667: power: Add power states to ClockedObject

Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>

--HG--
extra : amend_source : 0b6fb073c6bbc24be533ec431eb51fbf1b269508
2016-04-06 19:43:31 +01:00
Mitch Hayenga 8615b27174 mem: Remove threadId from memory request class
In general, the ThreadID parameter is unnecessary in the memory system
as the ContextID is what is used for the purposes of locks/wakeups.
Since we allocate sequential ContextIDs for each thread on MT-enabled
CPUs, ThreadID is unnecessary as the CPUs can identify the requesting
thread through sideband info (SenderState / LSQ entries) or ContextID
offset from the base ContextID for a cpu.
2016-04-05 12:39:21 -05:00
Akash Bagdia 1c34ee20df power: Low-power idle power state for idle CPUs
Add functionality to the BaseCPU that will put the entire CPU into a low-power
idle state whenever all threads in it are idle.
2014-12-09 10:42:08 +00:00
Mitch Hayenga 85dadcd381 cpu: Add instruction opclass histogram to minor 2016-04-05 08:08:12 -05:00
Krishnendra Nathella cabd4768c7 cpu: Fix LLSC atomic CPU wakeup
Writes to locked memory addresses (LLSC) did not wake up the locking
CPU. This can lead to deadlocks on multi-core runs. In AtomicSimpleCPU,
recvAtomicSnoop was checking if the incoming packet was an invalidation
(isInvalidate) and only then handled a locked snoop. But, writes are
seen instead of invalidates when running without caches (fast-forward
configurations). As as simple fix, now handleLockedSnoop is also called
even if the incoming snoop packet are from writes.
2015-07-19 15:03:30 -05:00
Andreas Hansson 0d50979888 misc: Add missing overrides to appease clang
Since the last round of fixes a few new issues have snuck in. We
should consider switching the regression runs to clang.
2016-02-15 03:40:32 -05:00
Andreas Hansson fbdeb60316 mem: Deduce if cache should forward snoops
This patch changes how the cache determines if snoops should be
forwarded from the memory side to the CPU side. Instead of having a
parameter, the cache now looks at the port connected on the CPU side,
and if it is a snooping port, then snoops are forwarded. Less error
prone, and less parameters to worry about.

The patch also tidies up the CPU classes to ensure that their I-side
port is not snooping by removing overrides to the snoop request
handler, such that snoop requests will panic via the default
MasterPort implement
2016-02-10 04:08:24 -05:00
Steve Reinhardt 5592798865 style: fix missing spaces in control statements
Result of running 'hg m5style --skip-all --fix-control -a'.
2016-02-06 17:21:19 -08:00
Steve Reinhardt 1b6355c895 cpu. arch: add initiateMemRead() to ExecContext interface
For historical reasons, the ExecContext interface had a single
function, readMem(), that did two different things depending on
whether the ExecContext supported atomic memory mode (i.e.,
AtomicSimpleCPU) or timing memory mode (all the other models).
In the former case, it actually performed a memory read; in the
latter case, it merely initiated a read access, and the read
completion did not happen until later when a response packet
arrived from the memory system.

This led to some confusing things, including timing accesses
being required to provide a pointer for the return data even
though that pointer was only used in atomic mode.

This patch splits this interface, adding a new initiateMemRead()
function to the ExecContext interface to replace the timing-mode
use of readMem().

For consistency and clarity, the readMemTiming() helper function
in the ISA definitions is renamed to initiateMemRead() as well.
For x86, where the access size is passed in explicitly, we can
also get rid of the data parameter at this level.  For other ISAs,
where the access size is determined from the type of the data
parameter, we have to keep the parameter for that purpose.
2016-01-17 18:27:46 -08:00
Andreas Hansson 2ac04c11ac misc: Add explicit overrides and fix other clang >= 3.5 issues
This patch adds explicit overrides as this is now required when using
"-Wall" with clang >= 3.5, the latter now part of the most recent
XCode. The patch consequently removes "virtual" for those methods
where "override" is added. The latter should be enough of an
indication.

As part of this patch, a few minor issues that clang >= 3.5 complains
about are also resolved (unused methods and variables).
2015-10-12 04:08:01 -04:00
Andreas Hansson 22c04190c6 misc: Remove redundant compiler-specific defines
This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap
(and similar) abstractions, as these are no longer needed with gcc 4.7
and clang 3.1 as minimum compiler versions.
2015-10-12 04:07:59 -04:00
Mitch Hayenga 9e07a7504c cpu,isa,mem: Add per-thread wakeup logic
Changes wakeup functionality so that only specific threads on SMT
capable cpus are woken.
2015-09-30 11:14:19 -05:00
Mitch Hayenga a5c4eb3de9 isa,cpu: Add support for FS SMT Interrupts
Adds per-thread interrupt controllers and thread/context logic
so that interrupts properly get routed in SMT systems.
2015-09-30 11:14:19 -05:00
Mitch Hayenga fafa83ed32 cpu: Add per-thread monitors
Adds per-thread address monitors to support FullSystem SMT.
2015-09-30 11:14:19 -05:00
Andreas Hansson daaae3744d mem: Reflect that packet address and size are always valid
This patch simplifies the packet, and removes the possibility of
creating a packet without a valid address and/or size. Under no
circumstances are these fields set at a later point, and thus they
really have to be provided at construction time.

The patch also fixes a case there the MinorCPU creates a packet
without a valid address and size, only to later delete it.
2015-08-21 07:03:27 -04:00
Andreas Sandberg 53e777d683 base: Declare a type for context IDs
Context IDs used to be declared as ad hoc (usually as int). This
changeset introduces a typedef for ContextIDs and a constant for
invalid context IDs.
2015-08-07 09:59:13 +01:00
Andreas Sandberg f789d729b5 cpu: Update debug message from Fetch1 isDrained() in Minor
Fix a spurious %s and include the state of the Fetch1 stage in the
debug printout.
2015-07-31 17:04:59 +01:00
Andreas Sandberg f73b05431a cpu: Fix Minor drain issues when switched out
The Minor CPU currently doesn't drain properly when it is switched
out. This happens because Fetch 1 expects to be in the FetchHalted
state when it is drained. However, because the CPU is switched out, it
is stuck in the FetchWaitingForPC state. Fix this by ignoring drain
requests and returning DrainState::Drained from MinorCPU::drain() if
the CPU is switched out. This is always safe since a switched out CPU,
by definition, doesn't have any instructions in flight.
2015-07-31 17:04:59 +01:00
Andreas Sandberg ff8195235e cpu: Only activate thread 0 in Minor if the CPU is active
Minor currently activates thread 0 in startup() to work around an
issue where activateContext() is called from LiveProcess before the
process entry point is known. When activateContext() is called, Minor
creates a branch instruction to the process's entry point. The first
time it is called, the branch points to an undefined location (0). The
call in startup() updates the branch to point to the actual entry
point.

When instantiating a switched out Minor CPU, it still tries to
activate thread 0. This is clearly incorrect since a switched out CPU
can't have any active threads. This changeset adds a check to ensure
that the thread is active before reactivating it.
2015-07-30 10:15:50 +01:00
Andreas Sandberg 473a0dcc63 cpu: Fix drain issues in the Minor CPU
The drain refactor patches introduced a couple of bugs in the way
Minor handles draining. This patch fixes an incorrect assert and a
case of infinite recursion when the CPU signals drain done.
2015-07-30 10:15:50 +01:00
Nilay Vaish aafa5c3f86 revert 5af8f40d8f2c 2015-07-28 01:58:04 -05:00
Nilay Vaish 608641e23c cpu: implements vector registers
This adds a vector register type.  The type is defined as a std::array of a
fixed number of uint64_ts.  The isa_parser.py has been modified to parse vector
register operands and generate the required code.  Different cpus have vector
register files now.
2015-07-26 10:21:20 -05:00
Andreas Sandberg ed38e3432c sim: Refactor and simplify the drain API
The drain() call currently passes around a DrainManager pointer, which
is now completely pointless since there is only ever one global
DrainManager in the system. It also contains vestiges from the time
when SimObjects had to keep track of their child objects that needed
draining.

This changeset moves all of the DrainState handling to the Drainable
base class and changes the drain() and drainResume() calls to reflect
this. Particularly, the drain() call has been updated to take no
parameters (the DrainManager argument isn't needed) and return a
DrainState instead of an unsigned integer (there is no point returning
anything other than 0 or 1 any more). Drainable objects should return
either DrainState::Draining (equivalent to returning 1 in the old
system) if they need more time to drain or DrainState::Drained
(equivalent to returning 0 in the old system) if they are already in a
consistent state. Returning DrainState::Running is considered an
error.

Drain done signalling is now done through the signalDrainDone() method
in the Drainable class instead of using the DrainManager directly. The
new call checks if the state of the object is DrainState::Draining
before notifying the drain manager. This means that it is safe to call
signalDrainDone() without first checking if the simulator has
requested draining. The intention here is to reduce the code needed to
implement draining in simple objects.
2015-07-07 09:51:05 +01:00
Andreas Sandberg e9c3d59aae sim: Make the drain state a global typed enum
The drain state enum is currently a part of the Drainable
interface. The same state machine will be used by the DrainManager to
identify the global state of the simulator. Make the drain state a
global typed enum to better cater for this usage scenario.
2015-07-07 09:51:04 +01:00
Andreas Sandberg 76cd4393c0 sim: Refactor the serialization base class
Objects that are can be serialized are supposed to inherit from the
Serializable class. This class is meant to provide a unified API for
such objects. However, so far it has mainly been used by SimObjects
due to some fundamental design limitations. This changeset redesigns
to the serialization interface to make it more generic and hide the
underlying checkpoint storage. Specifically:

  * Add a set of APIs to serialize into a subsection of the current
    object. Previously, objects that needed this functionality would
    use ad-hoc solutions using nameOut() and section name
    generation. In the new world, an object that implements the
    interface has the methods serializeSection() and
    unserializeSection() that serialize into a named /subsection/ of
    the current object. Calling serialize() serializes an object into
    the current section.

  * Move the name() method from Serializable to SimObject as it is no
    longer needed for serialization. The fully qualified section name
    is generated by the main serialization code on the fly as objects
    serialize sub-objects.

  * Add a scoped ScopedCheckpointSection helper class. Some objects
    need to serialize data structures, that are not deriving from
    Serializable, into subsections. Previously, this was done using
    nameOut() and manual section name generation. To simplify this,
    this changeset introduces a ScopedCheckpointSection() helper
    class. When this class is instantiated, it adds a new /subsection/
    and subsequent serialization calls during the lifetime of this
    helper class happen inside this section (or a subsection in case
    of nested sections).

  * The serialize() call is now const which prevents accidental state
    manipulation during serialization. Objects that rely on modifying
    state can use the serializeOld() call instead. The default
    implementation simply calls serialize(). Note: The old-style calls
    need to be explicitly called using the
    serializeOld()/serializeSectionOld() style APIs. These are used by
    default when serializing SimObjects.

  * Both the input and output checkpoints now use their own named
    types. This hides underlying checkpoint implementation from
    objects that need checkpointing and makes it easier to change the
    underlying checkpoint storage code.
2015-07-07 09:51:03 +01:00
Andrew Bardsley cea1d14a93 cpu: Fix a bug in counting issued instructions in MinorCPU
The MinorCPU would count bubbles in Execute::issue as part of
the num_insts_issued and so sometimes reach the instruction
issue limit incorrectly.

Fixed by checking for a bubble in one new place.
2015-05-26 03:21:37 -04:00
Andreas Sandberg 48281375ee mem, cpu: Add a separate flag for strictly ordered memory
The Request::UNCACHEABLE flag currently has two different
functions. The first, and obvious, function is to prevent the memory
system from caching data in the request. The second function is to
prevent reordering and speculation in CPU models.

This changeset gives the order/speculation requirement a separate flag
(Request::STRICT_ORDER). This flag prevents CPU models from doing the
following optimizations:

    * Speculation: CPU models are not allowed to issue speculative
      loads.

    * Write combining: CPU models and caches are not allowed to merge
      writes to the same cache line.

Note: The memory system may still reorder accesses unless the
UNCACHEABLE flag is set. It is therefore expected that the
STRICT_ORDER flag is combined with the UNCACHEABLE flag to prevent
this behavior.
2015-05-05 03:22:33 -04:00
Andreas Hansson d0d933facc cpu: Work around gcc 4.9 issues with Num_OpClasses
This patch fixes a recent issue with gcc 4.9 (and possibly more) being
convinced that indices outside the array bounds are used when
initialising the FUPool members.
2015-05-05 03:22:19 -04:00
Dibakar Gope 34ad1123ee cpu: re-organizes the branch predictor structure.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-04-13 17:33:57 -05:00
Nikos Nikoleris 305e29b98e cpu: fix system total instructions accounting
The totalInstructions counter is only incremented when the whole instruction is
commited and not on every microop. It was incorrectly reset in atomic and
timing cpus.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>"
2015-04-03 11:42:10 -05:00
Steve Reinhardt ee0b52404c mem: restructure Packet cmd initialization a bit more
Refactor the way that specific MemCmd values are generated for packets.
The new approach is a little more elegant in that we assign the right
value up front, and it's also more amenable to non-heap-allocated
Packet objects.

Also replaced the code in the Minor model that was still doing it the
ad-hoc way.

This is basically a refinement of http://repo.gem5.org/gem5/rev/711eb0e64249.
2015-02-11 10:48:50 -08:00
Andreas Hansson f26a289295 mem: Split port retry for all different packet classes
This patch fixes a long-standing isue with the port flow
control. Before this patch the retry mechanism was shared between all
different packet classes. As a result, a snoop response could get
stuck behind a request waiting for a retry, even if the send/recv
functions were split. This caused message-dependent deadlocks in
stress-test scenarios.

The patch splits the retry into one per packet (message) class. Thus,
sendTimingReq has a corresponding recvReqRetry, sendTimingResp has
recvRespRetry etc. Most of the changes to the code involve simply
clarifying what type of request a specific object was accepting.

The biggest change in functionality is in the cache downstream packet
queue, facing the memory. This queue was shared by requests and snoop
responses, and it is now split into two queues, each with their own
flow control, but the same physical MasterPort. These changes fixes
the previously seen deadlocks.
2015-03-02 04:00:35 -05:00
Andreas Hansson d0e1b8a19c arch: Make readMiscRegNoEffect const throughout
Finally took the plunge and made this apply to all ISAs, not just ARM.
2015-02-16 03:33:28 -05:00
Ali Saidi 9d8ddd92dc sim: Clean up InstRecord
Track memory size and flags as well as add some comments and consts.
2015-01-25 07:22:44 -05:00
Andreas Hansson da0c770943 cpu: Fix retry bug in MinorCPU LSQ 2015-01-20 08:11:58 -05:00
Andrew Lukefahr 6d32004407 minor: fixed LSQ MasterPortID
Minor was reporting the data cache access as ".inst" accesses.
This just switches the MasterPortID to dataMasterPortId.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-03 17:51:48 -06:00
Andrew Bardsley df37cad0fd cpu: Fix retries on barrier/store in Minor's store buffer
This patch fixes a case where a store in Minor's store buffer never
leaves the store buffer as it is pre-maturely counted as having been
issued, leading to the store buffer idling.

LSQ::StoreBuffer::numUnissuedAccesses should count the number of accesses
either in memory, or still in the store buffer after being completed.

For stores which are also barriers, the store will stay in the store
buffer for a cycle after it is completed and will be cleaned up by the
barrier clearing code (to ensure that barriers are completed in-order).
To acheive this, numUnissuedAccesses is not decremented when a store-barrier
is issued to memory, but when its barrier effect is cleared.

Without this patch, the correct behaviour happens when a memory transaction
is immediately accepted, but not if it needs a retry.
2014-12-02 06:08:15 -05:00
Andrew Bardsley 98f3e7a310 cpu: Fix memoryIssueLimit checking in Minor
This patch fixes the checking of the number of memory instructions issued
per cycles in the Minor CPU.
2014-12-02 06:08:13 -05:00
Andreas Hansson 41846cb61b mem: Assume all dynamic packet data is array allocated
This patch simplifies how we deal with dynamically allocated data in
the packet, always assuming that it is array allocated, and hence
should be array deallocated (delete[] as opposed to delete). The only
uses of dataDynamic was in the Ruby testers.

The ARRAY_DATA flag in the packet is removed accordingly. No
defragmentation of the flags is done at this point, leaving a gap in
the bit masks.

As the last part the patch, it renames dataDynamicArray to dataDynamic.
2014-12-02 06:07:43 -05:00
Andreas Hansson 9779ba2e37 mem: Add const getters for write packet data
This patch takes a first step in tightening up how we use the data
pointer in write packets. A const getter is added for the pointer
itself (getConstPtr), and a number of member functions are also made
const accordingly. In a range of places throughout the memory system
the new member is used.

The patch also removes the unused isReadWrite function.
2014-12-02 06:07:36 -05:00
Andreas Hansson 481eb6ae80 arm: Fixes based on UBSan and static analysis
Another churn to clean up undefined behaviour, mostly ARM, but some
parts also touching the generic part of the code base.

Most of the fixes are simply ensuring that proper intialisation. One
of the more subtle changes is the return type of the sign-extension,
which is changed to uint64_t. This is to avoid shifting negative
values (undefined behaviour) in the ISA code.
2014-11-14 03:53:51 -05:00