Commit graph

8 commits

Author SHA1 Message Date
Mitch Hayenga ff4009ac00 cpu: Add SMT support to MinorCPU
This patch adds SMT support to the MinorCPU.  Currently
RoundRobin or Random thread scheduling are supported.

Change-Id: I91faf39ff881af5918cca05051829fc6261f20e3
2016-07-21 17:19:16 +01:00
Andreas Hansson 0d50979888 misc: Add missing overrides to appease clang
Since the last round of fixes a few new issues have snuck in. We
should consider switching the regression runs to clang.
2016-02-15 03:40:32 -05:00
Andreas Hansson fbdeb60316 mem: Deduce if cache should forward snoops
This patch changes how the cache determines if snoops should be
forwarded from the memory side to the CPU side. Instead of having a
parameter, the cache now looks at the port connected on the CPU side,
and if it is a snooping port, then snoops are forwarded. Less error
prone, and less parameters to worry about.

The patch also tidies up the CPU classes to ensure that their I-side
port is not snooping by removing overrides to the snoop request
handler, such that snoop requests will panic via the default
MasterPort implement
2016-02-10 04:08:24 -05:00
Andreas Hansson f26a289295 mem: Split port retry for all different packet classes
This patch fixes a long-standing isue with the port flow
control. Before this patch the retry mechanism was shared between all
different packet classes. As a result, a snoop response could get
stuck behind a request waiting for a retry, even if the send/recv
functions were split. This caused message-dependent deadlocks in
stress-test scenarios.

The patch splits the retry into one per packet (message) class. Thus,
sendTimingReq has a corresponding recvReqRetry, sendTimingResp has
recvRespRetry etc. Most of the changes to the code involve simply
clarifying what type of request a specific object was accepting.

The biggest change in functionality is in the cache downstream packet
queue, facing the memory. This queue was shared by requests and snoop
responses, and it is now split into two queues, each with their own
flow control, but the same physical MasterPort. These changes fixes
the previously seen deadlocks.
2015-03-02 04:00:35 -05:00
Andrew Bardsley df37cad0fd cpu: Fix retries on barrier/store in Minor's store buffer
This patch fixes a case where a store in Minor's store buffer never
leaves the store buffer as it is pre-maturely counted as having been
issued, leading to the store buffer idling.

LSQ::StoreBuffer::numUnissuedAccesses should count the number of accesses
either in memory, or still in the store buffer after being completed.

For stores which are also barriers, the store will stay in the store
buffer for a cycle after it is completed and will be cleaned up by the
barrier clearing code (to ensure that barriers are completed in-order).
To acheive this, numUnissuedAccesses is not decremented when a store-barrier
is issued to memory, but when its barrier effect is cleared.

Without this patch, the correct behaviour happens when a memory transaction
is immediately accepted, but not if it needs a retry.
2014-12-02 06:08:15 -05:00
Andreas Hansson 41fc8a573e arch: Pass faults by const reference where possible
This patch changes how faults are passed between methods in an attempt
to copy as few reference-counting pointer instances as possible. This
should avoid unecessary copies being created, contributing to the
increment/decrement of the reference counters.
2014-09-19 10:35:18 -04:00
Andrew Bardsley 1a45a8c5d3 cpu: Fix memory access in Minor not setting parent Request flags
This patch fixes cases where uncacheable/memory type flags are not set
correctly on a memory op which is split in the LSQ.  Without this
patch, request->request if freely used to check flags where the flags
should actually come from the accumulation of request fragment flags.

This patch also fixes a bug where an uncacheable access which passes
through tryToSendRequest more than once can increment
LSQ::numAccessesInMemorySystem more than once.
2014-09-12 10:22:49 -04:00
Andrew Bardsley 0e8a90f06b cpu: `Minor' in-order CPU model
This patch contains a new CPU model named `Minor'. Minor models a four
stage in-order execution pipeline (fetch lines, decompose into
macroops, decompose macroops into microops, execute).

The model was developed to support the ARM ISA but should be fixable
to support all the remaining gem5 ISAs. It currently also works for
Alpha, and regressions are included for ARM and Alpha (including Linux
boot).

Documentation for the model can be found in src/doc/inside-minor.doxygen and
its internal operations can be visualised using the Minorview tool
utils/minorview.py.

Minor was designed to be fairly simple and not to engage in a lot of
instruction annotation. As such, it currently has very few gathered
stats and may lack other gem5 features.

Minor is faster than the o3 model. Sample results:

     Benchmark     |   Stat host_seconds (s)
    ---------------+--------v--------v--------
     (on ARM, opt) | simple | o3     | minor
                   | timing | timing | timing
    ---------------+--------+--------+--------
    10.linux-boot  |   169  |  1883  |  1075
    10.mcf         |   117  |   967  |   491
    20.parser      |   668  |  6315  |  3146
    30.eon         |   542  |  3413  |  2414
    40.perlbmk     |  2339  | 20905  | 11532
    50.vortex      |   122  |  1094  |   588
    60.bzip2       |  2045  | 18061  |  9662
    70.twolf       |   207  |  2736  |  1036
2014-07-23 16:09:04 -05:00