Commit graph

56 commits

Author SHA1 Message Date
Nikos Nikoleris 78a97b1847 mem: Always use InvalidateReq to service WriteLineReq misses
Previously, a WriteLineReq that missed in a cache would send out an
InvalidateReq if the block lookup failed or an UpgradeReq if the
block lookup succeeded but the block had sharers. This changes ensures
that a WriteLineReq always sends an InvalidateReq to invalidate all
copies of the block and satisfy the WriteLineReq.

Change-Id: I207ff5b267663abf02bc0b08aeadde69ad81be61
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05 16:48:25 -05:00
Andreas Hansson 50812a20f1 mem: Ensure InvalidateReq is considered isForward by MSHRs
This patch fixes an issue where an MSHR would incorrectly be perceived
to provide data to targets arriving after an InvalidateReq. To address
this the InvalidateReq is now treated as isForward, much like an
UpgradeReq that did not hit in the cache.

Change-Id: Ia878444d949539b5c33fd19f3e12b0b8a872275e
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-12-05 16:48:23 -05:00
Nikos Nikoleris e16967941b mem: Make packet debug printing more uniform
Previously DPRINTFs printing information about a packet would use ad hoc
formats. This patch changes all DPRINTFs to use the print function
defined by the packet class, making the packet printing format more
uniform and easier to change.

Change-Id: Idd436a9758d4bf70c86a574d524648b2a2580970
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-12-05 16:48:21 -05:00
Nikos Nikoleris 0bd9dfb8de mem: Service only the 1st FromCPU MSHR target on ReadRespWithInv
A response to a ReadReq can either be a ReadResp or a
ReadRespWithInvalidate. As we add targets to an MSHR for a ReadReq we
assume that the response will be a ReadResp. When the response is
invalidating (ReadRespWithInvalidate) servicing more than one targets
can potentially violate the memory ordering. This change fixes the way
we handle a ReadRespWithInvalidate. When a cache receives a
ReadRespWithInvalidate we service only the first FromCPU target and
all the FromSnoop targets from the MSHR target list. The rest of the
FromCPU targets are deferred and serviced by a new request.

Change-Id: I75c30c268851987ee5f8644acb46f440b4eeeec2
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-12-05 16:48:19 -05:00
Nikos Nikoleris d28c2906f4 mem: Keep track of allocOnFill in the TargetList
Previously the information of whether a response was allocating or not
was a property of the MSHR. This change makes this flag a property of
the TargetList. Differernt TargetLists, e.g. the targets and the
deferred targets lists might have different values. Additionally, the
information about whether each of the target expects an allocating
response is stored inside the TargetList container. This allows for
repopulating the flag in case some of the targets are removed.

Change-Id: If3ec2516992f42a6d9da907009ffe3ab8d0d2021
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-12-05 16:48:18 -05:00
Andreas Hansson 721efa4d09 mem: Update mostly exclusive policy even further
This patch takes yet another step in maintaining the clusivity, in
that it allows a mostly-inclusive cache to hold on to blocks even when
responding to a ReadExReq or UpgradeReq. Previously the cache simply
invalidated these blocks, but there is no strict need to do so.

The most important part of this patch is that we simply mark the block
clean when satisfying the upstream request where the cache is allowed
to keep the block. The only tricky part of the patch is in the memory
management of deferred snoops, where we need to distinguish the cases
where only the packet was copied (we expected to respond), and the
cases where we created an entirely new packet and request (we kept it
only to replay later).

The code in satisfyRequest is definitely ready for some refactoring
after this.

Change-Id: I201ddc7b2582eaa46fb8cff0c7ad09e02d64b0fc
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com>
2016-08-12 14:11:45 +01:00
Andreas Hansson 94f94fbc55 mem: Update mostly exclusive cache policy to cover more cases
This patch changes how the mostly exclusive policy is enforced to
ensure that we drop blocks when we should. As part of this change, the
actual invalidation due to the clusivity enforcement is moved outside
the hit handling, to a separate method maintainClusivity. For the
timing mode that means we can deal with all MSHR targets before taking
any action and possibly dropping the block. The method
satisfyCpuSideRequest is also renamed satisfyRequest as part of this
change (since we only ever see requests from the cpu-side port).

Change-Id: If6f3d1e0c3e7be9a67b72a55e4fc2ec4a90fd3d2
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com>
2016-08-12 14:11:45 +01:00
Andreas Hansson 2509553403 mem: Add a FromCache packet attribute
This patch adds a FromCache attribute to the packet, and updates a
number of the existing request commands to reflect that the request
originates from a cache. The attribute simplifies checking if a
requests came from a cache or not, and this is used by both the cache
and snoop filter in follow-on patches.

Change-Id: Ib0a7a080bbe4d6036ddd84b46fd45bc7eb41cd8f
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Steve Reinhardt <stever@gmail.com>
2016-08-12 14:11:45 +01:00
Nikos Nikoleris f4cc3a4d20 mem: Remove stale argument from a DPRINTF in the cache code
Change-Id: I70dd11c23b45dfc606ef08233d2e50fcc0817505
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-07-11 10:39:22 +01:00
Andreas Hansson e3e808416f mem: Fix memory leak in handling of deferred snoops
This patch fixes a memory leak where deferred snoop packets never got
deallocated. On the call to MSHR::handleSnoop these snoops were
treated as if a response will be sent, as the MSHR was
pendingModified. Consequently, a copy of the packet was created and
added to the MSHR targets. However, an preceeding target to the same
MSHR, originally from a CPU, was serviced before the snoop, and caused
the block to be invalidated. This happens for ReadExReq and
UpgradeReq.

Note that the original snoop will receive a response, just not from
the cache in question, but instead from the cache upstream that issued
the ReadExReq or UpgradeReq.

Change-Id: I4ac012fbc8a46cf693ca390fe9476105d444e6f4
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2016-05-26 11:56:24 +01:00
Nikos Nikoleris f9d62b63e1 mem: remove redudant check whether the cache forwards snoops
Change-Id: I57b56771086e1e2f512977fb7248d93c171ab925
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-05-26 11:56:24 +01:00
Nikos Nikoleris d68f3577d6 mem: change NULL to nullptr in the cache related classes
Change-Id: I5042410be54935650b7d05c84d8d9efbfcc06e70
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-05-26 11:56:24 +01:00
Nikos Nikoleris 90bf50b4c7 mem: fix the line length in the cache related classes
Change-Id: I6d1feb164a958dde0da87a1cd2698096112c4a82
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-05-26 11:56:24 +01:00
Andreas Hansson 13b9d4215d mem: Deallocate all write-queue entries when sent
This patch removes the write-queue entry tracking previously used for
uncacheable writes. The write-queue entry is now deallocated as soon
as the packet is sent. As a result we also forego the stats for
uncacheable writes. Additionally, there is no longer a need to attach
the write-queue entry to the packet.
2016-04-21 04:48:07 -04:00
Andreas Hansson 6c92ee49f1 mem: Align downstream cache packet creation in atomic and timing
This patch makes the control flow more uniform in atomic and timing,
ultimately making the code easier to understand.
2016-04-21 04:48:06 -04:00
Rekai Gonzalez Alberquilla a3bf4aa6ec mem: Add unused prefetch counter in caches
Added stat to the cache to account for HardPF'ed blocks that are evicted
before being referenced (over-prefetching).
2015-05-27 13:50:01 +01:00
Andreas Hansson 041ea8107e mem: Create a separate class for the cache write buffer
This patch breaks out the cache write buffer into a separate class,
without affecting any stats. The goal of the patch is to avoid
encumbering the much-simpler write queue with the complex MSHR
handling. In a follow on patch this simplification allows us to
implement write combining.

The WriteQueue gets its own class, but shares a common ancestor, the
generic Queue, with the MSHRQueue.
2016-03-17 09:51:18 -04:00
Andreas Hansson 7958f34797 mem: Ensure that InvalidateReq is not forwarded as ReadExReq
This patch fixes an issue where an InvalidationReq only traversed one
level of the cache hierarchy, and was subsequently turned into a
ReadExReq due to it needing writable, and the command not being
checked for explicitly.
2016-02-24 04:16:57 -05:00
Andreas Hansson 92f021cbbe mem: Move the point of coherency to the coherent crossbar
This patch introduces the ability of making the coherent crossbar the
point of coherency. If so, the crossbar does not forward packets where
a cache with ownership has already committed to responding, and also
does not forward any coherency-related packets that are not intended
for a downstream memory controller. Thus, invalidations and upgrades
are turned around in the crossbar, and the memory controller only sees
normal reads and writes.

In addition this patch moves the express snoop promotion of a packet
to the crossbar, thus allowing the downstream cache to check the
express snoop flag (as it should) for bypassing any blocking, rather
than relying on whether a cache is responding or not.
2016-02-10 04:08:25 -05:00
Andreas Hansson f84ee031cc mem: Align cache behaviour in atomic when upstream is responding
Adopt the same flow as in timing mode, where the caches on the path to
memory get to keep the line (if present), and we use the
responderHadWritable flag to determine if we need to forward the
(invalidating) packet or not.
2016-02-10 04:08:24 -05:00
Andreas Hansson 986214f181 mem: Align how snoops are handled when hitting writebacks
This patch unifies the snoop handling in case of hitting writebacks
with how we handle snoops hitting in the tags. As a result, we end up
using the same optimisation as the normal snoops, where we inform the
downstream cache if we encounter a line in Modified (writable and
dirty) state, which enables us to avoid sending out express snoops to
invalidate any Shared copies of the line. A few regressions
consequently change, as some transactions are sunk higher up in the
cache hierarchy.
2016-02-10 04:08:24 -05:00
Steve Reinhardt 6caa2c9b4e mem: add CacheVerbose debug flag, filter noisy DPRINTFs
Some of the DPRINTFs added to the classic cache in cset 45df88079f04,
while useful to those unfamiliar with the cache code, end up being
noise when you're familiar with the code but are trying to debug tricky
protocol issues.  (Particularly getting two messages from each cache
as it receives a snoop request then declares that there was no match.)

This patch introduces a CacheVerbose debug flag, and moves a subset of
the added DPRINTFs into that category, so that Cache by itself returns
to being a more succinct summary of cache activity.

Also added a CacheAll compound flag to turn on all the cache-related
debug flags (other than CacheTags, which you *really* have to want badly
to turn it on, IMO).
2015-12-31 09:32:09 -08:00
Andreas Hansson 7fca994d04 mem: Do not allocate space for packet data if not needed
This patch looks at the request and response command to determine if
either actually has any data payload, and if not, we do not allocate
any space for packet data.

The only tricky case is where the command type is changed as part of
the MSHR functionality. In these cases where the original packet had
no data, but the new packet does, we need to explicitly call
allocate().
2015-12-31 09:33:39 -05:00
Andreas Hansson f1ec326be5 mem: Do not alter cache block state on uncacheable snoops
This patch ensures we do not respond with a Modified (dirty and
writable) line if the request is uncacheable, and that the cache
responding retains the line without modifying the state (even if
responding).
2015-12-31 09:33:25 -05:00
Andreas Hansson 0fcb376e5f mem: Make cache terminology easier to understand
This patch changes the name of a bunch of packet flags and MSHR member
functions and variables to make the coherency protocol easier to
understand. In addition the patch adds and updates lots of
descriptions, explicitly spelling out assumptions.

The following name changes are made:

* the packet memInhibit flag is renamed to cacheResponding

* the packet sharedAsserted flag is renamed to hasSharers

* the packet NeedsExclusive attribute is renamed to NeedsWritable

* the packet isSupplyExclusive is renamed responderHadWritable

* the MSHR pendingDirty is renamed to pendingModified

The cache states, Modified, Owned, Exclusive, Shared are also called
out in the cache and MSHR code to make it easier to understand.
2015-12-31 09:32:58 -05:00
Andreas Hansson f6525ff221 mem: Remove unused cache squash functionality
This patch removes the unused squash function from the MSHR queue, and
the associated (and also unused) threadNum member from the MSHR.
2015-12-28 11:14:16 -05:00
Andreas Hansson fbf3987c7b mem: Avoid unecessary checks when creating HardPFReq in cache
The checks made before sending out a HardPFReq were unecessarily
complex, and checked for cases that never occur. This patch
tidies it up.
2015-12-28 11:14:15 -05:00
Andreas Hansson b93a9d0d51 mem: Do not use sender state to track forwarded snoops in cache
This patch changes how the cache tracks which snoops are forwarded,
and which ones are created locally. Previously the identification was
based on an empty sender state of a specific class, but this method
fails to distinguish which cache actually attached the sender
state. Instead we use the same mechanism as the crossbar, and keep
track of the requests that have outstanding snoops.
2015-12-28 11:14:14 -05:00
Andreas Hansson 036263e280 mem: Fix cache sender state handling and add clarification
This patch addresses a bug in how the cache attached the MSHR as a
sender state. Rather than overwriting any existing sender state it now
pushes a new one. The handling of upward snoops is also clarified.
2015-12-28 11:14:10 -05:00
Andreas Hansson 97887eb6dc mem: Fix memory allocation bug in deferred snoop handling
This patch fixes a corner case in the deferred snoop handling, where
requests ended up being used by multiple packets with different
lifetimes, and inadvertently got deleted while they were still in use.
2015-12-17 17:07:11 -05:00
Andreas Hansson 7433d77fcf mem: Add an option to perform clean writebacks from caches
This patch adds the necessary commands and cache functionality to
allow clean writebacks. This functionality is crucial, especially when
having exclusive (victim) caches. For example, if read-only L1
instruction caches are not sending clean writebacks, there will never
be any spills from the L1 to the L2. At the moment the cache model
defaults to not sending clean writebacks, and this should possibly be
re-evaluated.

The implementation of clean writebacks relies on a new packet command
WritebackClean, which acts much like a Writeback (renamed
WritebackDirty), and also much like a CleanEvict. On eviction of a
clean block the cache either sends a clean evict, or a clean
writeback, and if any copies are still cached upstream the clean
evict/writeback is dropped. Similarly, if a clean evict/writeback
reaches a cache where there are outstanding MSHRs for the block, the
packet is dropped. In the typical case though, the clean writeback
allocates a block in the downstream cache, and marks it writable if
the evicted block was writable.

The patch changes the O3_ARM_v7a L1 cache configuration and the
default L1 caches in config/common/Caches.py
2015-11-06 03:26:43 -05:00
Andreas Hansson 654266f39c mem: Add cache clusivity
This patch adds a parameter to control the cache clusivity, that is if
the cache is mostly inclusive or exclusive. At the moment there is no
intention to support strict policies, and thus the options are: 1)
mostly inclusive, or 2) mostly exclusive.

The choice of policy guides the behaviuor on a cache fill, and a new
helper function, allocOnFill, is created to encapsulate the decision
making process. For the timing mode, the decision is annotated on the
MSHR on sending out the downstream packet, and in atomic we directly
pass the decision to handleFill. We (ab)use the tempBlock in cases
where we are not allocating on fill, leaving the rest of the cache
unaffected. Simple and effective.

This patch also makes it more explicit that multiple caches are
allowed to consider a block writable (this is the case
also before this patch). That is, for a mostly inclusive cache,
multiple caches upstream may also consider the block exclusive. The
caches considering the block writable/exclusive all appear along the
same path to memory, and from a coherency protocol point of view it
works due to the fact that we always snoop upwards in zero time before
querying any downstream cache.

Note that this patch does not introduce clean writebacks. Thus, for
clean lines we are essentially removing a cache level if it is made
mostly exclusive. For example, lines from the read-only L1 instruction
cache or table-walker cache are always clean, and simply get dropped
rather than being passed to the L2. If the L2 is mostly exclusive and
does not allocate on fill it will thus never hold the line. A follow
on patch adds the clean writebacks.

The patch changes the L2 of the O3_ARM_v7a CPU configuration to be
mostly exclusive (and stats are affected accordingly).
2015-11-06 03:26:41 -05:00
Ali Jafri 52c8ae5187 mem: Enforce insertion order on the cache response path
This patch enforces insertion order transmission of packets on the
response path in the cache. Note that the logic to enforce order is
already present in the packet queue, this patch simply turns it on for
queues in the response path.

Without this patch, there are corner cases where a request-response is
faster than a response-response forwarded through the cache. This
violation of queuing order causes problems in the snoop filter leaving
it with inaccurate information. This causes assert failures in the
snoop filter later on.

A follow on patch relaxes the order enforcement in the packet queue to
limit the performance impact.
2015-11-06 03:26:37 -05:00
Andreas Hansson ac1368df50 mem: Unify delayed packet deletion
This patch unifies how we deal with delayed packet deletion, where the
receiving slave is responsible for deleting the packet, but the
sending agent (e.g. a cache) is still relying on the pointer until the
call to sendTimingReq completes. Previously we used a mix of a
deletion vector and a construct using unique_ptr. With this patch we
ensure all slaves use the latter approach.
2015-11-06 03:26:21 -05:00
Andreas Hansson d8b7a652e1 mem: Clarify cache MSHR handling on fill
This patch addresses the upgrading of deferred targets in the MSHR,
and makes it clearer by explicitly calling out what is happening
(deferred targets are promoted if we get exclusivity without asking
for it).
2015-10-29 08:48:20 -04:00
Andreas Hansson a9a7002a3b mem: Add check for block status on WriteLineReq fill
More checks to help with understanding of functionality.
2015-09-25 07:26:58 -04:00
Andreas Hansson 012dd52dc2 mem: Fix WriteLineReq fill behaviour
This patch fixes issues in the interactions between deferred snoops
and WriteLineReq. More specifically, the patch addresses an issue
where deferred snoops caused assertion failures when being serviced on
the arrival of an InvalidateResp. The response packet was perceived to
be invalidating, when actually it is not for the cache that sent out
the original invalidation request.
2015-09-25 07:26:58 -04:00
Ali Jafri ceec2bb02c mem: Add snoops for CleanEvicts and Writebacks in atomic mode
This patch mirrors the logic in timing mode which sends up snoops to
check for cached copies before sending CleanEvicts and Writebacks down
the memory hierarchy. In case there is a copy in a cache above,
discard CleanEvicts and set the BLOCK_CACHED flag in Writebacks so
that writebacks do not reset the cache residency bit in the snoop
filter below.
2015-09-25 07:26:57 -04:00
Andreas Hansson 462c288a75 mem: Make the coherent crossbar account for timing snoops
This patch introduces the concept of a snoop latency. Given the
requirement to snoop and forward packets in zero time (due to the
coherency mechanism), the latency is accounted for later.

On a snoop, we establish the latency, and later add it to the header
delay of the packet. To allow multiple caches to contribute to the
snoop latency, we use a separate variable in the packet, and then take
the maximum before adding it to the header delay.
2015-09-25 07:13:54 -04:00
Andreas Hansson 76088fb9ca mem: Tidy up the snoop state-transition logic
Remove broken and unused option to pass dirty data on non-exclusive
snoops. Also beef up the comments a bit.
2015-09-04 13:13:58 -04:00
Andreas Hansson 6eb434c8a2 arm, mem: Remove unused CLEAR_LL request flag
Cleaning up dead code. The CLREX stores zero directly to
MISCREG_LOCKFLAG and so the request flag is no longer needed. The
corresponding functionality in the cache tags is also removed.
2015-08-21 07:03:25 -04:00
Andreas Hansson bda79817c8 mem: Remove unused cache squash functionality
Tidying up.
2015-08-21 07:03:24 -04:00
Andreas Hansson ddfa96cf45 mem: Add explicit Cache subclass and make BaseCache abstract
Open up for other subclasses to BaseCache and transition to using the
explicit Cache subclass.

--HG--
rename : src/mem/cache/BaseCache.py => src/mem/cache/Cache.py
2015-08-21 07:03:23 -04:00
Andreas Hansson 1bf389a2bf mem: Move cache_impl.hh to cache.cc
There is no longer any need to keep the implementation in a header.
2015-08-21 07:03:20 -04:00
David Guillen 5287945a8b mem: Remove templates in cache model
This patch changes the cache implementation to rely on virtual methods
rather than using the replacement policy as a template argument.

There is no impact on the simulation performance, and overall the
changes make it easier to modify (and subclass) the cache and/or
replacement policy.
2015-05-05 03:22:21 -04:00
Anthony Gutierrez a628afedad mem: refactor LRU cache tags and add random replacement tags
this patch implements a new tags class that uses a random replacement policy.
these tags prefer to evict invalid blocks first, if none are available a
replacement candidate is chosen at random.

this patch factors out the common code in the LRU class and creates a new
abstract class: the BaseSetAssoc class. any set associative tag class must
implement the functionality related to the actual replacement policy in the
following methods:

accessBlock()
findVictim()
insertBlock()
invalidate()
2014-07-28 12:23:23 -04:00
Andreas Hansson 0d68d36b9d mem: Remove the cache builder
This patch removes the redundant cache builder class.
2013-06-27 05:49:50 -04:00
Andreas Sandberg d44f2f611f mem: Remove the IIC replacement policy
The IIC replacement policy seems to be unused and has probably
gathered too much bit rot to be useful. This patch removes the IIC and
its associated cache parameters.
2013-01-07 13:05:39 -05:00
Lisa Hsu 7a28ab2d18 remove the totally obsolete split cache 2008-10-23 16:11:28 -04:00
Steve Reinhardt 9117c94f9c Get rid of coherence protocol object.
--HG--
extra : convert_revision : 4ff144342dca23af9a12a2169ca318a002654b42
2007-06-27 20:54:13 -07:00