If we write back an exclusive copy, we now mark it
as such, so the cache receiving the writeback can
mark its copy as exclusive. This avoids some
unnecessary upgrade requests when a cache later
tries to re-acquire exclusive access to the block.
Corrects an oversight in cset f97b62be544f. The fix there only
failed queued SCUpgradeReq packets that encountered an
invalidation, which meant that the upgrade had to reach the L2
cache. To handle pending requests in the L1 we must similarly
fail StoreCondReq packets too.
Allow lower-level caches (e.g., L2 or L3) to pass exclusive
copies to higher levels (e.g., L1). This eliminates a lot
of unnecessary upgrade transactions on read-write sequences
to non-shared data.
Also some cleanup of MSHR coherence handling and multiple
bug fixes.
Requires new "SCUpgradeReq" message that marks upgrades
for store conditionals, so downstream caches can fail
these when they run into invalidations.
See http://www.m5sim.org/flyspray/task/197
Only set the dirty bit when we actually write to a block
(not if we thought we might but didn't, as in a failed
SC or CAS). This requires makeing sure the dirty bit
stays set when we get an exclusive (writable) copy
in a cache-to-cache transfer from another owner, which
n turn requires copying the mem-inhibit flag from
timing-mode requests to their associated responses.
On the config end, if a shared L2 is created for the system, it is
parameterized to have n sharers as defined by option.num_cpus. In addition to
making the cache sharing aware so that discriminating tag policies can make use
of context_ids to make decisions, I added an occupancy AverageStat and an occ %
stat to each cache so that you could know which contexts are occupying how much
cache on average, both in terms of blocks and percentage. Note that since
devices have context_id -1, having an array of occ stats that correspond to
each context_id will break here, so in FS mode I add an extra bucket for device
blocks. This bucket is explicitly not added in SE mode in order to not only
avoid ugliness in the stats.txt file, but to avoid broken stats (some formulas
break when a bucket is 0).
This prevents redundant prefetches from being issued, solving the
occasional 'needsExclusive && !blk->isWritable()' assertion failure
in cache_impl.hh that several people have run into.
Eliminates "prefetch_cache_check_push" flag, neither setting of
which really solved the problem.
Previously there was one per bus, which caused some coherence problems
when more than one decided to respond. Now there is just one on
the main memory bus. The default bus responder on all other buses
is now the downstream cache's cpu_side port. Caches no longer need
to do address range filtering; instead, we just have a simple flag
to prevent snoops from propagating to the I/O bus.
Apparently we broke it with the cache rewrite and never noticed.
Thanks to Bao Yungang <baoyungang@gmail.com> for a significant part
of these changes (and for inspiring me to work on the rest).
Some other overdue cleanup on the prefetch code too.
I think readData() and writeData() were used for Erik's compression
work, but that code is gone, these aren't called anymore, and they
don't even really do what their names imply.
the primary identifier for a hardware context should be contextId(). The
concept of threads within a CPU remains, in the form of threadId() because
sometimes you need to know which context within a cpu to manipulate.
I was asserting that the only reason you would defer targets is if
a write came in while you had an outstanding read miss, but there's
another case where you could get a read access after you've snooped
an invalidation and buffered it because it applies to a prior
outstanding miss.
Make OutputDirectory::resolve() private and change the functions using
resolve() to instead use create().
--HG--
extra : convert_revision : 36d4be629764d0c4c708cec8aa712cd15f966453
if a prior write miss arrived while an even earlier
read miss was still outstanding.
--HG--
extra : convert_revision : 4924e145829b2ecf4610b88d33f4773510c6801a
where we defer a response to a read from a far-away cache A, then later
defer a ReadExcl from a cache B on the same bus as us. We'll assert
MemInhibit in both cases, but in the latter case MemInhibit will keep
the invalidation from reaching cache A. This special response tells
cache A that it gets the block to satisfy its read, but must immediately
invalidate it.
--HG--
extra : convert_revision : f85c8b47bb30232da37ac861b50a6539dc81161b
Don't mark upstream MSHR as pending if downstream MSHR is already in service.
--HG--
extra : convert_revision : e1c135ff00217291db58ce8a06ccde34c403d37f
Not so much noise on failed sends, and more complete
info when grepping a trace using an address.
--HG--
extra : convert_revision : 05a8261c9452072ca08b906200c6322b33e2b9f1
SimObjects not yet updated:
- Process and subclasses
- BaseCPU and subclasses
The SimObject(const std::string &name) constructor was removed. Subclasses
that still rely on that behavior must call the parent initializer as
: SimObject(makeParams(name))
--HG--
extra : convert_revision : d6faddde76e7c3361ebdbd0a7b372a40941c12ed
Make sure not to keep processing functional accesses
after they've been responded to.
Also use checkFunctional() return value instead of checking
packet command field where possible, mostly just for consistency.
--HG--
extra : convert_revision : 29fc76bc18731bd93a4ed05a281297827028ef75
creation and initialization now happens in python. Parameter objects
are generated and initialized by python. The .ini file is now solely for
debugging purposes and is not used in construction of the objects in any
way.
--HG--
extra : convert_revision : 7e722873e417cb3d696f2e34c35ff488b7bff4ed
Turns out DeferredSnoop isn't quite the right bit of info
we needed... see new comment in cache_impl.hh.
--HG--
extra : convert_revision : a38de8c1677a37acafb743b7074ef88b21d3b7be
If the invalidation beats the upgrade at a lower level
then the upgrade must be converted to a read exclusive
"in the field".
Restructure target list & deferred target list to
factor out some common code.
--HG--
extra : convert_revision : 7bab4482dd6c48efdb619610f0d3778c60ff777a
- Add "deferred snoop" flag to Packet so upper-level caches
can distinguish whether lower-level cache request was
in-service or not at the time of the original snoop.
- Revamp response handling to properly handle deferred snoops
on non-cache-fill requests (i.e. upgrades).
- Make sure forwarded writebacks are kept in write buffer at
lower-level caches so they get snooped properly.
--HG--
extra : convert_revision : 17f8a3772a1ae31a16991a53f8225ddf54d31fc9