The TBE pointer in the MESI CMP implementation was not being set to NULL
when the TBE is deallocated. This resulted in segmentation fault on testing
the protocol when the ProtocolTrace was switched on.
By stalling and waiting the mandatory queue instead of recycling it, one can
ensure that no incoming messages are starved when the mandatory queue puts
signficant of pressure on the L1 cache controller (i.e. the ruby memtester).
--HG--
rename : src/mem/slicc/ast/WakeUpDependentsStatementAST.py => src/mem/slicc/ast/WakeUpAllDependentsStatementAST.py
The purpose of this patch is to change the way CacheMemory interfaces with
coherence protocols. Currently, whenever a cache controller (defined in the
protocol under consideration) needs to carry out any operation on a cache
block, it looks up the tag hash map and figures out whether or not the block
exists in the cache. In case it does exist, the operation is carried out
(which requires another lookup). As observed through profiling of different
protocols, multiple such lookups take place for a given cache block. It was
noted that the tag lookup takes anything from 10% to 20% of the simulation
time. In order to reduce this time, this patch is being posted.
I have to acknowledge that the many of the thoughts that went in to this
patch belong to Brad.
Changes to CacheMemory, TBETable and AbstractCacheEntry classes:
1. The lookup function belonging to CacheMemory class now returns a pointer
to a cache block entry, instead of a reference. The pointer is NULL in case
the block being looked up is not present in the cache. Similar change has
been carried out in the lookup function of the TBETable class.
2. Function for setting and getting access permission of a cache block have
been moved from CacheMemory class to AbstractCacheEntry class.
3. The allocate function in CacheMemory class now returns pointer to the
allocated cache entry.
Changes to SLICC:
1. Each action now has implicit variables - cache_entry and tbe. cache_entry,
if != NULL, must point to the cache entry for the address on which the action
is being carried out. Similarly, tbe should also point to the transaction
buffer entry of the address on which the action is being carried out.
2. If a cache entry or a transaction buffer entry is passed on as an
argument to a function, it is presumed that a pointer is being passed on.
3. The cache entry and the tbe pointers received __implicitly__ by the
actions, are passed __explicitly__ to the trigger function.
4. While performing an action, set/unset_cache_entry, set/unset_tbe are to
be used for setting / unsetting cache entry and tbe pointers respectively.
5. is_valid() and is_invalid() has been made available for testing whether
a given pointer 'is not NULL' and 'is NULL' respectively.
6. Local variables are now available, but they are assumed to be pointers
always.
7. It is now possible for an object of the derieved class to make calls to
a function defined in the interface.
8. An OOD token has been introduced in SLICC. It is same as the NULL token
used in C/C++. If you are wondering, OOD stands for Out Of Domain.
9. static_cast can now taken an optional parameter that asks for casting the
given variable to a pointer of the given type.
10. Functions can be annotated with 'return_by_pointer=yes' to return a
pointer.
11. StateMachine has two new variables, EntryType and TBEType. EntryType is
set to the type which inherits from 'AbstractCacheEntry'. There can only be
one such type in the machine. TBEType is set to the type for which 'TBE' is
used as the name.
All the protocols have been modified to conform with the new interface.
This patch changes the manner in which data is copied from L1 to L2 cache in
the implementation of the Hammer's cache coherence protocol. Earlier, data was
copied directly from one cache entry to another. This has been broken in to
two parts. First, the data is copied from the source cache entry to a
transaction buffer entry. Then, data is copied from the transaction buffer
entry to the destination cache entry.
This has been done to maintain the invariant - at any given instant, multiple
caches under a controller are exclusive with respect to each other.
This patch allows messages to be stalled in their input buffers and wait
until a corresponding address changes state. In order to make this work,
all in_ports must be ranked in order of dependence and those in_ports that
may unblock an address, must wake up the stalled messages. Alot of this
complexity is handled in slicc and the specification files simply
annotate the in_ports.
--HG--
rename : src/mem/slicc/ast/CheckAllocateStatementAST.py => src/mem/slicc/ast/StallAndWaitStatementAST.py
rename : src/mem/slicc/ast/CheckAllocateStatementAST.py => src/mem/slicc/ast/WakeUpDependentsStatementAST.py
Patch allows each individual message buffer to have different recycle latencies
and allows the overall recycle latency to be specified at the cmd line. The
patch also adds profiling info to make sure no one processor's requests are
recycled too much.
This patch allows one to disable migratory sharing for those cache blocks that
are accessed by atomic requests. While the implementations are different
between the token and hammer protocols, the motivation is the same. For
Alpha, LLSC semantics expect that normal loads do not unlock cache blocks that
have been locked by LL accesses. Therefore, locked blocks should not transfer
write permissions when responding to these load requests. Instead, only they
only transfer read permissions so that the subsequent SC access can possibly
succeed.
This patch fixes several bugs related to previous inconsistent assumptions on
how many tokens the Owner had. Mike Marty should have fixes these bugs years
ago. :)
Previously, the MOESI_hammer protocol calculated the same latency for L1 and
L2 hits. This was because the protocol was written using the old ruby
assumption that L1 hits used the sequencer fast path. Since ruby no longer
uses the fast-path, the protocol delays L2 hits by placing them on the
trigger queue.
The previous slower ruby latencies created a mismatch between the faster M5
cpu models and the much slower ruby memory system. Specifically smp
interrupts were much slower and infrequent, as well as cpus moving in and out
of spin locks. The result was many cpus were idle for large periods of time.
These changes fix the latency mismatch.
This patch adds back to ruby the capability to understand the response time
for messages that hit in different levels of the cache heirarchy.
Specifically add support for the MI_example, MOESI_hammer, and MOESI_CMP_token
protocols.
In addition to obvious changes, this required a slight change to the slicc
grammar to allow types with :: in them. Otherwise slicc barfs on std::string
which we need for the headers that slicc generates.
This patch includes the necessary regression updates to test the new ruby
configuration system. The patch includes support for multiple ruby protocols
and adds the ruby random tester. The patch removes atomic mode test for
ruby since ruby does not support atomic mode acceses. These tests can be
added back in when ruby supports atomic mode for real.
--HG--
rename : tests/quick/50.memtest/test.py => tests/quick/60.rubytest/test.py
Renamed the MESI directory file to be consistent with all other protocols.
--HG--
rename : src/mem/protocol/MESI_CMP_directory-mem.sm => src/mem/protocol/MESI_CMP_directory-dir.sm