This patch adds a predecessor field to the SenderState base class to
make the process of linking them up more uniform, and enable a
traversal of the stack without knowing the specific type of the
subclasses.
There are a number of simplifications done as part of changing the
SenderState, particularly in the RubyTest.
This patch removes the NACK frrom the packet as there is no longer any
module in the system that issues them (the bridge was the only one and
the previous patch removes that).
The handling of NACKs was mostly avoided throughout the code base, by
using e.g. panic or assert false, but in a few locations the NACKs
were actually dealt with (although NACKs never occured in any of the
regressions). Most notably, the DMA port will now never receive a NACK
and the backoff time is thus never changed. As a consequence, the
entire backoff mechanism (similar to a PCI bus) is now removed and the
DMA port entirely relies on the bus performing the arbitration and
issuing a retry when appropriate. This is more in line with e.g. PCIe.
Surprisingly, this patch has no impact on any of the regressions. As
mentioned in the patch that removes the NACK from the bridge, a
follow-up patch should change the request and response buffer size for
at least one regression to also verify that the system behaves as
expected when the bridge fills up.
While FastAlloc provides a small performance increase (~1.5%) over regular malloc it isn't thread safe.
After removing FastAlloc and using tcmalloc I've seen a performance increase of 12% over libc malloc
when running twolf for ARM.
This patch removes the Packet::NodeID typedef and unifies it with the
Port::PortId. The src and dest fields in the packet are used to hold a
port id (e.g. in the bus), and thus the two should actually be the
same.
The typedef PortID is now global (in base/types.hh) and aligned with
the ThreadID in terms of capitalisation and naming of the
InvalidPortID constant.
Before this patch, two flags were used for valid destination and
source, rather than relying on a named value (InvalidPortID), and
this is now redundant, as the src and dest field themselves are
sufficient to tell whether the current value is a valid port
identifier or not. Consequently, the VALID_SRC and VALID_DST are
removed.
As part of the cleaning up, a number of int parameters and local
variables are updated to use PortID.
Note that Ruby still has its own NodeID typedef. Furthermore, the
MemObject getMaster/SlavePort still has an int idx parameter with a
default value of -1 which should eventually change to PortID idx =
InvalidPortID.
This patch updates the comments for the src and dest fields to reflect
their actual use. Due to a number of patches (e.g. removing the
Broadcast flag), the old comments are no longer indicative of the
current usage.
This patch removes unused commands and attributes from the packet to
avoid any confusion. It is part of an effort to clear up how and where
different commands and attributes are used.
This patch simplifies the packet by removing the broadcast flag and
instead more firmly relying on (and enforcing) the semantics of
transactions in the classic memory system, i.e. request packets are
routed from a master to a slave based on the address, and when they
are created they have neither a valid source, nor destination. On
their way to the slave, the request packet is updated with a source
field for all modules that multiplex packets from multiple master
(e.g. a bus). When a request packet is turned into a response packet
(at the final slave), it moves the potentially populated source field
to the destination field, and the response packet is routed through
any multiplexing components back to the master based on the
destination field.
Modules that connect multiplexing components, such as caches and
bridges store any existing source and destination field in the sender
state as a stack (just as before).
The packet constructor is simplified in that there is no longer a need
to pass the Packet::Broadcast as the destination (this was always the
case for the classic memory system). In the case of Ruby, rather than
using the parameter to the constructor we now rely on setDest, as
there is already another three-argument constructor in the packet
class.
In many places where the packet information was printed as part of
DPRINTFs, request packets would be printed with a numeric "dest" that
would always be -1 (Broadcast) and that field is now removed from the
printing.
This patch adds the necessary flags to the SConstruct and SConscript
files for compiling using clang 2.9 and later (on Ubuntu et al and OSX
XCode 4.2), and also cleans up a bunch of compiler warnings found by
clang. Most of the warnings are related to hidden virtual functions,
comparisons with unsigneds >= 0, and if-statements with empty
bodies. A number of mismatches between struct and class are also
fixed. clang 2.8 is not working as it has problems with class names
that occur in multiple namespaces (e.g. Statistics in
kernel_stats.hh).
clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which
causes confusion between the container std::set and the function
Packet::set, and this is currently addressed by not including the
entire namespace std, but rather selecting e.g. "using std::vector" in
the appropriate places.
This command will be sent from the memory system (Ruby) to the LSQ of
an O3 CPU so that the LSQ, if it needs to, invalidates the address in
the request packet.
This adds the derived class FunctionalPacket to fix a long standing
deficiency in the Packet class where it was unable to handle finding data to
partially satisfy a functional access. Made this a derived class as
functional accesses are used only in certain contexts and to not add any
additional overhead to the existing Packet class.
This patch rpovides functional access support in Ruby. Currently only
the M5Port of RubyPort supports functional accesses. The support for
functional through the PioPort will be added as a separate patch.
The packet now identifies whether static or dynamic data has been allocated and
is used by Ruby to determine whehter to copy the data pointer into the ruby
request. Subsequently, Ruby can be told not to update phys memory when
receiving packets.
If we write back an exclusive copy, we now mark it
as such, so the cache receiving the writeback can
mark its copy as exclusive. This avoids some
unnecessary upgrade requests when a cache later
tries to re-acquire exclusive access to the block.
Corrects an oversight in cset f97b62be544f. The fix there only
failed queued SCUpgradeReq packets that encountered an
invalidation, which meant that the upgrade had to reach the L2
cache. To handle pending requests in the L1 we must similarly
fail StoreCondReq packets too.
Requires new "SCUpgradeReq" message that marks upgrades
for store conditionals, so downstream caches can fail
these when they run into invalidations.
See http://www.m5sim.org/flyspray/task/197
Only set the dirty bit when we actually write to a block
(not if we thought we might but didn't, as in a failed
SC or CAS). This requires makeing sure the dirty bit
stays set when we get an exclusive (writable) copy
in a cache-to-cache transfer from another owner, which
n turn requires copying the mem-inhibit flag from
timing-mode requests to their associated responses.
This frees up needed space for more public flags. Also:
- remove unused Request accessor methods
- make Packet use public Request accessors, so it need not be a friend
I did some of the flags and assertions wrong. Thanks to Brad Beckmann
for pointing this out. I should have run the opt regressions instead
of the fast. I also screwed up some of the logical functions in the Flags
class.
where we defer a response to a read from a far-away cache A, then later
defer a ReadExcl from a cache B on the same bus as us. We'll assert
MemInhibit in both cases, but in the latter case MemInhibit will keep
the invalidation from reaching cache A. This special response tells
cache A that it gets the block to satisfy its read, but must immediately
invalidate it.
--HG--
extra : convert_revision : f85c8b47bb30232da37ac861b50a6539dc81161b
Turns out DeferredSnoop isn't quite the right bit of info
we needed... see new comment in cache_impl.hh.
--HG--
extra : convert_revision : a38de8c1677a37acafb743b7074ef88b21d3b7be
- Add "deferred snoop" flag to Packet so upper-level caches
can distinguish whether lower-level cache request was
in-service or not at the time of the original snoop.
- Revamp response handling to properly handle deferred snoops
on non-cache-fill requests (i.e. upgrades).
- Make sure forwarded writebacks are kept in write buffer at
lower-level caches so they get snooped properly.
--HG--
extra : convert_revision : 17f8a3772a1ae31a16991a53f8225ddf54d31fc9
src/cpu/memtest/memtest.cc:
Need to set packet source field so that response from cache
doesn't run into assertion failure when copying source to dest.
src/mem/packet.hh:
Copy source field when copying packets.
Assert that source is valid before copying it to dest
when turning packets around.
--HG--
extra : convert_revision : 09e3cfda424aa89fe170e21e955b295746832bf8
timing mode still broken.
configs/example/memtest.py:
Revamp options.
src/cpu/memtest/memtest.cc:
No need for memory initialization.
No need to make atomic response... memory system should do that now.
src/cpu/memtest/memtest.hh:
MemTest really doesn't want to snoop.
src/mem/bridge.cc:
checkFunctional() cleanup.
src/mem/bus.cc:
src/mem/bus.hh:
src/mem/cache/base_cache.cc:
src/mem/cache/base_cache.hh:
src/mem/cache/cache.cc:
src/mem/cache/cache.hh:
src/mem/cache/cache_blk.hh:
src/mem/cache/cache_builder.cc:
src/mem/cache/cache_impl.hh:
src/mem/cache/coherence/coherence_protocol.cc:
src/mem/cache/coherence/coherence_protocol.hh:
src/mem/cache/coherence/simple_coherence.hh:
src/mem/cache/miss/SConscript:
src/mem/cache/miss/mshr.cc:
src/mem/cache/miss/mshr.hh:
src/mem/cache/miss/mshr_queue.cc:
src/mem/cache/miss/mshr_queue.hh:
src/mem/cache/prefetch/base_prefetcher.cc:
src/mem/cache/tags/fa_lru.cc:
src/mem/cache/tags/fa_lru.hh:
src/mem/cache/tags/iic.cc:
src/mem/cache/tags/iic.hh:
src/mem/cache/tags/lru.cc:
src/mem/cache/tags/lru.hh:
src/mem/cache/tags/split.cc:
src/mem/cache/tags/split.hh:
src/mem/cache/tags/split_lifo.cc:
src/mem/cache/tags/split_lifo.hh:
src/mem/cache/tags/split_lru.cc:
src/mem/cache/tags/split_lru.hh:
src/mem/packet.cc:
src/mem/packet.hh:
src/mem/physical.cc:
src/mem/physical.hh:
src/mem/tport.cc:
More major reorg. Seems to work for atomic mode now,
timing mode still broken.
--HG--
extra : convert_revision : 7e70dfc4a752393b911880ff028271433855ae87
src/mem/cache/cache_impl.hh:
src/mem/cache/coherence/simple_coherence.hh:
Get rid of old invalidate propagation logic in preparation
for new multilevel snoop protocol.
src/mem/cache/coherence/coherence_protocol.cc:
L2 cache now has protocol, so protocol must handle ReadExReq
coming in from the CPU side.
src/mem/cache/miss/mshr_queue.cc:
Assertion is failing, so let's take it out for now.
src/mem/packet.cc:
src/mem/packet.hh:
Add WritebackAck command.
Reorganize enum to put responses next to corresponding requests.
Get rid of unused WriteReqNoAck.
--HG--
extra : convert_revision : 24c519846d161978123f9aa029ae358a41546c73
Compiles but doesn't work... committing just so I can merge
(stupid bk!).
src/mem/bridge.cc:
Get rid of SNOOP_COMMIT.
src/mem/bus.cc:
src/mem/packet.hh:
Get rid of SNOOP_COMMIT & two-pass snoop.
First bits of EXPRESS_SNOOP support.
src/mem/cache/base_cache.cc:
src/mem/cache/base_cache.hh:
src/mem/cache/cache.hh:
src/mem/cache/cache_impl.hh:
src/mem/cache/miss/blocking_buffer.cc:
src/mem/cache/miss/miss_queue.cc:
src/mem/cache/prefetch/base_prefetcher.cc:
Big reorg of ports and port-related functions & events.
src/mem/cache/cache.cc:
src/mem/cache/cache_builder.cc:
src/mem/cache/coherence/SConscript:
Get rid of UniCoherence object.
--HG--
extra : convert_revision : 7672434fa3115c9b1c94686f497e57e90413b7c3
directly configured by python. Move stuff from root.(cc|hh) to
core.(cc|hh) since it really belogs there now.
In the process, simplify how ticks are used in the python code.
--HG--
extra : convert_revision : cf82ee1ea20f9343924f30bacc2a38d4edee8df3
Add support for a twin 64 bit int load
Add Memory barrier and write barrier flags as appropriate
Make atomic memory ops atomic
src/arch/alpha/isa/mem.isa:
src/arch/alpha/locked_mem.hh:
src/cpu/base_dyn_inst.hh:
src/mem/cache/cache_blk.hh:
src/mem/cache/cache_impl.hh:
rename store conditional stuff as extra data so it can be used for conditional swaps as well
src/arch/alpha/types.hh:
src/arch/mips/types.hh:
src/arch/sparc/types.hh:
add a largest read data type for statically allocating read buffers in atomic simple cpu
src/arch/isa_parser.py:
Add support for a twin 64 bit int load
src/arch/sparc/isa/decoder.isa:
Make atomic memory ops atomic
Add Memory barrier and write barrier flags as appropriate
src/arch/sparc/isa/formats/mem/basicmem.isa:
add post access code block and define a twinload format for twin loads
src/arch/sparc/isa/formats/mem/blockmem.isa:
remove old microcoded twin load coad
src/arch/sparc/isa/formats/mem/mem.isa:
swap.isa replaces the code in loadstore.isa
src/arch/sparc/isa/formats/mem/util.isa:
add a post access code block
src/arch/sparc/isa/includes.isa:
need bigint.hh for Twin64_t
src/arch/sparc/isa/operands.isa:
add a twin 64 int type
src/cpu/simple/atomic.cc:
src/cpu/simple/atomic.hh:
src/cpu/simple/base.hh:
src/cpu/simple/timing.cc:
add support for twinloads
add support for swap and conditional swap instructions
rename store conditional stuff as extra data so it can be used for conditional swaps as well
src/mem/packet.cc:
src/mem/packet.hh:
Add support for atomic swap memory commands
src/mem/packet_access.hh:
Add endian conversion function for Twin64_t type
src/mem/physical.cc:
src/mem/physical.hh:
src/mem/request.hh:
Add support for atomic swap memory commands
Rename sc code to extradata
--HG--
extra : convert_revision : 69d908512fb34a4e28b29a6e58b807fb1a6b1656
Created MemCmd class to wrap enum and provide handy methods to
check attributes, convert to string/int, etc.
--HG--
extra : convert_revision : 57f147ad893443e3a2040c6d5b4cdb1a8033930b
SConstruct:
src/SConscript:
Add flags for Intel CC while i'm at it
src/base/compiler.hh:
the _Pragma stuff needst to be called this way unless someone happens to have a cleaner way
src/base/cprintf_formats.hh:
add std:: where appropriate
src/base/statistics.hh:
use this->map since icc was getting confused about std::map vs the locally defined map
src/cpu/static_inst.hh:
Add some more dummy returns where needed
src/mem/packet.hh:
add more dummy returns where needed
src/sim/host.hh:
use limits to come up with max tick
--HG--
extra : convert_revision : 08e9f7898b29fb9d063136529afb9b6abceab60c
src/mem/bus.cc:
Make it so that invalidates being sent from the responder up don't call the responder
but they should also not Panic.
src/mem/packet.hh:
If we don't have data in the packet, don't call deleteData:
Example: InvalidateRequests never have data.
--HG--
extra : convert_revision : 18766bc9f3bb4d852ac651d094254d347abd1634
src/mem/cache/base_cache.cc:
On a delayed response, be sure to call the fixPacket wrapper to toggle hasData flag.
src/mem/packet.cc:
src/mem/packet.hh:
Create a wrapper to toggle the hasData flag on delayed responses
--HG--
extra : convert_revision : 1ced8d4e3dc12a059fb7636d59e429cd3dd46901
not necessarily 100% there yet.
src/mem/cache/cache_impl.hh:
Generate response packet on failed store conditional.
src/mem/packet.hh:
Clear packet flags when reinitializing.
(SATISFIED in particular is one we don't want to leave set.)
--HG--
extra : convert_revision : 29207c8a09afcbce43f41c480ad0c1b21d47454f
Fix CSHR's for flow control.
Fix for Bus Bridges reusing packets (clean flags up)
Now both timing/atomic caches with MOESI in UP fail at same point.
src/dev/io_device.hh:
DMA's should send WriteInvalidates
src/mem/bridge.cc:
Reusing packet, clean flags in the packet set by bus.
src/mem/cache/base_cache.cc:
src/mem/cache/base_cache.hh:
src/mem/cache/cache.hh:
src/mem/cache/cache_impl.hh:
src/mem/cache/coherence/simple_coherence.hh:
src/mem/cache/coherence/uni_coherence.cc:
src/mem/cache/coherence/uni_coherence.hh:
Fix CSHR's for flow control.
src/mem/packet.hh:
Make a writeInvalidateResp, since the DMA expects responses to it's writes
--HG--
extra : convert_revision : 59fd6658bcc0d076f4b143169caca946472a86cd
implement fix packet and add the ability to print a packet to a ostream
remove tabs in packet.hh (Could people stop inserting them??!?!?!)
mark const functions in packet.hh as such
src/base/traceflags.py:
add a traceflag for functional accesses
src/mem/packet.cc:
implement fix packet and add the ability to print a packet to a ostream
src/mem/packet.hh:
add the ability to print a packet to an ostream
remove tabs in file
mark const functions as such
--HG--
extra : convert_revision : 4297bce5e1d3abbab48be5bd9eb9e982b751fc7c
into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest
src/mem/packet.hh:
Hand merge code
--HG--
extra : convert_revision : d659418f24f4f4bf9867fec8573a5d227c0dfcea
src/mem/cache/base_cache.cc:
Fix a bug about not having a request to send
src/mem/cache/base_cache.hh:
Fix a bug with the blocking code
src/mem/cache/cache.hh:
AFix a bug with snoop hits in WB buffer
src/mem/cache/cache_impl.hh:
Fix a bug with snoop hits in WB buffer
Also, add better DPRINTF's
src/mem/cache/miss/miss_queue.cc:
Fix a bug with upgrades (Need to clean it up later)
src/mem/cache/miss/mshr.cc:
Fix a memory leak bug, still some outstanding with writebacks not being deleted
src/mem/cache/miss/mshr_queue.cc:
Fix a bug about upgrades (need to clean up later)
src/mem/packet.hh:
Fix for newly added cmd attribute for upgrades
tests/configs/memtest.py:
More interesting testcase
--HG--
extra : convert_revision : fcb4f17dd58b537bb4f67a8c835f50e455e8c688
src/mem/bus.cc:
Put back the check to see if the bus is busy. Also, populate the fields in the packet to indicate when the first word and the entire packet will be delivered.
src/mem/bus.hh:
Remove the occupyBus function.
src/mem/packet.hh:
Added fields to the packet to indicate when the first chunk of a packet arrives, and when the entire packet arrives.
--HG--
extra : convert_revision : cfc7670a33913d48a04d02c6d2448290a51f2d3c
Not fully implemented yet, but good enough for single level cache coherence
src/mem/packet.hh:
Add a bit to distinguish invalidates and upgrades
--HG--
extra : convert_revision : 5bf50d535857cea37fbdaf7993915d1332cb757e
Remove some dead code.
src/mem/cache/cache_impl.hh:
Upgrades don't need a response.
Moved satisfied check into bus so removed some dead code.
src/mem/cache/coherence/coherence_protocol.cc:
src/mem/packet.hh:
Upgrades don't require a response
--HG--
extra : convert_revision : dee0440ff19ba4c9e51bf9a47a5b0991265cfc1d
And small other tweaks to snooping coherence.
src/mem/cache/base_cache.hh:
Make timing response at the time of send.
src/mem/cache/cache.hh:
src/mem/cache/cache_impl.hh:
Update probe interface to be bi-directional for functional accesses
src/mem/packet.hh:
Add the function to create an atomic response to a given request
--HG--
extra : convert_revision : 04075a117cf30a7df16e6d3ce485543cc77d4ca6
Working on pulling out the changes in the cache so that it remains working.
src/mem/bus.cc:
Changes for multi-phase snoop
Some code for registering snoop ranges (a version that compiles and runs, but does nothing)
src/mem/bus.hh:
Changes for multi-phase snoop
src/mem/packet.hh:
Flag for multi-phase snoop
src/mem/port.hh:
Status for multi-phase snoop
--HG--
extra : convert_revision : 4c2e5263bba16e3bcf03aabe36ff45ec36de4720
src/mem/cache/base_cache.cc:
Add in retry path for blocking with multi-level caches
src/mem/cache/base_cache.hh:
Pull more of the blocking fixes into head
src/mem/packet.hh:
Fix typo
--HG--
extra : convert_revision : d4d149adfa414136ebd2c4789b739bb065710f7a
Nate needs to fix sinic builder stuff
Gabe needs to verify my fixes to decoder.isa
OPT/DEBUG compiles for ALPHA_FS, ALPHA_SE, MIPS_SE, SPARC_SE with this changeset
README:
Fix the swig version in the readme
src/SConscript:
remove sinic until nate fixes the builder crap for it
src/arch/alpha/system.hh:
src/arch/mips/isa/includes.isa:
src/arch/sparc/isa/decoder.isa:
src/base/stats/visit.cc:
src/base/timebuf.hh:
src/dev/ide_disk.cc:
src/dev/sinic.cc:
src/mem/cache/miss/mshr.cc:
src/mem/cache/miss/mshr_queue.cc:
src/mem/packet.hh:
src/mem/request.hh:
src/sim/builder.hh:
src/sim/system.hh:
fixes for gcc 4.1
--HG--
extra : convert_revision : 3775427c0047b282574d4831dd602c96cac3ba17