This constant is currently in System.hh, but is only used in Set.hh. It
is being moved to Set.hh to remove this artificial dependence of Set.hh
on System.hh.
--HG--
extra : rebase_source : 683c43a5eeaec4f5f523b3ea32953a07f65cfee7
This patch adds and removes included files from some of the files so as to
organize remove some false dependencies and include some files directly
instead of transitively.
--HG--
extra : rebase_source : 09b482ee9ae00b3a204ace0c63550bc3ca220134
SLICC uses pointers for cache and TBE entries but not for directory entries.
This patch changes the protocols, SLICC and Ruby memory system so that even
directory entries are referenced using pointers.
--HG--
extra : rebase_source : abeb4ac78033d003153751f216fd1948251fcfad
This patch changes the implementation of Ruby's recvTiming() function so
that it pushes a packet in to the Sequencer instead of a RubyRequest. This
requires changes in the Sequencer's makeRequest() and issueRequest()
functions, as they also need to operate on a Packet instead of RubyRequest.
This patch adds a fault model, which provides the probability of a number of
architectural faults in the interconnection network (e.g., data corruption,
misrouting). These probabilities can be used to realistically inject faults
in GARNET and faithfully evaluate the effectiveness of novel resilient NoC
architectures.
This patch removes some of the unused typedefs. It also moves
some of the typedefs from Global.hh to TypeDefines.hh. The patch
also eliminates the file NodeID.hh.
In RubySlicc_ComponentMapping.hh, certain '#define's have been used for
mapping MachineType to GenericMachineType. These '#define's are being
eliminated and the code will now be generated by SLICC instead. Also
are being eliminated some of the unused functions from
RubySlicc_ComponentMapping.sm.
Initialize flags via the Event constructor instead of calling
setFlags() in the body of the derived class's constructor. I
forget exactly why, but this made life easier when implementing
multi-queue support.
Also rename Event::getFlags() to isFlagSet() to better match
common usage, and get rid of some unused Event methods.
In the current implementation of Functional Accesses, it's very hard to
implement broadcast or snooping protocols where the memory has no idea if it
has exclusive access to a cache block or not. Without this knowledge, making
sure the RW vs. RO permissions are right are next to impossible. So we add a
new state called Backing_Store to enable the conveyance that this is the backup
storage for a block, so that it can be written if it is the only possibly RW
block in the system, or written even if there is another RW block in the
system, without causing problems.
Also, a small change to actually set the m_name field for each Controller so
that debugging can be easier. Now you can access a controller's name just by
controller->getName().
This patch replaces RUBY with PROTOCOL in all the SConscript files as
the environment variable that decides whether or not certain components
of the simulator are compiled.
This patch rpovides functional access support in Ruby. Currently only
the M5Port of RubyPort supports functional accesses. The support for
functional through the PioPort will be added as a separate patch.
The code for Set class was written under the assumption that
std::numeric_limits<long>::digits returns the number of bits used for
data type long, which was presumed to be either 32 or 64. But return value
is actually one less, that is, it is either 31 or 63. The value is now
being incremented by 1 so as to correctly set it.
The access permissions for the directory entries are not being set correctly.
This is because pointers are not used for handling directory entries.
function. get and set functions for access permissions have been added to the
Controller state machine. The changePermission() function provided by the
AbstractEntry and AbstractCacheEntry classes has been exposed to SLICC
code once again. The set_permission() functionality has been removed.
NOTE: Each protocol will have to define these get and set functions in order
to compile successfully.
Re-enabling implicit parenting (see previous patch) causes current
Ruby config scripts to create some strange hierarchies and generate
several warnings. This patch makes three general changes to address
these issues.
1. The order of object creation in the ruby config files makes the L1
caches children of the sequencer rather than the controller; these
config ciles are rewritten to assign the L1 caches to the
controller first.
2. The assignment of the sequencer list to system.ruby.cpu_ruby_ports
causes the sequencers to be children of system.ruby, generating
warnings because they are already parented to their respective
controllers. Changing this attribute to _cpu_ruby_ports fixes this
because the leading underscore means this is now treated as a plain
Python attribute rather than a child assignment. As a result, the
configuration hierarchy changes such that, e.g.,
system.ruby.cpu_ruby_ports0 becomes system.l1_cntrl0.sequencer.
3. In the topology classes, the routers become children of some random
internal link node rather than direct children of the topology.
The topology classes are rewritten to assign the routers to the
topology object first.
The virtual channels within "response" vnets are made buffers_per_data_vc
deep (default=4), while virtual channels within other vnets are made
buffers_per_ctrl_vc deep (default = 1). This is for accurate power estimates.
Identifying response vnets versus other vnets will allow garnet to
determine which vnets will carry data packets, and which will carry
ctrl packets, and use appropriate buffer sizes (since data packets are larger
than ctrl packets). This in turn allows the orion power model to accurately
estimate buffer power.
Renamed (message) class to vnet for consistency with rest of ruby.
Moved some parameters specific to fixed/flexible garnet networks into their
corresponding py files.
The RubyMemory flag wasnt used in the code, creating large gaps in trace output. Replace cprintfs w/dprintfs
using RubyMemory in memory controller. DPRINTF also deprecate the usage of the setDebug() pure virtual
function in the AbstractMemoryOrCache Class as well the m_debug/cprintf functions in MemoryControl.hh/cc
The simple network's endpoint bandwidth value is used to adjust the overall
bandwidth of the network. Specifically, the ration between endpoint bandwidth
and the MESSAGE_SIZE_MULTIPLIER determines the increase. By setting the value
to 1000, that means the bandwdith factor specified in the links translates to
the link bandwidth in bytes. Previously, it was increasing that value by 10.
This patch will likely require a reset of the ruby regression tester stats.
Moved the buffer_size, endpoint_bandwidth, and adaptive_routing params out of
the top-level parent network object and to only those networks that actually
use those parameters.
This patch ensures that both Garnet and the simple networks use the bw value
specified in the topology. To do so, the patch generalizes the specification
of bw for basic links. This value is then translated to the specific value
used by the simple and Garnet networks. Since Garent does not support
non-uniformed link bandwidth, the patch also adds a check to ensure all bws are
equal.
--HG--
rename : src/mem/ruby/network/BasicLink.cc => src/mem/ruby/network/simple/SimpleLink.cc
rename : src/mem/ruby/network/BasicLink.hh => src/mem/ruby/network/simple/SimpleLink.hh
rename : src/mem/ruby/network/BasicLink.py => src/mem/ruby/network/simple/SimpleLink.py
This patch converts links and switches from second class simobjects that were
virtually ignored by the networks (both simple and Garnet) to first class
simobjects that directly correspond to c++ ojbects manipulated by the
topology and network classes. This is especially true for Garnet, where the
links and switches directly correspond to specific C++ objects.
By making this change, many aspects of the Topology class were simplified.
--HG--
rename : src/mem/ruby/network/Network.cc => src/mem/ruby/network/BasicLink.cc
rename : src/mem/ruby/network/Network.hh => src/mem/ruby/network/BasicLink.hh
rename : src/mem/ruby/network/Network.cc => src/mem/ruby/network/garnet/fixed-pipeline/GarnetLink_d.cc
rename : src/mem/ruby/network/Network.hh => src/mem/ruby/network/garnet/fixed-pipeline/GarnetLink_d.hh
rename : src/mem/ruby/network/garnet/fixed-pipeline/GarnetNetwork_d.py => src/mem/ruby/network/garnet/fixed-pipeline/GarnetLink_d.py
rename : src/mem/ruby/network/garnet/fixed-pipeline/GarnetNetwork_d.py => src/mem/ruby/network/garnet/fixed-pipeline/GarnetRouter_d.py
rename : src/mem/ruby/network/Network.cc => src/mem/ruby/network/garnet/flexible-pipeline/GarnetLink.cc
rename : src/mem/ruby/network/Network.hh => src/mem/ruby/network/garnet/flexible-pipeline/GarnetLink.hh
rename : src/mem/ruby/network/garnet/fixed-pipeline/GarnetNetwork_d.py => src/mem/ruby/network/garnet/flexible-pipeline/GarnetLink.py
rename : src/mem/ruby/network/garnet/fixed-pipeline/GarnetNetwork_d.py => src/mem/ruby/network/garnet/flexible-pipeline/GarnetRouter.py
Moved the Topology class to the top network directory because it is shared by
both the simple and Garnet networks.
--HG--
rename : src/mem/ruby/network/simple/Topology.cc => src/mem/ruby/network/Topology.cc
rename : src/mem/ruby/network/simple/Topology.hh => src/mem/ruby/network/Topology.hh
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help
This function duplicates the functionality of allocate() exactly, except that it does not return
a return value. In protocols where you just want to allocate a block
but do not want that block to be your implicitly passed cache_entry, use this function.
Otherwise, SLICC will complain if you do not consume the pointer returned by allocate(),
and if you do a dummy assignment Entry foo := cache.allocate(address), the C++
compiler will complain of an unused variable. This is kind of a hack to get around
those issues, but suggestions welcome.
This is a substitute for MessageBuffers between controllers where you don't
want messages to actually go through the Network, because requests/responses can
always get reordered wrt to one another (even if you turn off Randomization and turn on Ordered)
because you are, after all, going through a network with contention. For systems where you model
multiple controllers that are very tightly coupled and do not actually go through a network,
it is a pain to have to write a coherence protocol to account for mixed up request/response orderings
despite the fact that it's completely unrealistic. This is *not* meant as a substitute for real
MessageBuffers when messages do in fact go over a network.
It is useful for Ruby to understand from whence request packets came.
This has all request packets going into Ruby pass the contextId value, if
it exists. This supplants the old libruby proc_id value passed around in
all the Messages, so I've also removed the unused unsigned proc_id; member
generated by SLICC for all Message types.
The goal of the patch is to do away with the CacheMsg class currently in use
in coherence protocols. In place of CacheMsg, the RubyRequest class will used.
This class is already present in slicc_interface/RubyRequest.hh. In fact,
objects of class CacheMsg are generated by copying values from a RubyRequest
object.
The tester code is in testers/networktest.
The tester can be invoked by configs/example/ruby_network_test.py.
A dummy coherence protocol called Network_test is also addded for network-only simulations and testing. The protocol takes in messages from the tester and just pushes them into the network in the appropriate vnet, without storing any state.
This patch fixes the problem where Ruby would fail to call sendRetry on ports
after it nacked the port. This patch is particularly helpful for bursty dma
requests which often include several packets.
None of the code in the ruby tester directory is compiled or referred to
outside of that directory. This change eliminates it. If it's needed in the
future, it can be revived from the history. In the mean time, this removes
clutter and the only use of the GEMS_ROOT scons variable.
At a couple of places in PerfectSwitch.cc and MessageBuffer.cc, DPRINTF()
has not been provided with correct number of arguments. The patch fixes these
bugs.
This patch removes the store buffer from Ruby. It is not in use currently.
Since libruby is being and store buffer makes calls to libruby, it is not
possible to maintain it until substantial changes are made.
This patch changes Address.hh so that it is not dependent on RubySystem.
This dependence seems unecessary. All those functions that depend on
RubySystem have been moved to Address.cc file.
This patch changes DataBlock.hh so that it is not dependent on RubySystem.
This dependence seems unecessary. All those functions that depende on
RubySystem have been moved to DataBlock.cc file.
This patch integrates permissions with cache and memory states, and then
automates the setting of permissions within the generated code. No longer
does one need to manually set the permissions within the setState funciton.
This patch will faciliate easier functional access support by always correctly
setting permissions for both cache and memory states.
--HG--
rename : src/mem/slicc/ast/EnumDeclAST.py => src/mem/slicc/ast/StateDeclAST.py
rename : src/mem/slicc/ast/TypeFieldEnumAST.py => src/mem/slicc/ast/TypeFieldStateAST.py
Overall, continue to progress Ruby debug messages to more of the normal M5
debug message style
- add a name() to the Ruby Throttle & PerfectSwitch objects so that the debug output
isn't littered w/"global:" everywhere.
- clean up messages that print over multiple lines when possible
- clean up duplicate prints in the message buffer
Currently the wakeup function for the PerfectSwitch contains three loops -
loop on number of virtual networks
loop on number of incoming links
loop till all messages for this (link, network) have been routed
With an 8 processor mesh network and Hammer protocol, about 11-12% of the
was observed to have been spent in this function, which is the highest
amongst all the functions. It was found that the innermost loop is executed
about 45 times per invocation of the wakeup function, when each invocation
of the wakeup function processes just about one message.
The patch tries to do away with the redundant executions of the innermost
loop. Counters have been added for each virtual network that record the
number of messages that need to be routed for that virtual network. The
inner loops are only executed when the number of messages for that particular
virtual network > 0. This does away with almost 80% of the executions of the
innermost loop. The function now consumes about 5-6% of the total execution
time.
The code for Orion 2.0 makes use of printf() at several places where there as
an error in configuration of the model. These have been replaced with fatal().
By stalling and waiting the mandatory queue instead of recycling it, one can
ensure that no incoming messages are starved when the mandatory queue puts
signficant of pressure on the L1 cache controller (i.e. the ruby memtester).
--HG--
rename : src/mem/slicc/ast/WakeUpDependentsStatementAST.py => src/mem/slicc/ast/WakeUpAllDependentsStatementAST.py
The packet now identifies whether static or dynamic data has been allocated and
is used by Ruby to determine whehter to copy the data pointer into the ruby
request. Subsequently, Ruby can be told not to update phys memory when
receiving packets.
Separate data VCs and ctrl VCs in garnet, as ctrl VCs have 1 buffer per VC,
while data VCs have > 1 buffers per VC. This is for correct power estimations.
The purpose of this patch is to change the way CacheMemory interfaces with
coherence protocols. Currently, whenever a cache controller (defined in the
protocol under consideration) needs to carry out any operation on a cache
block, it looks up the tag hash map and figures out whether or not the block
exists in the cache. In case it does exist, the operation is carried out
(which requires another lookup). As observed through profiling of different
protocols, multiple such lookups take place for a given cache block. It was
noted that the tag lookup takes anything from 10% to 20% of the simulation
time. In order to reduce this time, this patch is being posted.
I have to acknowledge that the many of the thoughts that went in to this
patch belong to Brad.
Changes to CacheMemory, TBETable and AbstractCacheEntry classes:
1. The lookup function belonging to CacheMemory class now returns a pointer
to a cache block entry, instead of a reference. The pointer is NULL in case
the block being looked up is not present in the cache. Similar change has
been carried out in the lookup function of the TBETable class.
2. Function for setting and getting access permission of a cache block have
been moved from CacheMemory class to AbstractCacheEntry class.
3. The allocate function in CacheMemory class now returns pointer to the
allocated cache entry.
Changes to SLICC:
1. Each action now has implicit variables - cache_entry and tbe. cache_entry,
if != NULL, must point to the cache entry for the address on which the action
is being carried out. Similarly, tbe should also point to the transaction
buffer entry of the address on which the action is being carried out.
2. If a cache entry or a transaction buffer entry is passed on as an
argument to a function, it is presumed that a pointer is being passed on.
3. The cache entry and the tbe pointers received __implicitly__ by the
actions, are passed __explicitly__ to the trigger function.
4. While performing an action, set/unset_cache_entry, set/unset_tbe are to
be used for setting / unsetting cache entry and tbe pointers respectively.
5. is_valid() and is_invalid() has been made available for testing whether
a given pointer 'is not NULL' and 'is NULL' respectively.
6. Local variables are now available, but they are assumed to be pointers
always.
7. It is now possible for an object of the derieved class to make calls to
a function defined in the interface.
8. An OOD token has been introduced in SLICC. It is same as the NULL token
used in C/C++. If you are wondering, OOD stands for Out Of Domain.
9. static_cast can now taken an optional parameter that asks for casting the
given variable to a pointer of the given type.
10. Functions can be annotated with 'return_by_pointer=yes' to return a
pointer.
11. StateMachine has two new variables, EntryType and TBEType. EntryType is
set to the type which inherits from 'AbstractCacheEntry'. There can only be
one such type in the machine. TBEType is set to the type for which 'TBE' is
used as the name.
All the protocols have been modified to conform with the new interface.
Ran all the source files through 'perl -pi' with this script:
s|\s*(};?\s*)?/\*\s*(end\s*)?namespace\s*(\S+)\s*\*/(\s*})?|} // namespace $3|;
s|\s*};?\s*//\s*(end\s*)?namespace\s*(\S+)\s*|} // namespace $2\n|;
s|\s*};?\s*//\s*(\S+)\s*namespace\s*|} // namespace $1\n|;
Also did a little manual editing on some of the arch/*/isa_traits.hh files
and src/SConscript.
Two functions in src/mem/ruby/system/PerfectCacheMemory.hh, tryCacheAccess()
and cacheProbe(), end with calls to panic(). Both of these functions have
return type other than void. Any file that includes this header file fails
to compile because of the missing return statement. This patch adds dummy
values so as to avoid the compiler warnings.
This diff is for changing the way ASSERT is handled in Ruby. m5.fast
compiles out the assert statements by using the macro NDEBUG. Ruby uses the
macro RUBY_NO_ASSERT to do so. This macro has been removed and NDEBUG has
been put in its place.
This patch allows messages to be stalled in their input buffers and wait
until a corresponding address changes state. In order to make this work,
all in_ports must be ranked in order of dependence and those in_ports that
may unblock an address, must wake up the stalled messages. Alot of this
complexity is handled in slicc and the specification files simply
annotate the in_ports.
--HG--
rename : src/mem/slicc/ast/CheckAllocateStatementAST.py => src/mem/slicc/ast/StallAndWaitStatementAST.py
rename : src/mem/slicc/ast/CheckAllocateStatementAST.py => src/mem/slicc/ast/WakeUpDependentsStatementAST.py
Patch allows each individual message buffer to have different recycle latencies
and allows the overall recycle latency to be specified at the cmd line. The
patch also adds profiling info to make sure no one processor's requests are
recycled too much.
The main purpose for clearing stats in the unserialize process is so
that the profiler can correctly set its start time to the unserialized
value of curTick.
This patch adds back to ruby the capability to understand the response time
for messages that hit in different levels of the cache heirarchy.
Specifically add support for the MI_example, MOESI_hammer, and MOESI_CMP_token
protocols.
This patch adds DMA testing to the Memtester and is inherits many changes from
Polina's old tester_dma_extension patch. Since Ruby does not work in atomic
mode, the atomic mode options are removed.
One big difference is that PrioHeap puts the smallest element at the
top of the heap, whereas stl puts the largest element on top, so I
changed all comparisons so they did the right thing.
Some usage of PrioHeap was simply changed to a std::vector, using sort
at the right time, other usage had me just use the various heap functions
in the stl.
This was somewhat tricky because the RefCnt API was somewhat odd. The
biggest confusion was that the the RefCnt object's constructor that
took a TYPE& cloned the object. I created an explicit virtual clone()
function for things that took advantage of this version of the
constructor. I was conservative and used clone() when I was in doubt
of whether or not it was necessary. I still think that there are
probably too many instances of clone(), but hopefully not too many.
I converted several instances of const MsgPtr & to a simple MsgPtr.
If the function wants to avoid the overhead of creating another
reference, then it should just use a regular pointer instead of a ref
counting ptr.
There were a couple of instances where refcounted objects were created
on the stack. This seems pretty dangerous since if you ever
accidentally make a reference to that object with a ref counting
pointer, bad things are bound to happen.
Further cleanup should probably be done to make this class be non-Ruby
specific and put it in src/base.
There are probably several cases where this class is used, std::bitset
could be used instead.
In addition to obvious changes, this required a slight change to the slicc
grammar to allow types with :: in them. Otherwise slicc barfs on std::string
which we need for the headers that slicc generates.
Previously, the set size was set to 4. This was mostly do to the fact that a
crazy graduate student use to create networks with 256 l2 cache banks. Now it
is far more likely that users will create systems with less than 64 of any
particular controller type. Therefore Ruby should be optimized for a set size
of 1.