Change all occurrances of Address as a variable name to instead use Addr.
Address is an allowed name in slicc even when Address is also being used as a
type, leading to declarations of "Address Address". While this works, it
prevents adding another field of type Address because the compiler then thinks
Address is a variable name, not type.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
This moves event and transition count statistics for cache controllers to
gem5's statistics. It does the same for the statistics associated with the
memory controller in ruby.
All the cache/directory/dma controllers individually collect the event and
transition counts. A callback function, collateStats(), has been added that
is invoked on the controller version 0 of each controller class. This
function adds all the individual controller statistics to a vector
variables. All the code for registering the statistical variables and
collating them is generated by SLICC. The patch removes the files
*_Profiler.{cc,hh} and *_ProfileDumper.{cc,hh} which were earlier used for
collecting and dumping statistics respectively.
This patch changes the way cache statistics are collected in ruby.
As of now, there is separate entity called CacheProfiler which holds
statistical variables for caches. The CacheMemory class defines different
functions for accessing the CacheProfiler. These functions are then invoked
in the .sm files. I find this approach opaque and prone to error. Secondly,
we probably should not be paying the cost of a function call for recording
statistics.
Instead, this patch allows for accessing statistical variables in the
.sm files. The collection would become transparent. Secondly, it would happen
in place, so no function calls. The patch also removes the CacheProfiler class.
--HG--
rename : src/mem/slicc/ast/InfixOperatorExprAST.py => src/mem/slicc/ast/OperatorExprAST.py
Due to recent changes to clocking system in Ruby and the way Ruby restores
state from a checkpoint, garnet was failing to run from a checkpointed state.
The problem is that Ruby resets the time to zero while warming up the caches.
If any component records a local copy of the time (read calls curCycle())
before the simulation has started, then that component will not operate until
that time is reached. In the context of this particular patch, the Garnet
Network class calls curCycle() at multiple places. Any non-operational
component can block in requests in the memory system, which the system
interprets as a deadlock. This patch makes changes so that Garnet can
successfully run from checkpointed state.
It adds a globally visible time at which the actual execution started. This
time is initialized in RubySystem::startup() function. This variable is only
meant for components with in Ruby. This replaces the private variable that
was maintained within Garnet since it is not possible to figure out the
correct time when the value of this variable can be set.
The patch also does away with all cases where curCycle() is called with in
some Ruby component before the system has actually started executing. This
is required due to the quirky manner in which ruby restores from a checkpoint.
This patch changes the SimpleTimingPort and RubyPort to panic on
inhibited requests as this should never happen in either of the
cases. The SimpleTimingPort is only used for the I/O devices PIO port
and the DMA devices config port and should thus never see an inhibited
request. Similarly, the SimpleTimingPort is also used for the
MessagePort in x86, and there should also not be any cases where the
port sees an inhibited request.
Previously, nextCycle() could return the *current* cycle if the current tick was
already aligned with the clock edge. This behavior is not only confusing (not
quite what the function name implies), but also caused problems in the
drainResume() function. When exiting/re-entering the sim loop (e.g., to take
checkpoints), the CPUs will drain and resume. Due to the previous behavior of
nextCycle(), the CPU tick events were being rescheduled in the same ticks that
were already processed before draining. This caused divergence from runs that
did not exit/re-entered the sim loop. (Initially a cycle difference, but a
significant impact later on.)
This patch separates out the two behaviors (nextCycle() and clockEdge()),
uses nextCycle() in drainResume, and uses clockEdge() everywhere else.
Nothing (other than name) should change except for the drainResume timing.
When using the o3 or inorder CPUs with many Ruby protocols, the caches may
need to forward invalidations to the CPUs. The RubyPort was instantiating a
packet to be sent to the CPUs to signal the eviction, but the packets were
not being freed by the CPUs. Consistent with the classic memory model, stack
allocate the packet and heap allocate the request so on
ruby_eviction_callback() completion, the packet deconstructor is called, and
deletes the request (*Note: stack allocating the request causes double
deletion, since it will be deleted in the packet destructor). This results in
the least memory allocations without memory errors.
When warming up caches in Ruby, the CacheRecorder sends fetch requests into
Ruby Sequencers with packet types that require responses. Since responses are
never generated for these CacheRecorder requests, the requests are not deleted
in the packet destructor called from the Ruby hit callback. Free the request.
The cache trace variables are array allocated uint8_t* in the RubySystem and
the Ruby CacheRecorder, but the code used delete to free the memory, resulting
in Valgrind memory errors. Change these deletes to delete [] to get rid of the
errors.
A set of patches was recently committed to allow multiple clock domains
in ruby. In those patches, I had inadvertently made an incorrect use of
the clocks. Suppose object A needs to schedule an event on object B. It
was possible that A accesses B's clock to schedule the event. This is not
possible in actual system. Hence, changes are being to the Consumer class
so as to avoid such happenings. Note that in a multi eventq simulation,
this can possibly lead to an incorrect simulation.
There are two functions in the Consumer class that are used for scheduling
events. The first function takes in the relative delay over the current time
as the argument and adds the current time to it for scheduling the event.
The second function takes in the absolute time (in ticks) for scheduling the
event. The first function is now being moved to protected section of the
class so that only objects of the derived classes can use it. All other
objects will have to specify absolute time while scheduling an event
for some consumer.
The histogram for tracking outstanding counts per cycle is maintained
in the profiler. For a parallel implementation of the memory system, we
need that this histogram is maintained locally. Hence it will now be
kept in the sequencer itself. The resulting histograms will be merged
when the stats are printed.
The functional write code was assuming that all writes are block sized,
which may not be true for Ruby Requests. This bug can lead to a buffer
overflow.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
The MESI CMP directory coherence protocol, while transitioning from SM to IM,
did not invalidate the lock that it might have taken on a cache line. This
patch adds an action for doing so.
The problem was found by Dibakar, but I was not happy with his proposed
solution. So I implemented a different solution.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
This patch fixes the warnings that clang3.2svn emit due to the "-Wall"
flag. There is one case of an uninitialised value in the ARM neon ISA
description, and then a whole range of unused private fields that are
pruned.
This patch address the most important name shadowing warnings (as
produced when using gcc/clang with -Wshadow). There are many
locations where constructor parameters and function parameters shadow
local variables, but these are left unchanged.
This patch makes the clock member private to the ClockedObject and
forces all children to access it using clockPeriod(). This makes it
impossible to inadvertently change the clock, and also makes it easier
to transition to a situation where the clock is derived from e.g. a
clock domain, or through a multiplier.
This patch adds a predecessor field to the SenderState base class to
make the process of linking them up more uniform, and enable a
traversal of the stack without knowing the specific type of the
subclasses.
There are a number of simplifications done as part of changing the
SenderState, particularly in the RubyTest.
This patch allows ruby to have multiple clock domains. As I understand
with this patch, controllers can have different frequencies. The entire
network needs to run at a single frequency.
The idea is that with in an object, time is treated in terms of cycles.
But the messages that are passed from one entity to another should contain
the time in Ticks. As of now, this is only true for the message buffers,
but not for the links in the network. As I understand the code, all the
entities in different networks (simple, garnet-fixed, garnet-flexible) should
be clocked at the same frequency.
Another problem is that the directory controller has to operate at the same
frequency as the ruby system. This is because the memory controller does
not make use of the Message Buffer, and instead implements a buffer of its
own. So, it has no idea of the frequency at which the directory controller
is operating and uses ruby system's frequency for scheduling events.
This patch is as of now the final patch in the series of patches that replace
Time with Cycles.This patch further replaces Time with Cycles in Sequencer,
Profiler, different protocols and related entities.
Though Time has not been completely removed, the places where it is in use
seem benign as of now.
The patch started of with replacing Time with Cycles in the Consumer class.
But to get ruby to compile, the rest of the changes had to be carried out.
Subsequent patches will further this process, till we completely replace
Time with Cycles.
This patch does several things. First, the counter for fully busy cycles for a
controller is now kept with in the controller, instead of being part of the profiler.
Second, the topology class no longer keeps an array of controllers which was only
used for printing stats. Instead, ruby system will now ask each controller to print
the stats. Thirdly, the statistical variable for recording how many different types
were created is being moved in to the controller from the profiler. Note that for
printing, the profiler will collate results from different controllers.
The number of bits required for an address was set to floorLog2(memory size).
This is correct under the assumption that the memory size is a power of 2,
which is not always true. Hence, floorLog2 is being replaced with ceilLog2.
This patch converts the panic() print outs in the Sequencer::wakeup()
call from ruby cycles to Ticks(). This makes it easier to debug deadlocks
with the ProtocolTrace flag so the issue time indicated in the panic message
can be quickly searched for.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
This patch was initiated so as to remove reference to g_system_ptr,
the pointer to Ruby System that is used for getting the current time.
That simple change actual requires changing a lot many things in slicc and
garnet. All these changes are related to how time is handled.
In most of the places, g_system_ptr has been replaced by another clock
object. The changes have been done under the assumption that all the
components in the memory system are on the same clock frequency, but the
actual clocks might be distributed.
Many Ruby structures inherit from the Consumer, which is used for scheduling
events. The Consumer used to relay on an Event Manager for scheduling events
and on g_system_ptr for time. With this patch, the Consumer will now use a
ClockedObject to schedule events and to query for current time. This resulted
in several structures being converted from SimObjects to ClockedObjects. Also,
the MessageBuffer class now requires a pointer to a ClockedObject so as to
query for time.
This patch adds a _curTick variable to an eventq. This variable is updated
whenever an event is serviced in function serviceOne(), or all events upto
a particular time are processed in function serviceEvents(). This change
helps when there are eventqs that do not make use of curTick for scheduling
events.
Recent changes to functionalRead() in the memory system was not correct.
The change allowed for returning data from the first message found in
the buffers of the memory system. This is not correct since it is possible
that a timing message has data from an older state of the block.
The changes are being reverted.
This patch adds support to ruby so that the statistics maintained by ruby
are reset/dumped when the statistics for the rest of the system are
reset/dumped. For resetting the statistics, ruby now provides the
resetStats() function that a sim object can provide. As a consequence, the
clearStats() function has been removed from RubySystem. For dumping stats,
Ruby now adds a callback event to the dumpStatsQueue. The exit callback that
ruby used to add earlier is being removed.
Created by: Hamid Reza Khaleghzadeh.
Improved by: Lluc Alvarez, Nilay Vaish
Committed by: Nilay Vaish
This patch moves the draining interface from SimObject to a separate
class that can be used by any object needing draining. However,
objects not visible to the Python code (i.e., objects not deriving
from SimObject) still depend on their parents informing them when to
drain. This patch also gets rid of the CountedDrainEvent (which isn't
really an event) and replaces it with a DrainManager.
When casting objects in the generated SWIG interfaces, SWIG uses
classical C-style casts ( (Foo *)bar; ). In some cases, this can
degenerate into the equivalent of a reinterpret_cast (mainly if only a
forward declaration of the type is available). This usually works for
most compilers, but it is known to break if multiple inheritance is
used anywhere in the object hierarchy.
This patch introduces the cxx_header attribute to Python SimObject
definitions, which should be used to specify a header to include in
the SWIG interface. The header should include the declaration of the
wrapped object. We currently don't enforce header the use of the
header attribute, but a warning will be generated for objects that do
not use it.
This patch adds support to different entities in the ruby memory system
for more reliable functional read/write accesses. Only the simple network
has been augmented as of now. Later on Garnet will also support functional
accesses.
The patch adds functional access code to all the different types of messages
that protocols can send around. These messages are functionally accessed
by going through the buffers maintained by the network entities.
The patch also rectifies some of the bugs found in coherence protocols while
testing the patch.
With this patch applied, functional writes always succeed. But functional
reads can still fail.
Currently the Ruby System maintains pointer to only one of the memory
controllers. But there can be multiple controllers in the system. This
patch adds a vector of memory controllers.
It seems unecessary that the BankedArray class needs to schedule an event
to figure out when the access ends. Instead only the time for the end of access
needs to be tracked.
Ruby system was recently converted to a clocked object. Such objects maintain
state related to the time that has passed so far. During the cache warmup, Ruby
system changes its own time and the global time. Later on, the global time is
restored. So Ruby system also needs to reset its own time.
This patch adds an additional level of ports in the inheritance
hierarchy, separating out the protocol-specific and protocl-agnostic
parts. All the functionality related to the binding of ports is now
confined to use BaseMaster/BaseSlavePorts, and all the
protocol-specific parts stay in the Master/SlavePort. In the future it
will be possible to add other protocol-specific implementations.
The functions used in the binding of ports, i.e. getMaster/SlavePort
now use the base classes, and the index parameter is updated to use
the PortID typedef with the symbolic InvalidPortID as the default.
This patch moves the code for functional accesses to ruby system. This is
because the subsequent patches add support for making functional accesses
to the messages in the interconnect. Making those accesses from the ruby port
would be cumbersome.
Fix the drain functionality of the RubyPort to only call drain on child ports
during a system-wide drain process, instead of calling each time that a
ruby_hit_callback is executed.
This fixes the issue of the RubyPort ports being reawakened during the drain
simulation, possibly with work they didn't previously have to complete. If
they have new work, they may call process on the drain event that they had
not registered work for, causing an assertion failure when completing the
drain event.
Also, in RubyPort, set the drainEvent to NULL when there are no events
to be drained. If not set to NULL, the drain loop can result in stale
drainEvents used.
This patch removes the use of g_system_ptr for event scheduling. Each consumer
object now needs to specify upfront an EventManager object it would use for
scheduling events. This makes the ruby memory system more amenable for a
multi-threaded simulation.
This patch shifts the version of gcc for which we enable c++0x from
4.6 to 4.4 The more long term plan is to see what the c++0x features
can bring and what level of support would be enabled simply by bumping
the required version of gcc from 4.3 to 4.4.
A few minor things had to be fixed in the code base, most notably the
choice of a hashmap implementation. In the Ruby Sequencer there were
also a few minor issues that gcc 4.4 was not too happy about.
This patch addresses a few minor issues reported by the clang static
analyzer.
The analysis was run with:
scan-build -disable-checker deadcode \
-enable-checker experimental.core \
-disable-checker experimental.core.CastToStruct \
-enable-checker experimental.cpluscplus
This patch is a first step to using Cycles as a parameter type. The
main affected modules are the CPUs and the Ruby caches. There are
definitely plenty more places that are affected, but this patch serves
as a starting point to making the transition.
An important part of this patch is to actually enable parameters to be
specified as Param.Cycles which involves some changes to params.py.
The memory size variable was a 32-bit int. This meant that the size of the
memory was limited to 4GB. This patch changes the type of the variable to
64-bit to support larger memory sizes. Thanks to Raghuraman Balasubramanian
for bringing this to notice.
This patch extends the queued port interfaces with methods for
scheduling the transmission of a timing request/response. The methods
are named similar to the corresponding sendTiming(Snoop)Req/Resp,
replacing the "send" with "sched". As the queues are currently
unbounded, the methods always succeed and hence do not return a value.
This functionality was previously provided in the subclasses by
calling PacketQueue::schedSendTiming with the appropriate
parameters. With this change, there is no need to introduce these
extra methods in the subclasses, and the use of the queued interface
is more uniform and explicit.
This patch fixes some problems with the drain/switchout functionality
for the O3 cpu and for the ARM ISA and adds some useful debug print
statements.
This is an incremental fix as there are still a few bugs/mem leaks with the
switchout code. Particularly when switching from an O3CPU to a
TimingSimpleCPU. However, when switching from O3 to O3 cores with the ARM ISA
I haven't encountered any more assertion failures; now the kernel will
typically panic inside of simulation.
This patch removes printConfig() functions from all structures in Ruby.
Most of the information is already part of config.ini, and where ever it
is not, it would become in due course.
This patch models a cache as separate tag and data arrays. The patch exposes
the banked array as another resource that is checked by SLICC before a
transition is allowed to execute. This is similar to how TBE entries and slots
in output ports are modeled.
Updates to Ruby to support statistics counting of cache accesses. This feature
serves multiple purposes beyond simple stats collection. It provides the
foundation for ruby to model the cache tag and data arrays as physical
resources, as well as provide the necessary input data for McPAT power
modeling.
This patch makes getAddrRanges const throughout the code base. There
is no reason why it should not be, and making it const prevents adding
any unintentional side-effects.
This patch adds isSnooping to the slave port, and thus avoids going
through getMasterPort to be able to ask the master. Over the course of
the next few patches, all getMasterPort/getSlavePort in Port and
MemObject are to be protocol agnostic, and the snooping is part of the
protocol layer.
The function is already present on the master port, where it is
implemented by the module itself, e.g. a cache. On the slave side, it
is merely asking the connected master port. The same name is used by
both functions despite their difference in behaviour. The initial
design used isMasterSnooping on the slave port side, but the more
verbose function name was later changed.
The scheduling of the deadlock check event was being done incorrectly as the
clock was not being multiplied, so as to convert the time into ticks. This
patch removes that bug.
This patch fixes a bug that caused snoop requests to be placed in a
packet queue. Instead, the packet is now sent immediately using
sendTimingSnoopReq, thus bypassing the packet queue and any normal
responses waiting to be sent.
This patch moves send/recvTiming and send/recvTimingSnoop from the
Port base class to the MasterPort and SlavePort, and also splits them
into separate member functions for requests and responses:
send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq,
send/recvTimingSnoopResp. A master port sends requests and receives
responses, and also receives snoop requests and sends snoop
responses. A slave port has the reciprocal behaviour as it receives
requests and sends responses, and sends snoop requests and receives
snoop responses.
For all MemObjects that have only master ports or slave ports (but not
both), e.g. a CPU, or a PIO device, this patch merely adds more
clarity to what kind of access is taking place. For example, a CPU
port used to call sendTiming, and will now call
sendTimingReq. Similarly, a response previously came back through
recvTiming, which is now recvTimingResp. For the modules that have
both master and slave ports, e.g. the bus, the behaviour was
previously relying on branches based on pkt->isRequest(), and this is
now replaced with a direct call to the apprioriate member function
depending on the type of access. Please note that send/recvRetry is
still shared by all the timing accessors and remains in the Port base
class for now (to maintain the current bus functionality and avoid
changing the statistics of all regressions).
The packet queue is split into a MasterPort and SlavePort version to
facilitate the use of the new timing accessors. All uses of the
PacketQueue are updated accordingly.
With this patch, the type of packet (request or response) is now well
defined for each type of access, and asserts on pkt->isRequest() and
pkt->isResponse() are now moved to the appropriate send member
functions. It is also worth noting that sendTimingSnoopReq no longer
returns a boolean, as the semantics do not alow snoop requests to be
rejected or stalled. All these assumptions are now excplicitly part of
the port interface itself.
This patch simplifies the packet by removing the broadcast flag and
instead more firmly relying on (and enforcing) the semantics of
transactions in the classic memory system, i.e. request packets are
routed from a master to a slave based on the address, and when they
are created they have neither a valid source, nor destination. On
their way to the slave, the request packet is updated with a source
field for all modules that multiplex packets from multiple master
(e.g. a bus). When a request packet is turned into a response packet
(at the final slave), it moves the potentially populated source field
to the destination field, and the response packet is routed through
any multiplexing components back to the master based on the
destination field.
Modules that connect multiplexing components, such as caches and
bridges store any existing source and destination field in the sender
state as a stack (just as before).
The packet constructor is simplified in that there is no longer a need
to pass the Packet::Broadcast as the destination (this was always the
case for the classic memory system). In the case of Ruby, rather than
using the parameter to the constructor we now rely on setDest, as
there is already another three-argument constructor in the packet
class.
In many places where the packet information was printed as part of
DPRINTFs, request packets would be printed with a numeric "dest" that
would always be -1 (Broadcast) and that field is now removed from the
printing.
This patch introduces port access methods that separates snoop
request/responses from normal memory request/responses. The
differentiation is made for functional, atomic and timing accesses and
builds on the introduction of master and slave ports.
Before the introduction of this patch, the packets belonging to the
different phases of the protocol (request -> [forwarded snoop request
-> snoop response]* -> response) all use the same port access
functions, even though the snoop packets flow in the opposite
direction to the normal packet. That is, a coherent master sends
normal request and receives responses, but receives snoop requests and
sends snoop responses (vice versa for the slave). These two distinct
phases now use different access functions, as described below.
Starting with the functional access, a master sends a request to a
slave through sendFunctional, and the request packet is turned into a
response before the call returns. In a system without cache coherence,
this is all that is needed from the functional interface. For the
cache-coherent scenario, a slave also sends snoop requests to coherent
masters through sendFunctionalSnoop, with responses returned within
the same packet pointer. This is currently used by the bus and caches,
and the LSQ of the O3 CPU. The send/recvFunctional and
send/recvFunctionalSnoop are moved from the Port super class to the
appropriate subclass.
Atomic accesses follow the same flow as functional accesses, with
request being sent from master to slave through sendAtomic. In the
case of cache-coherent ports, a slave can send snoop requests to a
master through sendAtomicSnoop. Just as for the functional access
methods, the atomic send and receive member functions are moved to the
appropriate subclasses.
The timing access methods are different from the functional and atomic
in that requests and responses are separated in time and
send/recvTiming are used for both directions. Hence, a master uses
sendTiming to send a request to a slave, and a slave uses sendTiming
to send a response back to a master, at a later point in time. Snoop
requests and responses travel in the opposite direction, similar to
what happens in functional and atomic accesses. With the introduction
of this patch, it is possible to determine the direction of packets in
the bus, and no longer necessary to look for both a master and a slave
port with the requested port id.
In contrast to the normal recvFunctional, recvAtomic and recvTiming
that are pure virtual functions, the recvFunctionalSnoop,
recvAtomicSnoop and recvTimingSnoop have a default implementation that
calls panic. This is to allow non-coherent master and slave ports to
not implement these functions.
This patch addresses a number of minor issues that cause problems when
compiling with clang >= 3.0 and gcc >= 4.6. Most importantly, it
avoids using the deprecated ext/hash_map and instead uses
unordered_map (and similarly so for the hash_set). To make use of the
new STL containers, g++ and clang has to be invoked with "-std=c++0x",
and this is now added for all gcc versions >= 4.6, and for clang >=
3.0. For gcc >= 4.3 and <= 4.5 and clang <= 3.0 we use the tr1
unordered_map to avoid the deprecation warning.
The addition of c++0x in turn causes a few problems, as the
compiler is more stringent and adds a number of new warnings. Below,
the most important issues are enumerated:
1) the use of namespaces is more strict, e.g. for isnan, and all
headers opening the entire namespace std are now fixed.
2) another other issue caused by the more stringent compiler is the
narrowing of the embedded python, which used to be a char array,
and is now unsigned char since there were values larger than 128.
3) a particularly odd issue that arose with the new c++0x behaviour is
found in range.hh, where the operator< causes gcc to complain about
the template type parsing (the "<" is interpreted as the beginning
of a template argument), and the problem seems to be related to the
begin/end members introduced for the range-type iteration, which is
a new feature in c++11.
As a minor update, this patch also fixes the build flags for the clang
debug target that used to be shared with gcc and incorrectly use
"-ggdb".
This patch fixes a bug in Ruby that caused non-deterministic
simulation when changing the underlying hash map implementation. The
reason is order-dependent behaviour in combination with iteration over
the hash map contents. The two locations where a sorted container is
assumed are now changed to make use of a std::map instead of the
unordered hash map.
With this change, the stats changes slightly and the follow-on
changeset will update the relevant statistics.
Fixes checkpointing with respect to lost events after swapping event queues.
Also adds DPRINTFs to better understand what's going on when Ruby serializes
and unserializes.
This patch removes the assumption on having on single instance of
PhysicalMemory, and enables a distributed memory where the individual
memories in the system are each responsible for a single contiguous
address range.
All memories inherit from an AbstractMemory that encompasses the basic
behaviuor of a random access memory, and provides untimed access
methods. What was previously called PhysicalMemory is now
SimpleMemory, and a subclass of AbstractMemory. All future types of
memory controllers should inherit from AbstractMemory.
To enable e.g. the atomic CPU and RubyPort to access the now
distributed memory, the system has a wrapper class, called
PhysicalMemory that is aware of all the memories in the system and
their associated address ranges. This class thus acts as an
infinitely-fast bus and performs address decoding for these "shortcut"
accesses. Each memory can specify that it should not be part of the
global address map (used e.g. by the functional memories by some
testers). Moreover, each memory can be configured to be reported to
the OS configuration table, useful for populating ATAG structures, and
any potential ACPI tables.
Checkpointing support currently assumes that all memories have the
same size and organisation when creating and resuming from the
checkpoint. A future patch will enable a more flexible
re-organisation.
--HG--
rename : src/mem/PhysicalMemory.py => src/mem/AbstractMemory.py
rename : src/mem/PhysicalMemory.py => src/mem/SimpleMemory.py
rename : src/mem/physical.cc => src/mem/abstract_mem.cc
rename : src/mem/physical.hh => src/mem/abstract_mem.hh
rename : src/mem/physical.cc => src/mem/simple_mem.cc
rename : src/mem/physical.hh => src/mem/simple_mem.hh
This patch removes the physMemPort from the RubySequencer and instead
uses the system pointer to access the physmem. The system already
keeps track of the physmem and the valid memory address ranges, and
with this patch we merely make use of that existing functionality. The
memory is modified so that it is possible to call the access functions
(atomic and functional) without going through the port, and the memory
is allowed to be unconnected, i.e. have no ports (since Ruby does not
attach it like the conventional memory system).
This patch introduces the notion of a master and slave port in the C++
code, thus bringing the previous classification from the Python
classes into the corresponding simulation objects and memory objects.
The patch enables us to classify behaviours into the two bins and add
assumptions and enfore compliance, also simplifying the two
interfaces. As a starting point, isSnooping is confined to a master
port, and getAddrRanges to slave ports. More of these specilisations
are to come in later patches.
The getPort function is not getMasterPort and getSlavePort, and
returns a port reference rather than a pointer as NULL would never be
a valid return value. The default implementation of these two
functions is placed in MemObject, and calls fatal.
The one drawback with this specific patch is that it requires some
code duplication, e.g. QueuedPort becomes QueuedMasterPort and
QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort
(avoiding multiple inheritance). With the later introduction of the
port interfaces, moving the functionality outside the port itself, a
lot of the duplicated code will disappear again.
This patch decouples the queueing and the port interactions to
simplify the introduction of the master and slave ports. By separating
the queueing functionality from the port itself, it becomes much
easier to distinguish between master and slave ports, and still retain
the queueing ability for both (without code duplication).
As part of the split into a PacketQueue and a port, there is now also
a hierarchy of two port classes, QueuedPort and SimpleTimingPort. The
QueuedPort is useful for ports that want to leave the packet
transmission of outgoing packets to the queue and is used by both
master and slave ports. The SimpleTimingPort inherits from the
QueuedPort and adds the implemention of recvTiming and recvFunctional
through recvAtomic.
The PioPort and MessagePort are cleaned up as part of the changes.
--HG--
rename : src/mem/tport.cc => src/mem/packet_queue.cc
rename : src/mem/tport.hh => src/mem/packet_queue.hh
This patch renames the sendTiming member function in the RubyPort to
avoid inadvertently hiding Port::sendTiming (discovered through some
rather painful debugging). The RubyPort does, in fact, rely on the
functionality of the queued port and the implementation merely
schedules a send the next cycle. The new name for the member function
is sendNextCycle to better reflect this behaviour.
In the unlikely event that we ever shift to using C++11 the member
functions in Port should have a "final" identifier to prevent any
overriding in derived classes.
This patch moves all port creation from the getPort method to be
consistently done in the MemObject's constructor. This is possible
thanks to the Swig interface passing the length of the vector ports.
Previously there was a mix of: 1) creating the ports as members (at
object construction time) and using getPort for the name resolution,
or 2) dynamically creating the ports in the getPort call. This is now
uniform. Furthermore, objects that would not be complete without a
port have these ports as members rather than having pointers to
dynamically allocated ports.
This patch also enables an elaboration-time enumeration of all the
ports in the system which can be used to determine the masterId.
This patch classifies all ports in Python as either Master or Slave
and enforces a binding of master to slave. Conceptually, a master (such
as a CPU or DMA port) issues requests, and receives responses, and
conversely, a slave (such as a memory or a PIO device) receives
requests and sends back responses. Currently there is no
differentiation between coherent and non-coherent masters and slaves.
The classification as master/slave also involves splitting the dual
role port of the bus into a master and slave port and updating all the
system assembly scripts to use the appropriate port. Similarly, the
interrupt devices have to have their int_port split into a master and
slave port. The intdev and its children have minimal changes to
facilitate the extra port.
Note that this patch does not enforce any port typing in the C++
world, it merely ensures that the Python objects have a notion of the
port roles and are connected in an appropriate manner. This check is
carried when two ports are connected, e.g. bus.master =
memory.port. The following patches will make use of the
classifications and specialise the C++ ports into masters and slaves.
This change adds a master id to each request object which can be
used identify every device in the system that is capable of issuing a request.
This is part of the way to removing the numCpus+1 stats in the cache and
replacing them with the master ids. This is one of a series of changes
that make way for the stats output to be changed to python.
This patch removes the calls to isTagPresent() from Sequencer.cc. These
calls are made just for setting the cache block to have been most recently
used. The calls have been folded in to the function setMRU().
This patch adds the necessary flags to the SConstruct and SConscript
files for compiling using clang 2.9 and later (on Ubuntu et al and OSX
XCode 4.2), and also cleans up a bunch of compiler warnings found by
clang. Most of the warnings are related to hidden virtual functions,
comparisons with unsigneds >= 0, and if-statements with empty
bodies. A number of mismatches between struct and class are also
fixed. clang 2.8 is not working as it has problems with class names
that occur in multiple namespaces (e.g. Statistics in
kernel_stats.hh).
clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which
causes confusion between the container std::set and the function
Packet::set, and this is currently addressed by not including the
entire namespace std, but rather selecting e.g. "using std::vector" in
the appropriate places.
This patch makes the physMemPort of the RubyPort a PioPort rather than
an M5Port. This reflects the fact that the M5Port and PioPort have
different roles. The M5Port is really a coherent slave that is
connected to the CPUs and other coherent masters of the system,
e.g. DMA ports. The PioPort, on the other hand, is a master port that
is connected to the memory and other slaves, for example the pio
devices.
This simplifies future changes into master/slave ports and is
consistent with the port roles throughout the system.
This patch implements the functionality for forwarding invalidations and
replacements from the L1 cache of the Ruby memory system to the O3 CPU. The
implementation adds a list of ports to RubyPort. Whenever a replacement or an
invalidation is performed, the L1 cache forwards this to all the ports, which
is the LSQ in case of the O3 CPU.
The functional ports are no longer used and this patch cleans up the
legacy that is still present in buses, memories, CPUs etc. Note that
this does not refer to the class FunctionalPort (already removed), but
rather ports with the name (and use) functional.
This patch simplifies the address-range determination mechanism and
also unifies the naming across ports and devices. It further splits
the queries for determining if a port is snooping and what address
ranges it responds to (aiming towards a separation of
cache-maintenance ports and pure memory-mapped ports). Default
behaviours are such that most ports do not have to define isSnooping,
and master ports need not implement getAddrRanges.
Port proxies are used to replace non-structural ports, and thus enable
all ports in the system to correspond to a structural entity. This has
the advantage of accessing memory through the normal memory subsystem
and thus allowing any constellation of distributed memories, address
maps, etc. Most accesses are done through the "system port" that is
used for loading binaries, debugging etc. For the entities that belong
to the CPU, e.g. threads and thread contexts, they wrap the CPU data
port in a port proxy.
The following replacements are made:
FunctionalPort > PortProxy
TranslatingPort > SETranslatingPortProxy
VirtualPort > FSTranslatingPortProxy
--HG--
rename : src/mem/vport.cc => src/mem/fs_translating_port_proxy.cc
rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh
rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc
rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh
The definition for the class CacheMsg was removed long back. Some declaration
had still survived, which was recently removed. Since the PerfectCacheMemory
class relied on this particular declaration, its absence let to compilation
breaking down. Hence this patch.
This patch resurrects ruby's cache warmup capability. It essentially
makes use of all the infrastructure that was added to the controllers,
memories and the cache recorder.
This patch adds function to the Sparse Memory so that the blocks can be
recorded in a cache trace. The blocks are added to the cache recorder
which can later write them into a file.
This patch adds functions to the memory vector class that can be used for
collating memory pages to raw trace and for populating pages from a raw
trace.