Commit graph

1396 commits

Author SHA1 Message Date
Andreas Hansson b7153e2a64 mem: Separate out the different cases for DRAM bus busy time
This patch changes how the data bus busy time is calculated such that
it is delayed to the actual scheduling time of the request as opposed
to being done as soon as possible.

This patch changes a bunch of statistics, and the stats update is
bundled together with the introruction of tFAW/tTAW and the named DRAM
configurations like DDR3 and LPDDR2.
2013-01-31 07:49:13 -05:00
Anthony Gutierrez af0f8b31db cache: remove drainManager because it's not used
the cache drainManager is set but never cleared, this is because
the cache itself does not need to be drained and thus never
triggers a signalDrainDone(). because the drainManager variable
is not used properly and does not appear to be necessary it has
been removed with this patch.
2013-01-28 20:19:42 -05:00
Nilay Vaish a8eb5b18e0 ruby: remove get_time()
This patch replaces get_time() in *.sm files with curCycle() which
is now possible since controllers are clocked objects.
2013-01-28 06:14:18 -06:00
Nilay Vaish 31659e83fb ruby: remove call to curCycle in panic()
The panic() function already prints the current tick value. This call to
curCycle() is as such redundant. Since we are trying to move towards multiple
clock domains, this call will print misleading time.
2013-01-28 06:11:42 -06:00
Nilay Vaish 5b6f972750 ruby: remove calls to g_system_ptr->getTime()
This patch further removes calls to g_system_ptr->getTime() where ever other
clocked objects are available for providing current time.
2013-01-17 13:10:12 -06:00
Malek Musleh 1abf950f3c ruby sequencer: converts cycles to ticks in deadlock panic()
This patch converts the panic() print outs in the Sequencer::wakeup()
call from ruby cycles to Ticks(). This makes it easier to debug deadlocks
with the ProtocolTrace flag so the issue time indicated in the panic message
can be quickly searched for.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-01-14 10:05:12 -06:00
Nilay Vaish 2012983718 Ruby: remove reference to g_system_ptr from class Message
This patch was initiated so as to remove reference to g_system_ptr,
the pointer to Ruby System that is used for getting the current time.
That simple change actual requires changing a lot many things in slicc and
garnet. All these changes are related to how time is handled.

In most of the places, g_system_ptr has been replaced by another clock
object. The changes have been done under the assumption that all the
components in the memory system are on the same clock frequency, but the
actual clocks might be distributed.
2013-01-14 10:05:10 -06:00
Nilay Vaish cf232de461 Ruby: use ClockedObject in Consumer class
Many Ruby structures inherit from the Consumer, which is used for scheduling
events. The Consumer used to relay on an Event Manager for scheduling events
and on g_system_ptr for time. With this patch, the Consumer will now use a
ClockedObject to schedule events and to query for current time. This resulted
in several structures being converted from SimObjects to ClockedObjects. Also,
the MessageBuffer class now requires a pointer to a ClockedObject so as to
query for time.
2013-01-14 10:04:21 -06:00
Mitch Hayenga c7dbd5e768 mem: Make LL/SC locks fine grained
The current implementation in gem5 just keeps a list of locks per cacheline.
Due to this, a store to a non-overlapping portion of the cacheline can cause an
LL/SC pair to fail.  This patch simply adds an address range to the lock
structure, so that the lock is only invalidated if the store overlaps the lock
range.
2013-01-08 08:54:07 -05:00
Mitch Hayenga dc4a0aa2fa mem: Fix use-after-free bug
Running with valgrind I noticed a use after free originating from
simple_mem.cc.  It looks like this is a known issue and this additional call
site was missed in an earlier patch.
2013-01-08 08:54:06 -05:00
Andreas Sandberg 964aa49d15 mem: Fix guest corruption when caches handle uncacheable accesses
When the classic gem5 cache sees an uncacheable memory access, it used
to ignore it or silently drop the cache line in case of a
write. Normally, there shouldn't be any data in the cache belonging to
an uncacheable address range. However, since some architecture models
don't implement cache maintenance instructions, there might be some
dirty data in the cache that is discarded when this happens. The
reason it has mostly worked before is because such cache lines were
most likely evicted by normal memory activity before a TLB flush was
requested by the OS.

Previously, the cache model would invalidate cache lines when they
were accessed by an uncacheable write. This changeset alters this
behavior so all uncacheable memory accesses cause a cache flush with
an associated writeback if necessary. This is implemented by reusing
the cache flushing machinery used when draining the cache, which
implies that writebacks are performed using functional accesses.
2013-01-07 13:05:47 -05:00
Andreas Sandberg d44f2f611f mem: Remove the IIC replacement policy
The IIC replacement policy seems to be unused and has probably
gathered too much bit rot to be useful. This patch removes the IIC and
its associated cache parameters.
2013-01-07 13:05:39 -05:00
Andreas Hansson 921490a060 sim: Fatal if a clocked object is set to have a clock of 0
This patch adds a check to the clocked object constructor to ensure it
is not configured to have a clock period of 0.
2013-01-07 13:05:39 -05:00
Andreas Hansson 18b147acef mem: Merge ranges that are part of the conf table
This patch adds basic merging of address ranges when determining which
address ranges should be reported in the configuration table. By
performing this merging it is possible to distribute an address range
across many memory channels (controllers). This is essential to enable
address interleaving.
2013-01-07 13:05:38 -05:00
Andreas Hansson 01c5598373 mem: Add interleaving bits to the address ranges
This patch adds support for interleaving bits for the address
ranges. What was previously just a start and end address, now has an
additional three fields, for the high bit, and number of bits to use
for interleaving, and a match value to compare against. If the number
of interleaving bits is set to zero it is effectively disabled.

A number of convenience functions are added to the range to enquire
about the interleaving, its granularity and the number of stripes it
is part of.
2013-01-07 13:05:38 -05:00
Andreas Hansson e0d93fde99 base: Simplify the AddrRangeMap by removing unused code
This patch cleans up the AddrRangeMap in preparation for the addition
of interleaving by removing unused code. The non-const editions of
find are never used, and hence the duplication is not needed.
2013-01-07 13:05:38 -05:00
Andreas Hansson 15a979c6be mem: Tidy up bus addr range debug messages
This patch tidies up a number of the bus DPRINTFs related to range
manipulation. In particular, it shifts the message about range changes
to the start of the member function, and also adds information about
when all ranges are received.
2013-01-07 13:05:38 -05:00
Andreas Hansson caf6786ad5 mem: Skip address mapper range checks to allow more flexibility
This patch makes the address mapper less stringent about checking the
before and after ranges, i.e. the original and remapped ranges. The
checks were not really necessary, and there are situations when the
previous checks were too strict.
2013-01-07 13:05:38 -05:00
Andreas Hansson 71da1d2157 base: Encapsulate the underlying fields in AddrRange
This patch makes the start and end address private in a move to
prevent direct manipulation and matching of ranges based on these
fields. This is done so that a transition to ranges with interleaving
support is possible.

As a result of hiding the start and end, a number of member functions
are needed to perform the comparisons and manipulations that
previously took place directly on the members. An accessor function is
provided for the start address, and a function is added to test if an
address is within a range. As a result of the latter the != and ==
operator is also removed in favour of the member function. A member
function that returns a string representation is also created to allow
debug printing.

In general, this patch does not add any functionality, but it does
take us closer to a situation where interleaving (and more cleverness)
can be added under the bonnet without exposing it to the user. More on
that in a later patch.
2013-01-07 13:05:38 -05:00
Andreas Hansson cfdaf53104 mem: Remove the joining of neighbouring ranges
This patch temporarily removes the joining of ranges when creating the
backing store, to reserve this functionality for the interleaved
ranges that are about to be introduced.

When creating the mmaps for the backing store, there is no point in
creating larger contigous chunks that what is necessary. The larger
chunks will only make life more difficult for the host.

Merging will be re-added later, but then only for interleaved ranges.
2013-01-07 13:05:38 -05:00
Andreas Hansson f456c7983d mem: Add tracing support in the communication monitor
This patch adds packet tracing to the communication monitor using a
protobuf as the mechanism for creating the trace.

If no file is specified, then the tracing is disabled. If a file is
specified, then for every packet that is successfully sent, a protobuf
message is serialized to the file.
2013-01-07 13:05:37 -05:00
Andreas Hansson 852a7bcf92 mem: Add sanity check to packet queue size
This patch adds a basic check to ensure that the packet queue does not
grow absurdly large. The queue should only be used to store packets
that were delayed due to blocking from the neighbouring port, and not
for actual storage. Thus, a limit of 100 has been chosen for now
(which is already quite substantial).
2013-01-07 13:05:35 -05:00
Andreas Hansson ce5fc494e3 ruby: Fix missing cxx_header in Switch
This patch addresses a warning related to the swig interface
generation for the Switch class. The cxx_header is now specified
correctly, and the header in question has got a few includes added to
make it all compile.
2013-01-07 13:05:35 -05:00
Andreas Hansson 174269978a mem: Fix a bug in the memory serialization file naming
This patch fixes a bug that caused multiple systems to overwrite each
other physical memory. The system name is now included in the filename
such that this is avoided.
2013-01-07 13:05:35 -05:00
Ali Saidi 9a645d6e9b cache: add note about where conflicts are handled 2013-01-07 13:05:32 -05:00
Nilay Vaish f3d0be210f ruby: add support for prefetching to MESI protocol 2012-12-11 10:05:56 -06:00
Nilay Vaish 9b72a0f627 ruby: change slicc to allow for constructor args
The patch adds support to slicc for recognizing arguments that should be
passed to the constructor of a class. I did not like the fact that an explicit
check was being carried on the type 'TBETable' to figure out the arguments to
be passed to the constructor.
The patch also moves some of the member variables that are declared for all
the controllers to the base class AbstractController.
2012-12-11 10:05:55 -06:00
Nilay Vaish 93e283abb3 ruby: add a prefetcher
This patch adds a prefetcher for the ruby memory system. The prefetcher
is based on a prefetcher implemented by others (well, I don't know
who wrote the original). The prefetcher does stride-based prefetching,
both unit and non-unit. It obseves the misses in the cache and trains on
these. After the training period is over, the prefetcher starts issuing
prefetch requests to the controller.
2012-12-11 10:05:54 -06:00
Nilay Vaish d502384795 ruby: add functions for computing next stride/page address 2012-12-11 10:05:53 -06:00
Nilay Vaish 2d6470936c sim: have a curTick per eventq
This patch adds a _curTick variable to an eventq. This variable is updated
whenever an event is serviced in function serviceOne(), or all events upto
a particular time are processed in function serviceEvents(). This change
helps when there are eventqs that do not make use of curTick for scheduling
events.
2012-11-16 10:27:47 -06:00
Nilay Vaish 90c45c29fe ruby: support functional accesses in garnet flexible network 2012-11-10 17:18:01 -06:00
Nilay Vaish 1492ab066d ruby: bug in functionalRead, revert recent changes
Recent changes to functionalRead() in the memory system was not correct.
The change allowed for returning data from the first message found in
the buffers of the memory system. This is not correct since it is possible
that a timing message has data from an older state of the block.

The changes are being reverted.
2012-11-10 17:18:00 -06:00
Andreas Hansson c4b36901d0 mem: Fix DRAM draining to ensure write queue is empty
This patch fixes the draining of the SimpleDRAM controller model. The
controller performs buffering of writes and normally there is no need
to ever empty the write buffer (if you have a fast on-chip memory,
then use it). The patch adds checks to ensure the write buffer is
drained when the controller is asked to do so.
2012-11-08 04:25:06 -05:00
Hamid Reza Khaleghzadeh ext:(%2C%20Lluc%20Alvarez%20%3Clluc.alvarez%40bsc.es%3E%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E) 8cd475d58e ruby: reset and dump stats along with reset of the system
This patch adds support to ruby so that the statistics maintained by ruby
are reset/dumped when the statistics for the rest of the system are
reset/dumped. For resetting the statistics, ruby now provides the
resetStats() function that a sim object can provide. As a consequence, the
clearStats() function has been removed from RubySystem. For dumping stats,
Ruby now adds a callback event to the dumpStatsQueue. The exit callback that
ruby used to add earlier is being removed.

Created by: Hamid Reza Khaleghzadeh.
Improved by: Lluc Alvarez, Nilay Vaish
Committed by: Nilay Vaish
2012-11-02 12:18:25 -05:00
Ali Saidi ce5766c409 mem: fix use after free issue in memories until 4-phase work complete. 2012-11-02 11:50:16 -05:00
Andreas Sandberg ddd6af414c mem: Add support for writing back and flushing caches
This patch adds support for the following optional drain methods in
the classical memory system's cache model:

memWriteback() - Write back all dirty cache lines to memory using
functional accesses.

memInvalidate() - Invalidate all cache lines. Dirty cache lines
are lost unless a writeback is requested.

Since memWriteback() is called when checkpointing systems, this patch
adds support for checkpointing systems with caches. The serialization
code now checks whether there are any dirty lines in the cache. If
there are dirty lines in the cache, the checkpoint is flagged as bad
and a warning is printed.
2012-11-02 11:32:02 -05:00
Andreas Sandberg b81a977e6a sim: Move the draining interface into a separate base class
This patch moves the draining interface from SimObject to a separate
class that can be used by any object needing draining. However,
objects not visible to the Python code (i.e., objects not deriving
from SimObject) still depend on their parents informing them when to
drain. This patch also gets rid of the CountedDrainEvent (which isn't
really an event) and replaces it with a DrainManager.
2012-11-02 11:32:01 -05:00
Andreas Sandberg c0ab52799c sim: Include object header files in SWIG interfaces
When casting objects in the generated SWIG interfaces, SWIG uses
classical C-style casts ( (Foo *)bar; ). In some cases, this can
degenerate into the equivalent of a reinterpret_cast (mainly if only a
forward declaration of the type is available). This usually works for
most compilers, but it is known to break if multiple inheritance is
used anywhere in the object hierarchy.

This patch introduces the cxx_header attribute to Python SimObject
definitions, which should be used to specify a header to include in
the SWIG interface. The header should include the declaration of the
wrapped object. We currently don't enforce header the use of the
header attribute, but a warning will be generated for objects that do
not use it.
2012-11-02 11:32:01 -05:00
Dam Sunwoo 81406018b0 ARM: dump stats and process info on context switches
This patch enables dumping statistics and Linux process information on
context switch boundaries (__switch_to() calls) that are used for
Streamline integration (a graphical statistics viewer from ARM).
2012-11-02 11:32:01 -05:00
Andreas Hansson 3d98119717 mem: Fix typo in port comments
This patch merely fixes a few typos in the port comments.
2012-10-31 09:28:23 -04:00
Andreas Hansson 6f6adbf0f6 dev: Make default clock more reasonable for system and devices
This patch changes the default system clock from 1THz to 1GHz. This
clock is used by all modules that do not override the default (parent
clock), and primarily affects the IO subsystem. Every DMA device uses
its clock to schedule the next transfer, and the change will thus
cause this inter-transfer delay to be longer.

The default clock of the bus is removed, as the clock inherited from
the system provides exactly the same value.

A follow-on patch will bump the stats.
2012-10-25 13:14:44 -04:00
Nilay Vaish 52d8693677 ruby: functional access updates to network test protocol
I had forgotten to change the network test protocol while making changes to
ruby for supporting functional accesses. This patch updates the protocol so
that it can compile correctly.
2012-10-18 18:35:42 -05:00
Nilay Vaish 5ffc165939 ruby: improved support for functional accesses
This patch adds support to different entities in the ruby memory system
for more reliable functional read/write accesses. Only the simple network
has been augmented as of now. Later on Garnet will also support functional
accesses.
The patch adds functional access code to all the different types of messages
that protocols can send around. These messages are functionally accessed
by going through the buffers maintained by the network entities.
The patch also rectifies some of the bugs found in coherence protocols while
testing the patch.

With this patch applied, functional writes always succeed. But functional
reads can still fail.
2012-10-15 17:51:57 -05:00
Nilay Vaish 61434a9943 ruby: register multiple memory controllers
Currently the Ruby System maintains pointer to only one of the memory
controllers. But there can be multiple controllers in the system. This
patch adds a vector of memory controllers.
2012-10-15 17:27:17 -05:00
Nilay Vaish c14e6cfc4e ruby: remove AbstractMemOrCache
The only place where this abstract class is in use is the memory controller,
which it self is an abstract class. Does not seem useful at all.
2012-10-15 17:27:16 -05:00
Nilay Vaish 3e607f146f ruby: allow function definition in slicc structs
This patch adds support for function definitions to appear in slicc structs.
This is required for supporting functional accesses for different types of
messages. Subsequent patches will use this to development.
2012-10-15 17:27:16 -05:00
Nilay Vaish c7b0901b97 ruby banked array: do away with event scheduling
It seems unecessary that the BankedArray class needs to schedule an event
to figure out when the access ends. Instead only the time for the end of access
needs to be tracked.
2012-10-15 17:27:15 -05:00
Nilay Vaish 6a65fafa52 ruby: reset timing after cache warm up
Ruby system was recently converted to a clocked object. Such objects maintain
state related to the time that has passed so far. During the cache warmup, Ruby
system changes its own time and the global time. Later on, the global time is
restored. So Ruby system also needs to reset its own time.
2012-10-15 17:27:15 -05:00
Andreas Hansson b6bd4f34b4 Mem: Fix incorrect logic in bus blocksize check
This patch fixes the logic in the blocksize check such that the
warning is printed if the size is not 16, 32, 64 or 128.
2012-10-15 12:51:21 -04:00
Andreas Hansson 2a740aa096 Port: Add protocol-agnostic ports in the port hierarchy
This patch adds an additional level of ports in the inheritance
hierarchy, separating out the protocol-specific and protocl-agnostic
parts. All the functionality related to the binding of ports is now
confined to use BaseMaster/BaseSlavePorts, and all the
protocol-specific parts stay in the Master/SlavePort. In the future it
will be possible to add other protocol-specific implementations.

The functions used in the binding of ports, i.e. getMaster/SlavePort
now use the base classes, and the index parameter is updated to use
the PortID typedef with the symbolic InvalidPortID as the default.
2012-10-15 08:12:35 -04:00
Andreas Hansson 9baa35ba80 Mem: Separate the host and guest views of memory backing store
This patch moves all the memory backing store operations from the
independent memory controllers to the global physical memory. The main
reason for this patch is to allow address striping in a future set of
patches, but at this point it already provides some useful
functionality in that it is now possible to change the number of
memory controllers and their address mapping in combination with
checkpointing. Thus, the host and guest view of the memory backing
store are now completely separate.

With this patch, the individual memory controllers are far simpler as
all responsibility for serializing/unserializing is moved to the
physical memory. Currently, the functionality is more or less moved
from AbstractMemory to PhysicalMemory without any major
changes. However, in a future patch the physical memory will also
resolve any ranges that are interleaved and properly assign the
backing store to the memory controllers, and keep the host memory as a
single contigous chunk per address range.

Functionality for future extensions which involve CPU virtualization
also enable the host to get pointers to the backing store.
2012-10-15 08:12:32 -04:00
Andreas Hansson 0c58106b6e Mem: Use deque instead of list for bus retries
This patch changes the data structure used to keep track of ports that
should be told to retry. As the bus is doing this in an FCFS way,
there is no point having a list. A deque is a better match (and is at
least in theory a better choice from a performance point of view).
2012-10-15 08:12:25 -04:00
Andreas Hansson 93a159875a Fix: Address a few minor issues identified by cppcheck
This patch addresses a number of smaller issues identified by the code
inspection utility cppcheck. There are a number of identified leaks in
the arm/linux/system.cc (although the function only get's called once
so it is not a major problem), a few deletes in dev/x86/i8042.cc that
were not array deletes, and sprintfs where the character array had one
element less than needed. In the IIC tags there was a function
allocating an array of longs which is in fact never used.
2012-10-15 08:12:23 -04:00
Andreas Hansson 88554790c3 Mem: Use cycles to express cache-related latencies
This patch changes the cache-related latencies from an absolute time
expressed in Ticks, to a number of cycles that can be scaled with the
clock period of the caches. Ultimately this patch serves to enable
future work that involves dynamic frequency scaling. As an immediate
benefit it also makes it more convenient to specify cache performance
without implicitly assuming a specific CPU core operating frequency.

The stat blocked_cycles that actually counter in ticks is now updated
to count in cycles.

As the timing is now rounded to the clock edges of the cache, there
are some regressions that change. Plenty of them have very minor
changes, whereas some regressions with a short run-time are perturbed
quite significantly. A follow-on patch updates all the statistics for
the regressions.
2012-10-15 08:10:54 -04:00
Andreas Hansson 36d199b9a9 Mem: Use range operations in bus in preparation for striping
This patch transitions the bus to use the AddrRange operations instead
of directly accessing the start and end. The change facilitates the
move to a more elaborate AddrRange class that also supports address
striping in the bus by specifying interleaving bits in the ranges.

Two new functions are added to the AddrRange to determine if two
ranges intersect, and if one is a subset of another. The bus
propagation of address ranges is also tweaked such that an update is
only propagated if the bus received information from all the
downstream slave modules. This avoids the iteration and need for the
cycle-breaking scheme that was previously used.
2012-10-15 08:07:04 -04:00
Andreas Hansson 43ca8415e8 Mem: Determine bus block size during initialisation
This patch moves the block size computation from findBlockSize to
initialisation time, once all the neighbouring ports are connected.

There is no need to dynamically update the block size, and the caching
of the value effectively avoided that anyhow. This is very similar to
what was already in place, just with a slightly leaner implementation.
2012-10-11 06:38:43 -04:00
Nilay Vaish 88ba1c452b ruby: makes some members non-static
This patch makes some of the members (profiler, network, memory vector)
of ruby system non-static.
2012-10-02 14:35:45 -05:00
Nilay Vaish 4488379244 ruby: changes to simple network
This patch makes the Switch structure inherit from BasicRouter, as is
done in two other networks.
2012-10-02 14:35:45 -05:00
Nilay Vaish b370f6a7b2 ruby: rename template_hack to template
I don't like using the word hack. Hence, the patch.
2012-10-02 14:35:44 -05:00
Nilay Vaish d58f84c481 ruby: remove unused code in protocols 2012-10-02 14:35:44 -05:00
Nilay Vaish 73eafe4849 ruby: remove some unused things in slicc
This patch removes the parts of slicc that were required for multi-chip
protocols. Going ahead, it seems multi-chip protocols would be implemented
by playing with the network itself.
2012-10-02 14:35:43 -05:00
Nilay Vaish 3c9d3b16d8 ruby: move functional access to ruby system
This patch moves the code for functional accesses to ruby system. This is
because the subsequent patches add support for making functional accesses
to the messages in the interconnect. Making those accesses from the ruby port
would be cumbersome.
2012-10-02 14:35:42 -05:00
Nilay Vaish 95664da097 MI coherence protocol: add copyright notice 2012-09-30 13:20:53 -05:00
Djordje Kovacevic 80a26a3e39 MEM: Put memory system document into doxygen 2012-09-25 11:49:41 -05:00
Mrinmoy Ghosh 6fc0094337 Cache: add a response latency to the caches
In the current caches the hit latency is paid twice on a miss. This patch lets
a configurable response latency be set of the cache for the backward path.
2012-09-25 11:49:41 -05:00
Ali Saidi 396600de10 mem: Add a gasket that allows memory ranges to be re-mapped.
For example if DRAM is at two locations and mirrored this patch allows the
mirroring to occur.
2012-09-25 11:49:40 -05:00
Joel Hestness 4095af5fd6 RubyPort and Sequencer: Fix draining
Fix the drain functionality of the RubyPort to only call drain on child ports
during a system-wide drain process, instead of calling each time that a
ruby_hit_callback is executed.

This fixes the issue of the RubyPort ports being reawakened during the drain
simulation, possibly with work they didn't previously have to complete. If
they have new work, they may call process on the drain event that they had
not registered work for, causing an assertion failure when completing the
drain event.

Also, in RubyPort, set the drainEvent to NULL when there are no events
to be drained. If not set to NULL, the drain loop can result in stale
drainEvents used.
2012-09-23 13:57:08 -05:00
Andreas Hansson 3b6a143ec5 DRAM: Introduce SimpleDRAM to capture a high-level controller
This patch introduces a high-level model of a DRAM controller, with a
basic read/write buffer structure, a selectable and customisable
arbiter, a few address mapping options, and the basic DRAM timing
constraints. The parameters make it possible to turn this model into
any desired DDRx/LPDDRx/WideIOx memory controller.

The intention is not to be cycle accurate or capture every aspect of a
DDR DRAM interface, but rather to enable exploring of the high-level
knobs with a good simulation speed. Thus, contrary to e.g. DRAMSim
this module emphasizes simulation speed with a good-enough accuracy.

This module is merely a starting point, and there are plenty additions
and improvements to come. A notable addition is the support for
address-striping in the bus to enable a multi-channel DRAM
controller. Also note that there are still a few "todo's" in the code
base that will be addressed as we go along.

A follow-up patch will add basic performance regressions that use the
traffic generator to exercise a few well-defined corner cases.
2012-09-21 11:48:13 -04:00
Andreas Hansson 4aee3aa073 Mem: Tidy up bus member variables types
This patch merely tidies up the types used for the bus member
variables. It also makes the constant ones const.
2012-09-21 10:11:24 -04:00
Anthony Gutierrez 9cd0c5ecc8 bus: removed outdated warn regarding 64 B block sizes
this warn is outdated as 64 B blocks are very common, and even
the default size for some CPU types. E.g., arm_detailed.
2012-09-20 17:25:52 -04:00
Andreas Hansson a731f8f9dd Mem: Remove the file parameter from AbstractMemory
This patch removes the unused file parameter from the
AbstractMemory. The patch serves to make it easier to transition to a
separation of the actual contigious host memory backing store, and the
gem5 memory controllers.

Without the file parameter it becomes easier to hide the creation of
the mmap in the PhysicalMemory, as there are no longer any reasons to
expose the actual contigious ranges to the user.

To the best of my knowledge there is no use of the parameter, so the
change should not affect anyone.
2012-09-19 06:15:46 -04:00
Andreas Hansson ffb6aec603 AddrRange: Transition from Range<T> to AddrRange
This patch takes the final plunge and transitions from the templated
Range class to the more specific AddrRange. In doing so it changes the
obvious Range<Addr> to AddrRange, and also bumps the range_map to be
AddrRangeMap.

In addition to the obvious changes, including the removal of redundant
includes, this patch also does some house keeping in preparing for the
introduction of address interleaving support in the ranges. The Range
class is also stripped of all the functionality that is never used.

--HG--
rename : src/base/range.hh => src/base/addr_range.hh
rename : src/base/range_map.hh => src/base/addr_range_map.hh
2012-09-19 06:15:44 -04:00
Nilay Vaish 33c904e0a5 ruby: eliminate typedef integer_t 2012-09-18 22:49:12 -05:00
Nilay Vaish 86b1c0fd54 ruby: avoid using g_system_ptr for event scheduling
This patch removes the use of g_system_ptr for event scheduling. Each consumer
object now needs to specify upfront an EventManager object it would use for
scheduling events. This makes the ruby memory system more amenable for a
multi-threaded simulation.
2012-09-18 22:46:34 -05:00
Andreas Hansson 7c55464aac Mem: Add a maximum bandwidth to SimpleMemory
This patch makes a minor addition to the SimpleMemory by enforcing a
maximum data rate. The bandwidth is configurable, and a reasonable
value (12.8GB/s) has been choosen as the default.

The changes do add some complexity to the SimpleMemory, but they
should definitely be justifiable as this enables a far more realistic
setup using even this simple memory controller.

The rate regulation is done for reads and writes combined to reflect
the bidirectional data busses used by most (if not all) relevant
memories. Moreover, the regulation is done per packet as opposed to
long term, as it is the short term data rate (data bus width times
frequency) that is the limiting factor.

A follow-up patch bumps the stats for the regressions.
2012-09-18 10:30:02 -04:00
Andreas Hansson 806a1144ce scons: Use c++0x with gcc >= 4.4 instead of 4.6
This patch shifts the version of gcc for which we enable c++0x from
4.6 to 4.4 The more long term plan is to see what the c++0x features
can bring and what level of support would be enabled simply by bumping
the required version of gcc from 4.3 to 4.4.

A few minor things had to be fixed in the code base, most notably the
choice of a hashmap implementation. In the Ruby Sequencer there were
also a few minor issues that gcc 4.4 was not too happy about.
2012-09-14 12:13:18 -04:00
Jason Power aa8bcd15ec Ruby: Modify Scons so that we can put .sm files in extras
Also allows for header files which are required in slicc generated
code to be in a directory other than src/mem/ruby/slicc_interface.
2012-09-12 14:52:04 -05:00
Andreas Hansson 292d8252a4 clang: Fix issues identified by the clang static analyzer
This patch addresses a few minor issues reported by the clang static
analyzer.

The analysis was run with:

scan-build -disable-checker deadcode \
           -enable-checker experimental.core \
           -disable-checker experimental.core.CastToStruct \
           -enable-checker experimental.cpluscplus
2012-09-11 14:15:47 -04:00
Lena Olson 584eba3ab6 Cache: Split invalidateBlk up to seperate block vs. tags
This seperates the functionality to clear the state in a block into
blk.hh and the functionality to udpate the tag information into the
tags.  This gets rid of the case where calling invalidateBlk on an
already-invalid block does something different than calling it on a
valid block, which was confusing.
2012-09-11 14:14:49 -04:00
Nilay Vaish 637c6c7e32 Ruby: Use uint32_t instead of uint32 everywhere 2012-09-11 09:24:45 -05:00
Nilay Vaish f00347a20f Ruby: Use uint8_t instead of uint8 everywhere 2012-09-11 09:23:56 -05:00
Nilay Vaish c5bf1390aa Ruby System: Convert to Clocked Object
This patch moves Ruby System from being a SimObject to recently introduced
ClockedObject.
2012-09-10 12:21:01 -05:00
Nilay Vaish 4e6f048ef0 Ruby Slicc: remove the call to cin.get() function
If I understand correctly, this was put in place so that a debugger can be
attached when the protocol aborts. While this sounds useful, it is a problem
when the simulation is not being actively monitored. I think it is better to
remove this.
2012-09-10 12:20:34 -05:00
Marco Elver 9e0edbcea8 Mem: Allow serializing of more than INT_MAX bytes
Despite gzwrite taking an unsigned for length, it returns an int for
bytes written; gzwrite fails if (int)len < 0.  Because of this, call
gzwrite with len no larger than INT_MAX: write in blocks of INT_MAX if
data to be written is larger than INT_MAX.
2012-09-10 11:57:43 -04:00
Andreas Hansson 287ea1a081 Param: Transition to Cycles for relevant parameters
This patch is a first step to using Cycles as a parameter type. The
main affected modules are the CPUs and the Ruby caches. There are
definitely plenty more places that are affected, but this patch serves
as a starting point to making the transition.

An important part of this patch is to actually enable parameters to be
specified as Param.Cycles which involves some changes to params.py.
2012-09-07 12:34:38 -04:00
Joel Hestness 6924e10978 Ruby Memory Controller: Fix clocking 2012-09-05 20:51:41 -05:00
Jason Power 494f6a858e Ruby: Correct DataBlock =operator
The =operator for the DataBlock class was incorrectly interpreting the class
member m_alloc. This variable stands for whether the assigned memory for the
data block needs to be freed or not by the class itself. It seems that the
=operator interpreted the variable as whether the memory is assigned to the
data block. This wrong interpretation was causing values not to propagate
to RubySystem::m_mem_vec_ptr. This caused major issues with restoring from
checkpoints when using a protocol which verified that the cache data was
consistent with the backing store (i.e. MOESI-hammer).
2012-08-28 17:57:51 -05:00
Andreas Hansson 0cacf7e817 Clock: Add a Cycles wrapper class and use where applicable
This patch addresses the comments and feedback on the preceding patch
that reworks the clocks and now more clearly shows where cycles
(relative cycle counts) are used to express time.

Instead of bumping the existing patch I chose to make this a separate
patch, merely to try and focus the discussion around a smaller set of
changes. The two patches will be pushed together though.

This changes done as part of this patch are mostly following directly
from the introduction of the wrapper class, and change enough code to
make things compile and run again. There are definitely more places
where int/uint/Tick is still used to represent cycles, and it will
take some time to chase them all down. Similarly, a lot of parameters
should be changed from Param.Tick and Param.Unsigned to
Param.Cycles.

In addition, the use of curTick is questionable as there should not be
an absolute cycle. Potential solutions can be built on top of this
patch. There is a similar situation in the o3 CPU where
lastRunningCycle is currently counting in Cycles, and is still an
absolute time. More discussion to be had in other words.

An additional change that would be appropriate in the future is to
perform a similar wrapping of Tick and probably also introduce a
Ticks class along with suitable operators for all these classes.
2012-08-28 14:30:33 -04:00
Andreas Hansson d14e5857c7 Port: Stricter port bind/unbind semantics
This patch tightens up the semantics around port binding and checks
that the ports that are being bound are currently not connected, and
similarly connected before unbind is called.

The patch consequently also changes the order of the unbind and bind
for the switching of CPUs to ensure that the rules are adhered
to. Previously the ports would be "over-written" without any check.

There are no changes in behaviour due to this patch, and the only
place where the unbind functionality is used is in the CPU.
2012-08-28 14:30:27 -04:00
Nilay Vaish 85c7352462 Ruby: remove README.debugging and Decommissioning_note
These files were relevant when Ruby was part of GEMS. They are not required
any longer.
2012-08-27 14:57:46 -05:00
Nilay Vaish 9190940511 Ruby: Remove RubyEventQueue
This patch removes RubyEventQueue. Consumer objects now rely on RubySystem
or themselves for scheduling events.
2012-08-27 01:00:55 -05:00
Nilay Vaish 7122b83d8f Ruby Memory Vector: Allow more than 4GB of memory
The memory size variable was a 32-bit int. This meant that the size of the
memory was limited to 4GB. This patch changes the type of the variable to
64-bit to support larger memory sizes. Thanks to Raghuraman Balasubramanian
for bringing this to notice.
2012-08-27 01:00:54 -05:00
Nilay Vaish b422994fea MESI Protocol: Correct the virtual network in profile functions
The virtual network in a couple of places was incorrectly mentioned
as 3 in place of 1. This is being corrected.
2012-08-25 15:49:06 -05:00
Nilay Vaish 01f1430833 MESI Coherence Protocol: Add copyright notice 2012-08-25 13:16:45 -05:00
Andreas Hansson c60db56741 Packet: Remove NACKs from packet and its use in endpoints
This patch removes the NACK frrom the packet as there is no longer any
module in the system that issues them (the bridge was the only one and
the previous patch removes that).

The handling of NACKs was mostly avoided throughout the code base, by
using e.g. panic or assert false, but in a few locations the NACKs
were actually dealt with (although NACKs never occured in any of the
regressions). Most notably, the DMA port will now never receive a NACK
and the backoff time is thus never changed. As a consequence, the
entire backoff mechanism (similar to a PCI bus) is now removed and the
DMA port entirely relies on the bus performing the arbitration and
issuing a retry when appropriate. This is more in line with e.g. PCIe.

Surprisingly, this patch has no impact on any of the regressions. As
mentioned in the patch that removes the NACK from the bridge, a
follow-up patch should change the request and response buffer size for
at least one regression to also verify that the system behaves as
expected when the bridge fills up.
2012-08-22 11:39:59 -04:00
Andreas Hansson a6074016e2 Bridge: Remove NACKs in the bridge and unify with packet queue
This patch removes the NACKing in the bridge, as the split
request/response busses now ensure that protocol deadlocks do not
occur, i.e. the message-dependency chain is broken by always allowing
responses to make progress without being stalled by requests. The
NACKs had limited support in the system with most components ignoring
their use (with a suitable call to panic), and as the NACKs are no
longer needed to avoid protocol deadlocks, the cleanest way is to
simply remove them.

The bridge is the starting point as this is the only place where the
NACKs are created. A follow-up patch will remove the code that deals
with NACKs in the endpoints, e.g. the X86 table walker and DMA
port. Ultimately the type of packet can be complete removed (until
someone sees a need for modelling more complex protocols, which can
now be done in parts of the system since the port and interface is
split).

As a consequence of the NACK removal, the bridge now has to send a
retry to a master if the request or response queue was full on the
first attempt. This change also makes the bridge ports very similar to
QueuedPorts, and a later patch will change the bridge to use these. A
first step in this direction is taken by aligning the name of the
member functions, as done by this patch.

A bit of tidying up has also been done as part of the simplifications.

Surprisingly, this patch has no impact on any of the
regressions. Hence, there was never any NACKs issued. In a follow-up
patch I would suggest changing the size of the bridge buffers set in
FSConfig.py to also test the situation where the bridge fills up.
2012-08-22 11:39:58 -04:00
Andreas Hansson e317d8b9ff Port: Extend the QueuedPort interface and use where appropriate
This patch extends the queued port interfaces with methods for
scheduling the transmission of a timing request/response. The methods
are named similar to the corresponding sendTiming(Snoop)Req/Resp,
replacing the "send" with "sched". As the queues are currently
unbounded, the methods always succeed and hence do not return a value.

This functionality was previously provided in the subclasses by
calling PacketQueue::schedSendTiming with the appropriate
parameters. With this change, there is no need to introduce these
extra methods in the subclasses, and the use of the queued interface
is more uniform and explicit.
2012-08-22 11:39:56 -04:00
Andreas Hansson 5803309574 PacketQueue: Allow queuing in the same tick as desired send tick
This patch allows packets to be enqueued in the same tick as they are
intended to be sent. This does not imply they actually are sent that
tick, although that is possible.

This change is useful for module that use the queued ports primarly to
avoid handling the flow control involved in sending and retrying
packets.
2012-08-21 05:49:24 -04:00
Andreas Hansson 452217817f Clock: Move the clock and related functions to ClockedObject
This patch moves the clock of the CPU, bus, and numerous devices to
the new class ClockedObject, that sits in between the SimObject and
MemObject in the class hierarchy. Although there are currently a fair
amount of MemObjects that do not make use of the clock, they
potentially should do so, e.g. the caches should at some point have
the same clock as the CPU, potentially with a 1:n ratio. This patch
does not introduce any new clock objects or object hierarchies
(clusters, clock domains etc), but is still a step in the direction of
having a more structured approach clock domains.

The most contentious part of this patch is the serialisation of clocks
that some of the modules (but not all) did previously. This
serialisation should not be needed as the clock is set through the
parameters even when restoring from the checkpoint. In other words,
the state is "stored" in the Python code that creates the modules.

The nextCycle methods are also simplified and the clock phase
parameter of the CPU is removed (this could be part of a clock object
once they are introduced).
2012-08-21 05:49:01 -04:00
Nilay Vaish 0160d51483 Ruby Banked Array: add copyrights 2012-08-19 13:05:53 -05:00
Jason Power 44b4c96253 Ruby: Add RubySystem parameter to MemoryControl
This guarantees that RubySystem object is created before the MemoryController
object is created.
2012-08-16 23:39:36 -05:00
Anthony Gutierrez 0b3897fc90 O3,ARM: fix some problems with drain/switchout functionality and add Drain DPRINTFs
This patch fixes some problems with the drain/switchout functionality
for the O3 cpu and for the ARM ISA and adds some useful debug print
statements.

This is an incremental fix as there are still a few bugs/mem leaks with the
switchout code. Particularly when switching from an O3CPU to a
TimingSimpleCPU. However, when switching from O3 to O3 cores with the ARM ISA
I haven't encountered any more assertion failures; now the kernel will
typically panic inside of simulation.
2012-08-15 10:38:08 -04:00
Jason Power 11411cc9c7 Ruby: Clean up topology changes
This patch moves instantiateTopology into Ruby.py and removes the
mem/ruby/network/topologies directory. It also adds some extra inheritance to
the topologies to clean up some issues in the existing topologies.
2012-08-10 13:50:42 -05:00
Steve Reinhardt f4b424cd53 SETranslatingPortProxy: fix bug in tryReadString()
Off-by-one loop termination meant that we were stuffing
the terminating '\0' into the std::string value, which
makes for difficult-to-debug string comparison failures.
2012-08-06 16:57:11 -07:00
Jason Power 6721b3e325 Ruby NetDest: add assert for bad element in netdest 2012-08-01 17:07:34 -05:00
Anthony Gutierrez 7bf14aedbf cache: don't allow dirty data in the i-cache
removes the optimization that forwards an exclusive copy to a requester on a
read, only for the i-cache. this optimization isn't necessary because we
typically won't be writing to the i-cache.
2012-07-27 16:08:04 -04:00
Andreas Hansson 66f5124e2b Bridge: Use EventWrapper instead of Event subclass for sendEvent
This class simply cleans up the code by making use of the EventWrapper
convenience class to schedule the sendEvent in the bridge ports.
2012-07-23 09:32:19 -04:00
Andreas Hansson f00cba34eb Mem: Make SimpleMemory single ported
This patch changes the simple memory to have a single slave port
rather than a vector port. The simple memory makes no attempts at
modelling the contention between multiple ports, and any such
multiplexing and demultiplexing could be done in a bus (or crossbar)
outside the memory controller. This scenario also matches with the
ongoing work on a SimpleDRAM model, which will be a single-ported
single-channel controller that can be used in conjunction with a bus
(or crossbar) to create a multi-port multi-channel controller.

There are only very few regressions that make use of the vector port,
and these are all for functional accesses only. To facilitate these
cases, memtest and memtest-ruby have been updated to also have a
"functional" bus to perform the (de)multiplexing of the functional
memory accesses.
2012-07-12 12:56:13 -04:00
Nilay Vaish b913af440b Ruby: remove config information from ruby.stats
This patch removes printConfig() functions from all structures in Ruby.
Most of the information is already part of config.ini, and where ever it
is not, it would become in due course.
2012-07-12 08:39:19 -05:00
Nilay Vaish ce4e9a9a50 Ruby: remove some unused stuff from SLICC files 2012-07-12 08:39:18 -05:00
Brad Beckmann 5931087dcd ruby: improved DRAM reset comment 2012-07-11 09:44:34 -07:00
Brad Beckmann 645fa9c262 # User Brad Beckmann <Brad.Beckmann@amd.com>
ruby: fixed fatal print statement
2012-07-10 22:51:54 -07:00
Brad Beckmann a22918dd41 # User Brad Beckmann <Brad.Beckmann@amd.com>
ruby: fixed msgptr print call
2012-07-10 22:51:54 -07:00
Brad Beckmann 884cd6f752 imported patch jason/slicc-external-structure-fix 2012-07-10 22:51:54 -07:00
Brad Beckmann 86d6b788f6 ruby: banked cache array resource model
This patch models a cache as separate tag and data arrays.  The patch exposes
the banked array as another resource that is checked by SLICC before a
transition is allowed to execute.  This is similar to how TBE entries and slots
in output ports are modeled.
2012-07-10 22:51:54 -07:00
Joel Hestness 467093ebf2 ruby: tag and data cache access support
Updates to Ruby to support statistics counting of cache accesses.  This feature
serves multiple purposes beyond simple stats collection.  It provides the
foundation for ruby to model the cache tag and data arrays as physical
resources, as well as provide the necessary input data for McPAT power
modeling.
2012-07-10 22:51:54 -07:00
Nuwan Jayasena c10f348120 ruby: adds reset function to Ruby memory controllers 2012-07-10 22:51:54 -07:00
Nuwan Jayasena 1740c4c448 ruby: memory controllers now inherit from an abstract "MemoryControl" class 2012-07-10 22:51:53 -07:00
Brad Beckmann 11b725c19d ruby: changes how Topologies are created
Instead of just passing a list of controllers to the makeTopology function
in src/mem/ruby/network/topologies/<Topo>.py we pass in a function pointer
which knows how to make the topology, possibly with some extra state set
in the configs/ruby/<protocol>.py file. Thus, we can move all of the files
from network/topologies to configs/topologies. A new class BaseTopology
is added which all topologies in configs/topologies must inheirit from and
follow its API.

--HG--
rename : src/mem/ruby/network/topologies/Crossbar.py => configs/topologies/Crossbar.py
rename : src/mem/ruby/network/topologies/Mesh.py => configs/topologies/Mesh.py
rename : src/mem/ruby/network/topologies/MeshDirCorners.py => configs/topologies/MeshDirCorners.py
rename : src/mem/ruby/network/topologies/Pt2Pt.py => configs/topologies/Pt2Pt.py
rename : src/mem/ruby/network/topologies/Torus.py => configs/topologies/Torus.py
2012-07-10 22:51:53 -07:00
Andreas Hansson d2f458e7b5 Mem: Make members relating to range and size constant
This patch makes the address-range related members const. The change
is trivial and merely ensures that they can be called on a const
memory.
2012-07-09 12:35:44 -04:00
Andreas Hansson 67e257f442 Port: Hide the queue implementation in SimpleTimingPort
This patch makes the queue implementation in the SimpleTimingPort
private to avoid confusion with the protected member queue in the
QueuedSlavePort. The SimpleTimingPort provides the queue_impl to the
QueuedSlavePort and it can be accessed via the reference in the base
class. The use of the member name queue is thus no longer overloaded.
2012-07-09 12:35:42 -04:00
Andreas Hansson b265d9925c Port: Align port names in C++ and Python
This patch is a first step to align the port names used in the Python
world and the C++ world. Ultimately it serves to make the use of
config.json together with output from the simulation easier, including
post-processing of statistics.

Most notably, the CPU, cache, and bus is addressed in this patch, and
there might be other ports that should be updated accordingly. The
dash name separator has also been replaced with a "." which is what is
used to concatenate the names in python, and a separation is made
between the master and slave port in the bus.
2012-07-09 12:35:39 -04:00
Andreas Hansson 1c2ee987f3 Bus: Make the default bus width 8 bytes instead of 64
This patch changes the default bus width to a more sensible 8 bytes
(64 bits), which is in line with most on-chip buses. Although there
are cases where a wider or narrower bus is useful, the 8 bytes is a
good compromise to serve as the default.

This patch changes essentially all statistics, and will be bundled
with the outstanding changes to the bus.
2012-07-09 12:35:38 -04:00
Andreas Hansson 8caaac048a Bus: Split the bus into separate request/response layers
This patch splits the existing buses into multiple layers. The
non-coherent bus is split into a request and a response layer, and the
coherent bus adds an additional layer for the snoop responses. The
layer is modified to be templatised on the port type, such that the
different layers can have retryLists with either master or slave
ports. This patch also removes the dynamic cast from the retry, as
previously promised when moving the recvRetry from the port base class
to the master/slave port respectively.

Overall, the split bus more closely reflects any modern on-chip bus
and should be at step in the right direction. From this point, it
would be reasonable straight forward to add separate layers (and thus
contention points and arbitration) for each port and thus create a
true crossbar.

The regressions all produce the correct output, but have varying
degrees of changes to their statistics. A separate patch will be
pushed with the updates to the reference statistics.
2012-07-09 12:35:37 -04:00
Andreas Hansson 995e6e4670 Bus: Add a notion of layers to the buses
This patch moves all flow control, arbitration and state information
into a bus layer. The layer is thus responsible for all the state
transitions, and for keeping hold of the retry list. Consequently the
layer is also responsible for the draining.

With this change, the non-coherent and coherent bus are given a single
layer to avoid changing any temporal behaviour, but the patch opens up
for adding more layers.
2012-07-09 12:35:36 -04:00
Andreas Hansson 14f9c77dd3 Bus: Replace tickNextIdle and inRetry with a state variable
This patch adds a state enum and member variable in the bus, tracking
the bus state, thus eliminating the need for tickNextIdle and inRetry,
and fixing an issue that allowed the bus to be occupied by multiple
packets at once (hopefully it also makes it easier to understand the
code).

The bus, in its current form, uses tickNextIdle and inRetry to keep
track of the state of the bus. However, it only updates tickNextIdle
_after_ forwarding a packet using sendTiming, and the result is that
the bus is still seen as idle, and a module that receives the packet
and starts transmitting new packets in zero time will still see the
bus as idle (and this is done by a number of DMA devices). The issue
can also be seen in isOccupied where the bus calls reschedule on an
event instead of schedule.

This patch addresses the problem by marking the bus as _not_ idle
already by the time we conclude that the bus is not occupied and we
will deal with the packet.

As a result of not allowing multiple packets to occupy the bus, some
regressions have slight changes in their statistics. A separate patch
updates these accordingly.

Further ahead, a follow-on patch will introduce a separate state
variable for request/responses/snoop responses, and thus implement a
split request/response bus with separate flow control for the
different message types (even further ahead it will introduce a
multi-layer bus).
2012-07-09 12:35:35 -04:00
Andreas Hansson 46d9adb68c Port: Make getAddrRanges const
This patch makes getAddrRanges const throughout the code base. There
is no reason why it should not be, and making it const prevents adding
any unintentional side-effects.
2012-07-09 12:35:34 -04:00
Andreas Hansson 830391cad9 Port: Add getAddrRanges to master port (asking slave port)
This patch adds getAddrRanges to the master port, and thus avoids
going through getSlavePort to be able to ask the slave. Similar to the
previous patch that added isSnooping to the SlavePort, this patch aims
to introduce an additional level of hierarchy in the ports (base port
being protocol-agnostic) and getSlave/MasterPort will return port
pointers to these base classes.

The function is named getAddrRanges also on the master port, but does
nothing besides asking the connected slave port. The slave port, as
before, has to provide an implementation and actually produce a list
of address ranges. The initial design used the name getSlaveAddrRanges
for the new function, but the more verbose name was later changed.
2012-07-09 12:35:33 -04:00
Andreas Hansson 49407d76aa Port: Add isSnooping to slave port (asking master port)
This patch adds isSnooping to the slave port, and thus avoids going
through getMasterPort to be able to ask the master. Over the course of
the next few patches, all getMasterPort/getSlavePort in Port and
MemObject are to be protocol agnostic, and the snooping is part of the
protocol layer.

The function is already present on the master port, where it is
implemented by the module itself, e.g. a cache. On the slave side, it
is merely asking the connected master port. The same name is used by
both functions despite their difference in behaviour. The initial
design used isMasterSnooping on the slave port side, but the more
verbose function name was later changed.
2012-07-09 12:35:32 -04:00
Andreas Hansson 17f9270dad Port: Move retry from port base class to Master/SlavePort
This patch is the last part of moving all protocol-related
functionality out of the Port base class. All the send/recv functions
are already moved, and the retry (which still governs all the timing
transport functions) is the only part that remained in the base class.

The only point where this currently causes a bit of inconvenience is
in the bus where the retry list is global and holds Port pointers (not
Master/SlavePort). This is about to change with the split into a
request/response bus and will soon be removed anyway.

The patch has no impact on any regressions.
2012-07-09 12:35:31 -04:00
Andreas Hansson ff5718f042 Fix: Address a few benign memory leaks
This patch is the result of static analysis identifying a number of
memory leaks. The leaks are all benign as they are a result of not
deallocating memory in the desctructor. The fix still has value as it
removes false positives in the static analysis.
2012-07-09 12:35:30 -04:00
Lena Olson d2ebade5a5 Cache: Fix the LRU policy for classic memory hierarchy
The LRU policy always evicted the least recently touched way, even if it
contained valid data and another way was invalid, as can happen if a block has
been invalidated by coherance.  This can result in caches never warming up even
though they are replacing blocks.  This modifies the LRU policy to move blocks
to LRU position on invalidation.
2012-06-29 11:21:58 -04:00
Uri Wiener fcccab0dcd Bus: enable non/coherent buses sub-classes
This patch merely changes several methods to be virtual in order to enable
non/coherent buses sub-classes.
2012-06-29 11:19:08 -04:00
Dam Sunwoo 7cbe0cf564 Mem: fix master id assertion in cache_impl.hh
The assertion was applied to the wrong packet.
This patch fixes the issue rerported by Xiang Jiang on the gem5-dev mailing list.
2012-06-29 11:19:07 -04:00
Matt Evans 579047c76d Mem: Fix a livelock resulting in LLSC/locked memory access implementation.
Currently when multiple CPUs perform a load-linked/store-conditional sequence,
the loads all create a list of reservations which is then scanned when the
stores occur.  A reservation matching the context and address of the store is
sought, BUT all reservations matching the address are also erased at this point.

The upshot is that a store-conditional will remove all reservations even if the
store itself does not succeed.  A livelock was observed using 7-8 CPUs where a
thread would erase the reservations of other threads, not succeed, loop and put
its own reservation in again only to have it blown by another thread that
unsuccessfully now tries to store-conditional -- no forward progress was made,
hanging the system.

The correct way to do this is to only blow a reservation when a store
(conditional or not) actually /occurs/ to its address.  One thread always wins
(the one that does the store-conditional first).
2012-06-29 11:19:05 -04:00
Ali Saidi 8d1e56bdcd Cache: Only invalidate a line in the cache when an uncacheable write is seen. 2012-06-29 11:18:29 -04:00
Ali Saidi c80cd4136e mem: Delay deleting of incoming packets by one call.
This patch is a temporary fix until Andreas' four-phase patches
get reviewed and committed. Removing FastAlloc seems to have exposed
an issue which previously was reasonable rare in which packets are freed
before the sending cache is done with them. This change puts incoming packets
no a pendingDelete queue which are deleted at the start of the next call and
thus breaks the dependency between when the caller returns true and when the
packet is actually used by the sending cache.

Running valgrind on a multi-core linux boot and the memtester results in no
valgrind warnings.
2012-06-07 10:59:03 -04:00
Dam Sunwoo 14539ccae1 Mem: add per-master stats to physmem
Added per-master stats (similar to cache stats) to physmem.
2012-06-05 01:23:11 -04:00
Ali Saidi 1b370431d0 sim: Remove FastAlloc
While FastAlloc provides a small performance increase (~1.5%) over regular malloc it isn't thread safe.
After removing FastAlloc and using tcmalloc I've seen a performance increase of 12% over libc malloc
when running twolf for ARM.
2012-06-05 01:23:08 -04:00
Andreas Hansson 0d32940711 Bus: Split the bus into a non-coherent and coherent bus
This patch introduces a class hierarchy of buses, a non-coherent one,
and a coherent one, splitting the existing bus functionality. By doing
so it also enables further specialisation of the two types of buses.

A non-coherent bus connects a number of non-snooping masters and
slaves, and routes the request and response packets based on the
address. The request packets issued by the master connected to a
non-coherent bus could still snoop in caches attached to a coherent
bus, as is the case with the I/O bus and memory bus in most system
configurations. No snoops will, however, reach any master on the
non-coherent bus itself. The non-coherent bus can be used as a
template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses,
and is typically used for the I/O buses.

A coherent bus connects a number of (potentially) snooping masters and
slaves, and routes the request and response packets based on the
address, and also forwards all requests to the snoopers and deals with
the snoop responses. The coherent bus can be used as a template for
modelling QPI, HyperTransport, ACE and coherent OCP buses, and is
typically used for the L1-to-L2 buses and as the main system
interconnect.

The configuration scripts are updated to use a NoncoherentBus for all
peripheral and I/O buses.

A bit of minor tidying up has also been done.

--HG--
rename : src/mem/bus.cc => src/mem/coherent_bus.cc
rename : src/mem/bus.hh => src/mem/coherent_bus.hh
rename : src/mem/bus.cc => src/mem/noncoherent_bus.cc
rename : src/mem/bus.hh => src/mem/noncoherent_bus.hh
2012-05-31 13:30:04 -04:00
Andreas Hansson b8cf48accc Bus: Remove redundant packet parameter from isOccupied
This patch merely remove the Packet* from the isOccupied member
function. Historically this was used to check if the packet was an
express snoop, but this is now done outside this function (where
relevant).
2012-05-30 05:31:11 -04:00
Andreas Hansson 5880fbe96d Bus: Turn the PortId into a transport function parameter
The main aim of this patch is to arrive at a suitable port interface
for vector ports, including both the packet and the port id. This
patch changes the bus transport functions
(recvFunctional/Atomic/Timing) to require a PortId parameter
indicating the source port. Previously this information was passed by
setting the source field of the packet, and this is only required in
the case of a timing request.

With this patch, the use of the source and destination field is also
more restrictive, as they are only needed for timing accesses. The
modifications to these fields for atomic snoops is now removed
entirely, also making minor modifications to the cache.
2012-05-30 05:30:24 -04:00
Andreas Hansson cad802761a Packet: Unify the use of PortID in packet and port
This patch removes the Packet::NodeID typedef and unifies it with the
Port::PortId. The src and dest fields in the packet are used to hold a
port id (e.g. in the bus), and thus the two should actually be the
same.

The typedef PortID is now global (in base/types.hh) and aligned with
the ThreadID in terms of capitalisation and naming of the
InvalidPortID constant.

Before this patch, two flags were used for valid destination and
source, rather than relying on a named value (InvalidPortID), and
this is now redundant, as the src and dest field themselves are
sufficient to tell whether the current value is a valid port
identifier or not. Consequently, the VALID_SRC and VALID_DST are
removed.

As part of the cleaning up, a number of int parameters and local
variables are updated to use PortID.

Note that Ruby still has its own NodeID typedef. Furthermore, the
MemObject getMaster/SlavePort still has an int idx parameter with a
default value of -1 which should eventually change to PortID idx =
InvalidPortID.
2012-05-30 05:29:42 -04:00
Andreas Hansson 6a54f7fc5f Packet: Updated comments for src and dest fields
This patch updates the comments for the src and dest fields to reflect
their actual use. Due to a number of patches (e.g. removing the
Broadcast flag), the old comments are no longer indicative of the
current usage.
2012-05-30 05:29:07 -04:00
Andreas Hansson 3b367db42c Bridge: Split deferred request, response and sender state
This patch splits the PacketBuffer class into a RequestState and a
DeferredRequest and DeferredResponse. Only the requests need a
SenderState, and the deferred requests and responses only need an
associated point in time for the request and the response queue.

Besides the cleaning up, the goal is to simplify the transition to a
new port handshake, and with these changes, the two packet queues are
starting to look very similar to the generic packet queue, but
currently they do a few unique things relating to the NACK and
counting of requests/responses that the packet queue cannot be
conveniently used. This will be addressed in a later patch.
2012-05-30 05:28:06 -04:00
Andreas Hansson 49da0497d3 Cache: Remove dangling doWriteback declaration
This patch removes the declaration of doWriteback as there is no
implementation for this member function.
2012-05-24 04:09:19 -04:00
Andreas Hansson 3e0ed08706 Packet: Cleaning up packet command and attribute
This patch removes unused commands and attributes from the packet to
avoid any confusion. It is part of an effort to clear up how and where
different commands and attributes are used.
2012-05-23 09:18:04 -04:00
Nilay Vaish 1031fe7b6f Ruby: Remove the unused src/mem/ruby/common/Driver.* files. 2012-05-22 11:35:58 -05:00
Nilay Vaish 6a966d5eeb Ruby Sequencer: Schedule deadlock check event at correct time
The scheduling of the deadlock check event was being done incorrectly as the
clock was not being multiplied, so as to convert the time into ticks. This
patch removes that bug.
2012-05-22 11:32:57 -05:00
Ali Saidi 041b932428 mem: fix bug with CopyStringOut and null string termination. 2012-05-10 18:04:27 -05:00
Ali Saidi c02dc07424 Cache: restructure code that actually isn't a loop 2012-05-10 18:04:27 -05:00
Ali Saidi 5745665509 gem5: assert before indexing intro arrays to verify bounds 2012-05-10 18:04:27 -05:00
Ali Saidi 4f66bcdd2e gem5: fix some iterator use and erase bugs 2012-05-10 18:04:27 -05:00
Ali Saidi 8cee4dacc8 gem5: Fix a number of incorrect case statements 2012-05-10 18:04:26 -05:00
Ali Saidi f6895e8bd4 Cache: Panic if you attempt to create a checkpoint with a cache in the system 2012-05-10 18:04:26 -05:00
Andreas Hansson ab23e29487 MEM: Add the communication monitor
This patch adds a communication monitor MemObject that can be inserted
between a master and slave port to provide a range of statistics about
the communication passing through it. The communication monitor is
non-invasive and does not change any properties or timing of the
packets, with the exception of adding a sender state to be able to
track latency. The statistics are only collected in timing mode (not
atomic) to avoid slowing down any fast forwarding.

An example of the statistics captured by the monitor are: read/write
burst lengths, bandwidth, request-response latency, outstanding
transactions, inter transaction time, transaction count, and address
distribution. The monitor can be used in combination with periodic
resetting and dumping of stats (through schedStatEvent) to study the
behaviour over time.

In future patches, a selection of convenience scripts will be added to
aid in visualising the statistics collected by the monitor.
2012-05-09 04:37:45 -04:00
Andreas Hansson 692351ea34 MEM: Do not forward uncacheable to bus snoopers
This patch adds a guarding if-statement to avoid forwarding
uncacheable requests (or rather their corresponding request packets)
to bus snoopers. These packets should never have any effect on the
caches, and thus there is no need to forward them to the snoopers.
2012-05-08 05:15:52 -04:00
Andreas Hansson 15e28c5ba6 Ruby: Ensure snoop requests are sent using sendTimingSnoopReq
This patch fixes a bug that caused snoop requests to be placed in a
packet queue. Instead, the packet is now sent immediately using
sendTimingSnoopReq, thus bypassing the packet queue and any normal
responses waiting to be sent.
2012-05-04 03:30:02 -04:00
Andreas Hansson 3fea59e162 MEM: Separate requests and responses for timing accesses
This patch moves send/recvTiming and send/recvTimingSnoop from the
Port base class to the MasterPort and SlavePort, and also splits them
into separate member functions for requests and responses:
send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq,
send/recvTimingSnoopResp. A master port sends requests and receives
responses, and also receives snoop requests and sends snoop
responses. A slave port has the reciprocal behaviour as it receives
requests and sends responses, and sends snoop requests and receives
snoop responses.

For all MemObjects that have only master ports or slave ports (but not
both), e.g. a CPU, or a PIO device, this patch merely adds more
clarity to what kind of access is taking place. For example, a CPU
port used to call sendTiming, and will now call
sendTimingReq. Similarly, a response previously came back through
recvTiming, which is now recvTimingResp. For the modules that have
both master and slave ports, e.g. the bus, the behaviour was
previously relying on branches based on pkt->isRequest(), and this is
now replaced with a direct call to the apprioriate member function
depending on the type of access. Please note that send/recvRetry is
still shared by all the timing accessors and remains in the Port base
class for now (to maintain the current bus functionality and avoid
changing the statistics of all regressions).

The packet queue is split into a MasterPort and SlavePort version to
facilitate the use of the new timing accessors. All uses of the
PacketQueue are updated accordingly.

With this patch, the type of packet (request or response) is now well
defined for each type of access, and asserts on pkt->isRequest() and
pkt->isResponse() are now moved to the appropriate send member
functions. It is also worth noting that sendTimingSnoopReq no longer
returns a boolean, as the semantics do not alow snoop requests to be
rejected or stalled. All these assumptions are now excplicitly part of
the port interface itself.
2012-05-01 13:40:42 -04:00
Nilay Vaish 04a558bb41 Garnet: Correct computation of link utilization
The computation for link utilization was incorrect for the flexible network.
The utilization was being divided twice by the total time.
2012-04-28 16:57:31 -05:00
Nilay Vaish c3dad222e3 Ruby: Remove extra statements from Sequencer 2012-04-25 17:52:03 -05:00
Andreas Hansson beed20d7bc MEM: Use base class Master/SlavePort pointers in the bus
This patch makes some rather trivial simplifications to the bus in
that it changes the use of BusMasterPort and BusSlavePort pointers to
simply use MasterPort and SlavePort (iterators are also updated
accordingly).

This change is a step towards a future patch that introduces a
separation of the interface and the structural port itself.
2012-04-25 10:45:23 -04:00
Andreas Hansson 4c92708b48 MEM: Add the PortId type and a corresponding id field to Port
This patch introduces the PortId type, moves the definition of
INVALID_PORT_ID to the Port class, and also gives every port an id to
reflect the fact that each element in a vector port has an
identifier/index.

Previously the bus and Ruby testers (and potentially other users of
the vector ports) added the id field in their port subclasses, and now
this functionality is always present as it is moved to the base class.
2012-04-25 10:41:23 -04:00
Andreas Hansson 750f33a901 MEM: Remove the Broadcast destination from the packet
This patch simplifies the packet by removing the broadcast flag and
instead more firmly relying on (and enforcing) the semantics of
transactions in the classic memory system, i.e. request packets are
routed from a master to a slave based on the address, and when they
are created they have neither a valid source, nor destination. On
their way to the slave, the request packet is updated with a source
field for all modules that multiplex packets from multiple master
(e.g. a bus). When a request packet is turned into a response packet
(at the final slave), it moves the potentially populated source field
to the destination field, and the response packet is routed through
any multiplexing components back to the master based on the
destination field.

Modules that connect multiplexing components, such as caches and
bridges store any existing source and destination field in the sender
state as a stack (just as before).

The packet constructor is simplified in that there is no longer a need
to pass the Packet::Broadcast as the destination (this was always the
case for the classic memory system). In the case of Ruby, rather than
using the parameter to the constructor we now rely on setDest, as
there is already another three-argument constructor in the packet
class.

In many places where the packet information was printed as part of
DPRINTFs, request packets would be printed with a numeric "dest" that
would always be -1 (Broadcast) and that field is now removed from the
printing.
2012-04-14 05:45:55 -04:00
Andreas Hansson dccca0d3a9 MEM: Separate snoops and normal memory requests/responses
This patch introduces port access methods that separates snoop
request/responses from normal memory request/responses. The
differentiation is made for functional, atomic and timing accesses and
builds on the introduction of master and slave ports.

Before the introduction of this patch, the packets belonging to the
different phases of the protocol (request -> [forwarded snoop request
-> snoop response]* -> response) all use the same port access
functions, even though the snoop packets flow in the opposite
direction to the normal packet. That is, a coherent master sends
normal request and receives responses, but receives snoop requests and
sends snoop responses (vice versa for the slave). These two distinct
phases now use different access functions, as described below.

Starting with the functional access, a master sends a request to a
slave through sendFunctional, and the request packet is turned into a
response before the call returns. In a system without cache coherence,
this is all that is needed from the functional interface. For the
cache-coherent scenario, a slave also sends snoop requests to coherent
masters through sendFunctionalSnoop, with responses returned within
the same packet pointer. This is currently used by the bus and caches,
and the LSQ of the O3 CPU. The send/recvFunctional and
send/recvFunctionalSnoop are moved from the Port super class to the
appropriate subclass.

Atomic accesses follow the same flow as functional accesses, with
request being sent from master to slave through sendAtomic. In the
case of cache-coherent ports, a slave can send snoop requests to a
master through sendAtomicSnoop. Just as for the functional access
methods, the atomic send and receive member functions are moved to the
appropriate subclasses.

The timing access methods are different from the functional and atomic
in that requests and responses are separated in time and
send/recvTiming are used for both directions. Hence, a master uses
sendTiming to send a request to a slave, and a slave uses sendTiming
to send a response back to a master, at a later point in time. Snoop
requests and responses travel in the opposite direction, similar to
what happens in functional and atomic accesses. With the introduction
of this patch, it is possible to determine the direction of packets in
the bus, and no longer necessary to look for both a master and a slave
port with the requested port id.

In contrast to the normal recvFunctional, recvAtomic and recvTiming
that are pure virtual functions, the recvFunctionalSnoop,
recvAtomicSnoop and recvTimingSnoop have a default implementation that
calls panic. This is to allow non-coherent master and slave ports to
not implement these functions.
2012-04-14 05:45:07 -04:00
Andreas Hansson b6aa6d55eb clang/gcc: Fix compilation issues with clang 3.0 and gcc 4.6
This patch addresses a number of minor issues that cause problems when
compiling with clang >= 3.0 and gcc >= 4.6. Most importantly, it
avoids using the deprecated ext/hash_map and instead uses
unordered_map (and similarly so for the hash_set). To make use of the
new STL containers, g++ and clang has to be invoked with "-std=c++0x",
and this is now added for all gcc versions >= 4.6, and for clang >=
3.0. For gcc >= 4.3 and <= 4.5 and clang <= 3.0 we use the tr1
unordered_map to avoid the deprecation warning.

The addition of c++0x in turn causes a few problems, as the
compiler is more stringent and adds a number of new warnings. Below,
the most important issues are enumerated:

1) the use of namespaces is more strict, e.g. for isnan, and all
   headers opening the entire namespace std are now fixed.

2) another other issue caused by the more stringent compiler is the
   narrowing of the embedded python, which used to be a char array,
   and is now unsigned char since there were values larger than 128.

3) a particularly odd issue that arose with the new c++0x behaviour is
   found in range.hh, where the operator< causes gcc to complain about
   the template type parsing (the "<" is interpreted as the beginning
   of a template argument), and the problem seems to be related to the
   begin/end members introduced for the range-type iteration, which is
   a new feature in c++11.

As a minor update, this patch also fixes the build flags for the clang
debug target that used to be shared with gcc and incorrectly use
"-ggdb".
2012-04-14 05:43:31 -04:00
Andreas Hansson c9634d9b38 Ruby: Ensure order-dependent iteration uses an ordered map
This patch fixes a bug in Ruby that caused non-deterministic
simulation when changing the underlying hash map implementation. The
reason is order-dependent behaviour in combination with iteration over
the hash map contents. The two locations where a sorted container is
assumed are now changed to make use of a std::map instead of the
unordered hash map.

With this change, the stats changes slightly and the follow-on
changeset will update the relevant statistics.
2012-04-12 08:35:49 -04:00
Lisa Hsu a5287efc58 slicc: Controllers attached to Sequencers no longer have to be named L1Cache. 2012-04-06 13:47:08 -07:00
Brad Beckmann 5dfa4cd3f5 sim-ruby: checkpointing fixes and dependent eventq improvements
Fixes checkpointing with respect to lost events after swapping event queues.
Also adds DPRINTFs to better understand what's going on when Ruby serializes
and unserializes.
2012-04-06 13:47:07 -07:00
Brad Beckmann 70682e36dd slicc: fixed error message when the type has no inheritance 2012-04-06 13:47:07 -07:00
Brad Beckmann 5838ed7290 MOESI_hammer: tbe allocation and dependent wakeup fixes 2012-04-06 13:47:07 -07:00
Brad Beckmann f050ebe3a8 MOESI_hammer: fixed bug with single cpu + flushes, then modified the regression tester to check this functionality 2012-04-06 13:47:06 -07:00
Brad Beckmann 0a9f4b950f rubytest: seperated read and write ports.
This patch allows the ruby tester to support protocols where the i-cache and d-cache
are managed by seperate controllers.
2012-04-06 13:47:06 -07:00
Andreas Hansson b00949d88b MEM: Enable multiple distributed generalized memories
This patch removes the assumption on having on single instance of
PhysicalMemory, and enables a distributed memory where the individual
memories in the system are each responsible for a single contiguous
address range.

All memories inherit from an AbstractMemory that encompasses the basic
behaviuor of a random access memory, and provides untimed access
methods. What was previously called PhysicalMemory is now
SimpleMemory, and a subclass of AbstractMemory. All future types of
memory controllers should inherit from AbstractMemory.

To enable e.g. the atomic CPU and RubyPort to access the now
distributed memory, the system has a wrapper class, called
PhysicalMemory that is aware of all the memories in the system and
their associated address ranges. This class thus acts as an
infinitely-fast bus and performs address decoding for these "shortcut"
accesses. Each memory can specify that it should not be part of the
global address map (used e.g. by the functional memories by some
testers). Moreover, each memory can be configured to be reported to
the OS configuration table, useful for populating ATAG structures, and
any potential ACPI tables.

Checkpointing support currently assumes that all memories have the
same size and organisation when creating and resuming from the
checkpoint. A future patch will enable a more flexible
re-organisation.

--HG--
rename : src/mem/PhysicalMemory.py => src/mem/AbstractMemory.py
rename : src/mem/PhysicalMemory.py => src/mem/SimpleMemory.py
rename : src/mem/physical.cc => src/mem/abstract_mem.cc
rename : src/mem/physical.hh => src/mem/abstract_mem.hh
rename : src/mem/physical.cc => src/mem/simple_mem.cc
rename : src/mem/physical.hh => src/mem/simple_mem.hh
2012-04-06 13:46:31 -04:00
Andreas Hansson 74043c4f5c MEM: Remove legacy DRAM in preparation for memory updates
This patch removes the DRAM memory class in preparation for updates to
the memory system, with the first one introducing an abstract memory
class, and removing the assumption of a single physical memory.
2012-03-30 12:57:48 -04:00
Andreas Hansson a128ba7cd1 Ruby: Remove the physMemPort and instead access memory directly
This patch removes the physMemPort from the RubySequencer and instead
uses the system pointer to access the physmem. The system already
keeps track of the physmem and the valid memory address ranges, and
with this patch we merely make use of that existing functionality. The
memory is modified so that it is possible to call the access functions
(atomic and functional) without going through the port, and the memory
is allowed to be unconnected, i.e. have no ports (since Ruby does not
attach it like the conventional memory system).
2012-03-30 09:42:36 -04:00
William Wang f9d403a7b9 MEM: Introduce the master/slave port sub-classes in C++
This patch introduces the notion of a master and slave port in the C++
code, thus bringing the previous classification from the Python
classes into the corresponding simulation objects and memory objects.

The patch enables us to classify behaviours into the two bins and add
assumptions and enfore compliance, also simplifying the two
interfaces. As a starting point, isSnooping is confined to a master
port, and getAddrRanges to slave ports. More of these specilisations
are to come in later patches.

The getPort function is not getMasterPort and getSlavePort, and
returns a port reference rather than a pointer as NULL would never be
a valid return value. The default implementation of these two
functions is placed in MemObject, and calls fatal.

The one drawback with this specific patch is that it requires some
code duplication, e.g. QueuedPort becomes QueuedMasterPort and
QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort
(avoiding multiple inheritance). With the later introduction of the
port interfaces, moving the functionality outside the port itself, a
lot of the duplicated code will disappear again.
2012-03-30 09:40:11 -04:00
Andreas Hansson ca9790a2db Ruby: Fix Set::print for 32-bit hosts
This patch fixes a compilation error caused by a length mismatch on
32-bit hosts. The ifdef and sprintf is replaced by a csprintf.
2012-03-23 06:54:25 -04:00
Andreas Hansson 9727b1be18 MEM: Unify bus access methods and prepare for master/slave split
This patch unifies the recvFunctional, recvAtomic and recvTiming to
all be based on a similar structure: 1) extract information about the
incoming packet, 2) send it out to the appropriate snoopers, 3)
determine where it is going, and 4) forward it to the right
destination. The naming of variables across the different access
functions is now consistent as well.

Additionally, the patch introduces the member functions releaseBus and
retryWaiting to better distinguish between the two cases when we
should tell a sender to retry. The first case is when the bus goes
from busy to idle, and the second case is when it receives a retry
from a destination that did not immediatelly accept a packet.

As a very minor change, the MMU debug flag is no longer used in the bus.
2012-03-22 06:37:21 -04:00
Andreas Hansson c2d2ea99e3 MEM: Split SimpleTimingPort into PacketQueue and ports
This patch decouples the queueing and the port interactions to
simplify the introduction of the master and slave ports. By separating
the queueing functionality from the port itself, it becomes much
easier to distinguish between master and slave ports, and still retain
the queueing ability for both (without code duplication).

As part of the split into a PacketQueue and a port, there is now also
a hierarchy of two port classes, QueuedPort and SimpleTimingPort. The
QueuedPort is useful for ports that want to leave the packet
transmission of outgoing packets to the queue and is used by both
master and slave ports. The SimpleTimingPort inherits from the
QueuedPort and adds the implemention of recvTiming and recvFunctional
through recvAtomic.

The PioPort and MessagePort are cleaned up as part of the changes.

--HG--
rename : src/mem/tport.cc => src/mem/packet_queue.cc
rename : src/mem/tport.hh => src/mem/packet_queue.hh
2012-03-22 06:36:27 -04:00
Andreas Hansson fb395b56dd Scons: Remove Werror=False in SConscript files
This patch removes the overriding of "-Werror" in a handful of
cases. The code compiles with gcc 4.6.3 and clang 3.0 without any
warnings, and thus without any errors. There are no functional changes
introduced by this patch. In the future, rather than ypassing
"-Werror", address the warnings.
2012-03-22 06:34:50 -04:00
Tushar Krishna c9e4bca8d8 Garnet: Stats at vnet granularity + code cleanup
This patch
(1) Moves redundant code from fixed and flexible networks to BaseGarnetNetwork.
(2) Prints network stats at vnet granularity.
2012-03-19 17:34:17 -04:00
Ali Saidi eaa994e7f6 cache: Allow main memory to be at disjoint address ranges. 2012-03-09 09:59:25 -05:00
Marc Orr eb43883bef build scripts: Made minor modifications to reduce build overhead time.
1. --implicit-cache behavior is default.
2. makeEnv in src/SConscript is conditionally called.
3. decider set to MD5-timestamp
4. NO_HTML build option changed to SLICC_HTML (defaults to False)
2012-03-06 19:07:41 -08:00
Andreas Hansson adc419a13a Ruby: Rename RubyPort::sendTiming to avoid overriding base class
This patch renames the sendTiming member function in the RubyPort to
avoid inadvertently hiding Port::sendTiming (discovered through some
rather painful debugging). The RubyPort does, in fact, rely on the
functionality of the queued port and the implementation merely
schedules a send the next cycle. The new name for the member function
is sendNextCycle to better reflect this behaviour.

In the unlikely event that we ever shift to using C++11 the member
functions in Port should have a "final" identifier to prevent any
overriding in derived classes.
2012-03-02 09:16:50 -05:00
Ali Saidi d907d0ec72 Cache: Fix an issue with LRU when bonus block is used to complete transaction.
The block is never inserted because it's the one extra block in the cache, but
it can be invalidated twice in a row. In that case the block doesn't have a
new master id (beacuse it was never inserted), however it is valid and
the accounting goes wrong at that point.
2012-03-01 17:26:31 -06:00
Andreas Hansson e5ac647fc9 MEM: Make all the port proxy members const
This is a trivial patch that merely makes all the member functions of
the port proxies const. There is no good reason why they should not
be, and this change only serves to make it explicit that they are not
modified through their use.
2012-02-29 04:47:51 -05:00
Andreas Hansson 0cd0a8fdd3 MEM: Simplify cache ports preparing for master/slave split
This patch splits the two cache ports into a master (memory-side) and
slave (cpu-side) subclass of port with slightly different
functionality. For example, it is only the CPU-side port that blocks
incoming requests, and only the memory-side port that schedules send
events outside of what the transmit list dictates.

This patch simplifies the two classes by relying further on
SimpleTimingPort and also generalises the latter to better accommodate
the changes (introducing trySendTiming and scheduleSend). The
memory-side cache port overrides sendDeferredPacket to be able to not
only send responses from the transmit list, but also send requests
based on the MSHRs.

A follow on patch further simplifies the SimpleTimingPort and the
cache ports.
2012-02-24 11:52:49 -05:00
Andreas Hansson 77878d0a87 MEM: Prepare mport for master/slave split
This patch simplifies the mport in preparation for a split into a
master and slave role for the message ports. In particular,
sendMessageAtomic was only used in a single location and similarly so
sendMessageTiming. The affected interrupt device is updated
accordingly.
2012-02-24 11:50:15 -05:00
Andreas Hansson 485d103255 MEM: Move all read/write blob functions from Port to PortProxy
This patch moves the readBlob/writeBlob/memsetBlob from the Port class
to the PortProxy class, thus making a clear separation of the basic
port functionality (recv/send functional/atomic/timing), and the
higher-level functional accessors available on the port proxies.

There are only a few places in the code base where the blob functions
were used on ports, and they are all for peeking into the memory
system without making a normal memory access (in the memtest, and the
malta and tsunami pchip). The memtest also exemplifies how easy it is
to create a non-translating proxy if desired. The malta and tsunami
pchip used a slave port to perform a functional read, and this is now
changed to rely on the physProxy of the system (to which they already
have a pointer).
2012-02-24 11:46:39 -05:00
Andreas Hansson 9e3c8de30b MEM: Make port proxies use references rather than pointers
This patch is adding a clearer design intent to all objects that would
not be complete without a port proxy by making the proxies members
rathen than dynamically allocated. In essence, if NULL would not be a
valid value for the proxy, then we avoid using a pointer to make this
clear.

The same approach is used for the methods using these proxies, such as
loadSections, that now use references rather than pointers to better
reflect the fact that NULL would not be an acceptable value (in fact
the code would break and that is how this patch started out).

Overall the concept of "using a reference to express unconditional
composition where a NULL pointer is never valid" could be done on a
much broader scale throughout the code base, but for now it is only
done in the locations affected by the proxies.
2012-02-24 11:45:30 -05:00
Andreas Hansson 1031b824b9 MEM: Move port creation to the memory object(s) construction
This patch moves all port creation from the getPort method to be
consistently done in the MemObject's constructor. This is possible
thanks to the Swig interface passing the length of the vector ports.
Previously there was a mix of: 1) creating the ports as members (at
object construction time) and using getPort for the name resolution,
or 2) dynamically creating the ports in the getPort call. This is now
uniform. Furthermore, objects that would not be complete without a
port have these ports as members rather than having pointers to
dynamically allocated ports.

This patch also enables an elaboration-time enumeration of all the
ports in the system which can be used to determine the masterId.
2012-02-24 11:43:53 -05:00
Andreas Hansson 9f07d2ce7e CPU: Round-two unifying instr/data CPU ports across models
This patch continues the unification of how the different CPU models
create and share their instruction and data ports. Most importantly,
it forces every CPU to have an instruction and a data port, and gives
these ports explicit getters in the BaseCPU (getDataPort and
getInstPort). The patch helps in simplifying the code, make
assumptions more explicit, andfurther ease future patches related to
the CPU ports.

The biggest changes are in the in-order model (that was not modified
in the previous unification patch), which now moves the ports from the
CacheUnit to the CPU. It also distinguishes the instruction fetch and
load-store unit from the rest of the resources, and avoids the use of
indices and casting in favour of keeping track of these two units
explicitly (since they are always there anyways). The atomic, timing
and O3 model simply return references to their already existing ports.
2012-02-24 11:42:00 -05:00
Andreas Hansson ef4af8cec8 MEM: Fatal when no port can be found for an address
This patch adds a check in the findPort method to ensure that an
invalid port id is never returned. Previously this could happen if no
default port was set, and no address matched the request, in which
case -1 was returned causing a SEGFAULT when using the id to index in
the port array. To clean things up further a symbolic name is added
for the invalid port id.
2012-02-24 11:40:29 -05:00
Andreas Hansson 5a9a743cfc MEM: Introduce the master/slave port roles in the Python classes
This patch classifies all ports in Python as either Master or Slave
and enforces a binding of master to slave. Conceptually, a master (such
as a CPU or DMA port) issues requests, and receives responses, and
conversely, a slave (such as a memory or a PIO device) receives
requests and sends back responses. Currently there is no
differentiation between coherent and non-coherent masters and slaves.

The classification as master/slave also involves splitting the dual
role port of the bus into a master and slave port and updating all the
system assembly scripts to use the appropriate port. Similarly, the
interrupt devices have to have their int_port split into a master and
slave port. The intdev and its children have minimal changes to
facilitate the extra port.

Note that this patch does not enforce any port typing in the C++
world, it merely ensures that the Python objects have a notion of the
port roles and are connected in an appropriate manner. This check is
carried when two ports are connected, e.g. bus.master =
memory.port. The following patches will make use of the
classifications and specialise the C++ ports into masters and slaves.
2012-02-13 06:43:09 -05:00
Dam Sunwoo 230540e655 mem: fix cache stats to use request ids correctly
This patch fixes the cache stats to use the new request ids.
Cache stats also display the requestor names in the vector subnames.
Most cache stats now include "nozero" and "nonan" flags to reduce the
amount of excessive cache stat dump. Also, simplified
incMissCount()/incHitCount() functions.
2012-02-12 16:07:39 -06:00
Ali Saidi 8aaa39e93d mem: Add a master ID to each request object.
This change adds a master id to each request object which can be
used identify every device in the system that is capable of issuing a request.
This is part of the way to removing the numCpus+1 stats in the cache and
replacing them with the master ids. This is one of a series of changes
that make way for the stats output to be changed to python.
2012-02-12 16:07:38 -06:00
Mrinmoy Ghosh 7e104a1af2 prefetcher: Make prefetcher a sim object instead of it being a parameter on cache 2012-02-12 16:07:38 -06:00
Nilay Vaish aa513a4a99 Ruby: Remove isTagPresent() calls from Sequencer.cc
This patch removes the calls to isTagPresent() from Sequencer.cc. These
calls are made just for setting the cache block to have been most recently
used. The calls have been folded in to the function setMRU().
2012-02-10 11:29:02 -06:00
Nilay Vaish 69d8600bf8 MESI: Add queues for stalled requests
This patch adds support for stalling the requests queued up at different
controllers for the MESI CMP directory protocol. Earlier the controllers
would recycle the requests using some fixed latency. This results in
younger requests getting serviced first at times, and can result in
starvation. Instead all the requests that need a particular block to be
in a stable state are moved to a separate queue, where they wait till
that block returns to a stable state and then they are processed.
2012-02-10 11:05:24 -06:00