Commit graph

1406 commits

Author SHA1 Message Date
Emilio Castillo
80fa6a0edc ruby: Fixed a deadlock when restoring a checkpoint with garnet
This patch fixes a problem where in Garnet, the enqueue time in the
VCallocator and the SWallocator which is of type Cycles was being stored
inside a variable with int type.

This lead to a known problem restoring checkpoints with garnet & the fixed
pipeline enabled. That value was really big and didn't fit in the variable
overflowing it, therefore some conditions on the VC allocation stage & the
SW allocation stage were not met and the packets didn't advance through the
network, leading to a deadlock panic right after the checkpoint was restored.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-10-30 10:35:05 -05:00
Stephan Diestelhorst
4e9d91016a mem: De-virtualise interfaces in the CoherentBus
The CoherentBus eventually got virtual methods for its interface. The
"virtuality" of the CoherentBus, however, comes already from the virtual
interface of the bus' ports. There is no need to add another layer of virtual
functions, here.
2013-10-17 10:20:45 -05:00
Matt Horsnell
6decd70bfb cpu: add consistent guarding to *_impl.hh files. 2013-10-17 10:20:45 -05:00
Sascha Bischoff
52f90890a3 mem: Add PortID to QueuedMasterPort constructor
This patch adds the PortID to the QueuedMasterPort. This allows a PortID to be
specified as it previously was set to the detault value of -1.
2013-10-17 10:20:45 -05:00
Ali Saidi
60ce2b34fe mem: Make MemoryAccess flag more verbose
This patch extends the MemoryAccess debug flag to report who sent the
requests and the cacheability.
2013-10-17 10:20:45 -05:00
Steve Reinhardt
b10ff075b1 ruby: eliminate non-determinism from ruby.stats output
Get rid of non-deterministic "stats" in ruby.stats output
such as time & date of run, elapsed & CPU time used,
and memory usage.  These values cause spurious
miscomparisons when looking at output diffs (though
they don't affect regressions, since the regressions
pass/fail status currently ignores ruby.stats entirely).

Most of this information is already captured in other
places (time & date in stdout, elapsed time & mem usage
in stats.txt), where the regression script is smart
enough to filter it out.  It seems easier to get rid of
the redundant output rather than teaching the
regression tester to ignore the same information in
two different places.
2013-10-15 18:22:49 -04:00
Andreas Sandberg
4f5775df64 mem: Rename the ASI_BITS flag field in Request
ASI_BITS in the Request object were originally used to store a memory
request's ASI on SPARC. This is not the case any more since other ISAs
use the ASI bits to store architecture-dependent information. This
changeset renames the ASI_BITS to ARCH_BITS which better describes
their use. Additionally, the getAsi() accessor is renamed to
getArchFlags().
2013-10-15 13:26:34 +02:00
Andreas Sandberg
5e7738467b mem: Use a flag instead of address bit 63 for generic IPRs
Using address bit 63 to identify generic IPRs caused problems on
SPARC, where IPRs are heavily used. This changeset redefines how
generic IPRs are identified. Instead of using bit 63, we now use a
separate flag (GENERIC_IPR) a memory request.
2013-10-15 13:24:35 +02:00
Andreas Hansson
9aa939891f mem: Fix scheduling bug in SimpleMemory
This patch ensures that a dequeue event is not scheduled if the memory
controller is waiting for a retry already. Without this check it is
possible for the controller to attempt sending something whilst
already having one packet that is in retry, thus causing the bus to
have an assertion failure.
2013-09-18 08:46:33 -04:00
Joel Hestness
cc155ffa0d ruby: Fix Topology throttle connections
The Topology source sets up input and output buffers for each of the external
nodes of a topology by indexing on Ruby's generated controller unique IDs.
These unique IDs are found by adding the MachineType_base_number to the version
number of each controller (see any generated *_Controller.cc - init() calls
getToNetQueue and getFromNetQueue using m_version + base). However, the
Topology object used the cntrl_id - which is required to be unique across all
controllers - to index the controllers list as they are being connected to
their input and output buffers. If the cntrl_ids did not match the Ruby unique
ID, the throttles end up connected to incorrectly indexed nodes in the network,
resulting in packets traversing incorrect network paths. This patch fixes the
Topology indexing scheme by using the Ruby unique ID to match that of the
SimpleNetwork buffer vectors.
2013-09-11 15:35:18 -05:00
Joel Hestness
c1cf55c738 ruby: Statically allocate stats in SimpleNetwork, Switch, Throttle
The previous changeset (9863:9483739f83ee) used STL vector containers to
dynamically allocate stats in the Ruby SimpleNetwork, Switch and Throttle. For
gcc versions before at least 4.6.3, this causes the standard vector allocator
to call Stats copy constructors (a no-no, since stats should be allocated in
the body of each SimObject instance). Since the size of these stats arrays is
known at compile time (NOTE: after code generation), this patch changes their
allocation to be static rather than using an STL vector.
2013-09-11 15:33:27 -05:00
Nilay Vaish
90bfbd9793 ruby: network: convert to gem5 style stats 2013-09-06 16:21:35 -05:00
Nilay Vaish
24dc914d87 ruby: profiler: removes function resourceUsage() 2013-09-06 16:21:32 -05:00
Nilay Vaish
79b5ea9d19 ruby: remove undefined message size type
This message size type does not work well with one of the statistical
variables. It also seems unnecessary.
2013-09-06 16:21:30 -05:00
Nilay Vaish
0280997fbf ruby: network: removes reset functionality 2013-09-06 16:21:30 -05:00
Nilay Vaish
e7bd70e079 ruby: network: shorten variable names 2013-09-06 16:21:29 -05:00
Nilay Vaish
c0a8ad0a35 ruby: converts sparse memory stats to gem5 style 2013-09-06 16:21:28 -05:00
Andreas Hansson
19a5b68db7 arch: Resurrect the NOISA build target and rename it NULL
This patch makes it possible to once again build gem5 without any
ISA. The main purpose is to enable work around the interconnect and
memory system without having to build any CPU models or device models.

The regress script is updated to include the NULL ISA target. Currently
no regressions make use of it, but all the testers could (and perhaps
should) transition to it.

--HG--
rename : build_opts/NOISA => build_opts/NULL
rename : src/arch/noisa/SConsopts => src/arch/null/SConsopts
rename : src/arch/noisa/cpu_dummy.hh => src/arch/null/cpu_dummy.hh
rename : src/cpu/intr_control.cc => src/cpu/intr_control_noisa.cc
2013-09-04 13:22:57 -04:00
Andreas Hansson
b63631536d stats: Cumulative stats update
This patch updates the stats to reflect the: 1) addition of the
internal queue in SimpleMemory, 2) moving of the memory class outside
FSConfig, 3) fixing up of the 2D vector printing format, 4) specifying
burst size and interface width for the DRAM instead of relying on
cache-line size, 5) performing merging in the DRAM controller write
buffer, and 6) fixing how idle cycles are counted in the atomic and
timing CPU models.

The main reason for bundling them up is to minimise the changeset
size.
2013-08-19 03:52:36 -04:00
Andreas Hansson
c26911013c config: Command line support for multi-channel memory
This patch adds support for specifying multi-channel memory
configurations on the command line, e.g. 'se/fs.py
--mem-type=ddr3_1600_x64 --mem-channels=4'. To enable this, it
enhances the functionality of MemConfig and moves the existing
makeMultiChannel class method from SimpleDRAM to the support scripts.

The se/fs.py example scripts are updated to make use of the new
feature.
2013-08-19 03:52:34 -04:00
Andreas Hansson
49d88f08b0 mem: Change AbstractMemory defaults to match the common case
This patch changes the default parameter value of conf_table_reported
to match the common case. It also simplifies the regression and config
scripts to reflect this change.
2013-08-19 03:52:33 -04:00
Andreas Hansson
6279eaf1f7 mem: Use STL deque in favour of list for DRAM queues
This patch changes the data structure used for the DRAM read, write
and response queues from an STL list to deque. This optimisation is
based on the observation that the size is small (and fixed), and that
the structures are frequently iterated over in a linear fashion.
2013-08-19 03:52:32 -04:00
Andreas Hansson
ac42db8134 mem: Perform write merging in the DRAM write queue
This patch implements basic write merging in the DRAM to avoid
redundant bursts. When a new access is added to the queue it is
compared against the existing entries, and if it is either
intersecting or immediately succeeding/preceeding an existing item it
is merged.

There is currently no attempt made at avoiding iterating over the
existing items in determining whether merging is possible or not.
2013-08-19 03:52:31 -04:00
Amin Farmahini
243f135e5f mem: Replacing bytesPerCacheLine with DRAM burstLength in SimpleDRAM
This patch gets rid of bytesPerCacheLine parameter and makes the DRAM
configuration separate from cache line size. Instead of
bytesPerCacheLine, we define a parameter for the DRAM called
burst_length. The burst_length parameter shows the length of a DRAM
device burst in bits. Also, lines_per_rowbuffer is replaced with
device_rowbuffer_size to improve code portablity.

This patch adds a burst length in beats for each memory type, an
interface width for each memory type, and the memory controller model
is extended to reason about "system" packets vs "dram" packets and
assemble the responses properly. It means that system packets larger
than a full burst are split into multiple dram packets.
2013-08-19 03:52:30 -04:00
Andreas Hansson
d5593f3c75 mem: Warn instead of panic for tXAW violation
Until the performance bug is fixed, avoid killing simulations.
2013-08-19 03:52:26 -04:00
Andreas Hansson
7bc3eaec7a mem: Allow disabling of tXAW through a 0 activation limit
This patch fixes an issue where an activation limit of 0 was not
allowed. With this patch, setting the limit to 0 simply disables the
tXAW constraint.
2013-08-19 03:52:26 -04:00
Andreas Hansson
2a675aecb9 mem: Add an internal packet queue in SimpleMemory
This patch adds a packet queue in SimpleMemory to avoid using the
packet queue in the port (and thus have no involvement in the flow
control). The port queue was bound to 100 packets, and as the
SimpleMemory is modelling both a controller and an actual RAM, it
potentially has a large number of packets in flight. There is
currently no limit on the number of packets in the memory controller,
but this could easily be added in a follow-on patch.

As a result of the added internal storage, the functional access and
draining is updated. Some minor cleaning up and renaming has also been
done.

The memtest regression changes as a result of this patch and the stats
will be updated.
2013-08-19 03:52:25 -04:00
Nilay Vaish
95381f8a99 ruby: slicc: remove double trigger, continueProcessing
These constructs are not in use and are not being maintained by any one.
In addition, it is not known if doubleTrigger works correctly with Ruby now.
2013-08-07 14:51:18 -05:00
Nilay Vaish
f1b17bf157 ruby: slicc: move some code to AbstractController
Some of the code in StateMachine.py file is added to all the controllers and
is independent of the controller definition. This code is being moved to the
AbstractController class which is the parent class of all controllers.
2013-08-07 14:51:18 -05:00
Andreas Hansson
d4273cc9a6 mem: Set the cache line size on a system level
This patch removes the notion of a peer block size and instead sets
the cache line size on the system level.

Previously the size was set per cache, and communicated through the
interconnect. There were plenty checks to ensure that everyone had the
same size specified, and these checks are now removed. Another benefit
that is not yet harnessed is that the cache line size is now known at
construction time, rather than after the port binding. Hence, the
block size can be locally stored and does not have to be queried every
time it is used.

A follow-on patch updates the configuration scripts accordingly.
2013-07-18 08:31:16 -04:00
Xiangyu Dong
4e8ecd7c6f mem: Add cache class destructor to avoid memory leaks
Make valgrind a little bit happier
2013-07-18 08:29:47 -04:00
Brad Beckmann
8e54c93222 ruby: removed the very old double trigger hack
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-07-11 13:56:05 -05:00
Nilay Vaish
1be0098c0b ruby: append transition comment only when in opt/debug 2013-06-28 21:42:27 -05:00
Nilay Vaish
b3980cdb9a ruby: network: remove reconfiguration code
This code seems not to be of any use now. There is no path in the simulator
that allows for reconfiguring the network. A better approach would be to
take a checkpoint and start the simulation from the checkpoint with the new
configuration.
2013-06-28 21:36:37 -05:00
Prakash Ramrakhyani
ac515d7a9b mem: Reorganize cache tags and make them a SimObject
This patch reorganizes the cache tags to allow more flexibility to
implement new replacement policies. The base tags class is now a
clocked object so that derived classes can use a clock if they need
one. Also having deriving from SimObject allows specialized Tag
classes to be swapped in/out in .py files.

The cache set is now templatized to allow it to contain customized
cache blocks with additional informaiton. This involved moving code to
the .hh file and removing cacheset.cc.

The statistics belonging to the cache tags are now including ".tags"
in their name. Hence, the stats need an update to reflect the change
in naming.
2013-06-27 05:49:50 -04:00
Andreas Hansson
0d68d36b9d mem: Remove the cache builder
This patch removes the redundant cache builder class.
2013-06-27 05:49:50 -04:00
Akash Bagdia
7d7ab73862 sim: Add the notion of clock domains to all ClockedObjects
This patch adds the notion of source- and derived-clock domains to the
ClockedObjects. As such, all clock information is moved to the clock
domain, and the ClockedObjects are grouped into domains.

The clock domains are either source domains, with a specific clock
period, or derived domains that have a parent domain and a divider
(potentially chained). For piece of logic that runs at a derived clock
(a ratio of the clock its parent is running at) the necessary derived
clock domain is created from its corresponding parent clock
domain. For now, the derived clock domain only supports a divider,
thus ensuring a lower speed compared to its parent. Multiplier
functionality implies a PLL logic that has not been modelled yet
(create a separate clock instead).

The clock domains should be used as a mechanism to provide a
controllable clock source that affects clock for every clocked object
lying beneath it. The clock of the domain can (in a future patch) be
controlled by a handler responsible for dynamic frequency scaling of
the respective clock domains.

All the config scripts have been retro-fitted with clock domains. For
the System a default SrcClockDomain is created. For CPUs that run at a
different speed than the system, there is a seperate clock domain
created. This domain incorporates the CPU and the associated
caches. As before, Ruby runs under its own clock domain.

The clock period of all domains are pre-computed, such that no virtual
functions or multiplications are needed when calling
clockPeriod. Instead, the clock period is pre-computed when any
changes occur. For this to be possible, each clock domain tracks its
children.
2013-06-27 05:49:49 -04:00
Akash Bagdia
7eccb1b779 config: Remove redundant explicit setting of default clocks
This patch removes the explicit setting of the clock period for
certain instances of CoherentBus, NonCoherentBus and IOCache where the
specified clock is same as the default value of the system clock. As
all the values used are the defaults, there are no performance
changes. There are similar cases where the toL2Bus is set to use the
parent CPU clock which is already the default behaviour.

The main motivation for these simplifications is to ease the
introduction of clock domains.
2013-06-27 05:49:49 -04:00
Andreas Hansson
3b92748937 mem: Tidy up the bridge with const and additional checks
This patch does a bit of tidying up in the bridge code, adding const
where appropriate and also removing redundant checks and adding a few
new ones.

There are no changes to the behaviour of any regressions.
2013-06-27 05:49:49 -04:00
Andreas Hansson
f25ea3fd56 mem: Fix CommMonitor style and response check
This patch fixes the CommMonitor local variable names, and also
introduces a variable to capture if it expects to see a response. The
latter check considers both needsResponse and memInhibitAsserted.
2013-06-27 05:49:49 -04:00
Andreas Hansson
33a8d777ad mem: Align cache timing to clock edges
This patch changes the cache timing calculations such that the results
are aligned to clock edges.

Plenty stats change as a results of this patch.
2013-06-27 05:49:49 -04:00
Andreas Hansson
368f50a0a1 mem: Cycles converted to Ticks in atomic cache accesses
This patch fixes an outstanding issue in the cache timing calculations
where an atomic access returned a time in Cycles, but the port
forwarded it on as if it was in Ticks.

A separate patch will update the regression stats.
2013-06-27 05:49:49 -04:00
Andreas Hansson
f330b3c28d mem: Remove a redundant heap allocation for a snoop packet
This patch changes the updards snoop packet to avoid allocating and
later deleting it. As the code executes in 0 time and the lifetime of
the packet does not extend beyond the block there is no reason to heap
allocate it.
2013-06-27 05:49:49 -04:00
Andreas Hansson
9a1169f3d7 mem: Remove CoherentBus snoop port unused private member
This patch removes an unused member to avoid getting compiler warnings
when using clang.
2013-06-27 05:49:49 -04:00
Nilay Vaish
d8ed1d1a2c ruby: moesi cmp directory: separate actions for external hits
This patch adds separate actions for requests that missed in the local cache
and messages were sent out to get the requested line. These separate actions
are required for differentiating between the hit and miss latencies in the
statistics collected.
2013-06-25 00:32:04 -05:00
Nilay Vaish
128ab50c47 ruby: mesi cmp directory: separate actions for external hits
This patch adds separate actions for requests that missed in the local cache
and messages were sent out to get the requested line. These separate actions
are required for differentiating between the hit and miss latencies in the
statistics collected.
2013-06-25 00:32:03 -05:00
Nilay Vaish
beb6e57c6f ruby: profiler: lots of inter-related changes
The patch started of with removing the global variables from the profiler for
profiling the miss latency of requests made to the cache. The corrresponding
histograms have been moved to the Sequencer. These are combined together when
the histograms are printed. Separate histograms are now maintained for
tracking latency of all requests together, of hits only and of misses only.

A particular set of histograms used to use the type GenericMachineType defined
in one of the protocol files. This patch removes this type. Now, everything
that relied on this type would use MachineType instead. To do this, SLICC has
been changed so that multiple machine types can be declared by a controller
in its preamble.
2013-06-25 00:32:03 -05:00
Nilay Vaish
b3db882dee ruby: remove the three files related to profiling
This patch removes the following three files: RubySlicc_Profiler.sm,
RubySlicc_Profiler_interface.cc and RubySlicc_Profiler_interface.hh.
Only one function prototyped in the file RubySlicc_Profiler.sm. Rest of the
code appearing in any of these files is not in use. Therefore, these files
are being removed.

That one single function, profileMsgDelay(), is being moved to the protocol
files where it is in use. If we need any of these deleted functions, I think
the right way to make them visible is to have the AbstractController class in
a .sm and let the controller state machine inherit from this class. The
AbstractController class can then have the prototypes of these profiling
functions in its definition.
2013-06-24 08:59:08 -05:00
Joel Hestness ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)
71c6c43110 ruby: MessageBuffer: Remove unused m_size variable
The m_size variable attempted to track m_prio_heap.size(), but it did so
incorrectly due to the functions reanalyzeMessages and reanalyzeAllMessages().
Since this variable is intended to track m_prio_heap.size(), we can simply
replace instances where m_size is referenced with m_prio_heap.size(), which
has the added bonus of removing the need for m_size.

Note: This patch also removes an extraneous DPRINTF format string designator
from reanalyzeAllMessages()

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-06-24 06:57:06 -05:00
Lena Olson
94280c7e51 ruby: fix typo in MOESI_CMP_token protocol 2013-06-20 16:20:38 -05:00