Commit graph

1519 commits

Author SHA1 Message Date
Nilay Vaish
67cd04b6fe ruby: make the max_size variable of the MessageBuffer unsigned 2014-03-01 23:59:57 -06:00
Nilay Vaish
a533f3f983 ruby: profiler: statically allocate stats variable
Couple of users observed segmentation fault when the simulator tries to
register the statistical variable m_IncompleteTimes.  It seems that there
is some problem with the initialization of these variables when allocated
in the constructor.
2014-03-01 23:35:21 -06:00
Nilay Vaish
7e27860ef4 ruby: route all packets through ruby port
Currently, the interrupt controller in x86 is connected to the io bus
directly.  Therefore the packets between the io devices and the interrupt
controller do not go through ruby.  This patch changes ruby port so that
these packets arrive at the ruby port first, which then routes them to their
destination.  Note that the patch does not make these packets go through the
ruby network.  That would happen in a subsequent patch.
2014-02-23 19:16:16 -06:00
Andreas Hansson
5755fff998 ruby: Simplify RubyPort flow control and routing
This patch simplfies the retry logic in the RubyPort, avoiding
redundant attributes, and enforcing more stringent checks on the
interactions with the normal ports. The patch also simplifies the
routing done by the RubyPort, using the port identifiers instead of a
heavy-weight sender state.

The patch also fixes a bug in the sending of responses from PIO
ports. Previously these responses bypassed the queue in the queued
port, and ignored the return value, potentially leading to response
packets being lost.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-02-23 19:16:16 -06:00
Nilay Vaish
7572ab71b5 ruby: message buffer: refactor code
Code in two of the functions was exactly the same.  This patch moves
this code to a new function which is called from the two functions
mentioned initially.
2014-02-23 19:16:15 -06:00
Nilay Vaish
cde20fd476 ruby: remove few not required #includes 2014-02-23 19:16:15 -06:00
Nilay Vaish
82378f7301 ruby: slicc: remove unused COPY_HEAD functionality 2014-02-23 19:16:15 -06:00
Nilay Vaish
13ad07601b ruby: protocols: remove unused action z_stall 2014-02-23 19:16:15 -06:00
Nilay Vaish
cd33f9bc42 ruby: network: move message buffers to base network class. 2014-02-21 08:02:05 -06:00
Nilay Vaish
bd8f954526 ruby: network: garnet: fixed: removes net_ptr from links 2014-02-21 08:02:04 -06:00
Nilay Vaish
307f53e164 ruby: cache: remove not required variable m_cache_name 2014-02-21 08:02:02 -06:00
Nilay Vaish
f8f8b7e5c2 ruby: network: garnet: fixed: removes next cycle functions
At several places, there are functions that take a cycle value as input
and performs some computation.  Along with each such function, another
function was being defined that simply added one more cycle to input and
computed the same function.  This patch removes this second copy of the
function.  Places where these functions were being called have been updated
to use the original function with argument being current cycle + 1.
2014-02-20 17:28:01 -06:00
Nilay Vaish
896654746a ruby: controller: slight code refactoring 2014-02-20 17:27:45 -06:00
Nilay Vaish
0ce8c25919 ruby: mesi three level: rename incorrectly named files
Two files had been incorrectly named with a .cache suffix.

--HG--
rename : src/mem/protocol/MESI_Three_Level-L0.cache => src/mem/protocol/MESI_Three_Level-L0cache.sm
rename : src/mem/protocol/MESI_Three_Level-L1.cache => src/mem/protocol/MESI_Three_Level-L1cache.sm
2014-02-20 17:27:17 -06:00
Nilay Vaish
db5b3d37fe ruby: network: removes unused code. 2014-02-20 17:27:07 -06:00
Nilay Vaish
dd5c72e5a7 ruby: slicc: slight code refactoring 2014-02-20 17:26:49 -06:00
Nilay Vaish
b312a41f21 ruby: message buffer: removes some unecessary functions. 2014-02-20 17:26:41 -06:00
Andreas Hansson
4b81585c49 mem: Fix bug in PhysicalMemory use of mmap and munmap
This patch fixes a bug in how physical memory used to be mapped and
unmapped. Previously we unmapped and re-mapped if restoring from a
checkpoint. However, we never checked that the new mapping was
actually the same, it was just magically working as the OS seems to
fairly reliably give us the same chunk back. This patch fixes this
issue by relying entirely on the mmap call in the constructor.
2014-02-18 05:51:01 -05:00
Andreas Hansson
969b436243 mem: Filter cache snoops based on address ranges
This patch adds a filter to the cache to drop snoop requests that are
not for a range covered by the cache. This fixes an issue observed
when multiple caches are placed in parallel, covering different
address ranges. Without this patch, all the caches will forward the
snoop upwards, when only one should do so.
2014-02-18 05:50:58 -05:00
Andreas Hansson
bf2f178f85 mem: Add a wrapped DRAMSim2 memory controller
This patch adds DRAMSim2 as a memory controller by wrapping the
external library and creating a sublass of AbstractMemory that bridges
between the semantics of gem5 and the DRAMSim2 interface.

The DRAMSim2 wrapper extracts the clock period from the config
file. There is no way of extracting this information from DRAMSim2
itself, so we simply read the same config file and get it from there.

To properly model the response queue, the wrapper keeps track of how
many transactions are in the actual controller, and how many are
stacking up waiting to be sent back as responses (in the wrapper). The
latter requires us to move away from the queued port and manage the
packets ourselves. This is due to DRAMSim2 not having any flow control
on the response path.

DRAMSim2 assumes that the transactions it is given are matching the
burst size of the choosen memory. The wrapper checks to ensure the
cache line size of the system matches the burst size of DRAMSim2 as
there are currently no provisions to split the system requests. In
theory we could allow a cache line size smaller than the burst size,
but that would lead to inefficient use of the DRAM, so for not we
fatal also in this case.
2014-02-18 05:50:53 -05:00
Andreas Hansson
c9cb492e1c mem: Fix input to DPRINTF in CommMonitor
Minor fix of the debug message parameters.
2014-02-18 05:50:51 -05:00
Nilay Vaish
bb0e9119e7 ruby: memory controller: use MemoryNode * 2014-02-06 16:30:12 -06:00
Mitch Hayenga
96317d466e mem: Add additional tolerance to stride prefetcher
Forces the prefetcher to mispredict twice in a row before resetting the
confidence of prefetching.  This helps cases where a load PC strides by a
constant factor, however it may operate on different arrays at times.
Avoids the cost of retraining.  Primarily helps with small iteration loops.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-29 23:21:26 -06:00
Mitch Hayenga
771c864bf4 mem: Allowed tagged instruction prefetching in stride prefetcher
For systems with a tightly coupled L2, a stride-based prefetcher may observe
access requests from both instruction and data L1 caches.  However, the PC
address of an instruction miss gives no relevant training information to the
stride based prefetcher(there is no stride to train).  In theses cases, its
better if the L2 stride prefetcher simply reverted back to a simple N-block
ahead prefetcher.  This patch enables this option.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-29 23:21:26 -06:00
Mitch Hayenga ext:(%2C%20Amin%20Farmahini%20%3Caminfar%40gmail.com%3E)
95735e10e7 mem: prefetcher: add options, support for unaligned addresses
This patch extends the classic prefetcher to work on non-block aligned
addresses.  Because the existing prefetchers in gem5 mask off the lower
address bits of cache accesses, many predictable strides fail to be
detected.  For example, if a load were to stride by 48 bytes, with 64 byte
cachelines, the current stride based prefetcher would see an access pattern
of 0, 64, 64, 128, 192.... Thus not detecting a constant stride pattern.  This
patch fixes this, by training the prefetcher on access and not masking off the
lower address bits.

It also adds the following configuration options:
1) Training/prefetching only on cache misses,
2) Training/prefetching only on data acceses,
3) Optionally tagging prefetches with a PC address.
#3 allows prefetchers to train off of prefetch requests in systems with
multiple cache levels and PC-based prefetchers present at multiple levels.
It also effectively allows a pipelining of prefetch requests (like in POWER4)
across multiple levels of cache hierarchy.

Improves performance on my gem5 configuration by 4.3% for SPECINT and 4.7%  for SPECFP (geomean).
2014-01-29 23:21:25 -06:00
Amin Farmahini
ffbdaa7cce mem: Remove redundant findVictim() input argument
The patch
(1) removes the redundant writeback argument from findVictim()
(2) fixes the description of access() function

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-28 18:00:50 -06:00
Amin Farmahini
575a73f4a1 mem: Fixes a bug in simple_dram write merging
Fixes updating the value of size in the write merge function.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-28 18:00:49 -06:00
Ali Saidi
90b1775a8f cpu: Add support for instructions that zero cache lines. 2014-01-24 15:29:30 -06:00
Giacomo Gabrielli
d3444c6603 mem: Add flag to request if it was generated by a page table walk 2014-01-24 15:29:30 -06:00
Giacomo Gabrielli
aefe9cc624 mem: Add support for a security bit in the memory system
This patch adds the basic building blocks required to support e.g. ARM
TrustZone by discerning secure and non-secure memory accesses.
2014-01-24 15:29:30 -06:00
Timothy M. Jones
427ceb57a9 Cache: Collect very basic stats on tag and data accesses
Adds very basic statistics on the number of tag and data accesses within the
cache, which is important for power modelling.  For the tags, simply count
the associativity of the cache each time.  For the data, this depends on
whether tags and data are accessed sequentially, which is given by a new
parameter.  In the parallel case, all data blocks are accessed each time, but
with sequential accesses, a single data block is accessed only on a hit.
2014-01-24 15:29:30 -06:00
Dam Sunwoo
85e8779de7 mem: per-thread cache occupancy and per-block ages
This patch enables tracking of cache occupancy per thread along with
ages (in buckets) per cache blocks.  Cache occupancy stats are
recalculated on each stat dump.
2014-01-24 15:29:30 -06:00
Matt Horsnell
ca89eba79e mem: track per-request latencies and access depths in the cache hierarchy
Add some values and methods to the request object to track the translation
and access latency for a request and which level of the cache hierarchy responded
to the request.
2014-01-24 15:29:30 -06:00
Nilay Vaish
37433d91a3 ruby: remove unused label no_vector 2014-01-17 11:02:15 -06:00
Nilay Vaish
407f37e15f ruby: move all statistics to stats.txt, eliminate ruby.stats 2014-01-10 16:19:47 -06:00
Nilay Vaish
0387281e2a ruby: fix bug introduced to revision 8523754f8885 2014-01-09 10:45:50 -06:00
Nilay Vaish
8559081648 ruby: slicc: remove variable 'addr' used in calls to doTransition
This variable causes trouble if a variable of same name is declared in a
protocol file. Hence it is being eliminated.
2014-01-08 04:26:25 -06:00
Nilay Vaish
4070b00875 ruby: add a three level MESI protocol.
The first two levels (L0, L1) are private to the core, the third level (L2)is
possibly shared. The protocol supports clustered designs.  For example, one
can have two sets of two cores. Each core has an L0 and L1 cache. There are
two L2 controllers where each set accesses only one of the L2 controllers.
2014-01-04 00:03:34 -06:00
Nilay Vaish
bb6d7d402b ruby: rename MESI_CMP_directory to MESI_Two_Level
This is because the next patch introduces a three level hierarchy.

--HG--
rename : build_opts/ALPHA_MESI_CMP_directory => build_opts/ALPHA_MESI_Two_Level
rename : build_opts/X86_MESI_CMP_directory => build_opts/X86_MESI_Two_Level
rename : configs/ruby/MESI_CMP_directory.py => configs/ruby/MESI_Two_Level.py
rename : src/mem/protocol/MESI_CMP_directory-L1cache.sm => src/mem/protocol/MESI_Two_Level-L1cache.sm
rename : src/mem/protocol/MESI_CMP_directory-L2cache.sm => src/mem/protocol/MESI_Two_Level-L2cache.sm
rename : src/mem/protocol/MESI_CMP_directory-dir.sm => src/mem/protocol/MESI_Two_Level-dir.sm
rename : src/mem/protocol/MESI_CMP_directory-dma.sm => src/mem/protocol/MESI_Two_Level-dma.sm
rename : src/mem/protocol/MESI_CMP_directory-msg.sm => src/mem/protocol/MESI_Two_Level-msg.sm
rename : src/mem/protocol/MESI_CMP_directory.slicc => src/mem/protocol/MESI_Two_Level.slicc
rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/config.ini => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/config.ini
rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/ruby.stats => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/ruby.stats
rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/simerr => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/simerr
rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/simout => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/simout
rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/stats.txt => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/stats.txt
rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/system.pc.com_1.terminal => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/system.pc.com_1.terminal
rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/config.ini => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/config.ini
rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/ruby.stats => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/ruby.stats
rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/simerr => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/simerr
rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/simout => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/simout
rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/stats.txt => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/stats.txt
rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/config.ini => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/config.ini
rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/ruby.stats => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/ruby.stats
rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/simerr => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/simerr
rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/simout => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/simout
rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/stats.txt => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/stats.txt
rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/config.ini => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/config.ini
rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/ruby.stats => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/ruby.stats
rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/simerr => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/simerr
rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/simout => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/simout
rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/stats.txt => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/stats.txt
rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/config.ini => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/config.ini
rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/ruby.stats => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/ruby.stats
rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/simerr => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/simerr
rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/simout => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/simout
rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/stats.txt => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/stats.txt
2014-01-04 00:03:33 -06:00
Nilay Vaish
5b1804e3bd ruby: add support for clusters
A cluster over here means a set of controllers that can be accessed only by a
certain set of cores.  For example,  consider a two level hierarchy. Assume
there are 4 L1 controllers (private) and 2 L2 controllers.  We can have two
different hierarchies here:

a. the address space is partitioned between the two L2 controllers.  Each L1
controller accesses both the L2 controllers.  In this case, each L1 controller
is a cluster initself.

b. both the L2 controllers can cache any address.  An L1 controller has access
to only one of the L2 controllers.  In this case, each L2 controller
along with the L1 controllers that access it, form a cluster.

This patch allows for each controller to have a cluster ID, which is 0 by
default.  By setting the cluster ID properly,  one can instantiate hierarchies
with clusters.  Note that the coherence protocol might have to be changed as
well.
2014-01-04 00:03:31 -06:00
Nilay Vaish
9853ef6651 ruby: some small changes 2014-01-04 00:03:30 -06:00
Nilay Vaish
d71311b1cf ruby: fix bugs in mesi cmp directory protocol
This patch fixes couple of bugs in the L2 controller of the mesi cmp
directory protocol.

1. The state MT_I was transitioning to NP on receiving a clean writeback
from the L1 controller.  This patch makes it inform the directory controller
about the writeback.

2. The L2 controller was sending the dirty bit to the L1 controller and the
L2 controller used writeback from the L1 controller to update the dirty bit
unconditionally.  Now, the L1 controller always assumes that the incoming
data is clean.  The L2 controller updates the dirty bit only when the L1
controller writes to the block.

3. Certain unused functions and events are being removed.
2013-12-26 15:18:55 -06:00
Nilay Vaish
fc53f9ffcc ruby: slicc: replace max_in_port_rank with number of inports
This patch replaces max_in_port_rank with the number of inports.  The use of
max_in_port_rank was causing spurious re-builds and incorrect initialization
of variables in ruby related regression tests.  This was due to the variable
value being used across threads while compiling when it was not meant to be.

Since the number of inports is state machine specific value, this problem
should get solved.
2013-12-20 20:34:04 -06:00
Nilay Vaish
30b259a31e ruby: declare variables to be unsigned in Address.hh 2013-12-20 20:34:03 -06:00
Nilay Vaish
f5b52a265a ruby: mesi: remove owner and sharer fields from directory tags
The directory controller should not have the sharer field since there is
only one level 2 cache. Anyway the field was not in use.  The owner field
was being used to track the l2 cache version (in case of distributed l2) that
has the cache block under consideration.  The information is not required
since the version of the level 2 cache can be obtained from a subset of the
address bits.
2013-12-20 20:34:03 -06:00
Andreas Hansson
460cc77d6d mem: Fixes for DRAM stats accounting
This patch fixes a number of stats accounting issues in the DRAM
controller. Most importantly, it separates the system interface and
DRAM interface so that it is clearer what the actual DRAM bandwidth
(and consequently utilisation) is.
2013-11-01 11:56:31 -04:00
Andreas Hansson
ce93982cc6 mem: Fix the LPDDR3 page size
This patch corrects the LPDDR3 page size, which was set too low.
2013-11-01 11:56:30 -04:00
Neha Agarwal
5c486908d7 mem: Adding stats for DRAM power calculation
This patch adds stats which are used for offline power calculation
from the 'Micron Power Calculator' spreadsheet.
2013-11-01 11:56:28 -04:00
Neha Agarwal
77fce1ce0e mem: Unify request selection for read and write queues
This patch unifies the request selection across read and write queues
for FR-FCFS scheduling policy. It also fixes the request selection
code to prioritize the row hits present in the request queues over the
selection based on earliest bank availability.
2013-11-01 11:56:27 -04:00
Andreas Hansson
bb572663cf mem: Add a simple adaptive version of the open-page policy
This patch adds a basic adaptive version of the open-page policy that
guides the decision to keep open or close by looking at the contents
of the controller queues. If no row hits are found, and bank conflicts
are present, then the row is closed by means of an auto
precharge. This is a well-known technique that should improve
performance in most use-cases.
2013-11-01 11:56:26 -04:00
Neha Agarwal
da6fd72f62 mem: Just-in-time write scheduling in DRAM controller
This patch removes the untimed while loop in the write scheduling
mechanism and now schedule commands taking into account the minimum
timing constraint. It also introduces an optimization to track write
queue size and switch from writes to reads if the number of write
requests fall below write low threshold.
2013-11-01 11:56:25 -04:00
Andreas Hansson
ee6b41a1e4 mem: Add tRRD as a timing parameter for the DRAM controller
This patch adds the tRRD parameter to the DRAM controller. With the
recent addition of the actAllowedAt member for each bank, this
addition is trivial.
2013-11-01 11:56:24 -04:00
Andreas Hansson
491d3a77cf mem: Less conservative tRAS in DRAM configurations
This patch changes the default values of the tRAS timing parameter to
be less conservative, and closer in line with existing parts.
2013-11-01 11:56:23 -04:00
Ani Udipi
8bc855fa15 mem: Make tXAW enforcement less conservative and per rank
This patch changes the tXAW constraint so that it is enforced per rank
rather than globally for all ranks in the channel. It also avoids
using the bank freeAt to enforce the activation limit, as doing so
also precludes performing any column or row command to the
DRAM. Instead the patch introduces a new variable actAllowedAt for the
banks and use this to track when a potential activation can occur.
2013-11-01 11:56:22 -04:00
Neha Agarwal
7645c8e611 mem: Fix for 100% write threshold in DRAM controller
This patch fixes the controller when a write threshold of 100% is
used.  Earlier for 100% write threshold no data is written to memory
as writes never get triggered since this corner case is not
considered.
2013-11-01 11:56:21 -04:00
Andreas Hansson
10e8978ec0 mem: Pick the next DRAM request based on bank availability
This patch changes the FCFS bit of FR-FCFS such that requests that
target the earliest available bank are picked first (as suggested in
the original work on FR-FCFS by Rixner et al). To accommodate this we
add functionality to identify a bank through a one-dimensional
identifier (bank id). The member names of the DRAMPacket are also
update to match the style guide.
2013-11-01 11:56:20 -04:00
Ani Udipi
ea76f97576 mem: Use the same timing calculation for DRAM read and write
This patch simplifies the DRAM model by re-using the function that
computes the busy and access time for both reads and writes.
2013-11-01 11:56:19 -04:00
Ani Udipi
655bf86828 mem: Fix DRAM bank occupancy for streaming access
This patch fixes an issue that allowed more than 100% bus utilisation
in certain cases.
2013-11-01 11:56:18 -04:00
Ani Udipi
be62a142cf mem: Schedule time for DRAM event taking tRAS into account
This patch changes the time the controller is woken up to take the
next scheduling decisions. tRAS is now handled in estimateLatency and
doDRAMAccess and we do not need to worry about it at scheduling
time. The earliest we need to wake up is to do a pre-charge, row
access and column access before the bus becomes free for use.
2013-11-01 11:56:17 -04:00
Ani Udipi
d4cf009b95 mem: Add tRAS parameter to the DRAM controller model
This patch adds an explicit tRAS parameter to the DRAM controller
model. Previously tRAS was, rather conservatively, assumed to be tRCD
+ tCL + tRP. The default values for tRAS are chosen to match the
previous behaviour and will be updated later.
2013-11-01 11:56:16 -04:00
Stephan Diestelhorst
19c2a606fa mem: Add "const" attribute to Packet getters
Add a "const" keywords to the getters in the Packet class so these can be
invoked on const Packet objects.
2013-10-31 13:41:13 -05:00
Prakash Ramrakhyani
885656f2ed mem: Add privilege info to request class
This patch adds a flag in the request class that indicates if the request
was made in privileged mode.
2013-10-31 13:41:13 -05:00
Lluc Alvarez
2b9b245fb3 ruby: set SenderMachine in messages of MOESI_CMP_directory
This patch adds missing initializations of the SenderMachine field of
out_msg's when thery are created in the L2 cache controller of the
MOESI_CMP_directory coherence protocol. When an out_msg is created and this
field is left uninitialized, it is set to the default value MachineType_NUM.
This causes a panic in the MachineType_to_string function when gem5 is
executed with the Ruby debug flag on and it tries to print the message.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-10-30 10:35:06 -05:00
Emilio Castillo
80fa6a0edc ruby: Fixed a deadlock when restoring a checkpoint with garnet
This patch fixes a problem where in Garnet, the enqueue time in the
VCallocator and the SWallocator which is of type Cycles was being stored
inside a variable with int type.

This lead to a known problem restoring checkpoints with garnet & the fixed
pipeline enabled. That value was really big and didn't fit in the variable
overflowing it, therefore some conditions on the VC allocation stage & the
SW allocation stage were not met and the packets didn't advance through the
network, leading to a deadlock panic right after the checkpoint was restored.

Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-10-30 10:35:05 -05:00
Stephan Diestelhorst
4e9d91016a mem: De-virtualise interfaces in the CoherentBus
The CoherentBus eventually got virtual methods for its interface. The
"virtuality" of the CoherentBus, however, comes already from the virtual
interface of the bus' ports. There is no need to add another layer of virtual
functions, here.
2013-10-17 10:20:45 -05:00
Matt Horsnell
6decd70bfb cpu: add consistent guarding to *_impl.hh files. 2013-10-17 10:20:45 -05:00
Sascha Bischoff
52f90890a3 mem: Add PortID to QueuedMasterPort constructor
This patch adds the PortID to the QueuedMasterPort. This allows a PortID to be
specified as it previously was set to the detault value of -1.
2013-10-17 10:20:45 -05:00
Ali Saidi
60ce2b34fe mem: Make MemoryAccess flag more verbose
This patch extends the MemoryAccess debug flag to report who sent the
requests and the cacheability.
2013-10-17 10:20:45 -05:00
Steve Reinhardt
b10ff075b1 ruby: eliminate non-determinism from ruby.stats output
Get rid of non-deterministic "stats" in ruby.stats output
such as time & date of run, elapsed & CPU time used,
and memory usage.  These values cause spurious
miscomparisons when looking at output diffs (though
they don't affect regressions, since the regressions
pass/fail status currently ignores ruby.stats entirely).

Most of this information is already captured in other
places (time & date in stdout, elapsed time & mem usage
in stats.txt), where the regression script is smart
enough to filter it out.  It seems easier to get rid of
the redundant output rather than teaching the
regression tester to ignore the same information in
two different places.
2013-10-15 18:22:49 -04:00
Andreas Sandberg
4f5775df64 mem: Rename the ASI_BITS flag field in Request
ASI_BITS in the Request object were originally used to store a memory
request's ASI on SPARC. This is not the case any more since other ISAs
use the ASI bits to store architecture-dependent information. This
changeset renames the ASI_BITS to ARCH_BITS which better describes
their use. Additionally, the getAsi() accessor is renamed to
getArchFlags().
2013-10-15 13:26:34 +02:00
Andreas Sandberg
5e7738467b mem: Use a flag instead of address bit 63 for generic IPRs
Using address bit 63 to identify generic IPRs caused problems on
SPARC, where IPRs are heavily used. This changeset redefines how
generic IPRs are identified. Instead of using bit 63, we now use a
separate flag (GENERIC_IPR) a memory request.
2013-10-15 13:24:35 +02:00
Andreas Hansson
9aa939891f mem: Fix scheduling bug in SimpleMemory
This patch ensures that a dequeue event is not scheduled if the memory
controller is waiting for a retry already. Without this check it is
possible for the controller to attempt sending something whilst
already having one packet that is in retry, thus causing the bus to
have an assertion failure.
2013-09-18 08:46:33 -04:00
Joel Hestness
cc155ffa0d ruby: Fix Topology throttle connections
The Topology source sets up input and output buffers for each of the external
nodes of a topology by indexing on Ruby's generated controller unique IDs.
These unique IDs are found by adding the MachineType_base_number to the version
number of each controller (see any generated *_Controller.cc - init() calls
getToNetQueue and getFromNetQueue using m_version + base). However, the
Topology object used the cntrl_id - which is required to be unique across all
controllers - to index the controllers list as they are being connected to
their input and output buffers. If the cntrl_ids did not match the Ruby unique
ID, the throttles end up connected to incorrectly indexed nodes in the network,
resulting in packets traversing incorrect network paths. This patch fixes the
Topology indexing scheme by using the Ruby unique ID to match that of the
SimpleNetwork buffer vectors.
2013-09-11 15:35:18 -05:00
Joel Hestness
c1cf55c738 ruby: Statically allocate stats in SimpleNetwork, Switch, Throttle
The previous changeset (9863:9483739f83ee) used STL vector containers to
dynamically allocate stats in the Ruby SimpleNetwork, Switch and Throttle. For
gcc versions before at least 4.6.3, this causes the standard vector allocator
to call Stats copy constructors (a no-no, since stats should be allocated in
the body of each SimObject instance). Since the size of these stats arrays is
known at compile time (NOTE: after code generation), this patch changes their
allocation to be static rather than using an STL vector.
2013-09-11 15:33:27 -05:00
Nilay Vaish
90bfbd9793 ruby: network: convert to gem5 style stats 2013-09-06 16:21:35 -05:00
Nilay Vaish
24dc914d87 ruby: profiler: removes function resourceUsage() 2013-09-06 16:21:32 -05:00
Nilay Vaish
79b5ea9d19 ruby: remove undefined message size type
This message size type does not work well with one of the statistical
variables. It also seems unnecessary.
2013-09-06 16:21:30 -05:00
Nilay Vaish
0280997fbf ruby: network: removes reset functionality 2013-09-06 16:21:30 -05:00
Nilay Vaish
e7bd70e079 ruby: network: shorten variable names 2013-09-06 16:21:29 -05:00
Nilay Vaish
c0a8ad0a35 ruby: converts sparse memory stats to gem5 style 2013-09-06 16:21:28 -05:00
Andreas Hansson
19a5b68db7 arch: Resurrect the NOISA build target and rename it NULL
This patch makes it possible to once again build gem5 without any
ISA. The main purpose is to enable work around the interconnect and
memory system without having to build any CPU models or device models.

The regress script is updated to include the NULL ISA target. Currently
no regressions make use of it, but all the testers could (and perhaps
should) transition to it.

--HG--
rename : build_opts/NOISA => build_opts/NULL
rename : src/arch/noisa/SConsopts => src/arch/null/SConsopts
rename : src/arch/noisa/cpu_dummy.hh => src/arch/null/cpu_dummy.hh
rename : src/cpu/intr_control.cc => src/cpu/intr_control_noisa.cc
2013-09-04 13:22:57 -04:00
Andreas Hansson
b63631536d stats: Cumulative stats update
This patch updates the stats to reflect the: 1) addition of the
internal queue in SimpleMemory, 2) moving of the memory class outside
FSConfig, 3) fixing up of the 2D vector printing format, 4) specifying
burst size and interface width for the DRAM instead of relying on
cache-line size, 5) performing merging in the DRAM controller write
buffer, and 6) fixing how idle cycles are counted in the atomic and
timing CPU models.

The main reason for bundling them up is to minimise the changeset
size.
2013-08-19 03:52:36 -04:00
Andreas Hansson
c26911013c config: Command line support for multi-channel memory
This patch adds support for specifying multi-channel memory
configurations on the command line, e.g. 'se/fs.py
--mem-type=ddr3_1600_x64 --mem-channels=4'. To enable this, it
enhances the functionality of MemConfig and moves the existing
makeMultiChannel class method from SimpleDRAM to the support scripts.

The se/fs.py example scripts are updated to make use of the new
feature.
2013-08-19 03:52:34 -04:00
Andreas Hansson
49d88f08b0 mem: Change AbstractMemory defaults to match the common case
This patch changes the default parameter value of conf_table_reported
to match the common case. It also simplifies the regression and config
scripts to reflect this change.
2013-08-19 03:52:33 -04:00
Andreas Hansson
6279eaf1f7 mem: Use STL deque in favour of list for DRAM queues
This patch changes the data structure used for the DRAM read, write
and response queues from an STL list to deque. This optimisation is
based on the observation that the size is small (and fixed), and that
the structures are frequently iterated over in a linear fashion.
2013-08-19 03:52:32 -04:00
Andreas Hansson
ac42db8134 mem: Perform write merging in the DRAM write queue
This patch implements basic write merging in the DRAM to avoid
redundant bursts. When a new access is added to the queue it is
compared against the existing entries, and if it is either
intersecting or immediately succeeding/preceeding an existing item it
is merged.

There is currently no attempt made at avoiding iterating over the
existing items in determining whether merging is possible or not.
2013-08-19 03:52:31 -04:00
Amin Farmahini
243f135e5f mem: Replacing bytesPerCacheLine with DRAM burstLength in SimpleDRAM
This patch gets rid of bytesPerCacheLine parameter and makes the DRAM
configuration separate from cache line size. Instead of
bytesPerCacheLine, we define a parameter for the DRAM called
burst_length. The burst_length parameter shows the length of a DRAM
device burst in bits. Also, lines_per_rowbuffer is replaced with
device_rowbuffer_size to improve code portablity.

This patch adds a burst length in beats for each memory type, an
interface width for each memory type, and the memory controller model
is extended to reason about "system" packets vs "dram" packets and
assemble the responses properly. It means that system packets larger
than a full burst are split into multiple dram packets.
2013-08-19 03:52:30 -04:00
Andreas Hansson
d5593f3c75 mem: Warn instead of panic for tXAW violation
Until the performance bug is fixed, avoid killing simulations.
2013-08-19 03:52:26 -04:00
Andreas Hansson
7bc3eaec7a mem: Allow disabling of tXAW through a 0 activation limit
This patch fixes an issue where an activation limit of 0 was not
allowed. With this patch, setting the limit to 0 simply disables the
tXAW constraint.
2013-08-19 03:52:26 -04:00
Andreas Hansson
2a675aecb9 mem: Add an internal packet queue in SimpleMemory
This patch adds a packet queue in SimpleMemory to avoid using the
packet queue in the port (and thus have no involvement in the flow
control). The port queue was bound to 100 packets, and as the
SimpleMemory is modelling both a controller and an actual RAM, it
potentially has a large number of packets in flight. There is
currently no limit on the number of packets in the memory controller,
but this could easily be added in a follow-on patch.

As a result of the added internal storage, the functional access and
draining is updated. Some minor cleaning up and renaming has also been
done.

The memtest regression changes as a result of this patch and the stats
will be updated.
2013-08-19 03:52:25 -04:00
Nilay Vaish
95381f8a99 ruby: slicc: remove double trigger, continueProcessing
These constructs are not in use and are not being maintained by any one.
In addition, it is not known if doubleTrigger works correctly with Ruby now.
2013-08-07 14:51:18 -05:00
Nilay Vaish
f1b17bf157 ruby: slicc: move some code to AbstractController
Some of the code in StateMachine.py file is added to all the controllers and
is independent of the controller definition. This code is being moved to the
AbstractController class which is the parent class of all controllers.
2013-08-07 14:51:18 -05:00
Andreas Hansson
d4273cc9a6 mem: Set the cache line size on a system level
This patch removes the notion of a peer block size and instead sets
the cache line size on the system level.

Previously the size was set per cache, and communicated through the
interconnect. There were plenty checks to ensure that everyone had the
same size specified, and these checks are now removed. Another benefit
that is not yet harnessed is that the cache line size is now known at
construction time, rather than after the port binding. Hence, the
block size can be locally stored and does not have to be queried every
time it is used.

A follow-on patch updates the configuration scripts accordingly.
2013-07-18 08:31:16 -04:00
Xiangyu Dong
4e8ecd7c6f mem: Add cache class destructor to avoid memory leaks
Make valgrind a little bit happier
2013-07-18 08:29:47 -04:00
Brad Beckmann
8e54c93222 ruby: removed the very old double trigger hack
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-07-11 13:56:05 -05:00
Nilay Vaish
1be0098c0b ruby: append transition comment only when in opt/debug 2013-06-28 21:42:27 -05:00
Nilay Vaish
b3980cdb9a ruby: network: remove reconfiguration code
This code seems not to be of any use now. There is no path in the simulator
that allows for reconfiguring the network. A better approach would be to
take a checkpoint and start the simulation from the checkpoint with the new
configuration.
2013-06-28 21:36:37 -05:00
Prakash Ramrakhyani
ac515d7a9b mem: Reorganize cache tags and make them a SimObject
This patch reorganizes the cache tags to allow more flexibility to
implement new replacement policies. The base tags class is now a
clocked object so that derived classes can use a clock if they need
one. Also having deriving from SimObject allows specialized Tag
classes to be swapped in/out in .py files.

The cache set is now templatized to allow it to contain customized
cache blocks with additional informaiton. This involved moving code to
the .hh file and removing cacheset.cc.

The statistics belonging to the cache tags are now including ".tags"
in their name. Hence, the stats need an update to reflect the change
in naming.
2013-06-27 05:49:50 -04:00
Andreas Hansson
0d68d36b9d mem: Remove the cache builder
This patch removes the redundant cache builder class.
2013-06-27 05:49:50 -04:00
Akash Bagdia
7d7ab73862 sim: Add the notion of clock domains to all ClockedObjects
This patch adds the notion of source- and derived-clock domains to the
ClockedObjects. As such, all clock information is moved to the clock
domain, and the ClockedObjects are grouped into domains.

The clock domains are either source domains, with a specific clock
period, or derived domains that have a parent domain and a divider
(potentially chained). For piece of logic that runs at a derived clock
(a ratio of the clock its parent is running at) the necessary derived
clock domain is created from its corresponding parent clock
domain. For now, the derived clock domain only supports a divider,
thus ensuring a lower speed compared to its parent. Multiplier
functionality implies a PLL logic that has not been modelled yet
(create a separate clock instead).

The clock domains should be used as a mechanism to provide a
controllable clock source that affects clock for every clocked object
lying beneath it. The clock of the domain can (in a future patch) be
controlled by a handler responsible for dynamic frequency scaling of
the respective clock domains.

All the config scripts have been retro-fitted with clock domains. For
the System a default SrcClockDomain is created. For CPUs that run at a
different speed than the system, there is a seperate clock domain
created. This domain incorporates the CPU and the associated
caches. As before, Ruby runs under its own clock domain.

The clock period of all domains are pre-computed, such that no virtual
functions or multiplications are needed when calling
clockPeriod. Instead, the clock period is pre-computed when any
changes occur. For this to be possible, each clock domain tracks its
children.
2013-06-27 05:49:49 -04:00