Commit graph

22 commits

Author SHA1 Message Date
Andreas Hansson
9cbe1cb653 config: Unify caches used in regressions and adjust L2 MSHRs
This patch unified the L1 and L2 caches used throughout the
regressions instead of declaring different, but very similar,
configurations in the different scripts.

The patch also changes the default L2 configuration to match what it
used to be for the fs and se scripts (until the last patch that
updated the regressions to also make use of the cache config). The
MSHRs and targets per MSHR are now set to a more realistic default of
20 and 12, respectively.

As a result of both the aforementioned changes, many of the regression
stats are changed. A follow-on patch will bump the stats.
2012-10-30 07:44:08 -04:00
Andreas Hansson
88554790c3 Mem: Use cycles to express cache-related latencies
This patch changes the cache-related latencies from an absolute time
expressed in Ticks, to a number of cycles that can be scaled with the
clock period of the caches. Ultimately this patch serves to enable
future work that involves dynamic frequency scaling. As an immediate
benefit it also makes it more convenient to specify cache performance
without implicitly assuming a specific CPU core operating frequency.

The stat blocked_cycles that actually counter in ticks is now updated
to count in cycles.

As the timing is now rounded to the clock edges of the cache, there
are some regressions that change. Plenty of them have very minor
changes, whereas some regressions with a short run-time are perturbed
quite significantly. A follow-on patch updates all the statistics for
the regressions.
2012-10-15 08:10:54 -04:00
Andreas Hansson
072a91ee51 Configs: Set the memtest clock to a reasonable value
This patch changes the memtest clock from 1THz (the default) to 2GHz,
similar to the CPUs in the other regressions. This is useful as the
caches will adopt the same clock as the CPU. The bus clock rate is
scaled accordingly, and the L1-L2 bus is kept at the CPU clock while
the memory bus is at half that frequency.

A separate patch updates the affected stats.
2012-10-15 08:09:57 -04:00
Mrinmoy Ghosh
6fc0094337 Cache: add a response latency to the caches
In the current caches the hit latency is paid twice on a miss. This patch lets
a configurable response latency be set of the cache for the backward path.
2012-09-25 11:49:41 -05:00
Andreas Hansson
f00cba34eb Mem: Make SimpleMemory single ported
This patch changes the simple memory to have a single slave port
rather than a vector port. The simple memory makes no attempts at
modelling the contention between multiple ports, and any such
multiplexing and demultiplexing could be done in a bus (or crossbar)
outside the memory controller. This scenario also matches with the
ongoing work on a SimpleDRAM model, which will be a single-ported
single-channel controller that can be used in conjunction with a bus
(or crossbar) to create a multi-port multi-channel controller.

There are only very few regressions that make use of the vector port,
and these are all for functional accesses only. To facilitate these
cases, memtest and memtest-ruby have been updated to also have a
"functional" bus to perform the (de)multiplexing of the functional
memory accesses.
2012-07-12 12:56:13 -04:00
Andreas Hansson
0d32940711 Bus: Split the bus into a non-coherent and coherent bus
This patch introduces a class hierarchy of buses, a non-coherent one,
and a coherent one, splitting the existing bus functionality. By doing
so it also enables further specialisation of the two types of buses.

A non-coherent bus connects a number of non-snooping masters and
slaves, and routes the request and response packets based on the
address. The request packets issued by the master connected to a
non-coherent bus could still snoop in caches attached to a coherent
bus, as is the case with the I/O bus and memory bus in most system
configurations. No snoops will, however, reach any master on the
non-coherent bus itself. The non-coherent bus can be used as a
template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses,
and is typically used for the I/O buses.

A coherent bus connects a number of (potentially) snooping masters and
slaves, and routes the request and response packets based on the
address, and also forwards all requests to the snoopers and deals with
the snoop responses. The coherent bus can be used as a template for
modelling QPI, HyperTransport, ACE and coherent OCP buses, and is
typically used for the L1-to-L2 buses and as the main system
interconnect.

The configuration scripts are updated to use a NoncoherentBus for all
peripheral and I/O buses.

A bit of minor tidying up has also been done.

--HG--
rename : src/mem/bus.cc => src/mem/coherent_bus.cc
rename : src/mem/bus.hh => src/mem/coherent_bus.hh
rename : src/mem/bus.cc => src/mem/noncoherent_bus.cc
rename : src/mem/bus.hh => src/mem/noncoherent_bus.hh
2012-05-31 13:30:04 -04:00
Andreas Hansson
b00949d88b MEM: Enable multiple distributed generalized memories
This patch removes the assumption on having on single instance of
PhysicalMemory, and enables a distributed memory where the individual
memories in the system are each responsible for a single contiguous
address range.

All memories inherit from an AbstractMemory that encompasses the basic
behaviuor of a random access memory, and provides untimed access
methods. What was previously called PhysicalMemory is now
SimpleMemory, and a subclass of AbstractMemory. All future types of
memory controllers should inherit from AbstractMemory.

To enable e.g. the atomic CPU and RubyPort to access the now
distributed memory, the system has a wrapper class, called
PhysicalMemory that is aware of all the memories in the system and
their associated address ranges. This class thus acts as an
infinitely-fast bus and performs address decoding for these "shortcut"
accesses. Each memory can specify that it should not be part of the
global address map (used e.g. by the functional memories by some
testers). Moreover, each memory can be configured to be reported to
the OS configuration table, useful for populating ATAG structures, and
any potential ACPI tables.

Checkpointing support currently assumes that all memories have the
same size and organisation when creating and resuming from the
checkpoint. A future patch will enable a more flexible
re-organisation.

--HG--
rename : src/mem/PhysicalMemory.py => src/mem/AbstractMemory.py
rename : src/mem/PhysicalMemory.py => src/mem/SimpleMemory.py
rename : src/mem/physical.cc => src/mem/abstract_mem.cc
rename : src/mem/physical.hh => src/mem/abstract_mem.hh
rename : src/mem/physical.cc => src/mem/simple_mem.cc
rename : src/mem/physical.hh => src/mem/simple_mem.hh
2012-04-06 13:46:31 -04:00
Andreas Hansson
5a9a743cfc MEM: Introduce the master/slave port roles in the Python classes
This patch classifies all ports in Python as either Master or Slave
and enforces a binding of master to slave. Conceptually, a master (such
as a CPU or DMA port) issues requests, and receives responses, and
conversely, a slave (such as a memory or a PIO device) receives
requests and sends back responses. Currently there is no
differentiation between coherent and non-coherent masters and slaves.

The classification as master/slave also involves splitting the dual
role port of the bus into a master and slave port and updating all the
system assembly scripts to use the appropriate port. Similarly, the
interrupt devices have to have their int_port split into a master and
slave port. The intdev and its children have minimal changes to
facilitate the extra port.

Note that this patch does not enforce any port typing in the C++
world, it merely ensures that the Python objects have a notion of the
port roles and are connected in an appropriate manner. This check is
carried when two ports are connected, e.g. bus.master =
memory.port. The following patches will make use of the
classifications and specialise the C++ ports into masters and slaves.
2012-02-13 06:43:09 -05:00
Dam Sunwoo
230540e655 mem: fix cache stats to use request ids correctly
This patch fixes the cache stats to use the new request ids.
Cache stats also display the requestor names in the vector subnames.
Most cache stats now include "nozero" and "nonan" flags to reduce the
amount of excessive cache stat dump. Also, simplified
incMissCount()/incHitCount() functions.
2012-02-12 16:07:39 -06:00
Gabe Black
ec20ee2f7c SE/FS: Make SE vs. FS mode a runtime parameter. 2012-01-28 07:24:34 -08:00
Andreas Hansson
f85286b3de MEM: Add port proxies instead of non-structural ports
Port proxies are used to replace non-structural ports, and thus enable
all ports in the system to correspond to a structural entity. This has
the advantage of accessing memory through the normal memory subsystem
and thus allowing any constellation of distributed memories, address
maps, etc. Most accesses are done through the "system port" that is
used for loading binaries, debugging etc. For the entities that belong
to the CPU, e.g. threads and thread contexts, they wrap the CPU data
port in a port proxy.

The following replacements are made:
FunctionalPort      > PortProxy
TranslatingPort     > SETranslatingPortProxy
VirtualPort         > FSTranslatingPortProxy

--HG--
rename : src/mem/vport.cc => src/mem/fs_translating_port_proxy.cc
rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh
rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc
rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh
2012-01-17 12:55:08 -06:00
Ali Saidi
a432d8e085 Mem: Fix issue with dirty block being lost when entire block transferred to non-cache.
This change fixes the problem for all the cases we actively use. If you want to try
more creative I/O device attachments (E.g. sharing an L2), this won't work. You
would need another level of caching between the I/O device and the cache
(which you actually need anyway with our current code to make sure writes
propagate). This is required so that you can mark the cache in between as
top level and it won't try to send ownership of a block to the I/O device.
Asserts have been added that should catch any issues.
2011-03-17 19:20:19 -05:00
Lisa Hsu
1d3228481f cache: Make caches sharing aware and add occupancy stats.
On the config end, if a shared L2 is created for the system, it is
parameterized to have n sharers as defined by option.num_cpus. In addition to
making the cache sharing aware so that discriminating tag policies can make use
of context_ids to make decisions, I added an occupancy AverageStat and an occ %
stat to each cache so that you could know which contexts are occupying how much
cache on average, both in terms of blocks and percentage. Note that since
devices have context_id -1, having an array of occ stats that correspond to
each context_id will break here, so in FS mode I add an extra bucket for device
blocks. This bucket is explicitly not added in SE mode in order to not only
avoid ugliness in the stats.txt file, but to avoid broken stats (some formulas
break when a bucket is 0).
2010-02-23 09:34:22 -08:00
Steve Reinhardt
07f091d6ed Get rid of remaining traces of obsolete CoherenceProtocol object.
--HG--
extra : convert_revision : c5555b00bef1b304a84886188ad2c0dcb4d7c5b9
2007-06-30 17:59:45 -07:00
Steve Reinhardt
0305159abf PhysicalMemory has vector of uniform ports instead of one special one.
configs/example/memtest.py:
    PhysicalMemory has vector of uniform ports instead of one special one.
    Other updates to fix obsolete brokenness.
src/mem/physical.cc:
src/mem/physical.hh:
src/python/m5/objects/PhysicalMemory.py:
    Have vector of uniform ports instead of one special one.
src/python/swig/pyobject.cc:
    Add comment.

--HG--
extra : convert_revision : a4a764dcdcd9720bcd07c979d0ece311fc8cb4f1
2007-05-19 00:24:34 -04:00
Ali Saidi
634d2e9d83 remove hit_latency and make latency do the right thing
set the latency parameter in terms of a latency
add caches to tsunami-simple configs

configs/common/Caches.py:
tests/configs/memtest.py:
tests/configs/o3-timing-mp.py:
tests/configs/o3-timing.py:
tests/configs/simple-atomic-mp.py:
tests/configs/simple-timing-mp.py:
tests/configs/simple-timing.py:
    set the latency parameter in terms of a latency
configs/common/FSConfig.py:
    give the bridge a default latency too
src/mem/cache/cache_builder.cc:
src/python/m5/objects/BaseCache.py:
    remove hit_latency and make latency do the right thing
tests/configs/tsunami-simple-atomic-dual.py:
tests/configs/tsunami-simple-atomic.py:
tests/configs/tsunami-simple-timing-dual.py:
tests/configs/tsunami-simple-timing.py:
    add caches to tsunami-simple configs

--HG--
extra : convert_revision : 37bef7c652e97c8cdb91f471fba62978f89019f1
2007-05-10 18:24:48 -04:00
Steve Reinhardt
d486700149 Add short memtest run to quick regressions.
Caveats:
- Even though memtest is ISA-independent, it will only
run for the Alpha builds, since there's no way to specify
ISA-independent reference files and I didn't want to commit
3 copies since I'm not sure we want to run it for all the
different ISAs anyway.
- Reference outputs were generated on my laptop,
so performance numbers will be low compared to zizzer.

--HG--
extra : convert_revision : 210fe4abecc19fbab0b15402c6cb4863650bab66
2007-02-06 21:16:33 -08:00
Ron Dreslinski
780aa0a0eb Fix corner case on assertion.
I need to move over to using the fixPacket function so I don't have to make the same changes everywhere.
Still a functional access bug someplace I need to track down in timing mode.

src/mem/cache/base_cache.cc:
src/mem/cache/cache_impl.hh:
    Fix corner case on assertion
tests/configs/memtest.py:
    Updated memtester with uncacheable addresses and functional accesses

--HG--
extra : convert_revision : e6fa851621700ff9227b83cc5cac20af4fc8444f
2006-10-19 21:26:46 -04:00
Ron Dreslinski
60252f8e63 Interesting memtest finally.
Get over 500,000 reads on each of 8 testers before memory leak becomes large.

tests/configs/memtest.py:
    Update test to be more interesting

--HG--
extra : convert_revision : 4258b798fbeeed2a376f1bfac100a109eb05620e
2006-10-11 01:18:20 -04:00
Ron Dreslinski
cc78d86661 Fix several bugs pertaining to upgrades/mem leaks.
src/mem/cache/base_cache.cc:
    Fix a bug about not having a request to send
src/mem/cache/base_cache.hh:
    Fix a bug with the blocking code
src/mem/cache/cache.hh:
    AFix a bug with snoop hits in WB buffer
src/mem/cache/cache_impl.hh:
    Fix a bug with snoop hits in WB buffer
    Also, add better DPRINTF's
src/mem/cache/miss/miss_queue.cc:
    Fix a bug with upgrades (Need to clean it up later)
src/mem/cache/miss/mshr.cc:
    Fix a memory leak bug, still some outstanding with writebacks not being deleted
src/mem/cache/miss/mshr_queue.cc:
    Fix a bug about upgrades (need to clean up later)
src/mem/packet.hh:
    Fix for newly added cmd attribute for upgrades
tests/configs/memtest.py:
    More interesting testcase

--HG--
extra : convert_revision : fcb4f17dd58b537bb4f67a8c835f50e455e8c688
2006-10-10 01:32:18 -04:00
Ron Dreslinski
b9fb4d4870 Make memtest work with 8 memtesters
src/mem/physical.cc:
    Update comment to match memtest use
src/python/m5/objects/PhysicalMemory.py:
    Make memtester have a way to connect functionally
tests/configs/memtest.py:
    Properly create 8 memtesters and connect them to the memory system

--HG--
extra : convert_revision : e5a2dd9c8918d58051b553b5c6a14785d48b34ca
2006-10-09 17:13:50 -04:00
Ron Dreslinski
6c7ab02682 Update the Memtester, commit a config file/test for it.
src/cpu/SConscript:
    Add memtester to the compilation environment.
    Someone who knows this better should make the MemTest a cpu model parameter.

    For now attached with the build of o3 cpu.
src/cpu/memtest/memtest.cc:
src/cpu/memtest/memtest.hh:
    Update Memtest for new mem system
src/python/m5/objects/MemTest.py:
    Update memtest python description

--HG--
extra : convert_revision : d6a63e08fda0975a7abfb23814a86a0caf53e482
2006-10-09 00:26:10 -04:00