This patch cleans up a number of remaining uses of bus.port which
is now split into bus.master and bus.slave. The only non-trivial change
is the memtest where the level building now has to be aware of the role
of the ports used in the previous level.
This patch merely removes the use of the num_cpus cache parameter
which no longer exists after the introduction of the masterIds. The
affected scripts fail when trying to set the parameter. Note that this
patch does not update the regression stats.
This patch classifies all ports in Python as either Master or Slave
and enforces a binding of master to slave. Conceptually, a master (such
as a CPU or DMA port) issues requests, and receives responses, and
conversely, a slave (such as a memory or a PIO device) receives
requests and sends back responses. Currently there is no
differentiation between coherent and non-coherent masters and slaves.
The classification as master/slave also involves splitting the dual
role port of the bus into a master and slave port and updating all the
system assembly scripts to use the appropriate port. Similarly, the
interrupt devices have to have their int_port split into a master and
slave port. The intdev and its children have minimal changes to
facilitate the extra port.
Note that this patch does not enforce any port typing in the C++
world, it merely ensures that the Python objects have a notion of the
port roles and are connected in an appropriate manner. This check is
carried when two ports are connected, e.g. bus.master =
memory.port. The following patches will make use of the
classifications and specialise the C++ ports into masters and slaves.
This patch moves the connection of the system port to create_system in
Ruby.py. Thereby it allows the failing Ruby test (and other Ruby
systems) to run again.
This patch fixes the currently broken fs.py by specifying the size of
the bridge range rather than the end address. This effectively
subtracts one when determining the address range for the IO bridge
(from IO bus to membus), and thus avoids the overlapping ranges.
This patch implements the functionality for forwarding invalidations and
replacements from the L1 cache of the Ruby memory system to the O3 CPU. The
implementation adds a list of ports to RubyPort. Whenever a replacement or an
invalidation is performed, the L1 cache forwards this to all the ports, which
is the LSQ in case of the O3 CPU.
In preparation for the introduction of Master and Slave ports, this
patch removes the default port parameter in the Python port and thus
forces the argument list of the Port to contain only the
description. The drawback at this point is that the config port and
dma port of PCI and DMA devices have to be connected explicitly. This
is key for future diversification as the pio and config port are
slaves, but the dma port is a master.
This patch makes the bus bridge uni-directional and specialises the
bus ports to be a master port and a slave port. This greatly
simplifies the assumptions on both sides as either port only has to
deal with requests or responses. The following patches introduce the
notion of master and slave ports, and would not be possible without
this split of responsibilities.
In making the bridge unidirectional, the address range mechanism of
the bridge is also changed. For the cases where communication is
taking place both ways, an additional bridge is needed. This causes
issues with the existing mechanism, as the busses cannot determine
when to stop iterating the address updates from the two bridges. To
avoid this issue, and also greatly simplify the specification, the
bridge now has a fixed set of address ranges, specified at creation
time.
Port proxies are used to replace non-structural ports, and thus enable
all ports in the system to correspond to a structural entity. This has
the advantage of accessing memory through the normal memory subsystem
and thus allowing any constellation of distributed memories, address
maps, etc. Most accesses are done through the "system port" that is
used for loading binaries, debugging etc. For the entities that belong
to the CPU, e.g. threads and thread contexts, they wrap the CPU data
port in a port proxy.
The following replacements are made:
FunctionalPort > PortProxy
TranslatingPort > SETranslatingPortProxy
VirtualPort > FSTranslatingPortProxy
--HG--
rename : src/mem/vport.cc => src/mem/fs_translating_port_proxy.cc
rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh
rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc
rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh
Currently there is an assumption that restoration from a checkpoint will
happen by first restoring to an atomic CPU and then switching to a timing
CPU. This patch adds support for directly restoring to a timing CPU. It
adds a new option '--restore-with-cpu' which is used to specify the type
of CPU to which the checkpoint should be restored to. It defaults to
'atomic' which was the case before.
This patch adds a new option for cpu type. This option is of type 'choice'
which is similar to a C++ enum, except that it takes string values as
possible choices. Following options are being removed -- detailed, timing,
inorder.
--HG--
extra : rebase_source : 58885e2e8a88b6af8e6ff884a5922059dbb1a6cb
When a change in the frame buffer from the VNC server is detected, the new
frame is stored out to the m5out/frames_*/ directory. Specifiy the flag
"--frame-capture" when running configs/example/fs.py to enable this behavior.
--HG--
extra : rebase_source : d4e08e83f4fa6ff79f3dc9c433fc1f0487e057fc
There are two lines in O3CPU.py that set the dcache and icache
tgts_per_mshr to 20, ignoring any pre-configured value of tgts_per_mshr.
This patch removes these hardcoded lines from O3CPU.py and sets the default
L1 cache mshr targets to 20.
--HG--
extra : rebase_source : 6f92d950e90496a3102967442814e97dc84db08b
This patch adds a fault model, which provides the probability of a number of
architectural faults in the interconnection network (e.g., data corruption,
misrouting). These probabilities can be used to realistically inject faults
in GARNET and faithfully evaluate the effectiveness of novel resilient NoC
architectures.
This patch drops RUBY as a compile time option. Instead the PROTOCOL option
is used to figure out whether or not to build Ruby. If the specified protocol
is 'None', then Ruby is not compiled.
The patch on Ruby functional accesses made changes to the process of
instantiating controllers and sequencers. The DMA controller and
sequencer was not updated, hence this patch.
Addition of functional access support to Ruby necessitated some changes to
the way coherence protocols are written. I had forgotten to update the
Network_test protocol. This patch makes those updates.
This patch rpovides functional access support in Ruby. Currently only
the M5Port of RubyPort supports functional accesses. The support for
functional through the PioPort will be added as a separate patch.
A significant contributor to the need for adoptOrphanParams()
is the practice of appending to SimObjectVectors which have
already been assigned as children. This practice sidesteps the
assignment operation for those appended SimObjects, which is
where parent/child relationships are typically established.
This patch reworks the config scripts that use append() on
SimObjectVectors, which all happen to be in the x86 system
configuration. At some point in the future, I hope to make
SimObjectVectors immutable (by deriving from tuple rather than
list), at which time this patch will be necessary for correct
operation. For now, it just avoids some of the warning
messages that get printed in adoptOrphanParams().
Re-enabling implicit parenting (see previous patch) causes current
Ruby config scripts to create some strange hierarchies and generate
several warnings. This patch makes three general changes to address
these issues.
1. The order of object creation in the ruby config files makes the L1
caches children of the sequencer rather than the controller; these
config ciles are rewritten to assign the L1 caches to the
controller first.
2. The assignment of the sequencer list to system.ruby.cpu_ruby_ports
causes the sequencers to be children of system.ruby, generating
warnings because they are already parented to their respective
controllers. Changing this attribute to _cpu_ruby_ports fixes this
because the leading underscore means this is now treated as a plain
Python attribute rather than a child assignment. As a result, the
configuration hierarchy changes such that, e.g.,
system.ruby.cpu_ruby_ports0 becomes system.l1_cntrl0.sequencer.
3. In the topology classes, the routers become children of some random
internal link node rather than direct children of the topology.
The topology classes are rewritten to assign the routers to the
topology object first.
A recent patch broke the ruby network tester by adding -p inside Options.py
which conflicts with the -p inside ruby_network_test.py.
Have removed -p from ruby_network_test.py
The network tester terminates after injecting for sim_cycles
(default=1000), instead of having to explicitly pass --maxticks from the
command line as before. If fixed_pkts is enabled, the tester only
injects maxpackets number of packets, else it keeps injecting till sim_cycles.
The tester also works with zero command line arguments now.
This patch ensures that both Garnet and the simple networks use the bw value
specified in the topology. To do so, the patch generalizes the specification
of bw for basic links. This value is then translated to the specific value
used by the simple and Garnet networks. Since Garent does not support
non-uniformed link bandwidth, the patch also adds a check to ensure all bws are
equal.
--HG--
rename : src/mem/ruby/network/BasicLink.cc => src/mem/ruby/network/simple/SimpleLink.cc
rename : src/mem/ruby/network/BasicLink.hh => src/mem/ruby/network/simple/SimpleLink.hh
rename : src/mem/ruby/network/BasicLink.py => src/mem/ruby/network/simple/SimpleLink.py
This patch converts links and switches from second class simobjects that were
virtually ignored by the networks (both simple and Garnet) to first class
simobjects that directly correspond to c++ ojbects manipulated by the
topology and network classes. This is especially true for Garnet, where the
links and switches directly correspond to specific C++ objects.
By making this change, many aspects of the Topology class were simplified.
--HG--
rename : src/mem/ruby/network/Network.cc => src/mem/ruby/network/BasicLink.cc
rename : src/mem/ruby/network/Network.hh => src/mem/ruby/network/BasicLink.hh
rename : src/mem/ruby/network/Network.cc => src/mem/ruby/network/garnet/fixed-pipeline/GarnetLink_d.cc
rename : src/mem/ruby/network/Network.hh => src/mem/ruby/network/garnet/fixed-pipeline/GarnetLink_d.hh
rename : src/mem/ruby/network/garnet/fixed-pipeline/GarnetNetwork_d.py => src/mem/ruby/network/garnet/fixed-pipeline/GarnetLink_d.py
rename : src/mem/ruby/network/garnet/fixed-pipeline/GarnetNetwork_d.py => src/mem/ruby/network/garnet/fixed-pipeline/GarnetRouter_d.py
rename : src/mem/ruby/network/Network.cc => src/mem/ruby/network/garnet/flexible-pipeline/GarnetLink.cc
rename : src/mem/ruby/network/Network.hh => src/mem/ruby/network/garnet/flexible-pipeline/GarnetLink.hh
rename : src/mem/ruby/network/garnet/fixed-pipeline/GarnetNetwork_d.py => src/mem/ruby/network/garnet/flexible-pipeline/GarnetLink.py
rename : src/mem/ruby/network/garnet/fixed-pipeline/GarnetNetwork_d.py => src/mem/ruby/network/garnet/flexible-pipeline/GarnetRouter.py
Frame buffer and boot linux:
./build/ARM_FS/m5.opt configs/example/fs.py --benchmark=ArmLinuxFrameBuf --kernel=vmlinux.touchkit
Linux from a CF card:
./build/ARM_FS/m5.opt configs/example/fs.py --benchmark=ArmLinuxCflash --kernel=vmlinux.touchkit
Run Android
./build/ARM_FS/m5.opt configs/example/fs.py --benchmark=ArmAndroid --kernel=vmlinux.android
Run MP
./build/ARM_FS/m5.opt configs/example/fs.py --benchmark=ArmLinuxCflash --kernel=vmlinux.mp-2.6.38
This patch moves the assignment of testsys.switch_cpus, testsys.switch_cpus_1,
switch_cpu_list, and switch_cpu_list1 outside of the for loop so they are
assigned only once, after switch_cpus and switch_cpus_1 are constructed.
The tester code is in testers/networktest.
The tester can be invoked by configs/example/ruby_network_test.py.
A dummy coherence protocol called Network_test is also addded for network-only simulations and testing. The protocol takes in messages from the tester and just pushes them into the network in the appropriate vnet, without storing any state.
Now, instead of --bench benchname, you can do --bench bench1-bench2-bench3 and it will
set up a simulation that instantiates those three workloads. Only caveat is that now,
for sanity checking, your -n X must match the number of benches in the list.
This change fixes the problem for all the cases we actively use. If you want to try
more creative I/O device attachments (E.g. sharing an L2), this won't work. You
would need another level of caching between the I/O device and the cache
(which you actually need anyway with our current code to make sure writes
propagate). This is required so that you can mark the cache in between as
top level and it won't try to send ownership of a block to the I/O device.
Asserts have been added that should catch any issues.
makeArmSystem creates both bare-metal and Linux systems more cleanly.
machine_type was never optional though listed as an optional argument; a system
such as "RealView_PBX" must now be explicitly specified. Now that it is a
required argument, the placement of the arguments has changed slightly
requiring some changes to calls that create ARM systems.
It's confusing (especially to new users), when you are setting some standard
parameters (as defined in Options.py) and they aren't reflected in the simulations
so we might as well link the settings in CacheConfig.py to those in Options.py
This way things that don't care about work count options and/or aren't called
by something that has those command line options set up doesn't have to build
a fake object to carry in inert values.
This makes sure that the address ranges requested for caches and uncached ports
don't conflict with each other, and that accesses which are always uncached
(message signaled interrupts for instance) don't waste time passing through
caches.
The disk image to use was always being forced to a particular value. This
change changes what disk image is selected as the default based on the
architecture being built. In the future, a more sophisticated system might be
used that selected a path based on certain rules instead of relying on one off
file names.
M5 skips over any simulated time where it doesn't have any work to do. When
the simulation is active, the time skipped is short and the work done at any
point in time is relatively substantial. If the time between events is long
and/or the work to do at each event is small, it's possible for simulated time
to pass faster than real time. When running a benchmark that can be good
because it means the simulation will finish sooner in real time. When
interacting with the real world through, for instance, a serial terminal or
bridge to a real network, this can be a problem. Human or network response time
could be greatly exagerated from the perspective of the simulation and make
simulated events happen "too soon" from an external perspective.
This change adds the capability to force the simulation to run no faster than
real time. It does so by scheduling a periodic event that checks to see if
its simulated period is shorter than its real period. If it is, it stalls the
simulation until they're equal. This is called time syncing.
A future change could add pseudo instructions which turn time syncing on and
off from within the simulation. That would allow time syncing to be used for
the interactive parts of a session but then turned off when running a
benchmark using the m5 utility program inside a script. Time syncing would
probably not happen anyway while running a benchmark because there would be
plenty of work for M5 to do, but the event overhead could be avoided.
This patch adds an option to the script Ruby.py for setting the parameter
m_random_seed used for randomizing delays in the memory system. The option
can be specified as "--random_seed <seed value>".
Most of the messages in the config scripts that report a time value already
print "@ tick" followed by the current tick value, but a few were printing
"@ cycle". Since this is a distinction that's frequently confusing to new
users, this changes those message to the more accurate and consistent "@ tick".