Commit graph

7885 commits

Author SHA1 Message Date
Korey Sewell f80508de65 inorder: cache port blocking
set the request to false when the cache port blocks so we dont deadlock.
also, comment out the outstanding address list sanity check for now.
2011-02-04 00:08:19 -05:00
Korey Sewell 0c6a679359 inorder: stage width as a python parameter
allow the user to specify how many instructions a pipeline stage can process
on any given cycle (stageWidth...i.e.bandwidth) by setting the parameter through
the python interface rather than compile the code after changing the *.cc file.
(we always had the parameter there, but still used the static 'ThePipeline::StageWidth'
instead)
-
Since StageWidth is now dynamically defined, change the interstage communication
structure to use a vector and get rid of array and array handling index (toNextStageIndex)
since we can just make calls to the list for the same information
2011-02-04 00:08:18 -05:00
Korey Sewell 8ac717ef4c inorder: multi-issue branch resolution
Only execute (resolve) one branch per cycle because handling more than one is
a little more complicated
2011-02-04 00:08:17 -05:00
Korey Sewell be17617990 inorder: pipe. stage inst. buffering
use skidbuffer as only location for instructions between stages. before,
we had the insts queue from the prior stage and the skidbuffer for the
current stage, but that gets confusing and this consolidation helps
when handling squash cases
2011-02-04 00:08:16 -05:00
Korey Sewell 050944dd73 inorder: change skidBuffer to list instead of queue
manage insertion and deletion like a queue but will need
access to internal elements for future changes
Currently, skidbuffer manages any instruction that was
in a stage but could not complete processing, however
we will want to manage all blocked instructions (from prev stage
and from cur. stage) in just one buffer.
2011-02-04 00:08:15 -05:00
Korey Sewell 7f937e11e2 inorder: activity tracking bug
Previous code was marking CPU activity on almost every cycle due to a bug in
tracking the status of pipeline stages. This disables the CPU from sleeping
on long latency stalls and increases simulation time
2011-02-04 00:08:13 -05:00
Gabe Black 091a3e6cc0 Fault: Rename sim/fault.hh to fault_fwd.hh to distinguish it from faults.hh.
--HG--
rename : src/sim/fault.hh => src/sim/fault_fwd.hh
2011-02-03 21:47:58 -08:00
Gabe Black fd26707731 Mem,X86: Make the IO bridge pass APIC messages back towards the CPU. 2011-02-03 20:56:27 -08:00
Gabe Black 00f24ae92c Config: Keep track of uncached and cached ports separately.
This makes sure that the address ranges requested for caches and uncached ports
don't conflict with each other, and that accesses which are always uncached
(message signaled interrupts for instance) don't waste time passing through
caches.
2011-02-03 20:23:00 -08:00
Gabe Black 869a046e41 O3: Fix a style bug in O3. 2011-02-02 23:34:14 -08:00
Gabe Black cb22bead7d X86: Get rid of the stupd microop. 2011-02-02 19:57:12 -08:00
Gabe Black 54f88d84c2 Stats: Update the x86 stats to reflect changing stupd to a store and update. 2011-02-02 19:56:49 -08:00
Gabe Black eabbdbee63 X86: Replace the stupd microop with a store/update sequence. 2011-02-02 19:56:38 -08:00
Gabe Black a7d4fa45b4 X86: Build O3 by default in SE. 2011-02-02 18:17:16 -08:00
Gabe Black 75d34c14fc Time: Add serialization functions to the Time class. 2011-02-02 18:05:03 -08:00
Gabe Black c4b81d311e X86: Change how the default disk image gets set up.
The disk image to use was always being forced to a particular value. This
change changes what disk image is selected as the default based on the
architecture being built. In the future, a more sophisticated system might be
used that selected a path based on certain rules instead of relying on one off
file names.
2011-02-02 18:03:58 -08:00
Gabe Black 119f5f8e94 X86: Add L1 caches for the TLB walkers.
Small L1 caches are connected to the TLB walkers when caches are used. This
allows them to participate in the coherence protocol properly.
2011-02-01 18:28:41 -08:00
Gabe Black 4b4cd0303e Fault: Move the definition of NoFault from faults.hh to fault.hh.
Moving the definition of NoFault into fault.hh doesn't bring any new
dependencies with it, and allows some files to include just fault.hh which has
less baggage. NoFault will still be available to everything that includes
faults.hh because it includes fault.hh.
2011-01-31 13:13:00 -08:00
Nathan Binkert 048b1e5843 refcnt: Change things around so that we handle constness correctly.
To use a non const pointer:
typedef RefCountingPtr<Foo> FooPtr;

To use a const pointer:
typedef RefCountingPtr<const Foo> ConstFooPtr;
2011-01-22 21:48:06 -08:00
Gabe Black 0e64e1be0f SConstruct: Fix the librt check in SConstruct. 2011-01-21 17:51:22 -08:00
Steve Reinhardt 5c99ae60b8 checkpointing: fix bug from curTick accessor conversion.
Regex replacement of curTick with curTick() accidentally
changed checkpoint key string for serialization but not
for unserialization.
2011-01-20 22:13:33 -08:00
Gabe Black ddeaf1252f TimeSync: Use the new setTick and getTick functions. 2011-01-19 16:22:23 -08:00
Gabe Black 23bab6783b Time: Add setTick and getTick functions to the Time class. 2011-01-19 16:22:15 -08:00
Gabe Black a368fba7d4 Time: Add a mechanism to prevent M5 from running faster than real time.
M5 skips over any simulated time where it doesn't have any work to do. When
the simulation is active, the time skipped is short and the work done at any
point in time is relatively substantial. If the time between events is long
and/or the work to do at each event is small, it's possible for simulated time
to pass faster than real time. When running a benchmark that can be good
because it means the simulation will finish sooner in real time. When
interacting with the real world through, for instance, a serial terminal or
bridge to a real network, this can be a problem. Human or network response time
could be greatly exagerated from the perspective of the simulation and make
simulated events happen "too soon" from an external perspective.

This change adds the capability to force the simulation to run no faster than
real time. It does so by scheduling a periodic event that checks to see if
its simulated period is shorter than its real period. If it is, it stalls the
simulation until they're equal. This is called time syncing.

A future change could add pseudo instructions which turn time syncing on and
off from within the simulation. That would allow time syncing to be used for
the interactive parts of a session but then turned off when running a
benchmark using the m5 utility program inside a script. Time syncing would
probably not happen anyway while running a benchmark because there would be
plenty of work for M5 to do, but the event overhead could be avoided.
2011-01-19 11:48:00 -08:00
Ali Saidi f7885b8f26 ARM/O3: Add regressions for ARM w/ O3 CPU. 2011-01-18 16:30:06 -06:00
Ali Saidi 9b67f3723e Stats: Update stats for previous set of patches. 2011-01-18 16:30:06 -06:00
Matt Horsnell 77853b9f52 O3: Fix itstate prediction and recovery.
Any change of control flow now resets the itstate to 0 mask and 0 condition,
except where the control flow alteration write into the cpsr register. These
case, for example return from an iterrupt, require the predecoder to recover
the itstate.

As there is a window of opportunity between the return from an interrupt
changing the control flow at the head of the pipe and the commit of the update
to the CPSR, the predecoder needs to be able to grab the ITstate early. This
is now handled by setting the forcedItState inside a PCstate for the control
flow altering instruction.

That instruction will have the correct mask/cond, but will not have a valid
itstate until advancePC is called (note this happens to advance the execution).
When the new PCstate is copy constructed it gets the itstate cond/mask, and
upon advancing the PC the itstate becomes valid.

Subsequent advancing invalidates the state and zeroes the cond/mask. This is
handled in isolation for the ARM ISA and should have no impact on other ISAs.

Refer arch/arm/types.hh and arch/arm/predecoder.cc for the details.
2011-01-18 16:30:05 -06:00
Matt Horsnell b13a79ee71 O3: Fix some variable length instruction issues with the O3 CPU and ARM ISA. 2011-01-18 16:30:05 -06:00
Matt Horsnell c98df6f8c2 O3: Don't test misprediction on load instructions until executed. 2011-01-18 16:30:05 -06:00
Ali Saidi 1167ef19cf O3: Keep around the last committed instruction and use for squashing.
Without this change 0 is always used for the youngest sequence number if
a squash occured and the ROB was empty (E.g. an instruction is marked
serializeAfter or a fetch stall prevents other instructions from issuing).
Using 0 there is a race to rename where an instruction that committed the
same cycle as the squashing instruction can have it's renamed state undone
by the squash using sequence number 0.
2011-01-18 16:30:05 -06:00
Ali Saidi ea058b14da O3: Don't try to scoreboard misc registers.
I'm not positive this is the correct fix, but it's working right now.
Either we need to do something like this, prevent the misc reg from being renamed at all,
or there something else going on. We need to find the root cause as to why
this is only a problem sometimes.
2011-01-18 16:30:05 -06:00
Matt Horsnell adbd84ab9f ARM: The ARM decoder should not panic when decoding undefined holes is arch.
This can abort simulations when the fetch unit runs ahead and speculatively
decodes instructions that are off the execution path.
2011-01-18 16:30:05 -06:00
Matt Horsnell 11bef2ab38 O3: Fix corner cases where multiple squashes/fetch redirects overwrite timebuf. 2011-01-18 16:30:05 -06:00
Matt Horsnell 62f2097917 O3: Fix mispredicts from non control instructions.
The squash inside the fetch unit should not attempt to remove them from the
branch predictor as non-control instructions are not pushed into the predictor.
2011-01-18 16:30:05 -06:00
Matt Horsnell 5ebf3b2808 O3: Fixes the way prefetches are handled inside the iew unit.
This patch prevents the prefetch being added to the instCommit queue twice.
2011-01-18 16:30:02 -06:00
Ali Saidi ee9a331fe5 O3: Support timing translations for O3 CPU fetch. 2011-01-18 16:30:02 -06:00
Ali Saidi 0f9a3671b6 ARM: Add support for moving predicated false dest operands from sources. 2011-01-18 16:30:02 -06:00
Min Kyu Jeong 96375409ea O3: Fixes fetch deadlock when the interrupt clears before CPU handles it.
When this condition occurs the cpu should restart the fetch stage to fetch from
the original execution path. Fault handling in the commit stage is cleaned up a
little bit so the control flow is simplier. Finally, if an instruction is being
used to carry a fault it isn't executed, so the fault propagates appropriately.
2011-01-18 16:30:01 -06:00
Ali Saidi 965a01d913 ARM: Use an actual NOP instead of a instruction that happens to do nothing 2011-01-18 16:30:01 -06:00
Ali Saidi a3232b534b ARM: fix mismatched new/delete. 2011-01-18 16:30:01 -06:00
Ali Saidi 25822ee655 mkblankimage: bash != sh on many systems and this script needs bash 2011-01-18 16:30:00 -06:00
Ali Saidi e91a6b7558 ARM: Add code for a simple bootloader for MP boot. 2011-01-18 16:29:59 -06:00
Gabe Black a39096a8c3 Unit tests: Convert the refcnttest unit test to use the new EXPECT macros. 2011-01-18 01:27:04 -08:00
Gabe Black c04571d601 Unit tests: Define a header file for common unit testing functions/macros. 2011-01-18 01:26:55 -08:00
Nathan Binkert 318bfe9d4f time: improve time datastructure
Use posix clock functions (and librt) if it is available.
Inline a bunch of functions and implement more operators.
* * *
time: more cleanup
2011-01-15 07:48:25 -08:00
Nilay Vaish c82a8979a3 Change interface between coherence protocols and CacheMemory
The purpose of this patch is to change the way CacheMemory interfaces with
coherence protocols. Currently, whenever a cache controller (defined in the
protocol under consideration) needs to carry out any operation on a cache
block, it looks up the tag hash map and figures out whether or not the block
exists in the cache. In case it does exist, the operation is carried out
(which requires another lookup). As observed through profiling of different
protocols, multiple such lookups take place for a given cache block. It was
noted that the tag lookup takes anything from 10% to 20% of the simulation
time. In order to reduce this time, this patch is being posted.

I have to acknowledge that the many of the thoughts that went in to this
patch belong to Brad.

Changes to CacheMemory, TBETable and AbstractCacheEntry classes:
1. The lookup function belonging to CacheMemory class now returns a pointer
to a cache block entry, instead of a reference. The pointer is NULL in case
the block being looked up is not present in the cache. Similar change has
been carried out in the lookup function of the TBETable class.
2. Function for setting and getting access permission of a cache block have
been moved from CacheMemory class to AbstractCacheEntry class.
3. The allocate function in CacheMemory class now returns pointer to the
allocated cache entry.

Changes to SLICC:
1. Each action now has implicit variables - cache_entry and tbe. cache_entry,
if != NULL, must point to the cache entry for the address on which the action
is being carried out. Similarly, tbe should also point to the transaction
buffer entry of the address on which the action is being carried out.
2. If a cache entry or a transaction buffer entry is passed on as an
argument to a function, it is presumed that a pointer is being passed on.
3. The cache entry and the tbe pointers received __implicitly__ by the
actions, are passed __explicitly__ to the trigger function.
4. While performing an action, set/unset_cache_entry, set/unset_tbe are to
be used for setting / unsetting cache entry and tbe pointers respectively.
5. is_valid() and is_invalid() has been made available for testing whether
a given pointer 'is not NULL' and 'is NULL' respectively.
6. Local variables are now available, but they are assumed to be pointers
always.
7. It is now possible for an object of the derieved class to make calls to
a function defined in the interface.
8. An OOD token has been introduced in SLICC. It is same as the NULL token
used in C/C++. If you are wondering, OOD stands for Out Of Domain.
9. static_cast can now taken an optional parameter that asks for casting the
given variable to a pointer of the given type.
10. Functions can be annotated with 'return_by_pointer=yes' to return a
pointer.
11. StateMachine has two new variables, EntryType and TBEType. EntryType is
set to the type which inherits from 'AbstractCacheEntry'. There can only be
one such type in the machine. TBEType is set to the type for which 'TBE' is
used as the name.

All the protocols have been modified to conform with the new interface.
2011-01-17 18:46:16 -06:00
Gabe Black 6fb521faba SPARC: Update stats for the call r15 as source change. 2011-01-15 15:30:34 -08:00
Gabe Black 371603f12c SPARC: Adjust the "call" instruction so R15 doesn't get marked as a source. 2011-01-15 15:30:17 -08:00
Nilay Vaish bec0103bb4 Regression Tests: Update the output for MESI_CMP_directory
This patch updates the output for regression tests that are carried out on
MESI_CMP_directory protocol. The changes made to the protocol in order to
remove the bugs present result in regression failure for the 60.rubytest.
Since the earlier protocol was incorrect, so we certainly cannot relay on the
earlier reference output. Hence, the update.
2011-01-13 22:48:03 -06:00
Nilay Vaish 47ba26f6b3 Ruby: Fixes MESI CMP directory protocol
The current implementation of MESI CMP directory protocol is broken.
This patch, from Arkaprava Basu, fixes the protocol.
2011-01-13 22:17:11 -06:00