Commit graph

1551 commits

Author SHA1 Message Date
Gabe Black
8ad2b8c559 SE/FS: Make the functions available from the TC consistent between SE and FS. 2011-10-31 02:58:22 -07:00
Gabe Black
d735abe5da GCC: Get everything working with gcc 4.6.1.
And by "everything" I mean all the quick regressions.
2011-10-31 01:09:44 -07:00
Gabe Black
facb40f3ff SE/FS: Make getProcessPtr available in both modes, and get rid of FULL_SYSTEMs. 2011-10-30 00:33:02 -07:00
Gabe Black
5b433568f0 SE/FS: Build the base process class in FS. 2011-10-30 00:32:54 -07:00
Gabe Black
464c485d0c SE/FS: Include getMemPort in FS. 2011-10-16 05:06:40 -07:00
Gabe Black
3595b0c5a1 SE/FS: Build/expose vport in SE mode. 2011-10-16 05:06:39 -07:00
Gabe Black
b2af015b97 ARM: Turn on the page table walker on ARM in SE mode. 2011-10-16 05:06:38 -07:00
Gabe Black
e8e9f97312 CPU: Make physPort and getPhysPort available in SE mode. 2011-10-16 02:59:53 -07:00
Gabe Black
8adc6781bf X86: Turn on the page table walker in SE mode. 2011-10-13 02:22:23 -07:00
Gabe Black
f338d60930 SE/FS: Build the Interrupt objects in SE mode. 2011-10-09 00:15:50 -07:00
Gabe Black
51f7a66660 SE/FS: Build the devices in SE mode. 2011-09-30 00:28:33 -07:00
Gabe Black
4fcf8e9959 O3: Tidy up some DPRINTFs in the LSQ. 2011-09-27 00:25:26 -07:00
Gabe Black
44ed4849d4 Faults: Replace calls to genMachineCheckFault with M5PanicFault. 2011-09-27 00:24:43 -07:00
Nilay Vaish
56bddab189 LSQ: Moved a couple of lines to enable O3 + Ruby
This patch makes O3 CPU work along with the Ruby memory model. Ruby
overwrites the senderState pointer with another pointer. The pointer
is restored only when Ruby gets done with the packet. LSQ makes use of
senderState just after sendTiming() returns. But the dynamic_cast returns
a NULL pointer since Ruby's senderState pointer is from a different class.
Storing the senderState pointer before calling sendTiming() does away with
the problem.
2011-09-26 12:18:32 -05:00
Steve Reinhardt
84f0a1bd91 event: minor cleanup
Initialize flags via the Event constructor instead of calling
setFlags() in the body of the derived class's constructor.  I
forget exactly why, but this made life easier when implementing
multi-queue support.

Also rename Event::getFlags() to isFlagSet() to better match
common usage, and get rid of some unused Event methods.
2011-09-22 18:59:55 -07:00
Gabe Black
10c2e37f60 Syscall: Make the syscall function available in both SE and FS modes.
In FS mode the syscall function will panic, but the interface will be
consistent and code which calls syscall can be compiled in. This will allow,
for instance, instructions that use syscall to be built unconditionally but
then not returned by the decoder.
2011-09-19 02:46:48 -07:00
Ali Saidi
649c239cee LSQ: Only trigger a memory violation with a load/load if the value changes.
Only create a memory ordering violation when the value could have changed
between two subsequent loads, instead of just when loads go out-of-order
to the same address. While not very common in the case of Alpha, with
an architecture with a hardware table walker this can happen reasonably
frequently beacuse a translation will miss and start a table walk and
before the CPU re-schedules the faulting instruction another one will
pass it to the same address (or cache block depending on the dendency
checking).

This patch has been tested with a couple of self-checking hand crafted
programs to stress ordering between two cores.

The performance improvement on SPEC benchmarks can be substantial (2-10%).
2011-09-13 12:58:08 -04:00
Gabe Black
49a7ed0397 StaticInst: Merge StaticInst and StaticInstBase.
Having two StaticInst classes, one nominally ISA dependent and the other ISA
dependent, has not been historically useful and makes the StaticInst class
more complicated that it needs to be. This change merges StaticInstBase into
StaticInst.
2011-09-09 02:40:11 -07:00
Gabe Black
b7b545bc38 Decode: Pull instruction decoding out of the StaticInst class into its own.
This change pulls the instruction decoding machinery (including caches) out of
the StaticInst class and puts it into its own class. This has a few intrinsic
benefits. First, the StaticInst code, which has gotten to be quite large, gets
simpler. Second, the code that handles decode caching is now separated out
into its own component and can be looked at in isolation, making it easier to
understand. I took the opportunity to restructure the code a bit which will
hopefully also help.

Beyond that, this change also lays some ground work for each ISA to have its
own, potentially stateful decode object. We'd be able to include less
contextualizing information in the ExtMachInst objects since that context
would be applied at the decoder. Also, the decoder could "know" ahead of time
that all the instructions it's going to see are going to be, for instance, 64
bit mode, and it will have one less thing to check when it decodes them.
Because the decode caching mechanism has been separated out, it's now possible
to have multiple caches which correspond to different types of decoding
context. Having one cache for each element of the cross product of different
configurations may become prohibitive, so it may be desirable to clear out the
cache when relatively static state changes and not to have one for each
setting.

Because the decode function is no longer universally accessible as a static
member of the StaticInst class, a new function was added to the ThreadContexts
that returns the applicable decode object.
2011-09-09 02:30:01 -07:00
Ali Saidi
b6203360ef LSQ: Set store predictor to periodically clear itself as recommended in the storesets paper.
This patch improves performance by as much as 10% on some spec benchmarks.
2011-08-19 15:08:07 -05:00
Geoffrey Blake
5f425b8bd1 Fix bugs due to interaction between SEV instructions and O3 pipeline
SEV instructions were originally implemented to cause asynchronous squashes
via the generateTCSquash() function in the O3 pipeline when updating the
SEV_MAILBOX miscReg. This caused race conditions between CPUs in an MP system
that would lead to a pipeline either going inactive indefinitely or not being
able to commit squashed instructions. Fixed SEV instructions to behave like
interrupts and cause synchronous sqaushes inside the pipeline, eliminating
the race conditions. Also fixed up the semantics of the WFE instruction to
behave as documented in the ARMv7 ISA description to not sleep if SEV_MAILBOX=1
or unmasked interrupts are pending.
2011-08-19 15:08:07 -05:00
Mrinmoy Ghosh
d0e0485902 LSQ: Add some better dprintfs for storeset predictor. 2011-08-19 15:08:05 -05:00
Mrinmoy Ghosh
0db95030fc LSQ: Fix a few issues with the storeset predictor.
Two issues are fixed in this patch:
1. The load and store pc passed to the predictor are passed in reverse order.
2. The flag indicating that a barrier is inflight was never cleared when
   the barrier was squashed instead of committed. This made all load insts
   dependent on a non-existent barrier in-flight.
2011-08-19 15:08:05 -05:00
Giacomo Gabrielli
676a530b77 O3: Squash the violator and younger instructions instead not all insts.
Change the way instructions are squashed on memory ordering violations
to squash the violator and younger instructions, not all instructions
that are younger than the instruction they violated (no reason to throw
away valid work).
2011-08-19 15:08:05 -05:00
Gabe Black
f2c89a01d1 InOrder: Make cache_unit.hh include hashmap.hh explicitly, not transitively. 2011-08-16 02:47:15 -07:00
Gabe Black
78a4636a13 O3: Make lsq_unit.hh include arch/isa_traits.hh directly, not transitively. 2011-08-16 02:46:57 -07:00
Gabe Black
0e6dc00497 O3: When squashing, restore the macroop that should be used for fetching. 2011-08-14 17:41:34 -07:00
Gabe Black
ec204f003c O3: Add a pointer to the macroop for a microop in the dyninst. 2011-08-14 04:08:14 -07:00
Gabe Black
e0043f8dbe O3: At the end of an instruction, force fetchAddr to something sensible.
It's possible (though until now very unlikely) for fetchAddr to get out of
sync with the actual PC of the current instruction. This change forcefull
resets fetchAddr at the end of every instruction.
2011-08-13 13:36:37 -07:00
Gabe Black
96df6bedb7 O3: Stop using the current macroop no matter why you're leaving it.
Until now, the only reason a macroop would be left was because it ended at a
microop marked as the last microop. In O3 with branch prediction, it's
possible for the branch predictor to have entries which originally came from
different instructions which happened to have the same RIP. This could
theoretically happen in many ways, but it was encountered specifically when
different programs in different address spaces ran one after the other in
X86_FS.

What would happen in that case was that the macroop would continue to be
looped over and microops fetched from it until it reached the last microop
even though the macropc had moved out from under it. If things lined up
properly, this could mean that the end bytes of an instruction actually fell
into the instruction sized block of memory after the one in the predecoder.
The fetch loop implicitly assumes that the last instruction sized chunk of
memory processed was the last one needed for the instruction it just finished
executing. It would then tell the predecoder to move to an offset within the
bytes it was given that is larger than those bytes, and that would trip an
assert in the x86 predecoder.

This change fixes this problem by making fetch stop processing the current
macroop if the address it should be fetching from changed when the PC is
updated. That happens when the last microop was reached because the instruction
handled it properly, and it also catches the case where the branch predictor
makes fetch do a macro level branch when it shouldn't.

The check of isLastMicroop is retained because otherwise, a macroop that
branches back to itself would act like a single, long macroop instead of
multiple instances of the same microop. There may be situations (which may
turn out to be purely hypothetical) where that matters.

This also fixes a relatively minor issue where the curMacroop variable would
be set to NULL immediately after seeing that a microop was the last one before
curMacroop was used to build the dyninst. The traceData structure would have a
NULL pointer to the macroop for that microop.
2011-08-09 11:30:43 -07:00
Gabe Black
3989f41261 O3: When waiting to handle an interrupt, let everything drain out.
Before this change, the commit stage would wait until the ROB and store queue
were empty before recognizing an interrupt. The fetch stage would stop
generating instructions at an appropriate point, so commit would then wait
until a valid time to interrupt the instruction stream. Instructions might be
in flight after fetch but not the in the ROB or store queue (in rename, for
instance), so this change makes commit wait until all in flight instructions
are finished.
2011-08-09 03:37:43 -07:00
Nilay Vaish
821dfc1289 BuildEnv: Eliminate RUBY as build environment variable
This patch replaces RUBY with PROTOCOL in all the SConscript files as
the environment variable that decides whether or not certain components
of the simulator are compiled.
2011-08-08 10:50:13 -05:00
Gabe Black
5c0e6e6092 O3: Get rid of the unused addToRemoveList function. 2011-08-07 15:41:10 -07:00
Gabe Black
a9b7931156 O3: Let squashed and deferred instructions issue.
Let squahsed and deferred instructions issue so they don't accumulate and clog
up the CPU.
2011-08-07 15:41:07 -07:00
Ali Saidi
4d83b8a799 O3: Fix uninitialized variable in the tournament branch predictor. 2011-08-07 09:21:49 -07:00
Gabe Black
16882b0483 Translation: Use a pointer type as the template argument.
This allows regular pointers and reference counted pointers without having to
use any shim structures or other tricks.
2011-08-07 09:21:48 -07:00
Gabe Black
6230668f5e O3: Get rid of the raw ExtMachInst constructor on DynInsts.
This constructor assumes that the ExtMachInst can be decoded directly into a
StaticInst that's useful to execute. With the advent of microcoded
instructions that's no longer true.
2011-08-02 11:51:16 -07:00
Gabe Black
206c2e9a0e O3: Implement memory mapped IPRs for O3. 2011-07-31 19:21:17 -07:00
Gabe Black
a42c6ae48d O3: Fix corner case squashing into the microcode ROM.
When fetching from the microcode ROM, if the PC is set so that it isn't in the
cache block that's been fetched the CPU will get stuck. The fetch stage
notices that it's in the ROM so it doesn't try to fetch from the current PC.
It then later notices that it's outside of the current cache block so it skips
generating instructions expecting to continue once the right bytes have been
fetched. This change lets the fetch stage attempt to generate instructions,
and only checks if the bytes it's going to use are valid if it's really going
to use them.
2011-07-30 23:22:53 -07:00
Giacomo Gabrielli
69ef57fd0f O3: Create a pipeline activity viewer for the O3 CPU model.
Implemented a pipeline activity viewer as a python script (util/o3-pipeview.py)
and modified O3 code base to support an extra trace flag (O3PipeView) for
generating traces to be used as inputs by the tool.
2011-07-15 11:53:35 -05:00
Mrinmoy Ghosh
3396fd9e84 Branch predictor: Fixes the tournament branch predictor.
Branch predictor could not predict a branch in a nested loop because:
 1. The global history was not updated after a mispredict squash.
 2. The global history was updated in the fetch stage. The choice predictors
    that were updated  used the changed global history. This is incorrect, as
    it incorporates the state of global history after the branch in
    encountered. Fixed update to choice predictor using the global history
    state before the branch happened.
 3. The global predictor table was also updated using the global history state
    before the branch happened as above.

Additionally, parameters to initialize ctr and history size were reversed.
2011-07-10 12:56:08 -05:00
Geoffrey Blake
c7e7b89058 O3: Fix up pipelining icache accesses in fetch stage to function properly
Fixed up the patch from Yasuko Watanabe that enabled pipelining of fetch accessess to
icache to work with recent changes to main repository.
Also added in ability for fetch stage to delay issuing the fault carrying
nop when a pipeline fetch causes a fault and no fetch bandwidth is available
until the next cycle.
2011-07-10 12:56:08 -05:00
Ali Saidi
60579e8d74 O3: Make sure fetch doesn't go off into the weeds during speculation. 2011-07-10 12:56:08 -05:00
Gabe Black
3a1428365a ExecContext: Rename the readBytes/writeBytes functions to readMem and writeMem.
readBytes and writeBytes had the word "bytes" in their names because they
accessed blobs of bytes. This distinguished them from the read and write
functions which handled higher level data types. Because those functions don't
exist any more, this change renames readBytes and writeBytes to more general
names, readMem and writeMem, which reflect the fact that they are how you read
and write memory. This also makes their names more consistent with the
register reading/writing functions, although those are still read and set for
some reason.
2011-07-02 22:35:04 -07:00
Gabe Black
2e7426664a ExecContext: Get rid of the now unused read/write templated functions. 2011-07-02 22:34:58 -07:00
Brad Beckmann ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)
c86f849d5a Ruby: Add support for functional accesses
This patch rpovides functional access support in Ruby. Currently only
the M5Port of RubyPort supports functional accesses. The support for
functional through the PioPort will be added as a separate patch.
2011-06-30 19:49:26 -05:00
Gabe Black
affad29932 InOder: Fix a compile error. 2011-06-20 02:29:14 -07:00
Korey Sewell
477e7039b3 inorder: clear reg. dep entry after removing from list
this will safeguard future code from trying to remove
from the list twice. That code wouldnt break but would
waste time.
2011-06-19 21:43:42 -04:00
Korey Sewell
b963b339b9 inorder: se: squash after syscalls 2011-06-19 21:43:42 -04:00
Korey Sewell
eedd04e894 inorder: cleanup dprintfs in cache unit 2011-06-19 21:43:42 -04:00
Korey Sewell
078f914e69 inorder: SE mode TLB faults
handle them like we do in FS mode, by blocking the TLB until the fault
is handled by the fault->invoke()
2011-06-19 21:43:42 -04:00
Korey Sewell
3cb23bd3a2 inorder:tracing: fix fault tracing bug 2011-06-19 21:43:42 -04:00
Korey Sewell
fe3a2aa4a3 inorder: se compile fixes 2011-06-19 21:43:42 -04:00
Korey Sewell
e572c01120 inorder: add necessary debug flag header files 2011-06-19 21:43:41 -04:00
Korey Sewell
91a88ae8ce inorder: clear fetchbuffer on traps
implement clearfetchbufferfunction
extend predecoder to use multiple threads and clear those on trap
2011-06-19 21:43:41 -04:00
Korey Sewell
2dae0e8735 inorder: use separate float-reg bits function in dyninst
this will make sure we get the correct view of a FP register
2011-06-19 21:43:41 -04:00
Korey Sewell
8c0def8d03 inorder: use trapPending flag to manage traps 2011-06-19 21:43:41 -04:00
Korey Sewell
5ef0b7a9db inorder/dtb: make sure DTB translate correct address
The DTB expects the correct PC in the ThreadContext
but how if the memory accesses are speculative? Shouldn't
we send along the requestor's PC to the translate functions?
2011-06-19 21:43:41 -04:00
Korey Sewell
716e447da8 inorder: handle serializing instructions
including IPR accesses and store-conditionals. These class of instructions will not
execute correctly in a superscalar machine
2011-06-19 21:43:41 -04:00
Korey Sewell
561c33f082 inorder: dont handle multiple faults on same cycle
if a faulting instruction reaches an execution unit,
then ignore it and pass it through the pipeline.

Once we recognize the fault in the graduation unit,
dont allow a second fault to creep in on the same cycle.
2011-06-19 21:43:40 -04:00
Korey Sewell
c4deabfb97 inorder: register ports for FS mode
handle "snoop" port registration as well as functional
port setup for FS mode
2011-06-19 21:43:40 -04:00
Korey Sewell
f1c3691356 inorder: check for interrupts each tick
use a dummy instruction to facilitate the squash after
the interrupts trap
2011-06-19 21:43:40 -04:00
Korey Sewell
0bfdf342da inorder: explicit fault check
Before graduating an instruction, explicitly check fault
by making the fault check it's own separate command
that can be put on an instruction schedule.
2011-06-19 21:43:40 -04:00
Korey Sewell
5f608dd2e9 inorder: squash and trap behind a tlb fault 2011-06-19 21:43:39 -04:00
Korey Sewell
e0e387c2a9 inorder: stall stores on store conditionals & compare/swaps 2011-06-19 21:43:39 -04:00
Korey Sewell
e8b7df072b inorder: make InOrder CPU FS compilable/visible
make syscall a SE mode only functionality
copy over basic FS functions (hwrei) to make FS compile
2011-06-19 21:43:39 -04:00
Korey Sewell
d71b95d84d inorder: remove memdep tracking for default pipeline
speculative load/store pipelines can reenable this
2011-06-19 21:43:39 -04:00
Korey Sewell
b72bdcf4f8 inorder: fetchBuffer tracking
calculate blocks in use for the fetch buffer to figure out how many total blocks
are pending
2011-06-19 21:43:39 -04:00
Korey Sewell
4d4c7d79d0 inorder: redefine DynInst FP result type
Sharing the FP value w/the integer values was giving inconsistent results esp. when
their is a 32-bit integer register matched w/a 64-bit float value
2011-06-19 21:43:38 -04:00
Korey Sewell
db8b1e4b78 inorder: treat SE mode syscalls as a trapping instruction
define a syscallContext to schedule the syscall and then use syscall() to actually perform the action
2011-06-19 21:43:38 -04:00
Korey Sewell
c95fe261ab inorder: bug in mdu
segfault was caused by squashed multiply thats in the process of an event.
use isProcessing flag to handle this and cleanup the MDU code
2011-06-19 21:43:38 -04:00
Korey Sewell
4c979f9325 inorder: optionally track faulting instructions 2011-06-19 21:43:38 -04:00
Korey Sewell
22ba1718c4 inorder: cleanup events in resource pool
remove events in the resource pool that can be called from the CPU event, since the CPU
event is scheduled at the same time at the resource pool event.
----
Also, match the resPool event function names to the cpu event function names
----
2011-06-19 21:43:38 -04:00
Korey Sewell
e8082a28c8 inorder: don't stall after stores
once a ST is sent off, it's OK to keep processing, however it's a little more
complicated to handle the packet acknowledging the store is completed
2011-06-19 21:43:38 -04:00
Korey Sewell
379c23199e inorder: don't stall after stores
once a ST is sent off, it's OK to keep processing, however it's a little more
complicated to handle the packet acknowledging the store is completed
2011-06-19 21:43:37 -04:00
Korey Sewell
4c9ad53cc5 inorder: remove decode squash
also, cleanup comments for gem5.fast compilation
2011-06-19 21:43:37 -04:00
Korey Sewell
a444133e73 inorder: support for compare and swap insts
dont treat read() and write() fields as mut. exclusive
2011-06-19 21:43:37 -04:00
Korey Sewell
89d0f95bf0 inorder: branch predictor update
only update BTB on a taken branch and update branch predictor w/pcstate from instruction
---
only pay attention to branch predictor updates if the the inst. is in fact a branch
2011-06-19 21:43:37 -04:00
Korey Sewell
479195d4cf inorder: priority for grad/squash events
define separate priority resource pool squash and graduate events
2011-06-19 21:43:37 -04:00
Korey Sewell
71018f5e8b inorder: remove stalls on trap squash 2011-06-19 21:43:37 -04:00
Korey Sewell
34b2500f09 inorder: no dep. tracking for zero reg
this causes forwarding a bad value register value
2011-06-19 21:43:37 -04:00
Korey Sewell
d02fa0f6b6 imported patch recoverPCfromTrap 2011-06-19 21:43:37 -04:00
Korey Sewell
264e8178ff imported patch squash_from_next_stage 2011-06-19 21:43:36 -04:00
Korey Sewell
f0f33ae2b9 inorder: add flatDestReg member to dyninst
use it in reg. dep. tracking
2011-06-19 21:43:36 -04:00
Korey Sewell
555bd4d842 inorder: update event priorities
dont use offset to calculate this but rather an enum
that can be updated
2011-06-19 21:43:36 -04:00
Korey Sewell
7dea79535c inorder: implement trap handling 2011-06-19 21:43:36 -04:00
Korey Sewell
061b369d28 inorder: cleanup intercomm. structs/squash info 2011-06-19 21:43:35 -04:00
Korey Sewell
b195da9345 inorder: use setupSquash for misspeculation
implement a clean interface to handle branch misprediction and eventually all pipeline
flushing
2011-06-19 21:43:35 -04:00
Korey Sewell
73cfab8b23 inorder: DynInst handling of stores for big-endian ISAs
The DynInst was not performing the host-to-guest translation
which ended up breaking stores for SPARC
2011-06-19 21:43:35 -04:00
Korey Sewell
4f34bc8b7b inorder: make marking of dest. regs an explicit request
formerly, this was implicit when you accessed the execution unit
or the use-def unit but it's better that this just be something
that a user can specify.
2011-06-19 21:43:35 -04:00
Korey Sewell
946b0ed4f4 inorder: simplify handling of split accesses 2011-06-19 21:43:35 -04:00
Korey Sewell
1a6d25dc47 inorder: addtl functionaly for inst. skeds
add find and end functions for inst. schedules
that can search by stage number
2011-06-19 21:43:35 -04:00
Korey Sewell
8b54858831 inorder: register file stats
keep stats for int/float reg file usage instead
of aggregating across reg file types
2011-06-19 21:43:34 -04:00
Korey Sewell
085f30ff9c inorder: scheduling for nonspec insts
make handling of speculative and nonspeculative insts
more explicit
2011-06-19 21:43:34 -04:00
Korey Sewell
3c417ea23a inorder: find register dependencies "lazily"
Architectures like SPARC need to read the window pointer
in order to figure out it's register dependence. However,
this may not get updated until after an instruction gets
executed, so now we lazily detect the register dependence
in the EXE stage (execution unit or use_def). This
makes sure we get the mapping after the most current change.
2011-06-19 21:43:34 -04:00
Korey Sewell
bd67ee9852 inorder: assert on macro-ops
provide a sanity check for someone coding
a new architecture
2011-06-19 21:43:34 -04:00
Korey Sewell
ee7062d94d inorder: handle faults at writeback stage
call trap function when a fault is received
2011-06-19 21:43:34 -04:00
Korey Sewell
17f5749dbb inorder: ISA-zero reg handling
ignore writes to the ISA zero register
2011-06-19 21:43:34 -04:00
Korey Sewell
2a59fcfbe9 inorder: update support for branch delay slots 2011-06-19 21:43:34 -04:00
Korey Sewell
d4b4ef1324 inorder: inst. iterator cleanup
get rid of accessing iterators (for instructions) by reference
2011-06-19 21:43:34 -04:00
Korey Sewell
e2f9266dbf inorder: update bpred code
clean up control flow to make it easier to understand
2011-06-19 21:43:33 -04:00
Korey Sewell
6df6365095 inorder: add types for dependency checks 2011-06-19 21:43:33 -04:00
Korey Sewell
19e3eb2915 inorder: use flattenIdx for reg indexing
- also use "threadId()" instead of readTid() everywhere
- this will help support more complex ISA indexing
2011-06-19 21:43:33 -04:00
Korey Sewell
b2e5152e16 simple-thread: give a name() function for debugging w/the SimpleThread object 2011-06-19 21:43:33 -04:00
Korey Sewell
76c60c5f93 inorder: use m5_hash_map for skedCache
since we dont care about if the cache of instruction schedules is sorted or not,
then the hash map should be faster
2011-06-19 21:43:33 -04:00
Korey Sewell
c8b43641fd o3: missing newlines on some dprintfs 2011-06-10 22:15:32 -04:00
Korey Sewell
1a451cd2c5 sparc: compilation fixes for inorder
Add a few constants and functions that the InOrder model wants for SPARC.
* * *
sparc: add eaComp function
InOrder separates the address generation from the actual access so give
Sparc that functionality
* * *
sparc: add control flags for branches
branch predictors and other cpu model functions need to know specific information
about branches, so add the necessary flags here
2011-06-09 01:34:06 -04:00
Gabe Black
a59a143a25 gcc 4.0: Add some virtual destructors to make gcc 4.0 happy. 2011-06-07 00:24:49 -07:00
Nathan Binkert
2b1aa35e20 scons: rename TraceFlags to DebugFlags 2011-06-02 17:36:21 -07:00
Geoffrey Blake
d0b0a55515 O3: Fix offset calculation into storeQueue buffer for store->load forwarding
Calculation of offset to copy from storeQueue[idx].data structure for load to
store forwarding fixed to be difference in bytes between store and load virtual
addresses.  Previous method would induce bug where a load would index into
buffer at the wrong location.
2011-05-23 10:40:21 -05:00
Geoffrey Blake
c223b887fe O3: Fix issue w/wbOutstading being decremented multiple times on blocked cache.
If a split load fails on a blocked cache wbOutstanding can be decremented
twice if the first part of the split load succeeds and the second part fails.
Condition the decrementing on not having completed the first part of the load.
2011-05-23 10:40:19 -05:00
Geoffrey Blake
6dd996aabb O3: Fix issue with interrupts/faults occuring in the middle of a macro-op
This patch fixes two problems with the O3 cpu model. The first is an issue
with an instruction fetch causing a fault on the next address while the
current macro-op is being issued. This happens when the micro-ops exceed
the fetch bandwdith and then on the next cycle the fetch stage attempts
to issue a request to the next line while it still has micro-ops to issue
if the next line faults a fault is attached to a micro-op in the currently
executing macro-op rather than a "nop" from the next instruction block.
This leads to an instruction incorrectly faulting when on fetch when
it had no reason to fault.

A similar problem occurs with interrupts. When an interrupt occurs the
fetch stage nominally stops issuing instructions immediately. This is incorrect
in the case of a macro-op as the current location might not be interruptable.
2011-05-23 10:40:18 -05:00
Chander Sudanthi
4bf48a11ef Trace: Allow printing ASIDs and selectively tracing based on user/kernel code.
Debug flags are ExecUser, ExecKernel, and ExecAsid. ExecUser and
ExecKernel are set by default when Exec is specified.  Use minus
sign with ExecUser or ExecKernel to remove user or kernel tracing
respectively.
2011-05-13 17:27:00 -05:00
Geoffrey Blake
b79650ceaa O3: Fix an issue with a load & branch instruction and mem dep squashing
Instructions that load an address and are control instructions can
execute down the wrong path if they were predicted correctly and then
instructions following them are squashed. If an instruction is a
memory and control op use the predicted address for the next PC instead
of just advancing the PC. Without this change NPC is used for the next
instruction, but predPC is used to verify that the branch was successful
so the wrong path is silently executed.
2011-05-13 17:27:00 -05:00
Nathan Binkert
9c4c1419a7 work around gcc 4.5 warning 2011-05-09 16:34:11 -04:00
Tushar Krishna
1267ff5949 NetworkTest: added sim_cycles parameter to the network tester.
The network tester terminates after injecting for sim_cycles
(default=1000), instead of having to explicitly pass --maxticks from the
command line as before. If fixed_pkts is enabled, the tester only
injects maxpackets number of packets, else it keeps injecting till sim_cycles.
The tester also works with zero command line arguments now.
2011-05-07 17:43:30 -04:00
Ali Saidi
77bea2fb42 CPU: Add some useful debug message to the timing simple cpu. 2011-05-04 20:38:27 -05:00
Ali Saidi
6e634beb8a CPU: Fix a case where timing simple cpu faults can nest.
If we fault, change the state to faulting so that we don't fault again in the same cycle.
2011-05-04 20:38:27 -05:00
Ali Saidi
89e7bcca82 O3: Remove assertion for case that is actually handled in code.
If an nonspeculative instruction has a fault it might not be in the
nonSpecInsts map.
2011-05-04 20:38:27 -05:00
Ali Saidi
09a2be0c39 O3: Fix a small corner case with the lsq hazard detection logic. 2011-05-04 20:38:26 -05:00
Nathan Binkert
6e9143d36d stats: one more name violation 2011-04-20 19:07:45 -07:00
Nathan Binkert
63371c8664 stats: rename stats so they can be used as python expressions 2011-04-19 18:45:21 -07:00
Nathan Binkert
eddac53ff6 trace: reimplement the DTRACE function so it doesn't use a vector
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing.  This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help
2011-04-15 10:44:32 -07:00
Nathan Binkert
f946d7bcdb debug: create a Debug namespace 2011-04-15 10:44:15 -07:00
Nathan Binkert
bbb1392c08 includes: fix up code after sorting 2011-04-15 10:44:14 -07:00
Nathan Binkert
39a055645f includes: sort all includes 2011-04-15 10:44:06 -07:00
Ali Saidi
6b69890493 ARM: Fix checkpoint restoration into O3 CPU and the way O3 switchCpu works.
This change fixes a small bug in the arm copyRegs() code where some registers
wouldn't be copied if the processor was in a mode other than MODE_USER.
Additionally, this change simplifies the way the O3 switchCpu code works by
utilizing TheISA::copyRegs() to copy the required context information
rather than the adhoc copying that goes on in the CPU model. The current code
makes assumptions about the visibility of int and float registers that aren't
true for all architectures in FS mode.
2011-04-04 11:42:28 -05:00
Ali Saidi
a679cd917a ARM: Cleanup implementation of ITSTATE and put important code in PCState.
Consolidate all code to handle ITSTATE in the PCState object rather than
touching a variety of structures/objects.
2011-04-04 11:42:28 -05:00
Ali Saidi
5962fecc1d CPU: Remove references to memory copy operations 2011-04-04 11:42:26 -05:00
Ali Saidi
7dde557fdc O3: Tighten memory order violation checking to 16 bytes.
The comment in the code suggests that the checking granularity should be 16
bytes, however in reality the shift by 8 is 256 bytes which seems much
larger than required.
2011-04-04 11:42:23 -05:00
Lisa Hsu
06fcaf9104 Ruby: have the rubytester pass contextId to Ruby. 2011-03-31 17:17:51 -07:00
Somayeh Sardashti
c8bbfed937 This patch supports cache flushing in MOESI_hammer 2011-03-28 10:49:45 -05:00
Korey Sewell
e0fdd86fd9 mips: cleanup ISA-specific code
***
(1): get rid of expandForMT function
MIPS is the only ISA that cares about having a piece of ISA state integrate
multiple threads so add constants for MIPS and relieve the other ISAs from having
to define this. Also, InOrder was the only core that was actively calling
this function
* * *
(2): get rid of corespecific type
The CoreSpecific type was used as a proxy to pass in HW specific params to
a MIPS CPU, but since MIPS FS hasnt been touched for awhile, it makes sense
to not force every other ISA to use CoreSpecific as well use a special
reset function to set it. That probably should go in a PowerOn reset fault
 anyway.
2011-03-26 09:23:52 -04:00
Tushar Krishna
531f54fb51 This patch fixes a build error in networktest.cc that occurs with gcc4.2 2011-03-22 23:38:09 -04:00
Tushar Krishna
09c3a97a4c This patch adds the network tester for simple and garnet networks.
The tester code is in testers/networktest.
The tester can be invoked by configs/example/ruby_network_test.py.
A dummy coherence protocol called Network_test is also addded for network-only simulations and testing. The protocol takes in messages from the tester and just pushes them into the network in the appropriate vnet, without storing any state.
2011-03-21 22:51:58 -04:00
Nilay Vaish
2f4276448b Ruby: Convert AccessModeType to RubyAccessMode
This patch converts AccessModeType to RubyAccessMode so that both the
protocol dependent and independent code uses the same access mode.
2011-03-19 18:34:37 -05:00
Ali Saidi
53ab306acc ARM: Fix subtle bug in LDM.
If the instruction faults mid-op the base register shouldn't be written back.
2011-03-17 19:20:20 -05:00
Ali Saidi
b78be240cf ARM: Detect and skip udelay() functions in linux kernel.
This change speeds up booting, especially in MP cases, by not executing
udelay() on the core but instead skipping ahead tha amount of time that is being
delayed.
2011-03-17 19:20:20 -05:00
Ali Saidi
799c3da8d0 O3: Send instruction back to fetch on squash to seed predecoder correctly. 2011-03-17 19:20:19 -05:00
Ali Saidi
30143baf7e O3: Cleanup the commitInfo comm struct.
Get rid of unused members and use base types rather than derrived values
where possible to limit amount of state.
2011-03-17 19:20:19 -05:00
Ali Saidi
a432d8e085 Mem: Fix issue with dirty block being lost when entire block transferred to non-cache.
This change fixes the problem for all the cases we actively use. If you want to try
more creative I/O device attachments (E.g. sharing an L2), this won't work. You
would need another level of caching between the I/O device and the cache
(which you actually need anyway with our current code to make sure writes
propagate). This is required so that you can mark the cache in between as
top level and it won't try to send ownership of a block to the I/O device.
Asserts have been added that should catch any issues.
2011-03-17 19:20:19 -05:00
Ali Saidi
2f40b3b8ae O3: Fix unaligned stores when cache blocked
Without this change the a store can be issued to the cache multiple times.
If this case occurs when the l1 cache is out of mshrs (and thus blocked)
the processor will never make forward progress because each cycle it will
send a single request using the recently freed mshr and not completing the
multipart store. This will continue forever.
2011-03-17 19:20:19 -05:00
Gabe Black
579c5f0b65 Spelling: Fix the a spelling error by changing mmaped to mmapped.
There may not be a formally correct spelling for the past tense of mmap, but
mmapped is the spelling Google doesn't try to autocorrect. This makes sense
because it mirrors the past tense of map->mapped and not the past tense of
cape->caped.

--HG--
rename : src/arch/alpha/mmaped_ipr.hh => src/arch/alpha/mmapped_ipr.hh
rename : src/arch/arm/mmaped_ipr.hh => src/arch/arm/mmapped_ipr.hh
rename : src/arch/mips/mmaped_ipr.hh => src/arch/mips/mmapped_ipr.hh
rename : src/arch/power/mmaped_ipr.hh => src/arch/power/mmapped_ipr.hh
rename : src/arch/sparc/mmaped_ipr.hh => src/arch/sparc/mmapped_ipr.hh
rename : src/arch/x86/mmaped_ipr.hh => src/arch/x86/mmapped_ipr.hh
2011-03-01 23:18:47 -08:00
Nilay Vaish
80b3886475 Ruby: Make DataBlock.hh independent of RubySystem
This patch changes DataBlock.hh so that it is not dependent on RubySystem.
This dependence seems unecessary. All those functions that depende on
RubySystem have been moved to DataBlock.cc file.
2011-02-25 17:51:02 -06:00
Timothy M. Jones
a10685ad1e O3CPU: Fix iqCount and lsqCount SMT fetch policies.
Fixes two of the SMT fetch policies in O3CPU that were returning the count
of instructions in the IQ or LSQ rather than the thread ID to fetch from.
2011-02-25 13:50:29 +00:00
Korey Sewell
0a74246fb9 inorder: InstSeqNum bug
Because int and not InstSeqNum was used in a couple of places, you can
overflow the int type and thus get wierd bugs when the sequence number
is negative (or some wierd value)
2011-02-23 16:35:18 -05:00
Korey Sewell
3e1ad73d08 inorder: dyn inst initialization
remove constructors that werent being used (it just gets confusing)
use initialization list for all the variables instead of relying on initVars()
function
2011-02-23 16:35:04 -05:00
Korey Sewell
e0a021005d inorder: cache packet handling
-use a pointer to CacheReqPacket instead of PacketPtr so correct destructors
get called on packet deletion
- make sure to delete the packet if the cache blocks the sendTiming request
or for some reason we dont use the packet
- dont overwrite memory requests since in the worst case an instruction will
be replaying a request so no need to keep allocating a new request
- we dont use retryPkt so delete it
- fetch code was split out already, so just assert that this is a memory
reference inst. and that the staticInst is available
2011-02-23 16:30:45 -05:00
Ali Saidi
f9d4d9df1b O3: When a prefetch causes a fault, don't record it in the inst 2011-02-23 15:10:50 -06:00
Ali Saidi
3de8e0a0d4 O3: If there is an outstanding table walk don't let the inst queue sleep.
If there is an outstanding table walk and no other activity in the CPU
it can go to sleep and never wake up. This change makes the instruction
queue always active if the CPU is waiting for a store to translate.

If Gabe changes the way this code works then the below should be removed
as indicated by the todo.
2011-02-23 15:10:49 -06:00
Ali Saidi
7391ea6de6 ARM: Do something for ISB, DSB, DMB 2011-02-23 15:10:49 -06:00
Ali Saidi
ae3d456855 ARM: Fix bug that let two table walks occur in parallel. 2011-02-23 15:10:49 -06:00
Ali Saidi
68bd80794c O3: Fix bug when a squash occurs right before TLB miss returns.
In this case we need to throw away the TLB miss, not assume it was the
one we were waiting for.
2011-02-23 15:10:49 -06:00
Korey Sewell
66bb732c04 m5: merge inorder/release-notes/make_release changes 2011-02-18 14:35:15 -05:00
Korey Sewell
bc16bbc158 inorder: add names and slot #s to res. dprints 2011-02-18 14:31:31 -05:00
Korey Sewell
64d31e75b9 inorder: ignore nops in execution unit 2011-02-18 14:30:38 -05:00
Korey Sewell
0fe19836c7 inorder: update graduation unit
make sure instructions are able to commit before writing back to the RF
do not commit more than 1 non-speculative instruction per cycle
2011-02-18 14:30:05 -05:00
Korey Sewell
89335118a5 inorder: recognize isSerializeAfter flag
keep track of when an instruction needs the execution
behind it to be serialized. Without this, in SE Mode
instructions can execute behind a system call exit().
2011-02-18 14:29:48 -05:00
Korey Sewell
bbffd9419d inorder: update default thread size(=1)
a lot of structures get allocated based off that MaxThreads parameter so this is an
effort to not abuse it
2011-02-18 14:29:44 -05:00
Korey Sewell
a278df0b95 inorder: don't overuse getLatency()
resources don't need to call getLatency because the latency is already a member
in the class. If there is some type of special case where different instructions
impose a different latency inside a resource then we can revisit this and
add getLatency() back in
2011-02-18 14:29:40 -05:00
Korey Sewell
37df925953 inorder: update max. resource bandwidths
each resource has a certain # of requests it can take per cycle. update the #s here
to be more realistic based off of the pipeline width and if the resource needs to
be accessed on multiple cycles
2011-02-18 14:29:31 -05:00
Korey Sewell
91c48b1c3b inorder: cleanup in destructors
cleanup hanging pointers and other cruft in the destructors
2011-02-18 14:29:26 -05:00
Korey Sewell
8b4b4a1ba5 inorder: fix cache/fetch unit memory leaks
---
need to delete the cache request's data on clearRequest() now that we are recycling
requests
---
fetch unit needs to deallocate the fetch buffer blocks when they are replaced or
squashed.
2011-02-18 14:29:17 -05:00
Korey Sewell
72b5233112 inorder: remove events for zero-cycle resources
if a resource has a zero cycle latency (e.g. RegFile write), then dont allocate an event
for it to use
2011-02-18 14:29:02 -05:00
Korey Sewell
d5961b2b20 inorder: update pipeline interface for handling finished resource reqs
formerly, to free up bandwidth in a resource, we could just change the pointer in that resource
but at the same time the pipeline stages had visibility to see what happened to a resource request.
Now that we are recycling these requests (to avoid too much dynamic allocation), we can't throw
away the request too early or the pipeline stage gets bad information. Instead, mark when a request
is done with the resource all together and then let the pipeline stage call back to the resource
that it's time to free up the bandwidth for more instructions
*** inteface notes ***
- When an instruction completes and is done in a resource for that cycle, call done()
- When an instruction fails and is done with a resource for that cycle, call done(false)
- When an instruction completes, but isnt finished with a resource, call completed()
- When an instruction fails, but isnt finished with a resource, call completed(false)
* * *
inorder: tlbmiss wakeup bug fix
2011-02-18 14:28:37 -05:00
Korey Sewell
d64226750e inorder: remove request map, use request vector
take away all instances of reqMap in the code and make all references use the built-in
request vectors inside of each resource. The request map was dynamically allocating
a request per instruction. The request vector just allocates N number of requests
during instantiation and then the surrounding code is fixed up to reuse those N requests
***
setRequest() and clearRequest() are the new accessors needed to define a new
request in a resource
2011-02-18 14:28:30 -05:00
Korey Sewell
c883729025 inorder: add valid bit for resource requests
this will allow us to reuse resource requests within a resource instead
of always dynamically allocating
2011-02-18 14:28:22 -05:00
Korey Sewell
ff48afcf4f inorder: remove reqRemoveList
we are going to be getting away from creating new resource requests for every
instruction so no more need to keep track of a reqRemoveList and clean it up
every tick
2011-02-18 14:28:10 -05:00
Korey Sewell
991d0185c6 inorder: initialize res. req. vectors based on resource bandwidth
first change in an optimization that will stop InOrder from allocating new memory for every instruction's
request to a resource. This gets expensive since every instruction needs to access ~10 requests before
graduation. Instead, the plan is to allocate just enough resource request objects to satisfy each resource's
bandwidth (e.g. the execution unit would need to allocate 3 resource request objects for a 1-issue pipeline
since on any given cycle it could have 2 read requests and 1 write request) and then let the instructions
contend and reuse those allocated requests. The end result is a smaller memory footprint for the InOrder model
and increased simulation performance
2011-02-18 14:27:52 -05:00
Gabe Black
f036fd9748 O3: Fetch from the microcode ROM when needed. 2011-02-13 17:40:07 -08:00
Ali Saidi
7c763b34c9 O3: Fix GCC 4.2.4 complaint 2011-02-13 16:51:15 -05:00
Korey Sewell
470aa289da inorder: clean up the old way of inst. scheduling
remove remnants of old way of instruction scheduling which dynamically allocated
a new resource schedule for every instruction
2011-02-12 10:14:48 -05:00
Korey Sewell
e26aee514d inorder: utilize cached skeds in pipeline
allow the pipeline and resources to use the cached instruction schedule and resource
sked iterator
2011-02-12 10:14:45 -05:00
Korey Sewell
516b611462 inorder: define iterator for resource schedules
resource skeds are divided into two parts: front end (all insts) and back end (inst. specific)
each of those are implemented as separate lists,  so this iterator wraps around
the traditional list iterator so that an instruction can walk it's schedule but seamlessly
transfer from front end to back end when necessary
2011-02-12 10:14:43 -05:00
Korey Sewell
ec9b2ec251 inorder: stage scheduler for front/back end schedule creation
add a stage scheduler class to replace InstStage in pipeline_traits.cc
use that class to define a default front-end, resource schedule that all
instructions will follow. This will also replace the back end schedule in
pipeline_traits.cc. The reason for adding this is so that we can cache
instruction schedules in the future instead of calling the same function
over/over again as well as constantly dynamically alllocating memory on
every instruction to try to figure out it's schedule
2011-02-12 10:14:40 -05:00
Korey Sewell
6713dbfe08 inorder: cache instruction schedules
first step in a optimization to not dynamically allocate an instruction schedule
for every instruction but rather used cached schedules
2011-02-12 10:14:36 -05:00
Korey Sewell
af67631790 inorder: comments for resource sked class 2011-02-12 10:14:34 -05:00
Korey Sewell
800e93f358 inorder: remove unused file
inst_buffer file isn't used , so remove it
2011-02-12 10:14:32 -05:00
Giacomo Gabrielli
a05032f4df O3: Fix pipeline restart when a table walk completes in the fetch stage.
When a table walk is initiated by the fetch stage, the CPU can
potentially move to the idle state and never wake up.

The fetch stage must call cpu->wakeCPU() when a translation completes
(in finishTranslation()).
2011-02-11 18:29:35 -06:00
Ali Saidi
1411cb0b0f SimpleCPU: Fix a case where a DTLB fault redirects fetch and an I-side walk occurs.
This change fixes an issue where a DTLB fault occurs and redirects fetch to
handle the fault and the ITLB requires a walk which delays translation. In this
case the status of the cpu isn't updated appropriately, and an additional
instruction fetch occurs. Eventually this hits an assert as multiple instruction
fetches are occuring in the system and when the second one returns the
processor is in the wrong state.

Some asserts below are removed because it was always true (typo) and the state
after the initiateAcc() the processor could be in any valid state when a
d-side fault occurs.
2011-02-11 18:29:35 -06:00
Giacomo Gabrielli
e2507407b1 O3: Enhance data address translation by supporting hardware page table walkers.
Some ISAs (like ARM) relies on hardware page table walkers.  For those ISAs,
when a TLB miss occurs, initiateTranslation() can return with NoFault but with
the translation unfinished.

Instructions experiencing a delayed translation due to a hardware page table
walk are deferred until the translation completes and kept into the IQ.  In
order to keep track of them, the IQ has been augmented with a queue of the
outstanding delayed memory instructions.  When their translation completes,
instructions are re-executed (only their initiateAccess() was already
executed; their DTB translation is now skipped).  The IEW stage has been
modified to support such a 2-pass execution.
2011-02-11 18:29:35 -06:00
Brad Beckmann
dfa8cbeb06 m5: added work completed monitoring support 2011-02-06 22:14:19 -08:00
Joel Hestness
52b6119228 TimingSimpleCPU: split data sender state fix
In sendSplitData, keep a pointer to the senderState that may be updated after
the call to handle*Packet. This way, if the receiver updates the packet
senderState, it can still be accessed in sendSplitData.
2011-02-06 22:14:18 -08:00
Joel Hestness
b4c10bd680 mcpat: Adds McPAT performance counters
Updated patches from Rick Strong's set that modify performance counters for
McPAT
2011-02-06 22:14:17 -08:00
Korey Sewell
e396a34b01 inorder: fault handling
Maintain all information about an instruction's fault in the DynInst object rather
than any cpu-request object. Also, if there is a fault during the execution stage
then just save the fault inside the instruction and trap once the instruction
tries to graduate
2011-02-04 00:09:20 -05:00
Korey Sewell
e57613588b inorder: pcstate and delay slots bug
not taken delay slots were not being advanced correctly to pc+8, so for those ISAs
we 'advance()' the pcstate one more time for the desired effect
2011-02-04 00:09:19 -05:00
Korey Sewell
68d962f8af inorder: add a fetch buffer to fetch unit
Give fetch unit it's own parameterizable fetch buffer to read from. Very inefficient
(architecturally and in simulation) to continually fetch at the granularity of the
wordsize. As expected, the number of fetch memory requests drops dramatically
2011-02-04 00:08:22 -05:00
Korey Sewell
56ce8acd41 inorder: overload find-req fn
no need to have separate function name findSplitRequest, just overload the function
2011-02-04 00:08:21 -05:00
Korey Sewell
ab3d37d398 inorder: implement separate fetch unit
instead of having one cache-unit class be responsible for both data and code
accesses, separate code that is just for fetch in it's own derived class off the
original base class. This makes the code easier to manage as well as handle
future cases of special fetch handling
2011-02-04 00:08:20 -05:00
Korey Sewell
f80508de65 inorder: cache port blocking
set the request to false when the cache port blocks so we dont deadlock.
also, comment out the outstanding address list sanity check for now.
2011-02-04 00:08:19 -05:00
Korey Sewell
0c6a679359 inorder: stage width as a python parameter
allow the user to specify how many instructions a pipeline stage can process
on any given cycle (stageWidth...i.e.bandwidth) by setting the parameter through
the python interface rather than compile the code after changing the *.cc file.
(we always had the parameter there, but still used the static 'ThePipeline::StageWidth'
instead)
-
Since StageWidth is now dynamically defined, change the interstage communication
structure to use a vector and get rid of array and array handling index (toNextStageIndex)
since we can just make calls to the list for the same information
2011-02-04 00:08:18 -05:00
Korey Sewell
8ac717ef4c inorder: multi-issue branch resolution
Only execute (resolve) one branch per cycle because handling more than one is
a little more complicated
2011-02-04 00:08:17 -05:00
Korey Sewell
be17617990 inorder: pipe. stage inst. buffering
use skidbuffer as only location for instructions between stages. before,
we had the insts queue from the prior stage and the skidbuffer for the
current stage, but that gets confusing and this consolidation helps
when handling squash cases
2011-02-04 00:08:16 -05:00
Korey Sewell
050944dd73 inorder: change skidBuffer to list instead of queue
manage insertion and deletion like a queue but will need
access to internal elements for future changes
Currently, skidbuffer manages any instruction that was
in a stage but could not complete processing, however
we will want to manage all blocked instructions (from prev stage
and from cur. stage) in just one buffer.
2011-02-04 00:08:15 -05:00
Korey Sewell
7f937e11e2 inorder: activity tracking bug
Previous code was marking CPU activity on almost every cycle due to a bug in
tracking the status of pipeline stages. This disables the CPU from sleeping
on long latency stalls and increases simulation time
2011-02-04 00:08:13 -05:00
Gabe Black
091a3e6cc0 Fault: Rename sim/fault.hh to fault_fwd.hh to distinguish it from faults.hh.
--HG--
rename : src/sim/fault.hh => src/sim/fault_fwd.hh
2011-02-03 21:47:58 -08:00
Gabe Black
00f24ae92c Config: Keep track of uncached and cached ports separately.
This makes sure that the address ranges requested for caches and uncached ports
don't conflict with each other, and that accesses which are always uncached
(message signaled interrupts for instance) don't waste time passing through
caches.
2011-02-03 20:23:00 -08:00
Gabe Black
869a046e41 O3: Fix a style bug in O3. 2011-02-02 23:34:14 -08:00
Gabe Black
119f5f8e94 X86: Add L1 caches for the TLB walkers.
Small L1 caches are connected to the TLB walkers when caches are used. This
allows them to participate in the coherence protocol properly.
2011-02-01 18:28:41 -08:00
Matt Horsnell
b13a79ee71 O3: Fix some variable length instruction issues with the O3 CPU and ARM ISA. 2011-01-18 16:30:05 -06:00
Matt Horsnell
c98df6f8c2 O3: Don't test misprediction on load instructions until executed. 2011-01-18 16:30:05 -06:00
Ali Saidi
1167ef19cf O3: Keep around the last committed instruction and use for squashing.
Without this change 0 is always used for the youngest sequence number if
a squash occured and the ROB was empty (E.g. an instruction is marked
serializeAfter or a fetch stall prevents other instructions from issuing).
Using 0 there is a race to rename where an instruction that committed the
same cycle as the squashing instruction can have it's renamed state undone
by the squash using sequence number 0.
2011-01-18 16:30:05 -06:00
Ali Saidi
ea058b14da O3: Don't try to scoreboard misc registers.
I'm not positive this is the correct fix, but it's working right now.
Either we need to do something like this, prevent the misc reg from being renamed at all,
or there something else going on. We need to find the root cause as to why
this is only a problem sometimes.
2011-01-18 16:30:05 -06:00
Matt Horsnell
11bef2ab38 O3: Fix corner cases where multiple squashes/fetch redirects overwrite timebuf. 2011-01-18 16:30:05 -06:00
Matt Horsnell
62f2097917 O3: Fix mispredicts from non control instructions.
The squash inside the fetch unit should not attempt to remove them from the
branch predictor as non-control instructions are not pushed into the predictor.
2011-01-18 16:30:05 -06:00
Matt Horsnell
5ebf3b2808 O3: Fixes the way prefetches are handled inside the iew unit.
This patch prevents the prefetch being added to the instCommit queue twice.
2011-01-18 16:30:02 -06:00
Ali Saidi
ee9a331fe5 O3: Support timing translations for O3 CPU fetch. 2011-01-18 16:30:02 -06:00
Ali Saidi
0f9a3671b6 ARM: Add support for moving predicated false dest operands from sources. 2011-01-18 16:30:02 -06:00
Min Kyu Jeong
96375409ea O3: Fixes fetch deadlock when the interrupt clears before CPU handles it.
When this condition occurs the cpu should restart the fetch stage to fetch from
the original execution path. Fault handling in the commit stage is cleaned up a
little bit so the control flow is simplier. Finally, if an instruction is being
used to carry a fault it isn't executed, so the fault propagates appropriately.
2011-01-18 16:30:01 -06:00
Korey Sewell
cd5a7f7221 inorder: fix RUBY_FS build
the current code was using incorrect dummy instruction in interrupts function
2011-01-12 11:52:29 -05:00
Steve Reinhardt
6f1187943c Replace curTick global variable with accessor functions.
This step makes it easy to replace the accessor functions
(which still access a global variable) with ones that access
per-thread curTick values.
2011-01-07 21:50:29 -08:00
Steve Reinhardt
d60c293bbc inorder: replace schedEvent() code with reschedule().
There were several copies of similar functions that looked
like they all replicated reschedule(), so I replaced them
with direct calls.  Keeping this separate from the previous
cset since there may be some subtle functional differences
if the code ever reschedules an event that is scheduled but
not squashed (though none were detected in the regressions).
2011-01-07 21:50:29 -08:00
Steve Reinhardt
214cc0fafc inorder: get rid of references to mainEventQueue.
Events need to be scheduled on the queue assigned
to the SimObject, not on the global queue (which
should be going away).
Also cleaned up a number of redundant expressions
that made the code unnecessarily verbose.
2011-01-07 21:50:29 -08:00
Steve Reinhardt
89cf3f6e85 Move sched_list.hh and timebuf.hh from src/base to src/cpu.
These files really aren't general enough to belong in src/base.
This patch doesn't reorder include lines, leaving them unsorted
in many cases, but Nate's magic script will fix that up shortly.

--HG--
rename : src/base/sched_list.hh => src/cpu/sched_list.hh
rename : src/base/timebuf.hh => src/cpu/timebuf.hh
2011-01-03 14:35:47 -08:00
Steve Reinhardt
c69d48f007 Make commenting on close namespace brackets consistent.
Ran all the source files through 'perl -pi' with this script:

s|\s*(};?\s*)?/\*\s*(end\s*)?namespace\s*(\S+)\s*\*/(\s*})?|} // namespace $3|;
s|\s*};?\s*//\s*(end\s*)?namespace\s*(\S+)\s*|} // namespace $2\n|;
s|\s*};?\s*//\s*(\S+)\s*namespace\s*|} // namespace $1\n|;

Also did a little manual editing on some of the arch/*/isa_traits.hh files
and src/SConscript.
2011-01-03 14:35:43 -08:00
Nilay Vaish
58fa2857e1 This patch removes the WARN_* and ERROR_* from src/mem/ruby/common/Debug.hh file. These statements have been replaced with warn(), panic() and fatal() defined in src/base/misc.hh 2010-12-22 23:15:24 -06:00
Steve Reinhardt
2c0e80f96b memtest: delete some crufty dead code 2010-12-21 22:57:29 -08:00
Gabe Black
672d6a4b98 Style: Replace some tabs with spaces. 2010-12-20 16:24:40 -05:00
Ali Saidi
42ba158479 O3: Allow a store entry to store up to 16 bytes (instead of TheISA::IntReg).
The store queue doesn't need to be ISA specific and architectures can
frequently store more than an int registers worth of data. A 128 bits seems
more common, but even 256 bits may be appropriate. Pretty much anything less
than a cache line size is buildable.
2010-12-07 16:19:57 -08:00
Ali Saidi
e681c0f7b3 O3: Support squashing all state after special instruction
For SPARC ASIs are added to the ExtMachInst. If the ASI is changed simply
marking the instruction as Serializing isn't enough beacuse that only
stops rename. This provides a mechanism to squash all the instructions
and refetch them
2010-12-07 16:19:57 -08:00
Giacomo Gabrielli
719f9a6d4f O3: Make all instructions that write a misc. register not perform the write until commit.
ARM instructions updating cumulative flags (ARM FP exceptions and saturation
flags) are not serialized.

Added aliases for ARM FP exceptions and saturation flags in FPSCR.  Removed
write accesses to the FP condition codes for most ARM VFP instructions: only
VCMP and VCMPE instructions update the FP condition codes.  Removed a potential
cause of seg. faults in the O3 model for NEON memory macro-ops (ARM).
2010-12-07 16:19:57 -08:00
Min Kyu Jeong
4bbdd6ceb2 O3: Support SWAP and predicated loads/store in ARM. 2010-12-07 16:19:57 -08:00
Ali Saidi
21bfbd422c ARM: Support switchover with hardware table walkers 2010-12-07 16:19:57 -08:00
Nilay Vaish
658849d101 ruby: Converted old ruby debug calls to M5 debug calls
This patch developed by Nilay Vaish converts all the old GEMS-style ruby
debug calls to the appropriate M5 debug calls.
2010-12-01 11:30:04 -08:00
Gabe Black
40d434d551 X86: Loosen an assert for x86 and connect the APIC ports when caches are used. 2010-11-23 06:11:50 -05:00
Ali Saidi
e1b9a815dd SCons: Support building without an ISA 2010-11-19 18:00:39 -06:00
Gabe Black
92655b6399 O3: Fix fp destination register flattening, and index offset adjusting.
This change makes O3 flatten floating point destination registers, and also
fixes misc register flattening so that it's correctly repositioned relative to
the resized regions for integer and floating point indices.

It also fixes some overly long lines.
2010-11-18 13:11:36 -05:00
Gabe Black
8b9b85e92c O3: Make O3 support variably lengthed instructions. 2010-11-15 19:37:03 -08:00
Ali Saidi
776c075917 O3: reset architetural state by calling clear() 2010-11-15 14:04:05 -06:00
Giacomo Gabrielli
0058927190 CPU/ARM: Add SIMD op classes to CPU models and ARM ISA. 2010-11-15 14:04:04 -06:00
Min Kyu Jeong
745df74fe0 O3: prevent a squash when completeAcc() modifies misc reg through TC.
This happens on ARM instructions when they update the IT state bits.
Code and associated comment was copied from execute() and initiateAcc() methods
2010-11-15 14:04:04 -06:00
Ali Saidi
d4767f440a SCons: Cleanup SCons output during compile 2010-11-15 14:04:04 -06:00
Ali Saidi
16f210da37 CPU: Fix bug when a split transaction is issued to a faster cache
In the case of a split transaction and a cache that is faster than a CPU we
could get two responses before next_tick expires. Add an event that is
scheduled in this case and return false rather than asserting.
2010-11-15 14:04:03 -06:00
Ali Saidi
cdacbe734a ARM/Alpha/Cpu: Change prefetchs to be more like normal loads.
This change modifies the way prefetches work. They are now like normal loads
that don't writeback a register. Previously prefetches were supposed to call
prefetch() on the exection context, so they executed with execute() methods
instead of initiateAcc() completeAcc(). The prefetch() methods for all the CPUs
are blank, meaning that they get executed, but don't actually do anything.

On Alpha dead cache copy code was removed and prefetches are now normal ops.
They count as executed operations, but still don't do anything and IsMemRef is
not longer set on them.

On ARM IsDataPrefetch or IsInstructionPreftech is now set on all prefetch
instructions. The timing simple CPU doesn't try to do anything special for
prefetches now and they execute with the normal memory code path.
2010-11-08 13:58:22 -06:00
Ali Saidi
f4f5d03ed2 ARM: Make all ARM uops delayed commit. 2010-11-08 13:58:22 -06:00
Ali Saidi
0ea794bcf4 sim: Use forward declarations for ports.
Virtual ports need TLB data which means anything touching a file in the arch
directory rebuilds any file that includes system.hh which in everything.
2010-11-08 13:58:22 -06:00
Gabe Black
6f4bd2c1da ISA,CPU,etc: Create an ISA defined PC type that abstracts out ISA behaviors.
This change is a low level and pervasive reorganization of how PCs are managed
in M5. Back when Alpha was the only ISA, there were only 2 PCs to worry about,
the PC and the NPC, and the lsb of the PC signaled whether or not you were in
PAL mode. As other ISAs were added, we had to add an NNPC, micro PC and next
micropc, x86 and ARM introduced variable length instruction sets, and ARM
started to keep track of mode bits in the PC. Each CPU model handled PCs in
its own custom way that needed to be updated individually to handle the new
dimensions of variability, or, in the case of ARMs mode-bit-in-the-pc hack,
the complexity could be hidden in the ISA at the ISA implementation's expense.
Areas like the branch predictor hadn't been updated to handle branch delay
slots or micropcs, and it turns out that had introduced a significant (10s of
percent) performance bug in SPARC and to a lesser extend MIPS. Rather than
perpetuate the problem by reworking O3 again to handle the PC features needed
by x86, this change was introduced to rework PC handling in a more modular,
transparent, and hopefully efficient way.


PC type:

Rather than having the superset of all possible elements of PC state declared
in each of the CPU models, each ISA defines its own PCState type which has
exactly the elements it needs. A cross product of canned PCState classes are
defined in the new "generic" ISA directory for ISAs with/without delay slots
and microcode. These are either typedef-ed or subclassed by each ISA. To read
or write this structure through a *Context, you use the new pcState() accessor
which reads or writes depending on whether it has an argument. If you just
want the address of the current or next instruction or the current micro PC,
you can get those through read-only accessors on either the PCState type or
the *Contexts. These are instAddr(), nextInstAddr(), and microPC(). Note the
move away from readPC. That name is ambiguous since it's not clear whether or
not it should be the actual address to fetch from, or if it should have extra
bits in it like the PAL mode bit. Each class is free to define its own
functions to get at whatever values it needs however it needs to to be used in
ISA specific code. Eventually Alpha's PAL mode bit could be moved out of the
PC and into a separate field like ARM.

These types can be reset to a particular pc (where npc = pc +
sizeof(MachInst), nnpc = npc + sizeof(MachInst), upc = 0, nupc = 1 as
appropriate), printed, serialized, and compared. There is a branching()
function which encapsulates code in the CPU models that checked if an
instruction branched or not. Exactly what that means in the context of branch
delay slots which can skip an instruction when not taken is ambiguous, and
ideally this function and its uses can be eliminated. PCStates also generally
know how to advance themselves in various ways depending on if they point at
an instruction, a microop, or the last microop of a macroop. More on that
later.

Ideally, accessing all the PCs at once when setting them will improve
performance of M5 even though more data needs to be moved around. This is
because often all the PCs need to be manipulated together, and by getting them
all at once you avoid multiple function calls. Also, the PCs of a particular
thread will have spatial locality in the cache. Previously they were grouped
by element in arrays which spread out accesses.


Advancing the PC:

The PCs were previously managed entirely by the CPU which had to know about PC
semantics, try to figure out which dimension to increment the PC in, what to
set NPC/NNPC, etc. These decisions are best left to the ISA in conjunction
with the PC type itself. Because most of the information about how to
increment the PC (mainly what type of instruction it refers to) is contained
in the instruction object, a new advancePC virtual function was added to the
StaticInst class. Subclasses provide an implementation that moves around the
right element of the PC with a minimal amount of decision making. In ISAs like
Alpha, the instructions always simply assign NPC to PC without having to worry
about micropcs, nnpcs, etc. The added cost of a virtual function call should
be outweighed by not having to figure out as much about what to do with the
PCs and mucking around with the extra elements.

One drawback of making the StaticInsts advance the PC is that you have to
actually have one to advance the PC. This would, superficially, seem to
require decoding an instruction before fetch could advance. This is, as far as
I can tell, realistic. fetch would advance through memory addresses, not PCs,
perhaps predicting new memory addresses using existing ones. More
sophisticated decisions about control flow would be made later on, after the
instruction was decoded, and handed back to fetch. If branching needs to
happen, some amount of decoding needs to happen to see that it's a branch,
what the target is, etc. This could get a little more complicated if that gets
done by the predecoder, but I'm choosing to ignore that for now.


Variable length instructions:

To handle variable length instructions in x86 and ARM, the predecoder now
takes in the current PC by reference to the getExtMachInst function. It can
modify the PC however it needs to (by setting NPC to be the PC + instruction
length, for instance). This could be improved since the CPU doesn't know if
the PC was modified and always has to write it back.


ISA parser:

To support the new API, all PC related operand types were removed from the
parser and replaced with a PCState type. There are two warts on this
implementation. First, as with all the other operand types, the PCState still
has to have a valid operand type even though it doesn't use it. Second, using
syntax like PCS.npc(target) doesn't work for two reasons, this looks like the
syntax for operand type overriding, and the parser can't figure out if you're
reading or writing. Instructions that use the PCS operand (which I've
consistently called it) need to first read it into a local variable,
manipulate it, and then write it back out.


Return address stack:

The return address stack needed a little extra help because, in the presence
of branch delay slots, it has to merge together elements of the return PC and
the call PC. To handle that, a buildRetPC utility function was added. There
are basically only two versions in all the ISAs, but it didn't seem short
enough to put into the generic ISA directory. Also, the branch predictor code
in O3 and InOrder were adjusted so that they always store the PC of the actual
call instruction in the RAS, not the next PC. If the call instruction is a
microop, the next PC refers to the next microop in the same macroop which is
probably not desirable. The buildRetPC function advances the PC intelligently
to the next macroop (in an ISA specific way) so that that case works.


Change in stats:

There were no change in stats except in MIPS and SPARC in the O3 model. MIPS
runs in about 9% fewer ticks. SPARC runs with 30%-50% fewer ticks, which could
likely be improved further by setting call/return instruction flags and taking
advantage of the RAS.


TODO:

Add != operators to the PCState classes, defined trivially to be !(a==b).
Smooth out places where PCs are split apart, passed around, and put back
together later. I think this might happen in SPARC's fault code. Add ISA
specific constructors that allow setting PC elements without calling a bunch
of accessors. Try to eliminate the need for the branching() function. Factor
out Alpha's PAL mode pc bit into a separate flag field, and eliminate places
where it's blindly masked out or tested in the PC.
2010-10-31 00:07:20 -07:00
Gabe Black
d5dbd91f3d O3: Get rid of a bunch of commented out lines. 2010-10-24 00:43:32 -07:00
Gabe Black
d4492190e6 Alpha: Fix Alpha NumMiscArchRegs constant.
Also add asserts in O3's Scoreboard class to catch bad indexes.
2010-10-04 11:58:06 -07:00
Ali Saidi
aef4a9904e CPU/Cache: Fix some errors exposed by valgrind 2010-09-30 09:35:19 -05:00
Gabe Black
ab8d7eee76 CPU: Fix O3 and possible InOrder segfaults in FS. 2010-09-20 02:46:42 -07:00
Gabe Black
0dd1f7f01a CPU: Trim unnecessary includes from some common files.
This reduces the scope of those includes and makes it less likely for there to
be a dependency loop. This also moves the hashing functions associated with
ExtMachInst objects to be with the ExtMachInst definitions and out of
utility.hh.
2010-09-14 00:29:38 -07:00
Gabe Black
8f3fbd2d13 CPU: Get rid of the now unnecessary getInst/setInst family of functions.
This code is no longer needed because of the preceeding change which adds a
StaticInstPtr parameter to the fault's invoke method, obviating the only use
for this pair of functions.
2010-09-13 21:58:34 -07:00
Gabe Black
6833ca7eed Faults: Pass the StaticInst involved, if any, to a Fault's invoke method.
Also move the "Fault" reference counted pointer type into a separate file,
sim/fault.hh. It would be better to name this less similarly to sim/faults.hh
to reduce confusion, but fault.hh matches the name of the type. We could change
Fault to FaultPtr to match other pointer types, and then changing the name of
the file would make more sense.
2010-09-13 19:26:03 -07:00
Nathan Binkert
afafaf1dcb style: fix sorting of includes and whitespace in some files 2010-09-10 14:58:04 -07:00
Gabe Black
c9d01c6557 CPU: Get rid of the unused ev5_trap function on the simple and checker CPUs. 2010-08-31 09:47:29 -07:00
Steve Reinhardt
ee6a92863a memtest: fix/cleanup functional access testing
Don't assert that the response packet is marked as a response
since it won't always be so for functional accesses.

Also cleanup code to refer to functional accesses rather
than "probes" (old terminology), and mention in the
DPRINTF which type of access we're doing.
2010-08-25 21:55:44 -07:00
Ali Saidi
546eaa6109 CPU: Print out traces for faluting inst when the flag ExecFaulting is set 2010-08-25 19:10:43 -05:00
Min Kyu Jeong
e1168e72ca ARM: Fixed register flattening logic (FP_Base_DepTag was set too low)
When decoding a srs instruction, invalid mode encoding returns invalid instruction.
This can happen when garbage instructions are fetched from mispredicted path
2010-08-25 19:10:43 -05:00
Brad Beckmann
e983ef9e8c testers: move testers to a new directory
This patch moves the testers to a new subdirectory under src/cpu and includes
the necessary fixes to work with latest m5 initialization patches.

--HG--
rename : configs/example/determ_test.py => configs/example/ruby_direct_test.py
rename : src/cpu/directedtest/DirectedGenerator.cc => src/cpu/testers/directedtest/DirectedGenerator.cc
rename : src/cpu/directedtest/DirectedGenerator.hh => src/cpu/testers/directedtest/DirectedGenerator.hh
rename : src/cpu/directedtest/InvalidateGenerator.cc => src/cpu/testers/directedtest/InvalidateGenerator.cc
rename : src/cpu/directedtest/InvalidateGenerator.hh => src/cpu/testers/directedtest/InvalidateGenerator.hh
rename : src/cpu/directedtest/RubyDirectedTester.cc => src/cpu/testers/directedtest/RubyDirectedTester.cc
rename : src/cpu/directedtest/RubyDirectedTester.hh => src/cpu/testers/directedtest/RubyDirectedTester.hh
rename : src/cpu/directedtest/RubyDirectedTester.py => src/cpu/testers/directedtest/RubyDirectedTester.py
rename : src/cpu/directedtest/SConscript => src/cpu/testers/directedtest/SConscript
rename : src/cpu/directedtest/SeriesRequestGenerator.cc => src/cpu/testers/directedtest/SeriesRequestGenerator.cc
rename : src/cpu/directedtest/SeriesRequestGenerator.hh => src/cpu/testers/directedtest/SeriesRequestGenerator.hh
rename : src/cpu/memtest/MemTest.py => src/cpu/testers/memtest/MemTest.py
rename : src/cpu/memtest/SConscript => src/cpu/testers/memtest/SConscript
rename : src/cpu/memtest/memtest.cc => src/cpu/testers/memtest/memtest.cc
rename : src/cpu/memtest/memtest.hh => src/cpu/testers/memtest/memtest.hh
rename : src/cpu/rubytest/Check.cc => src/cpu/testers/rubytest/Check.cc
rename : src/cpu/rubytest/Check.hh => src/cpu/testers/rubytest/Check.hh
rename : src/cpu/rubytest/CheckTable.cc => src/cpu/testers/rubytest/CheckTable.cc
rename : src/cpu/rubytest/CheckTable.hh => src/cpu/testers/rubytest/CheckTable.hh
rename : src/cpu/rubytest/RubyTester.cc => src/cpu/testers/rubytest/RubyTester.cc
rename : src/cpu/rubytest/RubyTester.hh => src/cpu/testers/rubytest/RubyTester.hh
rename : src/cpu/rubytest/RubyTester.py => src/cpu/testers/rubytest/RubyTester.py
rename : src/cpu/rubytest/SConscript => src/cpu/testers/rubytest/SConscript
2010-08-24 12:07:22 -07:00
Gabe Black
943c171480 ISA: Get rid of old, unused utility functions cluttering up the ISAs. 2010-08-23 16:14:20 -07:00
Gabe Black
b187e7c9cc CPU: Make the constants for StaticInst flags visible outside the class. 2010-08-23 09:44:19 -07:00
Min Kyu Jeong
d8d6b869a2 O3: Skipping mem-order violation check for uncachable loads.
Uncachable load is not executed until it reaches the head of the ROB,
hence cannot cause one.
2010-08-23 11:18:42 -05:00
Min Kyu Jeong
e6a0be648e ARM: Improve printing of uop disassembly. 2010-08-23 11:18:42 -05:00
Min Kyu Jeong
ad2c3b008d CPU: Print out flatten-out register index as with IntRegs/FloatRegs traceflag 2010-08-23 11:18:41 -05:00
Min Kyu Jeong
03286e9d4e CPU: Make Exec trace to print predication result (if false) for memory instructions 2010-08-23 11:18:41 -05:00
Min Kyu Jeong
92ae620be8 ARM: mark msr/mrs instructions as SerializeBefore/After
Since miscellaneous registers bypass wakeup logic, force serialization
to resolve data dependencies through them
* * *
ARM: adding non-speculative/serialize flags for instructions change CPSR
2010-08-23 11:18:41 -05:00
Min Kyu Jeong
43c938d23e O3: Handle loads when the destination is the PC.
For loads that PC is the destination, check if the load
was mispredicted again when the value being loaded returns from memory
2010-08-23 11:18:40 -05:00
Min Kyu Jeong
5f91ec3f46 ARM/O3: store the result of the predicate evaluation in DynInst or Threadstate.
THis allows the CPU to handle predicated-false instructions accordingly.
This particular patch makes loads that are predicated-false to be sent
straight to the commit stage directly, not waiting for return of the data
that was never requested since it was predicated-false.
2010-08-23 11:18:40 -05:00
Ali Saidi
1d1837ee98 CPU: Set a default value when readBytes faults.
This was being done in read(), but if readBytes was called directly it
wouldn't happen. Also, instead of setting the memory blob being read to -1
which would (I believe) require using memset with -1 as a parameter, this now
uses bzero. It's hoped that it's more specialized behavior will make it
slightly faster.
2010-08-23 11:18:39 -05:00
Brad Beckmann
908364a1c9 ruby: Fixed minor bug in ruby test for setting the request type 2010-08-20 11:46:14 -07:00
Brad Beckmann
6a4f99899b ruby: Resurrected Ruby's deterministic tests
Added the request series and invalidate deterministic tests as new cpu models
and removed the no longer needed ruby tests

--HG--
rename : configs/example/rubytest.py => configs/example/determ_test.py
rename : src/mem/ruby/tester/DetermGETXGenerator.cc => src/cpu/directedtest/DirectedGenerator.cc
rename : src/mem/ruby/tester/DetermGETXGenerator.hh => src/cpu/directedtest/DirectedGenerator.hh
rename : src/mem/ruby/tester/DetermGETXGenerator.cc => src/cpu/directedtest/InvalidateGenerator.cc
rename : src/mem/ruby/tester/DetermGETXGenerator.hh => src/cpu/directedtest/InvalidateGenerator.hh
rename : src/cpu/rubytest/RubyTester.cc => src/cpu/directedtest/RubyDirectedTester.cc
rename : src/cpu/rubytest/RubyTester.hh => src/cpu/directedtest/RubyDirectedTester.hh
rename : src/mem/ruby/tester/DetermGETXGenerator.cc => src/cpu/directedtest/SeriesRequestGenerator.cc
rename : src/mem/ruby/tester/DetermGETXGenerator.hh => src/cpu/directedtest/SeriesRequestGenerator.hh
2010-08-20 11:46:13 -07:00
Brad Beckmann
808701a10c memtest: Memtester support for DMA
This patch adds DMA testing to the Memtester and is inherits many changes from
Polina's old tester_dma_extension patch.  Since Ruby does not work in atomic
mode, the atomic mode options are removed.
2010-08-20 11:46:12 -07:00
Gabe Black
c4ba6967a5 Inorder: Fix compilation of m5.fast.
printMemData is only used in DPRINTFs. If those are removed by compiling
m5.fast, that function is unused, gcc generates a warning, that gets turned
into an error, and the build fails. This change surrounds the function
definition with #if TRACING_ON so it only gets compiled in if the DPRINTFs do
to.
2010-08-14 01:00:45 -07:00
Gabe Black
961aafc044 Merge with head. 2010-08-13 06:16:30 -07:00
Gabe Black
aa8c6e9c95 CPU: Add readBytes and writeBytes functions to the exec contexts. 2010-08-13 06:16:02 -07:00
Gabe Black
65dbcc6ea1 InOrder: Clean up some DPRINTFs that print data sent to/from the cache. 2010-08-13 06:16:00 -07:00
Gabe Black
52a90a5998 CPU: Tidy up endianness handling for mmapped "IPR"s. 2010-08-13 06:10:45 -07:00
Joel Hestness
53c241fc16 TimingSimpleCPU: fix NO_ACCESS memory op handling
When a request is NO_ACCESS (x86 CDA microinstruction), the memory op
doesn't go to the cache, so TimingSimpleCPU::completeDataAccess needs
to handle the case where the current status of the CPU is Running
and not DcacheWaitResponse or DTBWaitResponse
2010-08-12 17:16:02 -07:00
Timothy M. Jones
607f519800 LSQ Unit: After deleting part of a split request, set it to NULL so that it
isn't accidentally deleted again later (causing a segmentation fault).
2010-07-22 18:54:37 +01:00
Timothy M. Jones
e50a880297 O3CPU: Fix a bug where stores in the cpu where never marked as split. 2010-07-22 18:52:02 +01:00
Timothy M. Jones
9a3533ec84 O3CPU: O3's tick event gets squashed when it is switched out. When repeatedly
switching between O3 and another CPU, O3's tick event might still be scheduled
in the event queue (as squashed).  Therefore, check for a squashed tick event
as well as a non-scheduled event when taking over from another CPU and deal
with it accordingly.
2010-07-22 18:47:43 +01:00
Korey Sewell
84489c5874 inorder: remove another debug stat 2010-06-28 07:33:33 -04:00
Korey Sewell
792c18a1fc inorder: remove debugging stat
m5 doesnt do stats specific to binary and this resource request stat is probably only
useful for people who really know the ins/outs of the model anyway
2010-06-26 09:41:39 -04:00
Korey Sewell
868181f24d inorder: Return Address Stack bug
the nextPC was getting sent to the branch predictor not the current PC, so
the RAS was returning the wrong PC and mispredicting everything.
2010-06-25 17:42:35 -04:00
Korey Sewell
6bfd766f2c inorder: resource scheduling backend
replace priority queue with vector of lists(1 list per stage) and place inside a class
so that we have more control of when an instruction uses a particular schedule entry
...
also, this is the 1st step toward making the InOrderCPU fully parameterizable. See the
wiki for details on this process
2010-06-25 17:42:34 -04:00
Korey Sewell
71b67d408b inorder: cleanup virtual functions
remove the annotation 'virtual' from  function declaration that isnt being derived from
2010-06-24 15:34:19 -04:00
Korey Sewell
f95430d97e inorder: enforce 78-character rule 2010-06-24 15:34:12 -04:00
Korey Sewell
ecba3074c2 inorder: exe_unit_stats for resolved branches 2010-06-24 13:58:27 -04:00
Korey Sewell
1a73764403 inorder: squash from memory stall
this applies to multithreading models which would like to squash a thread on memory stall
2010-06-23 22:09:49 -04:00
Korey Sewell
1f778b3583 inorder: record load/store trace data 2010-06-23 18:21:12 -04:00
Korey Sewell
defab3ffd5 inorder: update branch predictor
- use InOrderBPred instead of Resource for DPRINTFs
- account for DELAY SLOT in updating RAS and in squashing
- don't let squashed instructions update the predictor
- the BTB needs to use the ASID not the TID to work for multithreaded programs
- add stats for BTB hits
2010-06-23 18:19:18 -04:00
Korey Sewell
9f0d8f252c inorder-stats: add instruction type stats
also, remove inst-req stats as default.good for debugging
but in terms of pure processor stats they aren't useful
2010-06-23 18:18:20 -04:00
Korey Sewell
39ac4dce04 inorder: stall signal handling
remove stall only when necessary
add debugging printfs
2010-06-23 18:15:23 -04:00
Korey Sewell
7695d4c63f inorder: tick scheduling
use nextCycle to calculate ticks after addition
2010-06-23 18:14:59 -04:00
Timothy M. Jones
96767fc721 O3ThreadContext: When taking over from a previous context, only assert that
the system pointers match in Full System mode.
2010-06-23 00:53:17 +01:00
Nathan Binkert
54d813adca stats: get rid of the never-really-used event stuff 2010-06-14 23:24:46 -07:00
Nathan Binkert
3df84fd8a0 ruby: get rid of the Map class 2010-06-10 23:17:07 -07:00
Nathan Binkert
006818aeea ruby: get rid of Vector and use STL
add a couple of helper functions to base for deleteing all pointers in
a container and outputting containers to a stream
2010-06-10 23:17:07 -07:00
Steve Reinhardt
f92e91e853 Minor remote GDB cleanup.
Expand the help text on the --remote-gdb-port option so
people know you can use it to disable remote gdb without
reading the source code, and thus don't waste any time
trying to add a separate option to do that.
Clean up some gdb-related cruft I found while looking
for where one would add a gdb disable option, before
I found the comment that told me that I didn't need
to do that.
2010-06-03 16:54:26 -07:00
Gabe Black
05bd3eb4ec ARM: Implement support for the IT instruction and the ITSTATE bits of CPSR. 2010-06-02 12:58:16 -05:00
Ali Saidi
cb9936cfde ARM: Implement the ARM TLB/Tablewalker. Needs performance improvements. 2010-06-02 12:58:16 -05:00
Ali Saidi
b8ec214553 ARM: Implement ARM CPU interrupts 2010-06-02 12:58:16 -05:00
Ali Saidi
5e6d28996a ARM: Move PC mode bits around so they can be used for exectrace 2010-06-02 12:58:13 -05:00
Gabe Black
d149e43c41 Simple CPU: Make the FloatRegs trace flag do something. 2010-06-02 12:58:12 -05:00
Ali Saidi
b504b44b2f CPU: Reset fetch offset after a exception 2010-06-02 12:58:12 -05:00
Gabe Black
96be7e16c1 ARM: Make the predecoder handle Thumb instructions. 2010-06-02 12:58:00 -05:00
Maximilien Breughe
fc746c2268 BPRED: Fixed the treshold-bug in the tournament predictor.
Suppose the saturating counters of a branch predictor contain n bits.  When the
counter is between 0 and (2^(n-1) - 1), boundaries included, the branch is
predicted as not taken.  When the counter is between 2^(n-1) and (2^n - 1),
boundaries included, the branch is predicted as taken.
2010-05-13 23:45:57 -04:00
Nathan Binkert
e99828b06a tick: rename Clock namespace to SimClock 2010-04-15 16:24:12 -07:00
Korey Sewell
b49511ae48 inorder: timing for inst forwarding
when insts execute, they mark the time they finish to be used for subsequent isnts
they may need forwarding of data. However, the regdepmap was using the wrong
value to index into the destination operands of the instruction to be forwarded.
Thus, in some cases, we are checking to see if the 3rd destination register
for an instruction is executed at a certain time, when there is only 1 dest. register
valid. Thus, we get a bad, uninitialized time value that will stall forwarding
causing performance loss but still the correct execution.
2010-04-10 23:31:36 -04:00
Nathan Binkert
141f61d83a ruby: get rid of gems_common/util.hh and .cc and use stuff in src/base 2010-04-02 11:20:32 -07:00
Nathan Binkert
f1c3f3044b ruby: get "using namespace" out of headers
In addition to obvious changes, this required a slight change to the slicc
grammar to allow types with :: in them.  Otherwise slicc barfs on std::string
which we need for the headers that slicc generates.
2010-04-02 11:20:32 -07:00
Nathan Binkert
60ae1d2b10 style: cleanup the Ruby Tester 2010-03-29 20:39:02 -04:00
Korey Sewell
1c98bc5a56 m5: merge inorder updates 2010-03-27 02:23:00 -04:00
Korey Sewell
ac316d45e8 inorder: write-hints bug fix
make sure to only read 1 src reg. for write-hint and any other similar
'store' instruction. Reading the source reg when its not necessary
can cause the simulator to read from uninitialized values
2010-03-27 01:40:05 -04:00
Timothy M. Jones
6b293c73fd CPU: Added comments to address translation classes. 2010-03-25 12:43:52 +00:00
Steve Reinhardt
f066bfc2f5 cpu: get rid of uncached access "events"
These recordEvent() calls could cause crashes since they
access the req pointer after it's potentially been
deleted during a failed translation call.  (Similar
problem to the traceData bug fixed in the previous cset.)

Moving them above the translation call (as was done
recentlyi in cset 8b2b8e5e7d35) avoids the crash
but doesn't work, since at that point we don't know if
the access is uncached or not.

It's not clear why these calls are there, and no one
seems to use them, so we'll just delete them.  If they
are needed, they should be moved to somewhere that's
guaranteed to be after the translation completes but
before the request is possibly deleted, e.g., in
finishTranslation().
2010-03-23 08:50:59 -07:00
Steve Reinhardt
4d77ea7a57 cpu: fix exec tracing memory corruption bug
Accessing traceData (to call setAddress() and/or setData())
after initiating a timing translation was causing crashes,
since a failed translation could delete the traceData
object before returning.

It turns out that there was never a need to access traceData
after initiating the translation, as the traced data was
always available earlier; this ordering was merely
historical.  Furthermore, traceData->setAddress() and
traceData->setData() were being called both from the CPU
model and the ISA definition, often redundantly.

This patch standardizes all setAddress and setData calls
for memory instructions to be in the CPU models and not
in the ISA definition.  It also moves those calls above
the translation calls to eliminate the crashes.
2010-03-23 08:50:57 -07:00
Korey Sewell
2620e08722 inorder: import name for addtl. bpred stats 2010-03-22 17:19:48 -04:00
Maximilien Breughe
0170e851de inorder: fix squash bug in branch predictor 2010-03-22 16:59:12 -04:00
Korey Sewell
4ac245737d inorder: fix address list bug 2010-03-22 15:38:28 -04:00
Brad Beckmann
4ee3b0da45 TimingSimpleCPU: Fixed uncacacheable request read bug
Previously the recording of an uncached read occurred after the request was
possibly deleted within the translateTiming function.
2010-03-21 21:22:20 -07:00
Nathan Binkert
140785d24c ruby: get rid of std-includes.hh
Do not use "using namespace std;" in headers
Include header files as needed
2010-03-10 18:33:11 -08:00
Nathan Binkert
f0b4259e98 cpu_models: get rid of cpu_models.py and move the stuff into SCons 2010-02-26 18:14:48 -08:00
Timothy M. Jones
a5feaa6a69 BaseDynInst: Preserve the faults returned from read and write.
When implementing timing address translations instead of atomic, I
forgot to preserve the faults that are returned from the read and
write calls.  This patch reinstates them.
2010-02-20 20:11:58 +00:00
Timothy M. Jones
29e8bcead5 O3PCU: Split loads and stores that cross cache line boundaries.
When each load or store is sent to the LSQ, we check whether it will cross a
cache line boundary and, if so, split it in two. This creates two TLB
translations and two memory requests. Care has to be taken if the first
packet of a split load is sent but the second blocks the cache. Similarly,
for a store, if the first packet cannot be sent, we must store the second
one somewhere to retry later.

This modifies the LSQSenderState class to record both packets in a split
load or store.

Finally, a new const variable, HasUnalignedMemAcc, is added to each ISA
to indicate whether unaligned memory accesses are allowed. This is used
throughout the changed code so that compiler can optimise away code dealing
with split requests for ISAs that don't need them.
2010-02-12 19:53:20 +00:00
Timothy M. Jones
7fe9f92cfc BaseDynInst: Make the TLB translation timing instead of atomic.
This initiates a timing translation and passes the read or write on to the
processor before waiting for it to finish. Once the translation is finished,
the instruction's state is updated via the 'finish' function. A new
DataTranslation class is created to handle this.

The idea is taken from the implementation of timing translations in
TimingSimpleCPU by Gabe Black. This patch also separates out the timing
translations from this CPU and uses the new DataTranslation class.
2010-02-12 19:53:19 +00:00
Korey Sewell
c7f6e2661c inorder: double delete inst bug
Make sure that instructions are dereferenced/deleted twice by marking they are
on the remove list
2010-01-31 18:30:59 -05:00
Korey Sewell
9357e353fc inorder: inst count mgmt 2010-01-31 18:30:48 -05:00
Korey Sewell
be6724f7e7 inorder: implement split stores 2010-01-31 18:30:43 -05:00
Korey Sewell
6939482c49 inorder: implement split loads 2010-01-31 18:30:35 -05:00
Korey Sewell
ea8909925f inorder: add activity stats 2010-01-31 18:30:24 -05:00
Korey Sewell
f3bc2df663 inorder: object cleanup in destructors 2010-01-31 18:30:08 -05:00
Korey Sewell
1a89e8f4cb inorder: user per-thread dummy insts/reqs 2010-01-31 18:29:59 -05:00
Korey Sewell
002f1b8b7e inorder: add execution unit stats 2010-01-31 18:29:49 -05:00
Korey Sewell
82c5a754e6 inorder: recvRetry bug fix
- on certain retry requests you can get an assertion failure
- fix by allowing the request to literally "Retry" itself
  if it wasnt successful before, and then block any requests
  through cache port while waiting for the cache to be
  made available for access
2010-01-31 18:29:18 -05:00
Korey Sewell
349d86c0e4 inorder-stats: add prereq to basic stat
only show requests processed when the resource is actually in use
2010-01-31 18:29:06 -05:00
Korey Sewell
0b29c2d057 inorder: ctxt switch stats
- m5 line enforcement on use_def.cc,hh
2010-01-31 18:28:59 -05:00
Korey Sewell
ffa9ecb1fa inorder: pipeline stage stats
add idle/run/utilization stats for each pipeline stage
2010-01-31 18:28:51 -05:00
Korey Sewell
4d749472e3 inorder: enforce stage bandwidth
each stage keeps track of insts_processed on a per_thread basis but we should
be keeping that on a total basis inorder to enforce stage width limits
2010-01-31 18:28:31 -05:00
Korey Sewell
b4e0ef7837 inorder: set thread status'
set Active/Suspended/Halted status for threads.  useful for system when determining
if/when to exit simulation
2010-01-31 18:28:12 -05:00
Korey Sewell
5e0b8337ed inorder: add/remove halt/deallocate context respectively
Halt is called from the exit() system call while
deallocate is unused. So to clear up things, just
use halt and remove deallocate.
2010-01-31 18:28:05 -05:00
Korey Sewell
069b38c0d5 inorder: track last branch committed
when threads are switching in/out the CPU, we need to keep
track of special cases like branches. Add appropriate
variables in ThreadState t track this and then use
these variables when updating pc after context switch
2010-01-31 18:27:58 -05:00
Korey Sewell
aacc5cb205 inorder: add updatePC event to resPool
this will be used for when a thread comes back from a cache miss, it needs to update the PCs
because the inst might of been a branch or delayslot in which the next PC isnt always
a straight addition
2010-01-31 18:27:49 -05:00
Korey Sewell
90d3b45a56 inorder: ready thread wakeup
allow a thread to wakeup and be activated after
it has been in suspended state and another
thread is switched out. Need to give
pipeline stages a "activateThread" function
so that can get to their suspended instruction
when the time is right.
2010-01-31 18:27:38 -05:00
Korey Sewell
3eb04b4ad7 inorder: add threadmodel flag
this prints out messages relative to what
threading model is being used (smt, switch-on-miss, single, etc.)
2010-01-31 18:27:25 -05:00
Korey Sewell
611a8642c2 inorder: mem. mgmt. update
update address List and address Map to take
into account multiple threads
2010-01-31 18:27:12 -05:00
Korey Sewell
4dbc2f1718 inorder: suspend in respool
give resources their own specific
activity to do for a "suspend" event
instead of defaulting to deactivating the thread for a
suspend thread event. This really matters
for the fetch sequence unit which wants to remove the
thread from fetching while other units want to
ignore a thread suspension. If you deactivate a thread
in a resource then you may lose some of the allotted
bandwidth that the thread is taking up...
2010-01-31 18:27:02 -05:00
Korey Sewell
4ea296e296 inorder: fetch thread bug
dont check total # of threads but instead all
active threads
2010-01-31 18:26:54 -05:00
Korey Sewell
96b493d315 inorder: ready/suspend status fns
update/add in the use of isThreadReady & isThreadSuspended
functions.Check in activateThread what list a thread is
on so it can be managed accordingly.
2010-01-31 18:26:47 -05:00
Korey Sewell
d9eaa2fe21 inorder-cleanup: remove unused thread functions 2010-01-31 18:26:40 -05:00
Korey Sewell
e1fcc64980 inorder: activate thread on cache miss
-Support ability to activate next ready thread after a cache miss
through the activateNextReadyContext/Thread() functions
-To support this a "readyList" of thread ids is added
-After a cache miss, thread will suspend and then call
activitynextreadythread
2010-01-31 18:26:32 -05:00
Korey Sewell
4a945aab19 inorder: add event priority offset
allow for events to schedule themselves later if desired. this is important
because of cases like where you need to activate a thread only after the previous
thread has been deactivated. The ordering there has to be enforced
2010-01-31 18:26:26 -05:00
Korey Sewell
eac5eac67a inorder: squash on memory stall
add code to recognize memory stalls in resources and the pipeline as well
as squash a thread if there is a stall and we are in the switch on cache miss
model
2010-01-31 18:26:13 -05:00
Korey Sewell
d8e0935af2 inorder: add insts to cpu event
some events are going to need instruction data when they process, so just
include the instruction in the event construction
2010-01-31 18:26:03 -05:00
Korey Sewell
e8312ab6f7 inorder: switch out buffer
add buffer for instructions to switch out to in a pipeline stage
can't squash the instruction and remove the pipeline so we kind of need
to 'suspend' an instruction at the stage while the memory stall resolves
for the switch on cache miss model
2010-01-31 18:25:48 -05:00
Korey Sewell
a892af7b26 inorder: dont allow early loads
- loads were happening on same cycle as the address was generated which is slightly
unrealistic. Instead, force address generation to be on separate cycle from load
initiation
- also, mark the stages in a more traditional way (F-D-X-M-W)
2010-01-31 18:25:27 -05:00
Korey Sewell
0e96798fe0 configs/inorder: add options for switch-on-miss to inorder cpu 2010-01-31 18:25:13 -05:00
Korey Sewell
7b3b362ba5 inorder: init internal debug cpu counters
- cpuEventNum
- resReqCount
2010-01-31 17:18:15 -05:00
Brad Beckmann
45230a4f6b ruby: added the GEMS ruby tester 2010-01-29 20:29:23 -08:00
Lisa Hsu
9f63548478 since totalInstructions() is impl'ed by all the cpus, make it an abstract base class. 2010-01-12 10:22:46 -08:00
Brad Beckmann
b5d2052fa0 m5: Fixed bug in atomic cpu destructor 2009-11-18 13:55:58 -08:00
Gabe Black
b8120f6c38 Mem: Eliminate the NO_FAULT request flag. 2009-11-10 21:10:18 -08:00
Nathan Binkert
2c5fe6f95e build: fix compile problems pointed out by gcc 4.4 2009-11-04 16:57:01 -08:00
Steve Reinhardt
fbfe92b5b8 o3: get rid of unused physmem pointer 2009-11-04 14:23:25 -08:00
Timothy M. Jones
835a55e7f3 POWER: Add support for the Power ISA
This adds support for the 32-bit, big endian Power ISA. This supports both
integer and floating point instructions based on the Power ISA Book I v2.06.
2009-10-27 09:24:39 -07:00
Gabe Black
010b13c937 ISA: Fix compilation. 2009-10-17 01:13:41 -07:00
Brad Beckmann
28204b2a96 fixed MC146818 checkpointing bug and added isa serialization calls to simple_thread 2009-10-15 15:15:24 -07:00
Korey Sewell
f09f84da6e inorder-debug: print out workload 2009-10-01 09:35:06 -04:00
Lisa Hsu
1290a5f340 commit Soumyaroop's bug catch about max_insts_all_threads 2009-09-29 18:03:10 -04:00
Steve Reinhardt
4bec4702e9 O3: Add flag to control whether faulting instructions are traced.
When enabled, faulting instructions appear in the trace twice
(once when they fault and again when they're re-executed).
This flag is set by the Exec compound flag for backwards compatibility.
2009-09-26 10:50:50 -07:00
Steve Reinhardt
f28ea7a6c9 O3: Mark fetch stage as active if it faults.
Otherwise if the rest of the pipeline is idle then
fault will never propagate to commit to be handled,
causing CPU to deadlock.
2009-09-26 10:50:50 -07:00
Korey Sewell
25d1f2728a inorder-debug: fix cpu tick debug message 2009-09-25 11:18:55 -04:00
Nathan Binkert
d9f39c8ce7 arch: nuke arch/isa_specific.hh and move stuff to generated config/the_isa.hh 2009-09-23 08:34:21 -07:00
Nathan Binkert
9a8cb7db7e python: Move more code into m5.util allow SCons to use that code.
Get rid of misc.py and just stick misc things in __init__.py
Move utility functions out of SCons files and into m5.util
Move utility type stuff from m5/__init__.py to m5/util/__init__.py
Remove buildEnv from m5 and allow access only from m5.defines
Rename AddToPath to addToPath while we're moving it to m5.util
Rename read_command to readCommand while we're moving it
Rename compare_versions to compareVersions while we're moving it.

--HG--
rename : src/python/m5/convert.py => src/python/m5/util/convert.py
rename : src/python/m5/smartdict.py => src/python/m5/util/smartdict.py
2009-09-22 15:24:16 -07:00
Korey Sewell
6f7e196113 inorder-mdu: multiplier latency fix
mdu was workign incorrectly for 4+ latency due to incorrectly assuming
multiply was finished the next stage
2009-09-17 15:45:27 -04:00
Soumyaroop Roy
83eebe0464 inorder-smt: remove hardcoded values
allows for the 2T hello world example to work in inorder model
2009-09-16 09:47:38 -04:00
Korey Sewell
badb2382a8 inorder-alpha-fs: edit inorder model to compile FS mode 2009-09-15 01:44:48 -04:00
Polina Dudnik
ca0e0c3683 SCons fix to always make MemTest object 2009-09-01 10:38:25 -05:00
Gabe Black
ce63e50364 Atomic CPU: Respect the NO_ACCESS request flag. 2009-08-23 14:15:15 -07:00
Steve Reinhardt
a13a706a20 Fix setting of INST_FETCH flag for O3 CPU.
It's still broken in inorder.
Also enhance DPRINTFs in cache and physical memory so we
can see more easily whether it's getting set or not.
2009-08-01 22:50:14 -07:00
Gabe Black
2871a13ab3 Simple CPU: Make the simple CPU handle the IntRegs trace flag. 2009-07-29 00:15:26 -07:00
Gabe Black
8ec235c7b1 ARM: Make native trace print out what instruction caused an error. 2009-07-27 00:54:09 -07:00
Korey Sewell
44f80e7ca5 o3-smt: enforce numThreads parameter for SMT SE mode 2009-07-25 00:50:27 -04:00
Gabe Black
3e8e813218 CPU: Separate out native trace into ISA (in)dependent code and SimObjects.
--HG--
rename : src/cpu/nativetrace.cc => src/arch/sparc/nativetrace.cc
rename : src/cpu/nativetrace.hh => src/arch/sparc/nativetrace.hh
rename : src/cpu/NativeTrace.py => src/arch/x86/X86NativeTrace.py
2009-07-19 23:54:56 -07:00
Gabe Black
c9a27d85b9 Get rid of the unused get(Data|Inst)Asid and (inst|data)Asid functions. 2009-07-08 23:02:22 -07:00
Gabe Black
b398b8ff1b Registers: Add a registers.hh file as an ISA switched header.
This file is for register indices, Num* constants, and register types.
copyRegs and copyMiscRegs were moved to utility.hh and utility.cc.

--HG--
rename : src/arch/alpha/regfile.hh => src/arch/alpha/registers.hh
rename : src/arch/arm/regfile.hh => src/arch/arm/registers.hh
rename : src/arch/mips/regfile.hh => src/arch/mips/registers.hh
rename : src/arch/sparc/regfile.hh => src/arch/sparc/registers.hh
rename : src/arch/x86/regfile.hh => src/arch/x86/registers.hh
2009-07-08 23:02:21 -07:00
Gabe Black
5c37d10624 Registers: Eliminate the ISA defined RegFile class. 2009-07-08 23:02:21 -07:00
Gabe Black
43345bff6c Registers: Move the PCs out of the ISAs and into the CPUs. 2009-07-08 23:02:21 -07:00
Gabe Black
1b29f1621d ARM, Simple CPU: Fix an index and add assert checks. 2009-07-08 23:02:21 -07:00
Gabe Black
a480ba00b9 Registers: Eliminate the ISA defined integer register file. 2009-07-08 23:02:20 -07:00
Gabe Black
0cb180ea0d Registers: Eliminate the ISA defined floating point register file. 2009-07-08 23:02:20 -07:00
Gabe Black
25884a8773 Registers: Get rid of the float register width parameter. 2009-07-08 23:02:20 -07:00
Gabe Black
32daf6fc3f Registers: Add an ISA object which replaces the MiscRegFile.
This object encapsulates (or will eventually) the identity and characteristics
of the ISA in the CPU.
2009-07-08 23:02:20 -07:00
Nathan Binkert
6faf377b53 types: clean up types, especially signed vs unsigned 2009-06-04 23:21:12 -07:00
Nathan Binkert
4e34266245 move: put predictor includes and cc files into the same place
--HG--
rename : src/cpu/2bit_local_pred.cc => src/cpu/pred/2bit_local.cc
rename : src/cpu/o3/2bit_local_pred.hh => src/cpu/pred/2bit_local.hh
rename : src/cpu/btb.cc => src/cpu/pred/btb.cc
rename : src/cpu/o3/btb.hh => src/cpu/pred/btb.hh
rename : src/cpu/ras.cc => src/cpu/pred/ras.cc
rename : src/cpu/o3/ras.hh => src/cpu/pred/ras.hh
rename : src/cpu/tournament_pred.cc => src/cpu/pred/tournament.cc
rename : src/cpu/o3/tournament_pred.hh => src/cpu/pred/tournament.hh
2009-06-04 21:50:20 -07:00
Nathan Binkert
47877cf2db types: add a type for thread IDs and try to use it everywhere 2009-05-26 09:23:13 -07:00
Nathan Binkert
8d2e51c7f5 includes: sort includes again 2009-05-17 14:34:52 -07:00
Nathan Binkert
eef3a2e142 types: Move stuff for global types into src/base/types.hh
--HG--
rename : src/sim/host.hh => src/base/types.hh
2009-05-17 14:34:50 -07:00
Korey Sewell
a032d91016 cpus: add InOrderCPU to default build
regressions need this so they build the model
2009-05-12 20:55:21 -04:00
Korey Sewell
6c88730540 inorder-resources: delete events
make sure unrecognized events in the resource pool are deleted and also delete resource events in destructor
2009-05-12 15:01:16 -04:00
Korey Sewell
db2b721380 inorder-tlb-cunit: merge the TLB as implicit to any memory access
TLBUnit no longer used and we also get rid of memAccSize and memAccFlags functions added to ISA and StaticInst
since TLB is not a separate resource to acquire. Instead, TLB access is done before any read/write to memory
and the result is checked before it's sent out to memory.
* * *
2009-05-12 15:01:16 -04:00
Korey Sewell
3a057bdbb1 inorder-tlb: squash insts in TLB correctly
TLB had a bug where if it was stalled and waiting , it would not squash all instructions older than squashed instruction correctly
* * *
2009-05-12 15:01:16 -04:00
Korey Sewell
f1c97e830b inorder-faults: ignore unalign translation faults for prefetches 2009-05-12 15:01:16 -04:00
Korey Sewell
fe4cd9847d inorder-stc: update interface to handle store conditionals 2009-05-12 15:01:15 -04:00
Korey Sewell
6211fe5d2e inorder-float: Fix storage of FP results
inorder was incorrectly storing FP values and confusing the integer/fp storage view of floating point operations. A big issue was knowing trying to infer when were doing single or double precision access
because this lets you know the size of value to store (32-64 bits). This isnt exactly straightforward since alpha uses all 64-bit regs while mips/sparc uses a dual-reg view. by getting this value from
the actual floating point register file, the model can figure out what it needs to store
2009-05-12 15:01:15 -04:00
Korey Sewell
3603dd25ef inorder-fetch: update model to use predecoder 2009-05-12 15:01:15 -04:00
Korey Sewell
c9a03f549b inorder-mem: clean up allocation/deletion of requests/packets
* * *
2009-05-12 15:01:15 -04:00
Korey Sewell
1c7e988272 inorder-mem: skeleton support for prefetch/writehints 2009-05-12 15:01:15 -04:00
Korey Sewell
f41df0ee08 inorder-o3: allow both to compile together
allow InOrder and O3CPU to be compiled at the same time: need to make branch prediction filed shared by both models
2009-05-12 15:01:14 -04:00
Korey Sewell
5127ea226a inorder-unified-tlb: use unified TLB instead of old TLB model 2009-05-12 15:01:14 -04:00
Korey Sewell
98b1452058 inorder-miscregs: Fix indexing for misc. reg operands and update result-types for better tracing of these types of values 2009-05-12 15:01:14 -04:00
Korey Sewell
2012202b06 inorder/alpha-isa: create eaComp object visible to StaticInst through ISA
Remove subinstructions eaComp/memAcc since unused in CPU Models. Instead, create eaComp that is visible from StaticInst object. Gives InOrder model capability of generating address without actually initiating access
* * *
2009-05-12 15:01:14 -04:00
Korey Sewell
b569f8f0ed inorder-bpred: edits to handle non-delay-slot ISAs
Changes so that InOrder can work for a non-delay-slot ISA like Alpha. Typically, changes have to do with handling misspeculated branches at different points in pipeline
2009-05-12 15:01:14 -04:00
Korey Sewell
1c8dfd9254 inorder-alpha-port: initial inorder support of ALPHA
Edit AlphaISA to support the inorder model. Mostly alternate constructor functions and also a few skeleton multithreaded support functions
* * *
Remove namespace from header file. Causes compiler issues that are hard to find
* * *
Separate the TLB from the CPU and allow it to live in the TLBUnit resource. Give CPU accessor functions for access and also bind at construction time
* * *
Expose memory access size and flags through instruction object
(temporarily memAccSize and memFlags to get TLB stuff working.)
2009-05-12 15:01:13 -04:00
Korey Sewell
9f90291c54 cpus: fix cpu progress event
this was double scheduling itself (once in constructor and once in cpu code). also add support for stopping / starting
progress events through repeatEvent flag and also changing the interval of the progress event as well
2009-05-05 02:39:05 -04:00
Nathan Binkert
50f1570352 arm: Unify the ARM tlb. We forgot about this when we did the rest.
This code compiles, but there are no tests still
2009-04-21 15:40:25 -07:00
Steve Reinhardt
3083268d60 request: rename INST_READ to INST_FETCH. 2009-04-20 18:54:02 -07:00
Gabe Black
bd6f2bb538 Mem: Change isLlsc to isLLSC. 2009-04-19 21:44:15 -07:00
Gabe Black
1a8a765a5c CPUs: Make the atomic CPU support locked memory accesses. 2009-04-19 04:50:07 -07:00
Gabe Black
3e5f487663 Memory: Rename LOCKED for load locked store conditional to LLSC. 2009-04-19 04:25:01 -07:00
Gabe Black
d10195b1a4 CPU: If the simple CPU is already idle, just return from suspendContext, don't assert. 2009-04-19 02:23:29 -07:00
Korey Sewell
5c1742b822 o3-delay-slot-bpred: fix decode stage handling of uncdtl. branches.\n decode stage was not setting the predicted PC correctly or passing that information back to fetch correctly 2009-04-18 10:42:29 -04:00
Steve Reinhardt
14808ecac9 o3, inorder: fix FS bug due to initializing ThreadState to Halted.
For some reason o3 FS init() only called initCPU if the thread state
was Suspended, which was no longer the case.  There's no apparent
reason to check, so I whacked the test completely rather than
changing the check to Halted.
The inorder init() was also updated to be symmetric, though the
previous code was just a fancy no-op.
2009-04-17 16:54:58 -07:00
Steve Reinhardt
b146131d18 o3: handle fetch with no active threads correctly.
This situation can arise now on the first fetch cycle after
the last active thread is halted.  It seems easy enough to
deal with when it happens rather than trying to avoid it.
2009-04-15 23:12:00 -07:00
Steve Reinhardt
bb974d5a47 o3: fix {read,set}ArchFloatReg* functions.
Register indices were not being calculated properly.
2009-04-15 23:10:43 -07:00
Steve Reinhardt
7617dcf736 ThreadState: initialize status to Halted in constructor.
This provides a common initial status for all threads independent
of CPU model (unlike the prior situation where CPUs initialized
threads to inconsistent states).
This mostly matters for SE mode; in FS mode, ISA-specific startupCPU()
methods generally handle boot-time initialization of thread contexts
(since the right thing to do is ISA-dependent).
2009-04-15 13:18:24 -07:00
Steve Reinhardt
8882dc1283 Get rid of the Unallocated thread context state.
Basically merge it in with Halted.
Also had to get rid of a few other functions that
called ThreadContext::deallocate(), including:
 - InOrderCPU's setThreadRescheduleCondition.
 - ThreadContext::exit().  This function was there to avoid terminating
   simulation when one thread out of a multi-thread workload exits, but we
   need to find a better (non-cpu-centric) way.
2009-04-15 13:13:47 -07:00
Nathan Binkert
e0de2c3443 tlb: More fixing of unified TLB 2009-04-08 22:21:27 -07:00
Gabe Black
7b5a96f06b tlb: Don't separate the TLB classes into an instruction TLB and a data TLB 2009-04-08 22:21:27 -07:00
Steve Reinhardt
61ff48a1f8 cpu: fix minor endian issue with trace output
(no functional change)
2009-03-11 23:05:24 -07:00
Nathan Binkert
ac7bda0212 stats: fix duplicate statistics names.
This generally requires providing a more meaningful name() function for a
class.
2009-03-07 14:30:54 -08:00
Nathan Binkert
cc95b57390 stats: Fix all stats usages to deal with template fixes 2009-03-05 19:09:53 -08:00
Steve Reinhardt
e3d6e8882e Get rid of 'using namespace' declarations in headers. 2009-03-05 17:15:31 -08:00
Korey Sewell
9e1dc7f205 InOrderCPU: Clean up Constructors to initialize variables correctly (i.e. in a way for the compiler to play *nice*) 2009-03-04 22:37:45 -05:00
Korey Sewell
7c8d544216 Give each resource in InOrder it's own TraceFlag instead of just standard 'Resource' flag 2009-03-04 13:17:09 -05:00
Korey Sewell
30cd2d21fa Remove unused functions/comments cluttering up the code. 2009-03-04 13:17:08 -05:00
Korey Sewell
f69b018571 make handling of interstage buffers (i.e. StageQueues) more consistent: (1)number from 0-n, not 1-n+1, (2) always check nextStageValid before a stageNum+1 and prevStageValid for a stageNum-1 reference (3) add skidSize() to get StageQueue size for all threads 2009-03-04 13:17:07 -05:00
Korey Sewell
f98e9161a8 InOrder didnt have all it's params set to a default value, which is now required for M5 objects; Also, a # of values need to be reset to 0 (or the appropriate value) before we assume they are OK for use. 2009-03-04 13:17:05 -05:00
Korey Sewell
846f953c2b Give TimeBuffer an ID that can be set. Necessary because InOrder uses generic stages so w/o an ID there is no way to differentiate buffers when debugging 2009-03-04 13:16:49 -05:00
Korey Sewell
e4aa4ca40c use numCycles instead of simTicks to determine CPI stat in InOrder 2009-03-04 13:16:48 -05:00
Steve Reinhardt
9ee8e685a4 O3: Make numThreads error message more helpful. 2009-03-04 09:25:53 -05:00
Gabe Black
9a000c5173 Processes: Make getting and setting system call arguments part of a process object. 2009-02-27 09:22:14 -08:00
Ali Saidi
d447ccb2c6 CPA: Add code to automatically record function symbols as CPU executes. 2009-02-26 19:29:17 -05:00
Gabe Black
5c546e3504 CPU: Only look up the nearest symbol in the kernel if you're actually in kernel code. 2009-02-25 10:22:36 -08:00
Gabe Black
9940e21fa9 CPU: Add a flag to identify a read barrier to the static inst class. 2009-02-25 10:19:33 -08:00
Gabe Black
da61c4b3ee CPU: Don't fetch when executing a macroop.
If the CPL changes mid macroop, the end of the instruction might not be
priveleged enough to execute the beginning.
2009-02-25 10:18:36 -08:00
Gabe Black
6ed47e9464 CPU: Implement translateTiming which defers to translateAtomic, and convert the timing simple CPU to use it. 2009-02-25 10:16:15 -08:00
Gabe Black
5605079b1f ISA: Replace the translate functions in the TLBs with translateAtomic. 2009-02-25 10:15:44 -08:00
Gabe Black
a1aba01a02 CPU: Get rid of translate... functions from various interface classes. 2009-02-25 10:15:34 -08:00
Nathan Binkert
3fa9812e1d debug: Move debug_break into src/base 2009-02-23 11:48:40 -08:00
Korey Sewell
6c5afe6346 Remove unnecessary building of FreeList/RenameMap in InOrder. Clean-up comments and O3 extensions InOrder Thread Context 2009-02-20 11:02:48 -05:00
Steve Reinhardt
89a7fb0393 Fixes to get prefetching working again.
Apparently we broke it with the cache rewrite and never noticed.
Thanks to Bao Yungang <baoyungang@gmail.com> for a significant part
of these changes (and for inspiring me to work on the rest).
Some other overdue cleanup on the prefetch code too.
2009-02-16 08:56:40 -08:00
Nathan Binkert
f255957b90 style 2009-02-10 22:19:27 -08:00
Korey Sewell
cf4a00ca41 Configs: Add support for the InOrder CPU model 2009-02-10 15:49:29 -08:00
Korey Sewell
973d8b8b13 InOrder: Import new inorder CPU model from MIPS.
This model currently only works in MIPS_SE mode, so it will take some effort
to clean it up and make it generally useful. Hopefully people are willing to
help make that happen!
2009-02-10 15:49:29 -08:00
Korey Sewell
34a5cd8870 ExeTrace: Allow subclasses of the tracer to define their own prefix to dump 2009-02-10 15:49:29 -08:00
Korey Sewell
2d0a66cbc1 CPU: Prepare CPU models for the new in-order CPU model.
Some new functions and forward declarations are necessary to make things work
2009-02-10 15:49:29 -08:00
Gabe Black
7b58511470 CPU: Don't always reset the micro pc on faults. Let the faults handle it. 2009-02-01 00:30:54 -08:00
Gabe Black
7720968949 X86: Make sure the predecoder is cleared out for interrupts. 2009-02-01 00:04:34 -08:00