Makes page table cache scheme actually work
src/mem/page_table.cc:
src/mem/page_table.hh:
fix caching scheme to actually work and improve performance
--HG--
extra : convert_revision : 443a8d8acbee540b26affcfdfbf107b8e735d1bd
into zeep.pool:/z/saidi/work/m5.newmem
src/cpu/simple/base.cc:
hand merge vincent/gabe/my changes to cast sizeof() to a 64bit int
--HG--
extra : convert_revision : eb989b4d65d08057df1777c04b8ee2cfa75a2695
into ahchoo.blinky.homelinux.org:/home/gblack/m5/newmem-x86
src/cpu/simple/base.cc:
Hand merge
--HG--
extra : convert_revision : a2902ef9d917d22ffb9c7dfa2fd444694a65240d
Initialize a temporary variable for thread->readPC() at setupFetchRequest() to reduce function calls.
exec tracing isn't needed for m5.fast binaries
Moved MISCREG_GL, MISCREG_CWP, and MISCREG_TLB_DATA out of switch statement and use if blocks instead.
src/arch/sparc/miscregfile.cc:
Moved MISCREG_GL, MISCREG_CWP, and MISCREG_TLB_DATA out of switch statement and use if blocks instead.
src/cpu/simple/base.cc:
Assign traceData to be NULL at BaseSimpleCPU constructor.
Initialize a temporary variable for thread->readPC() at setupFetchRequest() to reduce function calls.
exec tracing isn't needed for m5.fast binaries
--HG--
extra : convert_revision : 5dc92fff05c9bde994f1e0f1bb40e11c44eb72c6
src/arch/micro_asm.py:
Micro assembler
src/arch/micro_asm_test.py:
Test script for the micro assembler. This probably should go somewhere else eventually.
--HG--
extra : convert_revision : 277fdadec94763ae657f55f501704693b81e0015
src/arch/x86/isa/decoder/one_byte_opcodes.isa:
Give the "MOV" instruction the format of it's arguments. This will likely need to be completely overhauled in the near future.
src/arch/x86/predecoder.cc:
src/arch/x86/predecoder.hh:
Make the predecoder explicitly reset itself rather than counting on it happening naturally.
src/arch/x86/predecoder_tables.cc:
Fix the immediate size table
src/arch/x86/regfile.cc:
nextnpc is bogus
--HG--
extra : convert_revision : 0926701fedaab41817e64bb05410a25174484a5a
Oops... forgot to update call site after changing
function argument semantics.
src/mem/tport.cc:
Oops... forgot to update call site after changing
function argument semantics.
--HG--
extra : convert_revision : 9234b991dc678f062d268ace73c71b3d13dd17dc
- factor out checkFunctional() code so it can be
called from derived classes
- use EventWrapper for sendEvent, move event handling
code from event to port where it belongs
- make sendEvent a pointer so derived classes can
override it
- replace std::pair with new class for readability
--HG--
extra : convert_revision : 5709de2daacfb751a440144ecaab5f9fc02e6b7a
port. It would be better to move this to python IMO but for
now I'll stick in a compatibility hack.
--HG--
extra : convert_revision : a81a29cbd43becd0e485559eb7b2a31f7a0b082d
configs/example/memtest.py:
PhysicalMemory has vector of uniform ports instead of one special one.
Other updates to fix obsolete brokenness.
src/mem/physical.cc:
src/mem/physical.hh:
src/python/m5/objects/PhysicalMemory.py:
Have vector of uniform ports instead of one special one.
src/python/swig/pyobject.cc:
Add comment.
--HG--
extra : convert_revision : a4a764dcdcd9720bcd07c979d0ece311fc8cb4f1
cache blocks that get dmaed ARE NOT marked invalid in the caches so it's a performance issue here
src/mem/bridge.cc:
src/mem/bridge.hh:
hopefully the final hacky change to make the bus bridge work ok
--HG--
extra : convert_revision : 62cbc65c74d1a84199f0a376546ec19994c5899c
src/dev/io_device.cc:
extra printing and assertions
src/mem/bridge.hh:
deal with packets only satisfying part of a request by making many requests
src/mem/cache/cache_impl.hh:
make the cache try to satisfy a functional request from the cache above it before checking itself
--HG--
extra : convert_revision : 1df52ab61d7967e14cc377c560495430a6af266a
set the latency parameter in terms of a latency
add caches to tsunami-simple configs
configs/common/Caches.py:
tests/configs/memtest.py:
tests/configs/o3-timing-mp.py:
tests/configs/o3-timing.py:
tests/configs/simple-atomic-mp.py:
tests/configs/simple-timing-mp.py:
tests/configs/simple-timing.py:
set the latency parameter in terms of a latency
configs/common/FSConfig.py:
give the bridge a default latency too
src/mem/cache/cache_builder.cc:
src/python/m5/objects/BaseCache.py:
remove hit_latency and make latency do the right thing
tests/configs/tsunami-simple-atomic-dual.py:
tests/configs/tsunami-simple-atomic.py:
tests/configs/tsunami-simple-timing-dual.py:
tests/configs/tsunami-simple-timing.py:
add caches to tsunami-simple configs
--HG--
extra : convert_revision : 37bef7c652e97c8cdb91f471fba62978f89019f1
into pb15.local:/Users/ali/work/m5.newmem.zeep
tests/quick/10.linux-boot/ref/alpha/linux/tsunami-simple-timing-dual/m5stats.txt:
tests/quick/10.linux-boot/ref/alpha/linux/tsunami-simple-timing/m5stats.txt:
the new version of this is what we want
--HG--
extra : convert_revision : 204df6f8181df81e423def4695cd81544c485c47
add seperate response buffers and request queue sizes in bus bridge
add delay to respond to a nack in the bus bridge
src/dev/i8254xGBe.cc:
src/dev/ide_ctrl.cc:
src/dev/ns_gige.cc:
src/dev/pcidev.hh:
src/dev/sinic.cc:
add backoff delay parameters
src/dev/io_device.cc:
src/dev/io_device.hh:
add a backoff algorithm when nacks are received.
src/mem/bridge.cc:
src/mem/bridge.hh:
add seperate response buffers and request queue sizes
add a new parameters to specify how long before a nack in ready to go after a packet that needs to be nacked is received
src/mem/cache/cache_impl.hh:
assert on the
src/mem/tport.cc:
add a friendly assert to make sure the packet was inserted into the list
--HG--
extra : convert_revision : 3595ad932015a4ce2bb72772da7850ad91bd09b1
src/base/bitfield.hh:
bit_val was being used directly in the statement in
return. If type B had fewer bits than last, bit_val << last would get
the wrong answer.
--HG--
extra : convert_revision : cbc43ccd139f82ebbd65f30af5d05b87c4edac64
fix the timing cpu to handle receiving a nacked packet
src/cpu/simple/timing.cc:
make the timing cpu handle receiving a nacked packet
src/mem/bridge.cc:
src/mem/bridge.hh:
the bridge never returns false when recvTiming() is called on its ports now, it always returns true and nacks the packet if there isn't sufficient buffer space
--HG--
extra : convert_revision : 5e12d0cf6ce985a5f72bcb7ce26c83a76c34c50a
figure out the block size from devices attached to the bus otherwise use a default block size when no devices that care are attached
configs/common/FSConfig.py:
src/mem/bridge.cc:
src/mem/bridge.hh:
src/python/m5/objects/Bridge.py:
fix partial writes with a functional memory hack
src/mem/bus.cc:
src/mem/bus.hh:
src/python/m5/objects/Bus.py:
figure out the block size from devices attached to the bus otherwise use a default block size when no devices that care are attached
src/mem/packet.cc:
fix WriteInvalidateResp to not be a request that needs a response since it isn't
src/mem/port.hh:
by default return 0 for deviceBlockSize instead of panicing. This makes finding the block size the bus should use easier
--HG--
extra : convert_revision : 3fcfe95f9f392ef76f324ee8bd1d7f6de95c1a64
(which defines fenv) doesn't necessarily extend to c++ and it is a problem with solaris. If really
desired this could wrap the ieeefp interface found in bsd* as well, but I see no need at the moment.
src/arch/alpha/isa/fp.isa:
src/arch/sparc/isa/formats/basic.isa:
use m5_fesetround()/m5_fegetround() istead of fenv interface directly
src/arch/sparc/isa/includes.isa:
use base/fenv instead of fenv directly
src/base/SConscript:
add fenv to sconscript
src/base/fenv.hh:
src/base/random.cc:
m5 implementation to standerdize fenv across platforms.
--HG--
extra : convert_revision : 38d2629affd964dcd1a5ab0db4ac3cb21438e72c
src/cpu/simple/base.cc:
Cpu's should start as unallocated, not suspended
src/cpu/simple_thread.cc:
Wait for a thread to be assigned to activate the cpu
src/kern/tru64/tru64.hh:
When looking for a open cpu to assign threads, look for an unallocated one, not a suspended one.
--HG--
extra : convert_revision : 5e3ad2e96b4a715ed38293ceaccff5b9f4ea7985
and python code into m5 to allow swig an python code to
easily added by any SConscript instead of just the one in
src/python. This provides SwigSource and PySource for
adding new files to m5 (similar to Source for C++). Also
provides SimObject for including files that contain SimObject
information and build the m5.objects __init__.py file.
--HG--
extra : convert_revision : 38b50a0629846ef451ed02f96fe3633947df23eb
or automatically do Split(). It isn't used anywhere, and
isn't very consistent with the python features that are
about to be added. Do accept SCons.Node.FS.File arguments
though.
--HG--
extra : convert_revision : 0f3bb0e9e6806490330eea59a104013042b4dd49
unproxy() needs to return a new object otherwise all
instances will use the same value. This fix is more
or less unique to NextEthernetAddr because its use of
the proxy stuff is a bit different than everything else.
--HG--
extra : convert_revision : 2ce452e37d00b9ba76b6abfaec0ad2e0073920d7
1. Microops are created. These are StaticInsts use templates to provide a basic form of polymorphism without having to make the microassembler smarter.
2. An instruction class is created which has a "templated" microcode program as it's docstring. The template parameters are refernced with ^ following by a number.
3. An instruction in the decoder references an instruction template using it's mnemonic. The parameters to it's format end up replacing the placeholders. These parameters describe a source for an operand which could be memory, a register, or an immediate. It it's a register, the register index is used. If it's memory, eventually a load/store will be pre/postpended to the instruction template and it's destination register will be used in place of the ^. If it's an immediate, the immediate is used. Some operand types, specifically those that come from the ModRM byte, need to be decoded further into memory vs. register versions. This is accomplished by making the decode_block text for these instructions another case statement based off ModRM.
4. Once all of the template parameters have been handled, the instruction goes throw the microcode assembler which resolves labels and creates a list of python op objects. If an operand is a register, it uses a % prefix, an immediate uses $, and a label uses @. If the operand is just letters, numbers, and underscores, it can appear immediately after the prefix. If it's not, it can be encolsed in non nested {}s.
5. If there is a single "op" object (which corresponds to a single microop) the decoder is set up to return it directly. If not, a macroop wrapper is created around it.
In the future, I'm considering seperating the operand type specialization from the template substitution step. A problem this introduces is that either the template arguments need to be kept around for the specialization step, or they need to be re-extracted. Re-extraction might be the way to go so that the operand formats can be coded directly into the micro assembler template without having to pass them in as parameters. I don't know if that's actually useful, though.
src/arch/x86/isa/decoder/one_byte_opcodes.isa:
src/arch/x86/isa/microasm.isa:
src/arch/x86/isa/microops/microops.isa:
src/arch/x86/isa/operands.isa:
src/arch/x86/isa/microops/base.isa:
Implemented polymorphic microops and changed around the microcode assembler syntax.
--HG--
extra : convert_revision : e341f7b8ea9350a31e586a3d33250137e5954f43
src/cpu/o3/alpha/cpu_impl.hh:
Pass ISA-specific O3 CPU to FullO3CPU as a constructor parameter instead of using setCPU functions.
--HG--
extra : convert_revision : 74f4b1f5fb6f95a56081f367cce7ff44acb5688a
In this way a MemoryObject can keep a functional port around and give it to anyone who wants to do functional accesses rather
than creating a new one each time.
src/mem/bus.cc:
src/mem/bus.hh:
src/mem/cache/cache_impl.hh:
only keep around one func port we give to anyone who wants it. Otherwise we can run out of port ids reasonably quickly if
a lot of functional accesses are happening (e.g. remote debugging, dprintk, etc)
--HG--
extra : convert_revision : 6a9e3e96f51cedaab6de1b36cf317203899a3716
MicroOp: A single operation actually implemented in hardware.
MacroOp: A collection of microops which are executed as a unit.
Instruction: An architected instruction which can be implemented with a macroop or a microop.
--HG--
extra : convert_revision : 1cfc8409cc686c75220767839f55a30551aa6f13
The removed ones were unnecessary. The commented out ones could be useful in the future, should this problem get fixed. See flyspray task #243.
src/cpu/o3/commit_impl.hh:
src/cpu/o3/decode_impl.hh:
src/cpu/o3/fetch_impl.hh:
src/cpu/o3/iew_impl.hh:
src/cpu/o3/inst_queue_impl.hh:
src/cpu/o3/lsq_impl.hh:
src/cpu/o3/lsq_unit_impl.hh:
src/cpu/o3/rename_impl.hh:
src/cpu/o3/rob_impl.hh:
Remove/comment out DPRINTFs that were causing a segfault.
--HG--
extra : convert_revision : b5aeda1c6300dfde5e0a3e9b8c4c5f6fa00b9862
src/cpu/o3/fetch.hh:
src/cpu/o3/fetch_impl.hh:
Update code so that the O3 CPU can handle not initially having anything hooked up to its ports.
--HG--
extra : convert_revision : 04bcef44e754735d821509ebd69b0ef9c8ef8e2c
into zamp.eecs.umich.edu:/z/ktlim2/clean/tmp/clean2
src/cpu/base_dyn_inst.hh:
Hand merge. Line is no longer needed because it's handled in the ISA.
--HG--
extra : convert_revision : 0be4067aa38759a5631c6940f0167d48fde2b680
1. Move interrupt handling to a separate function to clean up main commit() function a bit. Also gate the function call off properly based on whether or not there are outstanding interrupts, and the system is not in PAL mode.
2. Better handling of updating instruction's status bits. Instructions are not marked "atCommit" until other stages view it (pushed off to IEW/IQ), and they have been properly handled (faults).
3. Don't consider the ROB "empty" for the purpose of other stages until the ROB is empty, all stores have written back, and there was no store commits this cycle. The last is necessary in case a store committed, in which case it would look like all stores have written back but in actuality have not.
src/cpu/o3/commit.hh:
Slightly modify how interrupts are handled. Also include some extra bools to keep track of state properly.
src/cpu/o3/commit_impl.hh:
Slightly modify how interrupts are handled. Also include some extra bools to keep track of state.
General correctness updates, most specifically for when commit broadcasts to other stages that the ROB is empty.
--HG--
extra : convert_revision : 682ec6ccf4ee6ed0c8a030ceaba1c90a3619d102
1. Update packet's flags properly when a snoop happens
2. Don't allow accesses to read a block's data if the block has outstanding MSHRs. This avoids a RAW hazard in MP systems that the memory system was not detecting properly earlier (a write required a block to upgrade, and while the upgrade was outstanding, a read came along and read old data).
3. Update MSHR's request upon a response being handled. If the MSHR has more targets than it can respond to in one cycle, then its request must be properly updated to the new head of the targets list.
src/mem/bus.cc:
Update packet's flags properly upon snoop.
src/mem/cache/cache_impl.hh:
Be sure to not allow accesses to a block with outstanding MSHRs.
src/mem/cache/miss/miss_queue.cc:
Update MSHR's request upon a response being handled.
--HG--
extra : convert_revision : 76a9abc610ca3f1904f075ad21637148a41982d6
src/cpu/o3/iew_impl.hh:
Allow for slightly more flexible handling of non-speculative instructions. They can be other classes now, such as loads or stores.
Also be sure to clear the state associated with squashes that are not used. i.e. if a squash due to a memory ordering violation happens on the same cycle as an older branch squashing, clear the state associated with the memory ordering violation.
Lastly don't consider uncached loads to officially be "at commit" until IEW receives the signal back from commit about the load.
src/cpu/o3/inst_queue_impl.hh:
Don't consider non-speculative instructions to be "at commit" until the IQ has received a signal from commit about the instruction. This prevents non-speculative instructions from being issued too early.
src/cpu/o3/mem_dep_unit_impl.hh:
Clear instruction's ability to issue if it's replayed.
--HG--
extra : convert_revision : d69dae878a30821222885485f4dee87170d56eb3
1. Requests are handled more properly now. They assume the memory system takes control of the request upon sending out an access.
2. load-load ordering is maintained.
src/cpu/base_dyn_inst.hh:
Update how requests are handled. The BaseDynInst should not be able to hold a pointer to the request because the request becomes owned by the memory system once it is sent out.
Also include some functions to allow certain status bits to be cleared.
src/cpu/base_dyn_inst_impl.hh:
Update how requests are handled. The BaseDynInst should not be able to hold a pointer to the request because the request becomes owned by the memory system once it is sent out.
src/cpu/o3/fetch_impl.hh:
General correctness fixes. retryPkt is not necessarily always set, so handle it properly. Also consider the cache unblocked only when recvRetry is called.
src/cpu/o3/lsq_unit.hh:
Handle requests a little more correctly. Now that the requests aren't pointed to by the DynInst, be sure to delete the request if it's not being used by the memory system.
Also be sure to not store-load forward from an uncacheable store.
src/cpu/o3/lsq_unit_impl.hh:
Check to make sure load-load ordering was maintained.
Also handle requests a little more correctly.
--HG--
extra : convert_revision : e86bead2886d02443cf77bf7a7a1492845e1690f
1. Set CPU ID in all modes for the O3 CPU.
2. Use nextCycle() function to prevent phase drift in O3 CPU.
3. Remove assertion in rename map that is no longer true.
src/cpu/o3/alpha/cpu_builder.cc:
Allow for CPU id in all modes, not just full system. Also include a parameter that was left out by accident.
src/cpu/o3/alpha/cpu_impl.hh:
Set the CPU ID properly.
src/cpu/o3/cpu.cc:
src/cpu/o3/cpu.hh:
Use nextCycle() function so that the CPU does not get out of phase when starting up from quiesces.
src/cpu/o3/rename_map.cc:
Remove assertion that is no longer true.
tests/configs/o3-timing.py:
Set CPU's id to 0.
--HG--
extra : convert_revision : 2b69c19adfce2adcc2d1939e89d702bd6674d5d5