sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Andreas Hansson	5a9a743cfc	MEM: Introduce the master/slave port roles in the Python classes This patch classifies all ports in Python as either Master or Slave and enforces a binding of master to slave. Conceptually, a master (such as a CPU or DMA port) issues requests, and receives responses, and conversely, a slave (such as a memory or a PIO device) receives requests and sends back responses. Currently there is no differentiation between coherent and non-coherent masters and slaves. The classification as master/slave also involves splitting the dual role port of the bus into a master and slave port and updating all the system assembly scripts to use the appropriate port. Similarly, the interrupt devices have to have their int_port split into a master and slave port. The intdev and its children have minimal changes to facilitate the extra port. Note that this patch does not enforce any port typing in the C++ world, it merely ensures that the Python objects have a notion of the port roles and are connected in an appropriate manner. This check is carried when two ports are connected, e.g. bus.master = memory.port. The following patches will make use of the classifications and specialise the C++ ports into masters and slaves.	2012-02-13 06:43:09 -05:00
Anthony Gutierrez	542d0ceebc	cpu: add separate stats for insts/ops both globally and per cpu model	2012-02-12 16:07:39 -06:00
Ali Saidi	8aaa39e93d	mem: Add a master ID to each request object. This change adds a master id to each request object which can be used identify every device in the system that is capable of issuing a request. This is part of the way to removing the numCpus+1 stats in the cache and replacing them with the master ids. This is one of a series of changes that make way for the stats output to be changed to python.	2012-02-12 16:07:38 -06:00
Nilay Vaish	6a7a6263e1	O3 CPU: Improve handling of delayed commit flag The delayed commit flag is used in conjunction with interrupt pending flag to figure out whether or not fetch stage should get more instructions. This patch clears this flag when instructions are squashed. Also, in case an interrupt is pending, currently it is not possible to access the instruction cache. This patch allows accessing the cache in case this flag is set.	2012-02-10 08:37:31 -06:00
Nilay Vaish	cd765c23a2	O3 CPU: Strengthen condition for handling interrupts The condition for handling interrupts is to check whether or not the cpu's instruction list is empty. As observed, this can lead to cases in which even though the instruction list is empty, interrupts are handled when they should not be. The condition is being strengthened so that interrupts get handled only when the last committed microop did not had IsDelayedCommit set.	2012-02-10 08:37:30 -06:00
Nilay Vaish	8f7e03d4cf	O3 CPU: Provide the squashing instruction This patch adds a function to the ROB that will get the squashing instruction from the ROB's list of instructions. This squashing instruction is used for figuring out the macroop from which the fetch stage should fetch the microops. Further, a check has been added that if the instructions are to be fetched from the cache maintained by the fetch stage, then the data in the cache should be valid and the PC of the thread being fetched from is same as the address of the cache block.	2012-02-10 08:37:28 -06:00
Nilay Vaish	0e597e944a	O3 Fetch: Check if PC is pointing to Microcode ROM	2012-02-10 08:37:26 -06:00
Gabe Black	e80ebc308f	SE/FS: Record the system pointer all the time for the simple CPU. This pointer was only being stored in code that came from SE mode. The system pointer is always meaningful and available, so it should always be stored.	2012-02-10 02:05:31 -08:00
Gabe Black	a6246bb047	Checker: Access workload element 0 only if there is an element 0.	2012-02-07 04:44:01 -08:00
Gabe Black	f2b46fdb85	Faults: Turn off arch/faults.hh Because there are no longer architecture independent but specialized functions in arch/XXX/faults.hh, code that isn't using the faults from a particular ISA no longer needs to be able to include them through the switching header file arch/faults.hh. By removing that header file (arch/faults.hh), the potential interface between ISA code and non ISA code is narrowed.	2012-02-07 04:43:21 -08:00
Gabe Black	ea8b347dc5	Merge with head, hopefully the last time for this batch.	2012-01-31 22:40:08 -08:00
Koan-Sin Tan	7d4f187700	clang: Enable compiling gem5 using clang 2.9 and 3.0 This patch adds the necessary flags to the SConstruct and SConscript files for compiling using clang 2.9 and later (on Ubuntu et al and OSX XCode 4.2), and also cleans up a bunch of compiler warnings found by clang. Most of the warnings are related to hidden virtual functions, comparisons with unsigneds >= 0, and if-statements with empty bodies. A number of mismatches between struct and class are also fixed. clang 2.8 is not working as it has problems with class names that occur in multiple namespaces (e.g. Statistics in kernel_stats.hh). clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which causes confusion between the container std::set and the function Packet::set, and this is currently addressed by not including the entire namespace std, but rather selecting e.g. "using std::vector" in the appropriate places.	2012-01-31 12:05:52 -05:00
Andreas Hansson	4fdecae443	Thread: Use inherited baseCpu rather than cpu in SimpleThread This patch is a trivial simplification, removing the cpu pointer from SimpleThread and relying on the baseCpu pointer in ThreadState. The patch does not add or change any functionality, it merely cleans up the code.	2012-01-31 11:50:07 -05:00
Geoffrey Blake	af6aaf2581	CheckerCPU: Re-factor CheckerCPU to be compatible with current gem5 Brings the CheckerCPU back to life to allow FS and SE checking of the O3CPU. These changes have only been tested with the ARM ISA. Other ISAs potentially require modification.	2012-01-31 07:46:03 -08:00
Gabe Black	e88165a431	Merge with main repository.	2012-01-30 21:07:57 -08:00
Andreas Hansson	ef9fc01073	MEM: Clean-up of Functional/Virtual/TranslatingPort remnants This patch cleans up forward declarations and a member-function prototype that still referred to the old FunctionalPort, VirtualPort and TranslatingPort. There is no change in functionality.	2012-01-30 03:44:25 -05:00
Gabe Black	39f314cc15	Yet another merge with the main repository. --HG-- rename : tests/long/10.linux-boot/ref/x86/linux/pc-o3-timing/config.ini => tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/config.ini rename : tests/long/10.linux-boot/ref/x86/linux/pc-o3-timing/simout => tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/simout rename : tests/long/10.linux-boot/ref/x86/linux/pc-o3-timing/stats.txt => tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/stats.txt rename : tests/long/10.linux-boot/ref/x86/linux/pc-o3-timing/system.pc.com_1.terminal => tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/system.pc.com_1.terminal rename : tests/long/00.gzip/ref/x86/linux/o3-timing/config.ini => tests/long/se/00.gzip/ref/x86/linux/o3-timing/config.ini rename : tests/long/00.gzip/ref/x86/linux/o3-timing/simout => tests/long/se/00.gzip/ref/x86/linux/o3-timing/simout rename : tests/long/00.gzip/ref/x86/linux/o3-timing/stats.txt => tests/long/se/00.gzip/ref/x86/linux/o3-timing/stats.txt rename : tests/long/10.mcf/ref/x86/linux/o3-timing/config.ini => tests/long/se/10.mcf/ref/x86/linux/o3-timing/config.ini rename : tests/long/10.mcf/ref/x86/linux/o3-timing/simout => tests/long/se/10.mcf/ref/x86/linux/o3-timing/simout rename : tests/long/10.mcf/ref/x86/linux/o3-timing/stats.txt => tests/long/se/10.mcf/ref/x86/linux/o3-timing/stats.txt rename : tests/long/20.parser/ref/x86/linux/o3-timing/config.ini => tests/long/se/20.parser/ref/x86/linux/o3-timing/config.ini rename : tests/long/20.parser/ref/x86/linux/o3-timing/simout => tests/long/se/20.parser/ref/x86/linux/o3-timing/simout rename : tests/long/20.parser/ref/x86/linux/o3-timing/stats.txt => tests/long/se/20.parser/ref/x86/linux/o3-timing/stats.txt rename : tests/long/70.twolf/ref/x86/linux/o3-timing/config.ini => tests/long/se/70.twolf/ref/x86/linux/o3-timing/config.ini rename : tests/long/70.twolf/ref/x86/linux/o3-timing/simout => tests/long/se/70.twolf/ref/x86/linux/o3-timing/simout rename : tests/long/70.twolf/ref/x86/linux/o3-timing/stats.txt => tests/long/se/70.twolf/ref/x86/linux/o3-timing/stats.txt rename : tests/quick/00.hello/ref/x86/linux/o3-timing/config.ini => tests/quick/se/00.hello/ref/x86/linux/o3-timing/config.ini rename : tests/quick/00.hello/ref/x86/linux/o3-timing/simout => tests/quick/se/00.hello/ref/x86/linux/o3-timing/simout rename : tests/quick/00.hello/ref/x86/linux/o3-timing/stats.txt => tests/quick/se/00.hello/ref/x86/linux/o3-timing/stats.txt	2012-01-29 03:27:15 -08:00
Gabe Black	dc0e629ea1	Implement Ali's review feedback. Try to decrease indentation, and remove some redundant FullSystem checks.	2012-01-29 02:04:34 -08:00
Nilay Vaish	5c2fc35e02	O3 CPU LSQ: Implement TSO This patch makes O3's LSQ maintain total order between stores. Essentially only the store at the head of the store buffer is allowed to be in flight. Only after that store completes, the next store is issued to the memory system. By default, the x86 architecture will have TSO.	2012-01-28 19:09:04 -06:00
Gabe Black	c3d41a2def	Merge with the main repo. --HG-- rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh	2012-01-28 07:24:01 -08:00
Gabe Black	da2a4acc26	Merge yet again with the main repository.	2012-01-16 04:27:10 -08:00
Andreas Hansson	07cf9d914b	MEM: Separate queries for snooping and address ranges This patch simplifies the address-range determination mechanism and also unifies the naming across ports and devices. It further splits the queries for determining if a port is snooping and what address ranges it responds to (aiming towards a separation of cache-maintenance ports and pure memory-mapped ports). Default behaviours are such that most ports do not have to define isSnooping, and master ports need not implement getAddrRanges.	2012-01-17 12:55:09 -06:00
Andreas Hansson	de34e49d15	MEM: Simplify ports by removing EventManager This patch removes the inheritance of EventManager from the ports and moves all responsibility for event queues to the owner. Eventually the event manager should be the interface block, which could either be the structural owner or a subblock like a LSQ in the O3 CPU for example.	2012-01-17 12:55:09 -06:00
Andreas Hansson	b3f930c884	CPU: Moving towards a more general port across CPU models This patch performs minimal changes to move the instruction and data ports from specialised subclasses to the base CPU (to the largest degree possible). Ultimately it servers to make the CPU(s) have a well-defined interface to the memory sub-system.	2012-01-17 12:55:08 -06:00
Andreas Hansson	f85286b3de	MEM: Add port proxies instead of non-structural ports Port proxies are used to replace non-structural ports, and thus enable all ports in the system to correspond to a structural entity. This has the advantage of accessing memory through the normal memory subsystem and thus allowing any constellation of distributed memories, address maps, etc. Most accesses are done through the "system port" that is used for loading binaries, debugging etc. For the entities that belong to the CPU, e.g. threads and thread contexts, they wrap the CPU data port in a port proxy. The following replacements are made: FunctionalPort > PortProxy TranslatingPort > SETranslatingPortProxy VirtualPort > FSTranslatingPortProxy --HG-- rename : src/mem/vport.cc => src/mem/fs_translating_port_proxy.cc rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh	2012-01-17 12:55:08 -06:00
Maximilien Breughe	a7394ad680	inorder: MDU deadlock fix	2012-01-12 10:15:00 -05:00
Nilay Vaish	9957035a42	DPRINTF: Improve some dprintf messages.	2012-01-10 10:15:02 -06:00
Anders Handler	b587d511c3	CPU: Remove Alpha-specific PC alignment check.	2012-01-09 20:05:07 -05:00
Ali Saidi	525d1e46dc	O3: Remove some asserts that no longer seem to be valid.	2012-01-09 18:08:20 -06:00
Ali Saidi	d2c26f402c	O3: Add support of function tracing with O3 CPU.	2012-01-09 18:08:20 -06:00
Andreas Hansson	c2dbfc1d6c	MAC: Make gem5 compile and run on MacOSX 10.7.2 Adaptations to make gem5 compile and run on OSX 10.7.2, with a stock gcc 4.2.1 and the remaining dependencies from macports, i.e. python 2.7,.2 swig 2.0.4, mercurial 2.0. The changes include an adaptation of the SConstruct to handle non-library linker flags, and Darwin-specific code to find the memory usage of gem5. A number of Ruby files relied on ambigious uint (without the 32 suffix) which caused compilation errors.	2012-01-09 18:08:20 -06:00
Gabe Black	241cc0c840	Another merge with the main repository.	2012-01-07 02:16:37 -08:00
Gabe Black	ec936364b7	Merge with the main repository again.	2012-01-07 02:15:35 -08:00
Gabe Black	36a822f08e	Merge with main repository.	2012-01-07 02:10:34 -08:00
Nathan Binkert	6ef9691035	gcc: fix unused variable warnings from GCC 4.6.1 --HG-- extra : rebase_source : f9e22de341493a25ac6106c16ac35c61c128a080	2011-12-13 11:49:27 -08:00
Chris Emmons	5bde1d359f	Output: Add hierarchical output support and cleanup existing codebase. --HG-- extra : rebase_source : 3301137733cdf5fdb471d56ef7990e7a3a865442	2011-12-01 00:15:25 -08:00
Chander Sudanthi	61c14da751	O3: Remove hardcoded tgts_per_mshr in O3CPU.py. There are two lines in O3CPU.py that set the dcache and icache tgts_per_mshr to 20, ignoring any pre-configured value of tgts_per_mshr. This patch removes these hardcoded lines from O3CPU.py and sets the default L1 cache mshr targets to 20. --HG-- extra : rebase_source : 6f92d950e90496a3102967442814e97dc84db08b	2011-12-01 00:15:22 -08:00
Ali Saidi	946f7f0f55	ARM: Add support for having a TLB cache. --HG-- extra : rebase_source : 7a5780ab74d7c294682738c7ccb3ce8d56c6fd63	2011-12-01 00:15:22 -08:00
Ali Saidi	1444103998	O3: Add stat that counts how many cycles the O3 cpu was quiesced. --HG-- extra : rebase_source : 043b9307eef3c5b87f8e6370765641e016ed1fa7	2011-12-01 00:15:22 -08:00
Gabe Black	85424bef19	SE/FS: Get rid of includes of config/full_system.hh.	2011-11-18 02:20:22 -08:00
Gabe Black	de21bb93ea	SE/FS: Get rid of FULL_SYSTEM in the CPU directory.	2011-11-18 01:33:28 -08:00
Nilay Vaish	a547cf34b9	Ruby: Remove some unused typedefs This patch removes some of the unused typedefs. It also moves some of the typedefs from Global.hh to TypeDefines.hh. The patch also eliminates the file NodeID.hh.	2011-11-03 22:46:45 -05:00
Gabe Black	8b4a3f4070	SE/FS: Get rid of FULL_SYSTEM in sim.	2011-11-02 02:11:14 -07:00
Gabe Black	b6da5e2086	SE/FS: Get rid of uses of FULL_SYSTEM in Alpha.	2011-11-01 04:01:14 -07:00
Gabe Black	1268e0df1f	SE/FS: Expose the same methods on the CPUs in SE and FS modes.	2011-11-01 04:01:13 -07:00
Gabe Black	8ad2b8c559	SE/FS: Make the functions available from the TC consistent between SE and FS.	2011-10-31 02:58:22 -07:00
Gabe Black	d735abe5da	GCC: Get everything working with gcc 4.6.1. And by "everything" I mean all the quick regressions.	2011-10-31 01:09:44 -07:00
Gabe Black	facb40f3ff	SE/FS: Make getProcessPtr available in both modes, and get rid of FULL_SYSTEMs.	2011-10-30 00:33:02 -07:00
Gabe Black	5b433568f0	SE/FS: Build the base process class in FS.	2011-10-30 00:32:54 -07:00
Gabe Black	464c485d0c	SE/FS: Include getMemPort in FS.	2011-10-16 05:06:40 -07:00
Gabe Black	3595b0c5a1	SE/FS: Build/expose vport in SE mode.	2011-10-16 05:06:39 -07:00
Gabe Black	b2af015b97	ARM: Turn on the page table walker on ARM in SE mode.	2011-10-16 05:06:38 -07:00
Gabe Black	e8e9f97312	CPU: Make physPort and getPhysPort available in SE mode.	2011-10-16 02:59:53 -07:00
Gabe Black	8adc6781bf	X86: Turn on the page table walker in SE mode.	2011-10-13 02:22:23 -07:00
Gabe Black	f338d60930	SE/FS: Build the Interrupt objects in SE mode.	2011-10-09 00:15:50 -07:00
Gabe Black	51f7a66660	SE/FS: Build the devices in SE mode.	2011-09-30 00:28:33 -07:00
Gabe Black	4fcf8e9959	O3: Tidy up some DPRINTFs in the LSQ.	2011-09-27 00:25:26 -07:00
Gabe Black	44ed4849d4	Faults: Replace calls to genMachineCheckFault with M5PanicFault.	2011-09-27 00:24:43 -07:00
Nilay Vaish	56bddab189	LSQ: Moved a couple of lines to enable O3 + Ruby This patch makes O3 CPU work along with the Ruby memory model. Ruby overwrites the senderState pointer with another pointer. The pointer is restored only when Ruby gets done with the packet. LSQ makes use of senderState just after sendTiming() returns. But the dynamic_cast returns a NULL pointer since Ruby's senderState pointer is from a different class. Storing the senderState pointer before calling sendTiming() does away with the problem.	2011-09-26 12:18:32 -05:00
Steve Reinhardt	84f0a1bd91	event: minor cleanup Initialize flags via the Event constructor instead of calling setFlags() in the body of the derived class's constructor. I forget exactly why, but this made life easier when implementing multi-queue support. Also rename Event::getFlags() to isFlagSet() to better match common usage, and get rid of some unused Event methods.	2011-09-22 18:59:55 -07:00
Gabe Black	10c2e37f60	Syscall: Make the syscall function available in both SE and FS modes. In FS mode the syscall function will panic, but the interface will be consistent and code which calls syscall can be compiled in. This will allow, for instance, instructions that use syscall to be built unconditionally but then not returned by the decoder.	2011-09-19 02:46:48 -07:00
Ali Saidi	649c239cee	LSQ: Only trigger a memory violation with a load/load if the value changes. Only create a memory ordering violation when the value could have changed between two subsequent loads, instead of just when loads go out-of-order to the same address. While not very common in the case of Alpha, with an architecture with a hardware table walker this can happen reasonably frequently beacuse a translation will miss and start a table walk and before the CPU re-schedules the faulting instruction another one will pass it to the same address (or cache block depending on the dendency checking). This patch has been tested with a couple of self-checking hand crafted programs to stress ordering between two cores. The performance improvement on SPEC benchmarks can be substantial (2-10%).	2011-09-13 12:58:08 -04:00
Gabe Black	49a7ed0397	StaticInst: Merge StaticInst and StaticInstBase. Having two StaticInst classes, one nominally ISA dependent and the other ISA dependent, has not been historically useful and makes the StaticInst class more complicated that it needs to be. This change merges StaticInstBase into StaticInst.	2011-09-09 02:40:11 -07:00
Gabe Black	b7b545bc38	Decode: Pull instruction decoding out of the StaticInst class into its own. This change pulls the instruction decoding machinery (including caches) out of the StaticInst class and puts it into its own class. This has a few intrinsic benefits. First, the StaticInst code, which has gotten to be quite large, gets simpler. Second, the code that handles decode caching is now separated out into its own component and can be looked at in isolation, making it easier to understand. I took the opportunity to restructure the code a bit which will hopefully also help. Beyond that, this change also lays some ground work for each ISA to have its own, potentially stateful decode object. We'd be able to include less contextualizing information in the ExtMachInst objects since that context would be applied at the decoder. Also, the decoder could "know" ahead of time that all the instructions it's going to see are going to be, for instance, 64 bit mode, and it will have one less thing to check when it decodes them. Because the decode caching mechanism has been separated out, it's now possible to have multiple caches which correspond to different types of decoding context. Having one cache for each element of the cross product of different configurations may become prohibitive, so it may be desirable to clear out the cache when relatively static state changes and not to have one for each setting. Because the decode function is no longer universally accessible as a static member of the StaticInst class, a new function was added to the ThreadContexts that returns the applicable decode object.	2011-09-09 02:30:01 -07:00
Ali Saidi	b6203360ef	LSQ: Set store predictor to periodically clear itself as recommended in the storesets paper. This patch improves performance by as much as 10% on some spec benchmarks.	2011-08-19 15:08:07 -05:00
Geoffrey Blake	5f425b8bd1	Fix bugs due to interaction between SEV instructions and O3 pipeline SEV instructions were originally implemented to cause asynchronous squashes via the generateTCSquash() function in the O3 pipeline when updating the SEV_MAILBOX miscReg. This caused race conditions between CPUs in an MP system that would lead to a pipeline either going inactive indefinitely or not being able to commit squashed instructions. Fixed SEV instructions to behave like interrupts and cause synchronous sqaushes inside the pipeline, eliminating the race conditions. Also fixed up the semantics of the WFE instruction to behave as documented in the ARMv7 ISA description to not sleep if SEV_MAILBOX=1 or unmasked interrupts are pending.	2011-08-19 15:08:07 -05:00
Mrinmoy Ghosh	d0e0485902	LSQ: Add some better dprintfs for storeset predictor.	2011-08-19 15:08:05 -05:00
Mrinmoy Ghosh	0db95030fc	LSQ: Fix a few issues with the storeset predictor. Two issues are fixed in this patch: 1. The load and store pc passed to the predictor are passed in reverse order. 2. The flag indicating that a barrier is inflight was never cleared when the barrier was squashed instead of committed. This made all load insts dependent on a non-existent barrier in-flight.	2011-08-19 15:08:05 -05:00
Giacomo Gabrielli	676a530b77	O3: Squash the violator and younger instructions instead not all insts. Change the way instructions are squashed on memory ordering violations to squash the violator and younger instructions, not all instructions that are younger than the instruction they violated (no reason to throw away valid work).	2011-08-19 15:08:05 -05:00
Gabe Black	f2c89a01d1	InOrder: Make cache_unit.hh include hashmap.hh explicitly, not transitively.	2011-08-16 02:47:15 -07:00
Gabe Black	78a4636a13	O3: Make lsq_unit.hh include arch/isa_traits.hh directly, not transitively.	2011-08-16 02:46:57 -07:00
Gabe Black	0e6dc00497	O3: When squashing, restore the macroop that should be used for fetching.	2011-08-14 17:41:34 -07:00
Gabe Black	ec204f003c	O3: Add a pointer to the macroop for a microop in the dyninst.	2011-08-14 04:08:14 -07:00
Gabe Black	e0043f8dbe	O3: At the end of an instruction, force fetchAddr to something sensible. It's possible (though until now very unlikely) for fetchAddr to get out of sync with the actual PC of the current instruction. This change forcefull resets fetchAddr at the end of every instruction.	2011-08-13 13:36:37 -07:00
Gabe Black	96df6bedb7	O3: Stop using the current macroop no matter why you're leaving it. Until now, the only reason a macroop would be left was because it ended at a microop marked as the last microop. In O3 with branch prediction, it's possible for the branch predictor to have entries which originally came from different instructions which happened to have the same RIP. This could theoretically happen in many ways, but it was encountered specifically when different programs in different address spaces ran one after the other in X86_FS. What would happen in that case was that the macroop would continue to be looped over and microops fetched from it until it reached the last microop even though the macropc had moved out from under it. If things lined up properly, this could mean that the end bytes of an instruction actually fell into the instruction sized block of memory after the one in the predecoder. The fetch loop implicitly assumes that the last instruction sized chunk of memory processed was the last one needed for the instruction it just finished executing. It would then tell the predecoder to move to an offset within the bytes it was given that is larger than those bytes, and that would trip an assert in the x86 predecoder. This change fixes this problem by making fetch stop processing the current macroop if the address it should be fetching from changed when the PC is updated. That happens when the last microop was reached because the instruction handled it properly, and it also catches the case where the branch predictor makes fetch do a macro level branch when it shouldn't. The check of isLastMicroop is retained because otherwise, a macroop that branches back to itself would act like a single, long macroop instead of multiple instances of the same microop. There may be situations (which may turn out to be purely hypothetical) where that matters. This also fixes a relatively minor issue where the curMacroop variable would be set to NULL immediately after seeing that a microop was the last one before curMacroop was used to build the dyninst. The traceData structure would have a NULL pointer to the macroop for that microop.	2011-08-09 11:30:43 -07:00
Gabe Black	3989f41261	O3: When waiting to handle an interrupt, let everything drain out. Before this change, the commit stage would wait until the ROB and store queue were empty before recognizing an interrupt. The fetch stage would stop generating instructions at an appropriate point, so commit would then wait until a valid time to interrupt the instruction stream. Instructions might be in flight after fetch but not the in the ROB or store queue (in rename, for instance), so this change makes commit wait until all in flight instructions are finished.	2011-08-09 03:37:43 -07:00
Nilay Vaish	821dfc1289	BuildEnv: Eliminate RUBY as build environment variable This patch replaces RUBY with PROTOCOL in all the SConscript files as the environment variable that decides whether or not certain components of the simulator are compiled.	2011-08-08 10:50:13 -05:00
Gabe Black	5c0e6e6092	O3: Get rid of the unused addToRemoveList function.	2011-08-07 15:41:10 -07:00
Gabe Black	a9b7931156	O3: Let squashed and deferred instructions issue. Let squahsed and deferred instructions issue so they don't accumulate and clog up the CPU.	2011-08-07 15:41:07 -07:00
Ali Saidi	4d83b8a799	O3: Fix uninitialized variable in the tournament branch predictor.	2011-08-07 09:21:49 -07:00
Gabe Black	16882b0483	Translation: Use a pointer type as the template argument. This allows regular pointers and reference counted pointers without having to use any shim structures or other tricks.	2011-08-07 09:21:48 -07:00
Gabe Black	6230668f5e	O3: Get rid of the raw ExtMachInst constructor on DynInsts. This constructor assumes that the ExtMachInst can be decoded directly into a StaticInst that's useful to execute. With the advent of microcoded instructions that's no longer true.	2011-08-02 11:51:16 -07:00
Gabe Black	206c2e9a0e	O3: Implement memory mapped IPRs for O3.	2011-07-31 19:21:17 -07:00
Gabe Black	a42c6ae48d	O3: Fix corner case squashing into the microcode ROM. When fetching from the microcode ROM, if the PC is set so that it isn't in the cache block that's been fetched the CPU will get stuck. The fetch stage notices that it's in the ROM so it doesn't try to fetch from the current PC. It then later notices that it's outside of the current cache block so it skips generating instructions expecting to continue once the right bytes have been fetched. This change lets the fetch stage attempt to generate instructions, and only checks if the bytes it's going to use are valid if it's really going to use them.	2011-07-30 23:22:53 -07:00
Giacomo Gabrielli	69ef57fd0f	O3: Create a pipeline activity viewer for the O3 CPU model. Implemented a pipeline activity viewer as a python script (util/o3-pipeview.py) and modified O3 code base to support an extra trace flag (O3PipeView) for generating traces to be used as inputs by the tool.	2011-07-15 11:53:35 -05:00
Mrinmoy Ghosh	3396fd9e84	Branch predictor: Fixes the tournament branch predictor. Branch predictor could not predict a branch in a nested loop because: 1. The global history was not updated after a mispredict squash. 2. The global history was updated in the fetch stage. The choice predictors that were updated used the changed global history. This is incorrect, as it incorporates the state of global history after the branch in encountered. Fixed update to choice predictor using the global history state before the branch happened. 3. The global predictor table was also updated using the global history state before the branch happened as above. Additionally, parameters to initialize ctr and history size were reversed.	2011-07-10 12:56:08 -05:00
Geoffrey Blake	c7e7b89058	O3: Fix up pipelining icache accesses in fetch stage to function properly Fixed up the patch from Yasuko Watanabe that enabled pipelining of fetch accessess to icache to work with recent changes to main repository. Also added in ability for fetch stage to delay issuing the fault carrying nop when a pipeline fetch causes a fault and no fetch bandwidth is available until the next cycle.	2011-07-10 12:56:08 -05:00
Ali Saidi	60579e8d74	O3: Make sure fetch doesn't go off into the weeds during speculation.	2011-07-10 12:56:08 -05:00
Gabe Black	3a1428365a	ExecContext: Rename the readBytes/writeBytes functions to readMem and writeMem. readBytes and writeBytes had the word "bytes" in their names because they accessed blobs of bytes. This distinguished them from the read and write functions which handled higher level data types. Because those functions don't exist any more, this change renames readBytes and writeBytes to more general names, readMem and writeMem, which reflect the fact that they are how you read and write memory. This also makes their names more consistent with the register reading/writing functions, although those are still read and set for some reason.	2011-07-02 22:35:04 -07:00
Gabe Black	2e7426664a	ExecContext: Get rid of the now unused read/write templated functions.	2011-07-02 22:34:58 -07:00
Brad Beckmann ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)	c86f849d5a	Ruby: Add support for functional accesses This patch rpovides functional access support in Ruby. Currently only the M5Port of RubyPort supports functional accesses. The support for functional through the PioPort will be added as a separate patch.	2011-06-30 19:49:26 -05:00
Gabe Black	affad29932	InOder: Fix a compile error.	2011-06-20 02:29:14 -07:00
Korey Sewell	477e7039b3	inorder: clear reg. dep entry after removing from list this will safeguard future code from trying to remove from the list twice. That code wouldnt break but would waste time.	2011-06-19 21:43:42 -04:00
Korey Sewell	b963b339b9	inorder: se: squash after syscalls	2011-06-19 21:43:42 -04:00
Korey Sewell	eedd04e894	inorder: cleanup dprintfs in cache unit	2011-06-19 21:43:42 -04:00
Korey Sewell	078f914e69	inorder: SE mode TLB faults handle them like we do in FS mode, by blocking the TLB until the fault is handled by the fault->invoke()	2011-06-19 21:43:42 -04:00
Korey Sewell	3cb23bd3a2	inorder:tracing: fix fault tracing bug	2011-06-19 21:43:42 -04:00
Korey Sewell	fe3a2aa4a3	inorder: se compile fixes	2011-06-19 21:43:42 -04:00
Korey Sewell	e572c01120	inorder: add necessary debug flag header files	2011-06-19 21:43:41 -04:00
Korey Sewell	91a88ae8ce	inorder: clear fetchbuffer on traps implement clearfetchbufferfunction extend predecoder to use multiple threads and clear those on trap	2011-06-19 21:43:41 -04:00
Korey Sewell	2dae0e8735	inorder: use separate float-reg bits function in dyninst this will make sure we get the correct view of a FP register	2011-06-19 21:43:41 -04:00
Korey Sewell	8c0def8d03	inorder: use trapPending flag to manage traps	2011-06-19 21:43:41 -04:00
Korey Sewell	5ef0b7a9db	inorder/dtb: make sure DTB translate correct address The DTB expects the correct PC in the ThreadContext but how if the memory accesses are speculative? Shouldn't we send along the requestor's PC to the translate functions?	2011-06-19 21:43:41 -04:00
Korey Sewell	716e447da8	inorder: handle serializing instructions including IPR accesses and store-conditionals. These class of instructions will not execute correctly in a superscalar machine	2011-06-19 21:43:41 -04:00
Korey Sewell	561c33f082	inorder: dont handle multiple faults on same cycle if a faulting instruction reaches an execution unit, then ignore it and pass it through the pipeline. Once we recognize the fault in the graduation unit, dont allow a second fault to creep in on the same cycle.	2011-06-19 21:43:40 -04:00
Korey Sewell	c4deabfb97	inorder: register ports for FS mode handle "snoop" port registration as well as functional port setup for FS mode	2011-06-19 21:43:40 -04:00
Korey Sewell	f1c3691356	inorder: check for interrupts each tick use a dummy instruction to facilitate the squash after the interrupts trap	2011-06-19 21:43:40 -04:00
Korey Sewell	0bfdf342da	inorder: explicit fault check Before graduating an instruction, explicitly check fault by making the fault check it's own separate command that can be put on an instruction schedule.	2011-06-19 21:43:40 -04:00
Korey Sewell	5f608dd2e9	inorder: squash and trap behind a tlb fault	2011-06-19 21:43:39 -04:00
Korey Sewell	e0e387c2a9	inorder: stall stores on store conditionals & compare/swaps	2011-06-19 21:43:39 -04:00
Korey Sewell	e8b7df072b	inorder: make InOrder CPU FS compilable/visible make syscall a SE mode only functionality copy over basic FS functions (hwrei) to make FS compile	2011-06-19 21:43:39 -04:00
Korey Sewell	d71b95d84d	inorder: remove memdep tracking for default pipeline speculative load/store pipelines can reenable this	2011-06-19 21:43:39 -04:00
Korey Sewell	b72bdcf4f8	inorder: fetchBuffer tracking calculate blocks in use for the fetch buffer to figure out how many total blocks are pending	2011-06-19 21:43:39 -04:00
Korey Sewell	4d4c7d79d0	inorder: redefine DynInst FP result type Sharing the FP value w/the integer values was giving inconsistent results esp. when their is a 32-bit integer register matched w/a 64-bit float value	2011-06-19 21:43:38 -04:00
Korey Sewell	db8b1e4b78	inorder: treat SE mode syscalls as a trapping instruction define a syscallContext to schedule the syscall and then use syscall() to actually perform the action	2011-06-19 21:43:38 -04:00
Korey Sewell	c95fe261ab	inorder: bug in mdu segfault was caused by squashed multiply thats in the process of an event. use isProcessing flag to handle this and cleanup the MDU code	2011-06-19 21:43:38 -04:00
Korey Sewell	4c979f9325	inorder: optionally track faulting instructions	2011-06-19 21:43:38 -04:00
Korey Sewell	22ba1718c4	inorder: cleanup events in resource pool remove events in the resource pool that can be called from the CPU event, since the CPU event is scheduled at the same time at the resource pool event. ---- Also, match the resPool event function names to the cpu event function names ----	2011-06-19 21:43:38 -04:00
Korey Sewell	e8082a28c8	inorder: don't stall after stores once a ST is sent off, it's OK to keep processing, however it's a little more complicated to handle the packet acknowledging the store is completed	2011-06-19 21:43:38 -04:00
Korey Sewell	379c23199e	inorder: don't stall after stores once a ST is sent off, it's OK to keep processing, however it's a little more complicated to handle the packet acknowledging the store is completed	2011-06-19 21:43:37 -04:00
Korey Sewell	4c9ad53cc5	inorder: remove decode squash also, cleanup comments for gem5.fast compilation	2011-06-19 21:43:37 -04:00
Korey Sewell	a444133e73	inorder: support for compare and swap insts dont treat read() and write() fields as mut. exclusive	2011-06-19 21:43:37 -04:00
Korey Sewell	89d0f95bf0	inorder: branch predictor update only update BTB on a taken branch and update branch predictor w/pcstate from instruction --- only pay attention to branch predictor updates if the the inst. is in fact a branch	2011-06-19 21:43:37 -04:00
Korey Sewell	479195d4cf	inorder: priority for grad/squash events define separate priority resource pool squash and graduate events	2011-06-19 21:43:37 -04:00
Korey Sewell	71018f5e8b	inorder: remove stalls on trap squash	2011-06-19 21:43:37 -04:00
Korey Sewell	34b2500f09	inorder: no dep. tracking for zero reg this causes forwarding a bad value register value	2011-06-19 21:43:37 -04:00
Korey Sewell	d02fa0f6b6	imported patch recoverPCfromTrap	2011-06-19 21:43:37 -04:00
Korey Sewell	264e8178ff	imported patch squash_from_next_stage	2011-06-19 21:43:36 -04:00
Korey Sewell	f0f33ae2b9	inorder: add flatDestReg member to dyninst use it in reg. dep. tracking	2011-06-19 21:43:36 -04:00
Korey Sewell	555bd4d842	inorder: update event priorities dont use offset to calculate this but rather an enum that can be updated	2011-06-19 21:43:36 -04:00
Korey Sewell	7dea79535c	inorder: implement trap handling	2011-06-19 21:43:36 -04:00
Korey Sewell	061b369d28	inorder: cleanup intercomm. structs/squash info	2011-06-19 21:43:35 -04:00
Korey Sewell	b195da9345	inorder: use setupSquash for misspeculation implement a clean interface to handle branch misprediction and eventually all pipeline flushing	2011-06-19 21:43:35 -04:00
Korey Sewell	73cfab8b23	inorder: DynInst handling of stores for big-endian ISAs The DynInst was not performing the host-to-guest translation which ended up breaking stores for SPARC	2011-06-19 21:43:35 -04:00
Korey Sewell	4f34bc8b7b	inorder: make marking of dest. regs an explicit request formerly, this was implicit when you accessed the execution unit or the use-def unit but it's better that this just be something that a user can specify.	2011-06-19 21:43:35 -04:00
Korey Sewell	946b0ed4f4	inorder: simplify handling of split accesses	2011-06-19 21:43:35 -04:00
Korey Sewell	1a6d25dc47	inorder: addtl functionaly for inst. skeds add find and end functions for inst. schedules that can search by stage number	2011-06-19 21:43:35 -04:00
Korey Sewell	8b54858831	inorder: register file stats keep stats for int/float reg file usage instead of aggregating across reg file types	2011-06-19 21:43:34 -04:00
Korey Sewell	085f30ff9c	inorder: scheduling for nonspec insts make handling of speculative and nonspeculative insts more explicit	2011-06-19 21:43:34 -04:00
Korey Sewell	3c417ea23a	inorder: find register dependencies "lazily" Architectures like SPARC need to read the window pointer in order to figure out it's register dependence. However, this may not get updated until after an instruction gets executed, so now we lazily detect the register dependence in the EXE stage (execution unit or use_def). This makes sure we get the mapping after the most current change.	2011-06-19 21:43:34 -04:00
Korey Sewell	bd67ee9852	inorder: assert on macro-ops provide a sanity check for someone coding a new architecture	2011-06-19 21:43:34 -04:00
Korey Sewell	ee7062d94d	inorder: handle faults at writeback stage call trap function when a fault is received	2011-06-19 21:43:34 -04:00
Korey Sewell	17f5749dbb	inorder: ISA-zero reg handling ignore writes to the ISA zero register	2011-06-19 21:43:34 -04:00
Korey Sewell	2a59fcfbe9	inorder: update support for branch delay slots	2011-06-19 21:43:34 -04:00
Korey Sewell	d4b4ef1324	inorder: inst. iterator cleanup get rid of accessing iterators (for instructions) by reference	2011-06-19 21:43:34 -04:00
Korey Sewell	e2f9266dbf	inorder: update bpred code clean up control flow to make it easier to understand	2011-06-19 21:43:33 -04:00
Korey Sewell	6df6365095	inorder: add types for dependency checks	2011-06-19 21:43:33 -04:00
Korey Sewell	19e3eb2915	inorder: use flattenIdx for reg indexing - also use "threadId()" instead of readTid() everywhere - this will help support more complex ISA indexing	2011-06-19 21:43:33 -04:00
Korey Sewell	b2e5152e16	simple-thread: give a name() function for debugging w/the SimpleThread object	2011-06-19 21:43:33 -04:00
Korey Sewell	76c60c5f93	inorder: use m5_hash_map for skedCache since we dont care about if the cache of instruction schedules is sorted or not, then the hash map should be faster	2011-06-19 21:43:33 -04:00
Korey Sewell	c8b43641fd	o3: missing newlines on some dprintfs	2011-06-10 22:15:32 -04:00
Korey Sewell	1a451cd2c5	sparc: compilation fixes for inorder Add a few constants and functions that the InOrder model wants for SPARC. * * * sparc: add eaComp function InOrder separates the address generation from the actual access so give Sparc that functionality * * * sparc: add control flags for branches branch predictors and other cpu model functions need to know specific information about branches, so add the necessary flags here	2011-06-09 01:34:06 -04:00
Gabe Black	a59a143a25	gcc 4.0: Add some virtual destructors to make gcc 4.0 happy.	2011-06-07 00:24:49 -07:00
Nathan Binkert	2b1aa35e20	scons: rename TraceFlags to DebugFlags	2011-06-02 17:36:21 -07:00
Geoffrey Blake	d0b0a55515	O3: Fix offset calculation into storeQueue buffer for store->load forwarding Calculation of offset to copy from storeQueue[idx].data structure for load to store forwarding fixed to be difference in bytes between store and load virtual addresses. Previous method would induce bug where a load would index into buffer at the wrong location.	2011-05-23 10:40:21 -05:00
Geoffrey Blake	c223b887fe	O3: Fix issue w/wbOutstading being decremented multiple times on blocked cache. If a split load fails on a blocked cache wbOutstanding can be decremented twice if the first part of the split load succeeds and the second part fails. Condition the decrementing on not having completed the first part of the load.	2011-05-23 10:40:19 -05:00
Geoffrey Blake	6dd996aabb	O3: Fix issue with interrupts/faults occuring in the middle of a macro-op This patch fixes two problems with the O3 cpu model. The first is an issue with an instruction fetch causing a fault on the next address while the current macro-op is being issued. This happens when the micro-ops exceed the fetch bandwdith and then on the next cycle the fetch stage attempts to issue a request to the next line while it still has micro-ops to issue if the next line faults a fault is attached to a micro-op in the currently executing macro-op rather than a "nop" from the next instruction block. This leads to an instruction incorrectly faulting when on fetch when it had no reason to fault. A similar problem occurs with interrupts. When an interrupt occurs the fetch stage nominally stops issuing instructions immediately. This is incorrect in the case of a macro-op as the current location might not be interruptable.	2011-05-23 10:40:18 -05:00
Chander Sudanthi	4bf48a11ef	Trace: Allow printing ASIDs and selectively tracing based on user/kernel code. Debug flags are ExecUser, ExecKernel, and ExecAsid. ExecUser and ExecKernel are set by default when Exec is specified. Use minus sign with ExecUser or ExecKernel to remove user or kernel tracing respectively.	2011-05-13 17:27:00 -05:00
Geoffrey Blake	b79650ceaa	O3: Fix an issue with a load & branch instruction and mem dep squashing Instructions that load an address and are control instructions can execute down the wrong path if they were predicted correctly and then instructions following them are squashed. If an instruction is a memory and control op use the predicted address for the next PC instead of just advancing the PC. Without this change NPC is used for the next instruction, but predPC is used to verify that the branch was successful so the wrong path is silently executed.	2011-05-13 17:27:00 -05:00
Nathan Binkert	9c4c1419a7	work around gcc 4.5 warning	2011-05-09 16:34:11 -04:00
Tushar Krishna	1267ff5949	NetworkTest: added sim_cycles parameter to the network tester. The network tester terminates after injecting for sim_cycles (default=1000), instead of having to explicitly pass --maxticks from the command line as before. If fixed_pkts is enabled, the tester only injects maxpackets number of packets, else it keeps injecting till sim_cycles. The tester also works with zero command line arguments now.	2011-05-07 17:43:30 -04:00
Ali Saidi	77bea2fb42	CPU: Add some useful debug message to the timing simple cpu.	2011-05-04 20:38:27 -05:00
Ali Saidi	6e634beb8a	CPU: Fix a case where timing simple cpu faults can nest. If we fault, change the state to faulting so that we don't fault again in the same cycle.	2011-05-04 20:38:27 -05:00
Ali Saidi	89e7bcca82	O3: Remove assertion for case that is actually handled in code. If an nonspeculative instruction has a fault it might not be in the nonSpecInsts map.	2011-05-04 20:38:27 -05:00
Ali Saidi	09a2be0c39	O3: Fix a small corner case with the lsq hazard detection logic.	2011-05-04 20:38:26 -05:00
Nathan Binkert	6e9143d36d	stats: one more name violation	2011-04-20 19:07:45 -07:00
Nathan Binkert	63371c8664	stats: rename stats so they can be used as python expressions	2011-04-19 18:45:21 -07:00
Nathan Binkert	eddac53ff6	trace: reimplement the DTRACE function so it doesn't use a vector At the same time, rename the trace flags to debug flags since they have broader usage than simply tracing. This means that --trace-flags is now --debug-flags and --trace-help is now --debug-help	2011-04-15 10:44:32 -07:00
Nathan Binkert	f946d7bcdb	debug: create a Debug namespace	2011-04-15 10:44:15 -07:00
Nathan Binkert	bbb1392c08	includes: fix up code after sorting	2011-04-15 10:44:14 -07:00
Nathan Binkert	39a055645f	includes: sort all includes	2011-04-15 10:44:06 -07:00
Ali Saidi	6b69890493	ARM: Fix checkpoint restoration into O3 CPU and the way O3 switchCpu works. This change fixes a small bug in the arm copyRegs() code where some registers wouldn't be copied if the processor was in a mode other than MODE_USER. Additionally, this change simplifies the way the O3 switchCpu code works by utilizing TheISA::copyRegs() to copy the required context information rather than the adhoc copying that goes on in the CPU model. The current code makes assumptions about the visibility of int and float registers that aren't true for all architectures in FS mode.	2011-04-04 11:42:28 -05:00
Ali Saidi	a679cd917a	ARM: Cleanup implementation of ITSTATE and put important code in PCState. Consolidate all code to handle ITSTATE in the PCState object rather than touching a variety of structures/objects.	2011-04-04 11:42:28 -05:00
Ali Saidi	5962fecc1d	CPU: Remove references to memory copy operations	2011-04-04 11:42:26 -05:00
Ali Saidi	7dde557fdc	O3: Tighten memory order violation checking to 16 bytes. The comment in the code suggests that the checking granularity should be 16 bytes, however in reality the shift by 8 is 256 bytes which seems much larger than required.	2011-04-04 11:42:23 -05:00
Lisa Hsu	06fcaf9104	Ruby: have the rubytester pass contextId to Ruby.	2011-03-31 17:17:51 -07:00
Somayeh Sardashti	c8bbfed937	This patch supports cache flushing in MOESI_hammer	2011-03-28 10:49:45 -05:00
Korey Sewell	e0fdd86fd9	mips: cleanup ISA-specific code *** (1): get rid of expandForMT function MIPS is the only ISA that cares about having a piece of ISA state integrate multiple threads so add constants for MIPS and relieve the other ISAs from having to define this. Also, InOrder was the only core that was actively calling this function * * * (2): get rid of corespecific type The CoreSpecific type was used as a proxy to pass in HW specific params to a MIPS CPU, but since MIPS FS hasnt been touched for awhile, it makes sense to not force every other ISA to use CoreSpecific as well use a special reset function to set it. That probably should go in a PowerOn reset fault anyway.	2011-03-26 09:23:52 -04:00
Tushar Krishna	531f54fb51	This patch fixes a build error in networktest.cc that occurs with gcc4.2	2011-03-22 23:38:09 -04:00
Tushar Krishna	09c3a97a4c	This patch adds the network tester for simple and garnet networks. The tester code is in testers/networktest. The tester can be invoked by configs/example/ruby_network_test.py. A dummy coherence protocol called Network_test is also addded for network-only simulations and testing. The protocol takes in messages from the tester and just pushes them into the network in the appropriate vnet, without storing any state.	2011-03-21 22:51:58 -04:00
Nilay Vaish	2f4276448b	Ruby: Convert AccessModeType to RubyAccessMode This patch converts AccessModeType to RubyAccessMode so that both the protocol dependent and independent code uses the same access mode.	2011-03-19 18:34:37 -05:00
Ali Saidi	53ab306acc	ARM: Fix subtle bug in LDM. If the instruction faults mid-op the base register shouldn't be written back.	2011-03-17 19:20:20 -05:00
Ali Saidi	b78be240cf	ARM: Detect and skip udelay() functions in linux kernel. This change speeds up booting, especially in MP cases, by not executing udelay() on the core but instead skipping ahead tha amount of time that is being delayed.	2011-03-17 19:20:20 -05:00
Ali Saidi	799c3da8d0	O3: Send instruction back to fetch on squash to seed predecoder correctly.	2011-03-17 19:20:19 -05:00
Ali Saidi	30143baf7e	O3: Cleanup the commitInfo comm struct. Get rid of unused members and use base types rather than derrived values where possible to limit amount of state.	2011-03-17 19:20:19 -05:00
Ali Saidi	a432d8e085	Mem: Fix issue with dirty block being lost when entire block transferred to non-cache. This change fixes the problem for all the cases we actively use. If you want to try more creative I/O device attachments (E.g. sharing an L2), this won't work. You would need another level of caching between the I/O device and the cache (which you actually need anyway with our current code to make sure writes propagate). This is required so that you can mark the cache in between as top level and it won't try to send ownership of a block to the I/O device. Asserts have been added that should catch any issues.	2011-03-17 19:20:19 -05:00
Ali Saidi	2f40b3b8ae	O3: Fix unaligned stores when cache blocked Without this change the a store can be issued to the cache multiple times. If this case occurs when the l1 cache is out of mshrs (and thus blocked) the processor will never make forward progress because each cycle it will send a single request using the recently freed mshr and not completing the multipart store. This will continue forever.	2011-03-17 19:20:19 -05:00
Gabe Black	579c5f0b65	Spelling: Fix the a spelling error by changing mmaped to mmapped. There may not be a formally correct spelling for the past tense of mmap, but mmapped is the spelling Google doesn't try to autocorrect. This makes sense because it mirrors the past tense of map->mapped and not the past tense of cape->caped. --HG-- rename : src/arch/alpha/mmaped_ipr.hh => src/arch/alpha/mmapped_ipr.hh rename : src/arch/arm/mmaped_ipr.hh => src/arch/arm/mmapped_ipr.hh rename : src/arch/mips/mmaped_ipr.hh => src/arch/mips/mmapped_ipr.hh rename : src/arch/power/mmaped_ipr.hh => src/arch/power/mmapped_ipr.hh rename : src/arch/sparc/mmaped_ipr.hh => src/arch/sparc/mmapped_ipr.hh rename : src/arch/x86/mmaped_ipr.hh => src/arch/x86/mmapped_ipr.hh	2011-03-01 23:18:47 -08:00
Nilay Vaish	80b3886475	Ruby: Make DataBlock.hh independent of RubySystem This patch changes DataBlock.hh so that it is not dependent on RubySystem. This dependence seems unecessary. All those functions that depende on RubySystem have been moved to DataBlock.cc file.	2011-02-25 17:51:02 -06:00
Timothy M. Jones	a10685ad1e	O3CPU: Fix iqCount and lsqCount SMT fetch policies. Fixes two of the SMT fetch policies in O3CPU that were returning the count of instructions in the IQ or LSQ rather than the thread ID to fetch from.	2011-02-25 13:50:29 +00:00
Korey Sewell	0a74246fb9	inorder: InstSeqNum bug Because int and not InstSeqNum was used in a couple of places, you can overflow the int type and thus get wierd bugs when the sequence number is negative (or some wierd value)	2011-02-23 16:35:18 -05:00
Korey Sewell	3e1ad73d08	inorder: dyn inst initialization remove constructors that werent being used (it just gets confusing) use initialization list for all the variables instead of relying on initVars() function	2011-02-23 16:35:04 -05:00
Korey Sewell	e0a021005d	inorder: cache packet handling -use a pointer to CacheReqPacket instead of PacketPtr so correct destructors get called on packet deletion - make sure to delete the packet if the cache blocks the sendTiming request or for some reason we dont use the packet - dont overwrite memory requests since in the worst case an instruction will be replaying a request so no need to keep allocating a new request - we dont use retryPkt so delete it - fetch code was split out already, so just assert that this is a memory reference inst. and that the staticInst is available	2011-02-23 16:30:45 -05:00
Ali Saidi	f9d4d9df1b	O3: When a prefetch causes a fault, don't record it in the inst	2011-02-23 15:10:50 -06:00
Ali Saidi	3de8e0a0d4	O3: If there is an outstanding table walk don't let the inst queue sleep. If there is an outstanding table walk and no other activity in the CPU it can go to sleep and never wake up. This change makes the instruction queue always active if the CPU is waiting for a store to translate. If Gabe changes the way this code works then the below should be removed as indicated by the todo.	2011-02-23 15:10:49 -06:00
Ali Saidi	7391ea6de6	ARM: Do something for ISB, DSB, DMB	2011-02-23 15:10:49 -06:00
Ali Saidi	ae3d456855	ARM: Fix bug that let two table walks occur in parallel.	2011-02-23 15:10:49 -06:00
Ali Saidi	68bd80794c	O3: Fix bug when a squash occurs right before TLB miss returns. In this case we need to throw away the TLB miss, not assume it was the one we were waiting for.	2011-02-23 15:10:49 -06:00
Korey Sewell	66bb732c04	m5: merge inorder/release-notes/make_release changes	2011-02-18 14:35:15 -05:00
Korey Sewell	bc16bbc158	inorder: add names and slot #s to res. dprints	2011-02-18 14:31:31 -05:00
Korey Sewell	64d31e75b9	inorder: ignore nops in execution unit	2011-02-18 14:30:38 -05:00
Korey Sewell	0fe19836c7	inorder: update graduation unit make sure instructions are able to commit before writing back to the RF do not commit more than 1 non-speculative instruction per cycle	2011-02-18 14:30:05 -05:00
Korey Sewell	89335118a5	inorder: recognize isSerializeAfter flag keep track of when an instruction needs the execution behind it to be serialized. Without this, in SE Mode instructions can execute behind a system call exit().	2011-02-18 14:29:48 -05:00
Korey Sewell	bbffd9419d	inorder: update default thread size(=1) a lot of structures get allocated based off that MaxThreads parameter so this is an effort to not abuse it	2011-02-18 14:29:44 -05:00
Korey Sewell	a278df0b95	inorder: don't overuse getLatency() resources don't need to call getLatency because the latency is already a member in the class. If there is some type of special case where different instructions impose a different latency inside a resource then we can revisit this and add getLatency() back in	2011-02-18 14:29:40 -05:00
Korey Sewell	37df925953	inorder: update max. resource bandwidths each resource has a certain # of requests it can take per cycle. update the #s here to be more realistic based off of the pipeline width and if the resource needs to be accessed on multiple cycles	2011-02-18 14:29:31 -05:00
Korey Sewell	91c48b1c3b	inorder: cleanup in destructors cleanup hanging pointers and other cruft in the destructors	2011-02-18 14:29:26 -05:00
Korey Sewell	8b4b4a1ba5	inorder: fix cache/fetch unit memory leaks --- need to delete the cache request's data on clearRequest() now that we are recycling requests --- fetch unit needs to deallocate the fetch buffer blocks when they are replaced or squashed.	2011-02-18 14:29:17 -05:00
Korey Sewell	72b5233112	inorder: remove events for zero-cycle resources if a resource has a zero cycle latency (e.g. RegFile write), then dont allocate an event for it to use	2011-02-18 14:29:02 -05:00
Korey Sewell	d5961b2b20	inorder: update pipeline interface for handling finished resource reqs formerly, to free up bandwidth in a resource, we could just change the pointer in that resource but at the same time the pipeline stages had visibility to see what happened to a resource request. Now that we are recycling these requests (to avoid too much dynamic allocation), we can't throw away the request too early or the pipeline stage gets bad information. Instead, mark when a request is done with the resource all together and then let the pipeline stage call back to the resource that it's time to free up the bandwidth for more instructions * inteface notes * - When an instruction completes and is done in a resource for that cycle, call done() - When an instruction fails and is done with a resource for that cycle, call done(false) - When an instruction completes, but isnt finished with a resource, call completed() - When an instruction fails, but isnt finished with a resource, call completed(false) * * * inorder: tlbmiss wakeup bug fix	2011-02-18 14:28:37 -05:00
Korey Sewell	d64226750e	inorder: remove request map, use request vector take away all instances of reqMap in the code and make all references use the built-in request vectors inside of each resource. The request map was dynamically allocating a request per instruction. The request vector just allocates N number of requests during instantiation and then the surrounding code is fixed up to reuse those N requests *** setRequest() and clearRequest() are the new accessors needed to define a new request in a resource	2011-02-18 14:28:30 -05:00
Korey Sewell	c883729025	inorder: add valid bit for resource requests this will allow us to reuse resource requests within a resource instead of always dynamically allocating	2011-02-18 14:28:22 -05:00
Korey Sewell	ff48afcf4f	inorder: remove reqRemoveList we are going to be getting away from creating new resource requests for every instruction so no more need to keep track of a reqRemoveList and clean it up every tick	2011-02-18 14:28:10 -05:00
Korey Sewell	991d0185c6	inorder: initialize res. req. vectors based on resource bandwidth first change in an optimization that will stop InOrder from allocating new memory for every instruction's request to a resource. This gets expensive since every instruction needs to access ~10 requests before graduation. Instead, the plan is to allocate just enough resource request objects to satisfy each resource's bandwidth (e.g. the execution unit would need to allocate 3 resource request objects for a 1-issue pipeline since on any given cycle it could have 2 read requests and 1 write request) and then let the instructions contend and reuse those allocated requests. The end result is a smaller memory footprint for the InOrder model and increased simulation performance	2011-02-18 14:27:52 -05:00
Gabe Black	f036fd9748	O3: Fetch from the microcode ROM when needed.	2011-02-13 17:40:07 -08:00
Ali Saidi	7c763b34c9	O3: Fix GCC 4.2.4 complaint	2011-02-13 16:51:15 -05:00
Korey Sewell	470aa289da	inorder: clean up the old way of inst. scheduling remove remnants of old way of instruction scheduling which dynamically allocated a new resource schedule for every instruction	2011-02-12 10:14:48 -05:00
Korey Sewell	e26aee514d	inorder: utilize cached skeds in pipeline allow the pipeline and resources to use the cached instruction schedule and resource sked iterator	2011-02-12 10:14:45 -05:00
Korey Sewell	516b611462	inorder: define iterator for resource schedules resource skeds are divided into two parts: front end (all insts) and back end (inst. specific) each of those are implemented as separate lists, so this iterator wraps around the traditional list iterator so that an instruction can walk it's schedule but seamlessly transfer from front end to back end when necessary	2011-02-12 10:14:43 -05:00
Korey Sewell	ec9b2ec251	inorder: stage scheduler for front/back end schedule creation add a stage scheduler class to replace InstStage in pipeline_traits.cc use that class to define a default front-end, resource schedule that all instructions will follow. This will also replace the back end schedule in pipeline_traits.cc. The reason for adding this is so that we can cache instruction schedules in the future instead of calling the same function over/over again as well as constantly dynamically alllocating memory on every instruction to try to figure out it's schedule	2011-02-12 10:14:40 -05:00
Korey Sewell	6713dbfe08	inorder: cache instruction schedules first step in a optimization to not dynamically allocate an instruction schedule for every instruction but rather used cached schedules	2011-02-12 10:14:36 -05:00
Korey Sewell	af67631790	inorder: comments for resource sked class	2011-02-12 10:14:34 -05:00
Korey Sewell	800e93f358	inorder: remove unused file inst_buffer file isn't used , so remove it	2011-02-12 10:14:32 -05:00
Giacomo Gabrielli	a05032f4df	O3: Fix pipeline restart when a table walk completes in the fetch stage. When a table walk is initiated by the fetch stage, the CPU can potentially move to the idle state and never wake up. The fetch stage must call cpu->wakeCPU() when a translation completes (in finishTranslation()).	2011-02-11 18:29:35 -06:00
Ali Saidi	1411cb0b0f	SimpleCPU: Fix a case where a DTLB fault redirects fetch and an I-side walk occurs. This change fixes an issue where a DTLB fault occurs and redirects fetch to handle the fault and the ITLB requires a walk which delays translation. In this case the status of the cpu isn't updated appropriately, and an additional instruction fetch occurs. Eventually this hits an assert as multiple instruction fetches are occuring in the system and when the second one returns the processor is in the wrong state. Some asserts below are removed because it was always true (typo) and the state after the initiateAcc() the processor could be in any valid state when a d-side fault occurs.	2011-02-11 18:29:35 -06:00
Giacomo Gabrielli	e2507407b1	O3: Enhance data address translation by supporting hardware page table walkers. Some ISAs (like ARM) relies on hardware page table walkers. For those ISAs, when a TLB miss occurs, initiateTranslation() can return with NoFault but with the translation unfinished. Instructions experiencing a delayed translation due to a hardware page table walk are deferred until the translation completes and kept into the IQ. In order to keep track of them, the IQ has been augmented with a queue of the outstanding delayed memory instructions. When their translation completes, instructions are re-executed (only their initiateAccess() was already executed; their DTB translation is now skipped). The IEW stage has been modified to support such a 2-pass execution.	2011-02-11 18:29:35 -06:00
Brad Beckmann	dfa8cbeb06	m5: added work completed monitoring support	2011-02-06 22:14:19 -08:00
Joel Hestness	52b6119228	TimingSimpleCPU: split data sender state fix In sendSplitData, keep a pointer to the senderState that may be updated after the call to handle*Packet. This way, if the receiver updates the packet senderState, it can still be accessed in sendSplitData.	2011-02-06 22:14:18 -08:00
Joel Hestness	b4c10bd680	mcpat: Adds McPAT performance counters Updated patches from Rick Strong's set that modify performance counters for McPAT	2011-02-06 22:14:17 -08:00
Korey Sewell	e396a34b01	inorder: fault handling Maintain all information about an instruction's fault in the DynInst object rather than any cpu-request object. Also, if there is a fault during the execution stage then just save the fault inside the instruction and trap once the instruction tries to graduate	2011-02-04 00:09:20 -05:00
Korey Sewell	e57613588b	inorder: pcstate and delay slots bug not taken delay slots were not being advanced correctly to pc+8, so for those ISAs we 'advance()' the pcstate one more time for the desired effect	2011-02-04 00:09:19 -05:00
Korey Sewell	68d962f8af	inorder: add a fetch buffer to fetch unit Give fetch unit it's own parameterizable fetch buffer to read from. Very inefficient (architecturally and in simulation) to continually fetch at the granularity of the wordsize. As expected, the number of fetch memory requests drops dramatically	2011-02-04 00:08:22 -05:00
Korey Sewell	56ce8acd41	inorder: overload find-req fn no need to have separate function name findSplitRequest, just overload the function	2011-02-04 00:08:21 -05:00
Korey Sewell	ab3d37d398	inorder: implement separate fetch unit instead of having one cache-unit class be responsible for both data and code accesses, separate code that is just for fetch in it's own derived class off the original base class. This makes the code easier to manage as well as handle future cases of special fetch handling	2011-02-04 00:08:20 -05:00
Korey Sewell	f80508de65	inorder: cache port blocking set the request to false when the cache port blocks so we dont deadlock. also, comment out the outstanding address list sanity check for now.	2011-02-04 00:08:19 -05:00
Korey Sewell	0c6a679359	inorder: stage width as a python parameter allow the user to specify how many instructions a pipeline stage can process on any given cycle (stageWidth...i.e.bandwidth) by setting the parameter through the python interface rather than compile the code after changing the *.cc file. (we always had the parameter there, but still used the static 'ThePipeline::StageWidth' instead) - Since StageWidth is now dynamically defined, change the interstage communication structure to use a vector and get rid of array and array handling index (toNextStageIndex) since we can just make calls to the list for the same information	2011-02-04 00:08:18 -05:00
Korey Sewell	8ac717ef4c	inorder: multi-issue branch resolution Only execute (resolve) one branch per cycle because handling more than one is a little more complicated	2011-02-04 00:08:17 -05:00
Korey Sewell	be17617990	inorder: pipe. stage inst. buffering use skidbuffer as only location for instructions between stages. before, we had the insts queue from the prior stage and the skidbuffer for the current stage, but that gets confusing and this consolidation helps when handling squash cases	2011-02-04 00:08:16 -05:00
Korey Sewell	050944dd73	inorder: change skidBuffer to list instead of queue manage insertion and deletion like a queue but will need access to internal elements for future changes Currently, skidbuffer manages any instruction that was in a stage but could not complete processing, however we will want to manage all blocked instructions (from prev stage and from cur. stage) in just one buffer.	2011-02-04 00:08:15 -05:00
Korey Sewell	7f937e11e2	inorder: activity tracking bug Previous code was marking CPU activity on almost every cycle due to a bug in tracking the status of pipeline stages. This disables the CPU from sleeping on long latency stalls and increases simulation time	2011-02-04 00:08:13 -05:00
Gabe Black	091a3e6cc0	Fault: Rename sim/fault.hh to fault_fwd.hh to distinguish it from faults.hh. --HG-- rename : src/sim/fault.hh => src/sim/fault_fwd.hh	2011-02-03 21:47:58 -08:00
Gabe Black	00f24ae92c	Config: Keep track of uncached and cached ports separately. This makes sure that the address ranges requested for caches and uncached ports don't conflict with each other, and that accesses which are always uncached (message signaled interrupts for instance) don't waste time passing through caches.	2011-02-03 20:23:00 -08:00
Gabe Black	869a046e41	O3: Fix a style bug in O3.	2011-02-02 23:34:14 -08:00
Gabe Black	119f5f8e94	X86: Add L1 caches for the TLB walkers. Small L1 caches are connected to the TLB walkers when caches are used. This allows them to participate in the coherence protocol properly.	2011-02-01 18:28:41 -08:00
Matt Horsnell	b13a79ee71	O3: Fix some variable length instruction issues with the O3 CPU and ARM ISA.	2011-01-18 16:30:05 -06:00
Matt Horsnell	c98df6f8c2	O3: Don't test misprediction on load instructions until executed.	2011-01-18 16:30:05 -06:00
Ali Saidi	1167ef19cf	O3: Keep around the last committed instruction and use for squashing. Without this change 0 is always used for the youngest sequence number if a squash occured and the ROB was empty (E.g. an instruction is marked serializeAfter or a fetch stall prevents other instructions from issuing). Using 0 there is a race to rename where an instruction that committed the same cycle as the squashing instruction can have it's renamed state undone by the squash using sequence number 0.	2011-01-18 16:30:05 -06:00
Ali Saidi	ea058b14da	O3: Don't try to scoreboard misc registers. I'm not positive this is the correct fix, but it's working right now. Either we need to do something like this, prevent the misc reg from being renamed at all, or there something else going on. We need to find the root cause as to why this is only a problem sometimes.	2011-01-18 16:30:05 -06:00
Matt Horsnell	11bef2ab38	O3: Fix corner cases where multiple squashes/fetch redirects overwrite timebuf.	2011-01-18 16:30:05 -06:00
Matt Horsnell	62f2097917	O3: Fix mispredicts from non control instructions. The squash inside the fetch unit should not attempt to remove them from the branch predictor as non-control instructions are not pushed into the predictor.	2011-01-18 16:30:05 -06:00

... 3 4 5 6 7 ...

1396 commits