sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Ali Saidi	4412046041	cpu: include set in o3/commit_impl. While the majority of compilers seemed to pickup set from else where, one version of gcc 4.7 complains, so explictly add it.	2013-02-15 17:40:08 -05:00
Ali Saidi	7ae06a3b3b	cpu: fix case with o3 cpu blocking and unblocking decode in cycle Fix a case in the O3 CPU where the decode stage blocks and unblocks in a single cycle sending both signals to fetch which causes an assert or worse. The previous check could never work before since the status was set to Blocked before a test for the status being Unblocking was executed.	2013-02-15 17:40:08 -05:00
Ali Saidi	b84bd3028c	cpu: Fix a livelock in the o3 cpu. Check if an instruction just enabled interrupts and we've previously had an interrupt pending that was not handled because interrupts were subsequently disabled before the pipeline reached a place to handle the interrupt. In that case squash now to make sure the interrupt is handled.	2013-02-15 17:40:07 -05:00
Nilay Vaish ext:(%2C%20Timothy%20Jones%20%3Ctimothy.jones%40cl.cam.ac.uk%3E)	dbeabedaf0	branch predictor: move out of o3 and inorder cpus This patch moves the branch predictor files in the o3 and inorder directories to src/cpu/pred. This allows sharing the branch predictor across different cpu models. This patch was originally posted by Timothy Jones in July 2010 but never made it to the repository. --HG-- rename : src/cpu/o3/bpred_unit.cc => src/cpu/pred/bpred_unit.cc rename : src/cpu/o3/bpred_unit.hh => src/cpu/pred/bpred_unit.hh rename : src/cpu/o3/bpred_unit_impl.hh => src/cpu/pred/bpred_unit_impl.hh rename : src/cpu/o3/sat_counter.hh => src/cpu/pred/sat_counter.hh	2013-01-24 12:28:51 -06:00
Andrea Pellegrini	11d5ffa108	o3 cpu: fix zero reg problem There was an issue w/ the rename logic, which would assign a previous physical register to the ZeroReg architectural register in x86. This issue was giving problems for instructions squashed in threads w/ ID different from 0, sometimes allowing non-mispredicted instructions to obtain a value different from zero when reading the zeroReg.	2013-01-22 00:13:28 -06:00
Nilay Vaish	fc57ae6401	x86, cpu: corrects 270c9a75e91f, take over decoder on cpu switch The changes made by the changeset 270c9a75e91f do not work well with switching of cpus. The problem is that decoder for the old thread context holds state that is not taken over by the new decoder. This patch adds a takeOverFrom() function to Decoder class in each ISA. Except for x86, functions in other ISAs are blank. For x86, the function copies state from the old decoder to the new decoder.	2013-01-22 00:10:10 -06:00
Joel Hestness	1429d21244	O3 IEW: Make incrWb and decrWb clearer Move the increment/decrement of wbOutstanding outside of the comparison in incrWb and decrWb in the IEW. This also fixes a compiler bug with gcc 4.4.7, which incorrectly optimizes "-- ==" as "-=".	2013-01-19 15:14:54 -06:00
Nilay Vaish	25ec278a0b	x86: Changes to decoder, corrects 9376 The changes made by the changeset 9376 were not quite correct. The patch made changes to the code which resulted in decoder not getting initialized correctly when the state was restored from a checkpoint. This patch adds a startup function to each ISA object. For x86, this function sets the required state in the decoder. For other ISAs, the function is empty right now.	2013-01-12 22:09:48 -06:00
Andreas Sandberg	009970f59b	cpu: Unify the serialization code for all of the CPU models Cleanup the serialization code for the simple CPUs and the O3 CPU. The CPU-specific code has been replaced with a (un)serializeThread that serializes the thread state / context of a specific thread. Assuming that the thread state class uses the CPU-specific thread state uses the base thread state serialization code, this allows us to restore a checkpoint with any of the CPU models.	2013-01-07 13:05:52 -05:00
Andreas Sandberg	1814a85a05	cpu: Rewrite O3 draining to avoid stopping in microcode Previously, the O3 CPU could stop in the middle of a microcode sequence. This patch makes sure that the pipeline stops when it has committed a normal instruction or exited from a microcode sequence. Additionally, it makes sure that the pipeline has no instructions in flight when it is drained, which should make draining more robust. Draining is controlled in the commit stage, which checks if the next PC after a committed instruction is in microcode. If this isn't the case, it requests a squash of all instructions after that the instruction that just committed and immediately signals a drain stall to the fetch stage. The CPU then continues to execute until the pipeline and all associated buffers are empty.	2013-01-07 13:05:46 -05:00
Andreas Sandberg	52ff37caa3	cpu: Fix broken thread context handover The thread context handover code used to break when multiple handovers were performed during the same quiesce period. Previously, the thread contexts would assign the TC pointer in the old quiesce event to the new TC. This obviously broke in cases where multiple switches were performed within the same quiesce period, in which case the TC pointer in the quiesce event would point to an old CPU. The new implementation deschedules pending quiesce events in the old TC and schedules a new quiesce event in the new TC. The code has been refactored to remove most of the code duplication.	2013-01-07 13:05:46 -05:00
Andreas Sandberg	fca4fea769	cpu: Fix O3 LSQ debug dumping constness and formatting	2013-01-07 13:05:46 -05:00
Andreas Sandberg	8db27aa230	cpu: Fix broken squashAfter implementation in O3 CPU Commit can currently both commit and squash in the same cycle. This confuses other stages since the signals coming from the commit stage can only signal either a squash or a commit in a cycle. This changeset changes the behavior of squashAfter so that it commits all instructions, including the instruction that requested the squash, in the first cycle and then starts to squash in the next cycle.	2013-01-07 13:05:45 -05:00
Andreas Sandberg	a2077ccf02	o3 cpu: Remove unused variables	2013-01-07 13:05:45 -05:00
Andreas Sandberg	2cfe62adc4	cpu: Rename defer_registration->switched_out The defer_registration parameter is used to prevent a CPU from initializing at startup, leaving it in the "switched out" mode. The name of this parameter (and the help string) is confusing. This patch renames it to switched_out, which should be more descriptive.	2013-01-07 13:05:45 -05:00
Andreas Sandberg	901258c22b	cpu: Correctly call parent on switchOut() and takeOverFrom() This patch cleans up the CPU switching functionality by making sure that CPU models consistently call the parent on switchOut() and takeOverFrom(). This has the following implications that might alter current functionality: * The call to BaseCPU::switchout() in the O3 CPU is moved from signalDrained() (!) to switchOut(). * A call to BaseSimpleCPU::switchOut() is introduced in the simple CPUs.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	4ae02295d5	cpu: Unify SimpleCPU and O3 CPU serialization code The O3 CPU used to copy its thread context to a SimpleThread in order to do serialization. This was a bit of a hack involving two static SimpleThread instances and a magic constructor that was only used by the O3 CPU. This patch moves the ThreadContext serialization code into two global procedures that, in addition to the normal serialization parameters, take a ThreadContext reference as a parameter. This allows us to reuse the serialization code in all ThreadContext implementations.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	6daada2701	cpu: Initialize the O3 pipeline from startup() The entire O3 pipeline used to be initialized from init(), which is called before initState() or unserialize(). This causes the pipeline to be initialized from an incorrect thread context. This doesn't currently lead to correctness problems as instructions fetched from the incorrect start PC will be squashed a few cycles after initialization. This patch will affect the regressions since the O3 CPU now issues its first instruction fetch to the correct PC instead of 0x0.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	e2dad8236a	cpu: Implement a flat register interface in thread contexts Some architectures map registers differently depending on their mode of operations. There is currently no architecture independent way of accessing all registers. This patch introduces a flat register interface to the ThreadContext class. This interface is useful, for example, when serializing or copying thread contexts.	2013-01-07 13:05:44 -05:00
Andreas Sandberg	7eb0fb8b6e	cpu: Check that the memory system is in the correct mode This patch adds checks to all CPU models to make sure that the memory system is in the correct mode at startup and when resuming after a drain. Previously, we only checked that the memory system was in the right mode when resuming. This is inadequate since this is a configuration error that should be detected at startup as well as when resuming. Additionally, since the check was done using an assert, it wasn't performed when NDEBUG was set (e.g., the fast target).	2013-01-07 13:05:41 -05:00
Andreas Sandberg	3db3f83a5e	arch: Make the ISA class inherit from SimObject The ISA class on stores the contents of ID registers on many architectures. In order to make reset values of such registers configurable, we make the class inherit from SimObject, which allows us to use the normal generated parameter headers. This patch introduces a Python helper method, BaseCPU.createThreads(), which creates a set of ISAs for each of the threads in an SMT system. Although it is currently only needed when creating multi-threaded CPUs, it should always be called before instantiating the system as this is an obvious place to configure ID registers identifying a thread/CPU.	2013-01-07 13:05:35 -05:00
Ali Saidi	69d419f313	o3: Fix issue with LLSC ordering and speculation This patch unlocks the cpu-local monitor when the CPU sees a snoop to a locked address. Previously we relied on the cache to handle the locking for us, however some users on the gem5 mailing list reported a case where the cpu speculatively executes a ll operation after a pending sc operation in the pipeline and that makes the cache monitor valid. This should handle that case by invaliding the local monitor.	2013-01-07 13:05:33 -05:00
Ali Saidi	5146a69835	cpu: rename the misleading inSyscall to noSquashFromTC isSyscall was originally created because during handling of a syscall in SE mode the threadcontext had to be updated. However, in many places this is used in FS mode (e.g. fault handlers) and the name doesn't make much sense. The boolean actually stops gem5 from squashing speculative and non-committed state when a write to a threadcontext happens, so re-name the variable to something more appropriate	2013-01-07 13:05:33 -05:00
Gabe Black	e17c375ddd	Decoder: Remove the thread context get/set from the decoder. This interface is no longer used, and getting rid of it simplifies the decoders and code that sets up the decoders. The thread context had been used to read architectural state which was used to contextualize the instruction memory as it came in. That was changed so that the state is now sent to the decoders to keep locally if/when it changes. That's significantly more efficient. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2013-01-04 19:00:45 -06:00
Erik Tomusk	3dc7e4f496	TournamentBP: Fix some bugs with table sizes and counters globalHistoryBits, globalPredictorSize, and choicePredictorSize are decoupled. globalHistoryBits controls how much history is kept, global and choice predictor sizes control how much of that history is used when accessing predictor tables. This way, global and choice predictors can actually be different sizes, and it is no longer possible to walk off the predictor arrays and cause a seg fault. There are now individual thresholds for choice, global, and local saturating counters, so that taken/not taken decisions are correct even when the predictors' counters' sizes are different. The interface for localPredictorSize has been removed from TournamentBP because the value can be calculated from localHistoryBits. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2012-12-06 09:31:06 -06:00
Nathanael Premillieu	eb899407c5	o3 cpu: remove some unused buggy functions in the lsq Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2012-12-06 04:36:51 -06:00
Andreas Sandberg	b81a977e6a	sim: Move the draining interface into a separate base class This patch moves the draining interface from SimObject to a separate class that can be used by any object needing draining. However, objects not visible to the Python code (i.e., objects not deriving from SimObject) still depend on their parents informing them when to drain. This patch also gets rid of the CountedDrainEvent (which isn't really an event) and replaces it with a DrainManager.	2012-11-02 11:32:01 -05:00
Andreas Sandberg	eb703a4b4e	cpu: O3 add a header declaring the DerivO3CPU SWIG needs a complete declaration of all wrapped objects. This patch adds a header file with the DerivO3CPU class and includes it in the SWIG interface. --HG-- rename : src/cpu/o3/cpu_builder.cc => src/cpu/o3/deriv.cc	2012-11-02 11:32:01 -05:00
Andreas Sandberg	ebe65a394b	cpu: Add header files for checker CPUs In order to create reliable SWIG wrappers, we need to include the declaration of the wrapped class in the SWIG file. Previously, we didn't expose the declaration of checker CPUs. This patch adds header files for such CPUs and include them in the SWIG wrapper. --HG-- rename : src/cpu/dummy_checker_builder.cc => src/cpu/dummy_checker.cc rename : src/cpu/o3/checker_builder.cc => src/cpu/o3/checker.cc	2012-11-02 11:32:01 -05:00
Andreas Sandberg	c0ab52799c	sim: Include object header files in SWIG interfaces When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it.	2012-11-02 11:32:01 -05:00
Ali Saidi	5adb4ddc12	O3: Pack the comm structures a bit better to reduce their size.	2012-09-25 11:49:40 -05:00
Djordje Kovacevic	d060a28a29	CPU: Add abandoned instructions to O3 Pipe Viewer	2012-09-25 11:49:40 -05:00
Anthony Gutierrez	c6927ed138	stats: remove duplicate instruction stats from the commit stage these stats are duplicates of insts/opsCommitted, cause confusion, and are poorly named.	2012-09-12 11:35:52 -04:00
Ali Saidi	03ff612054	O3: Get rid of incorrect assert in RAS.	2012-09-07 14:20:53 -05:00
Andreas Hansson	287ea1a081	Param: Transition to Cycles for relevant parameters This patch is a first step to using Cycles as a parameter type. The main affected modules are the CPUs and the Ruby caches. There are definitely plenty more places that are affected, but this patch serves as a starting point to making the transition. An important part of this patch is to actually enable parameters to be specified as Param.Cycles which involves some changes to params.py.	2012-09-07 12:34:38 -04:00
Andreas Hansson	0cacf7e817	Clock: Add a Cycles wrapper class and use where applicable This patch addresses the comments and feedback on the preceding patch that reworks the clocks and now more clearly shows where cycles (relative cycle counts) are used to express time. Instead of bumping the existing patch I chose to make this a separate patch, merely to try and focus the discussion around a smaller set of changes. The two patches will be pushed together though. This changes done as part of this patch are mostly following directly from the introduction of the wrapper class, and change enough code to make things compile and run again. There are definitely more places where int/uint/Tick is still used to represent cycles, and it will take some time to chase them all down. Similarly, a lot of parameters should be changed from Param.Tick and Param.Unsigned to Param.Cycles. In addition, the use of curTick is questionable as there should not be an absolute cycle. Potential solutions can be built on top of this patch. There is a similar situation in the o3 CPU where lastRunningCycle is currently counting in Cycles, and is still an absolute time. More discussion to be had in other words. An additional change that would be appropriate in the future is to perform a similar wrapping of Tick and probably also introduce a Ticks class along with suitable operators for all these classes.	2012-08-28 14:30:33 -04:00
Andreas Hansson	d53d04473e	Clock: Rework clocks to avoid tick-to-cycle transformations This patch introduces the notion of a clock update function that aims to avoid costly divisions when turning the current tick into a cycle. Each clocked object advances a private (hidden) cycle member and a tick member and uses these to implement functions for getting the tick of the next cycle, or the tick of a cycle some time in the future. In the different modules using the clocks, changes are made to avoid counting in ticks only to later translate to cycles. There are a few oddities in how the O3 and inorder CPU count idle cycles, as seen by a few locations where a cycle is subtracted in the calculation. This is done such that the regression does not change any stats, but should be revisited in a future patch. Another, much needed, change that is not done as part of this patch is to introduce a new typedef uint64_t Cycle to be able to at least hint at the unit of the variables counting Ticks vs Cycles. This will be done as a follow-up patch. As an additional follow up, the thread context still uses ticks for the book keeping of last activate and last suspend and this should probably also be changed into cycles as well.	2012-08-28 14:30:31 -04:00
Andreas Hansson	c60db56741	Packet: Remove NACKs from packet and its use in endpoints This patch removes the NACK frrom the packet as there is no longer any module in the system that issues them (the bridge was the only one and the previous patch removes that). The handling of NACKs was mostly avoided throughout the code base, by using e.g. panic or assert false, but in a few locations the NACKs were actually dealt with (although NACKs never occured in any of the regressions). Most notably, the DMA port will now never receive a NACK and the backoff time is thus never changed. As a consequence, the entire backoff mechanism (similar to a PCI bus) is now removed and the DMA port entirely relies on the bus performing the arbitration and issuing a retry when appropriate. This is more in line with e.g. PCIe. Surprisingly, this patch has no impact on any of the regressions. As mentioned in the patch that removes the NACK from the bridge, a follow-up patch should change the request and response buffer size for at least one regression to also verify that the system behaves as expected when the bridge fills up.	2012-08-22 11:39:59 -04:00
Andreas Hansson	a81c969529	CPU: Remove overloaded function_trace_start parameter This patch removes the overloading of the parameter, which seems both redundant, and possibly incorrect. The inorder CPU is particularly interesting as it uses a different name for the parameter, and never make any use of it internally.	2012-08-21 05:49:43 -04:00
Andreas Hansson	016593f2e9	Clock: Make Tick unsigned and remove UTick This patch makes the Tick unsigned and removes the UTick typedef. The ticks should never be negative, and there was only one major issue with removing it, caused by the o3 CPU using a -1 as an initial value. The patch has no impact on any regressions.	2012-08-21 05:49:09 -04:00
Anthony Gutierrez	0b3897fc90	O3,ARM: fix some problems with drain/switchout functionality and add Drain DPRINTFs This patch fixes some problems with the drain/switchout functionality for the O3 cpu and for the ARM ISA and adds some useful debug print statements. This is an incremental fix as there are still a few bugs/mem leaks with the switchout code. Particularly when switching from an O3CPU to a TimingSimpleCPU. However, when switching from O3 to O3 cores with the ARM ISA I haven't encountered any more assertion failures; now the kernel will typically panic inside of simulation.	2012-08-15 10:38:08 -04:00
Anthony Gutierrez	8133f2460f	checker: make checker cpu id match its host's cpu id when using the checker i ran into problems where an instruction reading the cpu id register failed because the ids did not match, and hence, the result of the instruction did not match. this patch ensures that the ids match so this instruction does not fail. this problem only seemed to manifest itself when multiple cores were in the system, either multi-core, or extra switched- out cores present in the system.	2012-07-27 16:08:04 -04:00
Andreas Hansson	b265d9925c	Port: Align port names in C++ and Python This patch is a first step to align the port names used in the Python world and the C++ world. Ultimately it serves to make the use of config.json together with output from the simulation easier, including post-processing of statistics. Most notably, the CPU, cache, and bus is addressed in this patch, and there might be other ports that should be updated accordingly. The dash name separator has also been replaced with a "." which is what is used to concatenate the names in python, and a separation is made between the master and slave port in the bus.	2012-07-09 12:35:39 -04:00
Andreas Hansson	ff5718f042	Fix: Address a few benign memory leaks This patch is the result of static analysis identifying a number of memory leaks. The leaks are all benign as they are a result of not deallocating memory in the desctructor. The fix still has value as it removes false positives in the static analysis.	2012-07-09 12:35:30 -04:00
Nathanael Premillieu	af2b14a362	O3: Track if the RAS has been pushed or not to pop the RAS if neccessary. Add new flag (named pushedRAS) in the PredictorHistory structure. This flag tracks whether the RAS has been pushed or not during a prediction. Then, in the squash function it is used to pop the RAS if necessary.	2012-06-29 11:18:29 -04:00
Ali Saidi	20d25b9da7	ISA: Back-out NoopMachInst as a StaticInstPtr change.	2012-06-05 13:52:30 -04:00
Ali Saidi	6df196b71e	O3: Clean up the O3 structures and try to pack them a bit better. DynInst is extremely large the hope is that this re-organization will put the most used members close to each other.	2012-06-05 01:23:09 -04:00
Ali Saidi	1b370431d0	sim: Remove FastAlloc While FastAlloc provides a small performance increase (~1.5%) over regular malloc it isn't thread safe. After removing FastAlloc and using tcmalloc I've seen a performance increase of 12% over libc malloc when running twolf for ARM.	2012-06-05 01:23:08 -04:00
Gabe Black	008b17d816	ISA: Turn the ExtMachInst NoopMachinst into the StaticInstPtr NoopStaticInst. This eliminates a use of the ExtMachInst type outside of the ISAs.	2012-06-04 10:57:23 -07:00
Gabe Black	0cba96ba6a	CPU: Merge the predecoder and decoder. These classes are always used together, and merging them will give the ISAs more flexibility in how they cache things and manage the process. --HG-- rename : src/arch/x86/predecoder_tables.cc => src/arch/x86/decoder_tables.cc	2012-05-26 13:44:46 -07:00

1 2 3 4 5 ...

591 commits