sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Korey Sewell	d64226750e	inorder: remove request map, use request vector take away all instances of reqMap in the code and make all references use the built-in request vectors inside of each resource. The request map was dynamically allocating a request per instruction. The request vector just allocates N number of requests during instantiation and then the surrounding code is fixed up to reuse those N requests *** setRequest() and clearRequest() are the new accessors needed to define a new request in a resource	2011-02-18 14:28:30 -05:00
Korey Sewell	ff48afcf4f	inorder: remove reqRemoveList we are going to be getting away from creating new resource requests for every instruction so no more need to keep track of a reqRemoveList and clean it up every tick	2011-02-18 14:28:10 -05:00
Korey Sewell	470aa289da	inorder: clean up the old way of inst. scheduling remove remnants of old way of instruction scheduling which dynamically allocated a new resource schedule for every instruction	2011-02-12 10:14:48 -05:00
Korey Sewell	e26aee514d	inorder: utilize cached skeds in pipeline allow the pipeline and resources to use the cached instruction schedule and resource sked iterator	2011-02-12 10:14:45 -05:00
Korey Sewell	ec9b2ec251	inorder: stage scheduler for front/back end schedule creation add a stage scheduler class to replace InstStage in pipeline_traits.cc use that class to define a default front-end, resource schedule that all instructions will follow. This will also replace the back end schedule in pipeline_traits.cc. The reason for adding this is so that we can cache instruction schedules in the future instead of calling the same function over/over again as well as constantly dynamically alllocating memory on every instruction to try to figure out it's schedule	2011-02-12 10:14:40 -05:00
Korey Sewell	6713dbfe08	inorder: cache instruction schedules first step in a optimization to not dynamically allocate an instruction schedule for every instruction but rather used cached schedules	2011-02-12 10:14:36 -05:00
Korey Sewell	0c6a679359	inorder: stage width as a python parameter allow the user to specify how many instructions a pipeline stage can process on any given cycle (stageWidth...i.e.bandwidth) by setting the parameter through the python interface rather than compile the code after changing the *.cc file. (we always had the parameter there, but still used the static 'ThePipeline::StageWidth' instead) - Since StageWidth is now dynamically defined, change the interstage communication structure to use a vector and get rid of array and array handling index (toNextStageIndex) since we can just make calls to the list for the same information	2011-02-04 00:08:18 -05:00
Korey Sewell	cd5a7f7221	inorder: fix RUBY_FS build the current code was using incorrect dummy instruction in interrupts function	2011-01-12 11:52:29 -05:00
Steve Reinhardt	6f1187943c	Replace curTick global variable with accessor functions. This step makes it easy to replace the accessor functions (which still access a global variable) with ones that access per-thread curTick values.	2011-01-07 21:50:29 -08:00
Steve Reinhardt	d60c293bbc	inorder: replace schedEvent() code with reschedule(). There were several copies of similar functions that looked like they all replicated reschedule(), so I replaced them with direct calls. Keeping this separate from the previous cset since there may be some subtle functional differences if the code ever reschedules an event that is scheduled but not squashed (though none were detected in the regressions).	2011-01-07 21:50:29 -08:00
Steve Reinhardt	214cc0fafc	inorder: get rid of references to mainEventQueue. Events need to be scheduled on the queue assigned to the SimObject, not on the global queue (which should be going away). Also cleaned up a number of redundant expressions that made the code unnecessarily verbose.	2011-01-07 21:50:29 -08:00
Ali Saidi	cdacbe734a	ARM/Alpha/Cpu: Change prefetchs to be more like normal loads. This change modifies the way prefetches work. They are now like normal loads that don't writeback a register. Previously prefetches were supposed to call prefetch() on the exection context, so they executed with execute() methods instead of initiateAcc() completeAcc(). The prefetch() methods for all the CPUs are blank, meaning that they get executed, but don't actually do anything. On Alpha dead cache copy code was removed and prefetches are now normal ops. They count as executed operations, but still don't do anything and IsMemRef is not longer set on them. On ARM IsDataPrefetch or IsInstructionPreftech is now set on all prefetch instructions. The timing simple CPU doesn't try to do anything special for prefetches now and they execute with the normal memory code path.	2010-11-08 13:58:22 -06:00
Gabe Black	6f4bd2c1da	ISA,CPU,etc: Create an ISA defined PC type that abstracts out ISA behaviors. This change is a low level and pervasive reorganization of how PCs are managed in M5. Back when Alpha was the only ISA, there were only 2 PCs to worry about, the PC and the NPC, and the lsb of the PC signaled whether or not you were in PAL mode. As other ISAs were added, we had to add an NNPC, micro PC and next micropc, x86 and ARM introduced variable length instruction sets, and ARM started to keep track of mode bits in the PC. Each CPU model handled PCs in its own custom way that needed to be updated individually to handle the new dimensions of variability, or, in the case of ARMs mode-bit-in-the-pc hack, the complexity could be hidden in the ISA at the ISA implementation's expense. Areas like the branch predictor hadn't been updated to handle branch delay slots or micropcs, and it turns out that had introduced a significant (10s of percent) performance bug in SPARC and to a lesser extend MIPS. Rather than perpetuate the problem by reworking O3 again to handle the PC features needed by x86, this change was introduced to rework PC handling in a more modular, transparent, and hopefully efficient way. PC type: Rather than having the superset of all possible elements of PC state declared in each of the CPU models, each ISA defines its own PCState type which has exactly the elements it needs. A cross product of canned PCState classes are defined in the new "generic" ISA directory for ISAs with/without delay slots and microcode. These are either typedef-ed or subclassed by each ISA. To read or write this structure through a Context, you use the new pcState() accessor which reads or writes depending on whether it has an argument. If you just want the address of the current or next instruction or the current micro PC, you can get those through read-only accessors on either the PCState type or the Contexts. These are instAddr(), nextInstAddr(), and microPC(). Note the move away from readPC. That name is ambiguous since it's not clear whether or not it should be the actual address to fetch from, or if it should have extra bits in it like the PAL mode bit. Each class is free to define its own functions to get at whatever values it needs however it needs to to be used in ISA specific code. Eventually Alpha's PAL mode bit could be moved out of the PC and into a separate field like ARM. These types can be reset to a particular pc (where npc = pc + sizeof(MachInst), nnpc = npc + sizeof(MachInst), upc = 0, nupc = 1 as appropriate), printed, serialized, and compared. There is a branching() function which encapsulates code in the CPU models that checked if an instruction branched or not. Exactly what that means in the context of branch delay slots which can skip an instruction when not taken is ambiguous, and ideally this function and its uses can be eliminated. PCStates also generally know how to advance themselves in various ways depending on if they point at an instruction, a microop, or the last microop of a macroop. More on that later. Ideally, accessing all the PCs at once when setting them will improve performance of M5 even though more data needs to be moved around. This is because often all the PCs need to be manipulated together, and by getting them all at once you avoid multiple function calls. Also, the PCs of a particular thread will have spatial locality in the cache. Previously they were grouped by element in arrays which spread out accesses. Advancing the PC: The PCs were previously managed entirely by the CPU which had to know about PC semantics, try to figure out which dimension to increment the PC in, what to set NPC/NNPC, etc. These decisions are best left to the ISA in conjunction with the PC type itself. Because most of the information about how to increment the PC (mainly what type of instruction it refers to) is contained in the instruction object, a new advancePC virtual function was added to the StaticInst class. Subclasses provide an implementation that moves around the right element of the PC with a minimal amount of decision making. In ISAs like Alpha, the instructions always simply assign NPC to PC without having to worry about micropcs, nnpcs, etc. The added cost of a virtual function call should be outweighed by not having to figure out as much about what to do with the PCs and mucking around with the extra elements. One drawback of making the StaticInsts advance the PC is that you have to actually have one to advance the PC. This would, superficially, seem to require decoding an instruction before fetch could advance. This is, as far as I can tell, realistic. fetch would advance through memory addresses, not PCs, perhaps predicting new memory addresses using existing ones. More sophisticated decisions about control flow would be made later on, after the instruction was decoded, and handed back to fetch. If branching needs to happen, some amount of decoding needs to happen to see that it's a branch, what the target is, etc. This could get a little more complicated if that gets done by the predecoder, but I'm choosing to ignore that for now. Variable length instructions: To handle variable length instructions in x86 and ARM, the predecoder now takes in the current PC by reference to the getExtMachInst function. It can modify the PC however it needs to (by setting NPC to be the PC + instruction length, for instance). This could be improved since the CPU doesn't know if the PC was modified and always has to write it back. ISA parser: To support the new API, all PC related operand types were removed from the parser and replaced with a PCState type. There are two warts on this implementation. First, as with all the other operand types, the PCState still has to have a valid operand type even though it doesn't use it. Second, using syntax like PCS.npc(target) doesn't work for two reasons, this looks like the syntax for operand type overriding, and the parser can't figure out if you're reading or writing. Instructions that use the PCS operand (which I've consistently called it) need to first read it into a local variable, manipulate it, and then write it back out. Return address stack: The return address stack needed a little extra help because, in the presence of branch delay slots, it has to merge together elements of the return PC and the call PC. To handle that, a buildRetPC utility function was added. There are basically only two versions in all the ISAs, but it didn't seem short enough to put into the generic ISA directory. Also, the branch predictor code in O3 and InOrder were adjusted so that they always store the PC of the actual call instruction in the RAS, not the next PC. If the call instruction is a microop, the next PC refers to the next microop in the same macroop which is probably not desirable. The buildRetPC function advances the PC intelligently to the next macroop (in an ISA specific way) so that that case works. Change in stats: There were no change in stats except in MIPS and SPARC in the O3 model. MIPS runs in about 9% fewer ticks. SPARC runs with 30%-50% fewer ticks, which could likely be improved further by setting call/return instruction flags and taking advantage of the RAS. TODO: Add != operators to the PCState classes, defined trivially to be !(a==b). Smooth out places where PCs are split apart, passed around, and put back together later. I think this might happen in SPARC's fault code. Add ISA specific constructors that allow setting PC elements without calling a bunch of accessors. Try to eliminate the need for the branching() function. Factor out Alpha's PAL mode pc bit into a separate flag field, and eliminate places where it's blindly masked out or tested in the PC.	2010-10-31 00:07:20 -07:00
Gabe Black	ab8d7eee76	CPU: Fix O3 and possible InOrder segfaults in FS.	2010-09-20 02:46:42 -07:00
Gabe Black	6833ca7eed	Faults: Pass the StaticInst involved, if any, to a Fault's invoke method. Also move the "Fault" reference counted pointer type into a separate file, sim/fault.hh. It would be better to name this less similarly to sim/faults.hh to reduce confusion, but fault.hh matches the name of the type. We could change Fault to FaultPtr to match other pointer types, and then changing the name of the file would make more sense.	2010-09-13 19:26:03 -07:00
Gabe Black	aa8c6e9c95	CPU: Add readBytes and writeBytes functions to the exec contexts.	2010-08-13 06:16:02 -07:00
Korey Sewell	84489c5874	inorder: remove another debug stat	2010-06-28 07:33:33 -04:00
Korey Sewell	6bfd766f2c	inorder: resource scheduling backend replace priority queue with vector of lists(1 list per stage) and place inside a class so that we have more control of when an instruction uses a particular schedule entry ... also, this is the 1st step toward making the InOrderCPU fully parameterizable. See the wiki for details on this process	2010-06-25 17:42:34 -04:00
Korey Sewell	f95430d97e	inorder: enforce 78-character rule	2010-06-24 15:34:12 -04:00
Korey Sewell	9f0d8f252c	inorder-stats: add instruction type stats also, remove inst-req stats as default.good for debugging but in terms of pure processor stats they aren't useful	2010-06-23 18:18:20 -04:00
Korey Sewell	7695d4c63f	inorder: tick scheduling use nextCycle to calculate ticks after addition	2010-06-23 18:14:59 -04:00
Korey Sewell	4ac245737d	inorder: fix address list bug	2010-03-22 15:38:28 -04:00
Korey Sewell	c7f6e2661c	inorder: double delete inst bug Make sure that instructions are dereferenced/deleted twice by marking they are on the remove list	2010-01-31 18:30:59 -05:00
Korey Sewell	9357e353fc	inorder: inst count mgmt	2010-01-31 18:30:48 -05:00
Korey Sewell	ea8909925f	inorder: add activity stats	2010-01-31 18:30:24 -05:00
Korey Sewell	f3bc2df663	inorder: object cleanup in destructors	2010-01-31 18:30:08 -05:00
Korey Sewell	1a89e8f4cb	inorder: user per-thread dummy insts/reqs	2010-01-31 18:29:59 -05:00
Korey Sewell	0b29c2d057	inorder: ctxt switch stats - m5 line enforcement on use_def.cc,hh	2010-01-31 18:28:59 -05:00
Korey Sewell	ffa9ecb1fa	inorder: pipeline stage stats add idle/run/utilization stats for each pipeline stage	2010-01-31 18:28:51 -05:00
Korey Sewell	b4e0ef7837	inorder: set thread status' set Active/Suspended/Halted status for threads. useful for system when determining if/when to exit simulation	2010-01-31 18:28:12 -05:00
Korey Sewell	5e0b8337ed	inorder: add/remove halt/deallocate context respectively Halt is called from the exit() system call while deallocate is unused. So to clear up things, just use halt and remove deallocate.	2010-01-31 18:28:05 -05:00
Korey Sewell	069b38c0d5	inorder: track last branch committed when threads are switching in/out the CPU, we need to keep track of special cases like branches. Add appropriate variables in ThreadState t track this and then use these variables when updating pc after context switch	2010-01-31 18:27:58 -05:00
Korey Sewell	90d3b45a56	inorder: ready thread wakeup allow a thread to wakeup and be activated after it has been in suspended state and another thread is switched out. Need to give pipeline stages a "activateThread" function so that can get to their suspended instruction when the time is right.	2010-01-31 18:27:38 -05:00
Korey Sewell	96b493d315	inorder: ready/suspend status fns update/add in the use of isThreadReady & isThreadSuspended functions.Check in activateThread what list a thread is on so it can be managed accordingly.	2010-01-31 18:26:47 -05:00
Korey Sewell	d9eaa2fe21	inorder-cleanup: remove unused thread functions	2010-01-31 18:26:40 -05:00
Korey Sewell	e1fcc64980	inorder: activate thread on cache miss -Support ability to activate next ready thread after a cache miss through the activateNextReadyContext/Thread() functions -To support this a "readyList" of thread ids is added -After a cache miss, thread will suspend and then call activitynextreadythread	2010-01-31 18:26:32 -05:00
Korey Sewell	4a945aab19	inorder: add event priority offset allow for events to schedule themselves later if desired. this is important because of cases like where you need to activate a thread only after the previous thread has been deactivated. The ordering there has to be enforced	2010-01-31 18:26:26 -05:00
Korey Sewell	eac5eac67a	inorder: squash on memory stall add code to recognize memory stalls in resources and the pipeline as well as squash a thread if there is a stall and we are in the switch on cache miss model	2010-01-31 18:26:13 -05:00
Korey Sewell	d8e0935af2	inorder: add insts to cpu event some events are going to need instruction data when they process, so just include the instruction in the event construction	2010-01-31 18:26:03 -05:00
Korey Sewell	0e96798fe0	configs/inorder: add options for switch-on-miss to inorder cpu	2010-01-31 18:25:13 -05:00
Korey Sewell	7b3b362ba5	inorder: init internal debug cpu counters - cpuEventNum - resReqCount	2010-01-31 17:18:15 -05:00
Korey Sewell	f09f84da6e	inorder-debug: print out workload	2009-10-01 09:35:06 -04:00
Korey Sewell	25d1f2728a	inorder-debug: fix cpu tick debug message	2009-09-25 11:18:55 -04:00
Nathan Binkert	d9f39c8ce7	arch: nuke arch/isa_specific.hh and move stuff to generated config/the_isa.hh	2009-09-23 08:34:21 -07:00
Korey Sewell	badb2382a8	inorder-alpha-fs: edit inorder model to compile FS mode	2009-09-15 01:44:48 -04:00
Gabe Black	c9a27d85b9	Get rid of the unused get(Data\|Inst)Asid and (inst\|data)Asid functions.	2009-07-08 23:02:22 -07:00
Gabe Black	a480ba00b9	Registers: Eliminate the ISA defined integer register file.	2009-07-08 23:02:20 -07:00
Gabe Black	0cb180ea0d	Registers: Eliminate the ISA defined floating point register file.	2009-07-08 23:02:20 -07:00
Gabe Black	25884a8773	Registers: Get rid of the float register width parameter.	2009-07-08 23:02:20 -07:00
Gabe Black	32daf6fc3f	Registers: Add an ISA object which replaces the MiscRegFile. This object encapsulates (or will eventually) the identity and characteristics of the ISA in the CPU.	2009-07-08 23:02:20 -07:00
Nathan Binkert	6faf377b53	types: clean up types, especially signed vs unsigned	2009-06-04 23:21:12 -07:00
Nathan Binkert	47877cf2db	types: add a type for thread IDs and try to use it everywhere	2009-05-26 09:23:13 -07:00
Korey Sewell	db2b721380	inorder-tlb-cunit: merge the TLB as implicit to any memory access TLBUnit no longer used and we also get rid of memAccSize and memAccFlags functions added to ISA and StaticInst since TLB is not a separate resource to acquire. Instead, TLB access is done before any read/write to memory and the result is checked before it's sent out to memory. * * *	2009-05-12 15:01:16 -04:00
Korey Sewell	3a057bdbb1	inorder-tlb: squash insts in TLB correctly TLB had a bug where if it was stalled and waiting , it would not squash all instructions older than squashed instruction correctly * * *	2009-05-12 15:01:16 -04:00
Korey Sewell	fe4cd9847d	inorder-stc: update interface to handle store conditionals	2009-05-12 15:01:15 -04:00
Korey Sewell	1c7e988272	inorder-mem: skeleton support for prefetch/writehints	2009-05-12 15:01:15 -04:00
Korey Sewell	5127ea226a	inorder-unified-tlb: use unified TLB instead of old TLB model	2009-05-12 15:01:14 -04:00
Korey Sewell	1c8dfd9254	inorder-alpha-port: initial inorder support of ALPHA Edit AlphaISA to support the inorder model. Mostly alternate constructor functions and also a few skeleton multithreaded support functions * * * Remove namespace from header file. Causes compiler issues that are hard to find * * * Separate the TLB from the CPU and allow it to live in the TLBUnit resource. Give CPU accessor functions for access and also bind at construction time * * * Expose memory access size and flags through instruction object (temporarily memAccSize and memFlags to get TLB stuff working.)	2009-05-12 15:01:13 -04:00
Steve Reinhardt	14808ecac9	o3, inorder: fix FS bug due to initializing ThreadState to Halted. For some reason o3 FS init() only called initCPU if the thread state was Suspended, which was no longer the case. There's no apparent reason to check, so I whacked the test completely rather than changing the check to Halted. The inorder init() was also updated to be symmetric, though the previous code was just a fancy no-op.	2009-04-17 16:54:58 -07:00
Steve Reinhardt	7617dcf736	ThreadState: initialize status to Halted in constructor. This provides a common initial status for all threads independent of CPU model (unlike the prior situation where CPUs initialized threads to inconsistent states). This mostly matters for SE mode; in FS mode, ISA-specific startupCPU() methods generally handle boot-time initialization of thread contexts (since the right thing to do is ISA-dependent).	2009-04-15 13:18:24 -07:00
Korey Sewell	9e1dc7f205	InOrderCPU: Clean up Constructors to initialize variables correctly (i.e. in a way for the compiler to play nice)	2009-03-04 22:37:45 -05:00
Korey Sewell	30cd2d21fa	Remove unused functions/comments cluttering up the code.	2009-03-04 13:17:08 -05:00
Korey Sewell	f69b018571	make handling of interstage buffers (i.e. StageQueues) more consistent: (1)number from 0-n, not 1-n+1, (2) always check nextStageValid before a stageNum+1 and prevStageValid for a stageNum-1 reference (3) add skidSize() to get StageQueue size for all threads	2009-03-04 13:17:07 -05:00
Korey Sewell	846f953c2b	Give TimeBuffer an ID that can be set. Necessary because InOrder uses generic stages so w/o an ID there is no way to differentiate buffers when debugging	2009-03-04 13:16:49 -05:00
Korey Sewell	e4aa4ca40c	use numCycles instead of simTicks to determine CPI stat in InOrder	2009-03-04 13:16:48 -05:00
Gabe Black	9a000c5173	Processes: Make getting and setting system call arguments part of a process object.	2009-02-27 09:22:14 -08:00
Korey Sewell	973d8b8b13	InOrder: Import new inorder CPU model from MIPS. This model currently only works in MIPS_SE mode, so it will take some effort to clean it up and make it generally useful. Hopefully people are willing to help make that happen!	2009-02-10 15:49:29 -08:00

1 2 3

117 commits