sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Ali Saidi	e572cf93ee	ARM: Delete OABI syscall handling. We only support EABI binaries, so there is no reason to support OABI syscalls. The loader detects OABI calls and fatal() so there is no reason to even check here.	2011-02-23 15:10:48 -06:00
Ali Saidi	511c637ab0	CLCD: Fix some serialization bugs with the clcd controller.	2011-02-23 15:10:48 -06:00
Ali Saidi	79dac89552	ARM: Clarifies creation of Linux and baremetal ARM systems. makeArmSystem creates both bare-metal and Linux systems more cleanly. machine_type was never optional though listed as an optional argument; a system such as "RealView_PBX" must now be explicitly specified. Now that it is a required argument, the placement of the arguments has changed slightly requiring some changes to calls that create ARM systems.	2011-02-23 15:10:48 -06:00
Ali Saidi	e2a6275c03	ARM: Add support for read of 100MHz clock in system controller.	2011-02-23 15:10:48 -06:00
Ali Saidi	2157b9976b	ARM: Reset simulation statistics when pref counters are reset. The ARM performance counters are not currently supported by the model. This patch interprets a 'reset performance counters' command to mean 'reset the simulator statistics' instead.	2011-02-23 15:10:48 -06:00
Ali Saidi	d63020717c	ARM: Adds dummy support for a L2 latency miscreg.	2011-02-23 15:10:48 -06:00
Korey Sewell	981e1dd7ee	configs: cache: add cache line size option	2011-02-23 14:26:55 -05:00
Korey Sewell	fb92578415	configs: set default cache params It's confusing (especially to new users), when you are setting some standard parameters (as defined in Options.py) and they aren't reflected in the simulations so we might as well link the settings in CacheConfig.py to those in Options.py	2011-02-23 01:01:46 -05:00
Korey Sewell	78c37b8048	ruby: extend dprintfs for RubyGenerated TraceFlag "executing" isnt a very descriptive debug message and in going through the output you get multiple messages that say "executing" but nothing to help you parse through the code/execution. So instead, at least print out the name of the action that is taking place in these functions.	2011-02-23 00:58:42 -05:00
Korey Sewell	67cc52a605	ruby: cleaning up RubyQueue and RubyNetwork dprintfs Overall, continue to progress Ruby debug messages to more of the normal M5 debug message style - add a name() to the Ruby Throttle & PerfectSwitch objects so that the debug output isn't littered w/"global:" everywhere. - clean up messages that print over multiple lines when possible - clean up duplicate prints in the message buffer	2011-02-23 00:58:40 -05:00
Brad Beckmann	63a25a56cc	m5: merged in hammer fix	2011-02-22 11:16:40 -08:00
Nilay Vaish	77eed184f5	Ruby: Machine Type missing in MOESI CMP directory protocol In certain actions of the L1 cache controller, while creating an outgoing message, the machine type was not being set. This results in a segmentation fault when trace is collected. Joseph Pusudesris provided his patch for fixing this issue.	2011-02-19 17:32:43 -06:00
Nilay Vaish	293ccb7037	Ruby: clean MOESI CMP directory protocol The L1 cache controller file contains references to foo and goo queues, which are not in use at all. These have been removed.	2011-02-19 17:32:00 -06:00
Korey Sewell	66bb732c04	m5: merge inorder/release-notes/make_release changes	2011-02-18 14:35:15 -05:00
Korey Sewell	ab9c20cc78	inorder: regr-update: reduce dynamic mem. use to speedup sims previous changesets took a closer look at memory mgmt in the inorder model and sought to avoid dynamic memory mgmt (for access to pipeline resources) as much as possible. For the regressions that were run, the sims are about 2x speedup from changeset 7726 which is the last change since the recent commits in Feb. (note: these regressions now are 4-issue CPUs instead of just 1-issue)	2011-02-18 14:31:37 -05:00
Korey Sewell	bc16bbc158	inorder: add names and slot #s to res. dprints	2011-02-18 14:31:31 -05:00
Korey Sewell	64d31e75b9	inorder: ignore nops in execution unit	2011-02-18 14:30:38 -05:00
Korey Sewell	0fe19836c7	inorder: update graduation unit make sure instructions are able to commit before writing back to the RF do not commit more than 1 non-speculative instruction per cycle	2011-02-18 14:30:05 -05:00
Korey Sewell	89335118a5	inorder: recognize isSerializeAfter flag keep track of when an instruction needs the execution behind it to be serialized. Without this, in SE Mode instructions can execute behind a system call exit().	2011-02-18 14:29:48 -05:00
Korey Sewell	bbffd9419d	inorder: update default thread size(=1) a lot of structures get allocated based off that MaxThreads parameter so this is an effort to not abuse it	2011-02-18 14:29:44 -05:00
Korey Sewell	a278df0b95	inorder: don't overuse getLatency() resources don't need to call getLatency because the latency is already a member in the class. If there is some type of special case where different instructions impose a different latency inside a resource then we can revisit this and add getLatency() back in	2011-02-18 14:29:40 -05:00
Korey Sewell	37df925953	inorder: update max. resource bandwidths each resource has a certain # of requests it can take per cycle. update the #s here to be more realistic based off of the pipeline width and if the resource needs to be accessed on multiple cycles	2011-02-18 14:29:31 -05:00
Korey Sewell	91c48b1c3b	inorder: cleanup in destructors cleanup hanging pointers and other cruft in the destructors	2011-02-18 14:29:26 -05:00
Korey Sewell	8b4b4a1ba5	inorder: fix cache/fetch unit memory leaks --- need to delete the cache request's data on clearRequest() now that we are recycling requests --- fetch unit needs to deallocate the fetch buffer blocks when they are replaced or squashed.	2011-02-18 14:29:17 -05:00
Korey Sewell	72b5233112	inorder: remove events for zero-cycle resources if a resource has a zero cycle latency (e.g. RegFile write), then dont allocate an event for it to use	2011-02-18 14:29:02 -05:00
Korey Sewell	d5961b2b20	inorder: update pipeline interface for handling finished resource reqs formerly, to free up bandwidth in a resource, we could just change the pointer in that resource but at the same time the pipeline stages had visibility to see what happened to a resource request. Now that we are recycling these requests (to avoid too much dynamic allocation), we can't throw away the request too early or the pipeline stage gets bad information. Instead, mark when a request is done with the resource all together and then let the pipeline stage call back to the resource that it's time to free up the bandwidth for more instructions * inteface notes * - When an instruction completes and is done in a resource for that cycle, call done() - When an instruction fails and is done with a resource for that cycle, call done(false) - When an instruction completes, but isnt finished with a resource, call completed() - When an instruction fails, but isnt finished with a resource, call completed(false) * * * inorder: tlbmiss wakeup bug fix	2011-02-18 14:28:37 -05:00
Korey Sewell	d64226750e	inorder: remove request map, use request vector take away all instances of reqMap in the code and make all references use the built-in request vectors inside of each resource. The request map was dynamically allocating a request per instruction. The request vector just allocates N number of requests during instantiation and then the surrounding code is fixed up to reuse those N requests *** setRequest() and clearRequest() are the new accessors needed to define a new request in a resource	2011-02-18 14:28:30 -05:00
Korey Sewell	c883729025	inorder: add valid bit for resource requests this will allow us to reuse resource requests within a resource instead of always dynamically allocating	2011-02-18 14:28:22 -05:00
Korey Sewell	ff48afcf4f	inorder: remove reqRemoveList we are going to be getting away from creating new resource requests for every instruction so no more need to keep track of a reqRemoveList and clean it up every tick	2011-02-18 14:28:10 -05:00
Korey Sewell	991d0185c6	inorder: initialize res. req. vectors based on resource bandwidth first change in an optimization that will stop InOrder from allocating new memory for every instruction's request to a resource. This gets expensive since every instruction needs to access ~10 requests before graduation. Instead, the plan is to allocate just enough resource request objects to satisfy each resource's bandwidth (e.g. the execution unit would need to allocate 3 resource request objects for a 1-issue pipeline since on any given cycle it could have 2 read requests and 1 write request) and then let the instructions contend and reuse those allocated requests. The end result is a smaller memory footprint for the InOrder model and increased simulation performance	2011-02-18 14:27:52 -05:00
Nathan Binkert	e3d8d43b17	merge alpha system files into tree	2011-02-16 10:57:04 -05:00
Gabe Black	9836972a13	Util: Get rid of the make_release.py script. Since we're not doing releases any more we don't really need this script. If we need it in the future, we can resurrect it from the history.	2011-02-15 23:22:32 -08:00
Nathan Binkert	dfd4f6ad93	Cleanup system directory to fit into modern M5 tree	2011-02-16 00:34:02 -06:00
Nathan Binkert	99e7e5e7ef	copyright: update copyright on alpha system files	2011-02-16 00:34:01 -06:00
Gabe Black	fde8b5c387	X86: Get rid of "inline" on the MicroPanic constructor in decoder.cc. This was making certain versions of gcc omit the function from the object file which would break the build.	2011-02-15 15:58:16 -08:00
Gabe Black	989138970e	Info: Clean up some info files. Get rid of RELEASE_NOTES since we no longer do releases, update some of the information in README, and update the date in LICENSE.	2011-02-14 21:36:37 -08:00
Nilay Vaish	343e94a257	Ruby: Improve Change PerfectSwitch's wakeup function Currently the wakeup function for the PerfectSwitch contains three loops - loop on number of virtual networks loop on number of incoming links loop till all messages for this (link, network) have been routed With an 8 processor mesh network and Hammer protocol, about 11-12% of the was observed to have been spent in this function, which is the highest amongst all the functions. It was found that the innermost loop is executed about 45 times per invocation of the wakeup function, when each invocation of the wakeup function processes just about one message. The patch tries to do away with the redundant executions of the innermost loop. Counters have been added for each virtual network that record the number of messages that need to be routed for that virtual network. The inner loops are only executed when the number of messages for that particular virtual network > 0. This does away with almost 80% of the executions of the innermost loop. The function now consumes about 5-6% of the total execution time.	2011-02-14 16:14:54 -06:00
Gabe Black	5ec5794456	X86: Update stats for the improved branch detection/prediction.	2011-02-13 17:46:04 -08:00
Gabe Black	77b4a37067	X86: Detect branches taking into account instruction size. The size of the current instruction determines what the npc should be if there's no branching.	2011-02-13 17:45:47 -08:00
Gabe Black	44306e8114	X86: Update stats now that the dest reg isn't read unnecessarily to set flags.	2011-02-13 17:45:30 -08:00
Gabe Black	bce2be525d	X86: Put the result used for flags in an intermediate variable. Using the destination register directly causes the ISA parser to treat it as a source even if none of the original bits are used.	2011-02-13 17:45:12 -08:00
Gabe Black	b046f3feb6	X86: Update stats for the reduced register reads.	2011-02-13 17:44:32 -08:00
Gabe Black	4e1adf85f7	X86: Don't read in dest regs if all bits are replaced. In x86, 32 and 64 bit writes to registers in which registers appear to be 32 or 64 bits wide overwrite all bits of the destination register. This change removes false dependencies in these cases where the previous value of a register doesn't need to be read to write a new value. New versions of most microops are created that have a "Big" suffix which simply overwrite their destination, and the right version to use is selected during microop allocation based on the selected data size. This does not change the performance of the O3 CPU model significantly, I assume because there are other false dependencies from the condition code bits in the flags register.	2011-02-13 17:44:24 -08:00
Gabe Black	399e095510	X86: On a bad microopc, return a microop that returns a fault that panics. This way a bad micropc will have to get all the way to commit before killing the simulation. This accounts for misspeculated branches.	2011-02-13 17:42:56 -08:00
Gabe Black	1aa9698fa0	X86: Define fault objects to carry debug messages. These faults can panic/warn/warn_once, etc., instead of instructions doing that themselves directly. That way, instructions can be speculatively executed, and only if they're actually going to commit will their fault be invoked and the panic, etc., happen.	2011-02-13 17:42:05 -08:00
Gabe Black	5ee94f4a3d	X86: Only reset npc to reflect instruction length once. When redirecting fetch to handle branches, the npc of the current pc state needs to be left alone. This change makes the pc state record whether or not the npc already reflects a real value by making it keep track of the current instruction size, or if no size has been set.	2011-02-13 17:41:10 -08:00
Gabe Black	f036fd9748	O3: Fetch from the microcode ROM when needed.	2011-02-13 17:40:07 -08:00
Ali Saidi	7c763b34c9	O3: Fix GCC 4.2.4 complaint	2011-02-13 16:51:15 -05:00
Nilay Vaish	0cede15d6c	Ruby: Reorder Cache Lookup in Protocol Files The patch changes the order in which L1 dcache and icache are looked up when a request comes in. Earlier, if a request came in for instruction fetch, the dcache was looked up before the icache, to correctly handle self-modifying code. But, in the common case, dcache is going to report a miss and the subsequent icache lookup is going to report a hit. Given the invariant - caches under the same controller keep track of disjoint sets of cache blocks, we can move the icache lookup before the dcache lookup. In case of a hit in the icache, using our invariant, we know that the dcache would have reported a miss. In case of a miss in the icache, we know that icache would have missed even if the dcache was looked up before looking up the icache. Effectively, we are doing the same thing as before, though in the common case, we expect reduction in the number of lookups. This was empirically confirmed for MOESI hammer. The ratio lookups to access requests is now about 1.1 to 1.	2011-02-12 11:41:20 -06:00
Korey Sewell	2971b8401a	inorder:regress: host-inst-rate improved ~58% there are still only a few inorder benchmark but for the lengthier benchmarks (twolf and vortext) the latest changes to how instruction scheduling (how instructions figure out what they want to do on each pipeline stage in the inorder model) were able to improve performance by a nice amount... The latest results for the inorder model process about 100k insts/second (note: 58% is over the last time run on 64-bit pool machines at UM)	2011-02-12 10:14:52 -05:00

... 3 4 5 6 7 ...

8264 commits