sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Ali Saidi	0bd986015b	cpu: Put all CPU instruction tracers in a single file	2015-01-25 07:22:17 -05:00
Ali Saidi	6c4a23c1c6	cpu: remove legion tracer If someone wants to debug with legion again they can restore the code from the repository, but no need to have it hang around indefinately.	2015-01-25 07:22:05 -05:00
Curtis Dunham	10b5e5431d	sim: fix reference counting of PythonEvent When gem5 is a slave to another simulator and the Python is only used to initialize the configuration (and not perform actual simulation), a "debug start" (--debug-start) event will get freed during or immediately after the initial Python frame's execution rather than remaining in the event queue. This tricky patch fixes the GC issue causing this.	2014-12-23 11:51:40 -06:00
Andreas Hansson	10c69bb168	mem: Remove unused Packet src and dest fields This patch takes the final step in removing the src and dest fields in the packet. These fields were rather confusing in that they only remember a single multiplexing component, and pushed the responsibility to the bridge and caches to store the fields in a senderstate, thus effectively creating a stack. With the recent changes to the crossbar response routing the crossbar is now responsible without relying on the packet fields. Thus, these variables are now unused and can be removed.	2015-01-22 05:01:31 -05:00
Andreas Hansson	15c64035ed	mem: Remove Packet source from ForwardResponseRecord This patch removes the source field from the ForwardResponseRecord, but keeps the class as it is part of how the cache identifies responses to hardware prefetches that are snooped upwards.	2015-01-22 05:01:30 -05:00
Andreas Hansson	0c2ffd2daa	mem: Remove unused RequestState in the bridge This patch removes the bridge sender state as the Crossbar now takes care of remembering its own routing decisions.	2015-01-22 05:01:27 -05:00
Andreas Hansson	00536b0efc	mem: Always use SenderState for response routing in RubyPort This patch aligns how the response routing is done in the RubyPort, using the SenderState for both memory and I/O accesses. Before this patch, only the I/O used the SenderState, whereas the memory accesses relied on the src field in the packet. With this patch we shift to using SenderState in both cases, thus not relying on the src field any longer.	2015-01-22 05:01:24 -05:00
Andreas Hansson	072f78471d	mem: Make the XBar responsible for tracking response routing This patch removes the need for a source and destination field in the packet by shifting the onus of the tracking to the crossbar, much like a real implementation. This change in behaviour also means we no longer need a SenderState to remember the source/dest when ever we have multiple crossbars in the system. Thus, the stack that was created by the SenderState is not needed, and each crossbar locally tracks the response routing. The fields in the packet are still left behind as the RubyPort (which also acts as a crossbar) does routing based on them. In the succeeding patches the uses of the src and dest field will be removed. Combined, these patches improve the simulation performance by roughly 2%.	2015-01-22 05:01:14 -05:00
Andreas Hansson	fc8cb1fa76	stats: Update stats to reflect x86 table walker changes	2015-01-22 05:00:57 -05:00
Andreas Hansson	ce12d4bc63	x86: Delay X86 table walk on receiving walker response This patch fixes a minor issue in the X86 page table walker where it ended up sending new request packets to the crossbar before the response processing was finished (recvTimingResp is directly calling sendTimingReq). Under certain conditions this caused the crossbar to see illegal combinations of request/response overlap, in turn causing problems with a slightly modified crossbar implementation.	2015-01-22 05:00:54 -05:00
Andreas Hansson	f49830ce0b	mem: Clean up Request initialisation This patch tidies up how we create and set the fields of a Request. In essence it tries to use the constructor where possible (as opposed to setPhys and setVirt), thus avoiding spreading the information across a number of locations. In fact, setPhys is made private as part of this patch, and a number of places where we callede setVirt instead uses the appropriate constructor.	2015-01-22 05:00:53 -05:00
Malek Musleh	be3a952394	config, ruby: connect dma to network DMA Controller was not being connected to the network for the MESI_Three_Level protocol as was being done in the other protocol config files. Without this patch, this protocol segfaults during startup. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-20 14:15:28 -06:00
Nikos Nikoleris	a35283ac65	cpu: commit probe notification on every microop or macroop The ppCommit should notify the attached listener every time the cpu commits a microop or non microcoded insturction. The listener can then decide whether it will process only the last microop (eg. SimPoint probe). Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-20 14:15:27 -06:00
Andreas Hansson	3cb9c361e2	scons: Do not build the InOrderCPU One step closer to shifting focus to the MinorCPU.	2015-01-20 08:12:45 -05:00
Andreas Hansson	de162ad968	tests: Remove deprecated InOrderCPU tests This patch removes the three MIPS and SPARC regressions that use the deprecated InOrderCPU. This is the first step in completely removing the code from the tree, avoiding confusion, and focusing all development efforts on the MinorCPU. Brave new world.	2015-01-20 08:12:02 -05:00
Andreas Hansson	6096e2f9c1	mem: Fix bug in cache request retry mechanism This patch ensures that inhibited packets that are about to be turned into express snoops do not update the retry flag in the cache.	2015-01-20 08:12:01 -05:00
Andreas Hansson	da0c770943	cpu: Fix retry bug in MinorCPU LSQ	2015-01-20 08:11:58 -05:00
Andreas Hansson	92585d60c9	mem: Move DRAM interleaving check to init This patch fixes a bug where the DRAM controller tried to access the system cacheline size before the system pointer was initialised. It also fixes a bug where the granularity is 0 (no interleaving).	2015-01-20 08:11:55 -05:00
Nilay Vaish	e76442e203	stats: changes due to recent changesets.	2015-01-10 18:06:43 -06:00
Emilio Castillo	7bb65dd434	x86 : fxsave and fxrestore missing template code This patch corrects the FXSAVE and FXRSTOR Macroops. The actual code used for saving/restore the FP registers is in the file but it was not used. The FXSAVE and FXRSTOR instructions are used in the kernel for saving and loading the state of the mmx,xmm and fpu registers. This operation is triggered in FS by issuing a Device Not Available Fault. The cr0 register has a TS flag that is set upon each context change. Every time a task access any FP related register (SIMD as well) if the TS flag is set to one, the device not available fault is issued. The kernel saves the current state of the registers, and restore the previous state of the currently running task. Right now Gem5 lacks of this capability. the Device Not Available Fault is never issued, leading to several problems when different threads share the same CPU and SMT is not used. The PARSEC Ferret benchmark is an example of this behavior. In order to test this a hack in the atomic cpu code was done to detect if a static instruction has any FP operands and the cr0 reg TS bit is set. This check must be done in the ISA dependent code. But it seems to be tricky to access the cr0 register while executing an instruction. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-10 14:30:53 -06:00
Nikos Nikoleris	ec64b81a9d	cpu: fix RetiredStores probe point Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-10 14:30:53 -06:00
cdirik	1693e526d0	dev: prevent intel 8254 timer counter events firing before startup This change includes edits to Intel8254Timer to prevent counter events firing before startup to comply with SimObject initialization call sequence. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-06 15:10:22 -07:00
Gabe Black	1c1fb2c988	test: Add a unittest for the BitUnion types.	2015-01-07 00:34:40 -08:00
Gabe Black	86dea86987	base: Fix assigning between identical bitfields. If two bitfields are of the same type, also implying that they have the same first and last bit positions, the existing implementation would copy the entire bitfield. That includes the __data member which is shared among all the bitfields, effectively overwritting the entire bitunion. This change also adjusts the write only signed bitfield assignment operator to be like the unsigned version, using "using" instead of implementing it again and calling down to the underlying implementation.	2015-01-07 00:31:46 -08:00
Gabe Black	d0284544ec	stats: x86: Update stats for the CPUID change.	2015-01-07 00:31:09 -08:00
Gabe Black	cd6380605c	x86: Enable three bits in the FamilyModelStepping ECX CPUID bitfield. These are for the monitor/mwait instructions, SSSE3, and XSAVE.	2015-01-06 22:15:00 -08:00
Gabe Black	cb181d6f91	cpuid, x86: Revert "Enabling more features in CPUid" That change enables CPUID bits for features that aren't implemented in gem5. If a simulated system tries to use those features because it was told it could, bad things can happen.	2015-01-06 22:13:56 -08:00
Nilay Vaish	e979e8d75e	stats: changes due to recent changesets.	2015-01-04 13:02:12 -06:00
Anthony Gutierrez	0d8d6e4441	arm: fix build_drive_system when not using default options when trying to dual boot on arm build_drive_system will only use the default values for the dtb file, number of processors, and disk image. if you are using the non-default files by passing values on the command line for example, or by making a new entry in Benchmarks.py, the build config scripts will still look for the default files. this will lead to the wrong system files being used, or the simulator will fail if you do not have them. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-03 17:51:48 -06:00
Andrew Lukefahr	6d32004407	minor: fixed LSQ MasterPortID Minor was reporting the data cache access as ".inst" accesses. This just switches the MasterPortID to dataMasterPortId. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-03 17:51:48 -06:00
mike upton	cb911559dc	arm: Add unlinkat syscall implementation added ARM aarch64 unlinkat syscall support, modeled on other <xxx>at syscalls. This gets all of the cpu2006 int workloads passing in SE mode on aarch64. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-03 17:51:48 -06:00
Maxime Martinasso	5a5416d575	x86: implements the simd128 ADDSUBPD instruction This patch implements the simd128 ADDSUBPD instruction for the x86 architecture. Tested with a simple program in assembly language which executes the instruction. Checked that different versions of the instruction are executed by using the execution tracing option. Committed by: Nilay Vaish <nilay@cs.wisc.edu	2015-01-03 17:51:48 -06:00
Cagdas Dirik	02c376ac44	dev: prevent RTC events firing before startup This change includes edits to MC146818 timer to prevent RTC events firing before startup to comply with SimObject initialization call sequence. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-03 17:51:48 -06:00
Nilay Vaish	1ee70e9d84	configs: ruby: removes bug introduced by 05b5a6cf3521	2015-01-03 17:51:48 -06:00
Joel Hestness	642b9b4fab	syscall_emul: Return correct writev value According to Linux man pages, if writev is successful, it returns the total number of bytes written. Otherwise, it returns an error code. Instead of returning 0, return the result from the actual call to writev in the system call.	2014-12-27 13:48:40 -06:00
Andreas Hansson	df8df4fd0a	stats: Bump stats for decoder, TLB, prefetcher and DRAM changes Changes due to speculative execution of an unaligned PC, introduction of TLB stats, changes and re-work of the prefetcher, and the introduction of rank-wise refresh in the DRAM controller.	2014-12-23 09:31:20 -05:00
Mitch Hayenga	b2342c5d9a	mem: Change prefetcher to use random_mt Prefechers has used rand() to generate random numers previously.	2014-12-23 09:31:19 -05:00
Curtis Dunham	516e6046ae	mem: Hide WriteInvalidate requests from prefetchers Without this tweak, a prefetcher will happily prefetch data that will promptly be invalidated and overwritten by a WriteInvalidate.	2014-12-23 09:31:19 -05:00
Mitch Hayenga	bd4f901c77	mem: Fix event scheduling issue for prefetches The cache's MemSidePacketQueue schedules a sendEvent based upon nextMSHRReadyTime() which is the time when the next MSHR is ready or whenever a future prefetch is ready. However, a prefetch being ready does not guarentee that it can obtain an MSHR. So, when all MSHRs are full, the simulation ends up unnecessiciarly scheduling a sendEvent every picosecond until an MSHR is finally freed and the prefetch can happen. This patch fixes this by not signaling the prefetch ready time if the prefetch could not be generated. The event is rescheduled as soon as a MSHR becomes available.	2014-12-23 09:31:18 -05:00
Mitch Hayenga	4acd4a2055	mem: Fix bug relating to writebacks and prefetches Previously the code commented about an unhandled case where it might be possible for a writeback to arrive after a prefetch was generated but before it was sent to the memory system. I hit that case. Luckily the prefetchSquash() logic already in the code handles dropping prefetch request in certian circumstances.	2014-12-23 09:31:18 -05:00
Mitch Hayenga	df82a2d003	mem: Rework the structuring of the prefetchers Re-organizes the prefetcher class structure. Previously the BasePrefetcher forced multiple assumptions on the prefetchers that inherited from it. This patch makes the BasePrefetcher class truly representative of base functionality. For example, the base class no longer enforces FIFO order. Instead, prefetchers with FIFO requests (like the existing stride and tagged prefetchers) now inherit from a new QueuedPrefetcher base class. Finally, the stride-based prefetcher now assumes a custimizable lookup table (sets/ways) rather than the previous fully associative structure.	2014-12-23 09:31:18 -05:00
Mitch Hayenga	6cb58b2bd2	mem: Add parameter to reserve MSHR entries for demand access Adds a new parameter that reserves some number of MSHR entries for demand accesses. This helps prevent prefetchers from taking all MSHRs, forcing demand requests from the CPU to stall.	2014-12-23 09:31:18 -05:00
Curtis Dunham	4d88978913	arm: Add stats to table walker This patch adds table walker stats for: - Walk events - Instruction vs Data - Page size histogram - Wait time and service time histograms - Pending requests histogram (per cycle) - measures dist. of L (p(1..) = how often busy, p(0) = how often idle) - Squashes, before starting and after completion	2014-12-23 09:31:18 -05:00
Andreas Hansson	59460b91f3	config: Expose the DRAM ranks as a command-line option This patch gives the user direct influence over the number of DRAM ranks to make it easier to tune the memory density without affecting the bandwidth (previously the only means of scaling the device count was through the number of channels). The patch also adds some basic sanity checks to ensure that the number of ranks is a power of two (since we rely on bit slices in the address decoding).	2014-12-23 09:31:18 -05:00
Andreas Hansson	2f7baf9dbe	mem: Ensure DRAM controller is idle when in atomic mode This patch addresses an issue seen with the KVM CPU where the refresh events scheduled by the DRAM controller forces the simulator to switch out of the KVM mode, thus killing performance. The current patch works around the fact that we currently have no proper API to inform a SimObject of the mode switches. Instead we rely on drainResume being called after any switch, and cache the previous mode locally to be able to decide on appropriate actions. The switcheroo regression require a minor stats bump as a result.	2014-12-23 09:31:18 -05:00
Omar Naji	381d1da791	mem: Add rank-wise refresh to the DRAM controller This patch adds rank-wise refresh to the controller, as opposed to the channel-wide refresh currently in place. In essence each rank can be refreshed independently, and for this to be possible the controller is extended with a state machine per rank. Without this patch the data bus is always idle during a refresh, as all the ranks are refreshing at the same time. With the rank-wise refresh it is possible to use one rank while another one is refreshing, and thus the data bus can be kept busy. The patch introduces a Rank class to encapsulate the state per rank, and also shifts all the relevant banks, activation tracking etc to the rank. The arbitration is also updated to consider the state of the rank.	2014-12-23 09:31:18 -05:00
Omar Naji	152c02354e	mem: Fix a bug in the DRAM controller arbitration Fix a minor issue that affects multi-rank systems.	2014-12-23 09:31:18 -05:00
Andreas Hansson	e76e8e28a3	tests: Add a regression for the stack distance calculator Re-use the existing traffic generator regression, and enable the stack distance calculation in the comm monitor, along with the verification stack. The traffic generator config is also tuned to not increase the run-time too much (and actually have some address re-use).	2014-12-23 09:31:18 -05:00
Kanishk Sugand	7a25b1a0e0	mem: Add stack distance statistics to the CommMonitor This patch adds the stack distance calculator to the CommMonitor. The stats are disabled by default.	2014-12-23 09:31:18 -05:00
Kanishk Sugand	888975b29d	mem: Add a stack distance calculator This patch adds a stand-alone stack distance calculator. The stack distance calculator is a passive SimObject that observes the addresses passed to it. It calculates stack distances (LRU Distances) of incoming addresses based on the partial sum hierarchy tree algorithm described by Alamasi et al. http://doi.acm.org/10.1145/773039.773043. For each transaction a hashtable look-up is performed. At every non-unique transaction the tree is traversed from the leaf at the returned index to the root, the old node is deleted from the tree, and the sums (to the right) are collected and decremented. The collected sum represets the stack distance of the found node. At every unique transaction the stack distance is returned as numeric_limits<uint64>::max(). In addition to the basic stack distance calculation, a feature to mark an old node in the tree is added. This is useful if it is required to see the reuse pattern. For example, Writebacks to the lower level (e.g. membus from L2), can be marked instead of being removed from the stack (isMarked flag of Node set to True). And then later if this same address is accessed (by L1), the value of the isMarked flag would be True. This gives some insight on how the Writeback policy of the lower level affect the read/write accesses in an application. Debugging is enabled by setting the verify flag to true. Debugging is implemented using a dummy stack that behaves in a naive way, using STL vectors. Note that this has a large impact on run time.	2014-12-23 09:31:18 -05:00

... 4 5 6 7 8 ...

10914 commits