sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Andreas Hansson	0c2ffd2daa	mem: Remove unused RequestState in the bridge This patch removes the bridge sender state as the Crossbar now takes care of remembering its own routing decisions.	2015-01-22 05:01:27 -05:00
Andreas Hansson	00536b0efc	mem: Always use SenderState for response routing in RubyPort This patch aligns how the response routing is done in the RubyPort, using the SenderState for both memory and I/O accesses. Before this patch, only the I/O used the SenderState, whereas the memory accesses relied on the src field in the packet. With this patch we shift to using SenderState in both cases, thus not relying on the src field any longer.	2015-01-22 05:01:24 -05:00
Andreas Hansson	072f78471d	mem: Make the XBar responsible for tracking response routing This patch removes the need for a source and destination field in the packet by shifting the onus of the tracking to the crossbar, much like a real implementation. This change in behaviour also means we no longer need a SenderState to remember the source/dest when ever we have multiple crossbars in the system. Thus, the stack that was created by the SenderState is not needed, and each crossbar locally tracks the response routing. The fields in the packet are still left behind as the RubyPort (which also acts as a crossbar) does routing based on them. In the succeeding patches the uses of the src and dest field will be removed. Combined, these patches improve the simulation performance by roughly 2%.	2015-01-22 05:01:14 -05:00
Andreas Hansson	ce12d4bc63	x86: Delay X86 table walk on receiving walker response This patch fixes a minor issue in the X86 page table walker where it ended up sending new request packets to the crossbar before the response processing was finished (recvTimingResp is directly calling sendTimingReq). Under certain conditions this caused the crossbar to see illegal combinations of request/response overlap, in turn causing problems with a slightly modified crossbar implementation.	2015-01-22 05:00:54 -05:00
Andreas Hansson	f49830ce0b	mem: Clean up Request initialisation This patch tidies up how we create and set the fields of a Request. In essence it tries to use the constructor where possible (as opposed to setPhys and setVirt), thus avoiding spreading the information across a number of locations. In fact, setPhys is made private as part of this patch, and a number of places where we callede setVirt instead uses the appropriate constructor.	2015-01-22 05:00:53 -05:00
Nikos Nikoleris	a35283ac65	cpu: commit probe notification on every microop or macroop The ppCommit should notify the attached listener every time the cpu commits a microop or non microcoded insturction. The listener can then decide whether it will process only the last microop (eg. SimPoint probe). Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-20 14:15:27 -06:00
Andreas Hansson	6096e2f9c1	mem: Fix bug in cache request retry mechanism This patch ensures that inhibited packets that are about to be turned into express snoops do not update the retry flag in the cache.	2015-01-20 08:12:01 -05:00
Andreas Hansson	da0c770943	cpu: Fix retry bug in MinorCPU LSQ	2015-01-20 08:11:58 -05:00
Andreas Hansson	92585d60c9	mem: Move DRAM interleaving check to init This patch fixes a bug where the DRAM controller tried to access the system cacheline size before the system pointer was initialised. It also fixes a bug where the granularity is 0 (no interleaving).	2015-01-20 08:11:55 -05:00
Emilio Castillo	7bb65dd434	x86 : fxsave and fxrestore missing template code This patch corrects the FXSAVE and FXRSTOR Macroops. The actual code used for saving/restore the FP registers is in the file but it was not used. The FXSAVE and FXRSTOR instructions are used in the kernel for saving and loading the state of the mmx,xmm and fpu registers. This operation is triggered in FS by issuing a Device Not Available Fault. The cr0 register has a TS flag that is set upon each context change. Every time a task access any FP related register (SIMD as well) if the TS flag is set to one, the device not available fault is issued. The kernel saves the current state of the registers, and restore the previous state of the currently running task. Right now Gem5 lacks of this capability. the Device Not Available Fault is never issued, leading to several problems when different threads share the same CPU and SMT is not used. The PARSEC Ferret benchmark is an example of this behavior. In order to test this a hack in the atomic cpu code was done to detect if a static instruction has any FP operands and the cr0 reg TS bit is set. This check must be done in the ISA dependent code. But it seems to be tricky to access the cr0 register while executing an instruction. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-10 14:30:53 -06:00
Nikos Nikoleris	ec64b81a9d	cpu: fix RetiredStores probe point Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-10 14:30:53 -06:00
cdirik	1693e526d0	dev: prevent intel 8254 timer counter events firing before startup This change includes edits to Intel8254Timer to prevent counter events firing before startup to comply with SimObject initialization call sequence. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-06 15:10:22 -07:00
Gabe Black	1c1fb2c988	test: Add a unittest for the BitUnion types.	2015-01-07 00:34:40 -08:00
Gabe Black	86dea86987	base: Fix assigning between identical bitfields. If two bitfields are of the same type, also implying that they have the same first and last bit positions, the existing implementation would copy the entire bitfield. That includes the __data member which is shared among all the bitfields, effectively overwritting the entire bitunion. This change also adjusts the write only signed bitfield assignment operator to be like the unsigned version, using "using" instead of implementing it again and calling down to the underlying implementation.	2015-01-07 00:31:46 -08:00
Gabe Black	cd6380605c	x86: Enable three bits in the FamilyModelStepping ECX CPUID bitfield. These are for the monitor/mwait instructions, SSSE3, and XSAVE.	2015-01-06 22:15:00 -08:00
Gabe Black	cb181d6f91	cpuid, x86: Revert "Enabling more features in CPUid" That change enables CPUID bits for features that aren't implemented in gem5. If a simulated system tries to use those features because it was told it could, bad things can happen.	2015-01-06 22:13:56 -08:00
Andrew Lukefahr	6d32004407	minor: fixed LSQ MasterPortID Minor was reporting the data cache access as ".inst" accesses. This just switches the MasterPortID to dataMasterPortId. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-03 17:51:48 -06:00
mike upton	cb911559dc	arm: Add unlinkat syscall implementation added ARM aarch64 unlinkat syscall support, modeled on other <xxx>at syscalls. This gets all of the cpu2006 int workloads passing in SE mode on aarch64. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-03 17:51:48 -06:00
Maxime Martinasso	5a5416d575	x86: implements the simd128 ADDSUBPD instruction This patch implements the simd128 ADDSUBPD instruction for the x86 architecture. Tested with a simple program in assembly language which executes the instruction. Checked that different versions of the instruction are executed by using the execution tracing option. Committed by: Nilay Vaish <nilay@cs.wisc.edu	2015-01-03 17:51:48 -06:00
Cagdas Dirik	02c376ac44	dev: prevent RTC events firing before startup This change includes edits to MC146818 timer to prevent RTC events firing before startup to comply with SimObject initialization call sequence. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-03 17:51:48 -06:00
Joel Hestness	642b9b4fab	syscall_emul: Return correct writev value According to Linux man pages, if writev is successful, it returns the total number of bytes written. Otherwise, it returns an error code. Instead of returning 0, return the result from the actual call to writev in the system call.	2014-12-27 13:48:40 -06:00
Mitch Hayenga	b2342c5d9a	mem: Change prefetcher to use random_mt Prefechers has used rand() to generate random numers previously.	2014-12-23 09:31:19 -05:00
Curtis Dunham	516e6046ae	mem: Hide WriteInvalidate requests from prefetchers Without this tweak, a prefetcher will happily prefetch data that will promptly be invalidated and overwritten by a WriteInvalidate.	2014-12-23 09:31:19 -05:00
Mitch Hayenga	bd4f901c77	mem: Fix event scheduling issue for prefetches The cache's MemSidePacketQueue schedules a sendEvent based upon nextMSHRReadyTime() which is the time when the next MSHR is ready or whenever a future prefetch is ready. However, a prefetch being ready does not guarentee that it can obtain an MSHR. So, when all MSHRs are full, the simulation ends up unnecessiciarly scheduling a sendEvent every picosecond until an MSHR is finally freed and the prefetch can happen. This patch fixes this by not signaling the prefetch ready time if the prefetch could not be generated. The event is rescheduled as soon as a MSHR becomes available.	2014-12-23 09:31:18 -05:00
Mitch Hayenga	4acd4a2055	mem: Fix bug relating to writebacks and prefetches Previously the code commented about an unhandled case where it might be possible for a writeback to arrive after a prefetch was generated but before it was sent to the memory system. I hit that case. Luckily the prefetchSquash() logic already in the code handles dropping prefetch request in certian circumstances.	2014-12-23 09:31:18 -05:00
Mitch Hayenga	df82a2d003	mem: Rework the structuring of the prefetchers Re-organizes the prefetcher class structure. Previously the BasePrefetcher forced multiple assumptions on the prefetchers that inherited from it. This patch makes the BasePrefetcher class truly representative of base functionality. For example, the base class no longer enforces FIFO order. Instead, prefetchers with FIFO requests (like the existing stride and tagged prefetchers) now inherit from a new QueuedPrefetcher base class. Finally, the stride-based prefetcher now assumes a custimizable lookup table (sets/ways) rather than the previous fully associative structure.	2014-12-23 09:31:18 -05:00
Mitch Hayenga	6cb58b2bd2	mem: Add parameter to reserve MSHR entries for demand access Adds a new parameter that reserves some number of MSHR entries for demand accesses. This helps prevent prefetchers from taking all MSHRs, forcing demand requests from the CPU to stall.	2014-12-23 09:31:18 -05:00
Curtis Dunham	4d88978913	arm: Add stats to table walker This patch adds table walker stats for: - Walk events - Instruction vs Data - Page size histogram - Wait time and service time histograms - Pending requests histogram (per cycle) - measures dist. of L (p(1..) = how often busy, p(0) = how often idle) - Squashes, before starting and after completion	2014-12-23 09:31:18 -05:00
Andreas Hansson	59460b91f3	config: Expose the DRAM ranks as a command-line option This patch gives the user direct influence over the number of DRAM ranks to make it easier to tune the memory density without affecting the bandwidth (previously the only means of scaling the device count was through the number of channels). The patch also adds some basic sanity checks to ensure that the number of ranks is a power of two (since we rely on bit slices in the address decoding).	2014-12-23 09:31:18 -05:00
Andreas Hansson	2f7baf9dbe	mem: Ensure DRAM controller is idle when in atomic mode This patch addresses an issue seen with the KVM CPU where the refresh events scheduled by the DRAM controller forces the simulator to switch out of the KVM mode, thus killing performance. The current patch works around the fact that we currently have no proper API to inform a SimObject of the mode switches. Instead we rely on drainResume being called after any switch, and cache the previous mode locally to be able to decide on appropriate actions. The switcheroo regression require a minor stats bump as a result.	2014-12-23 09:31:18 -05:00
Omar Naji	381d1da791	mem: Add rank-wise refresh to the DRAM controller This patch adds rank-wise refresh to the controller, as opposed to the channel-wide refresh currently in place. In essence each rank can be refreshed independently, and for this to be possible the controller is extended with a state machine per rank. Without this patch the data bus is always idle during a refresh, as all the ranks are refreshing at the same time. With the rank-wise refresh it is possible to use one rank while another one is refreshing, and thus the data bus can be kept busy. The patch introduces a Rank class to encapsulate the state per rank, and also shifts all the relevant banks, activation tracking etc to the rank. The arbitration is also updated to consider the state of the rank.	2014-12-23 09:31:18 -05:00
Omar Naji	152c02354e	mem: Fix a bug in the DRAM controller arbitration Fix a minor issue that affects multi-rank systems.	2014-12-23 09:31:18 -05:00
Kanishk Sugand	7a25b1a0e0	mem: Add stack distance statistics to the CommMonitor This patch adds the stack distance calculator to the CommMonitor. The stats are disabled by default.	2014-12-23 09:31:18 -05:00
Kanishk Sugand	888975b29d	mem: Add a stack distance calculator This patch adds a stand-alone stack distance calculator. The stack distance calculator is a passive SimObject that observes the addresses passed to it. It calculates stack distances (LRU Distances) of incoming addresses based on the partial sum hierarchy tree algorithm described by Alamasi et al. http://doi.acm.org/10.1145/773039.773043. For each transaction a hashtable look-up is performed. At every non-unique transaction the tree is traversed from the leaf at the returned index to the root, the old node is deleted from the tree, and the sums (to the right) are collected and decremented. The collected sum represets the stack distance of the found node. At every unique transaction the stack distance is returned as numeric_limits<uint64>::max(). In addition to the basic stack distance calculation, a feature to mark an old node in the tree is added. This is useful if it is required to see the reuse pattern. For example, Writebacks to the lower level (e.g. membus from L2), can be marked instead of being removed from the stack (isMarked flag of Node set to True). And then later if this same address is accessed (by L1), the value of the isMarked flag would be True. This gives some insight on how the Writeback policy of the lower level affect the read/write accesses in an application. Debugging is enabled by setting the verify flag to true. Debugging is implemented using a dummy stack that behaves in a naive way, using STL vectors. Note that this has a large impact on run time.	2014-12-23 09:31:18 -05:00
Marco Elver	dd0f3943e2	mem: Add MemChecker and MemCheckerMonitor This patch adds the MemChecker and MemCheckerMonitor classes. While MemChecker can be integrated anywhere in the system and is independent, the most convenient usage is through the MemCheckerMonitor -- this however, puts limitations on where the MemChecker is able to observe read/write transactions.	2014-12-23 09:31:17 -05:00
Andreas Sandberg	184fefbb3b	arm: Raise an alignment fault if a PC has illegal alignment We currently don't handle unaligned PCs correctly. There is one check for unaligned PCs in the TLB when running in aarch64 mode, but this check does not cover cases where the CPU does not do a TLB lookup when decoding an instruction (e.g., a branch stays within the same cache line). Additionally, the Decoder class sometimes throws an assertion for unaligned PCs which breaks speculation. This changeset introduces a decoder fault bit field in the ExtMachInst structure. This field can be used to signal a decoder failure. If set, the decoder generates an internal gem5fault instruction instead of a normal instruction. This instruction in turns either panics (fault type PANIC), returns an PCAlignmentFault (fault type UNALIGNED, aarch64) or PrefetchAbort (fault type UNALIGNED, aarch32). The patch causes minor changes to the realview64 regressions, and a stats bump will follow.	2014-12-23 09:31:17 -05:00
Andreas Sandberg	b33812ba43	arm: Clean up and document decoder API This changeset adds more documentation to the ArmISA::Decoder class and restructures it slightly to make API groups more obvious.	2014-12-23 09:31:17 -05:00
Andreas Sandberg	070b4a81db	arm: Add support for filtering in the PMU This patch adds support for filtering events in the PMU. In order to do so, it updates the ISADevice base class to forward an ISA pointer to ISA devices. This enables such devices to access the MiscReg file to determine the current execution level.	2014-12-23 09:31:17 -05:00
Gabe Black	70eb68beae	Let other objects set up memory like regions in a KVM VM.	2014-12-09 21:53:44 -08:00
Andreas Sandberg	9b7578d8c7	arm: Fix decoding of PMXEVTYPER_EL0 and PMCCFILTR_EL0 The aarch64 system register decoder is currently not decoding PMXEVTYPER_EL0 and PMCCFILTR_EL0 correctly. This changeset updates the decoder so that they are decoded using the values in table C5-6 in ARM DDI 0478A.c.	2014-12-08 04:49:53 -05:00
Andreas Sandberg	6a9fbd295d	dev: Add response sanity checks in PioPort Add an assert in the PioPort that checks if a response packet from a device has the right flags set before passing it to them rest of the memory system.	2014-12-08 04:49:52 -05:00
Andreas Sandberg	1ccc4e0e21	dev: Correctly transform packets into responses The VirtIO devices didn't correctly set the response flags in memory packets. This changeset adds the required Packet::makeResponse() calls.	2014-12-08 04:49:51 -05:00
Gabe Black	4a8a0a0798	misc: Generalize GDB single stepping. The new single stepping implementation for x86 doesn't rely on any ISA specific properties or functionality. This change pulls out the per ISA implementation of those functions and promotes the X86 implementation to the base class. One drawback of that implementation is that the CPU might stop on an instruction twice if it's affected by both breakpoints and single stepping. While that might be a little surprising, it's harmless and would only happen under somewhat unlikely circumstances.	2014-12-05 22:37:03 -08:00
Gabe Black	fb07d43b1a	x86: Implement a remote GDB stub. This stub should allow remote debugging of 32 bit and 64 bit targets. Single stepping seems to work, as do breakpoints. If both breakpoints and single stepping affect an instruction, gdb will stop at the instruction twice before continuing. That's a little surprising, but is generally harmless.	2014-12-05 22:36:16 -08:00
Gabe Black	16c9b41616	misc: Add some utility functions for schedule inst commit events. These can be used to simplify the implementation of single step in derived classes.	2014-12-05 22:35:47 -08:00
Gabe Black	cddf988bfd	misc: Rename the GDB "Event" event class to InputEvent. The "Event" name is the same as the base event class. That's a bit confusing, and makes it a little awkward to add other event types.	2014-12-05 22:34:42 -08:00
Gabe Black	f9f46b8fa9	sim: Ensure GDB interrupts the simulation at an instruction boundary. Use the comInstEventQueue to ensure GDB interrupts the simulation at an instruction boundary and not in the middle of a macroop, memory access, etc.	2014-12-05 01:51:49 -08:00
Gabe Black	bacbb8ecbc	cpu: Only check for PC events on instruction boundaries. Only the instruction address is actually checked, so there's no need to check repeatedly while we're working through the microops of a macroop and that's not changing.	2014-12-05 01:47:35 -08:00
Gabe Black	fe48c0a32b	misc: Make the GDB register cache accessible in various sized chunks. Not all ISAs have 64 bit sized registers, so it's not always very convenient to access the GDB register cache in 64 bit sized chunks. This change makes it accessible in 8, 16, 32, or 64 bit chunks. The MIPS and ARM implementations were working around that limitation by bundling and unbundling 32 bit values into 64 bit values. That code has been removed.	2014-12-05 01:44:24 -08:00
Gabe Black	22aaa5867f	x86: Rework opcode parsing to support 3 byte opcodes properly. Instead of counting the number of opcode bytes in an instruction and recording each byte before the actual opcode, we can represent the path we took to get to the actual opcode byte by using a type code. That has a couple of advantages. First, we can disambiguate the properties of opcodes of the same length which have different properties. Second, it reduces the amount of data stored in an ExtMachInst, making them slightly easier/faster to create and process. This also adds some flexibility as far as how different types of opcodes are handled, which might come in handy if we decide to support VEX or XOP instructions. This change also adds tables to support properly decoding 3 byte opcodes. Before we would fall off the end of some arrays, on top of the ambiguity described above. This change doesn't measureably affect performance on the twolf benchmark. --HG-- rename : src/arch/x86/isa/decoder/three_byte_opcodes.isa => src/arch/x86/isa/decoder/three_byte_0f38_opcodes.isa rename : src/arch/x86/isa/decoder/three_byte_opcodes.isa => src/arch/x86/isa/decoder/three_byte_0f3a_opcodes.isa	2014-12-04 15:53:54 -08:00
Gabe Black	3069c28a02	arch: Allow named constants as decode case values. The values in a "bitfield" or in an ExtMachInst structure member may not be a literal value, it might select from an arbitrary collection of options. Instead of using the raw value of those constants in the decoder, it's easier to tell what's going on if they can be referred to as a symbolic constant/enum. To support that, the ISA description language is extended slightly so that in addition to integer literals, the case value for decode blobs can also be a string literal. It's up to the ISA author to ensure that the string evaluates to a legal constant value when interpretted as C++.	2014-12-04 15:52:48 -08:00
Gabe Black	d67cf81f5d	x86: Clean up style in process.cc.	2014-12-02 22:01:51 -08:00
Gabe Black	2d9dae01fb	sim: Make it possible to override the breakpoint length check. The check which makes sure the length of the breakpoint being written is the same as a MachInst is only correct on fixed instruction width ISAs. Instead of incorrectly applying that check to all ISAs, this change makes that the default check and lets ISA specific GDB classes override it.	2014-12-03 03:27:19 -08:00
Gabe Black	ecec8cde63	ide: Accept the IDLE (0xe3) ATA command. This command is supposed to set up a timer which will put the drive into a standby mode if it isn't sent a command within a given time out. Since most of the timeouts are generally significantly longer than a simulation would run anyway, and we don't have an implementation for standby mode to begin with, we can accept the command, do nothing, and report success.	2014-12-03 03:07:35 -08:00
Gabe Black	bce58726f3	dev: Support translating left and right ALT keys. This is used primarily for VNC.	2014-12-03 03:06:03 -08:00
Andreas Hansson	966c3f4bc5	scons: Ensure dictionary iteration is sorted by key This patch adds sorting based on the SimObject name or parameter name for all situations where we iterate over dictionaries. This should ensure a deterministic and consistent order across the host systems and hopefully avoid regression results differing across python versions.	2014-12-02 06:08:22 -05:00
Curtis Dunham	5d22250845	mem: Support WriteInvalidate (again) This patch takes a clean-slate approach to providing WriteInvalidate (write streaming, full cache line writes without first reading) support. Unlike the prior attempt, which took an aggressive approach of directly writing into the cache before handling the coherence actions, this approach follows the existing cache flows as closely as possible.	2014-12-02 06:08:19 -05:00
Curtis Dunham	7ca27dd3cc	mem: Remove WriteInvalidate support Prepare for a different implementation following in the next patch	2014-12-02 06:08:17 -05:00
Andrew Bardsley	df37cad0fd	cpu: Fix retries on barrier/store in Minor's store buffer This patch fixes a case where a store in Minor's store buffer never leaves the store buffer as it is pre-maturely counted as having been issued, leading to the store buffer idling. LSQ::StoreBuffer::numUnissuedAccesses should count the number of accesses either in memory, or still in the store buffer after being completed. For stores which are also barriers, the store will stay in the store buffer for a cycle after it is completed and will be cleaned up by the barrier clearing code (to ensure that barriers are completed in-order). To acheive this, numUnissuedAccesses is not decremented when a store-barrier is issued to memory, but when its barrier effect is cleared. Without this patch, the correct behaviour happens when a memory transaction is immediately accepted, but not if it needs a retry.	2014-12-02 06:08:15 -05:00
Andrew Bardsley	98f3e7a310	cpu: Fix memoryIssueLimit checking in Minor This patch fixes the checking of the number of memory instructions issued per cycles in the Minor CPU.	2014-12-02 06:08:13 -05:00
Andrew Bardsley	3cd0b1f6a6	arm: Fix TLB ignoring faults when table walking This patch fixes a case where the Minor CPU can deadlock due to the lack of a response to TLB request because of a bug in fault handling in the ARM table walker. TableWalker::processWalkWrapper is the scheduler-called wrapper which handles deferred walks which calls to TableWalker::wait cannot immediately process. The handling of faults generated by processWalk{AArch64,LPAE,} calls in those two functions is is different. processWalkWrapper ignores fault returns from processWalk... which can lead to ::finish not being called on a translation. This fix provides fault handling in processWalkWrapper similar to that found in the leaf functions which BaseTLB::Translation::finish.	2014-12-02 06:08:11 -05:00
Marco Elver	9649395f85	cpu, o3: Ignored invalidate causing same-address load reordering In case the memory subsystem sends a combined response with invalidate (e.g. ReadRespWithInvalidate), we cannot ignore the invalidate part of the response. If we were to ignore the invalidate part, under certain circumstances this effectively leads to reordering of loads to the same address which is not permitted under any memory consistency model implemented in gem5. Consider the case where a later load's address is computed before an earlier load in program order, and is therefore sent to the memory subsystem first. At some point the earlier load's address is computed and in doing so correctly marks the later load as a possibleLoadViolation. In the meantime some other node writes and sends invalidations to all other nodes. The invalidation races with the later load's ReadResp, and arrives before ReadResp and is deferred. Upon receipt of the ReadResp, the response is changed to ReadRespWithInvalidate, and sent to the CPU. If we ignore the invalidate part of the packet, we let the later load read the old value of the address. Eventually the earlier load's ReadResp arrives, but with new data. As there was no invalidate snoop (sunk into the ReadRespWithInvalidate), and if we did not process the invalidate of the ReadRespWithInvalidate, we obtain a load reordering. A similar scenario can be constructed where the earlier load's address is computed after ReadRespWithInvalidate arrives for the younger load. In this case hitExternalSnoop needs to be set to true on the ReadRespWithInvalidate, so that upon knowing the address of the earlier load, checkViolations will cause the later load to be squashed. Finally we must account for the case where both loads are sent to the memory subsystem (reordered), a snoop invalidate arrives and correctly sets the later loads fault to ReExec. However, before the CPU processes the fault, the later load's ReadResp arrives and the writeback discards the outstanding fault. We must add a check to ensure that we do not skip any unprocessed faults.	2014-12-02 06:08:03 -05:00
Andreas Hansson	74bbe20141	cpu: Always mask the snoop address when performing lock check Ensure the snoop address check is always using a cache-block aligned address. This patch updates Alpha and Mips to match the other ISAs.	2014-12-02 06:08:00 -05:00
Stephan Diestelhorst	810349a8a7	cpu: Move packet deallocation to recvTimingResp in the O3 CPU Move the packet deallocations in the O3 CPU so that the completeDataAccess deals only with the LSQ specific parts and the generic recvTimingResp frees the packet in all other cases.	2014-12-02 06:07:58 -05:00
Andreas Hansson	5c84157c29	mem: Relax packet src/dest check and shift onus to crossbar This patch allows objects to get the src/dest of a packet even if it is not set to a valid port id. This simplifies (ab)using the bridge as a buffer and latency adapter in situations where the neighbouring MemObjects are not crossbars. The checks that were done in the packet are now shifted to the crossbar where the fields are used to index into the port arrays. Thus, the carrier of the information is not burdened with checking, and the crossbar can check not only that the destination is set, but also that the port index is within limits.	2014-12-02 06:07:56 -05:00
Andreas Hansson	ea5ccc7041	mem: Clean up packet data allocation This patch attempts to make the rules for data allocation in the packet explicit, understandable, and easy to verify. The constructor that copies a packet is extended with an additional flag "alloc_data" to enable the call site to explicitly say whether the newly created packet is short-lived (a zero-time snoop), or has an unknown life-time and therefore should allocate its own data (or copy a static pointer in the case of static data). The tricky case is the static data. In essence this is a copy-avoidance scheme where the original source of the request (DMA, CPU etc) does not ask the memory system to return data as part of the packet, but instead provides a pointer, and then the memory system carries this pointer around, and copies the appropriate data to the location itself. Thus any derived packet actually never copies any data. As the original source does not copy any data from the response packet when arriving back at the source, we must maintain the copy of the original pointer to not break the system. We might want to revisit this one day and pay the price for a few extra memcpy invocations. All in all this patch should make it easier to grok what is going on in the memory system and how data is actually copied (or not).	2014-12-02 06:07:54 -05:00
Andreas Hansson	f012166bb6	mem: Cleanup Packet::checkFunctional and hasData usage This patch cleans up the use of hasData and checkFunctional in the packet. The hasData function is unfortunately suggesting that it checks if the packet has a valid data pointer, when it does in fact only check if the specific packet type is specified to have a data payload. The confusion led to a bug in checkFunctional. The latter function is also tidied up to avoid name overloading.	2014-12-02 06:07:52 -05:00
Andreas Hansson	a2ee51f631	mem: Make the requests carried by packets const This adds a basic level of sanity checking to the packet by ensuring that a request is not modified once the packet is created. The only issue that had to be worked around is the relaying of software-prefetches in the cache. The specific situation is now solved by first copying the request, and then creating a new packet accordingly.	2014-12-02 06:07:50 -05:00
Andreas Hansson	fa60d5cf27	mem: Make Request getters const This patch tidies up the Request class, making all getters const. The odd one out is incAccessDepth which is called by the memory system as packets carry the request around. This is also const to enable the packet to hold on to a const Request.	2014-12-02 06:07:48 -05:00
Andreas Hansson	3d6ec81e66	mem: Add checks and explanation for assertMemInhibit usage	2014-12-02 06:07:46 -05:00
Andreas Hansson	41846cb61b	mem: Assume all dynamic packet data is array allocated This patch simplifies how we deal with dynamically allocated data in the packet, always assuming that it is array allocated, and hence should be array deallocated (delete[] as opposed to delete). The only uses of dataDynamic was in the Ruby testers. The ARRAY_DATA flag in the packet is removed accordingly. No defragmentation of the flags is done at this point, leaving a gap in the bit masks. As the last part the patch, it renames dataDynamicArray to dataDynamic.	2014-12-02 06:07:43 -05:00
Andreas Hansson	5df96cb690	mem: Remove redundant Packet::allocate calls This patch cleans up the packet memory allocation confusion. The data is always allocated at the requesting side, when a packet is created (or copied), and there is never a need for any device to allocate any space if it is merely responding to a paket. This behaviour is in line with how SystemC and TLM works as well, thus increasing interoperability, and matching established conventions. The redundant calls to Packet::allocate are removed, and the checks in the function are tightened up to make sure data is only ever allocated once. There are still some oddities in the packet copy constructor where we copy the data pointer if it is static (without ownership), and allocate new space if the data is dynamic (with ownership). The latter is being worked on further in a follow-on patch.	2014-12-02 06:07:41 -05:00
Andreas Hansson	0706a25203	mem: Use const pointers for port proxy write functions This patch changes the various write functions in the port proxies to use const pointers for all sources (similar to how memcpy works). The one unfortunate aspect is the need for a const_cast in the packet, to avoid having to juggle a const and a non-const data pointer. This design decision can always be re-evaluated at a later stage.	2014-12-02 06:07:38 -05:00
Andreas Hansson	9779ba2e37	mem: Add const getters for write packet data This patch takes a first step in tightening up how we use the data pointer in write packets. A const getter is added for the pointer itself (getConstPtr), and a number of member functions are also made const accordingly. In a range of places throughout the memory system the new member is used. The patch also removes the unused isReadWrite function.	2014-12-02 06:07:36 -05:00
Andreas Hansson	25bfc24999	mem: Remove null-check bypassing in Packet::getPtr This patch removes the parameter that enables bypassing the null check in the Packet::getPtr method. A number of call sites assume the value to be non-null. The one odd case is the RubyTester, which issues zero-sized prefetches(!), and despite being reads they had no valid data pointer. This is now fixed, but the size oddity remains (unless anyone object or has any good suggestions). Finally, in the Ruby Sequencer, appropriate checks are made for flush packets as they have no valid data pointer.	2014-12-02 06:07:34 -05:00
Omar Naji	0e63d2cd62	mem: Add a GDDR5 DRAM config This patch adds a first cut GDDR5 config to accommodate the users combining gem5 and GPUSim. The config is based on a SK Hynix datasheet, and the Nvidia GTX580 specification. Someone from the GPUSim user-camp should tweak the default page-policy and static frontend and backend latencies.	2014-12-02 06:07:32 -05:00
Andreas Hansson	d66b14ca61	misc: Another round of static analysis fixups Mostly addressing uninitialised members.	2014-11-24 09:03:38 -05:00
Alexandru Dutu	1f539f13c3	mem: Page Table map api modification This patch adds uncacheable/cacheable and read-only/read-write attributes to the map method of PageTableBase. It also modifies the constructor of TlbEntry structs for all architectures to consider the new attributes.	2014-11-23 18:01:09 -08:00
Alexandru Dutu	c11bcb8119	mem: Multi Level Page Table bug fix The multi level page table was giving false positives for already mapped translations. This patch fixes the bogus behavior.	2014-11-23 18:01:09 -08:00
Alexandru Dutu	e4859fae5b	mem: Page Table long lines Trimmed down all the lines greater than 78 characters.	2014-11-23 18:01:09 -08:00
Alexandru Dutu	f743bdcb69	x86: Segment initialization to support KvmCPU in SE This patch sets up low and high privilege code and data segments and places them in the following order: cs low, ds low, ds, cs, in the GDT. Additionally, a syscall and page fault handler for KvmCPU in SE mode are defined. The order of the segment selectors in GDT is required in this manner for interrupt handling to work properly. Segment initialization is done for all the thread contexts.	2014-11-23 18:01:08 -08:00
Alexandru Dutu	adbaa4dfde	kvm, x86: Adding support for SE mode execution This patch adds methods in KvmCPU model to handle KVM exits caused by syscall instructions and page faults. These types of exits will be encountered if KvmCPU is run in SE mode.	2014-11-23 18:01:08 -08:00
Alexandru Dutu	335514dfdc	cpuid, x86: Enabling more features in CPUid Adding more features in the CPUid with the purpose of supporting running the KvmCPU in SE mode.	2014-11-23 18:01:08 -08:00
Gabe Black	8bbfb1b39d	x86: pc: Put a stub IO device at port 0xed which the kernel can use for delays. There was already a stub device at 0x80, the port traditionally used for an IO delay. 0x80 is also the port used for POST codes sent by firmware, and that may have prompted adding this port as a second option.	2014-11-21 17:22:02 -08:00
Gabe Black	b5fd6050a2	dev: Use fixed size member variables to describe fixed size PL111 registers.	2014-11-18 02:38:23 -08:00
Gabe Black	a08cfd797b	vnc: Add a conversion function for bgr888.	2014-11-17 01:45:42 -08:00
Gabe Black	aceeecb192	x86: Fix setting segment bases in real mode. The data size used for actually writing the base value for the segment was the default size, but really it should set the entire value without any possible truncation.	2014-11-17 01:00:53 -08:00
Gabe Black	f8603fa120	x86: Fix some bugs in the real mode far jmp instruction. The far pointer should be shifted right to get the selector value, not left. Also, when calculating the width of the offset, the wrong register was used in one spot.	2014-11-17 00:20:01 -08:00
Gabe Black	7739c24fbe	x86: APIC: Only set deliveryStatus if our IPI is going somewhere. Otherwise the IPI which isn't sent will never arrive, and the deliveryStatus bit will never be cleared.	2014-11-17 00:19:07 -08:00
Gabe Black	79e7ca307e	x86: APIC: Fix the getRegArrayBit function. The getRegArrayBit function extracts a bit from a series of registers which are treated as a single large bit array. A previous change had modified the logic which figured out which bit to extract from ">> 5" to "% 5" which seems wrong, especially when other, similar functions were changed to use "% 32".	2014-11-17 00:17:06 -08:00
Gabe Black	d228db1143	x86: Fix the CPUID Long Mode Address Size function. The value in EAX has an 8 bit field for the linear address size and one for the physical address size when calling that function. A recent change implemented it but returned 0xff for both of those fields. That implies that linear and physical addresses are 255 bits wide which is wrong. When using the KVM CPU model this causes an error, presumably because some of those bits are actually reserved, or the CPU or kernel realizes 255 bits is a bad value. This change makes those values 48.	2014-11-16 23:12:42 -08:00
Andreas Hansson	481eb6ae80	arm: Fixes based on UBSan and static analysis Another churn to clean up undefined behaviour, mostly ARM, but some parts also touching the generic part of the code base. Most of the fixes are simply ensuring that proper intialisation. One of the more subtle changes is the return type of the sign-extension, which is changed to uint64_t. This is to avoid shifting negative values (undefined behaviour) in the ISA code.	2014-11-14 03:53:51 -05:00
Andreas Hansson	9ffe0e7ba6	mem: Clarify unit of DRAM controller buffer size	2014-11-14 03:53:48 -05:00
Mitch Hayenga	9d6d8e02aa	mem: Delete unused variable in Garnet NetworkLink With recent changes OSX clang compilation fails due to an unused variable.	2014-11-12 09:05:23 -05:00
Ali Saidi	b6f32253dd	arm: Fix timing wakeup with LLSC	2014-11-12 09:05:22 -05:00
Andreas Hansson	7d05895120	sim: Sort SimObject descendants and ports This patch fixes a number of occurences where the sorting order of the objects was implementation defined.	2014-11-12 09:05:21 -05:00
Andreas Hansson	cc336ecb5e	base: Revert 9277177eccff and use getenv/setenv for UTC time This patch reverts changeset 9277177eccff which does not do what it was intended to do. In essence, we go back to implementing mkutctime much like the non-standard timegm extension.	2014-11-12 09:05:20 -05:00
Marc Orr	bf80734b2c	x86 isa: This patch attempts an implementation at mwait. Mwait works as follows: 1. A cpu monitors an address of interest (monitor instruction) 2. A cpu calls mwait - this loads the cache line into that cpu's cache. 3. The cpu goes to sleep. 4. When another processor requests write permission for the line, it is evicted from the sleeping cpu's cache. This eviction is forwarded to the sleeping cpu, which then wakes up. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-11-06 05:42:22 -06:00
Andrew Lukefahr	bd32d55a2c	cpu: Minor Draining Bug Fixes a bug where Minor drains in the midst of committing a conditional store. While committing a conditional store, lastCommitWasEndOfMacroop is true (from the previous instruction) as we still haven't finished the conditional store. If a drain occurs before the cache response, Minor would check just lastCommitWasEndOfMacroop, which was true, and set drainState=DrainHaltFetch, which increases the streamSeqNum. This caused the conditional store to be squashed when the memory responded and it completed. However, to the memory the store succeeded, while to the instruction sequence it never occurred. In the case of an LLSC, the instruction sequence will replay the squashed STREX, which will fail as the cache is no longer in LLSC. Then the instruction sequence will loop back to a LDREX, which receives the updated (incorrect) value. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-11-06 05:42:21 -06:00
Nilay Vaish	0811f21f67	ruby: provide a backing store Ruby's functional accesses are not guaranteed to succeed as of now. While this is not a problem for the protocols that are currently in the mainline repo, it seems that coherence protocols for gpus rely on a backing store to supply the correct data. The aim of this patch is to make this backing store configurable i.e. it comes into play only when a particular option: --access-backing-store is invoked. The backing store has been there since M5 and GEMS were integrated. The only difference is that earlier the system used to maintain the backing store and ruby's copy was write-only. Sometime last year, we moved to data being supplied supplied by ruby in SE mode simulations. And now we have patches on the reviewboard, which remove ruby's copy of memory altogether and rely completely on the system's memory to supply data. This patch adds back a SimpleMemory member to RubySystem. This member is used only if the option: access-backing-store is set to true. By default, the memory would not be accessed.	2014-11-06 05:42:21 -06:00
Nilay Vaish	3022d463fb	ruby: interface with classic memory controller This patch is the final in the series. The whole series and this patch in particular were written with the aim of interfacing ruby's directory controller with the memory controller in the classic memory system. This is being done since ruby's memory controller has not being kept up to date with the changes going on in DRAMs. Classic's memory controller is more up to date and supports multiple different types of DRAM. This also brings classic and ruby ever more close. The patch also changes ruby's memory controller to expose the same interface.	2014-11-06 05:42:21 -06:00
Nilay Vaish	68ddfab8a4	ruby: remove the function functionalReadBuffers() This function was added when I had incorrectly arrived at the conclusion that such a function can improve the chances of a functional read succeeding. As was later realized, this is not possible in the current setup. While the code using this function was dropped long back, this function was not. Hence the patch.	2014-11-06 05:42:20 -06:00
Nilay Vaish	d25b722e4a	ruby: coherence protocols: remove data block from dirctory entry This patch removes the data block present in the directory entry structure of each protocol in gem5's mainline. Firstly, this is required for moving towards common set of memory controllers for classic and ruby memory systems. Secondly, the data block was being misused in several places. It was being used for having free access to the physical memory instead of calling on the memory controller. From now on, the directory controller will not have a direct visibility into the physical memory. The Memory Vector object now resides in the Memory Controller class. This also means that some significant changes are being made to the functional accesses in ruby.	2014-11-06 05:42:20 -06:00
Nilay Vaish	0baaed60ab	ruby: slicc: allow adding a bool to an int, like C++.	2014-11-06 05:42:20 -06:00
Nilay Vaish	85c29973a3	ruby: remove sparse memory. In my opinion, it creates needless complications in rest of the code. Also, this structure hinders the move towards common set of code for physical memory controllers.	2014-11-06 05:42:20 -06:00
Nilay Vaish	95a0b18431	ruby: single physical memory in fs mode Both ruby and the system used to maintain memory copies. With the changes carried for programmed io accesses, only one single memory is required for fs simulations. This patch sets the copy of memory that used to reside with the system to null, so that no space is allocated, but address checks can still be carried out. All the memory accesses now source and sink values to the memory maintained by ruby.	2014-11-06 05:41:44 -06:00
Nilay Vaish	8ccfd9defa	ruby: dma sequencer: remove RubyPort as parent class As of now DMASequencer inherits from the RubyPort class. But the code in RubyPort class is heavily tailored for the CPU Sequencer. There are parts of the code that are not required at all for the DMA sequencer. Moreover, the next patch uses the dma sequencer for carrying out memory accesses for all the io devices. Hence, it is better to have a leaner dma sequencer.	2014-11-06 00:55:09 -06:00
Ali Saidi	7a0bf814b6	automated merge	2014-10-29 23:22:26 -05:00
Ali Saidi	f2db2a96d1	arm, tests: Update config files to more recent kernels and create 64-bit regressions. This changes the default ARM system to a Versatile Express-like system that supports 2GB of memory and PCI devices and updates the default kernels/file-systems for AArch64 ARM systems (64-bit) to support up to 32GB of memory and PCI devices. Some platforms that are no longer supported have been pruned from the configuration files. In addition a set of 64-bit ARM regressions have been added to the regression system.	2014-10-29 23:18:27 -05:00
Mitch Hayenga	5bfa521c46	cpu: Add writeback modeling for drain functionality It is possible for the O3 CPU to consider itself drained and later have a squashed instruction perform a writeback. This patch re-adds tracking of in-flight instructions to prevent falsely signaling a drained event.	2014-10-29 23:18:27 -05:00
Mitch Hayenga	6847bbf7ce	cpu: Add drain check functionality to IEW IEW did not check the instQueue and memDepUnit to ensure they were drained. This caused issues when drainSanityCheck() did check those structures after asserting IEW was drained.	2014-10-29 23:18:26 -05:00
Ali Saidi	b31d9e93e2	arm, mem: Fix drain bug and provide drain prints for more components.	2014-10-29 23:18:26 -05:00
Ali Saidi	baf88e908d	arm: Fix multi-system AArch64 boot w/caches. Automatically extract cpu release address from DTB file. Check SCTLR_EL1 to verify all caches are enabled.	2014-10-29 23:18:26 -05:00
Ali Saidi	9900629f83	arm: Mark some miscregs (timer counter) registers at unverifiable. The checker can't verify timer registers, so it should just grab the version from the executing CPU, otherwise it could get a larger value and diverge execution.	2014-10-29 23:18:24 -05:00
Ali Saidi	e3ee27c7b4	cpu: Add support to checker for CACHE_BLOCK_ZERO commands. The checker didn't know how to properly validate these new commands.	2014-10-29 23:18:24 -05:00
Andrew Bardsley	536c72333f	cpu: Fix barrier push to store buffer when full bug in Minor This patch fixes a bug where a completing load or store which is also a barrier can push a barrier into the store buffer without first checking that there is a free slot. The bug was not fatal but would print a warning that the store buffer was full when inserting.	2014-10-29 23:18:24 -05:00
Curtis Dunham	4024fab7fc	mem: don't inhibit WriteInv's or defer snoops on their MSHRs WriteInvalidate semantics depend on the unconditional writeback or they won't complete. Also, there's no point in deferring snoops on their MSHRs, as they don't get new data at the end of their life cycle the way other transactions do. Add comment in the cache about a minor inefficiency re: WriteInvalidate.	2014-10-21 17:04:41 -05:00
Curtis Dunham	46f9f11a55	mem: have WriteInvalidate obsolete MSHRs Since WriteInvalidate directly writes into the cache, it can create tricky timing interleavings with reads and writes to the same cache line that haven't yet completed. This patch ensures that these requests, when completed, don't overwrite the newer data from the WriteInvalidate.	2014-10-29 23:18:24 -05:00
Steve Reinhardt	6ab4eddb9f	syscall_emul: add retry flag to SyscallReturn This hook allows blocking emulated system calls to indicate that they would block, but return control to the simulator so that the simulation does not hang. The actual retry functionality requires additional support, to be provided in a future changeset.	2014-09-02 16:07:50 -05:00
Steve Reinhardt	9ac7f14fc0	syscall_emul: minor style fix to LiveProcess constructor	2014-10-22 15:53:34 -07:00
Steve Reinhardt	df7f0892ed	syscall_emul: devirtualize BaseBufferArg methods Not clear why they were marked virtual to begin with, but that doesn't appear to be necessary.	2014-10-22 15:53:34 -07:00
Steve Reinhardt	44af2c6a69	syscall_emul: Put BufferArg classes in a separate header. Move the BufferArg classes that support syscall buffer args (i.e., pointers into simulated user space) out of syscall_emul.hh and into a new header syscall_emul_buf.hh so they are accessible to emulated driver implementations. Take the opportunity to add some comments as well.	2014-10-22 15:53:34 -07:00
Steve Reinhardt	44ec1d2124	syscall_emul: add EmulatedDriver object Fake SE-mode device drivers can now be added by deriving from this abstract object.	2014-10-22 15:53:34 -07:00
Nilay Vaish	6523aad25c	sim: revert 6709bbcf564d The identifier SYS_getdents is not available on Mac OS X. Therefore, its use results in compilation failure. It seems there is no straight forward way to implement the system call getdents using readdir() or similar C functions. Hence the commit 6709bbcf564d is being rolled back.	2014-10-22 15:59:57 -05:00
Andreas Hansson	d6f1c6ce89	x86: Fixes to avoid LTO warnings This patch fixes a few minor issues that caused link-time warnings when using LTO, mainly for x86. The most important change is how the syscall array is created. Previously gcc and clang would complain that the declaration and definition types did not match. The organisation is now changed to match how it is done for ARM, moving the code that was previously in syscalls.cc into process.cc, and having a class variable pointing to the static array. With these changes, there are no longer any warnings using gcc 4.6.3 with LTO.	2014-10-20 18:03:56 -04:00
Andreas Hansson	6290f98194	misc: Use gmtime for conversion to UTC to avoid getenv/setenv This patch changes how we turn time into UTC. Previously we manipulated the TZ environment variable, but this has issues as the strings that are manipulated could be tainted (see e.g. CERT ENV34-C). Now we simply rely on the built-in gmtime function and avoid touching getenv/setenv all together.	2014-10-20 18:03:55 -04:00
Omar Naji	a4a8568bd2	mem: Fix DRAM activationlLimit bug Ensure that we do the proper event scheduling also when the activation limit is disabled.	2014-10-20 18:03:55 -04:00
Andreas Hansson	77f8f5d94c	base: Fix for stats node on gcc < 4.6.3 This patch adds an explicit function to get the underlying node as gcc 4.6.1 and 4.6.2 have issues otherwise.	2014-10-20 18:03:54 -04:00
Omar Naji	29dd2887f4	mem: Add DRAM device size and check against config This patch adds the size of the DRAM device to the DRAM config. It also compares the actual DRAM size (calculated using information from the config) to the size defined in the system. If these two values do not match gem5 will print a warning. In order to do correct DRAM research the size of the memory defined in the system should match the size of the DRAM in the config. The timing and current parameters found in the DRAM configs are defined for a DRAM device with a specific size and would differ for another device with a different size.	2014-10-20 18:03:52 -04:00
Nilay Vaish	922a9d8ed2	cpu: o3: corrects base FP and CC register index in removeThread()	2014-10-20 16:47:55 -05:00
Tom Jablin	c6731e331a	sim: invalid alignment checks in mmap and mremap Presently, the alignment checks in the mmap and mremap implementations in syscall_emul.hh are wrong. The checks are implemented as: if ((start % TheISA::PageBytes) != 0 \|\| (length % TheISA::PageBytes) != 0) { warn("mmap failing: arguments not page-aligned: " "start 0x%x length 0x%x", start, length); return -EINVAL; } This checks that both the start and the length arguments of the mmap syscall are checked for page-alignment. However, the POSIX specification says: The off argument is constrained to be aligned and sized according to the value returned by sysconf() when passed _SC_PAGESIZE or _SC_PAGE_SIZE. When MAP_FIXED is specified, the application shall ensure that the argument addr also meets these constraints. The implementation performs mapping operations over whole pages. Thus, while the argument len need not meet a size or alignment constraint, the implementation shall include, in any mapping operation, any partial page specified by the range [pa,pa+len). So the length parameter should not be checked for page-alignment. By contrast, the current implementation fails to check the offset argument, which must be page aligned. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-10-20 16:45:25 -05:00
Michael Adler	7254d5742a	sim: mmap: correct behavior for fixed address Change mmap fixed address request to return an error if the mapping is impossible due to conflict instead of what I believe used to be silent corruption. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-10-20 16:45:08 -05:00
Michael Adler	a3fe4c0662	sim: implement getdents/getdents64 in user mode Has been tested only for alpha. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-10-20 16:44:53 -05:00
Severin Wischmann ext:(%2C%20Ioannis%20Ilkos%20%3Cioannis.ilkos09%40imperial.ac.uk%3E)	e72736aaf0	x86: syscall: implementation of exit_group On exit_group syscall, we used to exit the simulator. But now we will only halt the execution of threads that belong to the group. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-10-20 16:43:48 -05:00
Andreas Hansson	6d4866383f	mem: Modernise PhysicalMemory with C++11 features Bring the PhysicalMemory up-to-date by making use of range-based for loops and vector intialisation where possible.	2014-10-16 05:50:01 -04:00
Andreas Hansson	edc77fc03c	misc: Move AddrRangeList from port.hh to addr_range.hh The new location seems like a better fit. The iterator typedefs are removed in favour of using C++11 auto.	2014-10-16 05:49:59 -04:00
Geoffrey Blake	2d2006ddb3	dev: refactor pci config space for sysfs scanning Sysfs on ubuntu scrapes the entire PCI config space when it discovers a device using 4 byte accesses. This was not supported by our devices, in particular the NIC that implemented the extended PCI config space. This change allows the extended PCI config space to be accessed by sysfs properly.	2014-10-16 05:49:57 -04:00
Andrew Bardsley	d6732895a5	mem: Add ExternalMaster and ExternalSlave ports This patch adds two MemoryObject's: ExternalMaster and ExternalSlave. Each object has a single port which can be bound to an externally- provided bridge to a port of another simulation system at initialisation.	2014-10-16 05:49:56 -04:00
Andreas Hansson	e2a13386e5	sim: EventQueue wakeup on events scheduled outside the event loop This patch adds a 'wakeup' member function to EventQueue which should be called on an event queue whenever an event is scheduled on the event queue from outside code within the call tree of the gem5 event loop. This clearly isn't necessary for normal gem5 EventQueue operation but becomes the minimum necessary interface to allow hosting gem5's event loop onto other schedulers where there may be calls into gem5 from external code which schedules events onto an EventQueue between the current time and the time of the next scheduled event. The use case I have in mind is a SystemC hosting where the event loop is: while (more events) { wait(time_to_next_event or wakeup) setCurTick service events at this time } where the 'wait' needs to be woken up if time_to_next_event becomes shorter due to a scheduled event from SystemC arriving in a gem5 object. Requiring 'wakeup' to be called is a more efficient interface than requiring all gem5 event scheduling actions to affect the host scheduler. This interface could be located elsewhere, say on another global object, or by being passed by the host scheduler to objects which will schedule such events, but it seems cleanest to put it on EventQueue as it is actually a signal to the queue. EventQueue::wakeup is called for async_event events on event queue 0 as it's only important that some queue be triggered for such events.	2014-10-16 05:49:53 -04:00
Andrew Bardsley	960935a5bd	base: Reimplement the DPRINTF mechanism in a Logger class This patch adds a Logger class encapsulating dprintf. This allows variants of DPRINTF logging to be constructed and substituted in place of the default behaviour. The Logger provides a logMessage(when, name, format, ...) member function like Trace::dprintf and a getOstream member function to use a raw ostream for logging. A class OstreamLogger is provided which generates the customary debugging output with Trace::OstreamLogger::logMessage being the old Trace::dprintf.	2014-10-16 05:49:53 -04:00
Andreas Hansson	a2d246b6b8	arch: Use shared_ptr for all Faults This patch takes quite a large step in transitioning from the ad-hoc RefCountingPtr to the c++11 shared_ptr by adopting its use for all Faults. There are no changes in behaviour, and the code modifications are mostly just replacing "new" with "make_shared".	2014-10-16 05:49:51 -04:00
Andreas Hansson	a769963d16	o3: Use shared_ptr for MemDepEntry This patch transitions the o3 MemDepEntry from the ad-hoc RefCountingPtr to the c++11 shared_ptr. There are no changes in behaviour, and the code modifications are mainly replacing "new" with "make_shared".	2014-10-16 05:49:49 -04:00
Andreas Hansson	db3739682d	mem: Use shared_ptr for Ruby Message classes This patch transitions the Ruby Message and its derived classes from the ad-hoc RefCountingPtr to the c++11 shared_ptr. There are no changes in behaviour, and the code modifications are mainly replacing "new" with "make_shared". The cloning of derived messages is slightly changed as they previously relied on overriding the base-class through covariant return types.	2014-10-16 05:49:49 -04:00
Andreas Hansson	acdfcad30d	base: Use shared_ptr for stat Node This patch transitions the stat Node and its derived classes from the ad-hoc RefCountingPtr to the c++11 shared_ptr. There are no changes in behaviour, and the code modifications are mainly replacing "new" with "make_shared".	2014-10-16 05:49:48 -04:00
Andreas Hansson	8b789ae451	base: Transition CP annotate to use shared_ptr	2014-10-16 05:49:47 -04:00
Andreas Hansson	ad3f75dc81	dev: Use shared_ptr for EthPacketData This patch transitions the EthPacketData from the ad-hoc RefCountingPtr to the c++11 shared_ptr. There are no changes in behaviour, and the code modifications are mainly replacing "new" with "make_shared". The bool casting operator for the shared_ptr is explicit, and we must therefore either cast it, compare it to NULL (p != nullptr), double negate it (!!p) or do a (p ? true : false).	2014-10-16 05:49:46 -04:00
Andreas Hansson	4e67ab6663	dev: Use shared_ptr for Arguments::Data This patch takes a first few steps in transitioning from the ad-hoc RefCountingPtr to the c++11 shared_ptr. There are no changes in behaviour, and the code modifications are mainly introducing the use of make_shared. Note that the class could use unique_ptr rather than shared_ptr, was it not for the postfix increment and decrement operators.	2014-10-16 05:49:45 -04:00
Andreas Hansson	2475862747	arch,x86,mem: Dynamically determine the ISA for Ruby store check This patch makes the memory system ISA-agnostic by enabling the Ruby Sequencer to dynamically determine if it has to do a store check. To enable this check, the ISA is encoded as an enum, and the system is able to provide the ISA to the Sequencer at run time. --HG-- rename : src/arch/x86/insts/microldstop.hh => src/arch/x86/ldstflags.hh	2014-10-16 05:49:44 -04:00
Andreas Hansson	df973abef3	mem: Dynamically determine page bytes in memory components This patch takes a step towards an ISA-agnostic memory system by enabling the components to establish the page size after instantiation. The swap operation in the memory is now also allowing any granularity to avoid depending on the IntReg of the ISA.	2014-10-16 05:49:43 -04:00
Andreas Sandberg	37908d62a4	arm: Add helper methods to setup architected PMU events	2014-10-16 05:49:42 -04:00
Andreas Sandberg	e0074324ba	cpu: Probe points for basic PMU stats This changeset adds probe points that can be used to implement PMU counters for CPU stats. The following probes are supported: * BaseCPU::ppCycles / Cycles * BaseCPU::ppRetiredInsts / RetiredInsts * BaseCPU::ppRetiredLoads / RetiredLoads * BaseCPU::ppRetiredStores / RetiredStores * BaseCPU::ppRetiredBranches RetiredBranches	2014-10-16 05:49:41 -04:00
Andreas Sandberg	9d35d48e84	arm: Add TLB PMU probes This changeset adds probe points that can be used to implement PMU counters for TLB stats. The following probes are supported: * ArmISA::TLB::ppRefills / TLB Refills (TLB insertions)	2014-10-16 05:49:41 -04:00
Andreas Sandberg	76b0ff9ecd	cpu: Add branch predictor PMU probe points This changeset adds probe points that can be used to implement PMU counters for branch predictor stats. The following probes are supported: * BPRedUnit::ppBranches / Branches * BPRedUnit::ppMisses / Misses	2014-10-16 05:49:40 -04:00
Andreas Sandberg	3697990c27	arm: Add a model of an ARM PMUv3 This class implements a subset of the ARM PMU v3 specification as described in the ARMv8 reference manual. It supports most of the features of the PMU, however the following features are known to be missing: * Event filtering (e.g., from different privilege levels). * Access controls (the PMU currently ignores the execution level). * The chain counter (event no. 0x1E) is unimplemented. The PMU itself does not implement any events, it merely provides an interface for the configuration scripts to hook up probes that drive events. Configuration scripts should call addEventProbe() to configure custom events or high-level methods to configure architected events. The Python implementation of addEventProbe() automatically delays event type registration until after instantiation. In order to support CPU switching and some combined counters (e.g., memory references synthesized from loads and stores), the PMU allows multiple probes per event type. When creating a system that switches between CPU models that share the same PMU, PMU events for all of the CPU models can be registered with the PMU. Kudos to Matt Horsnell for the initial gem5 implementation of the PMU.	2014-10-16 05:49:39 -04:00
Andreas Sandberg	132ea6319a	sim: Add typedefs for PMU probe points In order to show make PMU probe points usable across different PMU implementations, we want a common probe interface. This patch the namespace ProbePoins that contains typedefs for probe points that are shared between multiple SimObjects. It also adds typedefs for the PMU probe interface.	2014-10-16 05:49:38 -04:00
Andreas Sandberg	804ed4b418	sim: Add support for serializing BitUnionXX BitUnion instances can normally not be used with the SERIALIZE_SCALAR and UNSERIALIZE_SCALAR macros due to the way they are converted between their storage type and their actual type. This changeset adds a set of parm(In\|Out) functions specifically for gem5 bit unions to work around the issue.	2014-10-16 05:49:37 -04:00
Andreas Hansson	66df7b7fd4	config: Add the ability to read a config file using C++ and Python This patch adds the ability to load in config.ini files generated from gem5 into another instance of gem5 built without Python configuration support. The intended use case is for configuring gem5 when it is a library embedded in another simulation system. A parallel config file reader is also provided purely in Python to demonstrate the approach taken and to provided similar functionality for as-yet-unknown use models. The Python configuration file reader can read both .ini and .json files. C++ configuration file reading: A command line option has been added for scons to enable C++ configuration file reading: --with-cxx-config There is an example in util/cxx_config that shows C++ configuration in action. util/cxx_config/README explains how to build the example. Configuration is achieved by the object CxxConfigManager. It handles reading object descriptions from a CxxConfigFileBase object which wraps a config file reader. The wrapper class CxxIniFile is provided which wraps an IniFile for reading .ini files. Reading .json files from C++ would be possible with a similar wrapper and a JSON parser. After reading object descriptions, CxxConfigManager creates SimObjectParam-derived objects from the classes in the (generated with this patch) directory build/ARCH/cxx_config CxxConfigManager can then build SimObjects from those SimObjectParams (in an order dictated by the SimObject-value parameters on other objects) and bind ports of the produced SimObjects. A minimal set of instantiate-replacing member functions are provided by CxxConfigManager and few of the member functions of SimObject (such as drain) are extended onto CxxConfigManager. Python configuration file reading (configs/example/read_config.py): A Python version of the reader is also supplied with a similar interface to CxxConfigFileBase (In Python: ConfigFile) to config file readers. The Python config file reading will handle both .ini and .json files. The object construction strategy is slightly different in Python from the C++ reader as you need to avoid objects prematurely becoming the children of other objects when setting parameters. Port binding also needs to be strictly in the same port-index order as the original instantiation.	2014-10-16 05:49:37 -04:00
Andreas Hansson	b14f521e5f	scons: Add Undefined Behavior Sanitizer (UBSan) option This patch adds the Undefined Behavior Sanitizer (UBSan) for clang and gcc >= 4.9. Due to the performance impact, the usage is guarded by a command-line option.	2014-10-16 05:49:36 -04:00
Akash Bagdia	8b7724d04c	arm: Don't speculatively access most miscregisters. Speculative exeuction can cause panics in detailed execution mode that shouldn't happen.	2014-09-02 11:26:32 +01:00
Curtis Dunham	f7c6a2cbed	scons: Generate a single debug flag C++ file Reduces target count/compiler invocations by ~180.	2014-08-12 17:35:28 -05:00
Curtis Dunham	f780e85dc3	scons: create dummy target to have SWIG generate C++ classes scons build/<arch>/swig	2014-10-16 05:49:33 -04:00
Andrew Bardsley	d8502ee46d	config: Add a --without-python option to build process Add the ability to build libgem5 without embedded Python or the ability to configure with Python. This is a prelude to a patch to allow config.ini files to be loaded into libgem5 using only C++ which would make embedding gem5 within other simulation systems easier. This adds a few registration interfaces to things which cross between Python and C++. Namely: stats dumping and SimObject resolving	2014-10-16 05:49:32 -04:00
Andrew Lukefahr	8e07b36d2b	cpu: Fix o3 SMT IQCount bug Commmitted by: Nilay Vaish <nilay@cs.wisc.edu>	2014-10-11 16:16:02 -05:00
Nilay Vaish	a098fad174	ruby: network: garnet: add statistics for different activities This patch adds some statistics to garnet that record the activity of certain structures in the on-chip network. These statistics, in a later patch, will be used for computing the energy consumed by the on-chip network.	2014-10-11 15:02:23 -05:00
Nilay Vaish	25bb18f12b	ruby: network: garnet: remove functions for computing power	2014-10-11 15:02:23 -05:00
Nilay Vaish	9321a41c62	ruby: drop Orion network power model Orion is being dropped from ruby. It would be replaced with DSENT which has better models. Note that the power / energy numbers reported after this patch has been applied are not for use.	2014-10-11 15:02:23 -05:00
Nilay Vaish	b6d804a1e6	ruby: mesi: slight renaming	2014-10-11 15:02:23 -05:00
Nilay Vaish	e7f918d8cd	ruby: structures: coorect #ifndef macros in header files	2014-10-11 15:02:22 -05:00
Jiuyue Ma	9fb8b8515b	x86: add LongModeAddressSize function to cpuid LongModeAddressSize was used by kernel 2.6.28.4 for physical address validation, if not properly implemented, PCI resource allocation may failed because of ioremap failed: - linux-2.6.28.4/arch/x86/mm/ioremap.c:27-30 27 static inline int phys_addr_valid(unsigned long addr) 28 { 29 return addr < (1UL << boot_cpu_data.x86_phys_bits); 30 } - linux-2.6.28.4/arch/x86/kernel/cpu/common.c:475-482 475 #ifdef CONFIG_X86_64 476 if (c->extended_cpuid_level >= 0x80000008) { 477 u32 eax = cpuid_eax(0x80000008); 478 479 c->x86_virt_bits = (eax >> 8) & 0xff; 480 c->x86_phys_bits = eax & 0xff; 481 } 482 #endif - linux-2.6.28.4/arch/x86/mm/ioremap.c:209-214 209 if (!phys_addr_valid(phys_addr)) { 210 printk(KERN_WARNING "ioremap: invalid physical address %llx\n", 211 (unsigned long long)phys_addr); 212 WARN_ON_ONCE(1); 213 return NULL; 214 } This patch return 0x0000ffff for LongModeAddressSize, which guarantee phys_addr_valid never failed. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-06-13 16:48:47 +08:00
Andrew Lukefahr	f94fd44991	sim: draining bug for fast-forwaring multiple cores fix draining bug where multiple cores hit max_insts_any_thread simultaneously Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-10-11 15:02:22 -05:00
Nilay Vaish	2816521f0d	base: addr range: slight change to validity check The validity check is being changed from < to <= since the end of the range is considered to be a part of it.	2014-10-11 15:02:22 -05:00
Nilay Vaish	a9bfea5a35	base: misc: Add missing header file.	2014-10-11 15:02:22 -05:00
Omar Naji	cd8023a1ee	mem: DRAMPower integration for on-line DRAM power stats This patch takes the final step in integrating DRAMPower and adds the appropriate calls in the DRAM controller to provide the command trace and extract the power and energy stats. The debug printouts are still left in place, but will eventually be removed. At the moment the DRAM power calculation is always on when using the DRAM controller model. The run-time impact of this addition is around 1.5% when looking at the total host seconds of the regressions. We deem this a sensible trade-off to avoid the complication of adding an enable/disable mechanism.	2014-07-29 17:22:44 +01:00
Omar Naji	afc6ce6228	mem: Add DRAMPower wrapping class This patch adds a class to wrap DRAMPower Library in gem5. This class initiates an object of class MemorySpecification of the DRAMPower Library, passes the parameters from DRAMCtrl.py to this object and creates an object of drampower library using the memory specification.	2014-07-29 17:29:36 +01:00
Omar Naji	00b37ffe50	mem: Add missig timing and current parameters to DRAM configs This patch adds missing timing and current parameters to the existing DRAM configs. These missing timing and current parameters are required by DRAMPower for the DRAM power calculations. The missing values are datasheet values of the specified DRAMs, and the appropriate references are added for the variuos configs.	2014-07-25 10:05:59 +01:00
Omar Naji	f9fce9ba07	mem: Remove DRAMSim2 DDR3 configuration This patch prunes the DDR3 config that was initially created to match the default config of DRAMSim2. The config is not complete as it is, and to avoid having to maintain it, the easiest way forward is to simply prune it. Going forward we are adding power number etc to the other configurations.	2014-10-09 17:52:04 -04:00
Andreas Hansson	c81517c293	config: Add Current as a parameter type This patch adds the Python parameter type Current, which is used for the DRAM power modelling (to start with). With this addition we avoid implicit unit assumptions.	2014-10-09 17:52:00 -04:00
Mitch Hayenga	06f4b521aa	cpu: Remove Ozone CPU from the source tree The Ozone CPU is now very much out of date and completely non-functional, with no one actively working on restoring it. It is a source of confusion for new users who attempt to use it before realizing its current state. RIP	2014-10-09 17:51:58 -04:00
Andreas Hansson	f4a538f862	mem: Add packet sanity checks to cache and MSHRs This patch adds a number of asserts to the cache, checking basic assumptions about packets being requests or responses.	2014-10-09 17:51:56 -04:00
Andreas Hansson	4a453e8c95	mem: Allow packet queue to move next send event forward This patch changes the packet queue such that when scheduling a send, the queue is allowed to move the event forward.	2014-10-09 17:51:52 -04:00
Andreas Hansson	6498ccddb2	misc: Fix issues identified by static analysis Another bunch of issues addressed.	2014-10-01 08:05:54 -04:00
Andreas Hansson	b520223699	arm: Use MiscRegIndex rather than int when flattening Some additional type checking to avoid future issues.	2014-10-01 08:05:52 -04:00
Andreas Hansson	10f82934be	arm: More UBSan cleanups after additional full-system runs Some incorrect casting to IntRegIndex, and a few uninitialized members in the i8254xGBe device.	2014-10-01 08:05:51 -04:00
Andreas Hansson	ec41000dad	arm: Fixed undefined behaviours identified by gcc This patch fixes the runtime errors highlighted by the undefined behaviour sanitizer. In the end there were two issues. First, when rotating an immediate, we ended up shifting an uint32_t by 32 in some cases. This case is fixed by checking for a rotation by 0 positions. Second, the Mrc15 and Mcr15 are operating on an IntReg and a MiscReg, but we used the type RegRegImmOp and passed a MiscRegIndex as an IntRegIndex. This issue is resolved by introducing a MiscRegRegImmOp and RegMiscRegImmOp with the appropriate types. With these fixes there are no runtime errors identified for the full ARM regressions.	2014-09-27 09:08:37 -04:00
Andreas Hansson	341dbf2662	arch: Use const StaticInstPtr references where possible This patch optimises the passing of StaticInstPtr by avoiding copying the reference-counting pointer. This avoids first incrementing and then decrementing the reference-counting pointer.	2014-09-27 09:08:36 -04:00
Andreas Hansson	deb2200671	scons: Address issues related to gcc 4.9.1 Fix a number few minor issues to please gcc 4.9.1. Removing the '-fuse-linker-plugin' flag means no libraries are part of the LTO process, but hopefully this is an acceptable loss, as the flag causes issues on a lot of systems (only certain combinations of gcc, ld and ar work).	2014-09-27 09:08:34 -04:00
Curtis Dunham	4836aef1e4	dev: Output invalid access size in IsaFake panic	2014-09-27 09:08:33 -04:00
Curtis Dunham	b7f1d675da	mem: Output precise range when XBar has conflicts	2014-09-27 09:08:32 -04:00
Curtis Dunham	725be98fe8	mem: Provide better diagnostic for unconnected port When _masterPort is null, a message to that effect is more helpful than a segfault.	2014-09-27 09:08:30 -04:00
Andreas Hansson	de62aedabc	misc: Fix a bunch of minor issues identified by static analysis Add some missing initialisation, and fix a handful benign resource leaks (including some false positives).	2014-09-27 09:08:29 -04:00
Mitch Hayenga	cc6523e2d6	cpu: Remove unused deallocateContext calls The call paths for de-scheduling a thread are halt() and suspend(), from the thread context. There is no call to deallocateContext() in general, though some CPUs chose to define it. This patch removes the function from BaseCPU and the cores which do not require it.	2014-09-20 17:18:36 -04:00
Mitch Hayenga	e1403fc2af	alpha,arm,mips,power,x86,cpu,sim: Cleanup activate/deactivate activate(), suspend(), and halt() used on thread contexts had an optional delay parameter. However this parameter was often ignored. Also, when used, the delay was seemily arbitrarily set to 0 or 1 cycle (no other delays were ever specified). This patch removes the delay parameter and 'Events' associated with them across all ISAs and cores. Unused activate logic is also removed.	2014-09-20 17:18:35 -04:00
Andreas Hansson	1f6d5f8f84	mem: Rename Bus to XBar to better reflect its behaviour This patch changes the name of the Bus classes to XBar to better reflect the actual timing behaviour. The actual instances in the config scripts are not renamed, and remain as e.g. iobus or membus. As part of this renaming, the code has also been clean up slightly, making use of range-based for loops and tidying up some comments. The only changes outside the bus/crossbar code is due to the delay variables in the packet. --HG-- rename : src/mem/Bus.py => src/mem/XBar.py rename : src/mem/coherent_bus.cc => src/mem/coherent_xbar.cc rename : src/mem/coherent_bus.hh => src/mem/coherent_xbar.hh rename : src/mem/noncoherent_bus.cc => src/mem/noncoherent_xbar.cc rename : src/mem/noncoherent_bus.hh => src/mem/noncoherent_xbar.hh rename : src/mem/bus.cc => src/mem/xbar.cc rename : src/mem/bus.hh => src/mem/xbar.hh	2014-09-20 17:18:32 -04:00
Stephan Diestelhorst	435f4aec3d	mem: Add access statistics for the snoop filter Adds a simple access counter for requests and snoops for the snoop filter and also classifies hits based on whether a single other holder existed or whether multiple shares held the line.	2014-04-25 12:36:16 +01:00
Stephan Diestelhorst	afa2428eca	mem: Tie in the snoop filter in the coherent bus	2014-09-20 17:18:29 -04:00
Stephan Diestelhorst	7d488cc66f	mem: Add a simple snoop counter per bus This patch adds a simple counter for both total messages and a histogram for the fan-out of snoop messages. The fan-out describes to how many ports snoops had to be sent per incoming request / snoop-from-below. Without any cleverness, this usually means to either all, or all but the requesting port.	2014-04-24 13:28:47 +01:00
Stephan Diestelhorst	fe98cb6be4	misc: Add functions for doing popcount and power-of-two checking Adds two public domain algorithms for determining number of set bits and also whether a value is a power of two, uses the builtin that is available in GCC and clang for popcount.	2014-04-24 17:41:26 +01:00
Stephan Diestelhorst	ba98d598ae	mem: Simple Snoop Filter This is a first cut at a simple snoop filter that tracks presence of lines in the caches "above" it. The snoop filter can be applied at any given cache hierarchy and will then handle the caches above it appropriately; there is no need to use this only in the last-level bus. This design currently has some limitations: missing stats, no notion of clean evictions (these will not update the underlying snoop filter, because they are not sent from the evicting cache down), no notion of capacity for the snoop filter and thus no need for invalidations caused by capacity pressure in the snoop filter. These are planned to be added on top with future change sets.	2014-09-20 17:18:26 -04:00
Stephan Diestelhorst	16351ba8d6	energy: Tighter checking of levels for DFS systems There are cases where users might by accident / intention specify less voltage operating points thatn frequency points. We consider one of these cases special: giving only a single voltage to a voltage domain effectively renders it as a static domain. This patch adds additional logic in the auxiliary parts of the functionality to handle these cases properly (simple driver asking for N>1 operating levels, we should return the same voltage for all of them) and adds error checking code in the voltage domain.	2014-08-12 19:00:44 +01:00
Stephan Diestelhorst	65aaf62714	energy: Add the Energy Controller in the right configs Tie in the newly created energy controller components in the default configurations.	2014-07-25 13:36:23 +01:00
Akash Bagdia	04e51e5e3e	energy: Memory-mapped Energy Controller component This patch provides an Energy Controller device that provides software (driver) access to a DVFS handler. The device is currently residing in the dev/arm tree, but there is nothing inherently ARM specific in the behaviour. It is currently only tested and supported for ARM Linux, hence the location.	2014-09-20 17:18:23 -04:00
Stephan Diestelhorst	4422d1322a	energy: Small extentions and fixes for DVFS handler These additions allow easier interoperability with and querying from an additional controller which will be in a separate patch. Also adding warnings for changing the enabled state of the handler across checkpoint / resume and deviating from the state in the configuration. Contributed-by: Akash Bagdia <akash.bagdia@arm.com>	2014-06-16 14:59:44 +01:00
Wendy Elsasser	bf23847072	mem: Add DDR4 bank group timing Added the following parameter to the DRAMCtrl class: - bank_groups_per_rank This defaults to 1. For the DDR4 case, the default is overridden to indicate bank group architecture, with multiple bank groups per rank. Added the following delays to the DRAMCtrl class: - tCCD_L : CAS-to-CAS, same bank group delay - tRRD_L : RAS-to-RAS, same bank group delay These parameters are only applied when bank group timing is enabled. Bank group timing is currently enabled only for DDR4 memories. For all other memories, these delays will default to '0 ns' In the DRAM controller model, applied the bank group timing to the per bank parameters actAllowedAt and colAllowedAt. The actAllowedAt will be updated based on bank group when an ACT is issued. The colAllowedAt will be updated based on bank group when a RD/WR burst is issued. At the moment no modifications are made to the scheduling.	2014-09-20 17:18:21 -04:00
Wendy Elsasser	b6ecfe9183	mem: Add memory rank-to-rank delay Add the following delay to the DRAM controller: - tCS : Different rank bus turnaround delay This will be applied for 1) read-to-read, 2) write-to-write, 3) write-to-read, and 4) read-to-write command sequences, where the new command accesses a different rank than the previous burst. The delay defaults to 2*tCK for each defined memory class. Note that this does not correspond to one particular timing constraint, but is a way of modelling all the associated constraints. The DRAM controller has some minor changes to prioritize commands to the same rank. This prioritization will only occur when the command stream is not switching from a read to write or vice versa (in the case of switching we have a gap in any case). To prioritize commands to the same rank, the model will determine if there are any commands queued (same type) to the same rank as the previous command. This check will ensure that the 'same rank' command will be able to execute without adding bubbles to the command flow, e.g. any ACT delay requirements can be done under the hoods, allowing the burst to issue seamlessly.	2014-09-20 17:17:57 -04:00
Wendy Elsasser	a384525355	cpu: Update DRAM traffic gen Add new DRAM_ROTATE mode to traffic generator. This mode will generate DRAM traffic that rotates across banks per rank, command types, and ranks per channel The looping order is illustrated below: for (ranks per channel) for (command types) for (banks per rank) // Generate DRAM Command Series This patch also adds the read percentage as an input argument to the DRAM sweep script. If the simulated read percentage is 0 or 100, the middle for loop does not generate additional commands. This loop is used only when the read percentage is set to 50, in which case the middle loop will toggle between read and write commands. Modified sweep.py script, which generates DRAM traffic. Added input arguments and support for new DRAM_ROTATE mode. The script now has input arguments for: 1) Read percentage 2) Number of ranks 3) Address mapping 4) Traffic generator mode (DRAM or DRAM_ROTATE) The default values are: 100% reads, 1 rank, RoRaBaCoCh address mapping, and DRAM traffic gen mode For the DRAM traffic mode, added multi-rank support.	2014-09-20 17:17:55 -04:00
Andreas Sandberg	3f7a9348dd	dev: Add support for 9p proxying over VirtIO This patch adds support for 9p filesystem proxying over VirtIO. It can currently operate by connecting to a 9p server over a socket (VirtIO9PSocket) or by starting the diod 9p server and connecting over pipe (VirtIO9PDiod). WARNING: Checkpoints are currently not supported for systems with 9p proxies!	2014-09-20 17:17:54 -04:00
Andreas Sandberg	8c070c8f1b	dev: Add a VirtIO block device model	2014-09-20 17:17:53 -04:00
Andreas Sandberg	b8c9b04bd6	dev: Add a VirtIO console device model	2014-09-20 17:17:52 -04:00
Andreas Sandberg	bf2c2183c6	dev, pci: Implement basic VirtIO support This patch adds support for VirtIO over the PCI bus. It does so by providing the following new SimObjects: * VirtIODeviceBase - Abstract base class for VirtIO devices. * PciVirtIO - VirtIO PCI transport interface. A VirtIO device is hooked up to the guest system by adding a PciVirtIO device to the PCI bus and connecting it to a VirtIO device using the vio parameter. New VirtIO devices should inherit from VirtIODevice base and implementing one or more VirtQueues. The VirtQueues are usually device-specific and all derive from the VirtQueue class. Queues must be registered with the base class from the constructor since the device assumes that the number of queues stay constant.	2014-09-20 17:17:51 -04:00
Andreas Sandberg	0c5139310d	dev: Refactor terminal<->UART interface to make it more generic The terminal currently assumes that the transport to the guest always inherits from the Uart class. This assumption breaks when implementing, for example, a VirtIO consoles. This patch removes this assumption by adding pointer to the from the terminal to the uart and replacing it with a more general callback interface. The Uart, or any other class using the terminal, class implements an instance of the callbacks class and registers it with the terminal.	2014-09-20 17:17:50 -04:00
Andreas Hansson	0fa128bbd0	base: Clean up redundant string functions and use C++11 This patch does a bit of housekeeping on the string helper functions and relies on the C++11 standard library where possible. It also does away with our custom string hash as an implementation is already part of the standard library.	2014-09-20 17:17:49 -04:00
Andrew Bardsley	b2c2e67468	base: Add getSectionNames to IniFile Add an accessor to IniFile to list all the sections in the file.	2014-09-20 17:17:47 -04:00
Mitch Hayenga	4f0e3cd4d7	cpu: Add ExecFlags debug flag Adds a debug flag to print out the flags a instruction is tagged with.	2014-09-20 17:17:45 -04:00
Mitch Hayenga	3e5bf0c922	mem: Remove the GHB prefetcher from the source tree There are two primary issues with this code which make it deserving of deletion. 1) GHB is a way to structure a prefetcher, not a definitive type of prefetcher 2) This prefetcher isn't even structured like a GHB prefetcher. It's basically a worse version of the stride prefetcher. It primarily serves to confuse new gem5 users and most functionality is already present in the stride prefetcher.	2014-09-20 17:17:44 -04:00
Dam Sunwoo	ca3513d630	cpu: use probes infrastructure to do simpoint profiling Instead of having code embedded in cpu model to do simpoint profiling use the probes infrastructure to do it.	2014-09-20 17:17:43 -04:00
Andrew Bardsley	7329c0e20b	config: Cleanup .json config file generation This patch 'completes' .json config files generation by adding in the SimObject references and String-valued parameters not currently printed. TickParamValues are also changed to print in the same tick-value format as in .ini files. This allows .json files to describe a system as fully as the .ini files currently do. This patch adds a new function config_value (which mirrors ini_str) to each ParamValue and to SimObject. This function can then be explicitly changed to give different .json and .ini printing behaviour rather than being written in terms of ini_str.	2014-09-20 17:17:42 -04:00
Andreas Hansson	41fc8a573e	arch: Pass faults by const reference where possible This patch changes how faults are passed between methods in an attempt to copy as few reference-counting pointer instances as possible. This should avoid unecessary copies being created, contributing to the increment/decrement of the reference counters.	2014-09-19 10:35:18 -04:00
Andreas Hansson	619c5519fe	cpu: Use a deque in o3 rename instruction queue Switch from a list to a data structure with better data layout.	2014-09-19 10:35:14 -04:00
Andreas Hansson	586a219d11	base: Ensure the CP annotation compiles again A bit of revamping to get the CP annotate functionality to compile.	2014-09-19 10:35:12 -04:00
Andreas Hansson	efd5cf323a	misc: Use safe_cast when assumptions are made about return value This patch changes two dynamic_cast to safe_cast as we assume the return value is not NULL (without checking).	2014-09-19 10:35:11 -04:00
Andreas Hansson	32c111eda4	misc: Restore ostream flags where needed This patch ensures we adhere to the normal ostream usage rules, and restore the flags after modifying them.	2014-09-19 10:35:09 -04:00
Andreas Hansson	addfd89dce	stats: Fix flow-control bug in Vector2D printing	2014-09-19 10:35:08 -04:00
Andreas Hansson	f615c4aeb0	misc: Remove assertions ensuring unsigned values >= 0	2014-09-19 10:35:07 -04:00
Andreas Hansson	377f081251	mem: Check return value of checkFunctional in SimpleMemory Simple fix to ensure we only iterate until we are done.	2014-09-19 10:35:06 -04:00
Andreas Hansson	38646d48eb	mem: Add checks to sendTimingReq in cache A small fix to ensure the return value is not ignored.	2014-09-19 10:35:04 -04:00
Nilay Vaish	2ccdfc547d	ruby: network: revert some of the changes from ad9c042dce54 The changeset ad9c042dce54 made changes to the structures under the network directory to use a map of buffers instead of vector of buffers. The reasoning was that not all vnets that are created are used and we needlessly allocate more buffers than required and then iterate over them while processing network messages. But the move to map resulted in a slow down which was pointed out by Andreas Hansson. This patch moves things back to using vector of message buffers.	2014-09-15 16:19:38 -05:00
Andrew Bardsley	1a45a8c5d3	cpu: Fix memory access in Minor not setting parent Request flags This patch fixes cases where uncacheable/memory type flags are not set correctly on a memory op which is split in the LSQ. Without this patch, request->request if freely used to check flags where the flags should actually come from the accumulation of request fragment flags. This patch also fixes a bug where an uncacheable access which passes through tryToSendRequest more than once can increment LSQ::numAccessesInMemorySystem more than once.	2014-09-12 10:22:49 -04:00
Andrew Bardsley	c8b919aba2	style: Fix line continuation, especially in debug messages This patch closes a number of space gaps in debug messages caused by the incorrect use of line continuation within strings. (There's also one consistency change to a similar, but correct, use of line continuation)	2014-09-12 10:22:47 -04:00
Andreas Hansson	2b4906fc64	minor: Fix typo in DPRINTF for Minor branch prediction	2014-09-12 10:22:46 -04:00
Andreas Sandberg	53a24b01ab	sim: Automatically unregister probe listeners The ProbeListener base class automatically registers itself with a probe manager. Currently, the class does not unregister a itself when it is destroyed, which makes removing probes listeners somewhat cumbersome. This patch adds an automatic call to manager->removeListener in the ProbeListener destructor, which solves the problem.	2014-09-09 04:36:43 -04:00
Geoffrey Blake	b0e4de667a	config: Fix vectorparam command line parsing Parsing vectorparams from the command was slightly broken in that it wouldn't accept the input that the help message provided to the user and it didn't do the conversion on the second code path used to convert the string input to the actual internal representation. This patch fixes these bugs.	2014-09-09 04:36:34 -04:00
Mitch Hayenga	cd1bd7572a	cpu: Only iterate over possible threads on the o3 cpu Some places in O3 always iterated over "Impl::MaxThreads" even if a CPU had fewer threads. This removes a few of those instances.	2014-09-09 04:36:34 -04:00
Mitch Hayenga	9a595fac74	mem: Add accessor function for vaddr Determine if a request has an associated virtual address.	2014-09-09 04:36:33 -04:00
Andreas Sandberg	11494c4345	sim: Fix resource leak in BaseGlobalEvent Static analysis revealed that BaseGlobalEvent::barrier was never deallocated. This changeset solves this leak by making the barrier allocation a part of the BaseGlobalEvent instead of storing a pointer to a separate heap-allocated barrier.	2014-09-09 04:36:32 -04:00
Andreas Hansson	da4539dc74	misc: Fix a number of unitialised variables and members Static analysis unearther a bunch of uninitialised variables and members, and this patch addresses the problem. In all cases these omissions seem benign in the end, but at least fixing them means less false positives next time round.	2014-09-09 04:36:31 -04:00
Ali Saidi	346fe73370	dev: seperate legacy io offsets from PCI offset The PC platform has a single IO range that is used both legacy IO and PCI IO while other platforms may use seperate regions. Provide another mechanism to configure the legacy IO base address range and set it to the PCI IO address range for x86.	2014-09-03 07:43:06 -04:00
Ali Saidi	1c0ae90027	arm: Support >2GB of memory for AArch64 systems	2014-09-03 07:43:05 -04:00
Ali Saidi	1e13f1b074	dev, arm: Add support for linux generic pci host driver This change adds support for a generic pci host bus driver that has been included in recent Linux kernel instead of the more bespoke one we've been using to date. It also works with aarch64 so it provides PCI support for 64-bit ARM Linux. To make this work a new configuration option pci_io_base is added to the RealView platform that should be set to the start of the memory used as memory mapped IO ports (IO ports that are memory mapped, not regular memory mapped IO). And a parameter pci_cfg_gen_offsets which specifies if the config space offsets should be used that the generic driver expects. To use the pci-host-generic device you need to: pci_io_base = 0x2f000000 (Valid for VExpress EMM) pci_cfg_gen_offsets = True and add the following to your device tree: pci { compatible = "pci-host-ecam-generic"; device_type = "pci"; #address-cells = <0x3>; #size-cells = <0x2>; #interrupt-cells = <0x1>; //bus-range = <0x0 0x1>; // CPU_PHYSICAL(2) SIZE(2) // Note, some DTS blobs only support 1 size reg = <0x0 0x30000000 0x0 0x10000000>; // IO (1), no bus address (2), cpu address (2), size (2) // MMIO (1), at address (2), cpu address (2), size (2) ranges = <0x01000000 0x0 0x00000000 0x0 0x2f000000 0x0 0x10000>, <0x02000000 0x0 0x40000000 0x0 0x40000000 0x0 0x10000000>; // With gem5 we typically use INTA/B/C/D one per device interrupt-map = <0x0000 0x0 0x0 0x1 0x1 0x0 0x11 0x1 0x0000 0x0 0x0 0x2 0x1 0x0 0x12 0x1 0x0000 0x0 0x0 0x3 0x1 0x0 0x13 0x1 0x0000 0x0 0x0 0x4 0x1 0x0 0x14 0x1>; // Only match INTA/B/C/D and not BDF interrupt-map-mask = <0x0000 0x0 0x0 0x7>; };	2014-09-03 07:43:04 -04:00
Geoffrey Blake	31e4e475d9	config: Add port splicing capability to PortRef class The new configuration scripts need the ability to splice a simobject between a pair of ports that are already connected. The primary use case is when a CommMonitor needs to be created after the system is configured and then spliced between the pair of ports it will monitor.	2014-09-03 07:43:03 -04:00
Geoffrey Blake	845e199934	config: Refactor RealviewEMM to fit into new config system This eliminates some default devices and adds in helper functions to connect the devices defined here to associate with the proper clock domains.	2014-09-03 07:43:01 -04:00
Andreas Hansson	83a46bfc09	base: Use STL C++11 random number generation This patch changes the random number generator from the in-house Mersenne twister to an implementation relying entirely on C++11 STL. The format for the checkpointing of the twister is simplified. As the functionality was never used this should not matter. Note that this patch does not actually make use of the checkpointing functionality. As the random number generator is not thread safe, it may be sensible to create one generator per thread, system, or even object. Until this is decided the status quo is maintained in that no generator state is part of the checkpoint.	2014-09-03 07:42:55 -04:00
Andreas Hansson	2698e73966	base: Use the global Mersenne twister throughout This patch tidies up random number generation to ensure that it is done consistently throughout the code base. In essence this involves a clean-up of Ruby, and some code simplifications in the traffic generator. As part of this patch a bunch of skewed distributions (off-by-one etc) have been fixed. Note that a single global random number generator is used, and that the object instantiation order will impact the behaviour (the sequence of numbers will be unaffected, but if module A calles random before module B then they would obviously see a different outcome). The dependency on the instantiation order is true in any case due to the execution-model of gem5, so we leave it as is. Also note that the global ranom generator is not thread safe at this point. Regressions using the memtest, TrafficGen or any Ruby tester are affected and will be updated accordingly.	2014-09-03 07:42:54 -04:00
Andreas Hansson	1ff4c45bbb	mem: Avoid unecessary retries when bus peer is not ready This patch removes unecessary retries that happened when the bus layer itself was no longer busy, but the the peer was not yet ready. Instead of sending a retry that will inevitably not succeed, the bus now silenty waits until the peer sends a retry.	2014-09-03 07:42:53 -04:00
Mitch Hayenga	8f95144e16	arm: Make memory ops work on 64bit/128-bit quantities Multiple instructions assume only 32-bit load operations are available, this patch increases load sizes to 64-bit or 128-bit for many load pair and load multiple instructions.	2014-09-03 07:42:52 -04:00
Curtis Dunham	f6f63ec0aa	mem: write streaming support via WriteInvalidate promotion Support full-block writes directly rather than requiring RMW: * a cache line is allocated in the cache upon receipt of a WriteInvalidateReq, not the WriteInvalidateResp. * only top-level caches allocate the line; the others just pass the request along and invalidate as necessary. * to close a timing window between the Req and the Resp, a new metadata bit tracks whether another cache has read a copy of the new line before the writeback to memory.	2014-06-27 12:29:00 -05:00
Andreas Hansson	3be4f4b846	mem: Fix a bug in the cache port flow control This patch fixes a bug in the cache port where the retry flag was reset too early, allowing new requests to arrive before the retry was actually sent, but with the event already scheduled. This caused a deadlock in the interactions with the O3 LSQ. The patche fixes the underlying issue by shifting the resetting of the flag to be done by the event that also calls sendRetry(). The patch also tidies up the flow control in recvTimingReq and ensures that we also check if we already have a retry outstanding.	2014-09-03 07:42:50 -04:00
Curtis Dunham	5d029463ee	cpu, mem: Make software prefetches non-blocking Previously, they were treated so much like loads that they could stall at the head of the ROB. Now they are always treated like L1 hits. If they actually miss, a new request is created at the L1 and tracked from the MSHRs there if necessary (i.e. if it didn't coalesce with an existing outstanding load).	2014-05-13 12:20:49 -05:00
Curtis Dunham	e3b19cb294	mem: Refactor assignment of Packet types Put the packet type swizzling (that is currently done in a lot of places) into a refineCommand() member function.	2014-05-13 12:20:48 -05:00
Mitch Hayenga	afbae1ec95	x86: Flag instructions that call suspend as IsQuiesce The o3 cpu relies upon instructions that suspend a thread context being flagged as "IsQuiesce". If they are not, unpredictable behavior can occur. This patch fixes that for the x86 ISA.	2014-09-03 07:42:46 -04:00
Mitch Hayenga	659bdc1a6b	cpu: Fix o3 drain bug For X86, the o3 CPU would get stuck with the commit stage not being drained if an interrupt arrived while drain was pending. isDrained() makes sure that pcState.microPC() == 0, thus ensuring that we are at an instruction boundary. However, when we take an interrupt we execute: pcState.upc(romMicroPC(entry)); pcState.nupc(romMicroPC(entry) + 1); tc->pcState(pcState); As a result, the MicroPC is no longer zero. This patch ensures the drain is delayed until no interrupts are present. Once draining, non-synchronous interrupts are deffered until after the switch.	2014-09-03 07:42:45 -04:00
Mitch Hayenga	bb1e6cf7c4	arm: Fix v8 neon latency issue for loads/stores Neon memory ops that operate on multiple registers currently have very poor performance because of interleave/deinterleave micro-ops. This patch marks the deinterleave/interleave micro-ops as "No_OpClass" such that they take minumum cycles to execute and are never resource constrained. Additionaly the micro-ops over-read registers. Although one form may need to read up to 20 sources, not all do. This adds in new forms so false dependencies are not modeled. Instructions read their minimum number of sources.	2014-09-03 07:42:44 -04:00
Curtis Dunham	4a3f11149d	arm: use condition code registers for ARM ISA Analogous to ee049bf (for x86). Requires a bump of the checkpoint version and corresponding upgrader code to move the condition code register values to the new register file.	2014-04-29 16:05:02 -05:00
Andrew Bardsley	035a82ee2c	arm: ISA X31 destination register fix This patch substituted the zero register for X31 used as a destination register. This prevents false dependencies based on X31.	2014-09-03 07:42:43 -04:00
Dam Sunwoo	5008a20aa4	cpu: fix bimodal predictor to use correct global history reg A small bug in the bimodal predictor caused significant degradation in performance on some benchmarks. This was caused by using the wrong globalHistoryReg during the update phase. This patches fixes the bug and brings the performance to normal level.	2014-09-03 07:42:41 -04:00
Mitch Hayenga	476c6fe368	arm: Mark v7 cbz instructions as direct branches v7 cbz/cbnz instructions were improperly marked as indirect branches.	2014-09-03 07:42:40 -04:00
Mitch Hayenga	4f13f676aa	cpu: Fix cache blocked load behavior in o3 cpu This patch fixes the load blocked/replay mechanism in the o3 cpu. Rather than flushing the entire pipeline, this patch replays loads once the cache becomes unblocked. Additionally, deferred memory instructions (loads which had conflicting stores), when replayed would not respect the number of functional units (only respected issue width). This patch also corrects that. Improvements over 20% have been observed on a microbenchmark designed to exercise this behavior.	2014-09-03 07:42:39 -04:00
Mitch Hayenga	283935a6f0	cpu: Fix o3 quiesce fetch bug O3 is supposed to stop fetching instructions once a quiesce is encountered. However due to a bug, it would continue fetching instructions from the current fetch buffer. This is because of a break statment that only broke out of the first of 2 nested loops. It should have broken out of both.	2014-09-03 07:42:38 -04:00
Mitch Hayenga	4f26bedc18	cpu: Fix SMT scheduling issue with the O3 cpu The o3 cpu could attempt to schedule inactive threads under round-robin SMT mode. This is because it maintained an independent priority list of threads from the active thread list. This priority list could be come stale once threads were inactive, leading to the cpu trying to fetch/commit from inactive threads. Additionally the fetch queue is now forcibly flushed of instrctuctions from the de-scheduled thread. Relevant output: 24557000: system.cpu: [tid:1]: Calling deactivate thread. 24557000: system.cpu: [tid:1]: Removing from active threads list 24557500: system.cpu: FullO3CPU: Ticking main, FullO3CPU. 24557500: system.cpu.fetch: Running stage. 24557500: system.cpu.fetch: Attempting to fetch from [tid:1]	2014-09-03 07:42:37 -04:00
Mitch Hayenga	daedc5a491	cpu: Fix incorrect speculative branch predictor behavior When a branch mispredicted gem5 would squash all history after and including the mispredicted branch. However, the mispredicted branch is still speculative and its history is required to rollback state if another, older, branch mispredicts. This leads to things like RAS corruption.	2014-09-03 07:42:36 -04:00
Mitch Hayenga	ecd5300971	cpu: Add a fetch queue to the o3 cpu This patch adds a fetch queue that sits between fetch and decode to the o3 cpu. This effectively decouples fetch from decode stalls allowing it to be more aggressive, running futher ahead in the instruction stream.	2014-09-03 07:42:35 -04:00
Mitch Hayenga	1716749c8c	cpu: Fix o3 front-end pipeline interlock behavior The o3 pipeline interlock/stall logic is incorrect. o3 unnecessicarily stalled fetch and decode due to later stages in the pipeline. In general, a stage should usually only consider if it is stalled by the adjacent, downstream stage. Forcing stalls due to later stages creates and results in bubbles in the pipeline. Additionally, o3 stalled the entire frontend (fetch, decode, rename) on a branch mispredict while the ROB is being serially walked to update the RAT (robSquashing). Only should have stalled at rename.	2014-09-03 07:42:34 -04:00
Mitch Hayenga	976f27487b	cpu: Change writeback modeling for outstanding instructions As highlighed on the mailing list gem5's writeback modeling can impact performance. This patch removes the limitation on maximum outstanding issued instructions, however the number that can writeback in a single cycle is still respected in instToCommit().	2014-09-03 07:42:33 -04:00
Mitch Hayenga	fd722946dd	arch: Properly guess OpClass from optional StaticInst flags isa_parser.py guesses the OpClass if none were given based upon the StaticInst flags. The existing code does not take into account optionally set flags. This code hoists the setting of optional flags so OpClass is properly assigned.	2014-09-03 07:42:32 -04:00
Geoffrey Blake	b404ffde60	cache: Fix handling of LL/SC requests under contention If a set of LL/SC requests contend on the same cache block we can get into a situation where CPUs will deadlock if they expect a failed SC to supply them data. This case happens where 3 or more cores are contending for a cache block using LL/SC and the system is configured where 2 cores are connected to a local bus and the third is connected to a remote bus. If a core on the local bus sends an SCUpgrade and the core on the remote bus sends and SCUpgrade they will race to see who will win the SC access. In the meantime if the other core appends a read to one of the SCUpgrades it will expect to be supplied data by that SCUpgrade transaction. If it happens that the SCUpgrade that was picked to supply the data is failed, it will drop the appended request for data and never respond, leaving the requesting core to deadlock. This patch makes all SC's behave as normal stores to prevent this case but still makes sure to check whether it can perform the update.	2014-09-03 07:42:31 -04:00
Curtis Dunham	12210ada54	arm: support 16kb vm granules	2014-05-27 11:00:56 -05:00
Andreas Hansson	77c28cc395	mem: Packet queue clean up No change in functionality, just a bit of tidying up.	2014-09-03 07:42:28 -04:00
Mitch Hayenga	71769d2d7b	dev: Avoid invalid sized reads in PL390 with DPRINTF enabled The first DPRINTF() in PL390::writeDistributor always read a uint32_t, though a packet may have only been 1 or 2 bytes. This caused an assertion in packet->get().	2014-09-03 07:42:27 -04:00
Andrew Bardsley	87f6034462	sim: Fix checkpoint restore for Ticked This patch makes restoring the 'lastStopped' value for Ticked-containing objects (including MinorCPU) optional so that Ticked-containing objects can be restored from non-Ticked-containing objects (such as AtomicSimpleCPU).	2014-09-03 07:42:25 -04:00
Andreas Sandberg	326662b01b	arch, cpu: Factor out the ExecContext into a proper base class We currently generate and compile one version of the ISA code per CPU model. This is obviously wasting a lot of resources at compile time. This changeset factors out the interface into a separate ExecContext class, which also serves as documentation for the interface between CPUs and the ISA code. While doing so, this changeset also fixes up interface inconsistencies between the different CPU models. The main argument for using one set of ISA code per CPU model has always been performance as this avoid indirect branches in the generated code. However, this argument does not hold water. Booting Linux on a simulated ARM system running in atomic mode (opt/10.linux-boot/realview-simple-atomic) is actually 2% faster (compiled using clang 3.4) after applying this patch. Additionally, compilation time is decreased by 35%.	2014-09-03 07:42:22 -04:00
Andreas Hansson	e1ac962939	arch: Cleanup unused ISA traits constants This patch prunes unused values, and also unifies how the values are defined (not using an enum for ALPHA), aligning the use of int vs Addr etc. The patch also removes the duplication of PageBytes/PageShift and VMPageSize/LogVMPageSize. For all ISAs the two pairs had identical values and the latter has been removed.	2014-09-03 07:42:21 -04:00
Mitch Hayenga	23c8540756	config: Change parsing of Addr so hex values work from scripts When passed from a configuration script with a hexadecimal value (like "0x80000000"), gem5 would error out. This is because it would call "toMemorySize" which requires the argument to end with a size specifier (like 1MB, etc). This modification makes it so raw hex values can be passed through Addr parameters from the configuration scripts.	2014-09-03 07:42:20 -04:00
Andreas Hansson	1046b8d6e5	arm: Fix ExtMachInst hash operator underlying type This patch fixes the hash operator used for ARM ExtMachInst, which incorrectly was still using uint32_t. Instead of changing it to uint64_t it is not using the underlying data type of the BitUnion.	2014-09-03 07:42:19 -04:00
Nilay Vaish	2cbe7c705b	ruby: remove typedef of Index as int64 The Index type defined as typedef int64 does not really provide any help since in most places we use primitive types instead of Index. Also, the name Index is very generic that it does not merit being used as a typename.	2014-09-01 16:55:50 -05:00
Nilay Vaish	4ccdf8fb81	x86: set op class of two fp instructions This patch sets op class of two fp instructions: movfp and pop x87 stack as IntAluOp since these instructions do not make use of the fp alu.	2014-09-01 16:55:49 -05:00
Nilay Vaish	b4dade6fb2	ruby: PerfectSwitch: moves code to a per vnet helper function This patch moves code from the wakeup() function to a operateVnet(). The aim is to improve the readiblity of the code.	2014-09-01 16:55:48 -05:00
Nilay Vaish	7a0d5aafe4	ruby: message buffers: significant changes This patch is the final patch in a series of patches. The aim of the series is to make ruby more configurable than it was. More specifically, the connections between controllers are not at all possible (unless one is ready to make significant changes to the coherence protocol). Moreover the buffers themselves are magically connected to the network inside the slicc code. These connections are not part of the configuration file. This patch makes changes so that these connections will now be made in the python configuration files associated with the protocols. This requires each state machine to expose the message buffers it uses for input and output. So, the patch makes these buffers configurable members of the machines. The patch drops the slicc code that usd to connect these buffers to the network. Now these buffers are exposed to the python configuration system as Master and Slave ports. In the configuration files, any master port can be connected any slave port. The file pyobject.cc has been modified to take care of allocating the actual message buffer. This is inline with how other port connections work.	2014-09-01 16:55:47 -05:00
Nilay Vaish	00286fc5cb	build opts: add MI_example to NULL ISA A later changeset changes the file src/python/swig/pyobject.cc to include a header file that includes a header file generated at build time depending on the PROTOCOL in use. Since NULL ISA was not specifying any protocol, this resulted in compilation problems. Hence, the changeset.	2014-09-01 16:55:46 -05:00
Nilay Vaish	d07abd9b5b	mem: change the namespace Message to ProtoMessage The namespace Message conflicts with the Message data type used extensively in Ruby. Since Ruby is being moved to the same Master/Slave ports based configuration style as the rest of gem5, this conflict needs to be resolved. Hence, the namespace is being renamed to ProtoMessage.	2014-09-01 16:55:46 -05:00
Nilay Vaish	cee8faaad0	ruby: slicc: change the way configurable members are specified There are two changes this patch makes to the way configurable members of a state machine are specified in SLICC. The first change is that the data member declarations will need to be separated by a semi-colon instead of a comma. Secondly, the default value to be assigned would now use SLICC's assignment operator i.e. ':='.	2014-09-01 16:55:45 -05:00
Nilay Vaish	b1d3873ec5	ruby: slicc: improve the grammar This patch changes the grammar for SLICC so as to remove some of the redundant / duplicate rules. In particular rules for object/variable declaration and class member declaration have been unified. Similarly, the rules for a general function and a class method have been unified. One more change is in the priority of two rules. The first rule is on declaring a function with all the params typed and named. The second rule is on declaring a function with all the params only typed. Earlier the second rule had a higher priority. Now the first rule has a higher priority.	2014-09-01 16:55:44 -05:00
Nilay Vaish	3202ec98e7	ruby: mesi three level: slight naming changes.	2014-09-01 16:55:44 -05:00
Nilay Vaish	557200725c	ruby: slicc: donot prefix machine name to variables This changeset does away with prefixing of member variables of state machines with the identity of the machine itself.	2014-09-01 16:55:43 -05:00
Nilay Vaish	6ceb1aadc2	ruby: remove unused toString() from AbstractController	2014-09-01 16:55:42 -05:00
Nilay Vaish	00dbadcbb0	ruby: network: move getNumNodes() to base class All the implementations were doing the same things.	2014-09-01 16:55:42 -05:00
Nilay Vaish	cc2cc58869	ruby: eliminate type Time There is another type Time in src/base class which results in a conflict.	2014-09-01 16:55:41 -05:00
Nilay Vaish	82d136285d	ruby: move files from ruby/system to ruby/structures The directory ruby/system is crowded and unorganized. Hence, the files the hold actual physical structures, are being moved to the directory ruby/structures. This includes Cache Memory, Directory Memory, Memory Controller, Wire Buffer, TBE Table, Perfect Cache Memory, Timer Table, Bank Array. The directory ruby/systems has the glue code that holds these structures together. --HG-- rename : src/mem/ruby/system/MachineID.hh => src/mem/ruby/common/MachineID.hh rename : src/mem/ruby/buffers/MessageBuffer.cc => src/mem/ruby/network/MessageBuffer.cc rename : src/mem/ruby/buffers/MessageBuffer.hh => src/mem/ruby/network/MessageBuffer.hh rename : src/mem/ruby/buffers/MessageBufferNode.cc => src/mem/ruby/network/MessageBufferNode.cc rename : src/mem/ruby/buffers/MessageBufferNode.hh => src/mem/ruby/network/MessageBufferNode.hh rename : src/mem/ruby/system/AbstractReplacementPolicy.hh => src/mem/ruby/structures/AbstractReplacementPolicy.hh rename : src/mem/ruby/system/BankedArray.cc => src/mem/ruby/structures/BankedArray.cc rename : src/mem/ruby/system/BankedArray.hh => src/mem/ruby/structures/BankedArray.hh rename : src/mem/ruby/system/Cache.py => src/mem/ruby/structures/Cache.py rename : src/mem/ruby/system/CacheMemory.cc => src/mem/ruby/structures/CacheMemory.cc rename : src/mem/ruby/system/CacheMemory.hh => src/mem/ruby/structures/CacheMemory.hh rename : src/mem/ruby/system/DirectoryMemory.cc => src/mem/ruby/structures/DirectoryMemory.cc rename : src/mem/ruby/system/DirectoryMemory.hh => src/mem/ruby/structures/DirectoryMemory.hh rename : src/mem/ruby/system/DirectoryMemory.py => src/mem/ruby/structures/DirectoryMemory.py rename : src/mem/ruby/system/LRUPolicy.hh => src/mem/ruby/structures/LRUPolicy.hh rename : src/mem/ruby/system/MemoryControl.cc => src/mem/ruby/structures/MemoryControl.cc rename : src/mem/ruby/system/MemoryControl.hh => src/mem/ruby/structures/MemoryControl.hh rename : src/mem/ruby/system/MemoryControl.py => src/mem/ruby/structures/MemoryControl.py rename : src/mem/ruby/system/MemoryNode.cc => src/mem/ruby/structures/MemoryNode.cc rename : src/mem/ruby/system/MemoryNode.hh => src/mem/ruby/structures/MemoryNode.hh rename : src/mem/ruby/system/MemoryVector.hh => src/mem/ruby/structures/MemoryVector.hh rename : src/mem/ruby/system/PerfectCacheMemory.hh => src/mem/ruby/structures/PerfectCacheMemory.hh rename : src/mem/ruby/system/PersistentTable.cc => src/mem/ruby/structures/PersistentTable.cc rename : src/mem/ruby/system/PersistentTable.hh => src/mem/ruby/structures/PersistentTable.hh rename : src/mem/ruby/system/PseudoLRUPolicy.hh => src/mem/ruby/structures/PseudoLRUPolicy.hh rename : src/mem/ruby/system/RubyMemoryControl.cc => src/mem/ruby/structures/RubyMemoryControl.cc rename : src/mem/ruby/system/RubyMemoryControl.hh => src/mem/ruby/structures/RubyMemoryControl.hh rename : src/mem/ruby/system/RubyMemoryControl.py => src/mem/ruby/structures/RubyMemoryControl.py rename : src/mem/ruby/system/SparseMemory.cc => src/mem/ruby/structures/SparseMemory.cc rename : src/mem/ruby/system/SparseMemory.hh => src/mem/ruby/structures/SparseMemory.hh rename : src/mem/ruby/system/TBETable.hh => src/mem/ruby/structures/TBETable.hh rename : src/mem/ruby/system/TimerTable.cc => src/mem/ruby/structures/TimerTable.cc rename : src/mem/ruby/system/TimerTable.hh => src/mem/ruby/structures/TimerTable.hh rename : src/mem/ruby/system/WireBuffer.cc => src/mem/ruby/structures/WireBuffer.cc rename : src/mem/ruby/system/WireBuffer.hh => src/mem/ruby/structures/WireBuffer.hh rename : src/mem/ruby/system/WireBuffer.py => src/mem/ruby/structures/WireBuffer.py rename : src/mem/ruby/recorder/CacheRecorder.cc => src/mem/ruby/system/CacheRecorder.cc rename : src/mem/ruby/recorder/CacheRecorder.hh => src/mem/ruby/system/CacheRecorder.hh	2014-09-01 16:55:40 -05:00
Alexandru	5efbb4442a	mem: adding architectural page table support for SE mode This patch enables the use of page tables that are stored in system memory and respect x86 specification, in SE mode. It defines an architectural page table for x86 as a MultiLevelPageTable class and puts a placeholder class for other ISAs page tables, giving the possibility for future implementation.	2014-08-28 10:11:44 -05:00
Alexandru	26ac28dec2	mem: adding a multi-level page table class This patch defines a multi-level page table class that stores the page table in system memory, consistent with ISA specifications. In this way, cpu models that use the actual hardware to execute (e.g. KvmCPU), are able to traverse the page table.	2014-04-01 12:18:12 -05:00
Andreas Hansson	9e4cd5bf1e	mem: Fix DRAMSim2 cycle check when restoring from checkpoint This patch ensures the cycle check is still valid even restoring from a checkpoint. In this case the DRAMSim2 cycle count is relative to the startTick rather than 0.	2014-08-26 10:14:38 -04:00
Andreas Hansson	6fa8015b7f	base: Add const to intmath and be more flexible with typing This patch ensures the functions can be used on const variables.	2014-08-26 10:14:32 -04:00
Andreas Sandberg	70176fecd1	base: Replace the internal varargs stuff with C++11 constructs We currently use our own home-baked support for type-safe variadic functions. This is confusing and somewhat limited (e.g., cprintf only supports a limited number of arguments). This changeset converts all uses of our internal varargs support to use C++11 variadic macros.	2014-08-26 10:13:45 -04:00
Andreas Sandberg	f3e5fee743	base: Add compiler macros for C++11 final/override Add the macros M5_ATTR_FINAL and M5_ATTR_OVERRIDE which are defined to final and override respectively if supported by the compiler. This is done to allow a smooth transition to gcc >= 4.7.	2014-08-26 10:13:33 -04:00
Mitch Hayenga	0da99b7e0c	mips: Fix RLIMIT_RSS naming MIPS defined RLIMIT_RSS in a way that could cause a naming conflict with RLIMIT_RSS from the host system. Broke clang+MacOS build.	2014-08-26 10:13:31 -04:00
Andreas Sandberg	61b8d5e4e4	base: Add a static assert to check bit union ranges If a bit field in a bit union specified as Bitfield<LSB, MSB> instead of Bitfield<MSB, LSB> the code silently fails and the field is read as zero. This changeset introduces a static assert that tests, at compile time, that the bit order is correct.	2014-08-26 10:13:28 -04:00
Andreas Sandberg	a3d3eb0ff7	sparc: Fixup bit ordering in the PSTATE bit union The order of the MSB and LSB bit of the mm field in the PSTATE union is wrong. Any access to this field will currently be ignored and reads will always return zero. This patch fixes the ordering so it is <MSB, LSB> instead of <LSB, MSB>.	2014-08-26 10:13:23 -04:00
Andreas Hansson	3efabb4b2f	mem: Update DRAM controller comments Update comments and add a reference for more information.	2014-08-26 10:13:03 -04:00
Andreas Hansson	56b7796e0d	mem: Fix address interleaving bug in DRAM controller This patch fixes a bug in the DRAM controller address decoding. In cases where the DRAM burst size (e.g. 32 bytes in a rank with a single LPDDR3 x32) was smaller than the channel interleaving size (e.g. systems with a 64-byte cache line) one address bit effectively got used as a channel bit when it should have been a low-order column bit. This patch adds a notion of "columns per stripe", and more clearly deals with the low-order column bits and high-order column bits. The patch also relaxes the granularity check such that it is possible to use interleaving granularities other than the cache line size. The patch also adds a missing M5_CLASS_VAR_USED to the tCK member as it is only used in the debug build for now.	2014-08-26 10:12:45 -04:00
Curtis Dunham	04d1f61ae8	sim: bump checkpoint version for multiple event queues This patch adds a fix for older checkpoints before support for multiple event queues were added in changeset 2cce74fe359e. The change in checkpoint version should really hav ebeen part of the aforementioned changeset.	2014-02-05 16:17:41 -06:00
Dam Sunwoo	b04d6c7c33	arm: change MISCREG_L2ERRSR to warn not fail Some newer binaries compiled for Versatile Express TC2 contain access to implementation specific L2MERRSR registers. This causes an infinite loop of undefined exceptions. This patch changes the behavior to "warn not fail" to keep the workloads going.	2014-08-13 06:57:36 -04:00
Dam Sunwoo	74a4926fe0	sim: remove kernel mapping check for baremetal workloads Baremetal workloads are specified using the "kernel" parameter, but don't always have the correct address mappings. This patch adds a boolean flag to the system and bypasses the kernel addr mapping checks when running in baremetal mode.	2014-08-13 06:57:35 -04:00

... 4 5 6 7 8 ...

6802 commits