sanchayanmaity/gem5 - Sanchayan Maity's repositories

Author	SHA1	Message	Date
Tony Gutierrez	1a7d3f9fcb	gpu-compute: AMD's baseline GPU model	2016-01-19 14:28:22 -05:00
Andreas Sandberg	745f8229f6	dev, arm: Add a platform with support for both aarch32 and aarch64 Add a platform with support for both aarch32 and aarch64. This platform implements a subset of the devices in a real Versatile Express and extends it with some gem5-specific functionality. It is in many ways similar to the old VExpress_EMM64 platform, but supports the following new features: * Automatic PCI interrupt assignment * PCI interrupts allocated in a contiguous range. * Automatic boot loader selection (32-bit / 64-bit) * Cleaner memory map where gem5-specific devices live in CS5 which isn't used by current Versatile Express platforms. * No fake devices. Devices that were previously faked will be removed from the device tree instead. * Support for 510 GiB contiguous memory	2016-01-15 11:30:13 +00:00
Andreas Hansson	c965ca96cc	configs: Fix inheritance of HMCSystem and cleanup spacing Minor fix to ensure the HMCSystem can actually be instantiated (SimObject cannot be created). Also address some spacing issues.	2016-01-11 05:52:17 -05:00
Gabor Dozsa	64ca31976f	config: Updates for distributed gem5 simulations	2016-01-07 16:33:47 -06:00
Radhika Jagtap	9bd5051b60	config: Enable elastic trace capture and replay in se/fs This patch adds changes to the configuration scripts to support elastic tracing and replay. The patch adds a command line option to enable elastic tracing in SE mode and FS mode. When enabled the Elastic Trace cpu probe is attached to O3CPU and a few O3 CPU parameters are tuned. The Elastic Trace probe writes out both instruction fetch and data dependency traces. The patch also enables configuring the TraceCPU to replay traces using the SE and FS script. The replay run is designed to resume from checkpoint using atomic cpu to restore state keeping it consistent with FS run flow. It then switches to TraceCPU to replay the input traces.	2015-12-07 16:42:16 -06:00
Andreas Sandberg	78275c9d2f	dev: Rewrite PCI host functionality The gem5's current PCI host functionality is very ad hoc. The current implementations require PCI devices to be hooked up to the configuration space via a separate configuration port. Devices query the platform to get their config-space address range. Un-mapped parts of the config space are intercepted using the XBar's default port mechanism and a magic catch-all device (PciConfigAll). This changeset redesigns the PCI host functionality to improve code reuse and make config-space and interrupt mapping more transparent. Existing platform code has been updated to use the new PCI host and configured to stay backwards compatible (i.e., no guest-side visible changes). The current implementation does not expose any new functionality, but it can easily be extended with features such as automatic interrupt mapping. PCI devices now register themselves with a PCI host controller. The host controller interface is defined in the abstract base class PciHost. Registration is done by PciHost::registerDevice() which takes the device, its bus position (bus/dev/func tuple), and its interrupt pin (INTA-INTC) as a parameter. The registration interface returns a PciHost::DeviceInterface that the PCI device can use to query memory mappings and signal interrupts. The host device manages the entire PCI configuration space. Accesses to devices decoded into the devices bus position and then forwarded to the correct device. Basic PCI host functionality is implemented in the GenericPciHost base class. Most platforms can use this class as a basic PCI controller. It provides the following functionality: * Configurable configuration space decoding. The number of bits dedicated to a device is a prameter, making it possible to support both CAM, ECAM, and legacy mappings. * Basic interrupt mapping using the interruptLine value from a device's configuration space. This behavior is the same as in the old implementation. More advanced controllers can override the interrupt mapping method to dynamically assign host interrupts to PCI devices. * Simple (base + addr) remapping from the PCI bus's address space to physical addresses for PIO, memory, and DMA.	2015-12-05 00:11:24 +00:00
Andreas Sandberg	6a05179e13	arm, config: Automatically discover available platforms Add support for automatically discover available platforms. The Python-side uses functionality similar to what we use when auto-detecting available CPU models. The machine IDs have been updated to match the platform configurations. If there isn't a matching machine ID, the configuration scripts default to -1 which Linux uses for device tree only platforms.	2015-12-04 00:19:05 +00:00
Andreas Hansson	7433d77fcf	mem: Add an option to perform clean writebacks from caches This patch adds the necessary commands and cache functionality to allow clean writebacks. This functionality is crucial, especially when having exclusive (victim) caches. For example, if read-only L1 instruction caches are not sending clean writebacks, there will never be any spills from the L1 to the L2. At the moment the cache model defaults to not sending clean writebacks, and this should possibly be re-evaluated. The implementation of clean writebacks relies on a new packet command WritebackClean, which acts much like a Writeback (renamed WritebackDirty), and also much like a CleanEvict. On eviction of a clean block the cache either sends a clean evict, or a clean writeback, and if any copies are still cached upstream the clean evict/writeback is dropped. Similarly, if a clean evict/writeback reaches a cache where there are outstanding MSHRs for the block, the packet is dropped. In the typical case though, the clean writeback allocates a block in the downstream cache, and marks it writable if the evicted block was writable. The patch changes the O3_ARM_v7a L1 cache configuration and the default L1 caches in config/common/Caches.py	2015-11-06 03:26:43 -05:00
Andreas Hansson	654266f39c	mem: Add cache clusivity This patch adds a parameter to control the cache clusivity, that is if the cache is mostly inclusive or exclusive. At the moment there is no intention to support strict policies, and thus the options are: 1) mostly inclusive, or 2) mostly exclusive. The choice of policy guides the behaviuor on a cache fill, and a new helper function, allocOnFill, is created to encapsulate the decision making process. For the timing mode, the decision is annotated on the MSHR on sending out the downstream packet, and in atomic we directly pass the decision to handleFill. We (ab)use the tempBlock in cases where we are not allocating on fill, leaving the rest of the cache unaffected. Simple and effective. This patch also makes it more explicit that multiple caches are allowed to consider a block writable (this is the case also before this patch). That is, for a mostly inclusive cache, multiple caches upstream may also consider the block exclusive. The caches considering the block writable/exclusive all appear along the same path to memory, and from a coherency protocol point of view it works due to the fact that we always snoop upwards in zero time before querying any downstream cache. Note that this patch does not introduce clean writebacks. Thus, for clean lines we are essentially removing a cache level if it is made mostly exclusive. For example, lines from the read-only L1 instruction cache or table-walker cache are always clean, and simply get dropped rather than being passed to the L2. If the L2 is mostly exclusive and does not allocate on fill it will thus never hold the line. A follow on patch adds the clean writebacks. The patch changes the L2 of the O3_ARM_v7a CPU configuration to be mostly exclusive (and stats are affected accordingly).	2015-11-06 03:26:41 -05:00
Nilay Vaish	6433a10749	configs: fix bug introduced due to 276ad9121192 I had made a typo in changeset 276ad9121192. This changeset fixes it	2015-11-04 12:36:28 -06:00
Erfan Azarkhish	100cbc9cf6	mem: hmc: top level design This patch enables modeling a complete Hybrid Memory Cube (HMC) device. It highly reuses the existing components in gem5's general memory system with some small modifications. This changeset requires additional patches to model a complete HMC device. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-11-03 12:17:56 -06:00
Palle Lyckegaard	2cb491379b	sparc: add missing parameter to makeSparcSystem() makeSparcSystem() in configs/common/FSConfig.py is missing the cmdLine parameter Without the parameter the simulation fails to start. With the parameter the simulation starts properly.	2015-11-03 12:17:55 -06:00
Jason Lowe-Power	f065f9941b	config: Add configs scripts used in Learning gem5 Added a new directory in configs (learning_gem5) to hold the scripts that are used in the book. See http://lowepower.com/jason/learning_gem5/ for a working copy. For now, only the scripts in Part 1: Getting started with gem5 have been added. A separate patch adds tests for these scripts. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-09-16 09:35:36 -05:00
Andreas Hansson	ddfa96cf45	mem: Add explicit Cache subclass and make BaseCache abstract Open up for other subclasses to BaseCache and transition to using the explicit Cache subclass. --HG-- rename : src/mem/cache/BaseCache.py => src/mem/cache/Cache.py	2015-08-21 07:03:23 -04:00
Matthias Jung	8723b08dbf	misc: Coupling gem5 with SystemC TLM2.0 Transaction Level Modeling (TLM2.0) is widely used in industry for creating virtual platforms (IEEE 1666 SystemC). This patch contains a standard compliant implementation of an external gem5 port, that enables the usage of gem5 as a TLM initiator component in SystemC based virtual platforms. Both TLM coding paradigms loosely timed (b_transport) and aproximately timed (nb_transport) are supported. Compared to the original patch a TLM memory manager was added. Furthermore, the transaction object was removed and for each TLM payload a PacketPointer that points to the original gem5 packet is added as an TLM extension. For event handling single events are now created. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-08-03 23:08:40 -05:00
Andreas Hansson	b93c912013	mem: Remove redundant is_top_level cache parameter This patch takes the final step in removing the is_top_level parameter from the cache. With the recent changes to read requests and write invalidations, the parameter is no longer needed, and consequently removed. This also means that asymmetric cache hierarchies are now fully supported (and we are actually using them already with L1 caches, but no table-walker caches, connected to a shared L2).	2015-07-03 10:14:43 -04:00
Andreas Hansson	893533a126	mem: Allow read-only caches and check compliance This patch adds a parameter to the BaseCache to enable a read-only cache, for example for the instruction cache, or table-walker cache (not for x86). A number of checks are put in place in the code to ensure a read-only cache does not end up with dirty data. A follow-on patch adds suitable read requests to allow a read-only cache to explicitly ask for clean data.	2015-07-03 10:14:39 -04:00
Andreas Sandberg	7c4eb3b4d8	kvm, arm: Add support for aarch64 This changeset adds support for aarch64 in kvm. The CPU module supports both checkpointing and online CPU model switching as long as no devices are simulated by the host kernel. It currently has the following limitations: * The system register based generic timer can only be simulated by the host kernel. Workaround: Use a memory mapped timer instead to simulate the timer in gem5. * Simulating devices (e.g., the generic timer) in the host kernel requires that the host kernel also simulates the GIC. * ID registers in the host and in gem5 must match for switching between simulated CPUs and KVM. This is particularly important for ID registers describing memory system capabilities (e.g., ASID size, physical address size). * Switching between a virtualized CPU and a simulated CPU is currently not supported if in-kernel device emulation is used. This could be worked around by adding support for switching to the gem5 (e.g., the KvmGic) side of the device models. A simpler workaround is to avoid in-kernel device models altogether.	2015-06-01 19:44:19 +01:00
Andreas Hansson	554ddc7c07	arch, cpu: Do not forward snoops to table walker This patch simplifies the overall CPU by changing the TLB caches such that they do not forward snoops to the table walker port(s). Note that only ARM and X86 are affected. There is no reason for the ports to snoop as they do not actually take any action, and from a performance point of view we are better of not snooping more than we have to. Should it at a later point be required to snoop for a particular TLB design it is easy enough to add it back.	2015-05-05 03:22:27 -04:00
Nilay Vaish	4333549575	cpu: o3: replace issueLatency with bool pipelined Currently, each op class has a parameter issueLat that denotes the cycles after which another op of the same class can be issued. As of now, this latency can either be one cycle (fully pipelined) or same as execution latency of the op (not at all pipelined). The fact that issueLat is a parameter of type Cycles makes one believe that it can be set to any value. To avoid the confusion, the parameter is being renamed as 'pipelined' with type boolean. If set to true, the op would execute in a fully pipelined fashion. Otherwise, it would execute in an unpipelined fashion.	2015-04-29 22:35:22 -05:00
bpotter	936768c8f4	config: enable setting SE-mode environment variables from file	2015-04-23 13:40:18 -07:00
Andreas Hansson	076ea249ae	config: Remove memory aliases and rely on class name Instead of maintaining two lists, rely entirely on the class name. There is really no point in causing unecessary confusion.	2015-04-20 12:46:29 -04:00
Malek Musleh	826f69b470	config, cpu: fix progress interval for switched CPUs This patch ensures that the CPU progress Event is triggered for the new set of switched_cpus that get scheduled (e.g. during fast-forwarding). it also avoids printing the interval state if the cpu is currently switched out. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-04-14 11:01:10 -05:00
Dibakar Gope	34ad1123ee	cpu: re-organizes the branch predictor structure. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-04-13 17:33:57 -05:00
Curtis Dunham	c3268f8820	config: Support full-system with SST's memory system This patch adds an example configuration in ext/sst/tests/ that allows an SST/gem5 instance to simulate a 4-core AArch64 system with SST's memHierarchy components providing all the caches and memories.	2015-04-08 15:56:06 -05:00
Andreas Hansson	aeffde5ed5	arm, configs: Do not forward snoops from I cache This fix simply tells the I cache to not forward snoops to the fetch unit (since there is really no reason to do so).	2015-03-27 04:56:10 -04:00
Steve Reinhardt	c55749d998	config: expand '~' and '~user' in paths	2015-03-23 16:14:19 -07:00
Curtis Dunham	bcea57afc3	config: Add ability to exit simulation after initialization When using gem5 as a slave simulator, it will not advance the clock on its own and depends on the master simulator calling simulate(). This new option lets us use the Python scripts to do all the configuration while stopping short of actually simulating anything.	2015-03-23 06:57:38 -04:00
Chris Emmons	142ab40c4b	config: Specify OS type and release on command line This patch enables users to speficy --os-type on the command line. This option is used to take specific actions for an OS type, such as changing the kernel command line. This patch is part of the Android KitKat enablement.	2015-03-19 04:06:14 -04:00
Rizwana Begum	0c8e025c3b	config: Fix for 'android' lookup in disk name This patch modifies FSConfig.py to look for 'android' only in disk image name. Before this patch, 'android' was searched in full disk path. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-03-09 09:39:08 -05:00
Andreas Hansson	36dc93a5fa	mem: Move crossbar default latencies to subclasses This patch introduces a few subclasses to the CoherentXBar and NoncoherentXBar to distinguish the different uses in the system. We use the crossbar in a wide range of places: interfacing cores to the L2, as a system interconnect, connecting I/O and peripherals, etc. Needless to say, these crossbars have very different performance, and the clock frequency alone is not enough to distinguish these scenarios. Instead of trying to capture every possible case, this patch introduces dedicated subclasses for the three primary use-cases: L2XBar, SystemXBar and IOXbar. More can be added if needed, and the defaults can be overridden.	2015-03-02 04:00:47 -05:00
Curtis Dunham	07ce60bdfa	config: add --root-device machine parameter In case /dev/sda1 is not actually the boot partition for an image, we can override it on the command line or in a benchmark definition.	2015-01-16 14:12:03 -06:00
Steve Reinhardt	774922895b	config: rename 'file' var Rename uses of 'file' as a local variable to avoid conflict with the built-in type of the same name.	2015-02-05 16:45:12 -08:00
Steve Reinhardt	634d923751	config: make M5_PATH a real search path Although you can put a list of colon-separated directory names in M5_PATH, the current code just takes the first one that exists and assumes all files must live there. This change makes the code search the specified list of directories for each individual binary or disk image that's requested. The main motivation is that the x86/Alpha binaries and the ARM binaries are in separate downloads, and thus naturally end up in separate directories. With this change, you can have M5_PATH point to those two directories, then run any FS regression test without changing M5_PATH. Currently, you either have to merge the two download directories or change M5_PATH (or do something else I haven't figured out).	2015-02-05 16:45:06 -08:00
Andreas Hansson	28a7cea2b3	config: Add XOR hashing to the DRAM channel interleaving This patch uses the recently added XOR hashing capabilities for the DRAM channel interleaving. This avoids channel biasing due to strided access patterns.	2015-02-03 14:25:55 -05:00
Andreas Hansson	5ea60a95b3	config: Adjust DRAM channel interleaving defaults This patch changes the DRAM channel interleaving default behaviour to be more representative. The default address mapping (RoRaBaCoCh) moves the channel bits towards the least significant bits, and uses 128 byte as the default channel interleaving granularity. These defaults can be overridden if desired, but should serve as a sensible starting point for most use-cases.	2015-02-03 14:25:52 -05:00
Malek Musleh	ca131a4196	config: arm: fix os_flags Fix the makeArmSystem routine to reflect recent changes that support kernel commandline option when running android. Without this fix, trying to run android encounters a 'reference before assignment' error. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2015-01-30 15:49:34 -06:00
Andreas Hansson	3cb9c361e2	scons: Do not build the InOrderCPU One step closer to shifting focus to the MinorCPU.	2015-01-20 08:12:45 -05:00
Andreas Hansson	59460b91f3	config: Expose the DRAM ranks as a command-line option This patch gives the user direct influence over the number of DRAM ranks to make it easier to tune the memory density without affecting the bandwidth (previously the only means of scaling the device count was through the number of channels). The patch also adds some basic sanity checks to ensure that the number of ranks is a power of two (since we rely on bit slices in the address decoding).	2014-12-23 09:31:18 -05:00
Marco Elver	177682ead4	config: Add --memchecker option This patch adds the --memchecker option, to denote that a MemChecker should be instantiated for the system. The exact usage of the MemChecker depends on the system configuration. For now CacheConfig.py makes use of the option, adding MemCheckerMonitor instances between CPUs and D-Caches. Note, however, that currently this only provides limited checking on a running system; other parts of the system, such as I/O devices are not monitored, and may cause warnings to be issued by the monitor.	2014-12-23 09:31:18 -05:00
Dam Sunwoo	809134a2b1	config: Add options to take/resume from SimPoint checkpoints More documentation at http://gem5.org/Simpoints Steps to profile, generate, and use SimPoints with gem5: 1. To profile workload and generate SimPoint BBV file, use the following option: --simpoint-profile --simpoint-interval <interval length> Requires single Atomic CPU and fastmem. <interval length> is in number of instructions. 2. Generate SimPoint analysis using SimPoint 3.2 from UCSD. (SimPoint 3.2 not included with this flow.) 3. To take gem5 checkpoints based on SimPoint analysis, use the following option: --take-simpoint-checkpoint=<simpoint file path>,<weight file path>,<interval length>,<warmup length> <simpoint file> and <weight file> is generated by SimPoint analysis tool from UCSD. SimPoint 3.2 format expected. <interval length> and <warmup length> are in number of instructions. 4. To resume from gem5 SimPoint checkpoints, use the following option: --restore-simpoint-checkpoint -r <N> --checkpoint-dir <simpoint checkpoint path> <N> is (SimPoint index + 1). E.g., "-r 1" will resume from SimPoint #0.	2014-12-23 09:31:17 -05:00
Gabe Black	7540656fc5	config: Add two options for setting the kernel command line. Both options accept template which will, through python string formatting, have "mem", "disk", and "script" values substituted in from the mdesc. Additional values can be used on a case by case basis by passing them as keyword arguments to the fillInCmdLine function. That makes it possible to have specialized parameters for a particular ISA, for instance. The first option lets you specify the template directly, and the other lets you specify a file which has the template in it.	2014-12-04 16:42:07 -08:00
Gabe Black	b7dc4ba516	config: Get rid of some extra spaces around default arguments.	2014-12-03 03:11:00 -08:00
Nilay Vaish	3022d463fb	ruby: interface with classic memory controller This patch is the final in the series. The whole series and this patch in particular were written with the aim of interfacing ruby's directory controller with the memory controller in the classic memory system. This is being done since ruby's memory controller has not being kept up to date with the changes going on in DRAMs. Classic's memory controller is more up to date and supports multiple different types of DRAM. This also brings classic and ruby ever more close. The patch also changes ruby's memory controller to expose the same interface.	2014-11-06 05:42:21 -06:00
Ali Saidi	f2db2a96d1	arm, tests: Update config files to more recent kernels and create 64-bit regressions. This changes the default ARM system to a Versatile Express-like system that supports 2GB of memory and PCI devices and updates the default kernels/file-systems for AArch64 ARM systems (64-bit) to support up to 32GB of memory and PCI devices. Some platforms that are no longer supported have been pruned from the configuration files. In addition a set of 64-bit ARM regressions have been added to the regression system.	2014-10-29 23:18:27 -05:00
Ali Saidi	3a5c975fd7	arm: fix bare-metal memory setup. The bare-metal configuration option still configured memory with the old scheme that no-longer works. This change unifies the code so there aren't any differences.	2014-10-29 23:18:26 -05:00
Nilay Vaish	b80e574d01	config: separate function for instantiating a memory controller This patch moves code for instantiating a single memory controller from the function config_mem() to a separate function. This is being done so that memory controllers can be instantiated without assuming that they will be attached to the system in a particular fashion.	2014-10-11 15:02:23 -05:00
Jiuyue Ma	e1a5522a89	config, x86: Ensure that PCI devs get bridged to the memory bus This patch force IO device to be mapped to 0xC0000000-0xFFFF0000 by reserve anything between the end of memory and 3GB if memory is less than 3GB. It also statically bridge these address range to the IO bus, which guaranty access to pci address space will pass though bridge to iobus. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-07-17 12:05:41 +08:00
Jiuyue Ma	7d03bf4d6b	config, x86: swap bus_id of ISA/PCI in X86 IntelMPTable This patch assign bus_id=0 to PCI bus and bus_id=1 to ISA bus for X86 platform. Because PCI device get config space address using Pc::calcPciConfigAddr() which requires "assert(bus==0)". This fixes PCI interrupt routing and discovery on Linux. Committed by: Nilay Vaish <nilay@cs.wisc.edu>	2014-07-17 11:00:12 +08:00
Andreas Hansson	1f6d5f8f84	mem: Rename Bus to XBar to better reflect its behaviour This patch changes the name of the Bus classes to XBar to better reflect the actual timing behaviour. The actual instances in the config scripts are not renamed, and remain as e.g. iobus or membus. As part of this renaming, the code has also been clean up slightly, making use of range-based for loops and tidying up some comments. The only changes outside the bus/crossbar code is due to the delay variables in the packet. --HG-- rename : src/mem/Bus.py => src/mem/XBar.py rename : src/mem/coherent_bus.cc => src/mem/coherent_xbar.cc rename : src/mem/coherent_bus.hh => src/mem/coherent_xbar.hh rename : src/mem/noncoherent_bus.cc => src/mem/noncoherent_xbar.cc rename : src/mem/noncoherent_bus.hh => src/mem/noncoherent_xbar.hh rename : src/mem/bus.cc => src/mem/xbar.cc rename : src/mem/bus.hh => src/mem/xbar.hh	2014-09-20 17:18:32 -04:00

1 2 3 4 5 ...

344 commits