I would like to model the behavior of caches in Intel architectures (LRU, inclusive, K-Way Associative, etc)., I've read wikipedia, Ulrich Drepper's great paper on memory, and the Intel Manual Volume 3A: System Programming Guide (chapter 11, but it's not very helpful, because they only explain what can be manipulated at the software level). I've also read a bunch of academic papers, but as usual, they do not make their code available for replication... even after asking for it. My question is, is there already a publicly available framework to model cache behavior? If not, is there a document detailing the behavior of caches from Intel at the deepest levels? I could not find one.
Implementing a cache modeling framework
1k Views Asked by Dervin Thunk At
1
There are 1 best solutions below
Related Questions in C
- How to call a C language function from x86 assembly code?
- What does: "char *argv[]" mean?
- User input sanitization program, which takes a specific amount of arguments and passes the execution to a bash script
- How to crop a BMP image in half using C
- How can I get the difference in minutes between two dates and hours?
- Why will this code compile although it defines two variables with the same name?
- Compiling eBPF program in Docker fails due to missing '__u64' type
- Why can't I use the file pointer after the first read attempt fails?
- #include Header files in C with definition too
- OpenCV2 on CLion
- What is causing the store latency in this program?
- How to refer to the filepath of test data in test sourcecode?
- 9 Digit Addresses in Hexadecimal System in MacOS
- My server TCP doesn't receive messages from the client in C
- Printing the characters obtained from the array s using printf?
Related Questions in CACHING
- Using Puppeteer to scrape a public API only when the data changes
- Caching private wordpress rest endpoints
- Cloudflare not respecting Cache-Control
- Unexpected Recursive Call
- Cannot serialize (Spring Boot)
- Nginx only caches file endpoints
- The Selenium application properties folder holds two environment options. After running a test the environment setting changes to a previous setting
- Launch jobs in cache in a loop in bash script
- Multiple async request do not store anything to cache
- Dev tool for Next.js cache on the client?
- Creating a letter in the terminal by entering
- Laravel: check if cache has key with thag
- The retrieval time for the Apache Ignite cache is too long
- How to run gradle with caches files
- Docker Run cache mount does not cache apt-get dependencies
Related Questions in ARCHITECTURE
- Where to store secret token for an embeddable web widget?
- Separation of Students and Users in NestJS Microservice architecture
- What's the right ZMQ architecture for my scenario?
- Javers in microservice architecture
- How to prevent users from creating custom client apps?
- How to manage different repositories for different clients with the same project?
- Adding users file storage feature to my application
- Transform Load pipeline for a logs system: Apache Airflow or Kafka Connect?
- Shoulld I decode JWT only on auth server?
- How to stored last ~1500 events in Sorted Set in Redis
- Should data be standardized on the backend or the client (front-end, mobile app)?
- Can I treat CNN channels separately to make placement predictions?
- How to handle sync distributed transaction in microservices?
- Database design, authentication and authorization in a microservices ticketing system
- Is there any example or design of a queue system in microservices?
Related Questions in SIMULATION
- Checking Event in solve_ivp
- I run Micromouse simulation (mms by Mackorone) using BFS algorithm but it not going well
- Cyclic Voltammetry Simmulation in MATLAB, I am running into issues with my data points returning as NaN values, i am a beginner, any help wanted
- VHDL Finite State Machine not transitioning correctly based on external signal
- Recoverable Error when running OPNET project
- Coding Runge-Kutta 4 in C++ for a Force proportional to 1/r^2 outputs a trajectory different from scipy.optimise.solve_ivp
- Simulation of interrupted set-up and delayed server shutdown in batch processing system
- Optimizing the reaction-diffusion algorithm in Monogame C#
- Runge Kutta implementation is less accurate than Euler implementation
- How do I simulate a vector field from the Boit-Savart law for a coil?
- Simulate nested logit errors in Python
- Simulating a discrete approximation to a random walk in R with multiple conditions
- SystemC Error with wait() in SC_THREAD: "wait() is only allowed in SC_THREADs and SC_CTHREADs"
- Simulating new variables based on existing variables
- Keeping Track of Coin Flips Even When They Are Not Flipped
Related Questions in CPU-CACHE
- 3D FFT with data larger than cache
- How can I mitigate the performance impact of transposed array access order?
- How do I find the L2CacheSize, L3CacheSize from C++ on Windows7?
- Fastest use of a dataset of just over 64 bytes?
- Loop stride and cache line
- Can't sample hardware cache events with linux perf
- cache coherence MESI protocol
- What is PDE cache?
- Performance cost of MESI protocol?
- cache optimization of matrice operation
- How can I measure cache misses on OS X Yosemite?
- Write-back vs Write-Through caching?
- Cache specifications for intel core i7
- Is it possible the to lock the ISR instructions to L1 cache?
- loop tiling. how to choose block size?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
There are plenty of cache simulators out there, Dinero for e.g. (pun obviously intended) should be fairly simple and is often used for educational purposes.
Note that this simulator is trace-driven, it means it feeds on a list of memory access addresses, it doesn't know how to run a binary. You can produce such traces by emulating them with binary instrumentation tools, for e.g.
etc.. Note that some of these offer internal cache simulators already, and may be possible to play with.
Other simulators can simulate full CPU/system behavior, not just caches, and can therefore support running a binary. Most of them include within them a simulated cache system. For e.g.:
and many others
On the other hand, writing your own cache simulator is fairly simple - if you can work on a memory trace (writing an actual fronend is way more complicated). You won't be able to get a too detailed spec on actual caches in Intel/AMD products, but the basic functionality is detailed in any computer architecture textbook or even wikipedia, the parameters (size, associativity, coherency policies) are mostly documented in the published guides, and may often change between product generations. You can always ask here if you encounter any specific question :)
Edit:
Regarding the second part of the question - there's no publicly available documentation of the exact cache implementation of Intel CPUs, but the dry "specs" (size, associativity, policies) are in the optimization guide:
Now, modeling these caches should be straightforward, but there may be some hidden caveats, like powerdown features or specialized LRU behaviors. One such reported example can be found here - http://blog.stuffedcow.net/2013/01/ivb-cache-replacement/ (if this is true, it might be worth implementing for accuracy), but aside from that I believe the overall behavior shouldn't be affected by these details too much, for any practical use.