Looking for a bit of advice if I may. I have a method in my PlayStation emulator (Java based for a university thesis which is since finished). It takes an integer memory address, and then returns the byte at that address - redirecting the read to RAM, BIOS ROM, a given I/O port, etc. depending on the address. At the moment this is implemented using a huge bundle of if-else cases which check the range of an address and read from the right place accordingly, returning the byte.
This gives a performance hit of around 9% of overall runtime for me. I figured I may be able to improve this using a dispatch table - essentially a HashMap with autoboxed Integer keys representing the memory addresses and a lambda value to handle the return of the byte depending on the address. Now bearing in mind there are approximately 2.6 million different possible addresses taking into account the memory map of the PS1, this uses a lot more memory - fine with that.
What is puzzling me is that this gives slightly worse performance than the bundle of if-else statements - around 12% of overall runtime. Is there a better way to do what I'm doing? I can't use an array solution (address as a primitive int index and lambda stored at that index) as there are gaps in the address space that this wouldn't handle without an order of magnitude too much memory usage.
I appreciate any other ideas that might get this number down a bit - I realise Java is not a great language for emulation, but part of my thesis was proving it would work (it does). Many thanks.
Regards, Phil
EDIT:
Below is the entire code of the readByte method (the address is converted to long to allow comparison of lower addresses to higher ones at values considered negative for a normal int):
/**
* This reads from the correct area depending on the address.
* @param address
* @return
*/
public byte readByte(int address) {
long tempAddress = address & 0xFFFFFFFFL;
byte retVal = 0;
if (tempAddress >= 0L && tempAddress < 0x200000L) { // RAM
retVal = ram[(int)tempAddress];
} else if (tempAddress >= 0x1F000000L && tempAddress < 0x1F800000L) { // Expansion Region 1
// do nothing for now
;
} else if (tempAddress >= 0x1F800000L && tempAddress < 0x1F800400L) { // Scratchpad
// read from data cache scratchpad if enabled
if (scratchpadEnabled()) {
tempAddress -= 0x1F800000L;
retVal = scratchpad[(int)tempAddress];
}
} else if (tempAddress >= 0x1F801000L && tempAddress < 0x1F802000L) { // I/O Ports
if (tempAddress >= 0x1F801000L && tempAddress < 0x1F801004L) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(expansion1BaseAddress >>> 24);
break;
case 1:
retVal = (byte)(expansion1BaseAddress >>> 16);
break;
case 2:
retVal = (byte)(expansion1BaseAddress >>> 8);
break;
case 3:
retVal = (byte)expansion1BaseAddress;
break;
}
}
else if (tempAddress >= 0x1F801004L && tempAddress < 0x1F801008L) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(expansion2BaseAddress >>> 24);
break;
case 1:
retVal = (byte)(expansion2BaseAddress >>> 16);
break;
case 2:
retVal = (byte)(expansion2BaseAddress >>> 8);
break;
case 3:
retVal = (byte)expansion2BaseAddress;
break;
}
} else if (tempAddress >= 0x1F801008L && tempAddress < 0x1F80100CL) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(expansion1DelaySize >>> 24);
break;
case 1:
retVal = (byte)(expansion1DelaySize >>> 16);
break;
case 2:
retVal = (byte)(expansion1DelaySize >>> 8);
break;
case 3:
retVal = (byte)expansion1DelaySize;
break;
}
} else if (tempAddress >= 0x1F80100CL && tempAddress < 0x1F801010L) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(expansion3DelaySize >>> 24);
break;
case 1:
retVal = (byte)(expansion3DelaySize >>> 16);
break;
case 2:
retVal = (byte)(expansion3DelaySize >>> 8);
break;
case 3:
retVal = (byte)expansion3DelaySize;
break;
}
} else if (tempAddress >= 0x1F801010L && tempAddress < 0x1F801014L) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(biosRomDelaySize >>> 24);
break;
case 1:
retVal = (byte)(biosRomDelaySize >>> 16);
break;
case 2:
retVal = (byte)(biosRomDelaySize >>> 8);
break;
case 3:
retVal = (byte)biosRomDelaySize;
break;
}
} else if (tempAddress >= 0x1F801014L && tempAddress < 0x1F801018L) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(spuDelaySize >>> 24);
break;
case 1:
retVal = (byte)(spuDelaySize >>> 16);
break;
case 2:
retVal = (byte)(spuDelaySize >>> 8);
break;
case 3:
retVal = (byte)spuDelaySize;
break;
}
} else if (tempAddress >= 0x1F801018L && tempAddress < 0x1F80101CL) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(cdromDelaySize >>> 24);
break;
case 1:
retVal = (byte)(cdromDelaySize >>> 16);
break;
case 2:
retVal = (byte)(cdromDelaySize >>> 8);
break;
case 3:
retVal = (byte)cdromDelaySize;
break;
}
} else if (tempAddress >= 0x1F80101CL && tempAddress < 0x1F801020L) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(expansion2DelaySize >>> 24);
break;
case 1:
retVal = (byte)(expansion2DelaySize >>> 16);
break;
case 2:
retVal = (byte)(expansion2DelaySize >>> 8);
break;
case 3:
retVal = (byte)expansion2DelaySize;
break;
}
} else if (tempAddress >= 0x1F801020L && tempAddress < 0x1F801024L) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(commonDelay >>> 24);
break;
case 1:
retVal = (byte)(commonDelay >>> 16);
break;
case 2:
retVal = (byte)(commonDelay >>> 8);
break;
case 3:
retVal = (byte)commonDelay;
break;
}
} else if (tempAddress >= 0x1F801040L && tempAddress < 0x1F801050L) {
// read from ControllerIO object
retVal = cio.readByte((int)tempAddress);
} else if (tempAddress >= 0x1F801060L && tempAddress < 0x1F801064L) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(ramSize >>> 24);
break;
case 1:
retVal = (byte)(ramSize >>> 16);
break;
case 2:
retVal = (byte)(ramSize >>> 8);
break;
case 3:
retVal = (byte)ramSize;
break;
}
}
else if (tempAddress >= 0x1F801070L && tempAddress < 0x1F801074L) { // Interrupt Status Register
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(interruptStatusReg >>> 24);
break;
case 1:
retVal = (byte)(interruptStatusReg >>> 16);
break;
case 2:
retVal = (byte)(interruptStatusReg >>> 8);
break;
case 3:
retVal = (byte)interruptStatusReg;
break;
}
}
else if (tempAddress >= 0x1F801074L && tempAddress < 0x1F801078L) { // Interrupt Mask Register
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(interruptMaskReg >>> 24);
break;
case 1:
retVal = (byte)(interruptMaskReg >>> 16);
break;
case 2:
retVal = (byte)(interruptMaskReg >>> 8);
break;
case 3:
retVal = (byte)interruptMaskReg;
break;
}
}
else if (tempAddress >= 0x1F801080L && tempAddress < 0x1F801100L) {
retVal = dma.readByte(address);
}
else if (tempAddress >= 0x1F801100L && tempAddress < 0x1F801104L) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(timer0.counterValueRead() >>> 24);
break;
case 1:
retVal = (byte)(timer0.counterValueRead() >>> 16);
break;
case 2:
retVal = (byte)(timer0.counterValueRead() >>> 8);
break;
case 3:
retVal = (byte)timer0.counterValueRead();
break;
}
}
else if (tempAddress >= 0x1F801104L && tempAddress < 0x1F801108L) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(timer0.counterModeRead(false) >>> 24);
break;
case 1:
retVal = (byte)(timer0.counterModeRead(false) >>> 16);
break;
case 2:
retVal = (byte)(timer0.counterModeRead(false) >>> 8);
break;
case 3:
retVal = (byte)timer0.counterModeRead(false);
break;
}
}
else if (tempAddress >= 0x1F801108L && tempAddress < 0x1F80110CL) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(timer0.counterTargetRead() >>> 24);
break;
case 1:
retVal = (byte)(timer0.counterTargetRead() >>> 16);
break;
case 2:
retVal = (byte)(timer0.counterTargetRead() >>> 8);
break;
case 3:
retVal = (byte)timer0.counterTargetRead();
break;
}
}
else if (tempAddress >= 0x1F801110L && tempAddress < 0x1F801114L) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(timer1.counterValueRead() >>> 24);
break;
case 1:
retVal = (byte)(timer1.counterValueRead() >>> 16);
break;
case 2:
retVal = (byte)(timer1.counterValueRead() >>> 8);
break;
case 3:
retVal = (byte)timer1.counterValueRead();
break;
}
}
else if (tempAddress >= 0x1F801114L && tempAddress < 0x1F801118L) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(timer1.counterModeRead(false) >>> 24);
break;
case 1:
retVal = (byte)(timer1.counterModeRead(false) >>> 16);
break;
case 2:
retVal = (byte)(timer1.counterModeRead(false) >>> 8);
break;
case 3:
retVal = (byte)timer1.counterModeRead(false);
break;
}
}
else if (tempAddress >= 0x1F801118L && tempAddress < 0x1F80111CL) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(timer1.counterTargetRead() >>> 24);
break;
case 1:
retVal = (byte)(timer1.counterTargetRead() >>> 16);
break;
case 2:
retVal = (byte)(timer1.counterTargetRead() >>> 8);
break;
case 3:
retVal = (byte)timer1.counterTargetRead();
break;
}
}
else if (tempAddress >= 0x1F801120L && tempAddress < 0x1F801124L) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(timer2.counterValueRead() >>> 24);
break;
case 1:
retVal = (byte)(timer2.counterValueRead() >>> 16);
break;
case 2:
retVal = (byte)(timer2.counterValueRead() >>> 8);
break;
case 3:
retVal = (byte)timer2.counterValueRead();
break;
}
}
else if (tempAddress >= 0x1F801124L && tempAddress < 0x1F801128L) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(timer2.counterModeRead(false) >>> 24);
break;
case 1:
retVal = (byte)(timer2.counterModeRead(false) >>> 16);
break;
case 2:
retVal = (byte)(timer2.counterModeRead(false) >>> 8);
break;
case 3:
retVal = (byte)timer2.counterModeRead(false);
break;
}
}
else if (tempAddress >= 0x1F801128L && tempAddress < 0x1F80112CL) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(timer2.counterTargetRead() >>> 24);
break;
case 1:
retVal = (byte)(timer2.counterTargetRead() >>> 16);
break;
case 2:
retVal = (byte)(timer2.counterTargetRead() >>> 8);
break;
case 3:
retVal = (byte)timer2.counterTargetRead();
break;
}
}
else if (tempAddress >= 0x1F801810L && tempAddress < 0x1F801814L) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(gpu.readResponse() >>> 24);
break;
case 1:
retVal = (byte)(gpu.readResponse() >>> 16);
break;
case 2:
retVal = (byte)(gpu.readResponse() >>> 8);
break;
case 3:
retVal = (byte)gpu.readResponse();
break;
}
}
else if (tempAddress >= 0x1F801814L && tempAddress < 0x1F801818L) {
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(gpu.readStatus() >>> 24);
break;
case 1:
retVal = (byte)(gpu.readStatus() >>> 16);
break;
case 2:
retVal = (byte)(gpu.readStatus() >>> 8);
break;
case 3:
retVal = (byte)gpu.readStatus();
break;
}
}
else if (tempAddress >= 0x1F801800L && tempAddress < 0x1F801804L) { // CDROM
switch ((int)tempAddress & 0xF) {
case 0:
retVal = cdrom.read1800();
break;
case 1:
retVal = cdrom.read1801();
break;
case 2:
retVal = cdrom.read1802();
break;
case 3:
retVal = cdrom.read1803();
break;
}
}
else if (tempAddress >= 0x1F801C00L && tempAddress < 0x1F802000L) {
// fake SPU read
retVal = spu.readByte(address);
}
} else if (tempAddress >= 0x1F802000L && tempAddress < 0x1F803000L) { // Expansion Region 2 (I/O Ports)
// read from BIOS post register
if (tempAddress == 0x1F802041L) {
retVal = biosPost;
}
} else if (tempAddress >= 0x1FA00000L && tempAddress < 0x1FC00000L) { // Expansion Region 3 (Multipurpose)
// do nothing for now
;
} else if (tempAddress >= 0x1FC00000L && tempAddress < 0x1FC80000L) { // BIOS ROM
// read from memory mapped BIOS file
tempAddress -= 0x1FC00000L;
retVal = biosBuffer.get((int)tempAddress);
} else if (tempAddress >= 0xFFFE0000L && tempAddress < 0xFFFE0200L) { // I/O Ports (Cache Control)
if (tempAddress >= 0xFFFE0130L && tempAddress < 0xFFFE0134L) { // Cache Control Register
int shift = (int)(tempAddress & 0x3L);
switch (shift) {
case 0:
retVal = (byte)(cacheControlReg >>> 24);
break;
case 1:
retVal = (byte)(cacheControlReg >>> 16);
break;
case 2:
retVal = (byte)(cacheControlReg >>> 8);
break;
case 3:
retVal = (byte)cacheControlReg;
break;
}
}
}
return retVal;
}
The best approach depends on your implementation under the hood. I see that the address space of the PSX is 32 bit, but as with many console, zones are mirrored. Now without seeing your actual implementation it's just guessing but here's some considerations.
I'll start considering this table
So for I/O ports there not much you can do since they're separated and must be handled specifically we can try to study how to improve addressing of everything else.
We can see that mirrored regions differ from the 4 most relevant bits. Which means that we can do
address &= 0x0FFFFFFF
so that we ignore the region and consider only the significative part of the address.So now we have 3 kinds of addresses:
0x0000000
, mapped to main RAM0xF000000
and ending at0xFC00000
(+ bios rom)0xFFFF0000
This could lead to an hybrid approach in which you use both if/else and a cache, eg:
Now we have that address space between
0xF000000
and0xFC000000
that must be split into multiple parts. As you can see from the memory map we have this:If you notice you can see that first 4 bits are always
0xF
while last12
bits are always0
, so we don't need them to understand where to dispatch the call. This means that the interesting part of the address has the following mask0x0FFF000
, so we could translate the address:Now these are only 4096 possible values which could fit in a tight LUT table.