Similar to my previous questions, here and here, I am trying to extract dylibs from the dyld shared cache. Though I'm running into a problem, referenced here, where the rebase information, as well as other data in the LINKEDIT segment, are being read at the data_offset + 0x1000
.
For example, when I dlopen the dylib with "DYLD_PRINT_REBASINGS=1" I get the following,
TestMachoRunner: pre-load: VoiceShortcutsMod
dyld: rebase: voiceShortcutsMod:*0x103000000 += 0xFFFFFFFF5D8B9000
TestMachoRunner: dlopen(/var/root/voiceShortcutsMod, 1): bad rebase type 0 in /var/root/voiceShortcutsMod
compared to when I add 0x1000 bytes infront of the data,
TestMachoRunner: pre-load: VoiceShortcutsMod
dyld: rebase: voiceShortcutsMod:*0x121F862F8 += 0xFFFFFFFF5AF3D000
dyld: rebase: voiceShortcutsMod:*0x121F86300 += 0xFFFFFFFF5AF3D000
dyld: rebase: voiceShortcutsMod:*0x121F86308 += 0xFFFFFFFF5AF3D000
...
and for reference, a normal library,
TestMachoRunner: pre-load: TestMacho
dyld: rebase: TestMacho:*0x1003D0018 += 0x1003C8000
dyld: rebase: TestMacho:*0x1003D0038 += 0x1003C8000
dyld: rebase: TestMacho:*0x1003D0048 += 0x1003C8000
...
Possible Solutions
1. Add 0x1000 bytes of data to everything
Though, while I could do what I did to the rebase info to everything else, this doesn't feel like a good solution because will increase the size of the final dylib by a lot and because it's not certain that it will be by 0x1000 bytes every time.
2. Shift the segment addresses.
In the dyld, every segment from every image is combined and sorted (more info here), as a result when extracting a dylib, the segments have really high vmaddress with large gaps between them. e.g.,
1A5747000 __TEXT
1C70492F8 __DATA_CONST
1CB44F160 __DATA
1CD79F3C8 __DATA_DIRTY
1D0014000 __LINKEDIT
To test my theory, I set the vmaddresses to be the same as the file offsets, which would make it more like a regular library, e.g.,
5B000 __TEXT
5C000 __DATA_CONST
70000 __DATA
78000 __DATA_DIRTY
7C000 __LINKEDIT
This, to my surprise, worked and the data was being read correctly. Though this brings up another problem, if I were to do this all the code in the __text section will break because the segments will be mapped differently in memory. I then did a bit more digging and found that when the DYLD Cache is built, it actually fixes this adjustment (Source code). Unfortunately, to do this, it relies on the LC_SEGMENT_SPLIT_INFO, which is not present in the dylib.
Conclusion
I believe that shifting the segments has the best chance of working, but I'm stuck on not having the segment_split_info. If you have any ideas I would love to hear them, Thanks!