I developed a procedure that receives a TStream; but the basic type, to allow the sending of all the types of stream heirs.
This procedure is intended to create one thread to each core, or multiple threads. Each thread will perform detailed analysis of stream data (read-only), and as Pascal classes are assigned by reference, and never by value, there will be a collision of threads, since the reading position is intercalará.
To fix this, I want the procedure do all the work to double the last TStream in memory, allocating it a new variable. This way I can duplicate the TStream in sufficient numbers so that each thread has a unique TStream. After the end of the very thread library memory.
Note: the procedure is within a DLL, the thread works.
Note 2: The goal is that the procedure to do all the necessary service, ie without the intervention of code that calls; You could easily pass an Array of TStream, rather than just a TStream. But I do not want it! The aim is that the service is provided entirely by the procedure.
Do you have any idea how to do this?
Thank you.
Addition:
I had a low-level idea, but my knowledge in Pascal is limited.
- Identify the object's address in memory, and its size.
- create a new address in memory with the same size as the original object.
- copy the entire contents (raw) object to this new address.
- I create a pointer to TStream that point to this new address in memory.
This would work, or is stupid?? If yes, how to operate? Example Please!
2º Addition:
Just as an example, suppose the program perform brute force attacks on encrypted streams (just an example, because it is not applicable):
Scene: A 30GB file in a CPU with 8 cores:
1º - TMemoryStream:
Create 8 TMemoryStream and copy the entire contents of the file for each of TMemoryStreams. This will result in 240GB RAM in use simultaneously. I consider this broken idea. In addition it would increase the processing time to the point of fastest not use multithreading. I would have to read the entire file into memory, and then loaded, begin to analyze it. Broke!
* A bad alternative to TMemoryStream is to copy the file slowly to TMemoryStream in lots of 100MB / core (800MB), not to occupy the memory. So each thread looks only 100MB, frees the memory until you complete the entire file. But the problem is that it would require Synchronize() function in DLL, which we know does not work out as I open question in Synchronize () DLL freezes without errors and crashes
2º - TFileStream:
This is worse in my opinion. See, I get a TStream, create 8 TFileStream and copy all the 30GB for each TFileStream. That sucks because occupy 240GB on disk, which is a high value, even to HDD. The read and write time (copy) in HD will make the implementation of multithreaded turns out to be more time consuming than a single thread. Broke!
Conclusion: The two approaches above require use synchronize() to queue each thread to read the file. Therefore, the threads are not operating simultaneously, even on a multicore CPU. I know that even if he could simultaneous access to the file (directly creating several TFileStream), the operating system still enfileiraria threads to read the file one at a time, because the HDD is not truly thread-safe, he can not read two data at the same time . This is a physical limitation of the HDD! However, the queuing management of OS is much more effective and decrease the latent bottleneck efficiently, unlike if I implement manually synchronize(). This justifies my idea to clone TStream, would leave with S.O. all the working to manage file access queue; without any intervention - and I know he will do it better than me.
Example
In the above example, I want 8 Threads analyze differently and simultaneously the same Stream, knowing that the threads do not know what kind of Stream provided, it can be a file Stream, a stream from the Internet, or even a small TStringStream . The main program will create only one Strean, and will with configuration parameters. A simple example:
TModeForceBrute = (M1, M2, M3, M4, M5...)
TModesFB = set of TModeForceBrute;
TService = record
stream: TStream;
modes: array of TModesFB;
end;
For example, it should be possible to analyze only the Stream M1, M2 only, or both [M1, M2]. The TModesFB composition changes the way the stream is analyzed. Each item in the array "modes", which functions as a task list, will be processed by a different thread. An example of a task list (JSON representation):
{
Stream: MyTstream,
modes: [
[M1, m5],
[M1],
[M5, m2],
[M5, m2, m4, m3],
[M1, m1, m3]
]
}
Note: In analyzer [m1] + [m2] <> [m1, m2].
In Program:
function analysis(Task: TService; maxCores: integer): TMyResultType; external 'mydll.dll';
In DLL:
// Basic, simple and fasted Exemple! May contain syntax errors or logical.
function analysis(Task: TService; maxCores: integer): TMyResultType;
var
i, processors : integer;
begin
processors := getCPUCount();
if (maxCores < processors) and (maxCores > 0) then
processors := maxCores;
setlength (globalThreads, processors);
for i := 0 to processors - 1 do
// It is obvious that the counter modes in the original is not the same counter processors.
if i < length(Task.modes) then begin
globalThreads[i] := TAnalusysThread.create(true, Task.stream, Task.modes[i])
globalThreads[i].start();
end;
[...]
end;
Note: With a single thread the program works beautifully, with no known errors.
I want each thread to take care of a type of analysis, and I can not use Synchronize() in DLL. Understand? There is adequate and clean solution?
Cloning a stream is code like this:
However doing things over DLL borders is hard, since the DLL has an own copy of libraries and library state. This is currently not recommended.