I have a huge text file and need to split it to some file. In the text file there is an identifier to split the file. Here is some part of the text file looks like:
Comp MOFVersion 10.1
Copyright 1997-2006. All rights reserved.
--------------------------------------------------
Mon 11/19/2022 8:34:22.35 - Starting The Process...
--------------------------------------------------
There are a lot of content here
...
exit
---------------------
list volume
list partition
exit
---------------------
Volume 0 is the selected volume.
Disk ### Status Size Free Dyn Gpt
-------- ------------- ------- ------- --- ---
* Disk 0 Online 238 GB 136 GB *
--------------------------------------------------
Tue 11/20/2022 8:34:22.35 - Starting The Process...
--------------------------------------------------
There are a lot of content here
....
SERVICE_NAME: vds
TYPE : 10 WIN32_OWN_PROCESS
STATE : 1 STOPPED
WIN32_EXIT_CODE : 0 (0x0)
SERVICE_EXIT_CODE : 0 (0x0)
CHECKPOINT : 0x0
WAIT_HINT : 0x0
---------------------
*exit /b 0
File not found - *.*
0 File(s) copied
--------------------------------------------------
Wed 11/21/2022 8:34:22.35 - Starting The Process...
--------------------------------------------------
There are a lot of content here
==========================================
Computer: .
==========================================
Active: True
DmiRevision: 0
list disk
exit
---------------------
*exit /b 0
11/19/2021 08:34 AM <DIR> .
11/19/2021 08:34 AM <DIR> ..
11/19/2021 08:34 AM 0 SL
1 File(s) 0 bytes
2 Dir(s) 80,160,923,648 bytes free
My expectation is split the file by mapping the string "Starting The Process". So if I have a text file like above example, then the file will split to 3 files and each file has differen content. For example:
file1
--------------------------------------------------
Mon 11/19/2022 8:34:22.35 - Starting The Process...
--------------------------------------------------
There are a lot of content here
...
exit
---------------------
list volume
list partition
exit
---------------------
Volume 0 is the selected volume.
Disk ### Status Size Free Dyn Gpt
-------- ------------- ------- ------- --- ---
* Disk 0 Online 238 GB 136 GB *
file2
--------------------------------------------------
Tue 11/20/2022 8:34:22.35 - Starting The Process...
--------------------------------------------------
There are a lot of content here
....
SERVICE_NAME: vds
TYPE : 10 WIN32_OWN_PROCESS
STATE : 1 STOPPED
WIN32_EXIT_CODE : 0 (0x0)
SERVICE_EXIT_CODE : 0 (0x0)
CHECKPOINT : 0x0
WAIT_HINT : 0x0
---------------------
*exit /b 0
File not found - *.*
0 File(s) copied
file 3
--------------------------------------------------
Wed 11/21/2022 8:34:22.35 - Starting The Process...
--------------------------------------------------
There are a lot of content here
==========================================
Computer: .
==========================================
Active: True
DmiRevision: 0
list disk
exit
---------------------
*exit /b 0
11/19/2021 08:34 AM <DIR> .
11/19/2021 08:34 AM <DIR> ..
11/19/2021 08:34 AM 0 SL
1 File(s) 0 bytes
2 Dir(s) 80,160,923,648 bytes free
here is what i've tried:
logfile = "E:/DATA/result.txt"
with open(logfile, 'r') as text_file:
lines = text_file.readlines()
for line in lines:
if "Starting The Process..." in line:
print(line)
I am only able to find the line with the string, but I don't know how to get the content of each line after split to 3 parts and output to new file.
Is it possible to do it in Python? Thank you for any advice.
Well if the file is small enough to comfortably fit into memory (say 1GB or less), you could read the entire file into a string and then use
re.findall
: