Remove first line using seek in Python

109 Views Asked by At

I have a scenario to remove the first line of the file (large file around 70 GB) using seek in Python. Also I can't write the data to another file. I need to remove from the existing file only. Is there any solution.

Tried seek to move the pointer to end of the line but not sure how to remove it.

2

There are 2 best solutions below

4
Bachagha Mousaab On

It's impossible unfortunately to delete it instantly, but uou can try this code. This will basically rewrite the content in the same file except for the first line:

import fileinput

with fileinput.input(files=('text.txt'), inplace=True) as f:
    for line_number, line in enumerate(f):
       if line_number == 0:
           continue
       print(line, end='')

The inplace=True argument tells Python to modify the file in place, rather than creating a new file.

0
Mark Tolonen On

You can memory map the file to the contents of the file appear in memory, then move the memory starting from the 2nd line to the beginning of the file. Then truncate the file to the new file length.

This won't likely be fast for a 70GB file. It still has to flush the file changes back to disk. That's just the way files work, but it won't require an additional 70GB of disk space such as the usual process of writing a new file and deleting the old one.

import mmap

# Create test file for demonstration (about 50MB)
#
# The quick brown fox jumped over 1 lazy dogs
# The quick brown fox jumped over 2 lazy dogs
# ...
# The quick brown fox jumped over 1,000,000 lazy dogs

with open('test.txt', 'w') as f:
    for i in range(1, 1_000_001):
        print(f'The quick brown fox jumped over {i:,} lazy dogs', file=f)

# Create memory-mapped file, read first line, shift file memory
# starting from offset of the 2nd line back to the beginning of the file.
# This removes the first line.
with open('test.txt', 'r+b') as f:
    with mmap.mmap(f.fileno(), 0) as mm:
        size = mm.size()
        line = mm.readline()
        linelen = len(line)
        mm.move(0, linelen, size - linelen)
        mm.flush()

    # Truncate the file to the shorter length.
    f.truncate(size - linelen)

# Read the first line of the new file.
with open('test.txt') as f:
    print(f.readline())

Output:

The quick brown fox jumped over 2 lazy dogs