Remove a character followed by whitespace each newline of a string

400 Views Asked by At

I am writing a program to edit a rtf file. The rtf file will always come in the same format with

Q     XXXXXXXXXXXX
A     YYYYYYYYYYYY
Q     XXXXXXXXXXXX
A     YYYYYYYYYYYY

I want to remove the Q / A + whitespace and leave just the X's and Y's on each line. My first idea is to split the string into a new string for each line and edit it from there using str.split like so:

private void countLines(String str){
    String[] lines = str.split("\r\n|\r|\n");
    linesInDoc = lines;
}

From here my idea is to take each even array value and get rid of Q + whitespace and take each odd array value and get rid of A + whitespace. Is there a better way to do this? Note: The first line somteimes contains a ~6 digit alphanumeric. I tihnk an if statement for a 2 non whitespace chars would solve this.

Here is the rest of the code:

import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.StringWriter;
import java.io.Writer;

import javax.swing.JEditorPane;
import javax.swing.text.BadLocationException;
import javax.swing.text.EditorKit;


public class StringEditing {
    String[] linesInDoc;

    private String readRTF(File file){
        String documentText = "";
        try{
            JEditorPane p = new JEditorPane();
            p.setContentType("text/rtf");
            EditorKit rtfKit = p.getEditorKitForContentType("text/rtf");
            rtfKit.read(new FileReader(file), p.getDocument(), 0);
            rtfKit = null;  
            EditorKit txtKit = p.getEditorKitForContentType("text/plain");
            Writer writer = new StringWriter();
            txtKit.write(writer, p.getDocument(), 0, p.getDocument().getLength());
            documentText = writer.toString();
        }
        catch( FileNotFoundException e )
        {
            System.out.println( "File not found" );
        }
        catch( IOException e )
        {
            System.out.println( "I/O error" );
        }
        catch( BadLocationException e )
        {
        }
        return documentText;
    }
    public void editDocument(File file){
        String plaintext = readRTF(file);
        System.out.println(plaintext);
        fixString(plaintext);
        System.out.println(plaintext);
    }
3

There are 3 best solutions below

0
On

Unless I'm missing something, you could use String.substring(int) like

String lines = "Q     XXXXXXXXXXXX\n" //
        + "A     YYYYYYYYYYYY\n" //
        + "Q     XXXXXXXXXXXX\n" //
        + "A     YYYYYYYYYYYY\n";
for (String line : lines.split("\n")) {
    System.out.println(line.substring(6));
}

Output is

XXXXXXXXXXXX
YYYYYYYYYYYY
XXXXXXXXXXXX
YYYYYYYYYYYY

If your format should be more general, you might prefer

System.out.println(line.substring(1).trim());
3
On

easily doable by a regex (assuming 'fileText' is your whole file's content)

removedPrefix = fileText.replaceAll("(A|Q) *(.+)\\r", "$2\\r");

The regex means a Q or A for start, then some (any amount of) spaces, then anything (marked as group 2), and a closing line. This doesn't do anything to the first line with the digits. The result is the file content without the Q/A and the spaces. There are easier ways if you know the exact number of spaces before your needed text, but this works for all, and greatly flexible.

If you process line by line it's

removedPrefix = currentLine.replaceAll("(A|Q) *(.+)", "$2");

As simple as that

0
On

A BufferedReader will handle the newline \n for you. You can use a matcher to validate that the line is in the desired format. If the line is fixed length, simply use the substring

final String bodyPattern = "\\w{1,1}[ \\w]{5,5}\\d{12,12}";

try  (BufferedReader br = new BufferedReader(new FileReader(fileName))) {


            String line;

            while ((line = br.readLine()) != null) {


                    if (line.matches(bodyPattern)) {
                        // 
                       myString = line.substring(6);
                    }
            }
    }
        //catch Block

You can adjust the regex pattern to your specific requirements