I am trying to learn and understand how lucene works, what is inside lucene index. Basically i would want to see how the data is represented inside lucene index?
I am using lucene-core 8.6.0 as dependency
Below is my very basic Lucene code
    private Document create(File file) throws IOException {
        Document document = new Document();
        Field field = new Field("contents", new FileReader(file), TextField.TYPE_NOT_STORED);
        Field fieldPath = new Field("path", file.getAbsolutePath(), TextField.TYPE_STORED);
        Field fieldName = new Field("name", file.getName(), TextField.TYPE_STORED);
        document.add(field);
        document.add(fieldPath);
        document.add(fieldName);
        //Create analyzer
        Analyzer analyzer = new StandardAnalyzer();
        //Create IndexWriter pass the analyzer
        Path indexPath = Files.createTempDirectory("tempIndex");
        Directory directory = FSDirectory.open(indexPath);
        IndexWriterConfig indexWriterCOnfig = new IndexWriterConfig(analyzer);
        IndexWriter iwriter = new IndexWriter(directory, indexWriterCOnfig);
        iwriter.addDocument(document);
        iwriter.close();
        return document;
    }
Note : I understand the knowledge behind Lucene - the inverted index, but i lack the understanding of the lucene library uses this concept and how the files are created so that search was made easy and feasible using lucene.
I tried Limo, but of no use. Its just did not work even though i gave the index location in the web.xml
 
                        
If the index is large in size (e.g. hundreds of GBs), Luke sometimes fails to open it. There is a command-line based alternative of Luke, called
I-Rex. It is developed for researches in Information Retrieval. Here is the link to it: https://github.com/souravsaha/I-REX/tree/shell-lucene8Feel free to add/edit the codes.