Deeplearning4J Java code to predict price of Bitcoin is giving strange values

111 Views Asked by At

I put together some code that given an array containing the full time series of the daily Bitcoin close price in dollars, should produce a prediction for the price of Tomorrow. I would expect the result not to be too dissimilar from the current price, but I get values that are smaller than 20 dollars.

Here's my code:

public static void main(String[] args) {
    
    // Time series array
    double[] data = readCsv();

    // Hyperparameters
    int sequenceLength = 100; // The length of your sequences
    int numHiddenNodes = 256; // The number of hidden nodes
    int numEpochs = 10; // The number of epochs

    // Create input and output arrays
    INDArray input = Nd4j.create(1, 1, data.length - sequenceLength);
    INDArray output = Nd4j.create(1, 1, data.length - sequenceLength);
    for (int i = 0; i < data.length - sequenceLength; i++) {
        input.putScalar(new int[]{0, i % sequenceLength, 0}, data[i]);
        output.putScalar(new int[]{0, 0, i % sequenceLength}, data[i + sequenceLength]);
    }

    // Configure network
    MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
        .weightInit(WeightInit.XAVIER)
        .updater(new Nadam())
        .list()
        .layer(0, new LSTM.Builder().nIn(1).nOut(numHiddenNodes)
            .activation(Activation.TANH).build())
        .layer(1, new RnnOutputLayer.Builder(LossFunctions.LossFunction.MSE)
            .activation(Activation.IDENTITY).nIn(numHiddenNodes).nOut(1).build())
        .build();

    MultiLayerNetwork net = new MultiLayerNetwork(conf);
    net.init();
    net.setListeners(new ScoreIterationListener(100));
    
    // Train the network
    for (int epoch = 0; epoch < numEpochs; epoch++) {
        net.fit(input, output);
    }

    // Initialize the input for next prediction with the last sequenceLength number of values from the input INDArray
    INDArray nextInput = Nd4j.create(new double[] {data[data.length - sequenceLength]}, new int[]{1, 1, 1});

    // Use trained model to predict the next value
    INDArray predicted = net.rnnTimeStep(nextInput);

    // Print out predicted value
    System.out.println("Predicted: " + predicted.getDouble(0));
}

Here are my pom dependencies:

 <dependency>
       <groupId>org.deeplearning4j</groupId>
       <artifactId>deeplearning4j-core</artifactId>
       <version>1.0.0-M2.1</version>
   </dependency>
   <dependency>
       <groupId>org.nd4j</groupId>
       <artifactId>nd4j-native-platform</artifactId>
       <version>1.0.0-M2.1</version>
   </dependency>
   <dependency>
       <groupId>org.apache.commons</groupId>
       <artifactId>commons-csv</artifactId>
       <version>1.8</version>
   </dependency>
   <dependency>
       <groupId>ch.qos.logback</groupId>
       <artifactId>logback-classic</artifactId>
       <version>1.2.3</version>
   </dependency>

What can be wrong? Why doesn't it give meaningful results?

EDIT

Some more information. Here's where to downloaded the time series from: https://uk.investing.com/crypto/bitcoin/historical-data

And here's how I read the csv:

private static double[] readCsv() {
    String csvFile = "/BTCDaily.csv"; // File in the resources folder
    List<Double> priceList = new ArrayList<>();

    try {
        URL url = App.class.getResource(csvFile);
        Reader in = new FileReader(url.toURI().getPath());
        Iterable<CSVRecord> records = CSVFormat.DEFAULT.withFirstRecordAsHeader().parse(in);
        for (CSVRecord record : records) {
            String price = record.get("Price");
            price = price.replace(",", "");  // remove commas
            priceList.add(Double.parseDouble(price));
        }
    } catch (Exception e) {
        e.printStackTrace();
    }

    return reverse(priceList.stream().mapToDouble(Double::doubleValue).toArray());
}

public static double[] reverse(double[] array) {
    int length = array.length;
    double[] reversed = new double[length];
    for (int i = 0; i < length; i++) {
        reversed[i] = array[length - i - 1];
    }
    return reversed;
}
2

There are 2 best solutions below

1
Paul Verest On

Most likely you read CSV values wrong,

taking 30,448.4 as 30.

So trace last value from CSV into log.

Additionally, I would be sure about anything, when using 1.0.0-beta7 version.

Have you checked for the latest version e.g. https://mvnrepository.com/artifact/org.deeplearning4j/deeplearning4j-core ?

At my glance, it looks not promising.

2
Adam Gibson On

Maintainer here. As Paul mentioned, stick to the latest version when possible. Make sure to normalize your values as well as your labels.

Paul could be right you're reading your values in wrong. You normally use the CSVSequenceRecordReader for that.

Beyond that, you probably need to scale your data. Have you looked at doing that?