FileChannel.tryLock works locally but hangs on remote machine, accessed via SSH

72 Views Asked by At

Tl;DR code with FileChannel.tryLock works on a local file but hangs indefinitely when run on a remote machine, I had SSHed into. Most of the length of this question is about showing outputs and debugging attempts, but the essential part is in the first para.

So I have a simple sample code that tries to get a file lock via FileChannel.tryLock. (scroll below for the Java version)

import java.io.RandomAccessFile
import java.util.logging.Logger

val LOGGER_FORMAT ="[%1\$tF %1\$tT] %4$-7s %5\$s %6\$s%n".also { System.setProperty("java.util.logging.SimpleFormatter.format", it) }
val logger = Logger.getGlobal()
fun main(args:Array<String>) {
    val path= if (args.isEmpty())"localFile.txt" else args[0]
    val channel=RandomAccessFile(path,"rw").channel
    logger.info("created channel $channel")
    logger.info("trying for lock on file $path")
    val lock=channel.tryLock()
    logger.info("got this lock $lock")
}

When I execute this code on my local Ubuntu machine against localFile.txt, the code works just fine and gives a lock.

[2023-01-13 14:44:03] INFO    created channel sun.nio.ch.FileChannelImpl@7d4991ad 
[2023-01-13 14:44:04] INFO    trying for lock on file localFile.txt
[2023-01-13 14:44:04] INFO    got this lock: sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid]

I then packaged the code as an executable jar (you need not do so, just run as a .class file if you wish) and ran it on a remote machine I had SSHed into. The file it tried to get a lock on was also a file on that remote machine, namely remoteFile.txt. There the command writes the following and hangs indefinitely.

[2023-01-13 14:40:41] INFO    created channel sun.nio.ch.FileChannelImpl@7d4991ad 
[2023-01-13 14:40:41] INFO    trying for lock on file remoteFile.txt

Why does this happen?

Especially, since tryLock is non-blocking.


My debugging attempts

  1. As a first step to debugging I tried a java port of the code (scroll below) - but it breaks in exactly the same way.

  2. I then checked the file permissions. Here are the ls -l for the files

#localFile.txt
-rw-rw-r-- 1 giraffe giraffe 8 Jan 13 02:22 localFile.txt

#remoteFIle.txt
-rw-r--r-- 1 zebra ox 6 Jan 13 19:37 remoteFile.txt

They are different, so I changed the permissions for the remote to be same as those for the local one, but to no avail - locking still hangs on remote.

  1. Google returned someone facing an issue with tryLock but that's on some particular NFS deployment. I don't know how that ties into all this.

I am confused and mesmerizied by all this. Any and all help is appreciated.


Env:
local machine

#lsb_release -a
Ubuntu 20.04.5 LTS
#java -version
openjdk version "17" 2021-09-14
OpenJDK Runtime Environment (build 17+35-2724)
OpenJDK 64-Bit Server VM (build 17+35-2724, mixed mode, sharing)
#filesystem via cat /etc/fstab
ext4

remote machine

#lsb_release -a
Fedora release 23 (Twenty Three)
#java -version
openjdk version "17" 2021-09-14
OpenJDK Runtime Environment (build 17+35-2724)
OpenJDK 64-Bit Server VM (build 17+35-2724, mixed mode, sharing)
#filesystem via cat /etc/fstab
ext4

SSH command

ssh -X -t [email protected] bash -l

Java version

import java.io.IOException;
import java.io.RandomAccessFile;
import java.util.logging.Logger;

public class Test_Java {
    static {
        System.setProperty("java.util.logging.SimpleFormatter.format", "[%1$tF %1$tT] %4$-7s %5$s %6$s%n");
    }
    private final static Logger logger=Logger.getGlobal();


    public static void main(String[] args) throws IOException {
        var path="";
        if(args==null || args.length==0)
            path="localFile.txt";
        else
            path=args[0];

        try(var file= new RandomAccessFile(path,"rw")){
            var channel=file.getChannel();
            logger.info("created channel "+channel);
            logger.info("trying for lock on file "+path);
            var lock=channel.tryLock();
            logger.info("got this lock: "+lock);
        }
        catch (Exception e){
            e.printStackTrace();
        }
    }
}

Further data requested in comments

  1. ulimit -a on remoteMachine
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 63929
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 63929
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
  1. thread dump on remoteMachine while the process was hung, using jstack pid
2023-01-14 15:39:16
Full thread dump OpenJDK 64-Bit Server VM (17+35-2724 mixed mode, sharing):

Threads class SMR info:
_java_thread_list=0x00007fddc0001ba0, length=12, elements={
0x00007fde10024840, 0x00007fde1015d100, 0x00007fde1015e5e0, 0x00007fde10163a80,
0x00007fde10164d10, 0x00007fde10166100, 0x00007fde10167ae0, 0x00007fde10169010,
0x00007fde1016a480, 0x00007fde10179aa0, 0x00007fde1017cb00, 0x00007fddc0000bd0
}

"main" #1 prio=5 os_prio=0 cpu=135.19ms elapsed=121.26s tid=0x00007fde10024840 nid=0x3fd7 runnable  [0x00007fde18b9f000]
   java.lang.Thread.State: RUNNABLE
    at sun.nio.ch.FileDispatcherImpl.lock0(java.base@17/Native Method)
    at sun.nio.ch.FileDispatcherImpl.lock(java.base@17/FileDispatcherImpl.java:96)
    at sun.nio.ch.FileChannelImpl.tryLock(java.base@17/FileChannelImpl.java:1304)
    at java.nio.channels.FileChannel.tryLock(java.base@17/FileChannel.java:1191)
    at Test_serc_2_Java.main(Test_serc_2_Java.java:23)

"Reference Handler" #2 daemon prio=10 os_prio=0 cpu=0.19ms elapsed=121.24s tid=0x00007fde1015d100 nid=0x3fde waiting on condition  [0x00007fddf0afd000]
   java.lang.Thread.State: RUNNABLE
    at java.lang.ref.Reference.waitForReferencePendingList(java.base@17/Native Method)
    at java.lang.ref.Reference.processPendingReferences(java.base@17/Reference.java:253)
    at java.lang.ref.Reference$ReferenceHandler.run(java.base@17/Reference.java:215)

"Finalizer" #3 daemon prio=8 os_prio=0 cpu=0.38ms elapsed=121.24s tid=0x00007fde1015e5e0 nid=0x3fdf in Object.wait()  [0x00007fddf09fc000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(java.base@17/Native Method)
    - waiting on <0x0000000715802f40> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(java.base@17/ReferenceQueue.java:155)
    - locked <0x0000000715802f40> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(java.base@17/ReferenceQueue.java:176)
    at java.lang.ref.Finalizer$FinalizerThread.run(java.base@17/Finalizer.java:172)

"Signal Dispatcher" #4 daemon prio=9 os_prio=0 cpu=0.43ms elapsed=121.24s tid=0x00007fde10163a80 nid=0x3fe0 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Service Thread" #5 daemon prio=9 os_prio=0 cpu=0.16ms elapsed=121.24s tid=0x00007fde10164d10 nid=0x3fe1 runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Monitor Deflation Thread" #6 daemon prio=9 os_prio=0 cpu=1.56ms elapsed=121.24s tid=0x00007fde10166100 nid=0x3fe2 runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #7 daemon prio=9 os_prio=0 cpu=33.90ms elapsed=121.24s tid=0x00007fde10167ae0 nid=0x3fe3 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   No compile task

"C1 CompilerThread0" #9 daemon prio=9 os_prio=0 cpu=59.73ms elapsed=121.24s tid=0x00007fde10169010 nid=0x3fe4 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   No compile task

"Sweeper thread" #10 daemon prio=9 os_prio=0 cpu=0.18ms elapsed=121.24s tid=0x00007fde1016a480 nid=0x3fe5 runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Notification Thread" #11 daemon prio=9 os_prio=0 cpu=0.10ms elapsed=121.23s tid=0x00007fde10179aa0 nid=0x3fe6 runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Common-Cleaner" #12 daemon prio=8 os_prio=0 cpu=0.27ms elapsed=121.23s tid=0x00007fde1017cb00 nid=0x3fe8 in Object.wait()  [0x00007fdde1ffe000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
    at java.lang.Object.wait(java.base@17/Native Method)
    - waiting on <0x0000000715817018> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(java.base@17/ReferenceQueue.java:155)
    - locked <0x0000000715817018> (a java.lang.ref.ReferenceQueue$Lock)
    at jdk.internal.ref.CleanerImpl.run(java.base@17/CleanerImpl.java:140)
    at java.lang.Thread.run(java.base@17/Thread.java:833)
    at jdk.internal.misc.InnocuousThread.run(java.base@17/InnocuousThread.java:162)

"Attach Listener" #14 daemon prio=9 os_prio=0 cpu=0.57ms elapsed=72.87s tid=0x00007fddc0000bd0 nid=0x3fff waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"VM Thread" os_prio=0 cpu=2.32ms elapsed=121.25s tid=0x00007fde10159340 nid=0x3fdd runnable  

"GC Thread#0" os_prio=0 cpu=0.26ms elapsed=121.26s tid=0x00007fde10072e50 nid=0x3fd8 runnable  

"G1 Main Marker" os_prio=0 cpu=0.16ms elapsed=121.26s tid=0x00007fde10083e40 nid=0x3fd9 runnable  

"G1 Conc#0" os_prio=0 cpu=0.09ms elapsed=121.26s tid=0x00007fde10084da0 nid=0x3fda runnable  

"G1 Refine#0" os_prio=0 cpu=0.08ms elapsed=121.26s tid=0x00007fde101299d0 nid=0x3fdb runnable  

"G1 Service" os_prio=0 cpu=8.82ms elapsed=121.26s tid=0x00007fde1012a8c0 nid=0x3fdc runnable  

"VM Periodic Task Thread" os_prio=0 cpu=31.31ms elapsed=121.23s tid=0x00007fde1017b3e0 nid=0x3fe7 waiting on condition  

JNI global refs: 8, weak refs: 0


0

There are 0 best solutions below