Tl;DR code with FileChannel.tryLock works on a local file but hangs indefinitely when run on a remote machine, I had SSHed into. Most of the length of this question is about showing outputs and debugging attempts, but the essential part is in the first para.
So I have a simple sample code that tries to get a file lock via FileChannel.tryLock. (scroll below for the Java version)
import java.io.RandomAccessFile
import java.util.logging.Logger
val LOGGER_FORMAT ="[%1\$tF %1\$tT] %4$-7s %5\$s %6\$s%n".also { System.setProperty("java.util.logging.SimpleFormatter.format", it) }
val logger = Logger.getGlobal()
fun main(args:Array<String>) {
val path= if (args.isEmpty())"localFile.txt" else args[0]
val channel=RandomAccessFile(path,"rw").channel
logger.info("created channel $channel")
logger.info("trying for lock on file $path")
val lock=channel.tryLock()
logger.info("got this lock $lock")
}
When I execute this code on my local Ubuntu machine against localFile.txt, the code works just fine and gives a lock.
[2023-01-13 14:44:03] INFO created channel sun.nio.ch.FileChannelImpl@7d4991ad
[2023-01-13 14:44:04] INFO trying for lock on file localFile.txt
[2023-01-13 14:44:04] INFO got this lock: sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid]
I then packaged the code as an executable jar (you need not do so, just run as a .class file if you wish) and ran it on a remote machine I had SSHed into. The file it tried to get a lock on was also a file on that remote machine, namely remoteFile.txt. There the command writes the following and hangs indefinitely.
[2023-01-13 14:40:41] INFO created channel sun.nio.ch.FileChannelImpl@7d4991ad
[2023-01-13 14:40:41] INFO trying for lock on file remoteFile.txt
Why does this happen?
Especially, since tryLock is non-blocking.
My debugging attempts
As a first step to debugging I tried a java port of the code (scroll below) - but it breaks in exactly the same way.
I then checked the file permissions. Here are the
ls -lfor the files
#localFile.txt
-rw-rw-r-- 1 giraffe giraffe 8 Jan 13 02:22 localFile.txt
#remoteFIle.txt
-rw-r--r-- 1 zebra ox 6 Jan 13 19:37 remoteFile.txt
They are different, so I changed the permissions for the remote to be same as those for the local one, but to no avail - locking still hangs on remote.
- Google returned someone facing an issue with
tryLockbut that's on some particular NFS deployment. I don't know how that ties into all this.
I am confused and mesmerizied by all this. Any and all help is appreciated.
Env:
local machine
#lsb_release -a
Ubuntu 20.04.5 LTS
#java -version
openjdk version "17" 2021-09-14
OpenJDK Runtime Environment (build 17+35-2724)
OpenJDK 64-Bit Server VM (build 17+35-2724, mixed mode, sharing)
#filesystem via cat /etc/fstab
ext4
remote machine
#lsb_release -a
Fedora release 23 (Twenty Three)
#java -version
openjdk version "17" 2021-09-14
OpenJDK Runtime Environment (build 17+35-2724)
OpenJDK 64-Bit Server VM (build 17+35-2724, mixed mode, sharing)
#filesystem via cat /etc/fstab
ext4
SSH command
ssh -X -t [email protected] bash -l
Java version
import java.io.IOException;
import java.io.RandomAccessFile;
import java.util.logging.Logger;
public class Test_Java {
static {
System.setProperty("java.util.logging.SimpleFormatter.format", "[%1$tF %1$tT] %4$-7s %5$s %6$s%n");
}
private final static Logger logger=Logger.getGlobal();
public static void main(String[] args) throws IOException {
var path="";
if(args==null || args.length==0)
path="localFile.txt";
else
path=args[0];
try(var file= new RandomAccessFile(path,"rw")){
var channel=file.getChannel();
logger.info("created channel "+channel);
logger.info("trying for lock on file "+path);
var lock=channel.tryLock();
logger.info("got this lock: "+lock);
}
catch (Exception e){
e.printStackTrace();
}
}
}
Further data requested in comments
ulimit -aon remoteMachine
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 63929
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 63929
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
- thread dump on remoteMachine while the process was hung, using
jstack pid
2023-01-14 15:39:16
Full thread dump OpenJDK 64-Bit Server VM (17+35-2724 mixed mode, sharing):
Threads class SMR info:
_java_thread_list=0x00007fddc0001ba0, length=12, elements={
0x00007fde10024840, 0x00007fde1015d100, 0x00007fde1015e5e0, 0x00007fde10163a80,
0x00007fde10164d10, 0x00007fde10166100, 0x00007fde10167ae0, 0x00007fde10169010,
0x00007fde1016a480, 0x00007fde10179aa0, 0x00007fde1017cb00, 0x00007fddc0000bd0
}
"main" #1 prio=5 os_prio=0 cpu=135.19ms elapsed=121.26s tid=0x00007fde10024840 nid=0x3fd7 runnable [0x00007fde18b9f000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.FileDispatcherImpl.lock0(java.base@17/Native Method)
at sun.nio.ch.FileDispatcherImpl.lock(java.base@17/FileDispatcherImpl.java:96)
at sun.nio.ch.FileChannelImpl.tryLock(java.base@17/FileChannelImpl.java:1304)
at java.nio.channels.FileChannel.tryLock(java.base@17/FileChannel.java:1191)
at Test_serc_2_Java.main(Test_serc_2_Java.java:23)
"Reference Handler" #2 daemon prio=10 os_prio=0 cpu=0.19ms elapsed=121.24s tid=0x00007fde1015d100 nid=0x3fde waiting on condition [0x00007fddf0afd000]
java.lang.Thread.State: RUNNABLE
at java.lang.ref.Reference.waitForReferencePendingList(java.base@17/Native Method)
at java.lang.ref.Reference.processPendingReferences(java.base@17/Reference.java:253)
at java.lang.ref.Reference$ReferenceHandler.run(java.base@17/Reference.java:215)
"Finalizer" #3 daemon prio=8 os_prio=0 cpu=0.38ms elapsed=121.24s tid=0x00007fde1015e5e0 nid=0x3fdf in Object.wait() [0x00007fddf09fc000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(java.base@17/Native Method)
- waiting on <0x0000000715802f40> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(java.base@17/ReferenceQueue.java:155)
- locked <0x0000000715802f40> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(java.base@17/ReferenceQueue.java:176)
at java.lang.ref.Finalizer$FinalizerThread.run(java.base@17/Finalizer.java:172)
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 cpu=0.43ms elapsed=121.24s tid=0x00007fde10163a80 nid=0x3fe0 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Service Thread" #5 daemon prio=9 os_prio=0 cpu=0.16ms elapsed=121.24s tid=0x00007fde10164d10 nid=0x3fe1 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Monitor Deflation Thread" #6 daemon prio=9 os_prio=0 cpu=1.56ms elapsed=121.24s tid=0x00007fde10166100 nid=0x3fe2 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #7 daemon prio=9 os_prio=0 cpu=33.90ms elapsed=121.24s tid=0x00007fde10167ae0 nid=0x3fe3 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
No compile task
"C1 CompilerThread0" #9 daemon prio=9 os_prio=0 cpu=59.73ms elapsed=121.24s tid=0x00007fde10169010 nid=0x3fe4 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
No compile task
"Sweeper thread" #10 daemon prio=9 os_prio=0 cpu=0.18ms elapsed=121.24s tid=0x00007fde1016a480 nid=0x3fe5 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Notification Thread" #11 daemon prio=9 os_prio=0 cpu=0.10ms elapsed=121.23s tid=0x00007fde10179aa0 nid=0x3fe6 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Common-Cleaner" #12 daemon prio=8 os_prio=0 cpu=0.27ms elapsed=121.23s tid=0x00007fde1017cb00 nid=0x3fe8 in Object.wait() [0x00007fdde1ffe000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(java.base@17/Native Method)
- waiting on <0x0000000715817018> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(java.base@17/ReferenceQueue.java:155)
- locked <0x0000000715817018> (a java.lang.ref.ReferenceQueue$Lock)
at jdk.internal.ref.CleanerImpl.run(java.base@17/CleanerImpl.java:140)
at java.lang.Thread.run(java.base@17/Thread.java:833)
at jdk.internal.misc.InnocuousThread.run(java.base@17/InnocuousThread.java:162)
"Attach Listener" #14 daemon prio=9 os_prio=0 cpu=0.57ms elapsed=72.87s tid=0x00007fddc0000bd0 nid=0x3fff waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"VM Thread" os_prio=0 cpu=2.32ms elapsed=121.25s tid=0x00007fde10159340 nid=0x3fdd runnable
"GC Thread#0" os_prio=0 cpu=0.26ms elapsed=121.26s tid=0x00007fde10072e50 nid=0x3fd8 runnable
"G1 Main Marker" os_prio=0 cpu=0.16ms elapsed=121.26s tid=0x00007fde10083e40 nid=0x3fd9 runnable
"G1 Conc#0" os_prio=0 cpu=0.09ms elapsed=121.26s tid=0x00007fde10084da0 nid=0x3fda runnable
"G1 Refine#0" os_prio=0 cpu=0.08ms elapsed=121.26s tid=0x00007fde101299d0 nid=0x3fdb runnable
"G1 Service" os_prio=0 cpu=8.82ms elapsed=121.26s tid=0x00007fde1012a8c0 nid=0x3fdc runnable
"VM Periodic Task Thread" os_prio=0 cpu=31.31ms elapsed=121.23s tid=0x00007fde1017b3e0 nid=0x3fe7 waiting on condition
JNI global refs: 8, weak refs: 0