admin管理员组

文章数量:1122846

After investigating, I found out that the most performant way to read a directory and its sub-directories' files is using Files.walk() method.

private void selectPrimaryFolder(ResourceBundle resourceBundle, ListView<String> lvFileList) {
        DirectoryChooser chooser = new DirectoryChooser();
        chooser.setTitle(resourceBundle.getString("selectPrimaryFolder"));

        File selectedDirectory = chooser.showDialog((Stage) btnSelectPrimaryFolder.getScene().getWindow());

        System.out.println("SELECTED DIRECTORY: " + selectedDirectory.toPath());

        // TODO
        try (Stream<Path> stream = Files.walk(selectedDirectory.toPath()).sorted()) {
            stream.forEach(path -> addFileToList(path.toFile(), lvFileList));
        } catch (IOException ioException) {
            ioException.getMessage();
        }

        /*System.out.println(listFileTree(selectedDirectory));
        for (File f: listFileTree(selectedDirectory)){
            lvFileList.getItems().add(f.getName());
        }*/


    }

private static void addFileToList(File file, ListView<String> lvFileList) {
        if (file.isDirectory()) {
            System.out.println("Directory: " + file.getAbsolutePath());
        } else {
            System.out.println("File: " + file.getAbsolutePath());
            lvFileList.getItems().add(file.getName());
        }
    }

public static Collection<File> listFileTree(File dir) {
        Set<File> fileTree = new HashSet<File>();
        if(dir==null||dir.listFiles()==null){
            return fileTree;
        }
        for (File entry : dir.listFiles()) {
            if (entry.isFile()) fileTree.add(entry);
            else fileTree.addAll(listFileTree(entry));
        }
        return fileTree;
    }

Unfortunately, in the IntelliJ console, for any Cyrillic characters, it outputs ????, while in JavaFX's ListView, I get �����.

The listFileTree() method, which I found here (Recursively list files in Java), is another way I tried to get the list of files, but weirdly enough, it straight up ignores files that contain any Cyrillic characters.

According to a post here (both File.isFile() and File.isDirectory() is returning false), isFile() might be having an issue with the encoding.

The Encoding

From what I found, this might be an encoding issue.

I checked my IntelliJ's file enconding settings and they're all set to UTF-8 (it's also set to create UTF-8 files with no BOM), including the console's default encoding.

I've set both of these in the VMOptions:

-Dconsole.encoding=UTF-8
-Dfile.encoding=UTF-8

Byte vs Character Streams

Something interesting I ran into is that if I use FileChooser and select a single file, the characters are properly displayed in the ListView (albeit in the console they still show as question marks).

As per the documentation, File.walk() returns a stream (of path), which I'm assuming is done via single byte stream [See EDIT 4] (I admit, I got a bit lost in the documentation), because according to Ted Hopp's comment in Byte Stream vs Character Stream in Java, it should be able to read the Cyrillic if it was a "Character Stream" (assuming the original file is enconded in UTF-8, of course).

...To test this, try a file that contains something that requires more than one byte to represent (such as Greek, Cyrillic, or Arabic characters). With a byte-oriented stream, these won't work. With a character-oriented stream, the characters will be preserved as long as both the streams are using encodings that supports those characters (such as UTF-8) and the input file was stored in the encoding used for the input stream...

Figuring out the original file encoding

I thought I'd just look for a way to detect the encoding and set up a condition, but after reading many, many, many answers, it turns out it's not something that can be done. The best option one has is to try and "guess" it. Which I did.

Using new String(bytes, charset) / String.getBytes() mentioned in Encoding conversion in java, I tested the most common ones, as per this question (What is the most common encoding of each language?), via File.walk().

NONE OF THEM WORKED! Every single one resulted in question marks, in the console and in the ListView.

I thought my files might be bugged but VLC doesn't seem to have any issues with encoding. All the text is presented as it should.

Question

How do I go about this? Is there a recursive character stream alternative to File.walk()? Have I missed something else?

P.S. I have not tried any non-recursive solutions as recursion is a requirement.

P.S. 2. I've also tried the ones mentioned in io-recurse-tests, but alas, nothing.

EDIT 1 - Additional information

@Basil Bourque

Post an example of a file name with the problematic Cyrillic characters.

Cyrillic file name examples:

Ъпсурт - Колега.mp3

Kingsize - Оставам себе си.mp3

@Basil Bourque

What is “VLC” you mentioned?

I was referring to VLC Media Player.

@Basil Bourque @g00se

What file system? What is the host operating system?

I'm using Fedora, which uses Btrfs: the b-tree filesystem with UTF-8 as default system encoding.

@g00se

Not quite sure why a) you're mixing Path with File (perhaps because of your positive results with FileChooser or something?)...

ListView<String> lvPrimaryList = new ListView<>();
Button btn = new Button("Select file");
btn.setOnMouseClicked(new EventHandler<MouseEvent>() {
    @Override
    public void handle(MouseEvent mouseEvent) {

        FileChooser chooser = new FileChooser();
        chooser.setTitle(resourceBundle.getString("selectPrimaryFolder"));
        File selectedFile = chooser.showOpenDialog((Stage) btn.getScene().getWindow());
        System.out.println("SELECTED FILE: " + selectedFile.getAbsolutePath());

        lvPrimaryList.getItems().add(selectedFile.getAbsolutePath());

    }
});

The above code successfully adds /home/user/music/Kingsize - Оставам себе си.mp3 to the ListView.

But if I try to get a list of all of the files in a directory using File.walk() mentioned above, it ends up as /home/user/music/Kingsize - ������� ���� ��.mp3.

@g00se

...b) why you think recursion/non-recursion has any bearing on this...

I mentioned recursion in case there are solutions that work but only get a list of files in the current directory and I need to be able to get any possible files within any sub-directories.

@jewelsea

Create an App with a label that displays the hardcoded text of one of > the problem file names: new Scene(new Label("problem text")). That is > all it needs to do. Nothing else. Does the text display correctly?

Using Label does work.

ListView<Label> lvPrimaryList = new ListView<>();
Button btn = new Button("Select file");
btn.setOnMouseClicked(new EventHandler<MouseEvent>() {
    @Override
    public void handle(MouseEvent mouseEvent) {
    
        lvPrimaryList.getItems().add(new Label("Ъпсурт - Колега.mp3"));
    
    }
});

Hard coding the text via the following code also works:

ListView<String> lvPrimaryList = new ListView<>();
Button btn = new Button("Select file");
btn.setOnMouseClicked(new EventHandler<MouseEvent>() {
    @Override
    public void handle(MouseEvent mouseEvent) {

        lvPrimaryList.getItems().add("Ъпсурт - Колега.mp3");

    }
});

@g00se

new Scene(new Label("\u041E\u0434\u0438\u043D"));

ListView<Label> lvPrimaryList = new ListView<>();
Button btn = new Button("Select file");
btn.setOnMouseClicked(new EventHandler<MouseEvent>() {
    @Override
    public void handle(MouseEvent mouseEvent) {
        
        lvPrimaryList.getItems().add(new Label("\u041E\u0434\u0438\u043D"));
        
    }
});

This does work. I get, Один.

@Basil Bourque

You may need to specify a font you know to have glyphs for your desired characters. Other Questions have pointed to JavaFX failing to run through fonts properly to find one with needed glyphs.

I'm using IBM Plex Sans font.

In my Main.java file, I've added it like so:

Font.loadFont(Objects.requireNonNull(getClass().getResource("fonts/IBMPlexSans-Light.ttf")).toExternalForm(), 14);
Font.loadFont(Objects.requireNonNull(getClass().getResource("fonts/IBMPlexSans-Medium.ttf")).toExternalForm(), 14);
Font.loadFont(Objects.requireNonNull(getClass().getResource("fonts/IBMPlexSans-Regular.ttf")).toExternalForm(), 14);
Font.loadFont(Objects.requireNonNull(getClass().getResource("fonts/IBMPlexSans-SemiBold.ttf")).toExternalForm(), 14);

In my style.css file, I've loaded it via:

.root {
    -fx-font-family: "IBM Plex Sans";
}

I reviewed the font glyphs and it does include Cyrillic, as mentioned in the specifications.

I did remove it, just in case, but I still get ����� in the ListView when using File.walk().

EDIT 2 - Further testing

Using FileChooser and selecting multiple files produces the desired results in the ListView, but the code lacks recursion.

ListView<String> lvPrimaryList = new ListView<>();
Button btn = new Button("Select file");
btn.setOnMouseClicked(new EventHandler<MouseEvent>() {
    @Override
    public void handle(MouseEvent mouseEvent) {

        FileChooser chooser = new FileChooser();
        chooser.setTitle("Select files");
        List<File> selectedFiles = chooser.showOpenMultipleDialog((Stage) btn.getScene().getWindow());

        for (File file : selectedFiles) {
            System.out.println("SELECTED FILE: " + file.getAbsolutePath());
            lvPrimaryList.getItems().add(file.getAbsolutePath());
        }
    }
});

It's also a bit less user friendly when there are hundreds of files and the user would have to select all of them. It'd be much better to prompt the user to select a single directory and leave the program to do the "heavy lifting".

EDIT 3 - Minimal reproducible example

In the process of creating a full MRE, I ran into an issue where, after creating the files myself, their names were all: ?????? - ??????.mp3

Main.java

public class Main extends Application {
    @Override
    public void start(Stage stage) throws IOException {

        FXMLLoader fxmlLoader = new FXMLLoader(Main.class.getResource("main.fxml"));
        Scene scene = new Scene(fxmlLoader.load(), 800, 600);
        scene.getStylesheets().add(getClass().getResource("style/style.css").toExternalForm());
        stage.setTitle("My Program");
        stage.setScene(scene);
        stage.show();
    }

    public static void main(String[] args) {
        launch();
    }
}

MainController.java

public class MainController implements Initializable {

    @FXML
    private ListView<String> lvPrimaryList;
    @FXML
    private Button btnSelectFile;

    @Override
    public void initialize(URL url, ResourceBundle resourceBundle) {

        // Select primary list
        btnSelectFile.setOnMouseClicked(new EventHandler<MouseEvent>() {
            @Override
            public void handle(MouseEvent mouseEvent) {

                initTest();
            }
        });
    }

    private void initTest() {

        System.out.println(java.nio.charset.Charset.defaultCharset());

        // Edit these to match your operating system
        String osDelimiter = "/";
        String rootString = "/home/user/Documents/init-test" + osDelimiter;

        Path rootPath = Paths.get(rootString);


        try {
            Files.createDirectory(rootPath);

            File songOne = new File(rootString + "Ъпсурт - Колега.mp3");
            songOne.createNewFile();
            File songTwo = new File(rootString + "Kingsize - Оставам себе си.mp3");
            songTwo.createNewFile();

            //System.setOut(new PrintStream(new FileOutputStream(rootString + "Ъпсурт - Колега.mp3"), true, StandardCharsets.UTF_8));
            //System.setOut(new PrintStream(new FileOutputStream(rootString + "Kingsize - Оставам себе си.mp3"), true, StandardCharsets.UTF_8));

            //System.setOut(new PrintStream(new FileOutputStream(rootString + "Ъпсурт - Колега.mp3"), true, "Cp1252"));
            //System.setOut(new PrintStream(new FileOutputStream(rootString + "Kingsize - Оставам себе си.mp3"), true, "Cp1252"));

            //System.setOut(new PrintStream(new FileOutputStream(rootString + "Ъпсурт - Колега.mp3"), true, "windows-1251"));
            //System.setOut(new PrintStream(new FileOutputStream(rootString + "Kingsize - Оставам себе си.mp3"), true, "windows-1251"));

            //System.setOut(new PrintStream(new FileOutputStream(rootString + "Ъпсурт - Колега.mp3"), true, "UTF-16"));
            //System.setOut(new PrintStream(new FileOutputStream(rootString + "Kingsize - Оставам себе си.mp3"), true, "UTF-16"));

            //Writer writerOne = new OutputStreamWriter(new FileOutputStream(rootString + "Ъпсурт - Колега.mp3"));
            //writerOne.close();
            //Writer writerTwo = new OutputStreamWriter(new FileOutputStream(rootString + "Kingsize - Оставам себе си.mp3"));
            //writerTwo.close();
        
            //Writer writerOne = new OutputStreamWriter(new FileOutputStream(rootString + "Ъпсурт - Колега.mp3"), StandardCharsets.UTF_8);
            //writerOne.close();
            //Writer writerTwo = new OutputStreamWriter(new FileOutputStream(rootString + "Kingsize - Оставам себе си.mp3"), StandardCharsets.UTF_8);
            //writerTwo.close();

            //Writer writerOne = new OutputStreamWriter(new FileOutputStream(rootString + "Ъпсурт - Колега.mp3"), StandardCharsets.UTF_16);
            //writerOne.close();
            //Writer writerTwo = new OutputStreamWriter(new FileOutputStream(rootString + "Kingsize - Оставам себе си.mp3"), StandardCharsets.UTF_16);
            //writerTwo.close();

            try (Stream<Path> stream = Files.walk(rootPath.toFile().toPath()).sorted()) {
                stream.forEach(path -> addFileToList(path.toFile(), lvPrimaryList));
            } catch (IOException ioException) {
                ioException.getMessage();
            }

        } catch (IOException exception) {
            exception.printStackTrace();
        }
    }
}

init-test.fxml

<?xml version="1.0" encoding="UTF-8"?>

<?import javafx.geometry.Insets?>
<?import javafx.scene.control.Button?>
<?import javafx.scene.control.Label?>
<?import javafx.scene.control.ListView?>
<?import javafx.scene.layout.VBox?>
<?import javafx.scene.text.Font?>

<VBox alignment="CENTER" maxHeight="-Infinity" maxWidth="-Infinity" minHeight="-Infinity" minWidth="-Infinity" spacing="5.0" xmlns="; xmlns:fx="; fx:controller="com.project.test.MainController">
   <children>
      <Label alignment="CENTER" maxWidth="1.7976931348623157E308" text="Init Test">
         <font>
            <Font name="System Bold" size="13.0" />
         </font>
      </Label>
      <ListView fx:id="lvPrimaryList" />
      <Button fx:id="btnSelectFile" mnemonicParsing="false" text="Select file" />
   </children>
   <padding>
      <Insets bottom="5.0" left="5.0" right="5.0" top="5.0" />
   </padding>
</VBox>

I also ran into this post (How to support Cyrillic alphabet in Eclipse?).

It explains that the symbol doesn't have anything to do with the encoding but the font not having the necessary glyphs to support Cyrillic, as mentioned by Basil Bourque - which is weird because of the above experiments resulting in a success when manually selecting the files.

I'll keep updating as I keep investigating.

EDIT 4 - Source code

Digging through Github source code of Java/JavaFX, I found the following:

  • FileChooser - uses a List<File>
  • File - uses a normalized pathname String.
  • Files - uses an InputStream of Path
  • Path - uses a URI
  • URI - is a String

Theoretically, whether I use File.walk() or manually selecting files via FileChooser, both should be getting their data in a form of a String.

Yet, the same String seems to be interpreted in different ways.

After investigating, I found out that the most performant way to read a directory and its sub-directories' files is using Files.walk() method.

private void selectPrimaryFolder(ResourceBundle resourceBundle, ListView<String> lvFileList) {
        DirectoryChooser chooser = new DirectoryChooser();
        chooser.setTitle(resourceBundle.getString("selectPrimaryFolder"));

        File selectedDirectory = chooser.showDialog((Stage) btnSelectPrimaryFolder.getScene().getWindow());

        System.out.println("SELECTED DIRECTORY: " + selectedDirectory.toPath());

        // TODO
        try (Stream<Path> stream = Files.walk(selectedDirectory.toPath()).sorted()) {
            stream.forEach(path -> addFileToList(path.toFile(), lvFileList));
        } catch (IOException ioException) {
            ioException.getMessage();
        }

        /*System.out.println(listFileTree(selectedDirectory));
        for (File f: listFileTree(selectedDirectory)){
            lvFileList.getItems().add(f.getName());
        }*/


    }

private static void addFileToList(File file, ListView<String> lvFileList) {
        if (file.isDirectory()) {
            System.out.println("Directory: " + file.getAbsolutePath());
        } else {
            System.out.println("File: " + file.getAbsolutePath());
            lvFileList.getItems().add(file.getName());
        }
    }

public static Collection<File> listFileTree(File dir) {
        Set<File> fileTree = new HashSet<File>();
        if(dir==null||dir.listFiles()==null){
            return fileTree;
        }
        for (File entry : dir.listFiles()) {
            if (entry.isFile()) fileTree.add(entry);
            else fileTree.addAll(listFileTree(entry));
        }
        return fileTree;
    }

Unfortunately, in the IntelliJ console, for any Cyrillic characters, it outputs ????, while in JavaFX's ListView, I get �����.

The listFileTree() method, which I found here (Recursively list files in Java), is another way I tried to get the list of files, but weirdly enough, it straight up ignores files that contain any Cyrillic characters.

According to a post here (both File.isFile() and File.isDirectory() is returning false), isFile() might be having an issue with the encoding.

The Encoding

From what I found, this might be an encoding issue.

I checked my IntelliJ's file enconding settings and they're all set to UTF-8 (it's also set to create UTF-8 files with no BOM), including the console's default encoding.

I've set both of these in the VMOptions:

-Dconsole.encoding=UTF-8
-Dfile.encoding=UTF-8

Byte vs Character Streams

Something interesting I ran into is that if I use FileChooser and select a single file, the characters are properly displayed in the ListView (albeit in the console they still show as question marks).

As per the documentation, File.walk() returns a stream (of path), which I'm assuming is done via single byte stream [See EDIT 4] (I admit, I got a bit lost in the documentation), because according to Ted Hopp's comment in Byte Stream vs Character Stream in Java, it should be able to read the Cyrillic if it was a "Character Stream" (assuming the original file is enconded in UTF-8, of course).

...To test this, try a file that contains something that requires more than one byte to represent (such as Greek, Cyrillic, or Arabic characters). With a byte-oriented stream, these won't work. With a character-oriented stream, the characters will be preserved as long as both the streams are using encodings that supports those characters (such as UTF-8) and the input file was stored in the encoding used for the input stream...

Figuring out the original file encoding

I thought I'd just look for a way to detect the encoding and set up a condition, but after reading many, many, many answers, it turns out it's not something that can be done. The best option one has is to try and "guess" it. Which I did.

Using new String(bytes, charset) / String.getBytes() mentioned in Encoding conversion in java, I tested the most common ones, as per this question (What is the most common encoding of each language?), via File.walk().

NONE OF THEM WORKED! Every single one resulted in question marks, in the console and in the ListView.

I thought my files might be bugged but VLC doesn't seem to have any issues with encoding. All the text is presented as it should.

Question

How do I go about this? Is there a recursive character stream alternative to File.walk()? Have I missed something else?

P.S. I have not tried any non-recursive solutions as recursion is a requirement.

P.S. 2. I've also tried the ones mentioned in io-recurse-tests, but alas, nothing.

EDIT 1 - Additional information

@Basil Bourque

Post an example of a file name with the problematic Cyrillic characters.

Cyrillic file name examples:

Ъпсурт - Колега.mp3

Kingsize - Оставам себе си.mp3

@Basil Bourque

What is “VLC” you mentioned?

I was referring to VLC Media Player.

@Basil Bourque @g00se

What file system? What is the host operating system?

I'm using Fedora, which uses Btrfs: the b-tree filesystem with UTF-8 as default system encoding.

@g00se

Not quite sure why a) you're mixing Path with File (perhaps because of your positive results with FileChooser or something?)...

ListView<String> lvPrimaryList = new ListView<>();
Button btn = new Button("Select file");
btn.setOnMouseClicked(new EventHandler<MouseEvent>() {
    @Override
    public void handle(MouseEvent mouseEvent) {

        FileChooser chooser = new FileChooser();
        chooser.setTitle(resourceBundle.getString("selectPrimaryFolder"));
        File selectedFile = chooser.showOpenDialog((Stage) btn.getScene().getWindow());
        System.out.println("SELECTED FILE: " + selectedFile.getAbsolutePath());

        lvPrimaryList.getItems().add(selectedFile.getAbsolutePath());

    }
});

The above code successfully adds /home/user/music/Kingsize - Оставам себе си.mp3 to the ListView.

But if I try to get a list of all of the files in a directory using File.walk() mentioned above, it ends up as /home/user/music/Kingsize - ������� ���� ��.mp3.

@g00se

...b) why you think recursion/non-recursion has any bearing on this...

I mentioned recursion in case there are solutions that work but only get a list of files in the current directory and I need to be able to get any possible files within any sub-directories.

@jewelsea

Create an App with a label that displays the hardcoded text of one of > the problem file names: new Scene(new Label("problem text")). That is > all it needs to do. Nothing else. Does the text display correctly?

Using Label does work.

ListView<Label> lvPrimaryList = new ListView<>();
Button btn = new Button("Select file");
btn.setOnMouseClicked(new EventHandler<MouseEvent>() {
    @Override
    public void handle(MouseEvent mouseEvent) {
    
        lvPrimaryList.getItems().add(new Label("Ъпсурт - Колега.mp3"));
    
    }
});

Hard coding the text via the following code also works:

ListView<String> lvPrimaryList = new ListView<>();
Button btn = new Button("Select file");
btn.setOnMouseClicked(new EventHandler<MouseEvent>() {
    @Override
    public void handle(MouseEvent mouseEvent) {

        lvPrimaryList.getItems().add("Ъпсурт - Колега.mp3");

    }
});

@g00se

new Scene(new Label("\u041E\u0434\u0438\u043D"));

ListView<Label> lvPrimaryList = new ListView<>();
Button btn = new Button("Select file");
btn.setOnMouseClicked(new EventHandler<MouseEvent>() {
    @Override
    public void handle(MouseEvent mouseEvent) {
        
        lvPrimaryList.getItems().add(new Label("\u041E\u0434\u0438\u043D"));
        
    }
});

This does work. I get, Один.

@Basil Bourque

You may need to specify a font you know to have glyphs for your desired characters. Other Questions have pointed to JavaFX failing to run through fonts properly to find one with needed glyphs.

I'm using IBM Plex Sans font.

In my Main.java file, I've added it like so:

Font.loadFont(Objects.requireNonNull(getClass().getResource("fonts/IBMPlexSans-Light.ttf")).toExternalForm(), 14);
Font.loadFont(Objects.requireNonNull(getClass().getResource("fonts/IBMPlexSans-Medium.ttf")).toExternalForm(), 14);
Font.loadFont(Objects.requireNonNull(getClass().getResource("fonts/IBMPlexSans-Regular.ttf")).toExternalForm(), 14);
Font.loadFont(Objects.requireNonNull(getClass().getResource("fonts/IBMPlexSans-SemiBold.ttf")).toExternalForm(), 14);

In my style.css file, I've loaded it via:

.root {
    -fx-font-family: "IBM Plex Sans";
}

I reviewed the font glyphs and it does include Cyrillic, as mentioned in the specifications.

I did remove it, just in case, but I still get ����� in the ListView when using File.walk().

EDIT 2 - Further testing

Using FileChooser and selecting multiple files produces the desired results in the ListView, but the code lacks recursion.

ListView<String> lvPrimaryList = new ListView<>();
Button btn = new Button("Select file");
btn.setOnMouseClicked(new EventHandler<MouseEvent>() {
    @Override
    public void handle(MouseEvent mouseEvent) {

        FileChooser chooser = new FileChooser();
        chooser.setTitle("Select files");
        List<File> selectedFiles = chooser.showOpenMultipleDialog((Stage) btn.getScene().getWindow());

        for (File file : selectedFiles) {
            System.out.println("SELECTED FILE: " + file.getAbsolutePath());
            lvPrimaryList.getItems().add(file.getAbsolutePath());
        }
    }
});

It's also a bit less user friendly when there are hundreds of files and the user would have to select all of them. It'd be much better to prompt the user to select a single directory and leave the program to do the "heavy lifting".

EDIT 3 - Minimal reproducible example

In the process of creating a full MRE, I ran into an issue where, after creating the files myself, their names were all: ?????? - ??????.mp3

Main.java

public class Main extends Application {
    @Override
    public void start(Stage stage) throws IOException {

        FXMLLoader fxmlLoader = new FXMLLoader(Main.class.getResource("main.fxml"));
        Scene scene = new Scene(fxmlLoader.load(), 800, 600);
        scene.getStylesheets().add(getClass().getResource("style/style.css").toExternalForm());
        stage.setTitle("My Program");
        stage.setScene(scene);
        stage.show();
    }

    public static void main(String[] args) {
        launch();
    }
}

MainController.java

public class MainController implements Initializable {

    @FXML
    private ListView<String> lvPrimaryList;
    @FXML
    private Button btnSelectFile;

    @Override
    public void initialize(URL url, ResourceBundle resourceBundle) {

        // Select primary list
        btnSelectFile.setOnMouseClicked(new EventHandler<MouseEvent>() {
            @Override
            public void handle(MouseEvent mouseEvent) {

                initTest();
            }
        });
    }

    private void initTest() {

        System.out.println(java.nio.charset.Charset.defaultCharset());

        // Edit these to match your operating system
        String osDelimiter = "/";
        String rootString = "/home/user/Documents/init-test" + osDelimiter;

        Path rootPath = Paths.get(rootString);


        try {
            Files.createDirectory(rootPath);

            File songOne = new File(rootString + "Ъпсурт - Колега.mp3");
            songOne.createNewFile();
            File songTwo = new File(rootString + "Kingsize - Оставам себе си.mp3");
            songTwo.createNewFile();

            //System.setOut(new PrintStream(new FileOutputStream(rootString + "Ъпсурт - Колега.mp3"), true, StandardCharsets.UTF_8));
            //System.setOut(new PrintStream(new FileOutputStream(rootString + "Kingsize - Оставам себе си.mp3"), true, StandardCharsets.UTF_8));

            //System.setOut(new PrintStream(new FileOutputStream(rootString + "Ъпсурт - Колега.mp3"), true, "Cp1252"));
            //System.setOut(new PrintStream(new FileOutputStream(rootString + "Kingsize - Оставам себе си.mp3"), true, "Cp1252"));

            //System.setOut(new PrintStream(new FileOutputStream(rootString + "Ъпсурт - Колега.mp3"), true, "windows-1251"));
            //System.setOut(new PrintStream(new FileOutputStream(rootString + "Kingsize - Оставам себе си.mp3"), true, "windows-1251"));

            //System.setOut(new PrintStream(new FileOutputStream(rootString + "Ъпсурт - Колега.mp3"), true, "UTF-16"));
            //System.setOut(new PrintStream(new FileOutputStream(rootString + "Kingsize - Оставам себе си.mp3"), true, "UTF-16"));

            //Writer writerOne = new OutputStreamWriter(new FileOutputStream(rootString + "Ъпсурт - Колега.mp3"));
            //writerOne.close();
            //Writer writerTwo = new OutputStreamWriter(new FileOutputStream(rootString + "Kingsize - Оставам себе си.mp3"));
            //writerTwo.close();
        
            //Writer writerOne = new OutputStreamWriter(new FileOutputStream(rootString + "Ъпсурт - Колега.mp3"), StandardCharsets.UTF_8);
            //writerOne.close();
            //Writer writerTwo = new OutputStreamWriter(new FileOutputStream(rootString + "Kingsize - Оставам себе си.mp3"), StandardCharsets.UTF_8);
            //writerTwo.close();

            //Writer writerOne = new OutputStreamWriter(new FileOutputStream(rootString + "Ъпсурт - Колега.mp3"), StandardCharsets.UTF_16);
            //writerOne.close();
            //Writer writerTwo = new OutputStreamWriter(new FileOutputStream(rootString + "Kingsize - Оставам себе си.mp3"), StandardCharsets.UTF_16);
            //writerTwo.close();

            try (Stream<Path> stream = Files.walk(rootPath.toFile().toPath()).sorted()) {
                stream.forEach(path -> addFileToList(path.toFile(), lvPrimaryList));
            } catch (IOException ioException) {
                ioException.getMessage();
            }

        } catch (IOException exception) {
            exception.printStackTrace();
        }
    }
}

init-test.fxml

<?xml version="1.0" encoding="UTF-8"?>

<?import javafx.geometry.Insets?>
<?import javafx.scene.control.Button?>
<?import javafx.scene.control.Label?>
<?import javafx.scene.control.ListView?>
<?import javafx.scene.layout.VBox?>
<?import javafx.scene.text.Font?>

<VBox alignment="CENTER" maxHeight="-Infinity" maxWidth="-Infinity" minHeight="-Infinity" minWidth="-Infinity" spacing="5.0" xmlns="http://javafx.com/javafx/21" xmlns:fx="http://javafx.com/fxml/1" fx:controller="com.project.test.MainController">
   <children>
      <Label alignment="CENTER" maxWidth="1.7976931348623157E308" text="Init Test">
         <font>
            <Font name="System Bold" size="13.0" />
         </font>
      </Label>
      <ListView fx:id="lvPrimaryList" />
      <Button fx:id="btnSelectFile" mnemonicParsing="false" text="Select file" />
   </children>
   <padding>
      <Insets bottom="5.0" left="5.0" right="5.0" top="5.0" />
   </padding>
</VBox>

I also ran into this post (How to support Cyrillic alphabet in Eclipse?).

It explains that the symbol doesn't have anything to do with the encoding but the font not having the necessary glyphs to support Cyrillic, as mentioned by Basil Bourque - which is weird because of the above experiments resulting in a success when manually selecting the files.

I'll keep updating as I keep investigating.

EDIT 4 - Source code

Digging through Github source code of Java/JavaFX, I found the following:

  • FileChooser - uses a List<File>
  • File - uses a normalized pathname String.
  • Files - uses an InputStream of Path
  • Path - uses a URI
  • URI - is a String

Theoretically, whether I use File.walk() or manually selecting files via FileChooser, both should be getting their data in a form of a String.

Yet, the same String seems to be interpreted in different ways.

Share Improve this question edited Nov 29, 2024 at 13:41 AmigoJack 5,9481 gold badge19 silver badges33 bronze badges asked Nov 22, 2024 at 22:33 DoombringerDoombringer 6647 silver badges21 bronze badges 12
  • 2 Post an example of a file name with the problematic Cyrillic characters. We need an minimal reproducible example. – Basil Bourque Commented Nov 22, 2024 at 22:51
  • What is “VLC” you mentioned? – Basil Bourque Commented Nov 22, 2024 at 22:52
  • 1 Not quite sure why a) you're mixing Path with File (perhaps because of your positive results with FileChooser or something?) or b) why you think recursion/non-recursion has any bearing on this. Personally I would begin with something much simpler and test your gui component with a single problematic file using Path. I'm going to take a wild guess you're using Windows – g00se Commented Nov 23, 2024 at 0:12
  • 3 Create an App with a label that displays the hardcoded text of one of the problem file names: new Scene(new Label("problem text")). That is all it needs to do. Nothing else. Does the text display correctly? – jewelsea Commented Nov 23, 2024 at 0:20
  • 1 You may need to specify a font you know to have glyphs for your desired characters. Other Questions have pointed to JavaFX failing to run through fonts properly to find one with needed glyphs. – Basil Bourque Commented Nov 23, 2024 at 1:49
 |  Show 7 more comments

1 Answer 1

Reset to default 2

It was the God darn Flatpak! (ノ °益°)ノ 彡 ┻━┻

Thanks to the answer here (Java error creating Path from String, does linux limit filenames to 8bit charset), I managed to figure it out.

To test it, I created a basic Java project in IntelliJ with the following code:

import java.io.File;

public class Test {
    public static void main(String[] args) {
        File f = new File("\u2026");
        f.toPath();
    }
}

Whenever I ran it via IntelliJ, I'd get an error: Malformed input or input contains unmappable characters: ?

But if I were to run it via terminal using java Main (after compiling it with javac Main.java), it ran fine. Without even the LANG variables.

In the Flatpak IntelliJ version, I was getting this message in the console (which I kept missing cause it was always hidden because of the rest of my System.out.println() output or the errors - with which it blends real nice)

"Gtk-WARNING **: Locale not supported by C library. Using the fallback 'C' locale."

So I downloaded the .tar.gz for Linux from IntelliJ's website, ran it via terminal ($./idea) and what do you know? The warning wasn't there.

I tested the sample above by running it in IntelliJ and it worked fine. Didn't even need the VMOptions or anything.

So I opened my project and all of the code described in the original post worked as expected.

No ?????, no ����� and it even created the files described in the MainController.java file properly.

I spent 2 days dealing with this issue...

EDIT 1 - For Flatpak use

For people that want to use the Flatpak version of IntelliJ, I found this (How to give VSCode Flatpak package access to system SDK for Java?) interesting question.

From what I understand, Java can't seem to access the system's LANG variable when ran via Flatpak. Perhaps installing the SDK via Flatpak to match the IDE would fix it?

SIDE NOTE: I noticed the FileChooser and DirectoryChooser weren't using my system's theme colors when using the Flatpak IntelliJ. I don't know if it's important, but I figured I'd mention it just in case.

本文标签: