admin管理员组

文章数量:1313073

I have a program that reads from a file using O_DIRECT. The file is being continuously written to by another process. The read loop works fine until it reaches the point where the write is happening. At this point, read() fails with EINVAL (Invalid argument). If I don't use O_DIRECT, this issue doesn't happen and read returns valid value. I have verified that block size is 4096.

Here is my code:

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <malloc.h>
#include <iostream>

#define BUFFER_SIZE 524288  // 512 KB

int main() {
    int fd;
    char *buffer;
    ssize_t bytesRead;

    // Open file with O_DIRECT
    fd = open("/data/file.lz4", O_RDONLY | O_DIRECT);
    if (fd == -1) {
        perror("Error opening file");
        return EXIT_FAILURE;
    }

    // Allocate aligned memory for O_DIRECT
    buffer = (char *) aligned_alloc(4096, BUFFER_SIZE);

    // Read in a loop
    while (1) {
        bytesRead = read(fd, buffer, BUFFER_SIZE);
        if (bytesRead == -1) {
            std::cerr << "Error reading file: " << strerror(errno) << std::endl;
            std::cerr << "fd: " << fd << " buffer: " << (void*)buffer << " BUFFER_SIZE: " << BUFFER_SIZE << std::endl;
            break;
        }
        std::cout << bytesRead << std::endl;
    }

    // Free the aligned buffer
    free(buffer);

    // Close file
    close(fd);

    return 0;
}

Observed Behavior The program reads correctly when the file has data. When it reaches the writing point, the output is:

524288
524288
524288
524288
524288
172020
0
0
Error reading file: Invalid argument
fd: 3 buffer: 0x7f4140e88000 BUFFER_SIZE: 524288

This means read() is failing when trying to read data that is being written at the same time.

What I've Checked:

The buffer is correctly aligned using aligned_alloc(4096, BUFFER_SIZE). BUFFER_SIZE is a multiple of 4096, ensuring it meets O_DIRECT alignment requirements. fd is valid before the error occurs.

The file exists and is being continuously updated by another process.

Details

filesystem: ext4 (local SSD) cpu model: Intel(R) Core(TM) i9-14900KS Linux: 5.15.77-1-lts

Questions

Why does read() return EINVAL when it reaches the writing point?

Is there a way to wait for more data instead of failing when read() reaches unwritten portions?

Any insights into how O_DIRECT interacts with concurrent writes would be greatly appreciated!

I have a program that reads from a file using O_DIRECT. The file is being continuously written to by another process. The read loop works fine until it reaches the point where the write is happening. At this point, read() fails with EINVAL (Invalid argument). If I don't use O_DIRECT, this issue doesn't happen and read returns valid value. I have verified that block size is 4096.

Here is my code:

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <malloc.h>
#include <iostream>

#define BUFFER_SIZE 524288  // 512 KB

int main() {
    int fd;
    char *buffer;
    ssize_t bytesRead;

    // Open file with O_DIRECT
    fd = open("/data/file.lz4", O_RDONLY | O_DIRECT);
    if (fd == -1) {
        perror("Error opening file");
        return EXIT_FAILURE;
    }

    // Allocate aligned memory for O_DIRECT
    buffer = (char *) aligned_alloc(4096, BUFFER_SIZE);

    // Read in a loop
    while (1) {
        bytesRead = read(fd, buffer, BUFFER_SIZE);
        if (bytesRead == -1) {
            std::cerr << "Error reading file: " << strerror(errno) << std::endl;
            std::cerr << "fd: " << fd << " buffer: " << (void*)buffer << " BUFFER_SIZE: " << BUFFER_SIZE << std::endl;
            break;
        }
        std::cout << bytesRead << std::endl;
    }

    // Free the aligned buffer
    free(buffer);

    // Close file
    close(fd);

    return 0;
}

Observed Behavior The program reads correctly when the file has data. When it reaches the writing point, the output is:

524288
524288
524288
524288
524288
172020
0
0
Error reading file: Invalid argument
fd: 3 buffer: 0x7f4140e88000 BUFFER_SIZE: 524288

This means read() is failing when trying to read data that is being written at the same time.

What I've Checked:

The buffer is correctly aligned using aligned_alloc(4096, BUFFER_SIZE). BUFFER_SIZE is a multiple of 4096, ensuring it meets O_DIRECT alignment requirements. fd is valid before the error occurs.

The file exists and is being continuously updated by another process.

Details

filesystem: ext4 (local SSD) cpu model: Intel(R) Core(TM) i9-14900KS Linux: 5.15.77-1-lts

Questions

Why does read() return EINVAL when it reaches the writing point?

Is there a way to wait for more data instead of failing when read() reaches unwritten portions?

Any insights into how O_DIRECT interacts with concurrent writes would be greatly appreciated!

Share Improve this question edited Jan 31 at 6:51 rishi jain asked Jan 31 at 6:36 rishi jainrishi jain 1234 bronze badges 4
  • 4 Your reads have to always start from an offset which is a multiple of the block size, after the read that didn't return a multiple of block size you'll need to seek back to the start of a block – Alan Birtles Commented Jan 31 at 6:49
  • 1 While I'm a big fan of direct IO, reading the tail end of a file that is currently being written sounds like a pessimization. Caching recently accessed data in the page cache is precisely what normal IO is good at. Do you actually get a performance benefit from this? As for waiting for more data: You can use inotify and wait for an IN_MODIFY event, plus IN_CLOSE_WRITE, probably – Homer512 Commented Jan 31 at 8:30
  • 1 Instead of using O_DIRECT, you could use posix_fadvise with POSIX_FADV_DONTNEED to mark page-sized regions that have already been read and can be removed from the cache. – Ian Abbott Commented Jan 31 at 12:19
  • 1 Is the writing process using O_SYNC or calling fsync? If not, writes are cached and may not be written as you expect. – stark Commented Jan 31 at 13:56
Add a comment  | 

1 Answer 1

Reset to default 1

As suggested in Alan's comment:

reads have to always start from an offset which is a multiple of the block size, after the read that didn't return a multiple of block size you'll need to seek back to the start of a block

本文标签: cODIRECT read() fails with Invalid argument when reading from a continuously written fileStack Overflow