admin管理员组

文章数量:1122832

I am playing around with image classification using the tensorflow and keras packages for R. I have build and trained a model that does well on the testing validation dataset. I now want to use that model to predict image classes for a lot of images stored online (i have all the URLs in a dataframe in R).

I can write a for loop to do this where i download each image, classify it, record the model prediction, and then delete the downloaded image, but this takes a long time and it would be faster to just read the image into memory instead of downloading each image. I cannot for the life of me figure out how to load an imagine into memory in R and convert it to a datatype that works with the rest of my tensorflow image standardization.

Here is my for loop:

data$score<-NA
for (i in 1:nrow(data)){
  
  img_tensor = 
    get_file("t",data$image_url[i]) %>% #download temp file
    tf$io$read_file() %>%
    tf$io$decode_image() %>%
    tf$image$resize(as.integer(image_size)) %>%
    tf$expand_dims(0L) 
  
  #delete temp file
  file.remove("/Users/me/.keras/datasets/t")
  
  data$score[i]=model %>% predict(img_tensor, verbose=0)
  
}

Here is an example image URL: .jpeg

All i want to do is be able to load that image into R directly from the URL (no writing the file to disk) and then use the tensorflow workflow (decode_image, resize, expand_dims). Any help is appreciated!

To replicate the code just replace data$image_url[i] with the URL i provided. No need to worry about predicting my model, that part is working fine. I just need the image to successfully feed into the rest of the pipe.

I am playing around with image classification using the tensorflow and keras packages for R. I have build and trained a model that does well on the testing validation dataset. I now want to use that model to predict image classes for a lot of images stored online (i have all the URLs in a dataframe in R).

I can write a for loop to do this where i download each image, classify it, record the model prediction, and then delete the downloaded image, but this takes a long time and it would be faster to just read the image into memory instead of downloading each image. I cannot for the life of me figure out how to load an imagine into memory in R and convert it to a datatype that works with the rest of my tensorflow image standardization.

Here is my for loop:

data$score<-NA
for (i in 1:nrow(data)){
  
  img_tensor = 
    get_file("t",data$image_url[i]) %>% #download temp file
    tf$io$read_file() %>%
    tf$io$decode_image() %>%
    tf$image$resize(as.integer(image_size)) %>%
    tf$expand_dims(0L) 
  
  #delete temp file
  file.remove("/Users/me/.keras/datasets/t")
  
  data$score[i]=model %>% predict(img_tensor, verbose=0)
  
}

Here is an example image URL: https://inaturalist-open-data.s3.amazonaws.com/photos/451526093/medium.jpeg

All i want to do is be able to load that image into R directly from the URL (no writing the file to disk) and then use the tensorflow workflow (decode_image, resize, expand_dims). Any help is appreciated!

To replicate the code just replace data$image_url[i] with the URL i provided. No need to worry about predicting my model, that part is working fine. I just need the image to successfully feed into the rest of the pipe.

Share Improve this question edited Nov 22, 2024 at 16:45 icyeye asked Nov 22, 2024 at 16:45 icyeyeicyeye 32 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 1

A few notes:

  • Writing to a temporary directory on macOS and Linux usually has identical performance to keeping everything in memory, since /tmp is usually mounted as a RAM filesystem and never actually touches the disk. (If you're on Windows, or are swapping, the story is different)

  • As far as I know, TensorFlow doesn't have any graph ops that will fetch content from an http url, so you'll need to do that step using R or Python. If the op needs to live in a tf.data, you'll need to wrap it in tf.py_function.

  • To fetch a url directly into memory in R, without writing to the filesystem, you can do:

    url <- "https://inaturalist-open- data.s3.amazonaws.com/photos/451526093/medium.jpeg"
    bytes <- readBin(url, raw(), 200000)
    as_py_bytes <- reticulate::import_builtins(convert = FALSE)$bytes
    bytes_tensor <- tf$constant(as_py_bytes(bytes), tf$string)
    
  • The bottleneck is most likely the download step, not the "write to a file" step. You'll probably see the most significant speedups from rewriting your loop to process batches of images instead of a single image at a time (e.g., using curl::multi_download(), and passing a batch of images to predict())

本文标签: How can i load an image from URL into a tensorflow pipeline in RStack Overflow