admin管理员组文章数量:1313347
Running a FLASK Python web application that zips up a number of files and serves them to the user based on the user's filters.
Issue, after the user clicks download the backend pulls all the files and zip creation starts, but this can take minutes. The user won't know if it's hung.
I decided streaming the zip file as it's being created gets the file to the user quicker, and it also lets the user know that the web app is working on it. The issue with this is in order to use the browser's download section (the little pop up or the download page with progress bars), you need to provide the content-length header, but we don't know the size of the zip file because it hasn't finished being created yet. I've tried my best to estimate the size of the zip file once it's complete, and I thought it would have been easy as my zip is just ZIP_STORED, but there is internal zip structure that I'm not able to accurately measure. The browser just ends up rejecting the download with ERR_CONTENT_LENGTH_MISMATCH.
I can provide a Server Sent Event (SSE) route to make my own progress bar by reading the number of bytes sent and polling it in a seperate /progress route, but I really had my heart set on using the browsers download section and it's a point of pride for me at this point. I could also just not stream it, then as it's being created use SSE to provide updates on the zip, then once it's finished send it with a content-length header... Not quite as nice as I'd like it to be though.
def calculate_total_size(filenames):
total_size = 0
for file in filenames:
matching_filepath, _ = get_file_info(file)
if matching_filepath:
total_size += os.path.getsize(matching_filepath)
# Add overhead for ZIP file structure (22 bytes per file + 22 bytes for the central directory)
total_size += 22 * (len(filenames) + 1)
return total_size
def generate_file_entries(filenames):
for file in filenames:
matching_filepath, filename = get_file_info(file)
if matching_filepath:
file_stat = os.stat(matching_filepath)
modified_at = datetime.utcfromtimestamp(file_stat.st_mtime)
with open(matching_filepath, 'rb') as f:
chunk = f.read()
if isinstance(chunk, bytes): # Ensure only bytes are yielded
yield filename, modified_at, 0o600, ZIP_64, [chunk]
else:
print(f"Unexpected data type for file contents: {type(chunk)}")
Running a FLASK Python web application that zips up a number of files and serves them to the user based on the user's filters.
Issue, after the user clicks download the backend pulls all the files and zip creation starts, but this can take minutes. The user won't know if it's hung.
I decided streaming the zip file as it's being created gets the file to the user quicker, and it also lets the user know that the web app is working on it. The issue with this is in order to use the browser's download section (the little pop up or the download page with progress bars), you need to provide the content-length header, but we don't know the size of the zip file because it hasn't finished being created yet. I've tried my best to estimate the size of the zip file once it's complete, and I thought it would have been easy as my zip is just ZIP_STORED, but there is internal zip structure that I'm not able to accurately measure. The browser just ends up rejecting the download with ERR_CONTENT_LENGTH_MISMATCH.
I can provide a Server Sent Event (SSE) route to make my own progress bar by reading the number of bytes sent and polling it in a seperate /progress route, but I really had my heart set on using the browsers download section and it's a point of pride for me at this point. I could also just not stream it, then as it's being created use SSE to provide updates on the zip, then once it's finished send it with a content-length header... Not quite as nice as I'd like it to be though.
def calculate_total_size(filenames):
total_size = 0
for file in filenames:
matching_filepath, _ = get_file_info(file)
if matching_filepath:
total_size += os.path.getsize(matching_filepath)
# Add overhead for ZIP file structure (22 bytes per file + 22 bytes for the central directory)
total_size += 22 * (len(filenames) + 1)
return total_size
def generate_file_entries(filenames):
for file in filenames:
matching_filepath, filename = get_file_info(file)
if matching_filepath:
file_stat = os.stat(matching_filepath)
modified_at = datetime.utcfromtimestamp(file_stat.st_mtime)
with open(matching_filepath, 'rb') as f:
chunk = f.read()
if isinstance(chunk, bytes): # Ensure only bytes are yielded
yield filename, modified_at, 0o600, ZIP_64, [chunk]
else:
print(f"Unexpected data type for file contents: {type(chunk)}")
Share
Improve this question
asked Jan 30 at 18:03
john stamosjohn stamos
1,1255 gold badges18 silver badges38 bronze badges
1 Answer
Reset to default 0Have you tried using a transfer method that does not require giving the full content length? Something like this
from flask import Flask, Response
import zipfile
import io
app = Flask(__name__)
def stream_zip():
with io.BytesIO() as zip_buffer:
with zipfile.ZipFile(zip_buffer, "w", zipfile.ZIP_STORED) as zip_file:
files = {"file1.txt": "Hello, World!", "file2.txt": "Flask Streaming!"}
for filename, content in files.items():
zip_file.writestr(filename, content)
zip_buffer.seek(0)
yield zip_buffer.read() # Yield current ZIP contents
zip_buffer.truncate(0) # Clear buffer after yielding
@app.route('/download')
def download():
return Response(stream_zip(), mimetype="application/zip", headers={
"Content-Disposition": "attachment; filename=download.zip"
})
if __name__ == "__main__":
app.run(debug=True)
Below is a more minimal example on streaming files
def stream_data():
for i in range(10):
yield f"Chunk {i}\n".encode() # Simulate file content
@app.route('/download')
def download():
return Response(stream_data(), mimetype="application/octet-stream", headers={
"Content-Disposition": "attachment; filename=streamed.txt"
})
本文标签:
版权声明:本文标题:web applications - Python FLASK Webapp zips a number of large files based on user selection, but the user doesn't know i 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741945420a2406385.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论