admin管理员组

文章数量:1122832

I have been stuck on this problem for some time now. I am trying to upload a BAM file (ranging 0.7GB - 2.3GB) from my website via flask to my back end filesystem. I have tried a couple things already. I started just trying to upload the whole file to no ones surprise got errors instantly with a disconnection error. Some post suggested that this could be due to timeouts of either docker or gunicorn so i increased this timeout to 10min (just to make sure this was no issue). After still getting errors on timeout i tried to upload the files in chunks this seemed to go better except it keeps giving error to one of the chunks, this chunk is never the same but always the last chunk to finish making me think it has to do with merging the chunks back together. I hope a fresh pair of trained eyes can help me out here. The following is parts of the code im working with, unfortunately not the cleanest because of multiple interns working on this project: HTML:

<div class="gene_box">
    <div class="box_header">
        <h3 class="box_header_title">> Data Import</h3>
        <a href="/static/html/help.html" target="_blank">
            <img class="tooltip_img" src="../static/images/help_icon.png" />
        </a>
    </div>
    <div class="content">
        Data upload<br />
        <select id="file_type" name="file_type" onchange="enableFormElements()">
            <option value="default">Select Data Type</option>
            <option value="genomic_positions">Genomic elements</option>
            <option value="read_map">Read Counts</option>
            <option value="splice_variants">Splice Variants</option>
        </select>
        <input type="file" name="uploaded_file" id="uploaded_file" style="width:210px;"><br />
        <button id="upload_button" class="button">Upload</button>
        <div id="loader"></div>
    </div>
    <div id="read_map" class="checkboxes hidden">
        <div>
            <input type="checkbox" id="read_map_scale" value="forward" name="genomic_element_include"
                class="checkbox" />
            <label id="linear_label" for="read_map_scale" class="checkbox_label">Normal destribution</label>
        </div>
        <div>
            <input type="checkbox" id="read_map_scale" value="reverse" name="genomic_element_include"
                class="checkbox" />
            <label id="log2_label" for="read_map_scale" class="checkbox_label">Log2 destribution</label>
        </div>
        <div>
            <input type="checkbox" id="read_map_scale" value="both" name="genomic_element_include"
                class="checkbox" />
            <label id="log10_label" for="read_map_scale" class="checkbox_label">Log10 destribution</label>
    </div>

    </div>
    <div id="splice_variants" class="checkboxes hidden">
        <h1>Splice Variants</h1>
    </div>
</div>

The Javascript:

document.addEventListener('DOMContentLoaded', function () {
    document.getElementById("upload_button").addEventListener("click", function() {
        event.preventDefault();
        const uploadedFile = document.getElementById("uploaded_file").files[0];
        const mode = document.getElementById("read_map_scale").value; // Get mode
        const fileType = document.getElementById("file_type").value; // Get file type
        var accessionCode = document.querySelector('#gi_codes').value.trim();

        if (!uploadedFile) { alert("No file selected.");
            return;
        }
        if (!mode || mode === "default") { alert("Please select a mode.");
            return;
        }


        if (fileType === "genomic_positions") {
            poscalc(uploadedFile, globalresults_json, accessionCode);
        } else if (fileType === "read_map") {
            // Track uploaded chunks
            file_to_readdata(uploadedFile, mode, fileType);

            
        } else if (fileType === "splice_variantes"){
            alert("Work in progress not funcitonal yet");

        } else{
            alert("Invalid file type selected. (there is no functionality for this option yet)");
        }
    });
});

function file_to_readdata (uploadedFile) {
    const chunkSize = 10 * 1024 * 1024; // 5 MB chunk size
    const totalChunks = Math.ceil(uploadedFile.size / chunkSize);
    let uploadedChunks = 0;
    for (let i = 0; i < totalChunks; i++) {
        const start = i * chunkSize;
        const end = Math.min(start + chunkSize, uploadedFile.size);
        const chunk = uploadedFile.slice(start, end);

        console.log(`Uploading chunk ${i + 1}/${totalChunks}`);
        console.log("Chunk size:", chunk.size);

        const formData = new FormData();
        formData.append("chunk", chunk);
        formData.append("chunkIndex", i);
        formData.append("totalChunks", totalChunks);
        formData.append("fileName", uploadedFile.name);

        fetch(`${$SCRIPT_ROOT}/_upload_file`, {
            method: "POST",
            body: formData,
        })
            .then((response) => response.json())
            .then((data) => {
                console.log(data);
                if (data.status === "success") {
                    console.log(`Chunk ${i + 1}/${totalChunks} uploaded successfully`);
                    uploadedChunks++;

                    // Check if all chunks are uploaded
                    if (uploadedChunks === totalChunks) {
                        console.log("File upload complete");
                        
                    }
                } else {
                    console.error(`Error uploading chunk ${i + 1}:`, data.message);
                }
            })
            .catch((error) => {
                console.error("Error uploading chunk:", error);
            });
        }
}

The flask script:

@app.route('/_upload_file', methods=['GET', 'POST'])
def process_file():
    """
    When the user uploads a bam or sam file, the file is processed, and the data is returned in a dictionary-like structure to the client.
    """
    UPLOAD_DIR='../uploads'
    try:
            logging.debug("Request Files: %s", request.files)
            logging.debug("Request Form: %s", request.form)

            # Extract form data
            chunk = request.files.get('chunk')
            mode = request.form.get('mode')
            chunk_index = int(request.form.get('chunkIndex', -1))  # Default -1 for debug
            total_chunks = int(request.form.get('totalChunks', -1))
            file_name = request.form.get('fileName')

            logging.debug("Chunk: %s", chunk)
            logging.debug("Mode: %s", mode)
            logging.debug("Chunk Index: %d", chunk_index)
            logging.debug("Total Chunks: %d", total_chunks)
            logging.debug("File Name: %s", file_name)
            
            # Create a temporary file to store chunks
            temp_file_path = os.path.join(UPLOAD_DIR, f"{file_name}.part{chunk_index}")
            chunk.save(temp_file_path)

            # Check if all chunks are uploaded
            uploaded_chunks = [f for f in os.listdir(UPLOAD_DIR) if f.startswith(file_name)]
            if len(uploaded_chunks) == total_chunks:
                # Combine chunks
                complete_file_path = os.path.join(UPLOAD_DIR, file_name)
                with open(complete_file_path, 'wb') as output_file:
                    for i in range(total_chunks):
                        part_path = os.path.join(UPLOAD_DIR, f"{file_name}.part{i}")
                        with open(part_path, 'rb') as part_file:
                            output_file.write(part_file.read())
                        os.remove(part_path)  # Clean up chunk file
                if rcg.is_valid_bam(output_file):  # Use temp_file.name
                    sorted_bam = rcg.sort_and_index_bam(output_file)  # Pass temp_file.name to your helper function
                    coverage = rcg.calculate_full_coverage(sorted_bam, mode)
                else:
                    coverage = rcg.calculate_full_coverage(output_file, mode)
                return jsonify({"status": "success", "message": "File uploaded successfully", "filePath": complete_file_path})
            
            return jsonify({"status": "success", "message": "Chunk uploaded successfully"})
        
    except Exception as ex:
        return jsonify({"status": "error", "message": str(ex)}), 500

The error:

Uploading chunk 72/73 handle_context_submission.js:296:17
Chunk size: 10485760 handle_context_submission.js:297:17
Uploading chunk 73/73 handle_context_submission.js:296:17
Chunk size: 2228747

POST :443/_upload_file   [HTTP/1.1 500 INTERNAL SERVER ERROR 4248ms]
Chunk 72/73 uploaded successfully handle_context_submission.js:313:29
Object { message: "Argument must be string, bytes or unicode.", status: "error" }
Error uploading chunk 73: Argument must be string, bytes or unicode.

本文标签: