video - FFmpeg overlay positioning issue: Converting frontend center coordinates to FFmpeg top-left coordinates - Stack Overflow-软件玩家

admin管理员组
文章数量:1122846

I'm building a web-based video editor where users can:

Add multiple videos Add images Add text overlays with background color

Frontend sends coordinates where each element's (x,y) represents its center position. on click of the export button i want all data to be exported as one final video on click i send the data to the backend like -

 const exportAllVideos = async () => {
    try {
      const formData = new FormData();
        
      
      const normalizedVideos = videos.map(video => ({
          ...video,
          startTime: parseFloat(video.startTime),
          endTime: parseFloat(video.endTime),
          duration: parseFloat(video.duration)
      })).sort((a, b) => a.startTime - b.startTime);

      
      for (const video of normalizedVideos) {
          const response = await fetch(video.src);
          const blobData = await response.blob();
          const file = new File([blobData], `${video.id}.mp4`, { type: "video/mp4" });
          formData.append("videos", file);
      }

      
      const normalizedImages = images.map(image => ({
          ...image,
          startTime: parseFloat(image.startTime),
          endTime: parseFloat(image.endTime),
          x: parseInt(image.x),
          y: parseInt(image.y),
          width: parseInt(image.width),
          height: parseInt(image.height),
          opacity: parseInt(image.opacity)
      }));

      
      for (const image of normalizedImages) {
          const response = await fetch(image.src);
          const blobData = await response.blob();
          const file = new File([blobData], `${image.id}.png`, { type: "image/png" });
          formData.append("images", file);
      }

      
      const normalizedTexts = texts.map(text => ({
          ...text,
          startTime: parseFloat(text.startTime),
          endTime: parseFloat(text.endTime),
          x: parseInt(text.x),
          y: parseInt(text.y),
          fontSize: parseInt(text.fontSize),
          opacity: parseInt(text.opacity)
      }));

      
      formData.append("metadata", JSON.stringify({
          videos: normalizedVideos,
          images: normalizedImages,
          texts: normalizedTexts
      }));

      const response = await fetch("my_flask_endpoint", {
          method: "POST",
          body: formData
      });

      if (!response.ok) {
        
          console.log('wtf', response);
          
      }

      const finalVideo = await response.blob();
      const url = URL.createObjectURL(finalVideo);
      const a = document.createElement("a");
      a.href = url;
      a.download = "final_video.mp4";
      a.click();
      URL.revokeObjectURL(url);

    } catch (e) {
      console.log(e, "err");
    }
  };

the frontend data for each object that is text image and video we are storing it as an array of objects below is the Data strcutre for each object -

the frontend data for each
  const newVideo = {
      id: uuidv4(),
      src: URL.createObjectURL(videoData.videoBlob),
      originalDuration: videoData.duration,
      duration: videoData.duration,
      startTime: 0,
      playbackOffset: 0,
      endTime: videoData.endTime || videoData.duration,
      isPlaying: false,
      isDragging: false,
      speed: 1,
      volume: 100,
      x: window.innerHeight / 2,
      y: window.innerHeight / 2,
      width: videoData.width,
      height: videoData.height,
    };
    const newTextObject = {
      id: uuidv4(),
      description: text,
      opacity: 100,
      x: containerWidth.width / 2,
      y: containerWidth.height / 2,
      fontSize: 18,
      duration: 20,
      endTime: 20,
      startTime: 0,
      color: "#ffffff",
      backgroundColor: hasBG,
      padding: 8,
      fontWeight: "normal",
      width: 200,
      height: 40,
    };

    const newImage = {
      id: uuidv4(),
      src: URL.createObjectURL(imageData),
      x: containerWidth.width / 2,
      y: containerWidth.height / 2,
      width: 200,
      height: 200,
      borderRadius: 0,
      startTime: 0,
      endTime: 20,
      duration: 20,
      opacity: 100,
    };

BACKEND CODE -

import os
import shutil
import subprocess
from flask import Flask, request, send_file
import ffmpeg
import json
from werkzeug.utils import secure_filename
import uuid
from flask_cors import CORS


app = Flask(__name__)
CORS(app, resources={r"/*": {"origins": "*"}})



UPLOAD_FOLDER = 'temp_uploads'
if not os.path.exists(UPLOAD_FOLDER):
    os.makedirs(UPLOAD_FOLDER)


@app.route('/')
def home():
    return 'Hello World'


OUTPUT_WIDTH = 1920
OUTPUT_HEIGHT = 1080



@app.route('/process', methods=['POST'])
def process_video():
    work_dir = None
    try:
        work_dir = os.path.abspath(os.path.join(UPLOAD_FOLDER, str(uuid.uuid4())))
        os.makedirs(work_dir)
        print(f"Created working directory: {work_dir}")

        metadata = json.loads(request.form['metadata'])
        print("Received metadata:", json.dumps(metadata, indent=2))
        
        video_paths = []
        videos = request.files.getlist('videos')
        for idx, video in enumerate(videos):
            filename = f"video_{idx}.mp4"
            filepath = os.path.join(work_dir, filename)
            video.save(filepath)
            if os.path.exists(filepath) and os.path.getsize(filepath) > 0:
                video_paths.append(filepath)
                print(f"Saved video to: {filepath} Size: {os.path.getsize(filepath)}")
            else:
                raise Exception(f"Failed to save video {idx}")

        image_paths = []
        images = request.files.getlist('images')
        for idx, image in enumerate(images):
            filename = f"image_{idx}.png"
            filepath = os.path.join(work_dir, filename)
            image.save(filepath)
            if os.path.exists(filepath):
                image_paths.append(filepath)
                print(f"Saved image to: {filepath}")

        output_path = os.path.join(work_dir, 'output.mp4')

        filter_parts = []

        base_duration = metadata["videos"][0]["duration"] if metadata["videos"] else 10
        filter_parts.append(f'color=c=black:s={OUTPUT_WIDTH}x{OUTPUT_HEIGHT}:d={base_duration}[canvas];')

        for idx, (path, meta) in enumerate(zip(video_paths, metadata['videos'])):
            x_pos = int(meta.get("x", 0) - (meta.get("width", 0) / 2))
            y_pos = int(meta.get("y", 0) - (meta.get("height", 0) / 2))
            
            filter_parts.extend([
                f'[{idx}:v]setpts=PTS-STARTPTS,scale={meta.get("width", -1)}:{meta.get("height", -1)}[v{idx}];',
                f'[{idx}:a]asetpts=PTS-STARTPTS[a{idx}];'
            ])

            if idx == 0:
                filter_parts.append(
                    f'[canvas][v{idx}]overlay=x={x_pos}:y={y_pos}:eval=init[temp{idx}];'
                )
            else:
                filter_parts.append(
                    f'[temp{idx-1}][v{idx}]overlay=x={x_pos}:y={y_pos}:'
                    f'enable=\'between(t,{meta["startTime"]},{meta["endTime"]})\':eval=init'
                    f'[temp{idx}];'
                )

        last_video_temp = f'temp{len(video_paths)-1}'

        if video_paths:
            audio_mix_parts = []
            for idx in range(len(video_paths)):
                audio_mix_parts.append(f'[a{idx}]')
            filter_parts.append(f'{"".join(audio_mix_parts)}amix=inputs={len(video_paths)}[aout];')

        
        if image_paths:
            for idx, (img_path, img_meta) in enumerate(zip(image_paths, metadata['images'])):
                input_idx = len(video_paths) + idx
                
                
                x_pos = int(img_meta["x"] - (img_meta["width"] / 2))
                y_pos = int(img_meta["y"] - (img_meta["height"] / 2))
                
                filter_parts.extend([
                    f'[{input_idx}:v]scale={img_meta["width"]}:{img_meta["height"]}[img{idx}];',
                    f'[{last_video_temp}][img{idx}]overlay=x={x_pos}:y={y_pos}:'
                    f'enable=\'between(t,{img_meta["startTime"]},{img_meta["endTime"]})\':'
                    f'alpha={img_meta["opacity"]/100}[imgout{idx}];'
                ])
                last_video_temp = f'imgout{idx}'

        if metadata.get('texts'):
            for idx, text in enumerate(metadata['texts']):
                next_output = f'text{idx}' if idx < len(metadata['texts']) - 1 else 'vout'
                
                escaped_text = text["description"].replace("'", "\\'")
                
                x_pos = int(text["x"] - (text["width"] / 2))
                y_pos = int(text["y"] - (text["height"] / 2))
                
                text_filter = (
                    f'[{last_video_temp}]drawtext=text=\'{escaped_text}\':'
                    f'x={x_pos}:y={y_pos}:'
                    f'fontsize={text["fontSize"]}:'
                    f'fontcolor={text["color"]}'
                )
                
                if text.get('backgroundColor'):
                    text_filter += f':box=1:boxcolor={text["backgroundColor"]}:boxborderw=5'
                
                if text.get('fontWeight') == 'bold':
                    text_filter += ':font=Arial-Bold'
                
                text_filter += (
                    f':enable=\'between(t,{text["startTime"]},{text["endTime"]})\''
                    f'[{next_output}];'
                )
                
                filter_parts.append(text_filter)
                last_video_temp = next_output
        else:
            filter_parts.append(f'[{last_video_temp}]null[vout];')

        
        filter_complex = ''.join(filter_parts)

        
        cmd = [
            'ffmpeg',
            *sum([['-i', path] for path in video_paths], []),
            *sum([['-i', path] for path in image_paths], []),
            '-filter_complex', filter_complex,
            '-map', '[vout]'
        ]
        
        
        if video_paths:
            cmd.extend(['-map', '[aout]'])
        
        cmd.extend(['-y', output_path])

        print(f"Running ffmpeg command: {' '.join(cmd)}")
        result = subprocess.run(cmd, capture_output=True, text=True)
        
        if result.returncode != 0:
            print(f"FFmpeg error output: {result.stderr}")
            raise Exception(f"FFmpeg processing failed: {result.stderr}")

        return send_file(
            output_path,
            mimetype='video/mp4',
            as_attachment=True,
            download_name='final_video.mp4'
        )

    except Exception as e:
        print(f"Error in video processing: {str(e)}")
        return {'error': str(e)}, 500
    
    finally:
        if work_dir and os.path.exists(work_dir):
            try:
                print(f"Directory contents before cleanup: {os.listdir(work_dir)}")
                if not os.environ.get('FLASK_DEBUG'):
                    shutil.rmtree(work_dir)
                else:
                    print(f"Keeping directory for debugging: {work_dir}")
            except Exception as e:
                print(f"Cleanup error: {str(e)}")

                
if __name__ == '__main__':
    app.run(debug=True, port=8000)

I'm also attaching what the final thing looks like on the frontend web vs in the downloaded video and as u can see the downloaded video has all coords and positions messed up be it of the texts, images as well as videos

can somebody please help me figure this out :)

本文标签：

版权声明：本文标题：video - FFmpeg overlay positioning issue: Converting frontend center coordinates to FFmpeg top-left coordinates - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1736280802a1926116.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

video - FFmpeg overlay positioning issue: Converting frontend center coordinates to FFmpeg top-left coordinates - Stack Overflow

更多相关文章

Windows 11最稳定版本详解

python - dask `var` and `std` with ddof in groupby context and other aggregations - Stack Overflow

active directory - samba-tool GPO scripts - Stack Overflow

javascript - Stripe Payment Vue3 - Stack Overflow

Implement while loop inspring webflux to scroll Elasticsearch index and insert to redis - Stack Overflow

python - How to Call a FastAPI Endpoint on Google App Engine Protected by IAP as an End User? - Stack Overflow

Creating a listener for Branch.io deferred deep link in .NET MAUI - Stack Overflow

Diagnostic analyzer runner is currently unavailable doe to an internal error (with CodeRush) - Stack Overflow

c# - Printing Popup Hangs over 5 seconds for each page - Stack Overflow

How to run steps in parallel in Buildbot - Stack Overflow

linux - Do all fragments of an IP packet greater than MTU carry the full PPPoE header when modified in an eBPF tc program? - Sta

ios - Sending &quot;Start&quot; Live Activity Notification from Apple Push Notifications Console successfully received b

If I use a Google Site along with an Apps Script webapp(set to &#39;Anyone&#39; access)linked to a Google Sheet, is the

CC++ encode binary into utf8 - Stack Overflow

apache kafka - Unknown feature gate KafkaNodePools found in the configuration - Stack Overflow

c# - How to replace values in an array that is inside of a class? - Stack Overflow

Kubernetes: How can I run pods but reference of Volume on a different node? - Stack Overflow

pac4j v6 replacement of Pac4JHttpServletRequestWrapper - Stack Overflow

react hooks - My browser localstorage clears everytime i refresh - Stack Overflow

hcl - How to create parallel builds foreach item in list using packer template - Stack Overflow

发表评论

推荐文章

微信PC版主程序 打开WeChat.exe找不到vcomp140.dll怎么办？微信PC版DLL文件修复方法

wp cli - wp-cli output &#39;column display format&#39;wrapping

post meta - I read the postmeta and update it without changing it, then move on to the next record

rest api - How to Upload a media item with alt text and description

Copy Content with Images to Other Site

热门文章

Docker Compose restart:always not working when using devices - Stack Overflow

How can the alt be added to this wooCommerce category image?

block editor - Problem creating a two level list that is text only (not links)

Why does Rust include dev-only features as part of a release build? - Stack Overflow

custom taxonomy - Set download limit and expiry for WooCommerce downloadable products in specific categories

Can I create dynamic blocks inside an UX Twig component? - Stack Overflow

php - Endpoint exists and shows up in My Account dashboard. Need to add endpoint to My Account DROPDOWN menu in Porto theme

hooks - change the time of the default wordpress crons

c# - Environment Variables in Docker Compose Not Overriding appsettings.json in .NET Core Application - Stack Overflow

block editor - Unable to trigger a frontend @wordpresshooks action

最新文章

Java入门级教学（IDEA的下载与安装与JDK的环境配置）

华硕笔记本电脑用U盘重装windows系统

物理网卡MAC修改器v3.0 - 真实网卡硬件MAC地址修改，重装系统不变！

如何一键安装win7系统(一键安装win7系统步骤)

Windows 11最稳定版本详解

winapi - Win32 DrawText() ignores text color set on the device context and draws text in background color - Stack Overflow

How to get Graalvm to convert AWT Java program to exe - Stack Overflow

Embedding of sequence of events sets - Stack Overflow

hcl - How to create parallel builds foreach item in list using packer template - Stack Overflow

react hooks - My browser localstorage clears everytime i refresh - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

ios - Sending "Start" Live Activity Notification from Apple Push Notifications Console successfully received b

If I use a Google Site along with an Apps Script webapp(set to 'Anyone' access)linked to a Google Sheet, is the

微信PC版主程序打开WeChat.exe找不到vcomp140.dll怎么办？微信PC版DLL文件修复方法

wp cli - wp-cli output 'column display format'wrapping