admin管理员组

文章数量:1323323

I have a VTT file with captions for JW Player and I am trying to create an interactive transcript. For that to happen I need to read the VTT file into an array and then interact with the data.

Here is a snippet from the VTT file:

1
00:00:00 --> 00:00:05
65 MEP we have twitter handles for both of us on screen as well so if you want

2
00:00:05 --> 00:00:08
to interact with those afterwards that's the best way to do so now been going to

3
00:00:08,051 --> 00:00:12,310
be talking about a topic that's extremely important topic in my mind and

I have a VTT file with captions for JW Player and I am trying to create an interactive transcript. For that to happen I need to read the VTT file into an array and then interact with the data.

Here is a snippet from the VTT file:

1
00:00:00 --> 00:00:05
65 MEP we have twitter handles for both of us on screen as well so if you want

2
00:00:05 --> 00:00:08
to interact with those afterwards that's the best way to do so now been going to

3
00:00:08,051 --> 00:00:12,310
be talking about a topic that's extremely important topic in my mind and

Here is my Javascript so far:

$.get('http://dev.sharepoint-videos./test.vtt', function(data) {
     
     // Read all captions into an array
     var items = data.split('\n\r');
     
     console.log(items);
    
     //Loop through all captions
     $.each(items, function( index, value ) {
      
      var item = items[index].split('\n');
      console.log(item);    

      });

         
});

Here is what my Console.log is returning

0: "1
↵00:00:00 --> 00:00:05
↵65 MEP we have twitter handles for both of us on screen as well so if you want
"
1: "↵2
↵00:00:05 --> 00:00:08
↵to interact with those afterwards that's the best way to do so now been going to
"
2: "↵3
↵00:00:08,051 --> 00:00:12,310
↵be talking about a topic that's extremely important topic in my mind and
"

Whis is not the desired result. I am new to Javascript still and I what I am trying to do is read each caption into the array and then loop through grabbing both the start and end time and the caption so that I can use them in the JW Player JS API.

Share asked Sep 18, 2015 at 22:41 Requin CreativeRequin Creative 1231 silver badge13 bronze badges
Add a ment  | 

3 Answers 3

Reset to default 3

This is what finally worked for me.

$.get('http://dev.sharepoint-videos./test.vtt', function(data) {
     
     // Read all captions into an array
     var items = data.split('\n\r\n');
     
     console.log(items);
    
     //Loop through all captions
     $.each(items, function( index, value ) {
      
      var item = items[index].split('\n');
      console.log(item);    

      });
 });

A bit late here, but browsers [now] have builtin features for what you're trying to achieve.

First, make sure to format the VTT file as mentioned in the WebVTT document on MDN. Here's the formatted data per the specification.

Notice that I added the header WEBVTT and modified all the timestamps to enforce the timestamp format HH:MM:SS.TTT. The alternative allowed timestamp is MM:SS.TTT. See CueTimings on MDN for more info on this.

const data = `WEBVTT

1
00:00:00.000 --> 00:00:05.000
65 MEP we have twitter handles for both of us on screen as well so if you want

2
00:00:05.000 --> 00:00:08.000
to interact with those afterwards that's the best way to do so now been going to

3
00:00:08.000 --> 00:00:12.000
be talking about a topic that's extremely important topic in my mind and`;

Now that VTT data is valid, let's address the approach. The builtin features to be used are track element, video element and TextTrack interface API.

The TextTrack API needs a track element within a video element. It needs src target to fetch VTT file. Once it is setup, its mode has to be changed to "hidden" or "showing" so that TextTrackCueList can be populated from the cues parsed from the VTT file. Once TextTrackCueList is populated we can get the id, timings, text and other things from that object for each successfully parsed cue.

Note: in simpler terms a cue is a block in your VTT file. For instance, this is a cue:

1
00:00:00.000 --> 00:00:05.000
65 MEP we have twitter handles for both of us on screen as well so if you want

See cue on MDN to know it in detail.

Now, let's get to the code.

// Creating dummy video and track elements. 
const videoElement = document.createElement("video")
const trackElement = document.createElement("track")

// A dataURI for this example. Substitute correct URL here
const dataURL = "data:text/plain;base64," + btoa(data)

// variable to access TextTrack interface API
const track = trackElement.track

videoElement.append(trackElement)
trackElement.src = dataURL

// Important: set mode to hidden or showing. Default is disabled
track.mode = "hidden"

/** Replace this function with whatever you wanna do with Cues list.
  * This function takes the array-like object which is a cues list,
  * and extracts id, startTime, endTime, and text data from each cue.
*/
function processCues(cues) {
    for (const cue of cues) {
        console.log(`
            id: ${cue.id},
            startTime: ${cue.startTime},
            endTime: ${cue.endTime},
            text: ${cue.text}
        `)        
    }
}

// Could be optional for you. Apparently, track.cues was not instantly 
// populated so I'm using timeout here as a workaround. 
setTimeout(processCues, 50, track.cues)

Output:

id: 1,
startTime: 0,
endTime: 5,
text: 65 MEP we have twitter handles for both of us on screen as well so if you want

id: 2,
startTime: 5,
endTime: 8,
text: to interact with those afterwards that's the best way to do so now been going to
 
id: 3,
startTime: 8,
endTime: 12,
text: be talking about a topic that's extremely important topic in my mind and

Does this produce what you're after?

var data = `1
00:00:00 --> 00:00:05
65 MEP we have twitter handles for both of us on screen as well so if you want

2
00:00:05 --> 00:00:08
to interact with those afterwards that's the best way to do so now been going to

3
00:00:08,051 --> 00:00:12,310
be talking about a topic that's extremely important topic in my mind and`;

data.split("\n\n").map(function (item) {
  var parts = item.split("\n");
  return {
    number: parts[0],
    time: parts[1],
    text: parts[2],
  };
});

The above splits the groups on two new line characters, then again on a single new line character.

Which results in:

[
  {
    "number": "1",
    "time": "00:00:00 --> 00:00:05",
    "text": "65 MEP we have twitter handles for both of us on screen as well so if you want"
  },
  {
    "number": "2",
    "time": "00:00:05 --> 00:00:08",
    "text": "to interact with those afterwards that's the best way to do so now been going to"
  },
  {
    "number": "3",
    "time": "00:00:08,051 --> 00:00:12,310",
    "text": "be talking about a topic that's extremely important topic in my mind and"
  }
]

本文标签: jqueryHow can I use Javascript to read a VTT file into Array and LoopStack Overflow