Displaying caption transcript with React
September 22, 2020
Table of Contents
In this article, we'll learn how to display caption transcripts of your videos using React.
There is a common format that video captions are typically formatted in, known as WebVTT, the format is straightforward, with only a couple of rules.
The file first starts with WEBVTT
, we can use that to confirm we're reading the right kind of file, this line can also contain a bit of text which is known as the header. After a blank line, we're either going to find a note, or a cue.
A note is prefixed with NOTE
, notes can either single line, or multiline.
A cue might start with an identifier, but typically starts with a time index, and then the cue body, which is the text that is displayed at the described time index.
The WebVTT files are going to look something like this:
WEBVTT Header text
NOTE A single line note
NOTE
A multiline note,
which spans multiple lines
00:00:00.000 --> 00:00:02.000
This is the first two seconds
00:00:02.001 --> 00:00:04.000
This is the second two seconds
00:00:04.001 --> 00:00:06.000
This is the last of the video
WebVTT does have many other little options that can be included, for example STYLE
, but instead of discussing and implementing all that, we're going to utilise a pre-built module to parse & output our captions, meaning all we have to do is implement our transcription component.
For this project we're going to use Codesandbox.io using the react starter template
Now lets map out what we're going to build for our component, we will need:
- A component to display the time component of the captions
- A component to display the text component of the captions
- A component to display individual captions
- A component to consume the captions, and list it
- Something to take the raw captions, parse, and then display
We're going to implement our components from the smallest component, to the "largest", but first we're going to define the model object we're going to be consuming, to parse our captions we're going to utilise subtitle, so our best option is to follow the same model returned by that, which will look like:
[
{
"start": 0,
"end": 2000,
"text": "This is the first two seconds"
},
{
"start": 2001,
"end": 4000,
"text": "This is the second two seconds"
}
]
Lets start with our time component, we're going to create this with time.js
, this component will accept three props, start
, end
, and format
, we're going to wrap it in memo
as well, which will stop any unneeded re-renders as we only want to re-render when the props change.
We're going to allow end
to be undefined
, which will allow us to control externally whether or not there is a time-range, or if we should just show start
, which will result in less text to consume if end
will display the same as the next start
.
format
is going to be used so we can keep a consistent format across all timestamps, which will be calculated by our getFormat
function shown below.
We're going to utilise luxon to format our durations, you will need to add this as a dependency within Codesandbox.
Here is our time component, with both format generation shown, and our really small component that we can now use in our que component later on.
import React, { memo } from "react";
import { Duration } from "luxon";
export function getFormat(total) {
const totalDuration = Duration.fromMillis(total);
const format = "yy:MM:dd:hh:mm:ss";
// Lets get our total, but with all our possible values
const totalFormatted = totalDuration.toFormat(format);
// We're going to split up our formatted string, strip leading zeros,
// then get the index of the first section without a zero value
const firstIndexWithoutZero = totalFormatted
.split(":")
.map(value => +value.replace(/^0/))
.findIndex(numeric => numeric > 0);
const formatSplit = format.split(":"),
// Use mm:ss if total is wack
indexToSplitFrom =
firstIndexWithoutZero === -1
? formatSplit.length - 2
: firstIndexWithoutZero;
return formatSplit.slice(indexToSplitFrom).join(":");
}
export default memo(function time({ start, end = undefined, format }) {
const startDuration = Duration.fromMillis(start),
endDuration = end && Duration.fromMillis(end);
const startFormatted = startDuration.toFormat(format),
endFormatted = end && endDuration.toFormat(format);
return (
<div className="caption-time">
{startFormatted}
{end ? ` - ${endFormatted}` : undefined}
</div>
);
});
We can test our component by adding it to the default component in index.js
:
// At the top of our file
import CaptionTime from "./time";
// Within `<div>` of the `App` component:
{getFormat(5000)}
<CaptionTime start={2000} format={getFormat(5000)} />
<CaptionTime start={2001} end={5000} format={getFormat(5000)} />
Within the viewer, we should see:
mm:ss
00:02
00:02 - 00:05
Next we're going to create our text component within text.js
, this component is super small, and wraps our text in an element we're going to use for styling. An additional feature of this component is that we're going to want to replace new lines with <br/>
which will allow for multi-line captions, this component will accept a single prop, text
.
To know if we need a line break or not (<br/>
we're going to compare the index with the length of the split array, if its not the last element, we'll add it in.
import React, { memo, Fragment } from "react";
export default memo(function time({ text }) {
return (
<div className="caption-text">
{
text
.split("\n")
.map(
(item, index, array) => (
<Fragment key={index}>
{item}
{
(index + 1) !== array.length ? <br /> : undefined
}
</Fragment>
)
)
}
</div>
)
});
The same as with our time component, we can test this by adding it to our App
component:
// At the top of our file
import CaptionText from "./text";
// Within `<div>` of the `App` component:
<CaptionText text="This is a test" />
<CaptionText text={"This is a test\nWith multiple lines\nThis is another!"} />
Just a note, when passing
\n
within a string, within a prop, wrap the string with{}
, else the character will e converted to\\n
!
Within the viewer, we should see:
This is a test
This is a test
With multiple lines
This is another!
Now, lets create our caption component in caption.js
, this will take the properties requested by CaptionTime
and CaptionText
, we're going to define this as a list item as well:
import React, { memo } from "react";
import CaptionTime from "./time";
import CaptionText from "./text";
export default memo(function Caption({ start, end = undefined, format, text}) {
return (
<li className="caption">
<CaptionTime start={start} end={end} format={format} />
<CaptionText text={text} />
</li>
);
});
Now, as with our components before, we'll add it to our App
component to see how it works:
// At the top of our file
import Caption from "./caption";
// Within `<div>` of the `App` component:
<Caption
start={2001}
end={5000}
format={getFormat(5000)}
text="This is a test with time"
/>
Within the viewer, we should see:
00:02 - 00:05
This is a test with time
Now, lets create our list component within list.js
, this will take the original array we defined earlier, this is the component that we will use to find our consistent format, excluding that, we're going to map our captions
directly to a caption component, we're also going to define our captions element as a list:
import React, { memo } from "react";
import { getFormat } from "./time";
import Caption from "./caption";
function getFormatFromList(captions) {
if (!captions.length) return "mm:ss";
return getFormat(captions[captions.length - 1].end);
}
function shouldUseEnd(end, index, captions) {
const nextCaption = captions[index + 1];
if (!nextCaption) {
return false; // The end of the video
}
// Captions usually follow the rule where if they
// are continous, the next caption starts at the next millisecoond
return end !== nextCaption.start && end + 1 !== nextCaption.start;
}
export default memo(function Captions({ captions }) {
const format = getFormatFromList(captions);
return (
<ol className="captions">
{captions.map(({ end, ...rest }, index) => {
const endToUse = shouldUseEnd(end, index, captions) ? end : undefined;
return <Caption key={index} end={endToUse} {...rest} format={format} />;
})}
</ol>
);
});
Now, again, lets test!
// At the top of our file
import Captions from "./captions";
// Within `<div>` of the `App` component:
<Captions
captions={
[
{
"start": 0,
"end": 2000,
"text": "This is the first two seconds"
},
{
"start": 2001,
"end": 4000,
"text": "This is the second two seconds"
},
{
"start": 5000,
"end": 6000,
"text": "This is the third caption"
}
]
}
/>
Now in our viewer we should see:
00:00
This is the first two seconds
00:02 - 00:04
This is the second two seconds
00:05
This is the third caption
This is pretty good, we're almost there!
Now lets just add a bit of styling, this couldn't be any easier, all we need to do is add this to our styles.css
file, which will auto size our "time column" so our "text column" is nicely aligned
.captions {
display: grid;
grid-template-columns: auto 1fr;
text-align: left;
border-bottom: 1px dashed rgb(140, 140, 140);
list-style: none;
margin: 0;
padding: 0;
}
.caption {
display: contents;
}
.caption .caption-text,
.caption .caption-time {
border-top: 1px dashed rgb(140, 140, 140);
}
.caption .caption-text {
padding-left: 5px;
}
We can now use this to display our captions anywhere.
Lastly we'll create a component in captions-raw.js
that will take a raw captions string, which we will then parse and pass to the Captions
component. We will need to add subtitle as dependency beforehand.
import React, { useMemo } from "react";
import { parse } from "subtitle";
import Captions from "./captions";
export default function CaptionsRaw({ raw }) {
const parsed = useMemo(() => parse(raw), [raw]);
return <Captions captions={parsed} />;
}
We can now do our final test, for this we're going to define an additional variable outside of our App.js
component that we will pass to our CaptionsRaw
component:
// At the top of our file
import CaptionsRaw from "./captions-raw";
const captions = `
WEBVTT
00:00.000 --> 00:02.000
This is the first two seconds
00:02.000 --> 00:04.000
This is the second two seconds
This is multiple lines
Another
00:05.000 --> 00:06.000
This is the third caption
`.trim();
// Within `<div>` of the `App` component:
<CaptionsRaw
raw={captions}
/>
We should then see in our viewer:
00:00
This is the first two seconds
00:02 - 00:04
This is the second two seconds
This is multiple lines
Another
00:05
This is the third caption
And thats that!
I hope you enjoyed this article, and get to using captions with your video media, along with being able to provide your users with a full transcript of videos.
Fabian Cook
Software Engineer @ Dovetail
JavaScript Developer.
Read similar articles
How to Build and Deploy Superheroes React PWA Using Buddy
Check out our tutorialReact Quickstart For Beginners
Check out our tutorialBuilding a Web App with Angular and Bootstrap
Check out our tutorial