Displaying caption transcript with React
Fabian Cook
September 22, 2020

Displaying caption transcript with React

Table of Contents

    In this article, we'll learn how to display caption transcripts of your videos using React.

    There is a common format that video captions are typically formatted in, known as WebVTT, the format is straightforward, with only a couple of rules.

    The file first starts with WEBVTT, we can use that to confirm we're reading the right kind of file, this line can also contain a bit of text which is known as the header. After a blank line, we're either going to find a note, or a cue.

    A note is prefixed with NOTE, notes can either single line, or multiline.

    A cue might start with an identifier, but typically starts with a time index, and then the cue body, which is the text that is displayed at the described time index.

    The WebVTT files are going to look something like this:

    WEBVTT Header text
    
    NOTE A single line note
    
    NOTE
    A multiline note,
    which spans multiple lines
    
    00:00:00.000 --> 00:00:02.000
    This is the first two seconds
    
    00:00:02.001 --> 00:00:04.000
    This is the second two seconds
    
    00:00:04.001 --> 00:00:06.000
    This is the last of the video

    WebVTT does have many other little options that can be included, for example STYLE, but instead of discussing and implementing all that, we're going to utilise a pre-built module to parse & output our captions, meaning all we have to do is implement our transcription component.

    For this project we're going to use Codesandbox.io using the react starter template

    Now lets map out what we're going to build for our component, we will need:

    • A component to display the time component of the captions
    • A component to display the text component of the captions
    • A component to display individual captions
    • A component to consume the captions, and list it
    • Something to take the raw captions, parse, and then display

    We're going to implement our components from the smallest component, to the "largest", but first we're going to define the model object we're going to be consuming, to parse our captions we're going to utilise subtitle, so our best option is to follow the same model returned by that, which will look like:

    [
      {
        "start": 0,
        "end": 2000,
        "text": "This is the first two seconds"
      },
      {
        "start": 2001,
        "end": 4000,
        "text": "This is the second two seconds"
      }
    ]

    Lets start with our time component, we're going to create this with time.js, this component will accept three props, start, end, and format, we're going to wrap it in memo as well, which will stop any unneeded re-renders as we only want to re-render when the props change.

    We're going to allow end to be undefined, which will allow us to control externally whether or not there is a time-range, or if we should just show start, which will result in less text to consume if end will display the same as the next start.

    format is going to be used so we can keep a consistent format across all timestamps, which will be calculated by our getFormat function shown below.

    We're going to utilise luxon to format our durations, you will need to add this as a dependency within Codesandbox.

    Here is our time component, with both format generation shown, and our really small component that we can now use in our que component later on.

    import React, { memo } from "react";
    import { Duration } from "luxon";
    
    export function getFormat(total) {
      const totalDuration = Duration.fromMillis(total);
      const format = "yy:MM:dd:hh:mm:ss";
    
      // Lets get our total, but with all our possible values
      const totalFormatted = totalDuration.toFormat(format);
      // We're going to split up our formatted string, strip leading zeros,
      // then get the index of the first section without a zero value
      const firstIndexWithoutZero = totalFormatted
        .split(":")
        .map(value => +value.replace(/^0/))
        .findIndex(numeric => numeric > 0);
    
      const formatSplit = format.split(":"),
        // Use mm:ss if total is wack
        indexToSplitFrom =
          firstIndexWithoutZero === -1
            ? formatSplit.length - 2
            : firstIndexWithoutZero;
      return formatSplit.slice(indexToSplitFrom).join(":");
    }
    
    export default memo(function time({ start, end = undefined, format }) {
      const startDuration = Duration.fromMillis(start),
        endDuration = end && Duration.fromMillis(end);
      const startFormatted = startDuration.toFormat(format),
        endFormatted = end && endDuration.toFormat(format);
      return (
        <div className="caption-time">
          {startFormatted}
          {end ? ` - ${endFormatted}` : undefined}
        </div>
      );
    });

    We can test our component by adding it to the default component in index.js:

    // At the top of our file
    import CaptionTime from "./time";
    
    // Within `<div>` of the `App` component:
    {getFormat(5000)}
    <CaptionTime start={2000} format={getFormat(5000)} />
    <CaptionTime start={2001} end={5000} format={getFormat(5000)} />

    Within the viewer, we should see:

    mm:ss
    00:02
    00:02 - 00:05

    Next we're going to create our text component within text.js, this component is super small, and wraps our text in an element we're going to use for styling. An additional feature of this component is that we're going to want to replace new lines with <br/> which will allow for multi-line captions, this component will accept a single prop, text.

    To know if we need a line break or not (<br/> we're going to compare the index with the length of the split array, if its not the last element, we'll add it in.

    import React, { memo, Fragment } from "react";
    
    export default memo(function time({ text }) {
      return (
        <div className="caption-text">
          {
            text
              .split("\n")
              .map(
                (item, index, array) => (
                <Fragment key={index}>
                  {item}
                  {
                    (index + 1) !== array.length ? <br /> : undefined
                  }
                 </Fragment>
                )
              )
          }
        </div>
      )
    });

    The same as with our time component, we can test this by adding it to our App component:

    // At the top of our file
    import CaptionText from "./text";
    
    // Within `<div>` of the `App` component:
    <CaptionText text="This is a test" />
    <CaptionText text={"This is a test\nWith multiple lines\nThis is another!"} />

    Just a note, when passing \n within a string, within a prop, wrap the string with {}, else the character will e converted to \\n!

    Within the viewer, we should see:

    This is a test
    This is a test
    With multiple lines
    This is another!

    Now, lets create our caption component in caption.js, this will take the properties requested by CaptionTime and CaptionText, we're going to define this as a list item as well:

    import React, { memo } from "react";
    import CaptionTime from "./time";
    import CaptionText from "./text";
    
    export default memo(function Caption({ start, end = undefined, format, text}) {
      return (
        <li className="caption">
            <CaptionTime start={start} end={end} format={format} />
              <CaptionText text={text} />
        </li>
      );
    });

    Now, as with our components before, we'll add it to our App component to see how it works:

    // At the top of our file
    import Caption from "./caption";
    
    // Within `<div>` of the `App` component:
    <Caption
      start={2001}
      end={5000}
      format={getFormat(5000)}
      text="This is a test with time"
    />

    Within the viewer, we should see:

    00:02 - 00:05
    This is a test with time

    Now, lets create our list component within list.js, this will take the original array we defined earlier, this is the component that we will use to find our consistent format, excluding that, we're going to map our captions directly to a caption component, we're also going to define our captions element as a list:

    import React, { memo } from "react";
    import { getFormat } from "./time";
    import Caption from "./caption";
    
    function getFormatFromList(captions) {
      if (!captions.length) return "mm:ss";
      return getFormat(captions[captions.length - 1].end);
    }
    
    function shouldUseEnd(end, index, captions) {
      const nextCaption = captions[index + 1];
      if (!nextCaption) {
        return false; // The end of the video
      }
      // Captions usually follow the rule where if they
      // are continous, the next caption starts at the next millisecoond
      return end !== nextCaption.start && end + 1 !== nextCaption.start;
    }
    
    export default memo(function Captions({ captions }) {
      const format = getFormatFromList(captions);
    
      return (
        <ol className="captions">
          {captions.map(({ end, ...rest }, index) => {
            const endToUse = shouldUseEnd(end, index, captions) ? end : undefined;
            return <Caption key={index} end={endToUse} {...rest} format={format} />;
          })}
        </ol>
      );
    });

    Now, again, lets test!

    // At the top of our file
    import Captions from "./captions";
    
    // Within `<div>` of the `App` component:
    <Captions
      captions={
        [
          {
            "start": 0,
            "end": 2000,
            "text": "This is the first two seconds"
          },
          {
            "start": 2001,
            "end": 4000,
            "text": "This is the second two seconds"
          },
          {
            "start": 5000,
            "end": 6000,
            "text": "This is the third caption"
          }
        ]
      }
    />

    Now in our viewer we should see:

    00:00
    This is the first two seconds
    00:02 - 00:04
    This is the second two seconds
    00:05
    This is the third caption

    This is pretty good, we're almost there!

    Now lets just add a bit of styling, this couldn't be any easier, all we need to do is add this to our styles.css file, which will auto size our "time column" so our "text column" is nicely aligned

    .captions {
      display: grid;
      grid-template-columns: auto 1fr;
      text-align: left;
      border-bottom: 1px dashed rgb(140, 140, 140);
      list-style: none;
      margin: 0;
      padding: 0;
    }
    
    .caption {
      display: contents;
    }
    
    .caption .caption-text,
    .caption .caption-time {
      border-top: 1px dashed rgb(140, 140, 140);
    }
    
    .caption .caption-text {
      padding-left: 5px;
    }

    We can now use this to display our captions anywhere.

    Lastly we'll create a component in captions-raw.js that will take a raw captions string, which we will then parse and pass to the Captions component. We will need to add subtitle as dependency beforehand.

    import React, { useMemo } from "react";
    import { parse } from "subtitle";
    import Captions from "./captions";
    
    export default function CaptionsRaw({ raw }) {
      const parsed = useMemo(() => parse(raw), [raw]);
      return <Captions captions={parsed} />;
    }

    We can now do our final test, for this we're going to define an additional variable outside of our App.js component that we will pass to our CaptionsRaw component:

    // At the top of our file
    import CaptionsRaw from "./captions-raw";
    
    const captions = `
    WEBVTT
    
    00:00.000 --> 00:02.000
    This is the first two seconds
    
    00:02.000 --> 00:04.000
    This is the second two seconds
    This is multiple lines
    Another
    
    00:05.000 --> 00:06.000
    This is the third caption
    `.trim();
    
    // Within `<div>` of the `App` component:
    <CaptionsRaw
      raw={captions}
    />

    We should then see in our viewer:

    00:00
    This is the first two seconds
    00:02 - 00:04
    This is the second two seconds
    This is multiple lines
    Another
    00:05
    This is the third caption

    And thats that!

    I hope you enjoyed this article, and get to using captions with your video media, along with being able to provide your users with a full transcript of videos.

    About the Author
    Fabian Cook

    Fabian Cook

    JavaScript Developer.

    The Web Dev Monthly

    Sign up for a free monthly scoop of news and features articles handpicked by our staff.

    Unsubscribe at any time. No hidden catch.