Day 28 — A divide-and-conquer strategy for recording videos

16 September 2020 · recurse-center

This is not something new, almost every video in the world is recorded this way! At first, I fought against this strategy by doing just one take, thinking that it would save time. I couldn't be more wrong!

Disclaimer: I'm a new speaker, and I'm not very good at speaking. But I'm always trying to get better :) If you happen to watch any of my videos, and you have some feedback on how I could improve, then please send it to me!

Today I worked on recording my talk for JupyterCon. I used OBS Studio to record with my screen and webcam. It's such an awesome piece of open-source software! I've recorded 3 videos this year, and it's also the first time I've done something like this. I still cringe when I hear my voice in a recording though. I guess I'm not alone in doing that, and I blame it on evolution as it didn't account for us listening to our own recordings!

In April, I tried to finish the first video in just one take, thinking that it would save time. But it took me 8 takes over two days to finish it! I tried to record it during the day as I didn't have good lighting back when I was in Bangalore. Trying to record yourself during the day is such a bad idea (unless you have some soundproofing) because there are birds chirping, cars honking, and house bells ringing! These sounds, in addition to my own fumbles, would make discard a whole take. Doing single (and long) takes can also become tiring when you have to say the same lines again and again.

Last month, I tried to replicate this for the second video, but eventually ended up with two takes. It didn't make sense to discard more than half of the video just because some dogs outside decided to start barking at 2 AM in the night.

Just the thought of using a video editing software (with its complex GUI) had discouraged me from doing multiple takes, but while recording the second video I discovered that I could use ffmpeg to crop and concatenate multiple takes!

So this time around, I pre-divided my video into scenes (there were 6 of them), and did multiple takes for each one. I was able to record the first 2 scenes in 2 takes, and the last 4 in 3 takes. I still did go through the whole talk in one go, but I would take a water break in between takes. Overall, it took less time to record the whole video compared to the first one earlier this year! Shorter scenes with a water break also helped reduce fatigue.

After I had all the scenes and takes, I just noted down the start and end time to crop each take in a file.


  scenes/1/take-2.mp4 00:00:12 00:06:42
  scenes/2/take-2.mp4 00:00:05 00:04:41
  scenes/3/take-3.mp4 00:00:04 00:02:34
  scenes/4/take-3.mp4 00:00:03 00:03:30
  scenes/5/take-3.mp4 00:00:05 00:02:43
  scenes/6/take-3.mp4 00:00:04 00:02:48

I guess this could be automated if I had one of those clickity-clacks (maybe a virtual one), so that it notes down the start and end times automatically when I clack. Maybe it could also be done by detecting a specific word in the audio with some audio processing. A team at RC is participating in a Kaggle competition where they have to build a model to detect bird sounds, I guess I could ask them for pointers as detecting human clacks sounds so much more simpler!

After I had the file, I wrote some Python to call ffmpeg and concatenate all the takes.


  outs = []

  with open(inp, "r") as f:
      for i, l in enumerate(f.readlines()):
          filepath, start, end = l.split(" ")
          crop_result = os.path.join(outpath, f"crop-result-{i}.mp4")

          start_dt = dt.datetime.strptime(start.strip(), "%H:%M:%S")
          end_dt = dt.datetime.strptime(end.strip(), "%H:%M:%S")

          # calculate duration from start
          duration = time.strftime(
              "%H:%M:%S", time.gmtime((end_dt - start_dt).total_seconds())
          )

          # crop the video using a start time and duration
          os.system(f"ffmpeg -i {filepath.strip()} -ss {start.strip()} -t {duration} -async 1 {crop_result}")

          outs.append(f"file {os.path.basename(crop_result)}")

  # create a concat.txt with cropped video filenames
  concat = os.path.join(outpath, "concat.txt")
  with open(concat, "w") as f:
      f.write("\n".join(outs))

  result = os.path.join(outpath, "result.mp4")

  # concat all cropped videos
  os.system(f"ffmpeg -f concat -i {concat} -c copy {result}")

  print(f"Here's your video {result}")

Video recording is such a tiring process. I don't plan to do it again anytime soon!

Vinayak Mehta

Day 28 — A divide-and-conquer strategy for recording videos