VHS to iTunes Compatible MP4
Project Page: github/vhs-convert
Outlined here is the method I used for converting my VHS tapes to H264 files. To leave myself extra options open, I encoded them in a format that is also fully compatible with iTunes and most Apple devices.
Setup
FFMPEG is doing all of the heavy lifting in this process. This program is available on all platforms.
I used Windows in my setup, so the commands will have to be changed slightly for other platforms. The required changes are those for device selection, as the DirectShow video source is the Windows specific
Install FFMPEG and make sure to add it to your PATH.
Device
A video capture device is required. I used the August VGB100 which is available for $30 in Canada.
Install the device as normal. You won’t need the conversion software that comes with the device, only the drivers.
Prep-Work
Device Identity (Windows)
Identify the name of the USB video capture device using the following command
ffmpeg -list_devices true -f dshow -i dummy
Sample Output:
[dshow @ 0000025c7faba520] DirectShow video devices (some may be both video and audio devices)
[dshow @ 0000025c7faba520] "Conexant Polaris Video Capture"
[dshow @ 0000025c7faba520] Alternative name "@device_pnp_\\?\usb#vid_1f4d&pid_0102&mi_01#6&1c48a6c&0&0001#{65e8773d-8f56-11d0-a3b9-00a0c9223196}\{9b365890-165f-11d0-a195-0020afd156e4}"
[dshow @ 0000025c7faba520] DirectShow audio devices
[dshow @ 0000025c7faba520] "Analog Audio In (USB2.0 Video Capture)"
[dshow @ 0000025c7faba520] Alternative name "@device_cm_{33D9A762-90C8-11D0-BD43-00A0C911CE86}\wave_{B93F2950-52BB-4B0D-86C2-25E92FAE012D}"
The video device here is Conexant Polaris Video Capture
Device Options
Describe the options for the device:
ffmpeg -f dshow -list_options true -i video="Conexant Polaris Video Capture"
[dshow @ 000002a2eb64a540] DirectShow video device options (from video devices)
[dshow @ 000002a2eb64a540] Pin "Capture" (alternative pin name "2")
[dshow @ 000002a2eb64a540] pixel_format=uyvy422 min s=88x72 fps=25 max s=720x576 fps=25
[dshow @ 000002a2eb64a540] pixel_format=uyvy422 min s=80x60 fps=29.97 max s=720x480 fps=29.97
[dshow @ 000002a2eb64a540] pixel_format=uyvy422 min s=88x72 fps=25 max s=720x576 fps=25
[dshow @ 000002a2eb64a540] pixel_format=gray min s=88x72 fps=25 max s=720x576 fps=25
[dshow @ 000002a2eb64a540] pixel_format=uyvy422 min s=80x60 fps=29.97 max s=360x240 fps=29.97
[dshow @ 000002a2eb64a540] pixel_format=gray min s=80x60 fps=29.97 max s=360x240 fps=29.97
[dshow @ 000002a2eb64a540] pixel_format=uyvy422 min s=88x72 fps=25 max s=360x288 fps=25
[dshow @ 000002a2eb64a540] pixel_format=gray min s=88x72 fps=25 max s=360x288 fps=25
[dshow @ 000002a2eb64a540] Pin "Audio Out" (alternative pin name "3")
[dshow @ 000002a2eb64a540] Crossbar Switching Information for Conexant Polaris Video Capture:
[dshow @ 000002a2eb64a540] Crossbar Output pin 0: "Video Decoder" related output pin: 1 current input pin: 0 compatible input pins: 0 1
[dshow @ 000002a2eb64a540] Crossbar Output pin 1: "Audio Decoder" related output pin: 0 current input pin: 2 compatible input pins: 2
[dshow @ 000002a2eb64a540] Crossbar Input pin 0 - "Video Composite" related input pin: 2
[dshow @ 000002a2eb64a540] Crossbar Input pin 1 - "S-Video" related input pin: 2
[dshow @ 000002a2eb64a540] Crossbar Input pin 2 - "Audio Line" related input pin: 0
For this USB device, the video device is both a video and audio device. This complicates the command as we need to select the correct crossbar input pins for audio and video. If the audio and video are two separate devices then the commands below can omit the crossbar pin selection.
- Video crossbar input is
Crossbar Input pin 0 - "Video Composite"
- Audio crossbar input is
Crossbar Input pin 2 - "Audio Line"
Encoding Command
Here is the command used for converting the video
ffmpeg -framerate 25 -vsync 1 -video_size 720x576 -f dshow -rtbufsize 250000k \
-crossbar_video_input_pin_number 0 -crossbar_audio_input_pin_number 2 \
-i video="Conexant Polaris Video Capture":audio="Conexant Polaris Video Capture" \
-t 4:10:00 -vf "fps=25,yadif=0:0:0,crop=w=702:h=576,hqdn3d=6:4:14:10,scale=640x480" \
-c:v libx264 -preset medium -tune film -profile:v main -level 3.1 -pix_fmt yuv420p \
-crf 25 -maxrate 1750k -bufsize 3500k -af "aresample=async=1000,highpass=200,lowpass=3500" \
-c:a aac -b:a 96k -strict -2 -r 25 out.m4v
Python Wrapper
To simplify usage, here is a simple Python wrapper for the ffmpeg capture command above. Modify it to customize the defaults to your preference or use command line arguments to configure.
Usage:
./capture.py --out test.m4v
Command Components
The arguments are broken down as follows:
Video Input Format
-framerate 25
Video framerate as 25 FPS-vsync 1
Video synchronization method, “Frames will be duplicated and dropped to achieve exactly the requested constant frame rate.” vsync reference-video_size 720x576
Video resolution. 720x576@25fps matches one of the video device option sets from above. In this case I am using PAL type VHS, so the 720x576@25 input is appropriate.
Video Input Device
-f dshow
DirectShow input format (Windows specific)-rtbufsize 250000k
Real-Time input buffer size. Defines a 250MB buffer to be used as needed for storing a backlog of frames to process. This is needed if you are operating near the processing limit of your computer and need to handle spikes of processing work without losing frames.-crossbar_video_input_pin_number 0
Video crossbar input pin from above-crossbar_audio_input_pin_number 2
Audio crossbar input pin from above-i video="Conexant Polaris Video Capture":audio="Conexant Polaris Video Capture"
Video and audio coming from the same video input device from above
Video Configuration
-t 4:10:00
Time after which to stop recording. VHS tapes will usually be marked asE-60
,E-120
,E-240
. The number after E is the number of minutes on the tape. Do allow for an extra 10 minutes or so after that as they are often slightly longer.Set the length to stop the recording at. This can be set based on the type of VHS being converted. Generally tapes will have an indicator on them somewhere as E-60, E-120, E-240. When in doubt, set the time to 4:10:00 and stop the recording by pressingQ
when the VHS player reaches the end of the tape.
Video Filtering
-vf
Start video filter definitions (comma separated, ordered processing)fps=25
Extra pedantic conversion to 25FPS, adding here seems to help avoid spammy console warnings sayingPast duration X.XX is too large
yadif=0:0:0
Deinterlace the video input. YADIF filter referencecrop=w=702:h=576
Crop off the edges of the video to the correct 4:3 aspect ratio PAL AR. Crop filter Referencehqdn3d=6:4:14:10
Remove noise from the video (also improves compression). First two parameters are spatial strength, second two are temporal strength. Hqdn3d filter referencescale=640x480
resize down to 640x480 which is a more standard 4:3 AR resolution. Scale filter reference
Video Compressor
-c:v libx264
Use x264 to generate h264 encoded video-preset medium
Preset defines how hard the engine will work to compress video. The useful values in orderveryslow
,slower
,slow
,medium
,fast
,faster
,veryfast
. Use the slowest setting that your computer can keep up with without maxing out your processor. If your computer can’t keep up, you will start getting errors “real-time buffer […] too full or near too full (78% of size: 250000000 [rtbufsize parameter])! frame dropped!”.-tune film
Tuning for the x264 engine. “film is for live-action content: anything shot on a camera, as opposed to cell animation or computer generated text/charts” tune options-profile:v main -level 3.1
Set the H264 compatibility settings. These values will support almost all devices including: iPad (all versions), Apple TV 2 and later, and iPhone 4 and later Apple reference-pix_fmt yuv420p
Required pixel format for maximum compatibility-crf 25
Constant quality setting for video. Lower number is higher quality. Sane range is 18-28. Uses whatever bitrate is required to retain a generally constant quality to the video. encoder reference-maxrate 1750k
Set a maximum bitrate for the output video. This will override the crf for extremely noisy content to keep the file from getting too large.-bufsize 3500k
Should be used alongsidemaxrate
. Effectively determines how often themaxrate
is verified (3500k/1750kps) = every 2 seconds
Audio Filtering
-af
Start audio filter definitionsaresample=async=1000
“Stretch/squeeze samples to the given timestamps, with a maximum of 1000 samples per second compensation” aresample filter referencehighpass=200,lowpass=3500
Drop audio outside of the human voice range to remove noise.
Audio Compressor
-c:a aac
Use the AAC audio codec which is highly compatible (including Apple devices)-b:a 96k
Use 96 kbps audio bitrate. This is more than enough for the audio quality stored on VHS (especially after the frequency filtering above)-strict -2
Required on some platforms to enable the AAC codec
Output
-r 25
Force output framerate as constant 25 FPSout.m4v
Set the output file name. m4v is the expected extension for iTunes. mp4 is also an appropriate extension.
Alternate Output (With Display)
It may be helpful to see the output as it is recording. Note that this will require extra processing power and may require a faster x264 preset value.
Replace out.m4v
with
-f tee -map 0:v -map 0:a "out.m4v|[f=mpegts]udp://localhost:10001"
This will simultaneously write to out.m4v and send all video over UDP to port 10001. Then we can monitor the video output on that port with ffplay (which comes with ffmpeg)
ffplay -an -fast -framedrop -noinfbuf -vf scale=w=in_w/2:h=in_h/2 -loglevel panic \
-i udp://localhost:10001?listen
Note that -an
disables playing out audio, and the input is scaled down by half for display.
DANGER Unlike the normal out.m4v output, this command will NOT prompt before overwriting an output file. Extra care must be taken to avoid overwriting your output accidentally.
Trimming
If there is excessive content after the recorded video ends, it can be safely trimmed off without any need to re-encode the video.
Open the video in a player such as VLC and find the start and end of the video stream. Then trim the video as follows:
ffmpeg -ss 10 -i out.m4v -c copy -to 1:10:00 out-trimmed.m4v
-ss 10
Seek input to 10 seconds. Unfortunately without re-encoding the video, the start of the video cannot be trimmed correctly between key-frames,ss
should find the closest key-frame to seek to and start there. Leave this parameter off entirely if there is an acceptably small amount of wasted video at the start-i out.m4v
Read from the previously written output file-c copy
Use thecopy
codec for both audio and video, should process even large videos extremely quickly, since no re-encoding is being done-to 1:10:00
End the video copy at a specific time marker, everything after this point will be trimmed offout-trimmed.m4v
The output file to write to