Drew MacQuarrie: 2015

Thursday, 30 April 2015

Equirectangular image editing: region removal using graph cuts

I've been doing some work recently in editing equirectangular 360 panoramic images. Specifically, using graph cuts (max-flow/min-cut) to find a good way to remove a slice of the sphere. This allows the removal of large or small slices, so anything from small pieces of equipment to entire crews could be removed. However, as it uses a graph cuts technique to find a good cut, there are certain types of image for which it is more likely to produce good results e.g. outdoor, natural scenes. For my tests I've been using forest scenes, like this creative commons one created by Peter Gawthrop:

I've written some software that allows the removal of an image slice, available on my github. It's a Qt application that makes use of a number of other projects, specifically Graphcut Textures as implemented by Nghia Ho.

How it works

A 360 panorama is essentially a sphere. The viewer is at the centre of this sphere, onto which the image is mapped. If we wanted to remove an element of the panorama, we could remove a slice of the sphere and then stretch the remaining content to fill the hole. The issue here is that a hard edge will be seen at the cut point.

For example here's the original image, viewed as a rectilinear:

If we perform a simple rectangular cut on the equirectangular, we'll end up with a very noticeable, hard line where the cut took place:

Instead, we can specify a region we want to remove and then let graphcuts find a good seam to disguise this removal. In the tool (available above), we specify a region to remove as well as an amount we're willing to lose in order to find a decent cut:

The red region indicates the section to be removed from the panorama. The "overlap size" specifies how many additional columns can be used to find a good cut. Specifying more here means losing more columns, but provides more flexibility as the algorithm looks for a good cut. Here's the cut (in green) for the overlap in the above image:

This is the 400 columns to the left of the cut, overlaid with the 400 columns to the right of the cut. Max-flow/min-cut is then used to find the best seam between them, as described in the graphcut textures paper. Here's the result of the above cut in the original panorama, viewed as rectilinear:

This cut is fairly impressive. Some artefacts can be seen, for example the branch at the centre top of the image ends abruptly where the cut has taken place. However overall the effect is fairly good.

An issue is that a cut results in a stretching of the rest of the content. We've removed a fairly large amount of the sphere, so the rest of the image must stretch to avoid having a missing slice. In a forest scene, this doesn't introduce too many issues - the trees could plausibly be more squat than in the original. However in certain circumstances, for example if a person was in the scene, this may not be acceptable. In such cases, techniques such as non-homogeneous stretching or additive seam carving could be used to avoid scaling the salient elements.

Monday, 23 February 2015

Projection mapping in Unity

I recently wrote some software to allow projection mapping using Unity. The project code is available on my github here. Notes on how to use the software are included in the README of the project.

The concept

Projection mapping - sometimes called Spatial Augmented Reality - is when media is projected on to a surface that is generally non-planar. The projection is warped in such a way as to align the visuals to the physical objects. There are several ways to achieve this, but here I consider a 3D mapping technique where a reasonably accurate 3D model of the projection surface has been created.

Physical projection surface (left) and a 3D model of the object (right)

A virtual camera in Unity is then calibrated to mimic the behaviour of the projector, so visuals captured in Unity can be output to the projector and align with the physical object. This technique is similar to that used by Mapamok, however by building the system in Unity it's hoped that new content can be authored easily.

The method

A virtual camera in Unity has the intrinsic and extrinsic matrices of the physical projector applied to it. These matrices are calculated using OpenCV's calibrateCamera function, using manually acquired point correspondences between the virtual object and the projector's view of the physical object.

One of the major hurdles was the different coordinate systems used by Unity and OpenCV. Unity uses a left-handed coordinate system, while OpenCV expects right-handed. This can be overcome by converting to right-handed before sending the point correspondences to OpenCV, and likewise flipping one of the axes in OpenCV's results.

The translation and rotation components of the extrinsic matrix are applied separately to the Unity camera. The translation is fairly simple, however the rotation must be converted into ZXY Euler angles (the ZXY order is important) before they can be used. The intrinsic matrix is applied by swapping out the camera's projection matrix for the calculated intrinsic.

Results

After calibration, the model can be textured, animated and generally messed with to create some cool effects. Here's a very basic demo...

Image displayed by projector (left) and how it appears on the surface (right)

The future

This project allows projection mapping of visuals using Unity3D. However there are several elements missing before it could be considered a complete projection mapping solution. For example, there currently isn't support for multiple projectors. While it would be easy to calibrate more projectors using the current code, more would need to be done to blend projections together. Additionally, there's no radiometric compensation. These are things I'm planning to look at soon.

Tuesday, 17 February 2015

Shooting 360 degree video with the Ladybug3

I recently shot some panoramic content using Point Grey's Ladybug3 camera. The experience wasn't totally without issue so I wanted to share some of my experiences to help others who might have similar problems.

Panoramic view of London on a grey day. View as a sphere here

The Ladybug3 uses 6 cameras with wide angle lenses to capture panoramic content. It comes complete with stitching software that can output 360 degree images or video. The lack of a camera facing downwards means that the base of the sphere is not captured - this is seen as a black hole below the camera, for example if you look down when viewing the content in an Oculus Rift.

Camera setup

Ladybug3 connected to 17 inch MacBook Pro using FireWire 800 port

To capture content on the move I connected the Ladybug3 to a 17 inch MacBook Pro with a FireWire 800 port. The Ladybug software is Windows only, and VMWare Fusion cannot virtualise FireWire ports, so the laptop was booted into Windows. The MacBook itself was capable of supplying enough power to the camera even when not plugged in to the mains. It is also possible to use an ExpressCard to FireWire 800 adapter, powering the card via an external power supply. This method was shown to work by Paul Bourke in his setup and is the method recommended by Point Grey.

Software setup

I originally used the LadybugRecorder application as it's very simple. However, it provides very little control over the camera during filming. Instead, use the LadybugCapPro application to record data. This lets you control settings like exposure, shutter and gain.

Our camera was set to 15fps using JPEG 12-bit compression and full vertical resolution. As discussed here using a rate of 15fps rather than the full-resolution maximum of 16fps has the advantage of making it easier to upsample the video to a more standard 30fps later. Although the documentation is a little unclear on this for the Ladybug3, it is possible to stream at up to 32fps if half vertical resolution is used.

Stitching

Following advice from my colleague Richard Taylor, stitching was performed using the Point Grey software into a sequence of PNG images. These images were then turned into a video file using separate software (discussed below). It was felt that the quality of the videos output by Point Grey's software was poorer than using this technique. In fact, at high resolutions the video output by the Point Grey software was extremely lossy for me.

At high resolution and using a good colour mode (such as High Quality Linear) the stitching process can be quite lengthy. This can be improved by using a machine with a good GPU and lots of memory, and setting in/out markers to avoid processing unnecessary frames. It's quite hard to find the in/out markers feature in LadyBugCapPro - you move the seeker to the desired frame and then click the icon of a blue arrow in a circle (found in the "Stream toolbar"). It was noted that using the "Parallel processing" option to speed up export had a tendency to mess up some frames, so it was not used.

Setting in/out markers in LadybugCapPro to reduce number of frames processed

We used the "Panoramic" type, which creates an equirectangular panorama. This has the advantage of being commonly supported by most playback software. However, it was pointed out by my colleague that this format doesn't make the best use of pixels and can distort content at the sphere poles.

After stitching to PNGs I noticed there were gaps in the image numberings produced by LadyBugCapPro. I wrote a small Ruby script to identify these gaps, and then manually duplicated neighbouring frames to fill the holes. I didn't have many gaps so this manual method was fine for me. However the script could easily be updated to automatically duplicate neighbouring frames.

Dropped frames and audio syncing

When syncing the audio with the output from the ladybug, it became apparent that there was a problem. The frame rate was inconsistent, varying across the duration of the video. As our software assumes a constant FPS, this resulted in erratic playback speed. Sometimes this wasn't a problem, but when syncing audio even a single dropped frame was noticeable.

As my Windows partition was small and the Ladybug generates large amounts of data, I attempted to save the data to an external hard drive using the MacBook's USB2 port during filming. However, the transfer rate of USB2 was too low, resulting in many dropped frames. It's important to choose settings that don't produce more data than your system can handle. In particular you can adjust the JPEG compression ratio, the frame rate, and if you're using full or half vertical resolution. Regardless of the settings, however, frames can still sometimes get dropped.

You can test for dropped frames in a recording using LadybugCapPro from inside the GPS menu, under "Generate GPS/frame information". This provides a report of how many frames were dropped and where. This information can be used to correct for the missing frames. I made a very basic script (available here) that parses out the missing frame details from this report, and uses that to copy neighbouring frames to fill the gaps. (NB: this code is mostly for reference - it moves and copies files so should only be used if you definitely know what it's doing, and probably on a copy of the data to avoid having to restitch if something goes wrong).

Processing the PNGs

The PNG images can be turned into a video using separate software. My colleague loaded the files into Adobe After Effects, before exporting this as an Adobe Premier project and using that software to turn it into video file. I didn't have this software available so instead I used ffmpeg which is available for free. It's very easy to install on a Mac using Homebrew. The command I used to create the video file was:

> ffmpeg -r 15 -i ladybug_panoramic_%06d.png -c:v libx264 -r 30 -pix_fmt yuv420p out.mp4

The first -r parameter is the capture frame rate, the second -r parameter is the desired frame rate of the output. Refer to ffmpeg -help for more options.

Viewing in the Rift

To view the panoramic video in the Oculus Rift I used Kolor Eyes. This is free software that works on Windows and Mac.

References

I'd like to thank Richard Taylor for all his help - a lot of this post is based on his advice. Paul Bourke's Ladybug3 guide was also very helpful.

NB: Any code here was quickly thrown together, with few error checks and not much care. Standard caveats - use at your own peril.