Some Background on AI

Sixty years ago, the New York Times reported that an early artificial intelligence (A.I.) machine funded by the military and created by Cornell University scientists was, “the embryo of an electronic computer that [the American Navy] expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.”  Well, I’m still waiting on the rise of a robotic butler to pour me a glass of wine, cook dinner, and respond on my behalf to trivial Instagram emoji engagement.  Wait, that last one is real, WE LIVE IN THE FUTURE! THE ROBOTS ARE HERE! Sort of – with computational photography.

But before we get to consciousness, scientists are aiming their sights lower,(though along the same path) to something called machine learning.  Machine learning is, basically, a program that can learn rules and ‘understand’ patterns from sifting through giant sets of data. In important ways researchers are setting up software to teach itself.

You may know it best illustrated by the rise of natural language simulation that Google Assistant and Amazon Echo are pioneering.  “When I started, the Amazon machine-learning conference was just a couple hundred people; now it’s in the thousands,” says Ralf Herbrich, a leading AI researcher at Amazon.

But when is that sweet machine-learning sauce going to drizzle on photographers? It already has.  The promo for Apple’s newest “portrait lighting” gimmick effect proclaims that they are “combining timeless lighting principles with advanced machine learning to create an iPhone that takes studio-quality portraits without the studio.” Meanwhile, Google has programmed the Snapseed app from mobile phones to analyze for architectural asymmetry and correct for  converging verticals, automatically, and then offers sliders to adjust the effect. Noice!  But not much in advanced automation has appeared to benefit professional photographers in the recent past, apart from some cool selection tools in Photoshop.

Until now.

Some Background On CNN (Not The News Network)

NVIDIA (the company known best for its computer graphics cards and its market dominance in GPU circuitry) has been on a tear recently with breakthroughs in the application of machine learning on practical, real-world, visual effects. They produced results, a few months ago, that can unlock smooth appearing slow motion speeds for videos that have been otherwise impossible. Their AI program studies motion clips to learn the precise application of smoothing effect goals, then it applies multiple smoothing effects to intermediate frames to build up the accurate appearance of slow motion. It’s kinda revolutionary. You can shoot a video at traditional 30 frames per second and NVIDIA shows how they can estimate slow movement to appear like 240 frames per second.

Now NVIDIA has released some shocking examples of how machine learning can be applied to fundamental problems for digital photographers: the appearance of noise signal artifacts.

Want to reduce noise in your images? Shoot at higher ISO levels and edit clean images? Want to shoot with smaller, cheaper-sensor cameras and produce images that have similar noise patterns of more expensive larger sensor cameras? It’s coming. Be sure to watch this video:

The example from the video of the Koala is particularly striking. The image is filled with shadows and grey hues that are reconstructed from estimation. If an algorithm looks at only one pixel at a time, it will never see the Koala, it needs to see everything before it can see anything. In other words,

“The human visual system efficiently recognizes and localizes objects within cluttered scenes. For artificial systems, however, this is still difficult due to viewpoint-dependent object variability, and the high in-class variability of many object types. […] The most successful hierarchical object recognition systems all extract localized features from input images, convolving image patches with filters. Filter responses are then repeatedly sub-sampled and re-filtered, resulting in a deep feed-forward network architecture whose output feature vectors are eventually classified.” 

For video and still photographs, NVIDIA is applying a class of machine learning analysis called convolutional neural network (CNN). There are multiple ways to write a program in order to to have it learn from data, but CNNs are a class of machine learning that have been particularly good at image and object recognition.

CNNs are part of the core process that has allowed Telsa’s cameras to navigate the roads, liberating the driverless experience. And CNNs are the foundation of Yitu’s camera solutions for the Chinese government, securing the surveillance state.

Some Value for Photographers

The idea of using machine learning to aggressively de-noise imagery is groundbreaking for the visual arts and other imaging industries, and the researchers who worked on this project are speculating how this will benefit fields adjacent to their own tribe.  So, they expect this to help scientific endeavors like astrophotography and magnetic resonance imaging.  But they are either shortsighted or don’t want to discuss the real chess pieces on the board.  There are many looming implications and unanswered questions.

Is the technology that NVIDIA pioneered something that they are patenting with the option to license to, let’s say, Adobe?  Are companies like Adobe going to offer their own slightly modified version of the same process to denoise?  Will see  this tech in camera bodies in the next 5 to 10 years? Or in cellphones? Sony, for example, is already effectively using artificial intelligence in autofocus modes to track eye movement.  Is this the next big advantage that software offers photographers?  Will crimes be easier for police to solve when security cameras can de-noise to reveal detailed likenesses of perpetrators?  Will this technology make driverless cars safer at night?  Will powerful de-noising solutions tempt governments to spy more closely on their people? So many questions.

This news from NVIDIA is one of those big-little accomplishments where we can actually see how the imaging industry is part of some of the central features of human existence.  It’s another breakthrough in the application of machine learning techniques, and the prequel to the rise of the box headed, sloppy robots that look just like us.