您的当前位置：首页正文

To Appear in the IEEE Transactions on Pattern Analysis and Machine Intelligence Limits on S

来源：画鸵萌宠网

ToAppearintheIEEETransactionsonPatternAnalysisandMachineIntelligence

LimitsonSuper-ResolutionandHowtoBreakThem

SimonBakerandTakeoKanadeTheRoboticsInstituteCarnegieMellonUniversityPittsburgh,PA15213

Abstract

Nearlyallsuper-resolutionalgorithmsarebasedonthefundamentalconstraintsthatthesuper-resolutionimageshouldgeneratethelowresolutioninputimageswhenap-propriatelywarpedanddown-sampledtomodeltheimageformationprocess.(Thesereconstructionconstraintsarenormallycombinedwithsomeformofsmoothnesspriortoregularizetheirsolution.)Intheﬁrstpartofthispaper,wederiveasequenceofan-alyticalresultswhichshowthatthereconstructionconstraintsprovidelessandlessusefulinformationasthemagniﬁcationfactorincreases.Wealsovalidatethesere-sultsempiricallyandshowthatforlargeenoughmagniﬁcationfactorsanysmoothnesspriorleadstooverlysmoothresultswithverylittlehigh-frequencycontent(howevermanylowresolutioninputimagesareused.)Inthesecondpartofthispaper,wepro-poseasuper-resolutionalgorithmthatusesadifferentkindofconstraint,inadditiontothereconstructionconstraints.Thealgorithmattemptstorecognizelocalfeaturesinthelowresolutionimagesandthenenhancestheirresolutioninanappropriateman-ner.Wecallsuchasuper-resolutionalgorithmahallucinationorrecogstructionalgo-rithm.Wetriedourhallucinationalgorithmontwodifferentdatasets,frontalimagesoffacesandprintedRomantext.Weobtainedsigniﬁcantlybetterresultsthanexistingreconstruction-basedalgorithms,bothqualitativelyandintermsofRMSpixelerror.Keywords:Super-resolution,analysisofreconstructionconstraints,learning,faces,text,hallucination,recogstruction.

1Introduction

Super-resolutionistheprocessofcombiningmultiplelowresolutionimagestoformahigherresolutionone.Numeroussuper-resolutionalgorithmshavebeenproposedintheliterature[39,32,51,33,29,31,53,30,34,37,13,9,35,47,49,7,38,26,21,15,23,18]datingbacktothefrequencydomainapproachofHuangandTsai[28].Usuallyitisassumedthatthereissome(small)relativemotionbetweenthecameraandthescene,howevermotionlesssuper-resolutionisindeedpossibleifotherimagingparameters(suchastheamountofdefocusblur)varyinstead[21].Ifthereisrelativemotionbetweenthecameraandthescene,thentheﬁrststeptosuper-resolutionistoregisteroraligntheimages;i.e.computethemotionofpixelsfromoneimagetotheothers.Themotionﬁeldsaretypicallyassumedtotakeasimpleparametricform,suchasatranslationoraprojectivewarp[8],butinsteadcouldbedenseopticalﬂowﬁelds[20,2].Weassumethatimageregistrationhasalreadybeenperformed,andconcentrateonthesecondhalfofsuper-resolutionwhichconsistsoffusingthemultiple(aligned)lowresolutionimagesintoahigherresolutionone.Thesecond,fusionstepisusuallybasedontheconstraintsthatthesuper-resolutionimage,whenappropriatelywarpedanddown-sampledtotakeintoaccountthealignmentandtomodeltheimageformationprocess,shouldyieldthelowresolutioninputimages.ThesereconstructionconstraintshavebeenusedbynumerousauthorssinceﬁrststudiedbyPelegetal.[39,29].TheconstraintscaneasilybeembeddedinaBayesianframeworkincorporatingaprioronthehighresolutionimage[47,26,21]1.TheirsolutioncanbeestimatedeitherinbatchmodeorrecursivelyusingaKalmanﬁlter[22,18].Severalreﬁnementshavebeenproposed,includingsimultaneouslycomputingstructure[13,49,50]andremovingotherdegradingeffectssuchasmotionblur[7].Inpracticetheresultsobtainedusingthesereconstruction-basedalgorithmsaremixed.Whilethesuper-resolutionimagesaregenerallyanimprovementovertheinputs,thehighfrequencycomponentsoftheimagesaregenerallynotreconstructedverywell.Toillustratethispoint,weconductedanexperiment,theresultsofwhichareincludedinFigure1.Wetookahighresolutionimageofaface(showninthetopleftoftheﬁgure)andsyntheticallytranslateditbyrandomsub-pixelamounts,blurreditwithaGaussian,andthendown-sampledit.Werepeatedthisprocedureforseveraldifferent(linear)down-samplingfactors;2,4,8,and16.Ineachcase,wegeneratedmultipledown-sampledimages,eachwithadifferentrandomtranslation.Wegeneratedenough

ﬁcation.Inthesecondresultweshowthateventhoughtheconstraintsaregenerallyinvertibleforotherpointspreadfunctions,theconditionnumberalsoalwaysgrowsatleastasfastasaquadratic.(ThissecondresultisprovedforalargeclassofpointspreadfunctionsincludingallpointspreadfunctionswhichreasonablymodelCCDsensors.)Thissecondresult,however,doesnotentirelyexplaintheresultsshowninFigure1.Itisfrequentlypossibletoinvertanill-conditionedproblembysimplyimposingasmoothnessprioronthesolution.Inourthirdresultweusethefactthatthepixelsintheinputimagestakevaluesinaﬁniteset(typicallyintegersintherange0–255)toshowthatthevolumeofsolutionstothediscretizedreconstructionconstraintsgrowsatanextremelyfastrate.This,then,istheunderlyingreasonfortheresultsinFigure1.Forlargemagniﬁcationfactors,thereareahugenumberofsolutionstothereconstructionconstraints,includingnumerousverysmoothsolutions.Thesmoothnesspriorthatistypicallyaddedtoresolvetheambiguityinthelargesolutionspacesimplyensuresthatitisoneoftheoverlysmoothsolutionsthatischosen.(Strictly,theﬁnalsolutiontotheoverallproblemisonlyanapproximatesolutionofthereconstruc-tionconstraintssincebothsetsofconstraintsareaddedasleastsquaresconstraints.)

How,then,canhigh-magniﬁcationsuper-resolutionbeperformed?Ouranalyticalresultsholdforanarbitrarynumberofimages.Usingmorelowresolutionimagesthereforedoesnothelp,atleastbeyondapoint(whichinpracticeisdeterminedbyawiderangeoffactors.Seethediscussionattheendofthepaperformoredetails.)Theadditional(8-bit,say)inputimagessimplydonotprovideanymoreinformationbecausetheinformationwaslostwhentheywerequantizedto8-bits.Suppose,however,thattheinputimagescontaintext.Moreover,supposethatitispossibletoperformopticalcharacterrecognition(OCR)andrecognizethetext.Ifthefontcanalsobedetermined,itwouldthenbeeasytobeperformsuper-resolution.Thetextcouldbereproducedatanyresolutionbysimplyrenderingitfromtherecognizedtextandthedeﬁnitionofthefont.Inthesecondhalfofthispaper,wedescribeasuper-resolutionalgorithmbasedonthisprinciple,whichwecallrecognition-basedsuper-resolutionorhallucination[1,3].Moregenerally,weproposethetermrecogstructionforrecognition-basedreconstructiontechniques.Ourhallucinationalgorithm,however,isbasedontherecognitionofgenericlocal“features”(ratherthanthecharactersdetectedbyOCR)sothatitcanbeappliedtootherphenomena.Therecognizedlocalfeaturesareusedtopredictarecognition-basedpriorwhichreplacesthesmoothnesspriorsusedinexistingalgorithmssuchas[13,47,26,21].Wetrainedourhallucinationalgorithmseparatelyonbothfrontalimagesoffacesandcomputergeneratedtext.Weobtainedsigniﬁcantlybetterresultsthantraditionalreconstruction-basedsuper-resolution,bothvisuallyandintermsofRMSpixelintensityerror.Ouralgorithmiscloselyrelatedtotheindependentworkof[25]inwhichalearningframework

forlow-levelvisionwasproposed,oneapplicationofwhichisimageinterpolation.Intheirpaper,Freemanetal.learnaprioronthehigherresolutionimageusingabeliefpropagationnetwork.Ouralgorithmhastheadvantageofbeingapplicabletoanarbitrarynumberofimages.Ouralgorithmisalsocloselyrelatedto[19]inwhichtheparametersofan“active-appearance”modelareusedforsuper-resolution.Thisalgorithmcanalsobeinterpretedashavingastrong,learnedfaceprior.Theremainderofthispaperisorganizedasfollows.WebegininSection2byderivingthesuper-resolutionreconstructionconstraints,beforeanalyzingtheminSection3.Wepresentourhallucinationalgorithm(withresults)inSection4.WeendwithadiscussioninSection5.

2TheSuper-ResolutionReconstructionConstraints

Denotethelowresolutioninputimagesbyin

where

and

isavector

containingthe(columnandrow)pixelcoordinates.Thestaringpointinthederivationofthe

reconstructionconstraintsisthenthecontinuousimageformationequation(seeFigure2)[27,36]:

(1)

where

isthecontinuousirradiancelight-ﬁeldthatwouldhavereachedtheimageplaneof

underthepinholemodel,

isthepointspreadfunctionofthecamera,and

arecoordinatesintheimageplaneof(overwhichtheintegrationisperformed.)AllthatEqua-tion(1)saysisthatthepixelintensity

istheresultofconvolvingtheirradiancefunction

withthepoint-spreadfunctionofthecamera

andthensamplingitatthediscrete

pixellocations

.(Inamoregeneralformulation,

mayalsobeafunction

ofbothtimeandwavelength.Equation(1)wouldthenalsocontainintegrationsoverthesetwo

variablesaswell.Wedonotmodeltheseeffectsbecausetheydonotaffectthespatialanalysis.)

2.1ModelingthePointSpreadFunction

Wedecomposethepointspreadfunctionofthecameraintotwocomponents(seeFigure2):

(2)

where

modelstheblurringcausedbytheopticsand

modelsthespatialintegration

performedbytheCCDsensor[5].Theopticalblurring

istypicallyfurthersplitintoadefocus

factorthatcanbeapproximatedbyapill-boxfunctionandadiffraction-limitedopticaltransfer

EiLens System

Optical Effects: ωi

Finite ApertureEi*ωiCCD Sensor

SiSiSpatial Integration:ai

Loi(m,n) =Ei*ωi*ai(m,n)= Ei*PSFi(m,n)

Figure2:Ourimageformationmodel.Weassumethatthelowresolutioninputimages

areformed

bytheconvolutionoftheirradiancewiththecamerapointspreadfunction.Wemodelthepointspreadfunctionitselfastheconvolutionoftwoterms:(1)modelstheopticaleffectscausedbythelensandtheﬁniteaperture,and(2)modelsthespatialintegrationperformedbytheCCDsensor.

functionthatcanbemodeledbythesquareoftheﬁrst-orderBesselfunctionoftheﬁrstkind[10].Weaimtobeasgeneralaspossibleandsoavoidmakinganyassumptionsabout(mostof)ouranalysisisperformedforarbitraryfunctionsparametricformforareais

.Instead,

.Wedo,however,assumea

.Weassumethatthethephoto-sensitiveareasoftheCCDpixelsaresquare

anduniformlysensitivetolight,asin[5,6].Ifthelengthofthesideofthesquarephotosensitive

,thespatialintegrationfunctionisthen:

rameters.Inpractice,asimpleparametricformisassumedfor

,moreoftenthannot,thatit

isGaussian.Theparameter(sigma)isthenestimatedempirically.Sincethepointspreadfunctiondescribes“theimageofanisolatedpointobjectlocatedonauniformlyblackbackground”[36],theparameter(s)canbeestimatedfromtheimageofalightplacedalargedistancefromthecamera.)

2.2WhatisSuper-ResolutionAnyway?

Wewishtoestimateasuper-resolutionimageframeof

where

arepixelcoordinates.

.Thecoordinate

Preciselywhatdoesthismean?Letusbeginwiththecoordinateframeof

istypicallydeﬁnedbythatofoneofthelowresolutioninputimages,say

Ifthelinearmagniﬁcationofthesuper-resolutionprocessisclosertoeachotherthanthoseintermsofthatfor

,thepixelsin

willbe

times

.Thecoordinateframeofcanthereforebedeﬁnedin

usingtheequation:

(5)

where

(6)

Giventhisequationwedistinguishtwoprocesses:Deblurringisestimatingarepresentationof

(thatisasopposedtoestimating

);

i.e.deblurringisremovingtheeffectsoftheconvolutionwiththepointspreadfunction

.Deblurringisindependentofwhethertherepresentationof

isonadensergrid

thanthatoftheinputimages.Theresolutionmayormaynotchangeduringdeblurring.ResolutionEnhancementconsistsofestimatingeitheroftheirradiancefunctions(

)onadensergridthanthatoftheinputimage(s).Forexample,enhancingtheresolution

bythelinearmagniﬁcationfactorconsistsofestimatingtheirradiancefunctiononthe

griddeﬁnedbyEquation(4).Ifthenumberofinputimagesis1,resolutionenhancementisknownasinterpolation.Ifthereismorethanoneinputimage,resolutionenhancementisknownassuper-resolution.Resolutionisthereforesynonymouswithpixelgriddensity.Inthispaperwestudythemostgeneralcase;i.e.thecombinationofsuper-resolutionanddeblur-ring.Weestimate

,arepresentationof

onthegriddeﬁnedbyEquation(4).

2.3RepresentingContinuousImages

Inordertoproceedweneedtospecifywhichcontinuousfunctiondiscreteimage

isrepresentedbythe

.Thesimplestcaseisthat

representsthepiecewiseconstantfunction:

(7)

forall

andwherearethecoordinatesof

apixelin

.Then,Equation(5)canberearrangedtogive:

(9),in

for

;i.e.asetoflinearconstraintsontheunknownsuper-resolutionpixels

termsoftheknownlowresolutionpixelsboththepointspreadfunction

.Theconstantcoefﬁcients

dependon

andontheregistration

.(Similarderivationscanbe

performedforotherrepresentationsof

,suchaspiecewiselinearorquadraticones[14].)

3AnalysisoftheReconstructionConstraints

Wenowanalyzethesuper-resolutionreconstructionconstraintsdeﬁnedbyEquation(9).Ascanbeseen,theequationsdependupontwoimagingproperties:(1)thepointspreadfunctionand(2)theregistration

.Withoutsomeassumptionsaboutthesefunctionsanyanalysiswould

isarbitrary,itcanbechosen(ineffect)to

bemeaningless.Ifthepointspreadfunctionisarbitrary,itcanbechosentosimulatethe“smallpixels”ofthesuper-resolutionimage.Similarlyif

movethecameratowardsthesceneandtherebydirectlycapturethesuper-resolutionimage.Wethereforehavetomakesome(reasonable)assumptionsabouttheimagingconditions.AssumptionsMadeAboutthePointSpreadFunction

Weassumethatthepointspreadfunctionisthesameforalloftheimages

andtakestheform:

(11)

where

isaconstant(whichisdifferentforeachlowresolutionimage),and

isthelinearmagniﬁcationofthesuper-resolutionproblem.

,theexactvaluesof

Evengiventheseassumptions,theperformanceofanysuper-resolutionalgorithmwilldependuponthenumberofinputimages

,and,moreover,howwellthealgorithm

increases.Wethereforeas-

canregisterthelowresolutionimagestoestimatethe.Ourgoalistoshowthatsuper-resolution

becomesfundamentallymoredifﬁcultasthelinearmagniﬁcationberofinputimages

,witharbitraryvaluesof

sumethattheconditionsareasfavorableaspossibleandperformtheanalysisforanarbitrarynum-.Moreover,weassumethatthealgorithmhas

estimatedthesevaluesperfectly.Anyresultsderivedundertheseconditionswillonlybestrongerinpractice,wherethe

mighttakedegeneratevalues,ormightbeestimatedinaccurately.

p+(0.5,0.5)Area A1D ProofM(m-ci+ S (0.5,0.5))p-(0.5,0.5)M(m-ci- S (0.5,0.5))

Figure3:Thepixel

overwhichtheintegrationisperformedinEquation(12)isindicatedbythesmall

squareatthetopleftoftheﬁgure.Thelargersquareonthebottomrightistheregioninwhichisnon-zero.Sincetakesthevalueinthisregion,theintegralinEquation(12)equals,whereistheareaoftheintersectionofthetwosquares.ThisﬁgureisusedtoillustratetheproofofTheorem1.

3.1InvertibilityAnalysisforSquarePointSpreadFunctions

Weanalyzethereconstructionconstraintsinthreedifferentways.Theﬁrstanalysisisconcernedwithwhentheconstraintsareinvertible,andwhattherankofthenullspaceiswhentheyarenotinvertible.Inordertogetaneasilyinterpretableresult,theanalysisinthissectionisperformedunderthesimpliﬁedscenariothattheopticalblurringcanbeignoredandsoanalysisisforarbitraryopticalblurringmodels

,the

Diracdeltafunction.Thisassumptionwillberemovedinthefollowingtwosections,wherethe

.Assumingasquarepointspreadfunction

(andthattheregistration

isatranslation)Equation(9)simpliﬁesto:

(12).

wheretheintegrationisperformedoverthepixel;i.e.over

Usingthedeﬁnitionof

itiseasytoseethat

isequalto

timesthearea

oftheintersectionofthetwosquaresinFigure3.Wethenhave:Theorem1If

isanintegergreaterthan1,thenforallchoicesof

thesetofEquations(12)

isnotinvertible.Moreover,theminimumachievabledimensionofthenullspaceis

.If

isnotaninteger,

’scanalwaysbechosensothatthesetofEquations(12)areinvertible.

Proof:Weprovideaprooffor1Dimages.(SeeFigure3.)Theextensionto2Disstraight-forward.

ThenullspaceofEquation(12)isdeﬁnedbytheconstraints

where

istheareaofintersectionofthetwosquaresinFigure3.For1Dwejustconsideronerow

oftheﬁgure.Anyelementofthenullspacethereforecorrespondstoanassignmentofvaluestothesmallsquaresinawaythattheirweightedsum(overthelargesquare)equalszero,wheretheweightsaretheareasofintersectionwiththelargesquare.(Tobeabletoconductthisargumentforeverypixelinthesuper-resolutionimage,weneedtoassumethatthenumberofpixelsineveryrowandeverycolumnofthesuper-resolutionimageisgreaterthanpixels.ThisfollowsfromthefactthatChanging

.Thisisaminor

assumptionsinceitcorrespondstoassumingthatthelowresolutionimagesarebiggerthan

isphysicallyconstrainedtobelessthan1.)

toslidethelargesquarealongtherowbyasmallamount,wegetasimilarcon-

straintontheelementsinthenullspace.Theonlydifferenceisintheleft-mostandright-mostsquares.Subtractingthesetwoconstraintsshowsthattheleft-mostsquareandtheright-mostsquaremusthavethesamevalue.Thismeansthatand

mustequalboth

,iftheassignmentistolieinthenullspace.

isnotaninteger(oris1),thisprovesthatneighboringvaluesof

mustbeequal

andhence.Therefore,valuescanalwaysbechosenforthe

sothatthenullspaceonlycontains

thezerovector;i.e.thelinearsystemis,ingeneral,invertible.(Theequivalenceofthenullspacebeingnon-trivialandthelinearsystembeingnotinvertiblerequirestheassumptionthatthenumberofpixelsinthesuper-resolutionimageisﬁnite;i.e.thesuper-resolutionimageisbounded.)If

isaninteger,thisconstraintplacesanupperboundof

onthedimensionthatareperiodic.Allofthese

ofthenullspace(sincethenullspaceiscontainedinthesetassignmentstowithperiod

.)Thisvaluecanalsobeshowntobealowerboundonthedimensionofthe

nullspacebythespaceofperiodassignmentsforwhich

assignmentscaneasilybeseentolieinthenullspace(foranychoiceofthetranslationsthetwocases

and

,where

Tovalidatethistheorem,wesolvedthereconstructionconstraintsusinggradientdescentfor

.TheresultsarepresentedinFigure4.

Inthisexperiment,nosmoothnesspriorisusedandgradientdecentisrunforasufﬁcientlylongtimethatthestartingimage(whichissmooth)doesnotbiastheresults.Theinputinbothcasesconsistedofmultipledown-sampledimagessimilartotheoneatthetopofthesecondcolumninFigure1.Speciﬁcally,

randomlytranslatedimageswereusedasinput.Exactlythesame

isthereforeactuallysmallerthanthatfor

inputsareusedforthetwoexperiments.Theonlydifferenceisthemagniﬁcationfactorinthesuper-resolutionalgorithm.Theoutputfor

(andwasenlargedtothesamesizeinFigure4fordisplaypurposesonly.)

(a)

(b)

(c)

,withPrior

Figure4:ValidationofTheorem1:Theresultsofsolvingthereconstructionconstraintsusinggradient

descentforasquarepointspreadfunctionwith.(a)Whenisaninteger,theequationsarenotinvertibleandsoarandomperiodicimageinthenullspaceisaddedtotheoriginalimage.(b)Whenisnotaninteger,thereconstructionconstraintsareinvertibleandsoasmoothsolutionisfound,evenwithoutaprior.(Theresultforhasbeeninterpolatedtomakeitthesamesizeasthatfor.)(c)Whenasmoothnesspriorisaddedtothereconstructionconstraintsthedifﬁcultiesseenin(a)disappear.

AscanbeseeninFigure4,forwithperiod

the(additive)errorisapproximatelyaperiodicimage

thefactthattheproblemisnotinvertible

the

pixels.For

theequationsareinvertibleandsoasmoothsolutionisfound,

eventhoughnosmoothnesspriorwasused.For

doesnothaveanypracticalsigniﬁcance.Adequatesolutionscanbeobtainedbysimplyaddingasmoothnesspriortothereconstructionconstraints,asshowninFigure4(c).For

situationisdifferent,however.Aswillbeshowninthethirdpartofouranalysis,itistherapidrateofincreaseofthedimensionofnullspacethatistherootcauseoftheproblemsforlarge

3.2ConditioningAnalysisforArbitraryPointSpreadFunctions

Anylinearsystemthatisclosetobeingnotinvertibleisusuallyill-conditioned.Itisnosurprisethenthatchangingfromasquarepointspreadfunctiontoanarbitraryfunction

resultsinanill-conditionedsystem,aswenowshowinthesecondpartofouranalysis:Theorem2Suppose

isanyfunctionforwhich

forall

and

Then,theconditionnumberofthefollowinglinearsystemgrowsatleastasfastas

(13)

where

Proof:Weﬁrstprovethetheoremforthesquarepointspreadfunctionandthengeneralize.Theconditionnumberofan

(i.e.forEquation12)

matrix

isdeﬁned[42]as:

(15)

where

istheL2norm.(ThisresultfollowsimmediatelyfromtheSVD

.The

matrices

anddonotaffecttheL2normofavectorsincetheircolumnsareorthonormal.

Equation(15)clearlyholdsfor.)Itfollowsimmediatelythatif

and

areanytwovectorsthen:

andset

Thistheoremismoregeneralthanthepreviousonebecauseitappliesto(essentially)arbi-trarypointspreadfunctions.Ontheotherhand,itisaweakerresult(insomesituations)becauseitonlypredictsthatsuper-resolutionisill-conditioned(ratherthannotinvertible.)Thistheoremonitsown,therefore,doesnotentirelyexplainthepoorperformanceofsuper-resolution.AsweshowedinFigure4,problemsthatareill-conditioned(orevennotinvertible,wheretheconditionnumberisinﬁnite)canoftenbesolvedbysimplyaddingasmoothnessprior.Thenotinvertiblesuper-resolutionprobleminFigure4(a)issolvedinFigure4(c)inthisway.Severalresearchershaveperformedconditioninganalysisofvariousformsofsuper-resolution,including[21,48,43].Althoughuseful,noneoftheseresultsfullyexplainthedrop-offinperformancewiththemag-niﬁcation

.Theweaknessofconditioninganalysisisthatanill-conditionedsystemmaybe

ill-conditionedbecauseofasingle“almostsingularvalue.”AsindicatedbytherapidgrowthinthedimensionofthenullspaceinTheorem1,super-resolutionhasalargenumberof“almostsingularvalues”forlargemagniﬁcations.ThisistherealcauseofthedifﬁcultiesseeninFigure1.Onewaytoshowthisistoderivethevolumeofsolutions,aswenowdointhethirdpartofouranalysis.

3.3VolumeofSolutionsforArbitraryPointSpreadFunctions

Ifwecouldworkwithnoiseless,real-valuedquantitiesandperformarbitraryprecisionarithmeticthenthefactthatthereconstructionconstraintsareill-conditionedmightnotbeaproblem.Inreality,however,imagesarealwaysintensitydiscretized(typicallyto8-bitvaluesintherange0–255greylevels.)Therewillthereforealwaysbenoiseinthemeasurements,evenifitisonlyplus-or-minushalfagrey-level.Supposethat

denotestheoperatorwhichtakesareal-valued

irradiancemeasurementandturnsitintoaninteger-valuedintensity.Ifweincorporatethisquanti-zationintoourimageformationmodelthenEquation(17)becomes:

(18)

Supposethat

isaﬁxedsizeimage2with

pixels.Wethenhave:

Theorem3Ifleastasfastas

isthestandardroundingoperatorwhichreplacesarealnumberwiththe

(treating

asaconstantand

and

asvariables.)

nearestinteger,thenthevolumeofthesetofsolutionsofEquation(18)growsasymptoticallyat

Proof:Firstnotethatthespaceofsolutionsisconvexsinceintegrationisalinearoperation.NextnotethatonesolutionofEquation(18)isthesolutionto:

(19)

Thedeﬁnitionofthepointspreadfunctionasgive

andthepropertiesoftheconvolution

.Therefore,adding

toanypixelinisstillasolutionsincethe

righthandsideofEquation(19)increasesbyatmost1.(Theintegrandisincreasedbylessthan

grey-levelinthepixel,whichonlyhasanareaofunit.)ThevolumeofsolutionsofEquation(18)thereforecontainsan-dimensionalsimplex,wheretheanglesatonevertexareallright-angles,

andthesidesareall

unitslong.Thevolumeofsuchasimplexgrowsasymptoticallylike

(treating

asaconstantandand

asvariables).Thedesiredresultfollows.

Thisthirdandﬁnaltheoremprovidesthebestexplanationofthesuper-resolutionresultspre-sentedinFigure1.Forlargemagniﬁcationfactors

,thereisahugevolumeofsolutionstothe

discretizedreconstructionconstraintsinEquation(18).Thesmoothnesspriorwhichisaddedtoresolvethisambiguitysimplyensuresthatitisoneoftheoverlysmoothsolutionsthatischosen.(Ofcourse,withouttheprioranysolutionmightbechosenwhichwouldgenerallybeevenworse.Asmentionedintheintroduction,theﬁnalsolutionisreallyonlyanapproximatesolutionofthereconstructionconstraintssincebothsetsofconstraintsareaddedasleastsquaresconstraints.)InFigure5wepresentquantitativeresultstoillustrateTheorems2and3.Weagainusedthereconstruction-basedalgorithm[26].Weveriﬁedourimplementationintwoways:(1)wecheckedthatforsmallmagniﬁcationfactorsandnopriorourimplementationdoesyield(essentially)perfectreconstructions,and(2)formagniﬁcationsof4wecheckedthatournumericalresultsareconsistentwiththosein[26].Wealsotriedtherelatedalgorithmof[47]andobtainedverysimilarresults.UsingthesameinputsasFigure1,weplotthereconstructionerroragainstthemagniﬁcation;i.e.thedifferencebetweenthereconstructedhighresolutionimageandtheoriginal.Wecomparethiserrorwiththeresidualerror;i.e.thedifferencebetweenthelowresolutioninputsandtheirpre-dictionsfromthereconstructedhighresolutionimage.Asexpectedforanill-conditionedsystem,thereconstructionerrorismuchhigherthantheresidual.Wealsocomparewitharoughpredic-tionofthereconstructionerrorobtainedbymultiplyingthelowerboundontheconditionnumber

100RMS Intensity Difference in Grey-Levels80

Reconstruction ErrorResidual Error

Single Image Interpolation Error

Predicted Error

Smoothness Prior40

Super-Res.20

246

810Linear Magnification

121416

Figure5:AnillustrationofTheorems2and3usingthesameinputsasinFigure1.Thereconstructionerror

ismuchhigherthantheresidual,aswouldbeexpectedforanill-conditionedsystem.Forlowmagniﬁcations,thepriorisunnecessaryandsotheresultsareworsethanpredicted.Forhighmagniﬁcations,thepriordoeshelp,butatthepriceofoverlysmoothresults.(SeeFigure1.)Aroughestimateoftheamountofinformationprovidedbythereconstructionconstraintsisgivenbytheimprovementofthereconstructionerroroverthesingleimageinterpolationerror.Similarly,theimprovementfromthepredictederrortothereconstructionerrorisanestimateoftheamountofinformationprovidedbythesmoothnessprior.Bythismeasure,thesmoothnesspriorprovidesmoreinformationthanthereconstructionconstraintsforamagniﬁcationof.

(

)byanestimateoftheexpectedresidualassumingthatthegrey-levelsarediscretizedfroma

uniformdistribution.Forlowmagniﬁcationfactors,thisestimateisanunder-estimatebecausethepriorisunnecessaryfornoisefreedata;i.e.betterresultswouldbeobtainedwithouttheprior.Forhighmagniﬁcationsthepredictionisanover-estimatebecausethelocalsmoothnessassumptiondoeshelpthereconstruction(albeitattheexpenseofoverlysmoothresults.)

WealsoplotinterpolationresultsinFigure5;i.e.justusingthereconstructionconstraintsforoneimage(aswasproposed,forexample,in[46].)Thedifferencebetweenthiscurveandthereconstructionerrorcurveisameasureofhowmuchinformationthereconstructionconstraintsprovide.Similarly,thedifferencebetweenthepredictederrorandthereconstructionerrorisameasureofhowmuchinformationthesmoothnesspriorprovides.ForamagniﬁcationofThis,then,isanalternativeinterpretationofwhytheresultsinFigure1aresosmooth.

,we

seethatthepriorprovidesmoreinformationthanthesuper-resolutionreconstructionconstraints.

4Recognition-BasedSuper-ResolutionorHallucination

Howthenisitpossibletoperformhighmagniﬁcationsuper-resolutionwithouttheresultslookingoverlysmooth?Aswehavejustshown,therequiredhigh-frequencyinformationwaslostfromthereconstructionconstraintswhentheinputimageswerediscretizedto8-bitvalues.Genericsmoothnesspriorsmayhelpregularizetheproblem,butcannotreplacethemissinginformation.Asoutlinedintheintroduction,ourgoalinthissectionistodevelopasuper-resolutionalgo-rithmthatusestheinformationcontainedinacollectionofrecognitiondecisions(inadditiontothereconstructionconstraints.)Ourapproachistoembedtheresultsoftherecognitiondecisionsinarecognition-basedprioronthesolutionofthereconstructionconstraints,therebyresolvingtheinherentambiguityintheirsolution(seeSection3.3).

4.1BayesianMAPFormulationofSuper-Resolution

Webeginwiththe(standard)Bayesianformulationofsuper-resolution[13,47,26,21].Inthisap-proach,super-resolutionisposedasﬁndingthemaximumaposteriori(orMAP)super-resolutionimage

:i.e.estimating

.Bayeslawforthisestimationproblemis:

4.2Recognition-BasedPriorsforSuper-Resolution

Thesecondtermontheright-handsideofEquation(21)is(thenegativelogarithmof)theprior

.Usuallythisprioronthesuper-resolutionimageischosentobeasimplesmoothness

prior[13,47,26,21].Insteadwewouldliketochooseitsothatitdependsuponasetofrecognitiondecisions.Supposethattheoutputsoftherecognitiondecisionspartitionthesetofinputs(i.e.thelowresolutioninputimages

)intoasetofsubclasses

Wethendeﬁnea

recognition-basedpriorasonethatcanbewritteninthefollowingform:

(23).Oncethe

then.

Essentially,thereisaseparatepriorPrlowresolutioninputimages

foreachpossiblepartition

areavailable,thevariousrecognitionalgorithmscanbeapplied,

anditcanbedeterminedwhichpartitiontheinputsliein.Therecognition-basedpriorPrreducestothemorespeciﬁcpriorPrtheoverallpriorPr

.Thispriorcanbemademorepowerfulthan

becauseitcanbetailoredtothe(smaller)subsetoftheinputdomain

4.3Multi-ScaleDerivativeFeatures:TheParentStructure

Wedecidedtotrytorecognizegenericlocalimagefeatures(ratherthanhigherlevelconceptssuchashumanfacesorASCIIcharacters)becausewewanttoapplyouralgorithmtoavarietyofphenomena.Motivatedby[16,17]wealsodecidedtousemulti-scalefeatures.Inparticular,givenanimage,weﬁrstformitsGaussianpyramid

[11].Afterwards,wealso

formitsLaplacianpyramid

[12],thehorizontal

andvertical

ﬁrstderivativesoftheGaussianpyramid,andthehorizontal

andvertical

secondderivativesoftheGaussianpyramid[1].(SeeFigure6for

examplesofthesepyramidsforanimageofaface.)Finally,weformapyramidoffeatures:

(24)

Thepyramid

isapyramidwherethereare5valuesstoredateachpixel,the

Laplacianandthe4derivatives,ratherthanthesinglevaluetypicallystoredinmostpyramids.(ThechoiceofthefeaturesinEquation(24)isaninstanceofthe“featureselection”problem.Forexample,steerableﬁlters[24]couldbeusedinstead,orthesecondderivativescouldbedroppediftheyaretoonoisy.Wefoundtheperformanceofouralgorithmstobelargelyindependentofthechoiceoffeatures.Theselectionoftheoptimalfeaturesisoutsidethescopeofthispaper.)

Parent Structure vectorPS 2G 4G 3G 2G 1G 0

L 0

L 4L 3L 2L 1

H 0

H 4H 3H 2H 1

V4V3V2V1

(a)GaussianPyramid(b)LaplacianPyramid(c)FirstDerivativePyramids

Figure6:TheGaussian,Laplacian,andﬁrstderivativepyramidsofanimageofaface.(Wealsousetwo

secondderivativesbutomitthemfromtheﬁgure.)Wecombinethesepyramidsintoasinglemulti-valuedpyramid,wherewestoreavectoroftheLaplacianandthederivativesateachpixel.TheParentStructurevectorofapixelinthelevelofthepyramidconsistsofthevectorofvaluesforthatpixel,thevectorforitsparentinthelevel,thevectorforitsparent’sparent,etc[16].TheParentStructurevectoristhereforeahigh-dimensionalvectorofderivativescomputedatvariousscales.Inouralgorithms,recognitionmeansﬁndingthetrainingsamplewiththemostsimilarParentStructurevector.

Then,givenapixelinthelowresolutionimagethatweareperformingsuper-resolutionon,wewanttoﬁnd(i.e.recognize)apixelinacollectionoftrainingdatathatislocally“similar.”Bysimilar,wemeanthatboththeLaplacianandtheimagederivativesareapproximatelythesame,atallscales.Tocapturethisnotion,wedeﬁnetheParentStructurevector[16]ofapixel

inthe

levelofthefeaturepyramid

tobe:

(25)

AsillustratedinFigure6,theParentStructurevectoratanyparticularpixelinthepyramidconsistsofthefeaturevectoratthatpixel,thefeaturevectoroftheparentofthatpixel,thefeaturevectorofitsparent,andsoon.Exactlyasin[16],ournotionoftwopixelsbeingsimilaristhenthattheirParentStructurevectorsareapproximatelythesame(asmeasuredbysomenorm.)

4.4RecognitionasFindingtheNearest-NeighborParentStructure

Supposewehaveasetofhighresolutiontrainingimagespyramids

.Wecanthenformalloftheirfeature

.Alsosupposethatwearegivenalowresolutioninputimage

Finally,supposethatthisimageisataresolutionthatis

timessmallerthanthetraining

PS 2(T ) iPS 2( )Lo iF 4(T ) jF 3(T ) jF 2(T ) jF j 1(T )F 0(T ) jHigh-Resolution Training ImagesT j

Match

FLo i 4( )FLo i 3( )FLo i = Lo i 2( )Best ImageBest Pixel

BI i

Low-Resolution Input ImageLo i

BP i

Recognition Output

Figure7:Wecomputethefeaturepyramids

forthetrainingimagesandthefeaturepyramidforthelowresolutioninputimage.Foreachpixelinthelowresolutionimage,weﬁnd(i.e.recognize)theclosestmatchingParentStructureinthehighresolutiondata.WerecordandoutputthebestmatchingimageandthepixellocationofthebestmatchingParentStructure.Notethatthesedatastructuresarebothdeﬁnedindependentlyforeachpixelintheimages.

samples.(Theimagemayhavetobeinterpolatedtomakethisratioexactlyapowerof2.Sincetheinterpolatedimageisimmediatelydownsampledtocreatethepyramiditisonlythelowestlevelofthepyramidfeaturesthatareaffectedbythisinterpolation.Theoveralleffectontheprioristhereforeverysmall.)Wecanthencomputethefeaturepyramidfortheinputimagefromlevelandupwards

.Figure7showsanillustrationofthisscenariofor

Foreachpixel

intheinputindependently,wecompareitsParentStructurevector

againstallofthetrainingParentStructurevectorsatthesamelevel;i.e.we

compareagainst

forallandforall

.Thebestmatchingimage

andthebestmatchingpixel

arestoredastheoutputoftherecognitiondecision,

independentlyforeachpixelusedaweighted

.(Wefoundtheperformancetobelargelyindependent

ofthedistancefunctionusedtodeterminethebestmatchingParentStructurevector.Weactually

-norm,givingthederivativecomponentshalfasmuchweightastheLaplacian

valuesandreducingtheweightbyafactorofforeachincreaseinthepyramidlevel.)

RecognitioninourhallucinationalgorithmthereforemeansﬁndingtheclosestmatchingpixelinthetrainingdatainthesensethattheParentStructurevectorsofthethetwopixelsarethemostsimilar.Thissearchis,ingeneral,performedoverallpixelsinalloftheimagesinthetrainingdata.Ifwehavefrontalimagesoffaces,however,werestrictthissearchtoconsideronlythecorrespondingpixelsinthetrainingdata.Inthisway,wetreateachpixelintheinputimagedifferently,dependingonitsspatiallocation,similarlytothe“class-based”approachof[44].

4.5ARecognition-basedGradientPrior

Foreachpixelimage

intheinputimage

,wehaverecognizedthepixelthatisthemostsimilar

inthetrainingdata,speciﬁcally,thepixel

inthe

levelofthepyramidfortraining

.Theserecognitiondecisionspartitiontheinputs

intoacollectionofsubclasses,

asrequiredbytherecognition-basedpriordescribedinSection4.2.Ifwedenotethesubclassesby

(i.e.usingamulti-dimensionalindexratherthan)Equation(23)canberewrittenas:

wherePr

(26),giventhat

istheprobabilitythatthesuper-resolutionimageis

theinputimages

lieinthesubclassthatwillberecognizedtohave

astheclosestmatching

pixelinthetrainingimage(inthe

levelofthepyramid.)

WenowneedtodeﬁnePr

.Wedecidedtomakethisrecognition-based

priorafunctionofthegradientbecausethebase,oraverage,intensitiesinthesuper-resolutionimagearedeﬁnedbythereconstructionconstraints.Itisthehigh-frequencygradientinformationthatismissing.Speciﬁcally,wewanttodeﬁnePrEachlowresolutioninputimagedifferentpixelsinthe(andforeach

toencouragethegradient

ofthesuper-resolutionimagetobeclosetothegradientoftheclosestmatchingtrainingsamples.

hasa(different)closestmatching(ParentStructure)train-ofthemtobeprecise.SeealsoFigure7.)We

ingsampleforeachpixel.Moreover,eachsuchParentStructurecorrespondstoanumberof

levelofthepyramid,(

thereforeimposeaseparategradientconstraintforeachpixel

inthe

levelofthepyramid

.)Now,thebestmatchingpixel

isonlydeﬁnedonthe

levelofthepyra-

mid.Fornotationalconvenience,therefore,givenapixeldeﬁnethebestmatchingpixelonthe

onthe

levelofthepyramid,

levelofthepyramidtobe:

isapixelinthe

levelofthepyramidforimage

,thecorrespondingpixelinthe

super-resolutionimagederivativesof

.Wethereforewanttoimposetheconstraintthattheﬁrst

and

atthispointshouldequalthederivativesoftheclosestmatchingpixel(Parent

Structure)inthetrainingdata.Parametricexpressionsfor

can

easilybederivedaslinearfunctionsoftheunknownpixelsinthehighresolutionimage

.We

assumethattheerrorsinthegradientvaluesbetweentherecognizedtrainingsamplesandthesuper-resolutionimageareindependentlyandidenticallydistributed(acrossboththeimagesandthepixelsPr

)andmoreoverthattheyareGaussianwithcovariance

.Therefore:

(28)

Thisexpressionenforcestheconstraintsthatthegradientofthesuper-resolutionimageeachinputimage

.)Theseconstraintsarealsolinearintheunknownpixelsof

should

beequaltothegradientofthebestmatchingtrainingimage(separatelyforeachpixel

4.6AlgorithmPracticalities

Equations(21),(22),(26),and(28)formahigh-dimensionallinearleastsquaresproblem.TheconstraintsinEquation(22)arethestandardsuper-resolutionreconstructionconstraints.ThoseinEquation(28)aretherecognition-basedprior.Therelativeweightsoftheseconstraintsaredeﬁnedbythenoisecovariancesreliableonesandsosetsuper-resolutionimage

and

.Weassumethatthereconstructionconstraintsarethemore

(typically)tomakethemalmosthardconstraints.

Thenumberofunknownsinthelinearsystemisequaltothetotalnumberofpixelsinthe

.Directlyinvertingalinearsystemofsuchsizecanproveproblematic.

Wethereforeimplementedagradientdescentalgorithm(usingadiagonalapproximationtotheHessian[42]tosetthehowstepsizeinasimilarwayto[52].)Sincetheerrorfunctionisquadratic,thealgorithmconvergestothesingleglobalminimumwithoutanyproblem.

4.7ExperimentalResultsonHumanFaces

OurexperimentsforhumanfaceswereconductedwithasubsetoftheFERETdataset[40]con-sistingof596imagesof278individuals(92womenand186men).Eachpersonappearsbetween2and4times.Mostpeopleappeartwice,withtheimagestakenonthesamedayunderapproxi-matelythesameilluminationconditions,butwithdifferentfacialexpressions(oneimageisusuallyneutral,theothertypicallyasmile.)Asmallnumberofpeopleappear4times,withtheimagestakenovertwodifferentdays,separatedbyseveralmonths.

TheimagesintheFERETdatasetare

pixels,howevertheareaoftheimageoccupied

bythefacevariesconsiderably.Mostofthefacesarearoundpixelsorlarger.Intheclass-

basedapproach[44],theinputimages(whichareallfrontal)needtobealignedsothatwecanassumethatthesamepartofthefaceappearsinroughlythesamepartoftheimageeverytime.Thisallowsustoobtainthebestresults.Thisalignmentwasperformedbyhandmarkingthelocationof3points,thecentersofthetwoeyesandthelowertipofthenose.These3pointsdeﬁneanafﬁnewarp[8],whichwasusedtowarptheimagesintoacanonicalform.Thecanonicalimageis

pixelswiththerighteyeat.These

,thelefteyeat

,andthelowertipof

thenoseat

pixelimageswerethenusedasthetrainingsamples.(In

mostofourexperiments,wealsoadded8syntheticvariationsofeachimagetothetrainingsetbytranslatingtheimage8times,eachtimebyasmallamount.Thisstepenhancestheperformanceofouralgorithmslightly,althoughitisnotvitaltoobtaingoodperformance.)

Weuseda“leave-one-out”methodologytotestouralgorithm.Totestonanyparticularperson,weremovedalloccurrencesofthatindividualfromthetrainingset.Wethentrainedthealgorithmonthereducedtrainingset,andtestedontheimagesoftheindividualthathadbeenremoved.Becausethisprocessisquitetimeconsuming,weusedatestsetof100imagesof100differentindividualsratherthantheentiretrainingset.Thetestsetwasselectedatrandomfromthetrainingset.Aswillbeseen,thetestsetspansbothsexandracereasonablywell.4.7.1ComparisonwithExistingSuper-ResolutionAlgorithmsWeinitiallyrestrictattentiontothecaseofenhancing

pixelimagesfourtimestogive

pixelimages.Laterwewillconsiderthevariationinperformancewiththemagniﬁcation

factor.Wesimulatethemultipleslightlytranslatedimagesrequiredforsuper-resolutionusingtheFERETdatabasebyrandomlytranslatingtheoriginalFERETimagesmultipletimesbysub-pixelamountsbeforedown-samplingthemtoformthelowresolutioninputimages.

InourﬁrstsetofexperimentswecompareouralgorithmwiththoseofHardieetal.[26]andSchultzandStevenson[47].InFigure8(a)weplottheRMSpixelerroragainstthenumberoflowresolutioninputs,computedoverthe100imagetestset.(WecomputetheRMSerrorusingtheoriginalhighresolutionimageusedtosynthesizetheinputsfrom.)WealsoplotresultsforcubicB-splineinterpolation[41]forcomparison.Sincethisalgorithmisaninterpolationalgorithm,onlyoneimageiseverusedandsotheperformanceisindependentofthenumberofinputs.

InFigure8(a)weseethatourhallucinationalgorithmdoesoutperformthereconstruction-basedsuper-resolutionalgorithms,fromoneinputimageto25.Theimprovementisconsistentacrossthe

30RMS Error Per Pixel in Grey-LevelsRMS Error Per Pixel in Grey-LevelsCubic B-splineHallucinationHardie et al.

Schultz and Stevenson

40353025201510

Cubic B-splineHallucinationHardie et al.

Schultz and Stevenson

Number of Input Images

1002

4681012Standard Deviation of Additive Noise

1416

(a)VariationwiththeNumberofImages(b)VariationwithAdditiveNoise

Figure8:Acomparisonofourhallucinationalgorithmwiththereconstruction-basedsuper-resolution

algorithmsofSchultzandStevenson[47]andHardieetal.[26].In(a)weplottheRMSpixelintensityerrorcomputedacrossthe100imagetestsetagainstthenumberoflowresolutioninputimages.Ouralgorithmoutperformsthethetraditionalsuper-resolutionalgorithmsacrosstheentirerange.In(b)wevarytheamountofadditivenoise.Againweﬁndthatouralgorithmdoesbetterthanthetraditionalsuper-resolutionalgorithms,especiallyasthestandarddeviationofthenoiseincreases.

(a)Input

(b)Hallucinated(c)Hardieetal.(d)Original(e)CubicB-spline

(f)Input

(g)Hallucinated(h)Hardieetal.(i)Original(j)CubicB-spline

Figure9:ThebestandworstresultsinFigure8(a)intermsoftheRMSerrorofthehallucinationalgorithm

for9inputimages.In(a)–(e)wedisplaytheresultsforthebestperformingimageinthe100imagetestset.Theresultsfortheworstimagearepresentedin(f)–(j).(TheresultsforSchultzandStevensonaresimilartothoseforHardieetal.andareomitted.)Thereislittledifferenceinimagequalitybetweenthebestandworsthallucinatedresults.ThehallucinatedresultsarealsovisiblybetterthanthoseforHardieetal.

(a)Std.Dev.1.0(b)Std.Dev.2.0(c)Std.Dev.4.0(d)Std.Dev.8.0(e)Std.Dev.16.0

Figure10:AnexamplefromFigure8(b)ofthevariationintheperformanceofthehallucinationalgorithm

withadditivezero-mean,whiteGaussiannoise.Theoutputsofthehallucinationalgorithmareshownforvariouslevelsofnoise.Ascanbeseen,theoutputishardlyaffecteduntilaround4-bitsofintensitynoisehavebeenaddedtotheinputs.Thisisbecausethehallucinationalgorithmusesthestrongrecognition-basedfacepriortogeneratesmooth,face-likeimageshowevernoisytheinputimagesare.Ataround4-bitsofnoise,therecognitiondecisionsbegintofailandtheperformanceofthealgorithmbeginstodropoff.

numberofinputimagesandisaround20%.Theimprovementisalsolargelyindependentoftheactualinput.Inparticular,Figure9containsthebestandworstresultsobtainedacrosstheentiretestsetintermsoftheRMSerrorofthehallucinationalgorithmfor9lowresolutioninputs.Ascanbeseen,thereislittledifferencebetweenthebestresultsinFigure9(a)–(e)andtheworstonesin(f)–(j).Notice,also,howthehallucinatedresultsareadramaticimprovementoverthelowresolutioninput,andmoreoverarevisiblysharperthantheresultsforHardieetal..4.7.2RobustnesstoAdditiveIntensityNoise

Figure8(b)containstheresultsofanexperimentinvestigatingtherobustnessofthe3super-resolutionalgorithmstoadditiveintensitynoise.Inthisexperiment,weaddedzero-mean,whiteGaussiannoisetothelowresolutionimagesbeforepassingthemasinputstothealgorithms.Intheﬁgure,theRMSpixelintensityerrorisplottedagainstthestandarddeviationoftheadditivenoise.Theresultsshownarefor4lowresolutioninputimages,andagain,theresultsareanaverageoverthe100imagetestset.(TheresultsforcubicB-splineinterpolationjustuseoneinputimage,ofcourse.)Aswouldbeexpected,theperformanceofall4algorithmsgetsmuchworseasthestandarddeviationofthenoiseincreases.Thehallucinationalgorithm(andcubicB-splineinterpolation),however,seemsomewhatmorerobustthanthetraditionalreconstruction-basedsuper-resolutionalgorithms.Thereasonforthisincreasedrobustnessisprobablythatthehallucinationalgorithmalwaystendstogeneratesmooth,face-likeimages(becauseofthestrongrecognition-basedprior)howevernoisytheinputsare.Solongastherecognitiondecisionare

Input

ReductioninRMSerrorvs.cubicB-spline

56%(12.4vs.22.2)

73%(33.3vs.45.4)

Figure11:Thevariationintheperformanceofourhallucinationalgorithmwiththeinputimagesize.From

theexampleinthetoptworows,weseethatthealgorithmworkswelldowntopixelimages,butnotforpixelimages.(SeealsoFigure12.)TheimprovementintheRMSerroroverthe100imagetestsetinthelastrowconﬁrmsthefactthatthealgorithmbeginstobreakdownbetweenthesetwoimagesizes.

notaffectedtoomuch,theresultsshouldlookreasonable.OneexampleofhowtheoutputofthehallucinationalgorithmdegradeswiththeamountofadditivenoiseispresentedinFigure10.4.7.3VariationinPerformancewiththeInputImageSize

Wedonotexpectourhallucinationalgorithmtoworkforallsizesofinput.Oncetheinputgetstoosmall,therecognitiondecisionswillbebasedonessentiallynoinformation.Inthelimitthattheinputimageisjustasinglepixel,thealgorithmwillalwaysgeneratethesameface(forasingleinputimage),butwithdifferentaveragegreylevels.Wethereforeinvestigatedthelowestresolutionatwhichourhallucinationalgorithmworksreasonablewell.

InFigure11weshowexampleresultsforonefaceinthetestsetfor4differentinputsizes.(Alloftheresultsusejust4inputimages.)Weseethatthealgorithmworksreasonablywelldownto

pixels,butfor

pixelimagesitproducesafacethatappearstobeapieced-together

combinationofavarietyoffaces.Thisisnottoosurprisingbecausethepixelinputimage

isnotevenclearlyanimageofaface.(Manyfacedetectorssuchas[45]useinputwindowsof

(a)Input

(1of4images)

(b)Hallucinated

(c)SchultzandStevenson

(d)Original(e)CubicB-spline

Figure12:Selectedresultsfor

pixelimages,thesmallestinputsizeforwhichourhallucination

algorithmworksreliably.(Theinputconsistsofonly4lowresolutioninputimages.)NoticehowsharpthehallucinatedresultsarecomparedtotheinputandtheresultsfortheSchultzandStevenson[47]algorithm.(TheresultsforHardieetal.[26]aresimilartothoseforSchultzandStevensonandsoareomitted.)

CroppedCubic B-splineHallucinated

Figure13:ExampleresultsonasingleimagenotintheFERETdataset.Thefacialfeatures,suchaseyes,

nose,andmouth,whichareblurredandunclearintheoriginalcroppedface,areenhancedandappearmuchsharperinthehallucinatedimage.Incomparison,cubicB-splineinterpolationgivesoverlysmoothresults.

around

pixelssoitisunlikelythatthe

pixelimagewouldbedetectedasaface.)

InthelastrowofFigure11,wegivenumericalresultsoftheaverageimprovementintheRMSerrorovercubicB-splineinterpolation(computedoverthe100imagetestset.)Weseethatfor

and

pixelimages,thereductionintheerrorisverydramatic.Itisroughlyhalved.

Fortheothersizes,theresultsarelessimpressive,withtheRMSerrorbeingcutbyabout25%.For

pixelimages,thereasonisthatthehallucinationalgorithmisbeingtobreakdown.For

pixelimageareexcellent,however.(AlsoseeFigure12which

pixelimages,thereasonisthatcubicB-splinedoessowellthatitishardtodomuchbetter.

Theresultsforthe

containsseveralmoreexamples.)Theinputimagesarebarelyrecognizableasfacesandthefacialfeaturessuchastheeyes,eye-brows,andmouthsonlyconsistofahandfulofpixels.Theoutputs,albeitslightlynoisy,areclearlyrecognizabletothehumaneye.Thefacialfeaturesarealsoclearlydiscernible.ThehallucinatedresultsarealsoahugeimprovementoverSchultzandStevenson[47].4.7.4ResultsonNon-FERETTestImages

Inourﬁnalexperimentforhumanfaces,wetriedouralgorithmonanumberofimagesnotintheFERETdataset.InFigure13wepresenthallucinationresultsjustusingasingleinputimage.Ascanbeseen,thehallucinatedresultsareabigimprovementovercubicB-splineinterpolation.The

CroppedSchultz and StevensonHallucinated

Figure14:Exampleresultsonashortvideoofeightframes.(Onlyoneoftheinputimagesandthe

croppedlowresolutionfaceregionareshown.Theotherseveninputimagesaresimilarexceptthatthecameraisslightlytranslated.)TheresultsofthehallucinationalgorithmareslightlybetterthanthoseoftheSchultzandStevensonalgorithm,forexample,aroundtheeye-brows,aroundthefacecontour,andaroundthehairline.Thisimprovementisonlymarginalbecauseoftheharshilluminationconditions.Atpresent,theperformanceofourhallucinationalgorithmisverydependentuponsucheffects.

facialfeatures,suchastheeyes,nose,andmouthareallenhancedandappearmuchsharperinthehallucinatedresultthaneitherinthelowresolutioninputorintheinterpolatedimage.

InFigure14wepresentresultsonashorteightframevideo.Thefaceregionismarkedbyhardintheﬁrstframeandthentrackedovertheremainderofthesequence.OuralgorithmiscomparedwiththatofSchultzandStevenson.(TheresultsforHardieetal.aresimilarandsoareomitted.)Ouralgorithmmarginallyoutperformsbothreconstruction-basedalgorithms.Inparticular,theeye-brows,thefacecontour,andthehairlineareallalittlesharperinthehallucinatedresult.Theimprovementisquitesmall,however.Thisisbecausethehallucinationalgorithmiscurrentlyverysensitivetoilluminationconditionsandotherphotometriceffects.Weareworkingonmakingouralgorithmmorerobusttosucheffects,aswellasonseveralotherreﬁnements.4.7.5ResultsonImagesNotContainingFaces

InFigure15webrieﬂypresentafewresultsonimagesthatdonotcontainfaces,eventhoughthealgorithmhasbeentrainedontheFERETdataset.(Figure15(a)isarandomimage,(b)isamiscel-laneousimage,and(c)isaconstantimage.)Asmightbeexpected,ouralgorithmhallucinatesanoutlineofafaceinallthreecases,eventhoughthereisnofaceintheinput.Thisisthereasonwe

(a)RandomHallucinated(b)Misc.Hallucinated(c)ConstantHallucinated

Figure15:Theresultsofapplyingourhallucinationalgorithmtoimagesnotcontainingfaces.(Wehave

omittedthelowresolutioninputandhavejustdisplayedtheoriginalhighresolutionimage.)Asisevident,afaceishallucinatedbyouralgorithmevenwhennoneispresent,hencetheterm“hallucinationalgorithm.”

calledouralgorithma“hallucinationalgorithm.”(Thehallucinationalgorithmnaturallyperformsworseonimagesthatiswasnottrainedforthanreconstruction-basedalgorithmsdo.)

4.8ExperimentalResultsonTextData

Wealsoappliedouralgorithmtotextdata.Inparticular,wegrabbedanimageofanwindowdis-playingonepageofaletterandusedthebit-mapastheinput.Theimagewassplitintodisjointtrainingandtestsamples.(Thetrainingandtestdatathereforecontainthesamefont,areatthesamescale,andthedataisnoiseless.Thetrainingandtestdataarenotregisteredinanywayhow-ever.)TheresultsarepresentedinFigures16.TheinputinFigure16(a)ishalftheresolutionoftheoriginalinFigure16(f).ThehallucinatedresultinFigure16(c)isthebestreconstructionofthetext,bothvisuallyandintermsoftheRMSintensityerror.Forexample,comparetheappearanceoftheword“was”inthesecondsentenceofthetextinFigures16(b)–(f).Thehallucinationalgo-rithmalsohasanRMSerrorofonly24.5greylevels,comparedtoover48.0forthethreeotheralgorithms,almostafactoroftwoimprovement.

5Discussion

Intheﬁrsthalfofthispaperweshowedthatthesuper-resolutionreconstructionconstraintsprovidelessandlessusefulinformationasthemagniﬁcationfactorincreases.Themajorcauseofthisphenomenonisthespatialaveragingoverthephotosensitivearea;i.e.thefactthat

isnon-zero.

Theunderlyingreasonthattherearelimitsonreconstruction-basedsuper-resolutionisthereforethesimplefactthatCCDsensorsmusthaveanon-zerophotosensitiveareainordertobeabletocaptureanon-zeronumberofphotonsoflight.

Ouranalysisassumesquantizednoiselessimages;i.e.theintensitiesare8-bitvalues,created

(a)InputImage.(Justoneimageisused.)

(b)CubicB-spline,RMSError51.3

(c)Hallucinated,RMSError24.5

(d)SchultzandStevenson,RMSError48.4

(e)Hardieetal.,RMSError48.5

(f)OriginalHighResolutionImage

Figure16:Theresultsofenhancingtheresolutionofapieceoftextbyafactorof2.(Justasingleinput

imageisused.)Ourhallucinationalgorithmproducesaclear,crispimageusingnoexplicitknowledgethattheinputcontainstext.Inparticular,lookattheword“was”inthesecondsentence.TheRMSpixelintensityerrorisalsoalmostafactorof2improvementovertheotheralgorithms.

byroundingnoiselessreal-valuednumbers.(Itisthisquantizationthatcausesthelossofinforma-tion,whichwhencombinedwithspatialaveraging,meansthathighmagniﬁcationsuper-resolutionisnotpossiblefromthereconstructionconstraints.)Withoutthisassumption,however,itmightbepossibletoincreasethenumberofbitsperpixelbyaveragingacollectionofquantizednoisyim-ages(inanintelligentway).Inpractice,takingadvantageofsuchinformationisverydifﬁcult.Thispointalsodoesnotaffectanotheroutcomeofouranalysiswhichwastoshowthatreconstruction-basedsuper-resolutioninherentlytrades-offintensityresolutionforspatialresolution.

Inthesecondhalfofthispaperweshowedthatrecognitionprocessesmayprovideanaddi-tionalsourceofinformationforsuper-resolutionalgorithms.Inparticular,wedevelopeda“hallu-cination”algorithmforsuper-resolutionanddemonstratedthatthisalgorithmcanobtainfarbetterresultsthanexistingreconstruction-basedsuper-resolutionalgorithms,bothvisuallyandintermsofRMSpixelintensityerror.Similarapproachesmayaidother(ie.3D)reconstructiontasks.Atthistime,however,ourhallucinationalgorithmisnotrobustenoughtobeusedontypicalsurveillancevideo.Besidesintegratingitwitha3Dheadtrackertoavoidtheneedformanualregistrationandtoremovetherestrictiontofrontalfaces,therobustnessofthealgorithmtoil-luminationconditionsmustbeimproved.ThislackofrobustnesstoilluminationcanbeseeninFigure14wheretheperformanceofouralgorithmonimagescapturedoutdoorsandinnovelil-luminationconditionsresultsinsigniﬁcantlylessimprovementoverexistingreconstruction-basedalgorithmsthanthatseeninsomeofourotherresults.(ThemostappropriateﬁguretocompareFigure14withisFigure9.)Wearecurrentlyworkingontheseandotherreﬁnements.

Thetwohalvesofthispaperarerelatedinthefollowingsense.Bothhalvesareconcernedwithwheretheinformationcomesfromwhensuper-resolutionisperformedandhowstrongthatinformationis.Theﬁrsthalfinvestigateshowmuchinformationiscontainedinthereconstructionconstraintsandshowsthattheinformationcontentisfundamentallylimitedbythedynamicrangeoftheimages.Thesecondhalfdemonstratesthatstrongclass-basedpriorscanprovidefarmoreinformationthanthesimplesmoothnesspriorsthatareusedinexistingsuper-resolutionalgorithms.

Acknowledgements

WewishtothankHarryShumforpointingouttheworkofFreemanandPasztor[25],IainMatthewsforpointingouttheworkofEdwardsetal.[19],andHenrySchneidermanforsug-gestingweperformtheconditioninganalysisinSection3.2.Wewouldalsoliketothanknumer-ouspeopleforcommentsandsuggestions,includingTerryBoult,PeterCheeseman,MichalIrani,ShreeNayar,SteveSeitz,SundarVedula,andeveryoneintheFaceGroupatCMU.Finallywe

wouldliketothanktheanonymousreviewersfortheircommentsandsuggestions.TheresearchdescribedinthispaperwassupportedbyUSDODGrantMDA-904-98-C-A915.Apreliminaryversionofthispaper[4]appearedinJune2000intheIEEEConferenceonComputerVisionandPatternRecognition.Additionalexperimentalresultscanbefoundinthetechnicalreport[1].

References

[1]S.BakerandT.Kanade.Hallucinatingfaces.TechnicalReportCMU-RI-TR-99-32,The

RoboticsInstitute,CarnegieMellonUniversity,1999.[2]S.BakerandT.Kanade.Super-resolutionopticalﬂow.TechnicalReportCMU-RI-TR-99-36,

TheRoboticsInstitute,CarnegieMellonUniversity,1999.[3]S.BakerandT.Kanade.Hallucinatingfaces.InProceedingsoftheFourthInternational

ConferenceonAutomaticFaceandGestureRecognition,Grenoble,France,2000.[4]S.BakerandT.Kanade.Limitsonsuper-resolutionandhowtobreakthem.InProceedingsof

the2000IEEEConferenceonComputerVisionandPatternRecognition,HiltonHead,SouthCarolina,2000.[5]S.Baker,S.K.Nayar,andH.Murase.Parametricfeaturedetection.InternationalJournalof

ComputerVision,27(1):27–50,1998.[6]D.F.Barbe.Charge-CoupledDevices.Springer-Verlag,1980.

[7]B.Bascle,A.Blake,andA.Zisserman.Motiondeblurringandsuper-resolutionfroman

imagesequence.InProceedingsoftheFourthEuropeanConferenceonComputerVision,pages573–581,Cambridge,England,1996.[8]J.R.Bergen,P.Anandan,K.J.Hanna,andR.Hingorani.Hierarchicalmodel-basedmotion

estimation.InProceedingsoftheSecondEuropeanConferenceonComputerVision,pages237–252,SantaMargheritaLiguere,Italy,1992.[9]M.Berthod,H.Shekarforoush,M.Werman,andJ.Zerubia.Reconstructionofhighresolution

3Dvisualinformation.InProceedingsofthe1994ConferenceonComputerVisionandPatternRecognition,pages654–657,Seattle,WA,1994.[10]M.BornandE.Wolf.PrinciplesofOptics.PermagonPress,1965.

[11]P.J.Burt.Fastﬁltertransformsforimageprocessing.ComputerGraphicsandImagePro-cessing,16:20–51,1980.[12]P.J.BurtandE.H.Adelson.TheLaplacianpyramidasacompactimagecode.IEEETrans-actionsonCommuniations,31(4):532–540,1983.

[13]P.Cheeseman,B.Kanefsky,R.Kraft,J.Stutz,andR.Hanson.Super-resolvedsurfacerecon-structionfrommultipleimages.TechnicalReportFIA-94-12,NASAAmesResearchCenter,MoffetField,CA,1994.[14]M.-C.ChiangandT.E.Boult.Imaging-consistentsuper-resolution.InProceedingsofthe

1997DARPAImageUnderstandingWorkshop,1997.[15]M.-C.ChiangandT.E.Boult.Localblurestimationandsuper-resolution.InProceedingsof

the1997ConferenceonComputerVisionandPatternRecognition,pages821–826,SanJuan,PuertoRico,1997.[16]J.S.DeBonet.Multiresolutionsamplingprocedureforanalysisandsynthesisoftexture

images.InComputerGraphicsProceedings,AnnualConferenceSeries,(SIGGRAPH’97),pages361–368,1997.[17]J.S.DeBonetandP.Viola.Texturerecognitionusinganon-parametricmulti-scalestatistical

model.InProceedingsofthe1998ConferenceonComputerVisionandPatternRecognition,pages641–647,SantaBarbara,CA,1998.[18]F.Dellaert,S.Thrun,andC.Thorpe.Jacobianimagesofsuper-resolvedtexturemapsfor

model-basedmotionestimationandtracking.InProceedingsoftheFourthWorkshoponApplicationsofComputerVision,pages2–7,Princeton,NJ,1998.[19]G.J.Edwards,C.J.Taylor,andT.F.Cootes.Learningtoidentifyandtrackfacesinimage

sequences.InProceedingsoftheThirdInternationalConferenceonAutomaticFaceandGestureRecognition,pages260–265,Nara,Japan,1998.[20]M.Elad.Super-ResolutionReconstructionofImageSequences-AdaptiveFilteringAp-proach.PhDthesis,TheTechnion-IsraelInstituteofTechnology,Haifa,Israel,1996.[21]M.EladandA.Feuer.Restorationofsinglesuper-resolutionimagefromseveralblurred,

noisyanddown-sampledmeasuredimages.IEEETransactionsonImageProcessing,6(12):1646–58,1997.[22]M.EladandA.Feuer.Super-resolutionreconstructionofimagesequences.IEEETransac-tionsonPatternAnalysisandMachineIntelligence,21(9):817–834,1999.[23]M.EladandA.Feuer.Super-resolutionrestorationofanimagesequence-adaptiveﬁltering

approach.IEEETransactionsonImageProcessing,8(3):387–395,1999.[24]W.T.FreemanandE.H.Adelson.Thedesignanduseofsteerableﬁlters.IEEETransactions

onPatternAnalysisandMachineIntelligence,13:891–906,1991.[25]W.T.Freeman,E.C.Pasztor,andO.T.Carmichael.Learninglow-levelvision.International

JournalofComputerVision,20(1):25–47,2000.

[26]R.C.Hardie,K.J.Barnard,andE.E.Armstrong.JointMAPregistrationandhigh-resolution

imageestimationusingasequenceofundersampledimages.IEEETransactionsonImageProcessing,6(12):1621–1633,1997.[27]B.K.P.Horn.RobotVision.McGrawHill,1996.

[28]T.S.HuangandR.Tsai.Multi-frameimagerestorationandregistration.AdvancesinCom-puterVisionandImageProcessing,1:317–339,1984.[29]M.IraniandS.Peleg.Improvingresolutionbyimagerestoration.ComputerVision,Graphics,

andImageProcessing,53:231–239,1991.[30]M.IraniandS.Peleg.Motionanalysisforimageenhancement:Resolution,occulsion,and

transparency.JournalofVisualCommunicationandImageRepresentation,4(4):324–335,1993.[31]M.Irani,B.Rousso,andS.Peleg.Imagesequenceenhancementusingmultiplemotionsanal-ysis.InProceedingsofthe1992ConferenceonComputerVisionandPatternRecognition,pages216–221,Urbana-Champaign,Illinois,1992.[32]D.Keren,S.Peleg,andR.Brada.Imagesequenceenhancementusingsub-pixeldisplace-ments.InProceedingsofthe1988ConferenceonComputerVisionandPatternRecognition,pages742–746,AnnArbor,Michigan,1988.[33]S.Kim,N.Bose,andH.Valenzuela.Recursivereconstructionofhighresolutionimage

fromnoisyundersampledmultiframes.IEEETransactionsonAcoustics,Speech,andSignalProcessing,38:1013–1027,1990.[34]S.KimandW.-Y.Su.Recursivehigh-resolutionreconstructionofblurredmultiframeimages.

IEEETransactionsonImageProcessing,2:534–539,1993.[35]S.MannandR.W.Picard.Virtualbellows:Constructinghighqualitystillsfromvideo.

InProceedingsoftheFirstInternationalConferenceonImageProcessing,pages363–367,Austin,TX,1994.[36]V.S.Nalwa.AGuidedTourofComputerVision.Addison-Wesley,1993.

[37]T.Numnonda,M.Andrews,andR.Kakarala.Highresolutionimagereconstructionbysim-ulatedannealing.ImageandVisionComputing,11(4):213–220,1993.[38]A.Patti,M.Sezan,andA.Tekalp.Superresolutionvideoreconstructionwitharbitrarysam-plinglaticesandnonzeroaperturetime.IEEETransactionsonImageProcessing,6(8):1064–1076,1997.[39]S.Peleg,D.Keren,andL.Schweitzer.Improvingimageresolutionusingsubpixelmotion.

PatternRecognitionLetters,pages223–226,1987.

[40]P.J.Philips,H.Moon,P.Rauss,andS.A.Rizvi.TheFERETevaluationmethodologyfor

face-recognitionalgorithms.InCVPR’97,1997.[41]W.K.Pratt.DigitalImageProcessing.Wiley-Interscience,1991.

[42]W.H.Press,S.A.Teukolsky,W.T.Vetterling,andB.P.Flannery.NumericalRecipesinC.

CambridgeUniversityPress,secondedition,1992.[43]H.QiandQ.Snyder.Conditioninganalysisofmissingdataestimationforlargesensorarrays.

InProceedingsofthe2000IEEEConferenceonComputerVisionandPatternRecognition,HiltonHead,SouthCarolina,2000.[44]T.Riklin-RavivandA.Shashua.TheQuotientimage:Classbasedrecognitionandsynthesis

undervaryingillumination.InProceedingsofthe1999ConferenceonComputerVisionandPatternRecognition,pages566–571,FortCollins,CO,1999.[45]H.A.Rowley,S.Baluja,andT.Kanade.Neuralnetwork-basedfacedetection.IEEETrans-actionsonPatternAnalysisandMachineIntelligence,20(1):23–38,January1998.[46]R.SchultzandR.Stevenson.ABayseianapproachtoimageexpansionforimproveddeﬁni-tion.IEEETransactionsonImageProcessing,3(3):233–242,1994.[47]R.SchultzandR.Stevenson.Extractionofhigh-resolutionframesfromvideosequences.

IEEETransactionsonImageProcessing,5(6):996–1011,1996.[48]H.Shekarforoush.Conditioningboundsformulti-framesuper-resolutionalgorithms.Tech-nicalReportCAR-TR-912,ComputerVisionLaboratory,CenterforAutomationResearch,UniversityofMaryland,1999.[49]H.Shekarforoush,M.Berthod,J.Zerubia,andM.Werman.Sub-pixelbayesianestimation

ofalbedoandheight.InternationalJournalofComputerVision,19(3):289–300,1996.[50]V.Smelyanskiy,P.Cheeseman,D.Maluf,andR.Morris.Bayesiansuper-resolvedsurface

reconstructionfromimages.InProceedingsofthe2000IEEEConferenceonComputerVisionandPatternRecognition,HiltonHead,SouthCarolina,2000.[51]H.StarkandP.Oskoui.High-resolutionimagerecoveryfromimage-planearrays,using

convexprojections.JournaloftheOpticalSocietyofAmericaA,6:1715–1726,1989.[52]R.SzeliskiandP.Golland.Stereomatchingwithtransparencyandmatting.InSixthInterna-tionalConferenceonComputerVision(ICCV’98),pages517–524,Bombay,1998.[53]H.UrandD.Gross.Improvedresolutionfromsubpixelshiftedpictures.ComputerVision,

Graphics,andImageProcessing,54(2):181–186,1992.

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文