LimitsonSuper-ResolutionandHowtoBreakThem
SimonBakerandTakeoKanadeTheRoboticsInstituteCarnegieMellonUniversityPittsburgh,PA15213
Abstract
Nearlyallsuper-resolutionalgorithmsarebasedonthefundamentalconstraintsthatthesuper-resolutionimageshouldgeneratethelowresolutioninputimageswhenap-propriatelywarpedanddown-sampledtomodeltheimageformationprocess.(Thesereconstructionconstraintsarenormallycombinedwithsomeformofsmoothnesspriortoregularizetheirsolution.)Inthefirstpartofthispaper,wederiveasequenceofan-alyticalresultswhichshowthatthereconstructionconstraintsprovidelessandlessusefulinformationasthemagnificationfactorincreases.Wealsovalidatethesere-sultsempiricallyandshowthatforlargeenoughmagnificationfactorsanysmoothnesspriorleadstooverlysmoothresultswithverylittlehigh-frequencycontent(howevermanylowresolutioninputimagesareused.)Inthesecondpartofthispaper,wepro-poseasuper-resolutionalgorithmthatusesadifferentkindofconstraint,inadditiontothereconstructionconstraints.Thealgorithmattemptstorecognizelocalfeaturesinthelowresolutionimagesandthenenhancestheirresolutioninanappropriateman-ner.Wecallsuchasuper-resolutionalgorithmahallucinationorrecogstructionalgo-rithm.Wetriedourhallucinationalgorithmontwodifferentdatasets,frontalimagesoffacesandprintedRomantext.Weobtainedsignificantlybetterresultsthanexistingreconstruction-basedalgorithms,bothqualitativelyandintermsofRMSpixelerror.Keywords:Super-resolution,analysisofreconstructionconstraints,learning,faces,text,hallucination,recogstruction.
1Introduction
Super-resolutionistheprocessofcombiningmultiplelowresolutionimagestoformahigherresolutionone.Numeroussuper-resolutionalgorithmshavebeenproposedintheliterature[39,32,51,33,29,31,53,30,34,37,13,9,35,47,49,7,38,26,21,15,23,18]datingbacktothefrequencydomainapproachofHuangandTsai[28].Usuallyitisassumedthatthereissome(small)relativemotionbetweenthecameraandthescene,howevermotionlesssuper-resolutionisindeedpossibleifotherimagingparameters(suchastheamountofdefocusblur)varyinstead[21].Ifthereisrelativemotionbetweenthecameraandthescene,thenthefirststeptosuper-resolutionistoregisteroraligntheimages;i.e.computethemotionofpixelsfromoneimagetotheothers.Themotionfieldsaretypicallyassumedtotakeasimpleparametricform,suchasatranslationoraprojectivewarp[8],butinsteadcouldbedenseopticalflowfields[20,2].Weassumethatimageregistrationhasalreadybeenperformed,andconcentrateonthesecondhalfofsuper-resolutionwhichconsistsoffusingthemultiple(aligned)lowresolutionimagesintoahigherresolutionone.Thesecond,fusionstepisusuallybasedontheconstraintsthatthesuper-resolutionimage,whenappropriatelywarpedanddown-sampledtotakeintoaccountthealignmentandtomodeltheimageformationprocess,shouldyieldthelowresolutioninputimages.ThesereconstructionconstraintshavebeenusedbynumerousauthorssincefirststudiedbyPelegetal.[39,29].TheconstraintscaneasilybeembeddedinaBayesianframeworkincorporatingaprioronthehighresolutionimage[47,26,21]1.TheirsolutioncanbeestimatedeitherinbatchmodeorrecursivelyusingaKalmanfilter[22,18].Severalrefinementshavebeenproposed,includingsimultaneouslycomputingstructure[13,49,50]andremovingotherdegradingeffectssuchasmotionblur[7].Inpracticetheresultsobtainedusingthesereconstruction-basedalgorithmsaremixed.Whilethesuper-resolutionimagesaregenerallyanimprovementovertheinputs,thehighfrequencycomponentsoftheimagesaregenerallynotreconstructedverywell.Toillustratethispoint,weconductedanexperiment,theresultsofwhichareincludedinFigure1.Wetookahighresolutionimageofaface(showninthetopleftofthefigure)andsyntheticallytranslateditbyrandomsub-pixelamounts,blurreditwithaGaussian,andthendown-sampledit.Werepeatedthisprocedureforseveraldifferent(linear)down-samplingfactors;2,4,8,and16.Ineachcase,wegeneratedmultipledown-sampledimages,eachwithadifferentrandomtranslation.Wegeneratedenough
fication.Inthesecondresultweshowthateventhoughtheconstraintsaregenerallyinvertibleforotherpointspreadfunctions,theconditionnumberalsoalwaysgrowsatleastasfastasaquadratic.(ThissecondresultisprovedforalargeclassofpointspreadfunctionsincludingallpointspreadfunctionswhichreasonablymodelCCDsensors.)Thissecondresult,however,doesnotentirelyexplaintheresultsshowninFigure1.Itisfrequentlypossibletoinvertanill-conditionedproblembysimplyimposingasmoothnessprioronthesolution.Inourthirdresultweusethefactthatthepixelsintheinputimagestakevaluesinafiniteset(typicallyintegersintherange0–255)toshowthatthevolumeofsolutionstothediscretizedreconstructionconstraintsgrowsatanextremelyfastrate.This,then,istheunderlyingreasonfortheresultsinFigure1.Forlargemagnificationfactors,thereareahugenumberofsolutionstothereconstructionconstraints,includingnumerousverysmoothsolutions.Thesmoothnesspriorthatistypicallyaddedtoresolvetheambiguityinthelargesolutionspacesimplyensuresthatitisoneoftheoverlysmoothsolutionsthatischosen.(Strictly,thefinalsolutiontotheoverallproblemisonlyanapproximatesolutionofthereconstruc-tionconstraintssincebothsetsofconstraintsareaddedasleastsquaresconstraints.)
How,then,canhigh-magnificationsuper-resolutionbeperformed?Ouranalyticalresultsholdforanarbitrarynumberofimages.Usingmorelowresolutionimagesthereforedoesnothelp,atleastbeyondapoint(whichinpracticeisdeterminedbyawiderangeoffactors.Seethediscussionattheendofthepaperformoredetails.)Theadditional(8-bit,say)inputimagessimplydonotprovideanymoreinformationbecausetheinformationwaslostwhentheywerequantizedto8-bits.Suppose,however,thattheinputimagescontaintext.Moreover,supposethatitispossibletoperformopticalcharacterrecognition(OCR)andrecognizethetext.Ifthefontcanalsobedetermined,itwouldthenbeeasytobeperformsuper-resolution.Thetextcouldbereproducedatanyresolutionbysimplyrenderingitfromtherecognizedtextandthedefinitionofthefont.Inthesecondhalfofthispaper,wedescribeasuper-resolutionalgorithmbasedonthisprinciple,whichwecallrecognition-basedsuper-resolutionorhallucination[1,3].Moregenerally,weproposethetermrecogstructionforrecognition-basedreconstructiontechniques.Ourhallucinationalgorithm,however,isbasedontherecognitionofgenericlocal“features”(ratherthanthecharactersdetectedbyOCR)sothatitcanbeappliedtootherphenomena.Therecognizedlocalfeaturesareusedtopredictarecognition-basedpriorwhichreplacesthesmoothnesspriorsusedinexistingalgorithmssuchas[13,47,26,21].Wetrainedourhallucinationalgorithmseparatelyonbothfrontalimagesoffacesandcomputergeneratedtext.Weobtainedsignificantlybetterresultsthantraditionalreconstruction-basedsuper-resolution,bothvisuallyandintermsofRMSpixelintensityerror.Ouralgorithmiscloselyrelatedtotheindependentworkof[25]inwhichalearningframework
3
forlow-levelvisionwasproposed,oneapplicationofwhichisimageinterpolation.Intheirpaper,Freemanetal.learnaprioronthehigherresolutionimageusingabeliefpropagationnetwork.Ouralgorithmhastheadvantageofbeingapplicabletoanarbitrarynumberofimages.Ouralgorithmisalsocloselyrelatedto[19]inwhichtheparametersofan“active-appearance”modelareusedforsuper-resolution.Thisalgorithmcanalsobeinterpretedashavingastrong,learnedfaceprior.Theremainderofthispaperisorganizedasfollows.WebegininSection2byderivingthesuper-resolutionreconstructionconstraints,beforeanalyzingtheminSection3.Wepresentourhallucinationalgorithm(withresults)inSection4.WeendwithadiscussioninSection5.
2TheSuper-ResolutionReconstructionConstraints
Denotethelowresolutioninputimagesbyin
where
and
isavector
containingthe(columnandrow)pixelcoordinates.Thestaringpointinthederivationofthe
reconstructionconstraintsisthenthecontinuousimageformationequation(seeFigure2)[27,36]:
(1)
where
isthecontinuousirradiancelight-fieldthatwouldhavereachedtheimageplaneof
underthepinholemodel,
isthepointspreadfunctionofthecamera,and
arecoordinatesintheimageplaneof(overwhichtheintegrationisperformed.)AllthatEqua-tion(1)saysisthatthepixelintensity
istheresultofconvolvingtheirradiancefunction
withthepoint-spreadfunctionofthecamera
andthensamplingitatthediscrete
pixellocations
.(Inamoregeneralformulation,
mayalsobeafunction
ofbothtimeandwavelength.Equation(1)wouldthenalsocontainintegrationsoverthesetwo
variablesaswell.Wedonotmodeltheseeffectsbecausetheydonotaffectthespatialanalysis.)
2.1ModelingthePointSpreadFunction
Wedecomposethepointspreadfunctionofthecameraintotwocomponents(seeFigure2):
(2)
where
modelstheblurringcausedbytheopticsand
modelsthespatialintegration
performedbytheCCDsensor[5].Theopticalblurring
istypicallyfurthersplitintoadefocus
factorthatcanbeapproximatedbyapill-boxfunctionandadiffraction-limitedopticaltransfer
4
EiLens System
Optical Effects: ωi
Finite ApertureEi*ωiCCD Sensor
SiSiSpatial Integration:ai
Loi(m,n) =Ei*ωi*ai(m,n)= Ei*PSFi(m,n)
Figure2:Ourimageformationmodel.Weassumethatthelowresolutioninputimages
areformed
bytheconvolutionoftheirradiancewiththecamerapointspreadfunction.Wemodelthepointspreadfunctionitselfastheconvolutionoftwoterms:(1)modelstheopticaleffectscausedbythelensandthefiniteaperture,and(2)modelsthespatialintegrationperformedbytheCCDsensor.
functionthatcanbemodeledbythesquareofthefirst-orderBesselfunctionofthefirstkind[10].Weaimtobeasgeneralaspossibleandsoavoidmakinganyassumptionsabout(mostof)ouranalysisisperformedforarbitraryfunctionsparametricformforareais
.Instead,
.Wedo,however,assumea
.Weassumethatthethephoto-sensitiveareasoftheCCDpixelsaresquare
anduniformlysensitivetolight,asin[5,6].Ifthelengthofthesideofthesquarephotosensitive
,thespatialintegrationfunctionisthen:
rameters.Inpractice,asimpleparametricformisassumedfor
,moreoftenthannot,thatit
isGaussian.Theparameter(sigma)isthenestimatedempirically.Sincethepointspreadfunctiondescribes“theimageofanisolatedpointobjectlocatedonauniformlyblackbackground”[36],theparameter(s)canbeestimatedfromtheimageofalightplacedalargedistancefromthecamera.)
2.2WhatisSuper-ResolutionAnyway?
Wewishtoestimateasuper-resolutionimageframeof
where
arepixelcoordinates.
.Thecoordinate
Preciselywhatdoesthismean?Letusbeginwiththecoordinateframeof
istypicallydefinedbythatofoneofthelowresolutioninputimages,say
.
Ifthelinearmagnificationofthesuper-resolutionprocessisclosertoeachotherthanthoseintermsofthatfor
,thepixelsin
willbe
times
.Thecoordinateframeofcanthereforebedefinedin
usingtheequation:
(5)
where
(6)
6
Giventhisequationwedistinguishtwoprocesses:Deblurringisestimatingarepresentationof
(thatisasopposedtoestimating
);
i.e.deblurringisremovingtheeffectsoftheconvolutionwiththepointspreadfunction
.Deblurringisindependentofwhethertherepresentationof
isonadensergrid
thanthatoftheinputimages.Theresolutionmayormaynotchangeduringdeblurring.ResolutionEnhancementconsistsofestimatingeitheroftheirradiancefunctions(
or
)onadensergridthanthatoftheinputimage(s).Forexample,enhancingtheresolution
bythelinearmagnificationfactorconsistsofestimatingtheirradiancefunctiononthe
griddefinedbyEquation(4).Ifthenumberofinputimagesis1,resolutionenhancementisknownasinterpolation.Ifthereismorethanoneinputimage,resolutionenhancementisknownassuper-resolution.Resolutionisthereforesynonymouswithpixelgriddensity.Inthispaperwestudythemostgeneralcase;i.e.thecombinationofsuper-resolutionanddeblur-ring.Weestimate
,arepresentationof
onthegriddefinedbyEquation(4).
2.3RepresentingContinuousImages
Inordertoproceedweneedtospecifywhichcontinuousfunctiondiscreteimage
isrepresentedbythe
.Thesimplestcaseisthat
representsthepiecewiseconstantfunction:
(7)
forall
andwherearethecoordinatesof
apixelin
.Then,Equation(5)canberearrangedtogive:
(9),in
for
;i.e.asetoflinearconstraintsontheunknownsuper-resolutionpixels
termsoftheknownlowresolutionpixelsboththepointspreadfunction
.Theconstantcoefficients
dependon
andontheregistration
.(Similarderivationscanbe
performedforotherrepresentationsof
,suchaspiecewiselinearorquadraticones[14].)
7
3AnalysisoftheReconstructionConstraints
Wenowanalyzethesuper-resolutionreconstructionconstraintsdefinedbyEquation(9).Ascanbeseen,theequationsdependupontwoimagingproperties:(1)thepointspreadfunctionand(2)theregistration
,
.Withoutsomeassumptionsaboutthesefunctionsanyanalysiswould
isarbitrary,itcanbechosen(ineffect)to
bemeaningless.Ifthepointspreadfunctionisarbitrary,itcanbechosentosimulatethe“smallpixels”ofthesuper-resolutionimage.Similarlyif
movethecameratowardsthesceneandtherebydirectlycapturethesuper-resolutionimage.Wethereforehavetomakesome(reasonable)assumptionsabouttheimagingconditions.AssumptionsMadeAboutthePointSpreadFunction
Weassumethatthepointspreadfunctionisthesameforalloftheimages
andtakestheform:
(11)
where
isaconstant(whichisdifferentforeachlowresolutionimage),and
isthelinearmagnificationofthesuper-resolutionproblem.
,theexactvaluesof
Evengiventheseassumptions,theperformanceofanysuper-resolutionalgorithmwilldependuponthenumberofinputimages
,and,moreover,howwellthealgorithm
increases.Wethereforeas-
canregisterthelowresolutionimagestoestimatethe.Ourgoalistoshowthatsuper-resolution
becomesfundamentallymoredifficultasthelinearmagnificationberofinputimages
,witharbitraryvaluesof
sumethattheconditionsareasfavorableaspossibleandperformtheanalysisforanarbitrarynum-.Moreover,weassumethatthealgorithmhas
estimatedthesevaluesperfectly.Anyresultsderivedundertheseconditionswillonlybestrongerinpractice,wherethe
mighttakedegeneratevalues,ormightbeestimatedinaccurately.
8
p+(0.5,0.5)Area A1D ProofM(m-ci+ S (0.5,0.5))p-(0.5,0.5)M(m-ci- S (0.5,0.5))
Figure3:Thepixel
overwhichtheintegrationisperformedinEquation(12)isindicatedbythesmall
squareatthetopleftofthefigure.Thelargersquareonthebottomrightistheregioninwhichisnon-zero.Sincetakesthevalueinthisregion,theintegralinEquation(12)equals,whereistheareaoftheintersectionofthetwosquares.ThisfigureisusedtoillustratetheproofofTheorem1.
3.1InvertibilityAnalysisforSquarePointSpreadFunctions
Weanalyzethereconstructionconstraintsinthreedifferentways.Thefirstanalysisisconcernedwithwhentheconstraintsareinvertible,andwhattherankofthenullspaceiswhentheyarenotinvertible.Inordertogetaneasilyinterpretableresult,theanalysisinthissectionisperformedunderthesimplifiedscenariothattheopticalblurringcanbeignoredandsoanalysisisforarbitraryopticalblurringmodels
,the
Diracdeltafunction.Thisassumptionwillberemovedinthefollowingtwosections,wherethe
.Assumingasquarepointspreadfunction
(andthattheregistration
isatranslation)Equation(9)simplifiesto:
(12).
wheretheintegrationisperformedoverthepixel;i.e.over
Usingthedefinitionof
itiseasytoseethat
isequalto
timesthearea
oftheintersectionofthetwosquaresinFigure3.Wethenhave:Theorem1If
isanintegergreaterthan1,thenforallchoicesof
thesetofEquations(12)
isnotinvertible.Moreover,theminimumachievabledimensionofthenullspaceis
.If
isnotaninteger,
’scanalwaysbechosensothatthesetofEquations(12)areinvertible.
Proof:Weprovideaprooffor1Dimages.(SeeFigure3.)Theextensionto2Disstraight-forward.
9
ThenullspaceofEquation(12)isdefinedbytheconstraints
where
istheareaofintersectionofthetwosquaresinFigure3.For1Dwejustconsideronerow
ofthefigure.Anyelementofthenullspacethereforecorrespondstoanassignmentofvaluestothesmallsquaresinawaythattheirweightedsum(overthelargesquare)equalszero,wheretheweightsaretheareasofintersectionwiththelargesquare.(Tobeabletoconductthisargumentforeverypixelinthesuper-resolutionimage,weneedtoassumethatthenumberofpixelsineveryrowandeverycolumnofthesuper-resolutionimageisgreaterthanpixels.ThisfollowsfromthefactthatChanging
.Thisisaminor
assumptionsinceitcorrespondstoassumingthatthelowresolutionimagesarebiggerthan
isphysicallyconstrainedtobelessthan1.)
toslidethelargesquarealongtherowbyasmallamount,wegetasimilarcon-
straintontheelementsinthenullspace.Theonlydifferenceisintheleft-mostandright-mostsquares.Subtractingthesetwoconstraintsshowsthattheleft-mostsquareandtheright-mostsquaremusthavethesamevalue.Thismeansthatand
mustequalboth
,iftheassignmentistolieinthenullspace.
If
isnotaninteger(oris1),thisprovesthatneighboringvaluesof
mustbeequal
andhence.Therefore,valuescanalwaysbechosenforthe
sothatthenullspaceonlycontains
thezerovector;i.e.thelinearsystemis,ingeneral,invertible.(Theequivalenceofthenullspacebeingnon-trivialandthelinearsystembeingnotinvertiblerequirestheassumptionthatthenumberofpixelsinthesuper-resolutionimageisfinite;i.e.thesuper-resolutionimageisbounded.)If
isaninteger,thisconstraintplacesanupperboundof
onthedimensionthatareperiodic.Allofthese
ofthenullspace(sincethenullspaceiscontainedinthesetassignmentstowithperiod
.)Thisvaluecanalsobeshowntobealowerboundonthedimensionofthe
nullspacebythespaceofperiodassignmentsforwhich
assignmentscaneasilybeseentolieinthenullspace(foranychoiceofthetranslationsthetwocases
and
,where
).
Tovalidatethistheorem,wesolvedthereconstructionconstraintsusinggradientdescentfor
.TheresultsarepresentedinFigure4.
Inthisexperiment,nosmoothnesspriorisusedandgradientdecentisrunforasufficientlylongtimethatthestartingimage(whichissmooth)doesnotbiastheresults.Theinputinbothcasesconsistedofmultipledown-sampledimagessimilartotheoneatthetopofthesecondcolumninFigure1.Specifically,
randomlytranslatedimageswereusedasinput.Exactlythesame
isthereforeactuallysmallerthanthatfor
inputsareusedforthetwoexperiments.Theonlydifferenceisthemagnificationfactorinthesuper-resolutionalgorithm.Theoutputfor
(andwasenlargedtothesamesizeinFigure4fordisplaypurposesonly.)
10
(a)
(b)
(c)
,withPrior
Figure4:ValidationofTheorem1:Theresultsofsolvingthereconstructionconstraintsusinggradient
descentforasquarepointspreadfunctionwith.(a)Whenisaninteger,theequationsarenotinvertibleandsoarandomperiodicimageinthenullspaceisaddedtotheoriginalimage.(b)Whenisnotaninteger,thereconstructionconstraintsareinvertibleandsoasmoothsolutionisfound,evenwithoutaprior.(Theresultforhasbeeninterpolatedtomakeitthesamesizeasthatfor.)(c)Whenasmoothnesspriorisaddedtothereconstructionconstraintsthedifficultiesseenin(a)disappear.
AscanbeseeninFigure4,forwithperiod
the(additive)errorisapproximatelyaperiodicimage
thefactthattheproblemisnotinvertible
the
pixels.For
theequationsareinvertibleandsoasmoothsolutionisfound,
eventhoughnosmoothnesspriorwasused.For
doesnothaveanypracticalsignificance.Adequatesolutionscanbeobtainedbysimplyaddingasmoothnesspriortothereconstructionconstraints,asshowninFigure4(c).For
situationisdifferent,however.Aswillbeshowninthethirdpartofouranalysis,itistherapidrateofincreaseofthedimensionofnullspacethatistherootcauseoftheproblemsforlarge
.
3.2ConditioningAnalysisforArbitraryPointSpreadFunctions
Anylinearsystemthatisclosetobeingnotinvertibleisusuallyill-conditioned.Itisnosurprisethenthatchangingfromasquarepointspreadfunctiontoanarbitraryfunction
resultsinanill-conditionedsystem,aswenowshowinthesecondpartofouranalysis:Theorem2Suppose
isanyfunctionforwhich
forall
and
.
Then,theconditionnumberofthefollowinglinearsystemgrowsatleastasfastas
:
(13)
where
.
11
Proof:Wefirstprovethetheoremforthesquarepointspreadfunctionandthengeneralize.Theconditionnumberofan
(i.e.forEquation12)
matrix
isdefined[42]as:
(15)
where
istheL2norm.(ThisresultfollowsimmediatelyfromtheSVD
.The
matrices
anddonotaffecttheL2normofavectorsincetheircolumnsareorthonormal.
Equation(15)clearlyholdsfor.)Itfollowsimmediatelythatif
and
areanytwovectorsthen:
andset
Thistheoremismoregeneralthanthepreviousonebecauseitappliesto(essentially)arbi-trarypointspreadfunctions.Ontheotherhand,itisaweakerresult(insomesituations)becauseitonlypredictsthatsuper-resolutionisill-conditioned(ratherthannotinvertible.)Thistheoremonitsown,therefore,doesnotentirelyexplainthepoorperformanceofsuper-resolution.AsweshowedinFigure4,problemsthatareill-conditioned(orevennotinvertible,wheretheconditionnumberisinfinite)canoftenbesolvedbysimplyaddingasmoothnessprior.Thenotinvertiblesuper-resolutionprobleminFigure4(a)issolvedinFigure4(c)inthisway.Severalresearchershaveperformedconditioninganalysisofvariousformsofsuper-resolution,including[21,48,43].Althoughuseful,noneoftheseresultsfullyexplainthedrop-offinperformancewiththemag-nification
.Theweaknessofconditioninganalysisisthatanill-conditionedsystemmaybe
ill-conditionedbecauseofasingle“almostsingularvalue.”AsindicatedbytherapidgrowthinthedimensionofthenullspaceinTheorem1,super-resolutionhasalargenumberof“almostsingularvalues”forlargemagnifications.ThisistherealcauseofthedifficultiesseeninFigure1.Onewaytoshowthisistoderivethevolumeofsolutions,aswenowdointhethirdpartofouranalysis.
3.3VolumeofSolutionsforArbitraryPointSpreadFunctions
Ifwecouldworkwithnoiseless,real-valuedquantitiesandperformarbitraryprecisionarithmeticthenthefactthatthereconstructionconstraintsareill-conditionedmightnotbeaproblem.Inreality,however,imagesarealwaysintensitydiscretized(typicallyto8-bitvaluesintherange0–255greylevels.)Therewillthereforealwaysbenoiseinthemeasurements,evenifitisonlyplus-or-minushalfagrey-level.Supposethat
denotestheoperatorwhichtakesareal-valued
irradiancemeasurementandturnsitintoaninteger-valuedintensity.Ifweincorporatethisquanti-zationintoourimageformationmodelthenEquation(17)becomes:
(18)
Supposethat
isafixedsizeimage2with
pixels.Wethenhave:
Theorem3Ifleastasfastas
isthestandardroundingoperatorwhichreplacesarealnumberwiththe
(treating
asaconstantand
and
asvariables.)
nearestinteger,thenthevolumeofthesetofsolutionsofEquation(18)growsasymptoticallyat
Proof:Firstnotethatthespaceofsolutionsisconvexsinceintegrationisalinearoperation.NextnotethatonesolutionofEquation(18)isthesolutionto:
(19)
Thedefinitionofthepointspreadfunctionasgive
andthepropertiesoftheconvolution
.Therefore,adding
toanypixelinisstillasolutionsincethe
righthandsideofEquation(19)increasesbyatmost1.(Theintegrandisincreasedbylessthan
grey-levelinthepixel,whichonlyhasanareaofunit.)ThevolumeofsolutionsofEquation(18)thereforecontainsan-dimensionalsimplex,wheretheanglesatonevertexareallright-angles,
andthesidesareall
unitslong.Thevolumeofsuchasimplexgrowsasymptoticallylike
(treating
asaconstantandand
asvariables).Thedesiredresultfollows.
Thisthirdandfinaltheoremprovidesthebestexplanationofthesuper-resolutionresultspre-sentedinFigure1.Forlargemagnificationfactors
,thereisahugevolumeofsolutionstothe
discretizedreconstructionconstraintsinEquation(18).Thesmoothnesspriorwhichisaddedtoresolvethisambiguitysimplyensuresthatitisoneoftheoverlysmoothsolutionsthatischosen.(Ofcourse,withouttheprioranysolutionmightbechosenwhichwouldgenerallybeevenworse.Asmentionedintheintroduction,thefinalsolutionisreallyonlyanapproximatesolutionofthereconstructionconstraintssincebothsetsofconstraintsareaddedasleastsquaresconstraints.)InFigure5wepresentquantitativeresultstoillustrateTheorems2and3.Weagainusedthereconstruction-basedalgorithm[26].Weverifiedourimplementationintwoways:(1)wecheckedthatforsmallmagnificationfactorsandnopriorourimplementationdoesyield(essentially)perfectreconstructions,and(2)formagnificationsof4wecheckedthatournumericalresultsareconsistentwiththosein[26].Wealsotriedtherelatedalgorithmof[47]andobtainedverysimilarresults.UsingthesameinputsasFigure1,weplotthereconstructionerroragainstthemagnification;i.e.thedifferencebetweenthereconstructedhighresolutionimageandtheoriginal.Wecomparethiserrorwiththeresidualerror;i.e.thedifferencebetweenthelowresolutioninputsandtheirpre-dictionsfromthereconstructedhighresolutionimage.Asexpectedforanill-conditionedsystem,thereconstructionerrorismuchhigherthantheresidual.Wealsocomparewitharoughpredic-tionofthereconstructionerrorobtainedbymultiplyingthelowerboundontheconditionnumber
14
100RMS Intensity Difference in Grey-Levels80
Reconstruction ErrorResidual Error
Single Image Interpolation Error
Predicted Error
60
Smoothness Prior40
Super-Res.20
0
246
810Linear Magnification
121416
Figure5:AnillustrationofTheorems2and3usingthesameinputsasinFigure1.Thereconstructionerror
ismuchhigherthantheresidual,aswouldbeexpectedforanill-conditionedsystem.Forlowmagnifications,thepriorisunnecessaryandsotheresultsareworsethanpredicted.Forhighmagnifications,thepriordoeshelp,butatthepriceofoverlysmoothresults.(SeeFigure1.)Aroughestimateoftheamountofinformationprovidedbythereconstructionconstraintsisgivenbytheimprovementofthereconstructionerroroverthesingleimageinterpolationerror.Similarly,theimprovementfromthepredictederrortothereconstructionerrorisanestimateoftheamountofinformationprovidedbythesmoothnessprior.Bythismeasure,thesmoothnesspriorprovidesmoreinformationthanthereconstructionconstraintsforamagnificationof.
(
)byanestimateoftheexpectedresidualassumingthatthegrey-levelsarediscretizedfroma
uniformdistribution.Forlowmagnificationfactors,thisestimateisanunder-estimatebecausethepriorisunnecessaryfornoisefreedata;i.e.betterresultswouldbeobtainedwithouttheprior.Forhighmagnificationsthepredictionisanover-estimatebecausethelocalsmoothnessassumptiondoeshelpthereconstruction(albeitattheexpenseofoverlysmoothresults.)
WealsoplotinterpolationresultsinFigure5;i.e.justusingthereconstructionconstraintsforoneimage(aswasproposed,forexample,in[46].)Thedifferencebetweenthiscurveandthereconstructionerrorcurveisameasureofhowmuchinformationthereconstructionconstraintsprovide.Similarly,thedifferencebetweenthepredictederrorandthereconstructionerrorisameasureofhowmuchinformationthesmoothnesspriorprovides.ForamagnificationofThis,then,isanalternativeinterpretationofwhytheresultsinFigure1aresosmooth.
,we
seethatthepriorprovidesmoreinformationthanthesuper-resolutionreconstructionconstraints.
15
4Recognition-BasedSuper-ResolutionorHallucination
Howthenisitpossibletoperformhighmagnificationsuper-resolutionwithouttheresultslookingoverlysmooth?Aswehavejustshown,therequiredhigh-frequencyinformationwaslostfromthereconstructionconstraintswhentheinputimageswerediscretizedto8-bitvalues.Genericsmoothnesspriorsmayhelpregularizetheproblem,butcannotreplacethemissinginformation.Asoutlinedintheintroduction,ourgoalinthissectionistodevelopasuper-resolutionalgo-rithmthatusestheinformationcontainedinacollectionofrecognitiondecisions(inadditiontothereconstructionconstraints.)Ourapproachistoembedtheresultsoftherecognitiondecisionsinarecognition-basedprioronthesolutionofthereconstructionconstraints,therebyresolvingtheinherentambiguityintheirsolution(seeSection3.3).
4.1BayesianMAPFormulationofSuper-Resolution
Webeginwiththe(standard)Bayesianformulationofsuper-resolution[13,47,26,21].Inthisap-proach,super-resolutionisposedasfindingthemaximumaposteriori(orMAP)super-resolutionimage
:i.e.estimating
Pr
.Bayeslawforthisestimationproblemis:
Pr
Pr
Pr
4.2Recognition-BasedPriorsforSuper-Resolution
Thesecondtermontheright-handsideofEquation(21)is(thenegativelogarithmof)theprior
Pr
.Usuallythisprioronthesuper-resolutionimageischosentobeasimplesmoothness
prior[13,47,26,21].Insteadwewouldliketochooseitsothatitdependsuponasetofrecognitiondecisions.Supposethattheoutputsoftherecognitiondecisionspartitionthesetofinputs(i.e.thelowresolutioninputimages
)intoasetofsubclasses
Wethendefinea
recognition-basedpriorasonethatcanbewritteninthefollowingform:
Pr
Pr
Pr
(23).Oncethe
then.
Essentially,thereisaseparatepriorPrlowresolutioninputimages
foreachpossiblepartition
areavailable,thevariousrecognitionalgorithmscanbeapplied,
anditcanbedeterminedwhichpartitiontheinputsliein.Therecognition-basedpriorPrreducestothemorespecificpriorPrtheoverallpriorPr
.Thispriorcanbemademorepowerfulthan
becauseitcanbetailoredtothe(smaller)subsetoftheinputdomain
4.3Multi-ScaleDerivativeFeatures:TheParentStructure
Wedecidedtotrytorecognizegenericlocalimagefeatures(ratherthanhigherlevelconceptssuchashumanfacesorASCIIcharacters)becausewewanttoapplyouralgorithmtoavarietyofphenomena.Motivatedby[16,17]wealsodecidedtousemulti-scalefeatures.Inparticular,givenanimage,wefirstformitsGaussianpyramid
[11].Afterwards,wealso
formitsLaplacianpyramid
[12],thehorizontal
andvertical
firstderivativesoftheGaussianpyramid,andthehorizontal
andvertical
secondderivativesoftheGaussianpyramid[1].(SeeFigure6for
examplesofthesepyramidsforanimageofaface.)Finally,weformapyramidoffeatures:
(24)
Thepyramid
isapyramidwherethereare5valuesstoredateachpixel,the
Laplacianandthe4derivatives,ratherthanthesinglevaluetypicallystoredinmostpyramids.(ThechoiceofthefeaturesinEquation(24)isaninstanceofthe“featureselection”problem.Forexample,steerablefilters[24]couldbeusedinstead,orthesecondderivativescouldbedroppediftheyaretoonoisy.Wefoundtheperformanceofouralgorithmstobelargelyindependentofthechoiceoffeatures.Theselectionoftheoptimalfeaturesisoutsidethescopeofthispaper.)
17
Parent Structure vectorPS 2G 4G 3G 2G 1G 0
L 0
L 4L 3L 2L 1
H 0
H 4H 3H 2H 1
V0
V4V3V2V1
(a)GaussianPyramid(b)LaplacianPyramid(c)FirstDerivativePyramids
Figure6:TheGaussian,Laplacian,andfirstderivativepyramidsofanimageofaface.(Wealsousetwo
secondderivativesbutomitthemfromthefigure.)Wecombinethesepyramidsintoasinglemulti-valuedpyramid,wherewestoreavectoroftheLaplacianandthederivativesateachpixel.TheParentStructurevectorofapixelinthelevelofthepyramidconsistsofthevectorofvaluesforthatpixel,thevectorforitsparentinthelevel,thevectorforitsparent’sparent,etc[16].TheParentStructurevectoristhereforeahigh-dimensionalvectorofderivativescomputedatvariousscales.Inouralgorithms,recognitionmeansfindingthetrainingsamplewiththemostsimilarParentStructurevector.
Then,givenapixelinthelowresolutionimagethatweareperformingsuper-resolutionon,wewanttofind(i.e.recognize)apixelinacollectionoftrainingdatathatislocally“similar.”Bysimilar,wemeanthatboththeLaplacianandtheimagederivativesareapproximatelythesame,atallscales.Tocapturethisnotion,wedefinetheParentStructurevector[16]ofapixel
inthe
levelofthefeaturepyramid
tobe:
(25)
AsillustratedinFigure6,theParentStructurevectoratanyparticularpixelinthepyramidconsistsofthefeaturevectoratthatpixel,thefeaturevectoroftheparentofthatpixel,thefeaturevectorofitsparent,andsoon.Exactlyasin[16],ournotionoftwopixelsbeingsimilaristhenthattheirParentStructurevectorsareapproximatelythesame(asmeasuredbysomenorm.)
4.4RecognitionasFindingtheNearest-NeighborParentStructure
Supposewehaveasetofhighresolutiontrainingimagespyramids
.Wecanthenformalloftheirfeature
.Alsosupposethatwearegivenalowresolutioninputimage
.
Finally,supposethatthisimageisataresolutionthatis
18
timessmallerthanthetraining
PS 2(T ) iPS 2( )Lo iF 4(T ) jF 3(T ) jF 2(T ) jF j 1(T )F 0(T ) jHigh-Resolution Training ImagesT j
Match
FLo i 4( )FLo i 3( )FLo i = Lo i 2( )Best ImageBest Pixel
BI i
Low-Resolution Input ImageLo i
BP i
Recognition Output
Figure7:Wecomputethefeaturepyramids
forthetrainingimagesandthefeaturepyramidforthelowresolutioninputimage.Foreachpixelinthelowresolutionimage,wefind(i.e.recognize)theclosestmatchingParentStructureinthehighresolutiondata.WerecordandoutputthebestmatchingimageandthepixellocationofthebestmatchingParentStructure.Notethatthesedatastructuresarebothdefinedindependentlyforeachpixelintheimages.
samples.(Theimagemayhavetobeinterpolatedtomakethisratioexactlyapowerof2.Sincetheinterpolatedimageisimmediatelydownsampledtocreatethepyramiditisonlythelowestlevelofthepyramidfeaturesthatareaffectedbythisinterpolation.Theoveralleffectontheprioristhereforeverysmall.)Wecanthencomputethefeaturepyramidfortheinputimagefromlevelandupwards
.Figure7showsanillustrationofthisscenariofor
.
Foreachpixel
intheinputindependently,wecompareitsParentStructurevector
againstallofthetrainingParentStructurevectorsatthesamelevel;i.e.we
compareagainst
forallandforall
.Thebestmatchingimage
andthebestmatchingpixel
arestoredastheoutputoftherecognitiondecision,
independentlyforeachpixelusedaweighted
in
.(Wefoundtheperformancetobelargelyindependent
ofthedistancefunctionusedtodeterminethebestmatchingParentStructurevector.Weactually
-norm,givingthederivativecomponentshalfasmuchweightastheLaplacian
valuesandreducingtheweightbyafactorofforeachincreaseinthepyramidlevel.)
RecognitioninourhallucinationalgorithmthereforemeansfindingtheclosestmatchingpixelinthetrainingdatainthesensethattheParentStructurevectorsofthethetwopixelsarethemostsimilar.Thissearchis,ingeneral,performedoverallpixelsinalloftheimagesinthetrainingdata.Ifwehavefrontalimagesoffaces,however,werestrictthissearchtoconsideronlythecorrespondingpixelsinthetrainingdata.Inthisway,wetreateachpixelintheinputimagedifferently,dependingonitsspatiallocation,similarlytothe“class-based”approachof[44].
19
4.5ARecognition-basedGradientPrior
Foreachpixelimage
intheinputimage
,wehaverecognizedthepixelthatisthemostsimilar
inthetrainingdata,specifically,thepixel
inthe
levelofthepyramidfortraining
.Theserecognitiondecisionspartitiontheinputs
intoacollectionofsubclasses,
asrequiredbytherecognition-basedpriordescribedinSection4.2.Ifwedenotethesubclassesby
(i.e.usingamulti-dimensionalindexratherthan)Equation(23)canberewrittenas:
Pr
wherePr
Pr
Pr
(26),giventhat
istheprobabilitythatthesuper-resolutionimageis
theinputimages
lieinthesubclassthatwillberecognizedtohave
astheclosestmatching
pixelinthetrainingimage(inthe
levelofthepyramid.)
WenowneedtodefinePr
.Wedecidedtomakethisrecognition-based
priorafunctionofthegradientbecausethebase,oraverage,intensitiesinthesuper-resolutionimagearedefinedbythereconstructionconstraints.Itisthehigh-frequencygradientinformationthatismissing.Specifically,wewanttodefinePrEachlowresolutioninputimagedifferentpixelsinthe(andforeach
toencouragethegradient
ofthesuper-resolutionimagetobeclosetothegradientoftheclosestmatchingtrainingsamples.
hasa(different)closestmatching(ParentStructure)train-ofthemtobeprecise.SeealsoFigure7.)We
ingsampleforeachpixel.Moreover,eachsuchParentStructurecorrespondstoanumberof
levelofthepyramid,(
thereforeimposeaseparategradientconstraintforeachpixel
inthe
levelofthepyramid
.)Now,thebestmatchingpixel
isonlydefinedonthe
levelofthepyra-
mid.Fornotationalconvenience,therefore,givenapixeldefinethebestmatchingpixelonthe
onthe
levelofthepyramid,
levelofthepyramidtobe:
.
If
isapixelinthe
levelofthepyramidforimage
,thecorrespondingpixelinthe
super-resolutionimagederivativesof
is
.Wethereforewanttoimposetheconstraintthatthefirst
and
at
atthispointshouldequalthederivativesoftheclosestmatchingpixel(Parent
Structure)inthetrainingdata.Parametricexpressionsfor
can
easilybederivedaslinearfunctionsoftheunknownpixelsinthehighresolutionimage
20
.We
assumethattheerrorsinthegradientvaluesbetweentherecognizedtrainingsamplesandthesuper-resolutionimageareindependentlyandidenticallydistributed(acrossboththeimagesandthepixelsPr
)andmoreoverthattheyareGaussianwithcovariance
.Therefore:
(28)
Thisexpressionenforcestheconstraintsthatthegradientofthesuper-resolutionimageeachinputimage
.)Theseconstraintsarealsolinearintheunknownpixelsof
should
beequaltothegradientofthebestmatchingtrainingimage(separatelyforeachpixel
in
4.6AlgorithmPracticalities
Equations(21),(22),(26),and(28)formahigh-dimensionallinearleastsquaresproblem.TheconstraintsinEquation(22)arethestandardsuper-resolutionreconstructionconstraints.ThoseinEquation(28)aretherecognition-basedprior.Therelativeweightsoftheseconstraintsaredefinedbythenoisecovariancesreliableonesandsosetsuper-resolutionimage
and
.Weassumethatthereconstructionconstraintsarethemore
(typically)tomakethemalmosthardconstraints.
Thenumberofunknownsinthelinearsystemisequaltothetotalnumberofpixelsinthe
.Directlyinvertingalinearsystemofsuchsizecanproveproblematic.
Wethereforeimplementedagradientdescentalgorithm(usingadiagonalapproximationtotheHessian[42]tosetthehowstepsizeinasimilarwayto[52].)Sincetheerrorfunctionisquadratic,thealgorithmconvergestothesingleglobalminimumwithoutanyproblem.
4.7ExperimentalResultsonHumanFaces
OurexperimentsforhumanfaceswereconductedwithasubsetoftheFERETdataset[40]con-sistingof596imagesof278individuals(92womenand186men).Eachpersonappearsbetween2and4times.Mostpeopleappeartwice,withtheimagestakenonthesamedayunderapproxi-matelythesameilluminationconditions,butwithdifferentfacialexpressions(oneimageisusuallyneutral,theothertypicallyasmile.)Asmallnumberofpeopleappear4times,withtheimagestakenovertwodifferentdays,separatedbyseveralmonths.
21
TheimagesintheFERETdatasetare
pixels,howevertheareaoftheimageoccupied
bythefacevariesconsiderably.Mostofthefacesarearoundpixelsorlarger.Intheclass-
basedapproach[44],theinputimages(whichareallfrontal)needtobealignedsothatwecanassumethatthesamepartofthefaceappearsinroughlythesamepartoftheimageeverytime.Thisallowsustoobtainthebestresults.Thisalignmentwasperformedbyhandmarkingthelocationof3points,thecentersofthetwoeyesandthelowertipofthenose.These3pointsdefineanaffinewarp[8],whichwasusedtowarptheimagesintoacanonicalform.Thecanonicalimageis
pixelswiththerighteyeat.These
,thelefteyeat
,andthelowertipof
thenoseat
pixelimageswerethenusedasthetrainingsamples.(In
mostofourexperiments,wealsoadded8syntheticvariationsofeachimagetothetrainingsetbytranslatingtheimage8times,eachtimebyasmallamount.Thisstepenhancestheperformanceofouralgorithmslightly,althoughitisnotvitaltoobtaingoodperformance.)
Weuseda“leave-one-out”methodologytotestouralgorithm.Totestonanyparticularperson,weremovedalloccurrencesofthatindividualfromthetrainingset.Wethentrainedthealgorithmonthereducedtrainingset,andtestedontheimagesoftheindividualthathadbeenremoved.Becausethisprocessisquitetimeconsuming,weusedatestsetof100imagesof100differentindividualsratherthantheentiretrainingset.Thetestsetwasselectedatrandomfromthetrainingset.Aswillbeseen,thetestsetspansbothsexandracereasonablywell.4.7.1ComparisonwithExistingSuper-ResolutionAlgorithmsWeinitiallyrestrictattentiontothecaseofenhancing
pixelimagesfourtimestogive
pixelimages.Laterwewillconsiderthevariationinperformancewiththemagnification
factor.Wesimulatethemultipleslightlytranslatedimagesrequiredforsuper-resolutionusingtheFERETdatabasebyrandomlytranslatingtheoriginalFERETimagesmultipletimesbysub-pixelamountsbeforedown-samplingthemtoformthelowresolutioninputimages.
InourfirstsetofexperimentswecompareouralgorithmwiththoseofHardieetal.[26]andSchultzandStevenson[47].InFigure8(a)weplottheRMSpixelerroragainstthenumberoflowresolutioninputs,computedoverthe100imagetestset.(WecomputetheRMSerrorusingtheoriginalhighresolutionimageusedtosynthesizetheinputsfrom.)WealsoplotresultsforcubicB-splineinterpolation[41]forcomparison.Sincethisalgorithmisaninterpolationalgorithm,onlyoneimageiseverusedandsotheperformanceisindependentofthenumberofinputs.
InFigure8(a)weseethatourhallucinationalgorithmdoesoutperformthereconstruction-basedsuper-resolutionalgorithms,fromoneinputimageto25.Theimprovementisconsistentacrossthe
22
30RMS Error Per Pixel in Grey-LevelsRMS Error Per Pixel in Grey-LevelsCubic B-splineHallucinationHardie et al.
Schultz and Stevenson
40353025201510
Cubic B-splineHallucinationHardie et al.
Schultz and Stevenson
25
20
15
10
1
Number of Input Images
1002
4681012Standard Deviation of Additive Noise
1416
(a)VariationwiththeNumberofImages(b)VariationwithAdditiveNoise
Figure8:Acomparisonofourhallucinationalgorithmwiththereconstruction-basedsuper-resolution
algorithmsofSchultzandStevenson[47]andHardieetal.[26].In(a)weplottheRMSpixelintensityerrorcomputedacrossthe100imagetestsetagainstthenumberoflowresolutioninputimages.Ouralgorithmoutperformsthethetraditionalsuper-resolutionalgorithmsacrosstheentirerange.In(b)wevarytheamountofadditivenoise.Againwefindthatouralgorithmdoesbetterthanthetraditionalsuper-resolutionalgorithms,especiallyasthestandarddeviationofthenoiseincreases.
(a)Input
(b)Hallucinated(c)Hardieetal.(d)Original(e)CubicB-spline
(f)Input
(g)Hallucinated(h)Hardieetal.(i)Original(j)CubicB-spline
Figure9:ThebestandworstresultsinFigure8(a)intermsoftheRMSerrorofthehallucinationalgorithm
for9inputimages.In(a)–(e)wedisplaytheresultsforthebestperformingimageinthe100imagetestset.Theresultsfortheworstimagearepresentedin(f)–(j).(TheresultsforSchultzandStevensonaresimilartothoseforHardieetal.andareomitted.)Thereislittledifferenceinimagequalitybetweenthebestandworsthallucinatedresults.ThehallucinatedresultsarealsovisiblybetterthanthoseforHardieetal.
23
(a)Std.Dev.1.0(b)Std.Dev.2.0(c)Std.Dev.4.0(d)Std.Dev.8.0(e)Std.Dev.16.0
Figure10:AnexamplefromFigure8(b)ofthevariationintheperformanceofthehallucinationalgorithm
withadditivezero-mean,whiteGaussiannoise.Theoutputsofthehallucinationalgorithmareshownforvariouslevelsofnoise.Ascanbeseen,theoutputishardlyaffecteduntilaround4-bitsofintensitynoisehavebeenaddedtotheinputs.Thisisbecausethehallucinationalgorithmusesthestrongrecognition-basedfacepriortogeneratesmooth,face-likeimageshowevernoisytheinputimagesare.Ataround4-bitsofnoise,therecognitiondecisionsbegintofailandtheperformanceofthealgorithmbeginstodropoff.
numberofinputimagesandisaround20%.Theimprovementisalsolargelyindependentoftheactualinput.Inparticular,Figure9containsthebestandworstresultsobtainedacrosstheentiretestsetintermsoftheRMSerrorofthehallucinationalgorithmfor9lowresolutioninputs.Ascanbeseen,thereislittledifferencebetweenthebestresultsinFigure9(a)–(e)andtheworstonesin(f)–(j).Notice,also,howthehallucinatedresultsareadramaticimprovementoverthelowresolutioninput,andmoreoverarevisiblysharperthantheresultsforHardieetal..4.7.2RobustnesstoAdditiveIntensityNoise
Figure8(b)containstheresultsofanexperimentinvestigatingtherobustnessofthe3super-resolutionalgorithmstoadditiveintensitynoise.Inthisexperiment,weaddedzero-mean,whiteGaussiannoisetothelowresolutionimagesbeforepassingthemasinputstothealgorithms.Inthefigure,theRMSpixelintensityerrorisplottedagainstthestandarddeviationoftheadditivenoise.Theresultsshownarefor4lowresolutioninputimages,andagain,theresultsareanaverageoverthe100imagetestset.(TheresultsforcubicB-splineinterpolationjustuseoneinputimage,ofcourse.)Aswouldbeexpected,theperformanceofall4algorithmsgetsmuchworseasthestandarddeviationofthenoiseincreases.Thehallucinationalgorithm(andcubicB-splineinterpolation),however,seemsomewhatmorerobustthanthetraditionalreconstruction-basedsuper-resolutionalgorithms.Thereasonforthisincreasedrobustnessisprobablythatthehallucinationalgorithmalwaystendstogeneratesmooth,face-likeimages(becauseofthestrongrecognition-basedprior)howevernoisytheinputsare.Solongastherecognitiondecisionare
24
Input
ReductioninRMSerrorvs.cubicB-spline
56%(12.4vs.22.2)
73%(33.3vs.45.4)
Figure11:Thevariationintheperformanceofourhallucinationalgorithmwiththeinputimagesize.From
theexampleinthetoptworows,weseethatthealgorithmworkswelldowntopixelimages,butnotforpixelimages.(SeealsoFigure12.)TheimprovementintheRMSerroroverthe100imagetestsetinthelastrowconfirmsthefactthatthealgorithmbeginstobreakdownbetweenthesetwoimagesizes.
notaffectedtoomuch,theresultsshouldlookreasonable.OneexampleofhowtheoutputofthehallucinationalgorithmdegradeswiththeamountofadditivenoiseispresentedinFigure10.4.7.3VariationinPerformancewiththeInputImageSize
Wedonotexpectourhallucinationalgorithmtoworkforallsizesofinput.Oncetheinputgetstoosmall,therecognitiondecisionswillbebasedonessentiallynoinformation.Inthelimitthattheinputimageisjustasinglepixel,thealgorithmwillalwaysgeneratethesameface(forasingleinputimage),butwithdifferentaveragegreylevels.Wethereforeinvestigatedthelowestresolutionatwhichourhallucinationalgorithmworksreasonablewell.
InFigure11weshowexampleresultsforonefaceinthetestsetfor4differentinputsizes.(Alloftheresultsusejust4inputimages.)Weseethatthealgorithmworksreasonablywelldownto
pixels,butfor
pixelimagesitproducesafacethatappearstobeapieced-together
combinationofavarietyoffaces.Thisisnottoosurprisingbecausethepixelinputimage
isnotevenclearlyanimageofaface.(Manyfacedetectorssuchas[45]useinputwindowsof
25
(a)Input
(1of4images)
(b)Hallucinated
(c)SchultzandStevenson
(d)Original(e)CubicB-spline
Figure12:Selectedresultsfor
pixelimages,thesmallestinputsizeforwhichourhallucination
algorithmworksreliably.(Theinputconsistsofonly4lowresolutioninputimages.)NoticehowsharpthehallucinatedresultsarecomparedtotheinputandtheresultsfortheSchultzandStevenson[47]algorithm.(TheresultsforHardieetal.[26]aresimilartothoseforSchultzandStevensonandsoareomitted.)
26
CroppedCubic B-splineHallucinated
Figure13:ExampleresultsonasingleimagenotintheFERETdataset.Thefacialfeatures,suchaseyes,
nose,andmouth,whichareblurredandunclearintheoriginalcroppedface,areenhancedandappearmuchsharperinthehallucinatedimage.Incomparison,cubicB-splineinterpolationgivesoverlysmoothresults.
around
pixelssoitisunlikelythatthe
pixelimagewouldbedetectedasaface.)
InthelastrowofFigure11,wegivenumericalresultsoftheaverageimprovementintheRMSerrorovercubicB-splineinterpolation(computedoverthe100imagetestset.)Weseethatfor
and
pixelimages,thereductionintheerrorisverydramatic.Itisroughlyhalved.
Fortheothersizes,theresultsarelessimpressive,withtheRMSerrorbeingcutbyabout25%.For
pixelimages,thereasonisthatthehallucinationalgorithmisbeingtobreakdown.For
pixelimageareexcellent,however.(AlsoseeFigure12which
pixelimages,thereasonisthatcubicB-splinedoessowellthatitishardtodomuchbetter.
Theresultsforthe
containsseveralmoreexamples.)Theinputimagesarebarelyrecognizableasfacesandthefacialfeaturessuchastheeyes,eye-brows,andmouthsonlyconsistofahandfulofpixels.Theoutputs,albeitslightlynoisy,areclearlyrecognizabletothehumaneye.Thefacialfeaturesarealsoclearlydiscernible.ThehallucinatedresultsarealsoahugeimprovementoverSchultzandStevenson[47].4.7.4ResultsonNon-FERETTestImages
Inourfinalexperimentforhumanfaces,wetriedouralgorithmonanumberofimagesnotintheFERETdataset.InFigure13wepresenthallucinationresultsjustusingasingleinputimage.Ascanbeseen,thehallucinatedresultsareabigimprovementovercubicB-splineinterpolation.The
27
CroppedSchultz and StevensonHallucinated
Figure14:Exampleresultsonashortvideoofeightframes.(Onlyoneoftheinputimagesandthe
croppedlowresolutionfaceregionareshown.Theotherseveninputimagesaresimilarexceptthatthecameraisslightlytranslated.)TheresultsofthehallucinationalgorithmareslightlybetterthanthoseoftheSchultzandStevensonalgorithm,forexample,aroundtheeye-brows,aroundthefacecontour,andaroundthehairline.Thisimprovementisonlymarginalbecauseoftheharshilluminationconditions.Atpresent,theperformanceofourhallucinationalgorithmisverydependentuponsucheffects.
facialfeatures,suchastheeyes,nose,andmouthareallenhancedandappearmuchsharperinthehallucinatedresultthaneitherinthelowresolutioninputorintheinterpolatedimage.
InFigure14wepresentresultsonashorteightframevideo.Thefaceregionismarkedbyhardinthefirstframeandthentrackedovertheremainderofthesequence.OuralgorithmiscomparedwiththatofSchultzandStevenson.(TheresultsforHardieetal.aresimilarandsoareomitted.)Ouralgorithmmarginallyoutperformsbothreconstruction-basedalgorithms.Inparticular,theeye-brows,thefacecontour,andthehairlineareallalittlesharperinthehallucinatedresult.Theimprovementisquitesmall,however.Thisisbecausethehallucinationalgorithmiscurrentlyverysensitivetoilluminationconditionsandotherphotometriceffects.Weareworkingonmakingouralgorithmmorerobusttosucheffects,aswellasonseveralotherrefinements.4.7.5ResultsonImagesNotContainingFaces
InFigure15webrieflypresentafewresultsonimagesthatdonotcontainfaces,eventhoughthealgorithmhasbeentrainedontheFERETdataset.(Figure15(a)isarandomimage,(b)isamiscel-laneousimage,and(c)isaconstantimage.)Asmightbeexpected,ouralgorithmhallucinatesanoutlineofafaceinallthreecases,eventhoughthereisnofaceintheinput.Thisisthereasonwe
28
(a)RandomHallucinated(b)Misc.Hallucinated(c)ConstantHallucinated
Figure15:Theresultsofapplyingourhallucinationalgorithmtoimagesnotcontainingfaces.(Wehave
omittedthelowresolutioninputandhavejustdisplayedtheoriginalhighresolutionimage.)Asisevident,afaceishallucinatedbyouralgorithmevenwhennoneispresent,hencetheterm“hallucinationalgorithm.”
calledouralgorithma“hallucinationalgorithm.”(Thehallucinationalgorithmnaturallyperformsworseonimagesthatiswasnottrainedforthanreconstruction-basedalgorithmsdo.)
4.8ExperimentalResultsonTextData
Wealsoappliedouralgorithmtotextdata.Inparticular,wegrabbedanimageofanwindowdis-playingonepageofaletterandusedthebit-mapastheinput.Theimagewassplitintodisjointtrainingandtestsamples.(Thetrainingandtestdatathereforecontainthesamefont,areatthesamescale,andthedataisnoiseless.Thetrainingandtestdataarenotregisteredinanywayhow-ever.)TheresultsarepresentedinFigures16.TheinputinFigure16(a)ishalftheresolutionoftheoriginalinFigure16(f).ThehallucinatedresultinFigure16(c)isthebestreconstructionofthetext,bothvisuallyandintermsoftheRMSintensityerror.Forexample,comparetheappearanceoftheword“was”inthesecondsentenceofthetextinFigures16(b)–(f).Thehallucinationalgo-rithmalsohasanRMSerrorofonly24.5greylevels,comparedtoover48.0forthethreeotheralgorithms,almostafactoroftwoimprovement.
5Discussion
Inthefirsthalfofthispaperweshowedthatthesuper-resolutionreconstructionconstraintsprovidelessandlessusefulinformationasthemagnificationfactorincreases.Themajorcauseofthisphenomenonisthespatialaveragingoverthephotosensitivearea;i.e.thefactthat
isnon-zero.
Theunderlyingreasonthattherearelimitsonreconstruction-basedsuper-resolutionisthereforethesimplefactthatCCDsensorsmusthaveanon-zerophotosensitiveareainordertobeabletocaptureanon-zeronumberofphotonsoflight.
Ouranalysisassumesquantizednoiselessimages;i.e.theintensitiesare8-bitvalues,created
29
(a)InputImage.(Justoneimageisused.)
(b)CubicB-spline,RMSError51.3
(c)Hallucinated,RMSError24.5
(d)SchultzandStevenson,RMSError48.4
(e)Hardieetal.,RMSError48.5
(f)OriginalHighResolutionImage
Figure16:Theresultsofenhancingtheresolutionofapieceoftextbyafactorof2.(Justasingleinput
imageisused.)Ourhallucinationalgorithmproducesaclear,crispimageusingnoexplicitknowledgethattheinputcontainstext.Inparticular,lookattheword“was”inthesecondsentence.TheRMSpixelintensityerrorisalsoalmostafactorof2improvementovertheotheralgorithms.
30
byroundingnoiselessreal-valuednumbers.(Itisthisquantizationthatcausesthelossofinforma-tion,whichwhencombinedwithspatialaveraging,meansthathighmagnificationsuper-resolutionisnotpossiblefromthereconstructionconstraints.)Withoutthisassumption,however,itmightbepossibletoincreasethenumberofbitsperpixelbyaveragingacollectionofquantizednoisyim-ages(inanintelligentway).Inpractice,takingadvantageofsuchinformationisverydifficult.Thispointalsodoesnotaffectanotheroutcomeofouranalysiswhichwastoshowthatreconstruction-basedsuper-resolutioninherentlytrades-offintensityresolutionforspatialresolution.
Inthesecondhalfofthispaperweshowedthatrecognitionprocessesmayprovideanaddi-tionalsourceofinformationforsuper-resolutionalgorithms.Inparticular,wedevelopeda“hallu-cination”algorithmforsuper-resolutionanddemonstratedthatthisalgorithmcanobtainfarbetterresultsthanexistingreconstruction-basedsuper-resolutionalgorithms,bothvisuallyandintermsofRMSpixelintensityerror.Similarapproachesmayaidother(ie.3D)reconstructiontasks.Atthistime,however,ourhallucinationalgorithmisnotrobustenoughtobeusedontypicalsurveillancevideo.Besidesintegratingitwitha3Dheadtrackertoavoidtheneedformanualregistrationandtoremovetherestrictiontofrontalfaces,therobustnessofthealgorithmtoil-luminationconditionsmustbeimproved.ThislackofrobustnesstoilluminationcanbeseeninFigure14wheretheperformanceofouralgorithmonimagescapturedoutdoorsandinnovelil-luminationconditionsresultsinsignificantlylessimprovementoverexistingreconstruction-basedalgorithmsthanthatseeninsomeofourotherresults.(ThemostappropriatefiguretocompareFigure14withisFigure9.)Wearecurrentlyworkingontheseandotherrefinements.
Thetwohalvesofthispaperarerelatedinthefollowingsense.Bothhalvesareconcernedwithwheretheinformationcomesfromwhensuper-resolutionisperformedandhowstrongthatinformationis.Thefirsthalfinvestigateshowmuchinformationiscontainedinthereconstructionconstraintsandshowsthattheinformationcontentisfundamentallylimitedbythedynamicrangeoftheimages.Thesecondhalfdemonstratesthatstrongclass-basedpriorscanprovidefarmoreinformationthanthesimplesmoothnesspriorsthatareusedinexistingsuper-resolutionalgorithms.
Acknowledgements
WewishtothankHarryShumforpointingouttheworkofFreemanandPasztor[25],IainMatthewsforpointingouttheworkofEdwardsetal.[19],andHenrySchneidermanforsug-gestingweperformtheconditioninganalysisinSection3.2.Wewouldalsoliketothanknumer-ouspeopleforcommentsandsuggestions,includingTerryBoult,PeterCheeseman,MichalIrani,ShreeNayar,SteveSeitz,SundarVedula,andeveryoneintheFaceGroupatCMU.Finallywe
31
wouldliketothanktheanonymousreviewersfortheircommentsandsuggestions.TheresearchdescribedinthispaperwassupportedbyUSDODGrantMDA-904-98-C-A915.Apreliminaryversionofthispaper[4]appearedinJune2000intheIEEEConferenceonComputerVisionandPatternRecognition.Additionalexperimentalresultscanbefoundinthetechnicalreport[1].
References
[1]S.BakerandT.Kanade.Hallucinatingfaces.TechnicalReportCMU-RI-TR-99-32,The
RoboticsInstitute,CarnegieMellonUniversity,1999.[2]S.BakerandT.Kanade.Super-resolutionopticalflow.TechnicalReportCMU-RI-TR-99-36,
TheRoboticsInstitute,CarnegieMellonUniversity,1999.[3]S.BakerandT.Kanade.Hallucinatingfaces.InProceedingsoftheFourthInternational
ConferenceonAutomaticFaceandGestureRecognition,Grenoble,France,2000.[4]S.BakerandT.Kanade.Limitsonsuper-resolutionandhowtobreakthem.InProceedingsof
the2000IEEEConferenceonComputerVisionandPatternRecognition,HiltonHead,SouthCarolina,2000.[5]S.Baker,S.K.Nayar,andH.Murase.Parametricfeaturedetection.InternationalJournalof
ComputerVision,27(1):27–50,1998.[6]D.F.Barbe.Charge-CoupledDevices.Springer-Verlag,1980.
[7]B.Bascle,A.Blake,andA.Zisserman.Motiondeblurringandsuper-resolutionfroman
imagesequence.InProceedingsoftheFourthEuropeanConferenceonComputerVision,pages573–581,Cambridge,England,1996.[8]J.R.Bergen,P.Anandan,K.J.Hanna,andR.Hingorani.Hierarchicalmodel-basedmotion
estimation.InProceedingsoftheSecondEuropeanConferenceonComputerVision,pages237–252,SantaMargheritaLiguere,Italy,1992.[9]M.Berthod,H.Shekarforoush,M.Werman,andJ.Zerubia.Reconstructionofhighresolution
3Dvisualinformation.InProceedingsofthe1994ConferenceonComputerVisionandPatternRecognition,pages654–657,Seattle,WA,1994.[10]M.BornandE.Wolf.PrinciplesofOptics.PermagonPress,1965.
[11]P.J.Burt.Fastfiltertransformsforimageprocessing.ComputerGraphicsandImagePro-cessing,16:20–51,1980.[12]P.J.BurtandE.H.Adelson.TheLaplacianpyramidasacompactimagecode.IEEETrans-actionsonCommuniations,31(4):532–540,1983.
32
[13]P.Cheeseman,B.Kanefsky,R.Kraft,J.Stutz,andR.Hanson.Super-resolvedsurfacerecon-structionfrommultipleimages.TechnicalReportFIA-94-12,NASAAmesResearchCenter,MoffetField,CA,1994.[14]M.-C.ChiangandT.E.Boult.Imaging-consistentsuper-resolution.InProceedingsofthe
1997DARPAImageUnderstandingWorkshop,1997.[15]M.-C.ChiangandT.E.Boult.Localblurestimationandsuper-resolution.InProceedingsof
the1997ConferenceonComputerVisionandPatternRecognition,pages821–826,SanJuan,PuertoRico,1997.[16]J.S.DeBonet.Multiresolutionsamplingprocedureforanalysisandsynthesisoftexture
images.InComputerGraphicsProceedings,AnnualConferenceSeries,(SIGGRAPH’97),pages361–368,1997.[17]J.S.DeBonetandP.Viola.Texturerecognitionusinganon-parametricmulti-scalestatistical
model.InProceedingsofthe1998ConferenceonComputerVisionandPatternRecognition,pages641–647,SantaBarbara,CA,1998.[18]F.Dellaert,S.Thrun,andC.Thorpe.Jacobianimagesofsuper-resolvedtexturemapsfor
model-basedmotionestimationandtracking.InProceedingsoftheFourthWorkshoponApplicationsofComputerVision,pages2–7,Princeton,NJ,1998.[19]G.J.Edwards,C.J.Taylor,andT.F.Cootes.Learningtoidentifyandtrackfacesinimage
sequences.InProceedingsoftheThirdInternationalConferenceonAutomaticFaceandGestureRecognition,pages260–265,Nara,Japan,1998.[20]M.Elad.Super-ResolutionReconstructionofImageSequences-AdaptiveFilteringAp-proach.PhDthesis,TheTechnion-IsraelInstituteofTechnology,Haifa,Israel,1996.[21]M.EladandA.Feuer.Restorationofsinglesuper-resolutionimagefromseveralblurred,
noisyanddown-sampledmeasuredimages.IEEETransactionsonImageProcessing,6(12):1646–58,1997.[22]M.EladandA.Feuer.Super-resolutionreconstructionofimagesequences.IEEETransac-tionsonPatternAnalysisandMachineIntelligence,21(9):817–834,1999.[23]M.EladandA.Feuer.Super-resolutionrestorationofanimagesequence-adaptivefiltering
approach.IEEETransactionsonImageProcessing,8(3):387–395,1999.[24]W.T.FreemanandE.H.Adelson.Thedesignanduseofsteerablefilters.IEEETransactions
onPatternAnalysisandMachineIntelligence,13:891–906,1991.[25]W.T.Freeman,E.C.Pasztor,andO.T.Carmichael.Learninglow-levelvision.International
JournalofComputerVision,20(1):25–47,2000.
33
[26]R.C.Hardie,K.J.Barnard,andE.E.Armstrong.JointMAPregistrationandhigh-resolution
imageestimationusingasequenceofundersampledimages.IEEETransactionsonImageProcessing,6(12):1621–1633,1997.[27]B.K.P.Horn.RobotVision.McGrawHill,1996.
[28]T.S.HuangandR.Tsai.Multi-frameimagerestorationandregistration.AdvancesinCom-puterVisionandImageProcessing,1:317–339,1984.[29]M.IraniandS.Peleg.Improvingresolutionbyimagerestoration.ComputerVision,Graphics,
andImageProcessing,53:231–239,1991.[30]M.IraniandS.Peleg.Motionanalysisforimageenhancement:Resolution,occulsion,and
transparency.JournalofVisualCommunicationandImageRepresentation,4(4):324–335,1993.[31]M.Irani,B.Rousso,andS.Peleg.Imagesequenceenhancementusingmultiplemotionsanal-ysis.InProceedingsofthe1992ConferenceonComputerVisionandPatternRecognition,pages216–221,Urbana-Champaign,Illinois,1992.[32]D.Keren,S.Peleg,andR.Brada.Imagesequenceenhancementusingsub-pixeldisplace-ments.InProceedingsofthe1988ConferenceonComputerVisionandPatternRecognition,pages742–746,AnnArbor,Michigan,1988.[33]S.Kim,N.Bose,andH.Valenzuela.Recursivereconstructionofhighresolutionimage
fromnoisyundersampledmultiframes.IEEETransactionsonAcoustics,Speech,andSignalProcessing,38:1013–1027,1990.[34]S.KimandW.-Y.Su.Recursivehigh-resolutionreconstructionofblurredmultiframeimages.
IEEETransactionsonImageProcessing,2:534–539,1993.[35]S.MannandR.W.Picard.Virtualbellows:Constructinghighqualitystillsfromvideo.
InProceedingsoftheFirstInternationalConferenceonImageProcessing,pages363–367,Austin,TX,1994.[36]V.S.Nalwa.AGuidedTourofComputerVision.Addison-Wesley,1993.
[37]T.Numnonda,M.Andrews,andR.Kakarala.Highresolutionimagereconstructionbysim-ulatedannealing.ImageandVisionComputing,11(4):213–220,1993.[38]A.Patti,M.Sezan,andA.Tekalp.Superresolutionvideoreconstructionwitharbitrarysam-plinglaticesandnonzeroaperturetime.IEEETransactionsonImageProcessing,6(8):1064–1076,1997.[39]S.Peleg,D.Keren,andL.Schweitzer.Improvingimageresolutionusingsubpixelmotion.
PatternRecognitionLetters,pages223–226,1987.
34
[40]P.J.Philips,H.Moon,P.Rauss,andS.A.Rizvi.TheFERETevaluationmethodologyfor
face-recognitionalgorithms.InCVPR’97,1997.[41]W.K.Pratt.DigitalImageProcessing.Wiley-Interscience,1991.
[42]W.H.Press,S.A.Teukolsky,W.T.Vetterling,andB.P.Flannery.NumericalRecipesinC.
CambridgeUniversityPress,secondedition,1992.[43]H.QiandQ.Snyder.Conditioninganalysisofmissingdataestimationforlargesensorarrays.
InProceedingsofthe2000IEEEConferenceonComputerVisionandPatternRecognition,HiltonHead,SouthCarolina,2000.[44]T.Riklin-RavivandA.Shashua.TheQuotientimage:Classbasedrecognitionandsynthesis
undervaryingillumination.InProceedingsofthe1999ConferenceonComputerVisionandPatternRecognition,pages566–571,FortCollins,CO,1999.[45]H.A.Rowley,S.Baluja,andT.Kanade.Neuralnetwork-basedfacedetection.IEEETrans-actionsonPatternAnalysisandMachineIntelligence,20(1):23–38,January1998.[46]R.SchultzandR.Stevenson.ABayseianapproachtoimageexpansionforimproveddefini-tion.IEEETransactionsonImageProcessing,3(3):233–242,1994.[47]R.SchultzandR.Stevenson.Extractionofhigh-resolutionframesfromvideosequences.
IEEETransactionsonImageProcessing,5(6):996–1011,1996.[48]H.Shekarforoush.Conditioningboundsformulti-framesuper-resolutionalgorithms.Tech-nicalReportCAR-TR-912,ComputerVisionLaboratory,CenterforAutomationResearch,UniversityofMaryland,1999.[49]H.Shekarforoush,M.Berthod,J.Zerubia,andM.Werman.Sub-pixelbayesianestimation
ofalbedoandheight.InternationalJournalofComputerVision,19(3):289–300,1996.[50]V.Smelyanskiy,P.Cheeseman,D.Maluf,andR.Morris.Bayesiansuper-resolvedsurface
reconstructionfromimages.InProceedingsofthe2000IEEEConferenceonComputerVisionandPatternRecognition,HiltonHead,SouthCarolina,2000.[51]H.StarkandP.Oskoui.High-resolutionimagerecoveryfromimage-planearrays,using
convexprojections.JournaloftheOpticalSocietyofAmericaA,6:1715–1726,1989.[52]R.SzeliskiandP.Golland.Stereomatchingwithtransparencyandmatting.InSixthInterna-tionalConferenceonComputerVision(ICCV’98),pages517–524,Bombay,1998.[53]H.UrandD.Gross.Improvedresolutionfromsubpixelshiftedpictures.ComputerVision,
Graphics,andImageProcessing,54(2):181–186,1992.
35
因篇幅问题不能全部显示,请点此查看更多更全内容