ShaneLeggandMarcusHutter
IDSIA,Galleria2,Manno-Lugano6928,Switzerland
{shane,marcus}@idsia.ch
Afundamentaldifficultyinartificialintelligenceisthatnobodyreallyknowswhatintelli-genceis,especiallyforsystemswithsenses,environments,motivationsandcognitivecapacitieswhichareverydifferenttoourown.Inourworkwetakeamainstreaminformalperspectiveonintelligenceandformaliseandgeneralisethisusingthereinforcementlearningframeworkandal-gorithmiccomplexitytheory.Theresultingformaldefinitionofintelligencehasmanyinterestingpropertiesandhasreceivedattentioninboththeacademic[4,5]andpopularpress[2,1].
Althoughthereisnostrictconsensusamongexpertsoverthedefinitionofintelligenceforhu-mans,mostdefinitionssharemanykeyfeatures.Inallcases,intelligenceisapropertyofanentity,whichwewillcalltheagent,thatinteractswithanexternalproblemorsituation,whichwewillcalltheenvironment.Anagent’sintelligenceistypicallyrelatedtoitsabilitytosucceedwithre-specttooneormoreobjectives,whichwewillcallthegoal.Theemphasisonlearning,adaptationandflexibilitycommontomanydefinitionsimpliesthattheenvironmentisnotfullyknowntotheagent.Thustrueintelligencerequirestheabilitytodealwithawiderangeofpossibilities,notjustafewspecificsituations.Puttingthesethingstogethergivesusourinformaldefinition:Intelligencemeasuresanagent’sgeneralabilitytoachievegoalsinawiderangeofenvironments.Weareconfidentthatthisdefinitioncapturestheessenceofmanycommonperspectivesonintelligence.Italsodescribeswhatwewouldliketoachieveinmachines:Averygeneralcapacitytoadaptandperformwellinawiderangeofsituations.
Toformalisethiswecombinetheextremelyflexiblereinforcementlearningframeworkwithalgorithmiccomplexitytheory.Inreinforcementlearningtheagentsendsitsactionstotheenvi-ronmentandreceivesobservationsandrewardsback.Theagenttriestomaximisetheamountofrewarditreceivesbylearningaboutthestructureoftheenvironmentandthegoalsitneedstoac-complishinordertoreceiverewards.Todenotesymbolsbeingsentwewillusethelowercasevari-ablenameso,randaforobservations,rewardsandactionsrespectively.Theprocessofinteractionproducesanincreasinghistoryofobservations,rewardsandactions,o1r1a1o2r2a2o3r3a3o4....Theagentissimplyafunction,denotedbyπ,whichisaprobabilitymeasureoveractionscon-ditionedonthecurrenthistory,forexample,π(a3|o1r1a1o2r2).Howtheagentgeneratesthisdistributionoveractionsisleftcompletelyopen,forexample,agentsarenotrequiredtobeTuringcomputable.
Theenvironment,denotedµ,issimilarlydefined:∀k∈Ntheprobabilityofokrk,giventhecurrenthistoryisµ(okrk|o1r1a1o2r2a2...ok−1rk−1ak−1).Aswedesireanextremelygeneraldefinitionofintelligenceforarbitrarysystems,ourspaceofenvironmentsshouldbeaslargeaspossible.Anobviouschoiceisthespaceofallprobabilitymeasures,howeverthiscausesseriousproblemsaswecannotevendescribesomeofthesemeasuresinafiniteway.Thesolutionistorequirethemeasurestobecomputable.Thisallowsforaninfinitespaceofpossibleenvironmentswithnoboundontheircomplexity.Italsopermitsenvironmentswhicharenon-deterministicasitisonlytheirprobabilitydistributionswhichneedtobecomputable.Additionallyweboundthe∞πtotalrewardtobe1toensurethatthefuturevalueVµ:=Ei=1riisfinite.Thisspace,denotedE,appearstobethelargestusefulspaceofenvironments.
Wewanttocomputethegeneralperformanceofanagentinunknownenvironments.Asthereareaninfinitenumberofenvironments,wecannotsimplytakeanexpectedvaluewithrespecttoauniformdistribution—wemustweightsomeenvironmentsmoreheavilythanothers.Ifweconsidertheagent’sperspectiveontheproblem,itisthesameasasking:Givenseveraldifferenthypotheseswhichareconsistentwiththeobservations,whichhypothesisshouldbeconsideredthemostlikely?ThisisafundamentalproblemininductiveinferenceforwhichthestandardsolutionistoinvokeOccam’srazor:Givenmultiplehypotheseswhichareconsistentwiththedata,the
simplestshouldbepreferred.Asthisisgenerallyconsideredthemostintelligentthingtodo,weshouldtestagentsinsuchawaythattheyare,atleastonaverage,rewardedforcorrectlyapplyingOccam’srazor.Thismeansthatouraprioridistributionoverenvironmentsshouldbeweightedtowardssimplerenvironments.
Aseachenvironmentisdescribedbyacomputablemeasure,wecanmeasurethecomplexityoftheseinthestandardwaybyconsideringtheirKolmogorovcomplexity.Specifically,ifUisaprefixuniversalTuringmachinethentheKolmogorovcomplexityofanenvironmentµisthelengthoftheshortestprogramonUthatcomputesµ,formallyK(µ):=minp{l(p):U(p)=µ}.Wecannowdefinetheuniversalintelligenceofanagentπtosimplybeitsexpectedperformance,
π
Υ(π):=2−K(µ)Vµ.
µ∈E
Itisclearbyconstructionthatuniversalintelligencemeasuresthegeneralabilityofanagent
toperformwellinaverywiderangeofenvironments,asrequiredbyourinformaldefinitionofintelligencegivenearlier.Thedefinitionplacesnorestrictionsontheinternalworkingsoftheagent;itonlyrequiresthattheagentiscapableofgeneratingoutputandreceivinginputwhichincludesarewardsignal.UniversalintelligencealsoreflectsOccam’srazorinanaturalway;likestandardintelligencetestsforhumanswhichdefinethecorrectanswertoaquestiontobethesimplestconsistentwiththegiveninformation.
π
foranumberofbasicenvironments,suchassmallMDPs,andagentsByconsideringVµ
withsimplebutverygeneraloptimisationstrategies,itisclearthatΥcorrectlyorderstherelativeintelligenceoftheseagentsinanaturalway.Ifweconsiderahighlyspecialisedagent,forexampleIBM’sDeepBluechesssupercomputer,thenwecanseethatthisagentwillbeineffectiveoutsideofoneveryspecificenvironment,andthuswouldhavealowuniversalintelligencevalue.Thisisconsistentwithourviewofintelligenceasbeingahighlyadaptableandgeneralability.
AveryhighvalueofΥwouldimplythatanagentisabletoperformwellinmanyenviron-ments.Suchamachinewouldobviouslybeoflargepracticalsignificance.ThemaximalagentwithrespecttoΥisthetheoreticalAIXIagentwhichhasbeenshowntohavemanystrongoptimalityproperties,includingbeingself-optimisinginallenvironmentsinwhichthisisatallpossibleforageneralagent[3].Suchresultsconfirmthefactthatagentswithhighuniversalintelligenceareverypowerfulandadaptable.
UniversalintelligencespanssimpleadaptiveagentsrightuptosuperintelligentagentslikeAIXI,unlikethepass-failTuringtestwhichisusefulonlyforagentswithnearhumanintelligence.Furthermore,theTuringtestcannotbefullyformalisedasitisbasedonsubjectivejudgements.PerhapsanevenbiggerproblemisthattheTuringtestishighlyanthropocentric,indeedmanyhavesuggestedthatitisreallyatestofhumannessratherthanintelligence.Universalintelligencedoesnothavetheseproblemsasitisformallyspecifiedintermsofthemorefundamentalconceptofcomplexity.
References[1]C.Fi´evet.Mesurerl’intelligenced’unemachine.InLeMondedel’intelligence,volume1,
pages42–45,Paris,November2005.Mondeopublishing.
[2]D.Graham-Rowe.Spottingthebotswithbrains.InNewScientistmagazine,volume2512,
page27,13August2005.
[3]M.Hutter.UniversalArtificialIntelligence:SequentialDecisionsbasedonAlgorithmicProb-ability.Springer,Berlin,2004.300pages,http://www.idsia.ch/∼marcus/ai/uaibook.htm.[4]S.LeggandM.Hutter.Auniversalmeasureofintelligenceforartificialagents.InProc.21st
InternationalJointConf.onArtificialIntelligence(IJCAI-2005),Edinburgh,2005.
[5]S.LeggandM.Hutter.Aformalmeasureofmachineintelligence.InProc.Annualmachine
learningconferenceofBelgiumandTheNetherlands(Benelearn-2006),Ghent,2006.
因篇幅问题不能全部显示,请点此查看更多更全内容
Copyright © 2019- huatuo8.com 版权所有 湘ICP备2023022238号-1
违法及侵权请联系:TEL:199 1889 7713 E-MAIL:2724546146@qq.com
本站由北京市万商天勤律师事务所王兴未律师提供法律服务