WilliamE.HartScottBadenRichardK.BelewSandiaNationalLabsComputerScience&Eng’g.Dept.
P.O.Box5800,MS1110U.C.SanDiegoAlbuquerque,NM87185LaJolla,CA92093wehart@cs.sandia.gov
baden,rik@cs.ucsd.edu
Abstract
Thispaperexaminestheeffectsofrelaxedsynchroniza-tiononboththenumericalandparallelefficiencyofparallelgeneticalgorithms(GAs).Wedescribeacoarse-graingeo-graphicallystructuredparallelgeneticalgorithm.Ourex-perimentsprovidepreliminaryevidencethatasynchronousversionsofthesealgorithmshavealowerruntimethansyn-chronousGAs.Ouranalysisshowsthatthisimprovementisdueto(1)decreasedsynchronizationcostsand(2)highnumericalefficiency(e.g.fewerfunctionevaluations)fortheasynchronousGAs.ThisanalysisincludesacritiqueoftheutilityoftraditionalparallelperformancemeasuresforparallelGAs.
1.Introduction
Geneticalgorithms(GAs)arestochasticsearchalgo-rithmsthathavebeensuccessfullyappliedtoavarietyofoptimizationproblems[5].Unlikemostotheroptimizationprocedures,GAsmaintainapopulationofindividuals(setofsolutions)thatarecompetitivelyselectedtogeneratenewcandidatesfortheglobaloptima.ParallelGAshavebeendevelopedforavarietyofparallelarchitectures[4,7,14].TheseGAshavebeenusedtosolveavarietyofdifficultoptimizationproblems,andthealgorithmicmodificationsneededtoefficientlyparallelizeGAshaveprovidedinsightintothedynamicsofGAs’search[6].
Severalargumentsarecommonlyusedtojustifyparal-lelisminGAs.First,parallelismcanreducethecostoffunctionevaluations,whichmaydwarfothercostsinthe
ScottKohnDept.ChemistryU.C.SanDiegoLaJolla,CA92093skohn@chem.ucsd.edu
2
Background
2.1
GeneticAlgorithms
ThispaperconsidersGAsthatminimizefunctionsofthe
form:
R,whereisacompactsubsetofR.Theaimistofindsuchthatthethsolutioninthepopulationatminiteration,so.LetbeR.Figure1describesthegeneralprocessaGA.
Competitiveselectiondetermineswhichsolutionsinthecurrentpopulationwillbeusedtogeneratesolutionsforthenextiteration.Astandardselectionmechanismispro-portionalselection,whichstochasticallyselectsindividualwithprobability,whereisthethindividualandisthefitnessoftheindividual-thevalueoftheobjectivefunctionat.Thisselection
methodassumesthattheGAismaximizing
andthat0forall,butitcanbeeasilymodifiedtoperform
selectionwhenminimizingorwhenthefitnessesarenega-tive.Arelatedselectionmechanismisrankselectionwhichstochasticallyselectswithaprobabilitythatislinearlyrelatedtotherankofinthepopulation.
Initializepopulation(withrandomlygeneratedsolutions)Repeat12
EvaluatesolutionsinthepopulationPerformcompetitiveselectionApplygeneticoperators
Untilconvergencecriteriasatisfied
Figure1.Pseudo-codeforageneticalgo-rithm.
Newindividualsaregeneratedbyapplyinggeneticoper-atorsthattakeoneormoreindividualsfromthepopulationandgenerateoneormorenewindividuals.Thetwooper-atorscommonlyusedinGAsaremutationandcrossover.Figure2illustratesthetypesofnewsolutionsgeneratedbythesetwooperators.Crossovercombinespiecesoftwoso-lutionstogenerateanewsolution,anditistypicallyappliedwithhighfrequency.Mutationmakesasmallchangetoasolution,anditistypicallyusedwithlowfrequency.Forexample,mutationmightaddanormallydistributedrandomvaluetoasingledimensionofasolution.
2.2
GeographicallyStructuredGeneticAlgo-rithms
GAscanbedistinguishedbythemannerinwhichcom-petitiveselectionisperformed.GeographicallystructuredGAs(GSGAs)performastructuredselectioninwhichsolu-tionscompeteagainstafixedsubsetofthepopulationthat
2.7 1.2 7.4 0.5 9.62.7 1.2 5.5 0.5 9.6 (a)3.4 8.7 5.5 9.0 1.72.7 1.2 7.4 4.6 0.8 (b)2.1 6.7 9.9 4.6 0.82.1 8.7 5.5 9.0 0.2 (c)Figure2.Examplesofhowmutationand
crossovergenerateasolutions.Coloredbarsillustratewhatsubsectionsofthesolutionsarepiecedtogetherandmodified:(a)solu-tiongeneratedbymutation,(b)solutiongen-eratedbycrossover,and(c)solutiongener-atedbybothmutationandcrossover.
partiallyoverlapswithsubsetsusedbyotherindividuals.Thusthecrossoveroperatorisappliedtotwoindividualsselectedfromasubsetofthepopulation.PanmicticGAsperformaglobalcompetitiveselectionthatallowsanypairofindividualstobeselectedforthecrossoveroperator.AnexampleofpanmicticGAsarethosethatuseproportionalselection.
ParallelGSGAsaretypicallyimplementedasfinegrainalgorithmsinwhicheachprocessorisresponsibleforevalu-atingandprocessingasingleindividual.AmongfinegrainGSGAs,twoclassesofalgorithmscanbedistinguished:GSGAsdesignedforMIMDarchitectures[7]andGSGAsdesignedforSIMDarchitectures[4,11,14].Themethodofstructuringthecompetitiveselectiondiffersbetweenthesetwoclassesbasedondifferencesinthehardwarethatareexploitedbythesealgorithmstoachievehighutilization.Forexample,themostcommonmethodofstructuringthecompetitiveselectionforSIMDfinegrainGSGAsusesatoroidalgridliketheoneinFigure3[4,11,14].Everyindividualinthepopulationisassignedtoalocationonthegridinanarbitrarymanner.
Thegridisusedtodeterminesubsetsofsolutionsthatcompeteforalocationonthegrid.Therearedistinctno-tionsoflocalitywithrespecttothepopulationgridandthesearchspace;neighborsonthegridmayrepresentsolutionsthatarequitedifferent.Twomethodshavebeenusedtode-terminethesesubsetsofsolutionsinGSGAs:(1)fixedsizeneighborhoodshavebeenusedtodefinethesetofindividu-alsaboutagivenpointonthegrid,(2)randomwalkshavebeenusedtostochasticallysampleneighboringlocationsonthegrid.Forexample,thedashedlinesadjacenttopointAinFigure3indicateasubsetofindividualsthatmightcom-petetoreplacetheindividualatpointA.ThissubsetiscalledtheneighborhoodofpointA.ThedashedlinesadjacenttopointBillustratethetoroidalnatureofneighborhoodson
ABFigure3.AtwodimensionalgridusedbyGS-GAstodefinepopulationsubsets.
thisgrid.WecallaneighborhoodlikethatshowninFig-ure3aNEWSneighborhood,sinceitonlyusesneighborstotheNorth,East,WestandSouth.
Recently,Hart[8]hasdescribedacoarsegraindesignforparallelGSGAs.LikeSIMDGSGAs,thisparallelGAusesatwo-dimensionaltoroidalgridthatisdistributedacrossthesetofavailableprocessors.Thuseachprocessortypicallyprocessesasetofsolutionsthatislocatedonasubgridoftheentiregrid.Communicationofgridelementsisper-formedtoupdatetheneighborhoodsofindividualswhoseneighborhoodsspantwo(ormore)processors.
3
Methods
3.1
ParallelDesignofCoarse-GrainGSGAs
CoarsegrainGSGAsuseatoroidal,two-dimensionalpopulationgridthatisdistributedacrossprocessors.WeexamineGSGAsusingHPF-styleblockdecompositions[2].Communicationbetweenprocessorsisrequiredto(1)per-formselectionandrecombination,and(2)checkforter-minationsignals.Becausethegridistoroidal,periodicboundaryconditionsarerequired.Hence,everyprocessorcommunicateswiththesamenumberofprocessors.Eachprocessormayterminateindependentlybysatisfyingater-minationcriterion(e.g.,byexceedingaspecifiedtimelimitorfindingasolutionwithaspecifiedfitness).Communica-tionisrequiredtoterminatealloftheprocessorswhenanyofthemsatisfytheterminationcriterion.
Performingselectionandrecombinationatagivenlo-cationonthegridrequiresaccesstothefitnessandgeno-typesofneighboringindividuals,whichmaybelocatedonotherprocessors.Fixedsizeneighborhoodsareusedinourcoarse-grainGSGAs,whichletsuspredeterminethesizeoftheborderregionsthatneedtobecommunicatedbetweenprocessors.Thenumberofborderregionsthatneedtobecommunicateddependsonthestructureoftheneighbor-hoods.
CoarsegrainGSGAscanbedistinguishedbythemannerinwhichinterprocesscommunicationisperformed:(glob-ally)synchronizedorasynchronous.SynchronizedGSGAsguaranteethatallborderregionsforalloftheprocessorshavebeencommunicatedbeforeanyoftheprocessorscanproceedinthenextiteration.Bycomparison,theasyn-chronousalgorithmdoesnotrequirethatallborderregionsbefilledpriortostartingthenextiteration[3].AsynchronousGSGAswillprobablyhaveahigherutilizationoftheparal-lelhardware,butprocessorsmayfrequentlybeusingborderregionsthatareinconsistentwiththestateoftheneighboringprocessors.
3.2
ExperimentalDesign
TheGSGAsusedinourexperimentsaresimilartothethosedescribedbyHart[8].Mutationwasperformedbyaddingavaluefromanormallydistributedrandomvari-abletoacoordinateofasolution.Weusedaminimalneighborhoodthatincludedthetwoimmediateneighborsalongeachdimensionofthepopulationgrid(seeFigure3).Thecrossoverratewas0.8andthemutationratewas0.01.Rankselectionwasusedtocompetitivelyselectindividualswithineachneighborhood.FollowingGordonandWhit-ley[6],weusethebestindividualintheneighborhoodasoneoftheparentswhenperformingcrossover.Additionalcheckswereaddedtoavoidrepeatingafunctionevaluationonindividualsthatwerenotmodifiedbythecrossoverormutationoperations.
CommunicationinourparallelGSGAswascodedus-ingLPARX[10].LPARXprovidesdistributed2Dgridclassesthathandleblockcopiesandglobalsynchronization.LPARXwasextendedtocreatelabeledblockcopiesthatareusedtoperformasynchronouscommunication.
Thefunctionthatisoptimizedinourexperimentsisthetwo-dimensionalmolecularconformationproblemdescribedbyJudsonetal.[9].Thisproblemcon-cernsamoleculecomposedofachainofidenticalatomsthatareconnectedwithrigidrodsoflengthone.TheenergyofthemoleculeismodeledvanderWallsforces.Theenergy1equationusedbyJudsonetal.is
01
1226
,whereasolutionwithenergy
700wasfound.
4PerformanceAnalysis
ThereareseveralpropertiesofparallelGAsthatmaketheevaluationoftheirperformanceparticularlychallenging.RandomizationGAsarerandomizedalgorithmsthatrelyonrandomnumbergeneratorstodeterminestepsthatthetheymake.ConsequentlytheresultsofaparallelGAfortwodifferentsetsofprocessorsarenotstrictlycompara-blebecausethenumberofprocessorsaffectsthesequenceofalgorithmicstepsusedbythealgorithm.However,weconjecturethattheexpectedperformanceofsynchronizedparallelGAsshouldnotbeaffected.TheexpectedbehaviorofsynchronousparallelGAsdoesnotdependupontheran-domnumbergeneratorsthemselves,butonthedistributionofsearchbehaviorsthattheyinduce.Thisdistributionisindependentofthewayinwhichtherandomizationispro-vided.ItalsoseemsreasonablethatasynchronousparallelGAsarenotaffectedbythedifferentrandomnumbergen-erators,butinthiscasethedelaysininterprocessorcommu-nicationmayinfluencetheexpectedperformanceofparallelGAsinsubtleways.
TerminationCriteriaAweaknessofGAsisthelackofastoppingcriteriathatcanreliablyterminatethealgorithmnearalocalorglobaloptimum.Consequently,researcherstypicallyterminateGAsafterafixednumberofiterationsorwhenthedistributionofsolutionsinthepopulationap-pearstohaveconvergednearasinglepoint(totalconver-gencetypicallyneveroccursduetotheeffectsofmutation).TheserulesterminateGAswithsolutionswhosefitnessesvary.Consequently,researcherstypicallyreportthemeanandstandarddeviationofthefitnessesofthebestindividualsfound.
Performancemeasuresforparallelalgorithmsaretypi-callydefinedintermsofthecostrequiredtofindasolu-tionwithagivenaccuracy.ThustheperformanceanalysistypicallyperformedforGAsisinconsistentwiththeperfor-mancemeasuresusedforparallelalgorithms.ToanalyzetheperformanceofparallelGAsitisnecessarytorunthealgorithmuntilagivenaccuracyisachievedsincethisistheonlywayofcomparingresultswhenthenumberofiterationvarieswiththenumberofprocessors.
SynchronizationAninterestingfeatureofmanyparal-lelGAsisthatstrictlyspeaking,synchronizationisnotre-quired.Thereasonisthatthesearchperformedbyeachpro-cessorcanproceedindependentofthesearchesperformed
Errorbarsinallfiguresshow90%confidenceintervalsforthedata.
remainsfixedasthenumberofprocessorschanges,theseresultsmaybecomparablefordifferentprocessorsizesifourconjectureconcerningtheeffectofrandomizationinSection4istrue.Wetestedthisconjecturebycomparingthedistributionofthetotalnumberoffunctionevaluationsusedonallprocessorsforeachoftheprocessorsize,usingamethodofmultiplecomparisons(theGHprocedure[16]).Thistestconfirmedthatexpectedperformanceofthesyn-chronizedparallelGAsisnotaffectedbyrandomizationwitha5%confidencelevel.
103SyncASync]PTE[102101416256P(a)
8.0000E3SyncASync7.0000E3]6.0000E3p Tp[E5.0000E34.0000E33.0000E3416256P(b)
Figure4.Performanceofsynchronousandasynchronouscoarse-grainGSGAs:(a)plotofand(b)plotof.
Figure4bshowsthatasthenumberofprocessorsin-creases,theruntimedecreasesalmostlinearlyfortheasyn-chronousGSGAsanddecreasessublinearlyforsynchronousGSGAs.Figure5agraphsthecommunicationoverhead,
,thefractionofthetotaltimespentinthecommu-nicationsubroutine(includingloadimbalancecostsduetodifferentexecutiontimesamongtheprocessors).ThisgraphshowsthatsynchronizationcostsareonefactorthataffectsthedifferenceinruntimesforparallelGSGAs.
0.3SyncAsyncdeahr0.2evO noiatcinummCo0.10.0416256P(a)9.0000E5SyncASync8.0000E5snoi7.0000E5atulavE mu6.0000E5N5.0000E54.0000E5416256P(b)
Figure5.TwofactorsaffectingtherelativeruntimesofsynchronousandasynchronousGS-GAs:(a)communicationoverheadand(b)comparisonofthetotalnumberoffunctionevaluationsusedduringoptimization(acrossallprocessors).
ThedifferencebetweentheruntimesofasynchronousandsynchronousGSGAscanalsobeattributedtodiffer-encesinthenumericalefficiencyofthesynchronousandasynchronousalgorithms.Figure5bgraphsthetotalnum-beroffunctionevaluationsusedbyallprocessorsduringtheoptimizationruns.Thisfigureclearlyshowsthattheasyn-chronousGAterminatesafterfewerfunctionevaluationsthanthesynchronousGA.Thetotalnumberoffunctionevaluationsisareasonablemetricforevaluatingnumerical
efficiencyinourexperimentsbecausethefunctionevalua-tionsaccountfor70-90%ofthetotaltimeacrossallproces-sors.AninspectionofourexperimentalresultsshowsthatthereductioninthenumberoffunctionevaluationsshowninFigure5baccountsforasubstantialfractionoftherelativeimprovementbetweenthesynchronousandasynchronousGSGAs.
OnepossibleexplanationoftheimprovedperformanceoftheasynchronousGSGAsisthatthedelayedupdatestotheborderregionscausedbytheasynchronouscommunica-tionactuallyimprovethesearchperformedbytheparallelalgorithm.Intuitively,delayedcommunicationallowseachprocessortosearchlongerwithitscurrentpopulationbe-forebecominginfluencedbythesolutionscommunicatedbyneighboringprocessors.Toevaluatethishypothesis,weexaminedsynchronousGSGAsthatweresynchronizedatdifferentfrequencies.Table1showstheresultsoftheseexperimentswith256processors.
Frequency
1428.7
10.2
0.217
0.191
NumEval
8.25e5
3.07e5
Accuracy
StdDevSync
28.7
-800.2954.83-70
0.071
2.22
26.2
preventtheperformanceanalysisforverytightaccuracythresholds,sinceprohibitivelylongtrialsmightberequiredtofindnear-optimalsolutions.Inourexperimentstherel-ativeperformanceoftheparallelGSGAswasconsistentatdifferentaccuracythresholds,whichsuggeststhatper-formancecomparisonsbasedonmeasurementswithlooseaccuracythresholdsmayapplyfortighteraccuracythresh-olds.
Givenourcritiqueoftraditionalparallelperformancemeasures,weexpectthatpreviousanalysesofparallelGAswillneedtobereconsidered.Forexample,claimsofsu-perlinearspeedupforparallelGAs[1,12,13]haveusedspeedupmeasuresthatdonotdirectlyrelatethethecom-putationperformedonprocessorsandthecomputationperformedonprocessors.Consequently,themeaningofthistypeofsuperlinearspeedupisnotclear.
OurexperimentsexamineasingleimplementationissueforparallelGAs:theeffectofsynchronizationonthepar-allelandnumericalefficienciesofparallelGAs.Thereareavarietyofotherimplementationissueswhoseeffectsneedtobecarefullyexamined.Theseincludethetopologyofthepopulationgrid,thesizeofthepopulation,andthetypeofneighborhoodsusedtoperformcompetitiveselection.Em-piricalevidenceindicatesthattheseissuescanaffectthenumericalperformanceofGAs.Forexample,largerpopu-lationsizescansometimesimprovetherateofconvergenceofGAs,particularlyfordifficultoptimizationproblems.Whilemanyoftheseissueshavebeenindependentlyex-aminedbyresearchers,acarefulanalysisoftheirinteractionsinparallelGAshasnotbeenconducted.AlthoughwedonotexpectthatasingleformulationofparallelGAswillbebestforsolvingallproblems,thisanalysisisimportantforun-derstandingwhattypesofparallelGAsarelikelytoperformwell.Forexample,ourworkinprogressindicatesthattheshapeofthepopulationgridinteractswiththesynchroniza-tionmethod.Narrowgridsappeartohavebetternumericalperformance,especiallyforasynchronousparallelGAs.
References
[1]T.C.Belding.Thedistributedgeneticalgorithmrevisited.In
L.Eshelman,editor,ProceedingsoftheSixthIntl.Conf.onGeneticAlgorithms,pages114–121,SanMateo,CA,1995.Morgan-Kaufmann.
[2]B.Chapman,P.Mehrotra,andH.Zima.ExtendingHPFfor
advanceddataparallelapplications.Report94-34,ICASE,May1994.
[3]D.ChazanandW.L.Miranker.Chaoticrelaxation.Lin.
AlgebraandApplic,2:199–222,1969.
[4]R.J.CollinsandD.R.Jefferson.Selectioninmassively
parallelgeneticalgorithms.InProcofthefourthIntlConfonGeneticAlgorithms,pages249–256,1991.
[5]D.E.Goldberg.GeneticAlgorithmsinSearch,Optimization,
andMachineLearning.Addison-WesleyPublishingCo.,Inc.,19.
[6]V.S.GordonandD.Whitley.Serialandparallelgenetic
algorithmsasfunctionoptimizers.InS.Forrest,editor,ProcoftheFifthIntlConfonGeneticAlgorithms,pages177–183,SanMateo,CA,1993.Morgan-Kaufmann.
[7]M.Gorges-Schleuter.Explicitparallelismofgeneticalgo-rithmsthroughpopulationstructures.InParallelProblemSolvingfromNature,pages150–159,1990.
[8]W.E.Hart.AdaptiveGlobalOptimizationwithLocalSearch.
PhDthesis,UniversityofCalifornia,SanDiego,May1994.[9]R.Judson,M.Colvin,J.Meza,A.Huffer,andD.Gutierrez.
Dointelligentconfigurationsearchtechniquesoutperformrandomsearchforlargemolecules?IntlJQuantumChem,pages277–290,1992.
[10]S.R.KohnandS.B.Baden.Arobustparallelprogramming
modelfordynamicnon-uniformscientificcomputations.InProceedingsofthe1994ScalableHighPerformanceCom-putingConference,May1994.
[11]J.M.McInerney.BiologicallyInfluencedAlgorithmsand
ParallelisminNon-LinearOptimization.PhDthesis,Uni-versityofCalifornia,SanDiego,1992.[12]H.M¨uhlenbein,M.Schomisch,andJ.Born.Theparallel
geneticalgorithmasfunctionoptimizer.InR.K.BelewandL.B.Booker,editors,ProcoftheFourthIntlConfonGeneticAlgorithms,pages271–278,SanMateo,CA,1991.Morgan-Kaufmann.
[13]R.Shonkwiler.Parallelgeneticalgorithms.InS.Forrest,
editor,ProceedingsoftheFifthIntl.Conf.onGeneticAl-gorithms,pages199–205,SanMateo,CA,1993.Morgan-Kaufmann.
[14]P.SpiessensandB.Manderick.Amassivelyparallelgenetic
algorithm:Implementationandfirstanalysis.InProcoftheFouthIntlConfonGeneticAlgorithms,pages279–285,1991.
[15]T.Starkweather,D.Whitley,andK.Mathias.Optimization
usingdistributedgeneticalgorithms.InParallelProblemSolvingfromNature,pages176–185,1990.
[16]L.E.Toothaker.MultipleComparisonsforResearchers.
SagesPublications,1991.
因篇幅问题不能全部显示,请点此查看更多更全内容