100% found this document useful (1 vote)
1K views225 pages

ZeroMQ - The Guide

ZeroMQ - The Guide

Uploaded by

Halip Ismail
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
1K views225 pages

ZeroMQ - The Guide

ZeroMQ - The Guide

Uploaded by

Halip Ismail
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 225

12/31/2015 MQ - The Guide - MQ - The Guide

MQTheGuide
ByPieterHintjens,CEOofiMatix

Pleaseusetheissuetrackerforallcommentsanderrata.ThisversioncoversthelateststablereleaseofZeroMQ(3.2).Ifyouare
usingolderversionsofZeroMQthensomeoftheexamplesandexplanationswon'tbeaccurate.

TheGuideisoriginallyinC,butalsoinPHP,Python,Lua,andHaxe.We'vealsotranslatedmostoftheexamplesintoC++,C#,
CL,Delphi,Erlang,F#,Felix,Haskell,Java,ObjectiveC,Ruby,Ada,Basic,Clojure,Go,Haxe,Node.js,ooc,Perl,andScala.

Preface topprevnext

ZeroMQinaHundredWords topprevnext

ZeroMQ(alsoknownasMQ,0MQ,orzmq)lookslikeanembeddablenetworkinglibrarybutactslikeaconcurrencyframework.
Itgivesyousocketsthatcarryatomicmessagesacrossvarioustransportslikeinprocess,interprocess,TCP,andmulticast.You
canconnectsocketsNtoNwithpatternslikefanout,pubsub,taskdistribution,andrequestreply.It'sfastenoughtobethe
fabricforclusteredproducts.ItsasynchronousI/Omodelgivesyouscalablemulticoreapplications,builtasasynchronous
messageprocessingtasks.IthasascoreoflanguageAPIsandrunsonmostoperatingsystems.ZeroMQisfromiMatixandis
LGPLv3opensource.

HowItBegan topprevnext

WetookanormalTCPsocket,injecteditwithamixofradioactiveisotopesstolenfromasecretSovietatomicresearchproject,
bombardeditwith1950eracosmicrays,andputitintothehandsofadrugaddledcomicbookauthorwithabadlydisguised
fetishforbulgingmusclescladinspandex.Yes,ZeroMQsocketsaretheworldsavingsuperheroesofthenetworkingworld.

Figure1Aterribleaccident

http://zguide.zeromq.org/page:all 1/225
12/31/2015 MQ - The Guide - MQ - The Guide

TheZenofZero topprevnext

TheinZeroMQisallabouttradeoffs.OntheonehandthisstrangenamelowersZeroMQ'svisibilityonGoogleandTwitter.On
theotherhanditannoystheheckoutofsomeDanishfolkwhowriteusthingslike"MGrtfl",and"isnotafunnylooking
zero!"and"Rdgrdmedflde!",whichisapparentlyaninsultthatmeans"mayyourneighboursbethedirectdescendantsof
Grendel!"Seemslikeafairtrade.

OriginallythezeroinZeroMQwasmeantas"zerobroker"and(ascloseto)"zerolatency"(aspossible).Sincethen,ithascome
toencompassdifferentgoals:zeroadministration,zerocost,zerowaste.Moregenerally,"zero"referstothecultureof
minimalismthatpermeatestheproject.Weaddpowerbyremovingcomplexityratherthanbyexposingnewfunctionality.

Audience topprevnext

Thisbookiswrittenforprofessionalprogrammerswhowanttolearnhowtomakethemassivelydistributedsoftwarethatwill
dominatethefutureofcomputing.WeassumeyoucanreadCcode,becausemostoftheexampleshereareinCeventhough
ZeroMQisusedinmanylanguages.Weassumeyoucareaboutscale,becauseZeroMQsolvesthatproblemaboveallothers.
Weassumeyouneedthebestpossibleresultswiththeleastpossiblecost,becauseotherwiseyouwon'tappreciatethetrade
offsthatZeroMQmakes.Otherthanthatbasicbackground,wetrytopresentalltheconceptsinnetworkinganddistributed
computingyouwillneedtouseZeroMQ.

Acknowledgements topprevnext

ThankstoAndyOramformakingtheO'Reillybookhappen,andeditingthistext.

ThankstoBillDesmarais,BrianDorsey,DanielLin,EricDesgranges,GonzaloDiethelm,GuidoGoldstein,HunterFord,Kamil
Shakirov,MartinSustrik,MikeCastleman,NaveenChawla,NicolaPeduzzi,OliverSmith,OlivierChamoux,PeterAlexander,
PierreRouleau,RandyDryburgh,JohnUnwin,AlexThomas,MihailMinkov,JeremyAvnet,MichaelCompton,KamilKisiel,Mark
Kharitonov,GuillaumeAubert,IanBarber,MikeSheridan,FarukAkgul,OlegSidorov,LevGivon,AllisterMacLeod,Alexander
D'Archangel,AndreasHoelzlwimmer,HanHoll,RobertG.Jakabosky,FelipeCruz,MarcusMcCurdy,MikhailKulemin,Dr.Gerg
rdi,PavelZhukov,AlexanderElse,GiovanniRuggiero,Rick"Technoweenie",DanielLundin,DaveHoover,SimonJefford,
BenjaminPeterson,JustinCase,DevonWeller,RichardSmith,AlexanderMorland,WadimGrasza,MichaelJakl,Uwe
Dauernheim,SebastianNowicki,SimoneDeponti,AaronRaddon,DanColish,MarkusSchirp,BenoitLarroque,Jonathan
Palardy,IsaiahPeng,ArkadiuszOrzechowski,UmutAydin,MatthewHorsfall,JeremyW.Sherman,EricPugh,TylerSellon,John
E.Vincent,PavelMitin,MinRK,IgorWiedler,Olofkesson,PatrickLucas,HeowGoodman,SenthilPalanisami,JohnGallagher,
TomasRoos,StephenMcQuay,ErikAllik,ArnaudCogolugnes,RobGagnon,DanWilliams,EdwardSmith,JamesTucker,
KristianKristensen,VadimShalts,MartinTrojer,TomvanLeeuwen,HitenPandya,HarmAarts,MarcHarter,IskrenIvov
http://zguide.zeromq.org/page:all 2/225
12/31/2015 MQ - The Guide - MQ - The Guide
Chernev,JayHan,SoniaHamilton,NathanStocks,NaveenPalli,andZedShawfortheircontributionstothiswork.

Chapter1Basics topprevnext

FixingtheWorld topprevnext

HowtoexplainZeroMQ?Someofusstartbysayingallthewonderfulthingsitdoes.It'ssocketsonsteroids.It'slikemailboxes
withrouting.It'sfast!Otherstrytosharetheirmomentofenlightenment,thatzappowkaboomsatoriparadigmshiftmoment
whenitallbecameobvious.Thingsjustbecomesimpler.Complexitygoesaway.Itopensthemind.Otherstrytoexplainby
comparison.It'ssmaller,simpler,butstilllooksfamiliar.Personally,IliketorememberwhywemadeZeroMQatall,because
that'smostlikelywhereyou,thereader,stillaretoday.

Programmingissciencedressedupasartbecausemostofusdon'tunderstandthephysicsofsoftwareandit'srarely,ifever,
taught.Thephysicsofsoftwareisnotalgorithms,datastructures,languagesandabstractions.Thesearejusttoolswemake,
use,throwaway.Therealphysicsofsoftwareisthephysicsofpeoplespecifically,ourlimitationswhenitcomestocomplexity,
andourdesiretoworktogethertosolvelargeproblemsinpieces.Thisisthescienceofprogramming:makebuildingblocksthat
peoplecanunderstandanduseeasily,andpeoplewillworktogethertosolvetheverylargestproblems.

Weliveinaconnectedworld,andmodernsoftwarehastonavigatethisworld.Sothebuildingblocksfortomorrow'sverylargest
solutionsareconnectedandmassivelyparallel.It'snotenoughforcodetobe"strongandsilent"anymore.Codehastotalkto
code.Codehastobechatty,sociable,wellconnected.Codehastorunlikethehumanbrain,trillionsofindividualneuronsfiring
offmessagestoeachother,amassivelyparallelnetworkwithnocentralcontrol,nosinglepointoffailure,yetabletosolve
immenselydifficultproblems.Andit'snoaccidentthatthefutureofcodelookslikethehumanbrain,becausetheendpointsof
everynetworkare,atsomelevel,humanbrains.

Ifyou'vedoneanyworkwiththreads,protocols,ornetworks,you'llrealizethisisprettymuchimpossible.It'sadream.Even
connectingafewprogramsacrossafewsocketsisplainnastywhenyoustarttohandlereallifesituations.Trillions?Thecost
wouldbeunimaginable.Connectingcomputersissodifficultthatsoftwareandservicestodothisisamultibilliondollarbusiness.

Soweliveinaworldwherethewiringisyearsaheadofourabilitytouseit.Wehadasoftwarecrisisinthe1980s,whenleading
softwareengineerslikeFredBrooksbelievedtherewasno"SilverBullet"to"promiseevenoneorderofmagnitudeof
improvementinproductivity,reliability,orsimplicity".

Brooksmissedfreeandopensourcesoftware,whichsolvedthatcrisis,enablingustoshareknowledgeefficiently.Todayweface
anothersoftwarecrisis,butit'sonewedon'ttalkaboutmuch.Onlythelargest,richestfirmscanaffordtocreateconnected
applications.Thereisacloud,butit'sproprietary.Ourdataandourknowledgeisdisappearingfromourpersonalcomputersinto
cloudsthatwecannotaccessandwithwhichwecannotcompete.Whoownsoursocialnetworks?ItislikethemainframePC
revolutioninreverse.

Wecanleavethepoliticalphilosophyforanotherbook.ThepointisthatwhiletheInternetoffersthepotentialofmassively
connectedcode,therealityisthatthisisoutofreachformostofus,andsolargeinterestingproblems(inhealth,education,
economics,transport,andsoon)remainunsolvedbecausethereisnowaytoconnectthecode,andthusnowaytoconnectthe
brainsthatcouldworktogethertosolvetheseproblems.

Therehavebeenmanyattemptstosolvethechallengeofconnectedcode.TherearethousandsofIETFspecifications,each
solvingpartofthepuzzle.Forapplicationdevelopers,HTTPisperhapstheonesolutiontohavebeensimpleenoughtowork,but
itarguablymakestheproblemworsebyencouragingdevelopersandarchitectstothinkintermsofbigserversandthin,stupid
clients.

SotodaypeoplearestillconnectingapplicationsusingrawUDPandTCP,proprietaryprotocols,HTTP,andWebsockets.It
remainspainful,slow,hardtoscale,andessentiallycentralized.DistributedP2Parchitecturesaremostlyforplay,notwork.How
manyapplicationsuseSkypeorBittorrenttoexchangedata?

Whichbringsusbacktothescienceofprogramming.Tofixtheworld,weneededtodotwothings.One,tosolvethegeneral
problemof"howtoconnectanycodetoanycode,anywhere".Two,towrapthatupinthesimplestpossiblebuildingblocksthat
peoplecouldunderstandanduseeasily.

Itsoundsridiculouslysimple.Andmaybeitis.That'skindofthewholepoint.
http://zguide.zeromq.org/page:all 3/225
12/31/2015 MQ - The Guide - MQ - The Guide

StartingAssumptions topprevnext

Weassumeyouareusingatleastversion3.2ofZeroMQ.WeassumeyouareusingaLinuxboxorsomethingsimilar.We
assumeyoucanreadCcode,moreorless,asthat'sthedefaultlanguagefortheexamples.Weassumethatwhenwewrite
constantslikePUSHorSUBSCRIBE,youcanimaginetheyarereallycalledZMQ_PUSHorZMQ_SUBSCRIBEiftheprogramming
languageneedsit.

GettingtheExamples topprevnext

TheexamplesliveinapublicGitHubrepository.Thesimplestwaytogetalltheexamplesistoclonethisrepository:

gitclonedepth=1https://github.com/imatix/zguide.git

Next,browsetheexamplessubdirectory.You'llfindexamplesbylanguage.Ifthereareexamplesmissinginalanguageyouuse,
you'reencouragedtosubmitatranslation.Thisishowthistextbecamesouseful,thankstotheworkofmanypeople.All
examplesarelicensedunderMIT/X11.

AskandYeShallReceive topprevnext

Solet'sstartwithsomecode.WestartofcoursewithaHelloWorldexample.We'llmakeaclientandaserver.Theclientsends
"Hello"totheserver,whichreplieswith"World".Here'stheserverinC,whichopensaZeroMQsocketonport5555,reads
requestsonit,andreplieswith"World"toeachrequest:

hwserver:HelloWorldserverinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Q|Racket|Ruby|Scala
|Tcl|Ada|Basic|ooc

Figure2RequestReply

TheREQREPsocketpairisinlockstep.Theclientissueszmq_send()andthenzmq_recv(),inaloop(oronceifthat'sallit
needs).Doinganyothersequence(e.g.,sendingtwomessagesinarow)willresultinareturncodeof1fromthesendorrecv
call.Similarly,theserviceissueszmq_recv()andthenzmq_send()inthatorder,asoftenasitneedsto.

http://zguide.zeromq.org/page:all 4/225
12/31/2015 MQ - The Guide - MQ - The Guide
ZeroMQusesCasitsreferencelanguageandthisisthemainlanguagewe'lluseforexamples.Ifyou'rereadingthisonline,the
linkbelowtheexampletakesyoutotranslationsintootherprogramminglanguages.Let'scomparethesameserverinC++:

//
//HelloWorldserverinC++
//BindsREPsockettotcp://*:5555
//Expects"Hello"fromclient,replieswith"World"
//
#include<zmq.hpp>
#include<string>
#include<iostream>
#ifndef_WIN32
#include<unistd.h>
#else
#include<windows.h>

#definesleep(n)Sleep(n)
#endif

intmain(){
//Prepareourcontextandsocket
zmq::context_tcontext(1)
zmq::socket_tsocket(context,ZMQ_REP)
socket.bind("tcp://*:5555")

while(true){
zmq::message_trequest

//Waitfornextrequestfromclient
socket.recv(&request)
std::cout<<"ReceivedHello"<<std::endl

//Dosome'work'
sleep(1)

//Sendreplybacktoclient
zmq::message_treply(5)
memcpy((void*)reply.data(),"World",5)
socket.send(reply)
}
return0
}

hwserver.cpp:HelloWorldserver

YoucanseethattheZeroMQAPIissimilarinCandC++.InalanguagelikePHPorJava,wecanhideevenmoreandthecode
becomeseveneasiertoread:

<?php
/*
*HelloWorldserver
*BindsREPsockettotcp://*:5555
*Expects"Hello"fromclient,replieswith"World"
*@authorIanBarber<ian(dot)barber(at)gmail(dot)com>
*/

$context=newZMQContext(1)

//Sockettotalktoclients
$responder=newZMQSocket($context,ZMQ::SOCKET_REP)
$responder>bind("tcp://*:5555")

http://zguide.zeromq.org/page:all 5/225
12/31/2015 MQ - The Guide - MQ - The Guide
while(true){
//Waitfornextrequestfromclient
$request=$responder>recv()
printf("Receivedrequest:[%s]\n",$request)

//Dosome'work'
sleep(1)

//Sendreplybacktoclient
$responder>send("World")
}

hwserver.php:HelloWorldserver

//
//HelloWorldserverinJava
//BindsREPsockettotcp://*:5555
//Expects"Hello"fromclient,replieswith"World"
//

importorg.zeromq.ZMQ

publicclasshwserver{

publicstaticvoidmain(String[]args)throwsException{
ZMQ.Contextcontext=ZMQ.context(1)

//Sockettotalktoclients
ZMQ.Socketresponder=context.socket(ZMQ.REP)
responder.bind("tcp://*:5555")

while(!Thread.currentThread().isInterrupted()){
//Waitfornextrequestfromtheclient
byte[]request=responder.recv(0)
System.out.println("ReceivedHello")

//Dosome'work'
Thread.sleep(1000)

//Sendreplybacktoclient
Stringreply="World"
responder.send(reply.getBytes(),0)
}
responder.close()
context.term()
}
}

hwserver.java:HelloWorldserver

Theserverinotherlanguages:

hwserver:HelloWorldserverinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Q|Racket|Ruby|Scala
|Tcl|Ada|Basic|ooc

Here'stheclientcode:

hwclient:HelloWorldclientinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Q|Racket|Ruby|Scala
|Tcl|Ada|Basic|ooc

http://zguide.zeromq.org/page:all 6/225
12/31/2015 MQ - The Guide - MQ - The Guide
Nowthislookstoosimpletoberealistic,butZeroMQsocketshave,aswealreadylearned,superpowers.Youcouldthrow
thousandsofclientsatthisserver,allatonce,anditwouldcontinuetoworkhappilyandquickly.Forfun,trystartingtheclient
andthenstartingtheserver,seehowitallstillworks,thenthinkforasecondwhatthismeans.

Letusexplainbrieflywhatthesetwoprogramsareactuallydoing.TheycreateaZeroMQcontexttoworkwith,andasocket.
Don'tworrywhatthewordsmean.You'llpickitup.TheserverbindsitsREP(reply)sockettoport5555.Theserverwaitsfora
requestinaloop,andrespondseachtimewithareply.Theclientsendsarequestandreadsthereplybackfromtheserver.

Ifyoukilltheserver(CtrlC)andrestartit,theclientwon'trecoverproperly.Recoveringfromcrashingprocessesisn'tquitethat
easy.Makingareliablerequestreplyflowiscomplexenoughthatwewon'tcoverituntilChapter4ReliableRequestReply
Patterns.

Thereisalothappeningbehindthescenesbutwhatmatterstousprogrammersishowshortandsweetthecodeis,andhow
oftenitdoesn'tcrash,evenunderaheavyload.Thisistherequestreplypattern,probablythesimplestwaytouseZeroMQ.It
mapstoRPCandtheclassicclient/servermodel.

AMinorNoteonStrings topprevnext

ZeroMQdoesn'tknowanythingaboutthedatayousendexceptitssizeinbytes.Thatmeansyouareresponsibleforformattingit
safelysothatapplicationscanreaditback.Doingthisforobjectsandcomplexdatatypesisajobforspecializedlibrarieslike
ProtocolBuffers.Butevenforstrings,youneedtotakecare.

InCandsomeotherlanguages,stringsareterminatedwithanullbyte.Wecouldsendastringlike"HELLO"withthatextranull
byte:

zmq_send(requester,"Hello",6,0)

However,ifyousendastringfromanotherlanguage,itprobablywillnotincludethatnullbyte.Forexample,whenwesendthat
samestringinPython,wedothis:

socket.send("Hello")

Thenwhatgoesontothewireisalength(onebyteforshorterstrings)andthestringcontentsasindividualcharacters.

Figure3AZeroMQstring

AndifyoureadthisfromaCprogram,youwillgetsomethingthatlookslikeastring,andmightbyaccidentactlikeastring(ifby
luckthefivebytesfindthemselvesfollowedbyaninnocentlylurkingnull),butisn'taproperstring.Whenyourclientandserver
don'tagreeonthestringformat,youwillgetweirdresults.

WhenyoureceivestringdatafromZeroMQinC,yousimplycannottrustthatit'ssafelyterminated.Everysingletimeyoureada
string,youshouldallocateanewbufferwithspaceforanextrabyte,copythestring,andterminateitproperlywithanull.

Solet'sestablishtherulethatZeroMQstringsarelengthspecifiedandaresentonthewirewithoutatrailingnull.Inthe
simplestcase(andwe'lldothisinourexamples),aZeroMQstringmapsneatlytoaZeroMQmessageframe,whichlookslikethe
abovefigurealengthandsomebytes.

Hereiswhatweneedtodo,inC,toreceiveaZeroMQstringanddeliverittotheapplicationasavalidCstring:

//ReceiveZeroMQstringfromsocketandconvertintoCstring
//Chopsstringat255chars,ifit'slonger
staticchar*

http://zguide.zeromq.org/page:all 7/225
12/31/2015 MQ - The Guide - MQ - The Guide
s_recv(void*socket){
charbuffer[256]
intsize=zmq_rec
v(socket,buffer,255,0)
if(size==1)

returnNULL
if(size>255)
size=255
buffer[size]=
0
returnstrdup(buf
fer)
}

Thismakesahandyhelperfunctionandinthespiritofmakingthingswecanreuseprofitably,let'swriteasimilars_sendfunction
thatsendsstringsinthecorrectZeroMQformat,andpackagethisintoaheaderfilewecanreuse.

Theresultiszhelpers.h,whichletsuswritesweeterandshorterZeroMQapplicationsinC.Itisafairlylongsource,andonly
funforCdevelopers,soreaditatleisure.

VersionReporting topprevnext

ZeroMQdoescomeinseveralversionsandquiteoften,ifyouhitaproblem,it'llbesomethingthat'sbeenfixedinalaterversion.
Soit'sausefultricktoknowexactlywhatversionofZeroMQyou'reactuallylinkingwith.

Hereisatinyprogramthatdoesthat:

version:ZeroMQversionreportinginC

C++|C#|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Q|Ruby|Scala|Tcl|Ada|Basic|
Clojure|Haxe|ooc|Racket

GettingtheMessageOut topprevnext

Thesecondclassicpatternisonewaydatadistribution,inwhichaserverpushesupdatestoasetofclients.Let'sseean
examplethatpushesoutweatherupdatesconsistingofazipcode,temperature,andrelativehumidity.We'llgeneraterandom
values,justliketherealweatherstationsdo.

Here'stheserver.We'lluseport5556forthisapplication:

wuserver:WeatherupdateserverinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Racket|Ruby|Scala|
Tcl|Ada|Basic|ooc|Q

There'snostartandnoendtothisstreamofupdates,it'slikeaneverendingbroadcast.

Hereistheclientapplication,whichlistenstothestreamofupdatesandgrabsanythingtodowithaspecifiedzipcode,bydefault
NewYorkCitybecausethat'sagreatplacetostartanyadventure:

wuclient:WeatherupdateclientinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Racket|Ruby|Scala|
Tcl|Ada|Basic|ooc|Q

Figure4PublishSubscribe

http://zguide.zeromq.org/page:all 8/225
12/31/2015 MQ - The Guide - MQ - The Guide

NotethatwhenyouuseaSUBsocketyoumustsetasubscriptionusingzmq_setsockopt()andSUBSCRIBE,asinthiscode.
Ifyoudon'tsetanysubscription,youwon'tgetanymessages.It'sacommonmistakeforbeginners.Thesubscribercansetmany
subscriptions,whichareaddedtogether.Thatis,ifanupdatematchesANYsubscription,thesubscriberreceivesit.The
subscribercanalsocancelspecificsubscriptions.Asubscriptionisoften,butnotnecessarilyaprintablestring.See
zmq_setsockopt()forhowthisworks.

ThePUBSUBsocketpairisasynchronous.Theclientdoeszmq_recv(),inaloop(oronceifthat'sallitneeds).Tryingtosend
amessagetoaSUBsocketwillcauseanerror.Similarly,theservicedoeszmq_send()asoftenasitneedsto,butmustnotdo
zmq_recv()onaPUBsocket.

IntheorywithZeroMQsockets,itdoesnotmatterwhichendconnectsandwhichendbinds.However,inpracticethereare
undocumenteddifferencesthatI'llcometolater.Fornow,bindthePUBandconnecttheSUB,unlessyournetworkdesignmakes
thatimpossible.

ThereisonemoreimportantthingtoknowaboutPUBSUBsockets:youdonotknowpreciselywhenasubscriberstartstoget
messages.Evenifyoustartasubscriber,waitawhile,andthenstartthepublisher,thesubscriberwillalwaysmissthefirst
messagesthatthepublishersends.Thisisbecauseasthesubscriberconnectstothepublisher(somethingthattakesasmall
butnonzerotime),thepublishermayalreadybesendingmessagesout.

This"slowjoiner"symptomhitsenoughpeopleoftenenoughthatwe'regoingtoexplainitindetail.RememberthatZeroMQdoes
asynchronousI/O,i.e.,inthebackground.Sayyouhavetwonodesdoingthis,inthisorder:

Subscriberconnectstoanendpointandreceivesandcountsmessages.
Publisherbindstoanendpointandimmediatelysends1,000messages.

Thenthesubscriberwillmostlikelynotreceiveanything.You'llblink,checkthatyousetacorrectfilterandtryagain,andthe
subscriberwillstillnotreceiveanything.

MakingaTCPconnectioninvolvestoandfromhandshakingthattakesseveralmillisecondsdependingonyournetworkandthe
numberofhopsbetweenpeers.Inthattime,ZeroMQcansendmanymessages.Forsakeofargumentassumeittakes5msecs
toestablishaconnection,andthatsamelinkcanhandle1Mmessagespersecond.Duringthe5msecsthatthesubscriberis
connectingtothepublisher,ittakesthepublisheronly1msectosendoutthose1Kmessages.

InChapter2SocketsandPatternswe'llexplainhowtosynchronizeapublisherandsubscriberssothatyoudon'tstarttopublish
datauntilthesubscribersreallyareconnectedandready.Thereisasimpleandstupidwaytodelaythepublisher,whichisto
sleep.Don'tdothisinarealapplication,though,becauseitisextremelyfragileaswellasinelegantandslow.Usesleepsto
provetoyourselfwhat'shappening,andthenwaitforChapter2SocketsandPatternstoseehowtodothisright.

Thealternativetosynchronizationistosimplyassumethatthepublisheddatastreamisinfiniteandhasnostartandnoend.One

http://zguide.zeromq.org/page:all 9/225
12/31/2015 MQ - The Guide - MQ - The Guide
alsoassumesthatthesubscriberdoesn'tcarewhattranspiredbeforeitstartedup.Thisishowwebuiltourweatherclient
example.

Sotheclientsubscribestoitschosenzipcodeandcollects100updatesforthatzipcode.Thatmeansabouttenmillionupdates
fromtheserver,ifzipcodesarerandomlydistributed.Youcanstarttheclient,andthentheserver,andtheclientwillkeep
working.Youcanstopandrestarttheserverasoftenasyoulike,andtheclientwillkeepworking.Whentheclienthascollected
itshundredupdates,itcalculatestheaverage,printsit,andexits.

Somepointsaboutthepublishsubscribe(pubsub)pattern:

Asubscribercanconnecttomorethanonepublisher,usingoneconnectcalleachtime.Datawillthenarriveandbe
interleaved("fairqueued")sothatnosinglepublisherdrownsouttheothers.

Ifapublisherhasnoconnectedsubscribers,thenitwillsimplydropallmessages.

Ifyou'reusingTCPandasubscriberisslow,messageswillqueueuponthepublisher.We'lllookathowtoprotect
publishersagainstthisusingthe"highwatermark"later.

FromZeroMQv3.x,filteringhappensatthepublishersidewhenusingaconnectedprotocol(tcp://oripc://).Using
theepgm://protocol,filteringhappensatthesubscriberside.InZeroMQv2.x,allfilteringhappenedatthesubscriber
side.

Thisishowlongittakestoreceiveandfilter10Mmessagesonmylaptop,whichisan2011eraInteli5,decentbutnothing
special:

$timewuclient
Collectingupdatesfromweatherserver...
Averagetemperatureforzipcode'10001'was28F

real0m4.470s
user0m0.000s
sys0m0.008s

DivideandConquer topprevnext

Figure5ParallelPipeline

http://zguide.zeromq.org/page:all 10/225
12/31/2015 MQ - The Guide - MQ - The Guide

Asafinalexample(youaresurelygettingtiredofjuicycodeandwanttodelvebackintophilologicaldiscussionsabout
comparativeabstractivenorms),let'sdoalittlesupercomputing.Thencoffee.Oursupercomputingapplicationisafairlytypical
parallelprocessingmodel.Wehave:

Aventilatorthatproducestasksthatcanbedoneinparallel
Asetofworkersthatprocesstasks
Asinkthatcollectsresultsbackfromtheworkerprocesses

Inreality,workersrunonsuperfastboxes,perhapsusingGPUs(graphicprocessingunits)todothehardmath.Hereisthe
ventilator.Itgenerates100tasks,eachamessagetellingtheworkertosleepforsomenumberofmilliseconds:

taskvent:ParalleltaskventilatorinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|
Basic|ooc|Q|Racket

Hereistheworkerapplication.Itreceivesamessage,sleepsforthatnumberofseconds,andthensignalsthatit'sfinished:

taskwork:ParalleltaskworkerinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|
Basic|ooc|Q|Racket

Hereisthesinkapplication.Itcollectsthe100tasks,thencalculateshowlongtheoverallprocessingtook,sowecanconfirmthat
theworkersreallywererunninginparalleliftherearemorethanoneofthem:

tasksink:ParalleltasksinkinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|
Basic|ooc|Q|Racket

Theaveragecostofabatchis5seconds.Whenwestart1,2,or4workerswegetresultslikethisfromthesink:

http://zguide.zeromq.org/page:all 11/225
12/31/2015 MQ - The Guide - MQ - The Guide
1worker:totalelapsedtime:5034msecs.
2workers:totalelapsedtime:2421msecs.
4workers:totalelapsedtime:1018msecs.

Let'slookatsomeaspectsofthiscodeinmoredetail:

Theworkersconnectupstreamtotheventilator,anddownstreamtothesink.Thismeansyoucanaddworkersarbitrarily.
Iftheworkersboundtotheirendpoints,youwouldneed(a)moreendpointsand(b)tomodifytheventilatorand/orthesink
eachtimeyouaddedaworker.Wesaythattheventilatorandsinkarestablepartsofourarchitectureandtheworkersare
dynamicpartsofit.

Wehavetosynchronizethestartofthebatchwithallworkersbeingupandrunning.Thisisafairlycommongotchain
ZeroMQandthereisnoeasysolution.Thezmq_connectmethodtakesacertaintime.Sowhenasetofworkersconnect
totheventilator,thefirstonetosuccessfullyconnectwillgetawholeloadofmessagesinthatshorttimewhiletheothers
arealsoconnecting.Ifyoudon'tsynchronizethestartofthebatchsomehow,thesystemwon'truninparallelatall.Try
removingthewaitintheventilator,andseewhathappens.

Theventilator'sPUSHsocketdistributestaskstoworkers(assumingtheyareallconnectedbeforethebatchstartsgoing
out)evenly.Thisiscalledloadbalancingandit'ssomethingwe'lllookatagaininmoredetail.

Thesink'sPULLsocketcollectsresultsfromworkersevenly.Thisiscalledfairqueuing.

Figure6FairQueuing

Thepipelinepatternalsoexhibitsthe"slowjoiner"syndrome,leadingtoaccusationsthatPUSHsocketsdon'tloadbalance
properly.IfyouareusingPUSHandPULL,andoneofyourworkersgetswaymoremessagesthantheothers,it'sbecausethat
PULLsockethasjoinedfasterthantheothers,andgrabsalotofmessagesbeforetheothersmanagetoconnect.Ifyouwant
properloadbalancing,youprobablywanttolookattheloadbalancingpatterninChapter3AdvancedRequestReplyPatterns.

ProgrammingwithZeroMQ topprevnext

Havingseensomeexamples,youmustbeeagertostartusingZeroMQinsomeapps.Beforeyoustartthat,takeadeepbreath,
chillax,andreflectonsomebasicadvicethatwillsaveyoumuchstressandconfusion.

LearnZeroMQstepbystep.It'sjustonesimpleAPI,butithidesaworldofpossibilities.Takethepossibilitiesslowlyand
mastereachone.

Writenicecode.Uglycodehidesproblemsandmakesithardforotherstohelpyou.Youmightgetusedtomeaningless
variablenames,butpeoplereadingyourcodewon't.Usenamesthatarerealwords,thatsaysomethingotherthan"I'm
toocarelesstotellyouwhatthisvariableisreallyfor".Useconsistentindentationandcleanlayout.Writenicecodeand
yourworldwillbemorecomfortable.

Testwhatyoumakeasyoumakeit.Whenyourprogramdoesn'twork,youshouldknowwhatfivelinesaretoblame.This
isespeciallytruewhenyoudoZeroMQmagic,whichjustwon'tworkthefirstfewtimesyoutryit.
http://zguide.zeromq.org/page:all 12/225
12/31/2015 MQ - The Guide - MQ - The Guide
Whenyoufindthatthingsdon'tworkasexpected,breakyourcodeintopieces,testeachone,seewhichoneisnot
working.ZeroMQletsyoumakeessentiallymodularcodeusethattoyouradvantage.

Makeabstractions(classes,methods,whatever)asyouneedthem.Ifyoucopy/pastealotofcode,you'regoingto
copy/pasteerrors,too.

GettingtheContextRight topprevnext

ZeroMQapplicationsalwaysstartbycreatingacontext,andthenusingthatforcreatingsockets.InC,it'sthezmq_ctx_new()
call.Youshouldcreateanduseexactlyonecontextinyourprocess.Technically,thecontextisthecontainerforallsocketsina
singleprocess,andactsasthetransportforinprocsockets,whicharethefastestwaytoconnectthreadsinoneprocess.Ifat
runtimeaprocesshastwocontexts,thesearelikeseparateZeroMQinstances.Ifthat'sexplicitlywhatyouwant,OK,but
otherwiseremember:

Callzmq_ctx_new()onceatthestartofaprocess,andzmq_ctx_destroy()onceattheend.

Ifyou'reusingthefork()systemcall,dozmq_ctx_new()aftertheforkandatthebeginningofthechildprocesscode.In
general,youwanttodointeresting(ZeroMQ)stuffinthechildren,andboringprocessmanagementintheparent.

MakingaCleanExit topprevnext

Classyprogrammerssharethesamemottoasclassyhitmen:alwayscleanupwhenyoufinishthejob.WhenyouuseZeroMQin
alanguagelikePython,stuffgetsautomaticallyfreedforyou.ButwhenusingC,youhavetocarefullyfreeobjectswhenyou're
finishedwiththemorelseyougetmemoryleaks,unstableapplications,andgenerallybadkarma.

Memoryleaksareonething,butZeroMQisquitefinickyabouthowyouexitanapplication.Thereasonsaretechnicaland
painful,buttheupshotisthatifyouleaveanysocketsopen,thezmq_ctx_destroy()functionwillhangforever.Andevenifyou
closeallsockets,zmq_ctx_destroy()willbydefaultwaitforeveriftherearependingconnectsorsendsunlessyousetthe
LINGERtozeroonthosesocketsbeforeclosingthem.

TheZeroMQobjectsweneedtoworryaboutaremessages,sockets,andcontexts.Luckilyit'squitesimple,atleastinsimple
programs:

Usezmq_send()andzmq_recv()whenyoucan,asitavoidstheneedtoworkwithzmq_msg_tobjects.

Ifyoudousezmq_msg_recv(),alwaysreleasethereceivedmessageassoonasyou'redonewithit,bycalling
zmq_msg_close().

Ifyouareopeningandclosingalotofsockets,that'sprobablyasignthatyouneedtoredesignyourapplication.Insome
casessockethandleswon'tbefreeduntilyoudestroythecontext.

Whenyouexittheprogram,closeyoursocketsandthencallzmq_ctx_destroy().Thisdestroysthecontext.

ThisisatleastthecaseforCdevelopment.Inalanguagewithautomaticobjectdestruction,socketsandcontextswillbe
destroyedasyouleavethescope.Ifyouuseexceptionsyou'llhavetodothecleanupinsomethinglikea"final"block,thesame
asforanyresource.

Ifyou'redoingmultithreadedwork,itgetsrathermorecomplexthanthis.We'llgettomultithreadinginthenextchapter,but
becausesomeofyouwill,despitewarnings,trytorunbeforeyoucansafelywalk,belowisthequickanddirtyguidetomakinga
cleanexitinamultithreadedZeroMQapplication.

First,donottrytousethesamesocketfrommultiplethreads.Pleasedon'texplainwhyyouthinkthiswouldbeexcellentfun,just
pleasedon'tdoit.Next,youneedtoshutdowneachsocketthathasongoingrequests.TheproperwayistosetalowLINGER
value(1second),andthenclosethesocket.Ifyourlanguagebindingdoesn'tdothisforyouautomaticallywhenyoudestroya
context,I'dsuggestsendingapatch.

Finally,destroythecontext.Thiswillcauseanyblockingreceivesorpollsorsendsinattachedthreads(i.e.,whichsharethe
samecontext)toreturnwithanerror.Catchthaterror,andthensetlingeron,andclosesocketsinthatthread,andexit.Donot
destroythesamecontexttwice.Thezmq_ctx_destroyinthemainthreadwillblockuntilallsocketsitknowsaboutaresafely
http://zguide.zeromq.org/page:all 13/225
12/31/2015 MQ - The Guide - MQ - The Guide
closed.

Voila!It'scomplexandpainfulenoughthatanylanguagebindingauthorworthhisorhersaltwilldothisautomaticallyandmake
thesocketclosingdanceunnecessary.

WhyWeNeededZeroMQ topprevnext

Nowthatyou'veseenZeroMQinaction,let'sgobacktothe"why".

Manyapplicationsthesedaysconsistofcomponentsthatstretchacrosssomekindofnetwork,eitheraLANortheInternet.So
manyapplicationdevelopersendupdoingsomekindofmessaging.Somedevelopersusemessagequeuingproducts,butmost
ofthetimetheydoitthemselves,usingTCPorUDP.Theseprotocolsarenothardtouse,butthereisagreatdifferencebetween
sendingafewbytesfromAtoB,anddoingmessaginginanykindofreliableway.

Let'slookatthetypicalproblemswefacewhenwestarttoconnectpiecesusingrawTCP.Anyreusablemessaginglayerwould
needtosolveallormostofthese:

HowdowehandleI/O?Doesourapplicationblock,ordowehandleI/Ointhebackground?Thisisakeydesigndecision.
BlockingI/Ocreatesarchitecturesthatdonotscalewell.ButbackgroundI/Ocanbeveryhardtodoright.

Howdowehandledynamiccomponents,i.e.,piecesthatgoawaytemporarily?Doweformallysplitcomponentsinto
"clients"and"servers"andmandatethatserverscannotdisappear?Whatthenifwewanttoconnectserverstoservers?
Dowetrytoreconnecteveryfewseconds?

Howdowerepresentamessageonthewire?Howdoweframedatasoit'seasytowriteandread,safefrombuffer
overflows,efficientforsmallmessages,yetadequatefortheverylargestvideosofdancingcatswearingpartyhats?

Howdowehandlemessagesthatwecan'tdeliverimmediately?Particularly,ifwe'rewaitingforacomponenttocome
backonline?Dowediscardmessages,putthemintoadatabase,orintoamemoryqueue?

Wheredowestoremessagequeues?Whathappensifthecomponentreadingfromaqueueisveryslowandcausesour
queuestobuildup?What'sourstrategythen?

Howdowehandlelostmessages?Dowewaitforfreshdata,requestaresend,ordowebuildsomekindofreliabilitylayer
thatensuresmessagescannotbelost?Whatifthatlayeritselfcrashes?

Whatifweneedtouseadifferentnetworktransport.Say,multicastinsteadofTCPunicast?OrIPv6?Doweneedto
rewritetheapplications,oristhetransportabstractedinsomelayer?

Howdoweroutemessages?Canwesendthesamemessagetomultiplepeers?Canwesendrepliesbacktoanoriginal
requester?

HowdowewriteanAPIforanotherlanguage?Dowereimplementawirelevelprotocolordowerepackagealibrary?If
theformer,howcanweguaranteeefficientandstablestacks?Ifthelatter,howcanweguaranteeinteroperability?

Howdowerepresentdatasothatitcanbereadbetweendifferentarchitectures?Doweenforceaparticularencodingfor
datatypes?Howfaristhisthejobofthemessagingsystemratherthanahigherlayer?

Howdowehandlenetworkerrors?Dowewaitandretry,ignorethemsilently,orabort?

TakeatypicalopensourceprojectlikeHadoopZookeeperandreadtheCAPIcodeinsrc/c/src/zookeeper.c.WhenIread
thiscode,inJanuary2013,itwas4,200linesofmysteryandinthereisanundocumented,client/servernetworkcommunication
protocol.Iseeit'sefficientbecauseitusespollinsteadofselect.Butreally,Zookeepershouldbeusingagenericmessaging
layerandanexplicitlydocumentedwirelevelprotocol.Itisincrediblywastefulforteamstobebuildingthisparticularwheelover
andover.

Buthowtomakeareusablemessaginglayer?Why,whensomanyprojectsneedthistechnology,arepeoplestilldoingitthe
hardwaybydrivingTCPsocketsintheircode,andsolvingtheproblemsinthatlonglistoverandover?

Itturnsoutthatbuildingreusablemessagingsystemsisreallydifficult,whichiswhyfewFOSSprojectsevertried,andwhy
commercialmessagingproductsarecomplex,expensive,inflexible,andbrittle.In2006,iMatixdesignedAMQPwhichstartedto
giveFOSSdevelopersperhapsthefirstreusablerecipeforamessagingsystem.AMQPworksbetterthanmanyotherdesigns,
butremainsrelativelycomplex,expensive,andbrittle.Ittakesweekstolearntouse,andmonthstocreatestablearchitectures
thatdon'tcrashwhenthingsgethairy.

http://zguide.zeromq.org/page:all 14/225
12/31/2015 MQ - The Guide - MQ - The Guide
Figure7MessagingasitStarts

Mostmessagingprojects,likeAMQP,thattrytosolvethislonglistofproblemsinareusablewaydosobyinventinganew
concept,the"broker",thatdoesaddressing,routing,andqueuing.Thisresultsinaclient/serverprotocolorasetofAPIsontopof
someundocumentedprotocolthatallowsapplicationstospeaktothisbroker.Brokersareanexcellentthinginreducingthe
complexityoflargenetworks.ButaddingbrokerbasedmessagingtoaproductlikeZookeeperwouldmakeitworse,notbetter.It
wouldmeanaddinganadditionalbigbox,andanewsinglepointoffailure.Abrokerrapidlybecomesabottleneckandanewrisk
tomanage.Ifthesoftwaresupportsit,wecanaddasecond,third,andfourthbrokerandmakesomefailoverscheme.Peopledo
this.Itcreatesmoremovingpieces,morecomplexity,andmorethingstobreak.

Andabrokercentricsetupneedsitsownoperationsteam.Youliterallyneedtowatchthebrokersdayandnight,andbeatthem
withastickwhentheystartmisbehaving.Youneedboxes,andyouneedbackupboxes,andyouneedpeopletomanagethose
boxes.Itisonlyworthdoingforlargeapplicationswithmanymovingpieces,builtbyseveralteamsofpeopleoverseveralyears.

Figure8MessagingasitBecomes

http://zguide.zeromq.org/page:all 15/225
12/31/2015 MQ - The Guide - MQ - The Guide
Sosmalltomediumapplicationdevelopersaretrapped.Eithertheyavoidnetworkprogrammingandmakemonolithic
applicationsthatdonotscale.Ortheyjumpintonetworkprogrammingandmakebrittle,complexapplicationsthatarehardto
maintain.Ortheybetonamessagingproduct,andendupwithscalableapplicationsthatdependonexpensive,easilybroken
technology.Therehasbeennoreallygoodchoice,whichismaybewhymessagingislargelystuckinthelastcenturyandstirs
strongemotions:negativeonesforusers,gleefuljoyforthosesellingsupportandlicenses.

Whatweneedissomethingthatdoesthejobofmessaging,butdoesitinsuchasimpleandcheapwaythatitcanworkinany
application,withclosetozerocost.Itshouldbealibrarywhichyoujustlink,withoutanyotherdependencies.Noadditional
movingpieces,sonoadditionalrisk.ItshouldrunonanyOSandworkwithanyprogramminglanguage.

AndthisisZeroMQ:anefficient,embeddablelibrarythatsolvesmostoftheproblemsanapplicationneedstobecomenicely
elasticacrossanetwork,withoutmuchcost.

Specifically:

IthandlesI/Oasynchronously,inbackgroundthreads.Thesecommunicatewithapplicationthreadsusinglockfreedata
structures,soconcurrentZeroMQapplicationsneednolocks,semaphores,orotherwaitstates.

ComponentscancomeandgodynamicallyandZeroMQwillautomaticallyreconnect.Thismeansyoucanstart
componentsinanyorder.Youcancreate"serviceorientedarchitectures"(SOAs)whereservicescanjoinandleavethe
networkatanytime.

Itqueuesmessagesautomaticallywhenneeded.Itdoesthisintelligently,pushingmessagesascloseaspossibletothe
receiverbeforequeuingthem.

Ithaswaysofdealingwithoverfullqueues(called"highwatermark").Whenaqueueisfull,ZeroMQautomaticallyblocks
senders,orthrowsawaymessages,dependingonthekindofmessagingyouaredoing(thesocalled"pattern").

Itletsyourapplicationstalktoeachotheroverarbitrarytransports:TCP,multicast,inprocess,interprocess.Youdon't
needtochangeyourcodetouseadifferenttransport.

Ithandlesslow/blockedreaderssafely,usingdifferentstrategiesthatdependonthemessagingpattern.

Itletsyouroutemessagesusingavarietyofpatternssuchasrequestreplyandpubsub.Thesepatternsarehowyou
createthetopology,thestructureofyournetwork.

Itletsyoucreateproxiestoqueue,forward,orcapturemessageswithasinglecall.Proxiescanreducetheinterconnection
complexityofanetwork.

Itdeliverswholemessagesexactlyastheyweresent,usingasimpleframingonthewire.Ifyouwritea10kmessage,you
willreceivea10kmessage.

Itdoesnotimposeanyformatonmessages.Theyareblobsfromzerotogigabyteslarge.Whenyouwanttorepresent
datayouchoosesomeotherproductontop,suchasmsgpack,Google'sprotocolbuffers,andothers.

Ithandlesnetworkerrorsintelligently,byretryingautomaticallyincaseswhereitmakessense.

Itreducesyourcarbonfootprint.DoingmorewithlessCPUmeansyourboxesuselesspower,andyoucankeepyourold
boxesinuseforlonger.AlGorewouldloveZeroMQ.

ActuallyZeroMQdoesrathermorethanthis.Ithasasubversiveeffectonhowyoudevelopnetworkcapableapplications.
Superficially,it'sasocketinspiredAPIonwhichyoudozmq_recv()andzmq_send().Butmessageprocessingrapidly
becomesthecentralloop,andyourapplicationsoonbreaksdownintoasetofmessageprocessingtasks.Itiselegantand
natural.Anditscales:eachofthesetasksmapstoanode,andthenodestalktoeachotheracrossarbitrarytransports.Two
nodesinoneprocess(nodeisathread),twonodesononebox(nodeisaprocess),ortwonodesononenetwork(nodeisabox)
it'sallthesame,withnoapplicationcodechanges.

SocketScalability topprevnext

Let'sseeZeroMQ'sscalabilityinaction.Hereisashellscriptthatstartstheweatherserverandthenabunchofclientsinparallel:

wuserver&
wuclient12345&

http://zguide.zeromq.org/page:all 16/225
12/31/2015 MQ - The Guide - MQ - The Guide
wuclient23456&
wuclient34567&
wuclient45678&
wuclient56789&

Astheclientsrun,wetakealookattheactiveprocessesusingthetopcommand',andweseesomethinglike(ona4corebox):

PIDUSERPRNIVIRTRESSHRS%CPU%MEMTIME+COMMAND
7136ph2001040m959m1156R15712.016:25.47wuserver
7966ph2009860818041372S330.00:03.94wuclient
7963ph2003311617481372S140.00:00.76wuclient
7965ph2003311617841372S60.00:00.47wuclient
7964ph2003311617881372S50.00:00.25wuclient
7967ph2003307217401372S50.00:00.35wuclient

Let'sthinkforasecondaboutwhatishappeninghere.Theweatherserverhasasinglesocket,andyetherewehaveitsending
datatofiveclientsinparallel.Wecouldhavethousandsofconcurrentclients.Theserverapplicationdoesn'tseethem,doesn't
talktothemdirectly.SotheZeroMQsocketisactinglikealittleserver,silentlyacceptingclientrequestsandshovingdataoutto
themasfastasthenetworkcanhandleit.Andit'samultithreadedserver,squeezingmorejuiceoutofyourCPU.

UpgradingfromZeroMQv2.2toZeroMQv3.2 topprevnext

CompatibleChanges topprevnext

Thesechangesdon'timpactexistingapplicationcodedirectly:

Pubsubfilteringisnowdoneatthepublishersideinsteadofsubscriberside.Thisimprovesperformancesignificantlyin
manypubsubusecases.Youcanmixv3.2andv2.1/v2.2publishersandsubscriberssafely.

ZeroMQv3.2hasmanynewAPImethods(zmq_disconnect(),zmq_unbind(),zmq_monitor(),zmq_ctx_set(),
etc.)

IncompatibleChanges topprevnext

Thesearethemainareasofimpactonapplicationsandlanguagebindings:

Changedsend/recvmethods:zmq_send()andzmq_recv()haveadifferent,simplerinterface,andtheoldfunctionality
isnowprovidedbyzmq_msg_send()andzmq_msg_recv().Symptom:compileerrors.Solution:fixupyourcode.

Thesetwomethodsreturnpositivevaluesonsuccess,and1onerror.Inv2.xtheyalwaysreturnedzeroonsuccess.
Symptom:apparenterrorswhenthingsactuallyworkfine.Solution:teststrictlyforreturncode=1,notnonzero.

zmq_poll()nowwaitsformilliseconds,notmicroseconds.Symptom:applicationstopsresponding(infactresponds
1000timesslower).Solution:usetheZMQ_POLL_MSECmacrodefinedbelow,inallzmq_pollcalls.

ZMQ_NOBLOCKisnowcalledZMQ_DONTWAIT.Symptom:compilefailuresontheZMQ_NOBLOCKmacro.

TheZMQ_HWMsocketoptionisnowbrokenintoZMQ_SNDHWMandZMQ_RCVHWM.Symptom:compilefailuresonthe
ZMQ_HWMmacro.

Mostbutnotallzmq_getsockopt()optionsarenowintegervalues.Symptom:runtimeerrorreturnson

http://zguide.zeromq.org/page:all 17/225
12/31/2015 MQ - The Guide - MQ - The Guide
zmq_setsockoptandzmq_getsockopt.

TheZMQ_SWAPoptionhasbeenremoved.Symptom:compilefailuresonZMQ_SWAP.Solution:redesignanycodethatuses
thisfunctionality.

SuggestedShimMacros topprevnext

Forapplicationsthatwanttorunonbothv2.xandv3.2,suchaslanguagebindings,ouradviceistoemulatec3.2asfaras
possible.HereareCmacrodefinitionsthathelpyourC/C++codetoworkacrossbothversions(takenfromCZMQ):

#ifndefZMQ_DONTWAIT
#defineZMQ_DONTWAITZMQ_NOBLOCK
#endif
#ifZMQ_VERSION_MAJOR==2
#definezmq_msg_send(msg,sock,opt)zmq_send(sock,msg,opt)
#definezmq_msg_recv(msg,sock,opt)zmq_recv(sock,msg,opt)
#definezmq_ctx_destroy(context)zmq_term(context)
#defineZMQ_POLL_MSEC1000//zmq_pollisusec
#defineZMQ_SNDHWMZMQ_HWM
#defineZMQ_RCVHWMZMQ_HWM
#elifZMQ_VERSION_MAJOR==3
#defineZMQ_POLL_MSEC1//zmq_pollismsec
#endif

Warning:UnstableParadigms! topprevnext

Traditionalnetworkprogrammingisbuiltonthegeneralassumptionthatonesockettalkstooneconnection,onepeer.Thereare
multicastprotocols,buttheseareexotic.Whenweassume"onesocket=oneconnection",wescaleourarchitecturesincertain
ways.Wecreatethreadsoflogicwhereeachthreadworkwithonesocket,onepeer.Weplaceintelligenceandstateinthese
threads.

IntheZeroMQuniverse,socketsaredoorwaystofastlittlebackgroundcommunicationsenginesthatmanageawholesetof
connectionsautomagicallyforyou.Youcan'tsee,workwith,open,close,orattachstatetotheseconnections.Whetheryouuse
blockingsendorreceive,orpoll,allyoucantalktoisthesocket,nottheconnectionsitmanagesforyou.Theconnectionsare
privateandinvisible,andthisisthekeytoZeroMQ'sscalability.

Thisisbecauseyourcode,talkingtoasocket,canthenhandleanynumberofconnectionsacrosswhatevernetworkprotocols
arearound,withoutchange.AmessagingpatternsittinginZeroMQscalesmorecheaplythanamessagingpatternsittinginyour
applicationcode.

Sothegeneralassumptionnolongerapplies.Asyoureadthecodeexamples,yourbrainwilltrytomapthemtowhatyouknow.
Youwillread"socket"andthink"ah,thatrepresentsaconnectiontoanothernode".Thatiswrong.Youwillread"thread"and
yourbrainwillagainthink,"ah,athreadrepresentsaconnectiontoanothernode",andagainyourbrainwillbewrong.

Ifyou'rereadingthisGuideforthefirsttime,realizethatuntilyouactuallywriteZeroMQcodeforadayortwo(andmaybethree
orfourdays),youmayfeelconfused,especiallybyhowsimpleZeroMQmakesthingsforyou,andyoumaytrytoimposethat
generalassumptiononZeroMQ,anditwon'twork.Andthenyouwillexperienceyourmomentofenlightenmentandtrust,that
zappowkaboomsatoriparadigmshiftmomentwhenitallbecomesclear.

Chapter2SocketsandPatterns topprevnext

http://zguide.zeromq.org/page:all 18/225
12/31/2015 MQ - The Guide - MQ - The Guide
InChapter1BasicswetookZeroMQforadrive,withsomebasicexamplesofthemainZeroMQpatterns:requestreply,pub
sub,andpipeline.Inthischapter,we'regoingtogetourhandsdirtyandstarttolearnhowtousethesetoolsinrealprograms.

We'llcover:

HowtocreateandworkwithZeroMQsockets.
Howtosendandreceivemessagesonsockets.
HowtobuildyourappsaroundZeroMQ'sasynchronousI/Omodel.
Howtohandlemultiplesocketsinonethread.
Howtohandlefatalandnonfatalerrorsproperly.
HowtohandleinterruptsignalslikeCtrlC.
HowtoshutdownaZeroMQapplicationcleanly.
HowtocheckaZeroMQapplicationformemoryleaks.
Howtosendandreceivemultipartmessages.
Howtoforwardmessagesacrossnetworks.
Howtobuildasimplemessagequeuingbroker.
HowtowritemultithreadedapplicationswithZeroMQ.
HowtouseZeroMQtosignalbetweenthreads.
HowtouseZeroMQtocoordinateanetworkofnodes.
Howtocreateandusemessageenvelopesforpubsub.
UsingtheHWM(highwatermark)toprotectagainstmemoryoverflows.

TheSocketAPI topprevnext

Tobeperfectlyhonest,ZeroMQdoesakindofswitchandbaitonyou,forwhichwedon'tapologize.It'sforyourowngoodandit
hurtsusmorethanithurtsyou.ZeroMQpresentsafamiliarsocketbasedAPI,whichrequiresgreateffortforustohideabunch
ofmessageprocessingengines.However,theresultwillslowlyfixyourworldviewabouthowtodesignandwritedistributed
software.

SocketsarethedefactostandardAPIfornetworkprogramming,aswellasbeingusefulforstoppingyoureyesfromfallingonto
yourcheeks.OnethingthatmakesZeroMQespeciallytastytodevelopersisthatitusessocketsandmessagesinsteadofsome
otherarbitrarysetofconcepts.KudostoMartinSustrikforpullingthisoff.Itturns"MessageOrientedMiddleware",aphrase
guaranteedtosendthewholeroomofftoCatatonia,into"ExtraSpicySockets!",whichleavesuswithastrangecravingforpizza
andadesiretoknowmore.

Likeafavoritedish,ZeroMQsocketsareeasytodigest.Socketshavealifeinfourparts,justlikeBSDsockets:

Creatinganddestroyingsockets,whichgotogethertoformakarmiccircleofsocketlife(seezmq_socket(),
zmq_close()).

Configuringsocketsbysettingoptionsonthemandcheckingthemifnecessary(seezmq_setsockopt(),
zmq_getsockopt()).

PluggingsocketsintothenetworktopologybycreatingZeroMQconnectionstoandfromthem(seezmq_bind(),
zmq_connect()).

Usingthesocketstocarrydatabywritingandreceivingmessagesonthem(seezmq_msg_send(),zmq_msg_recv()).

Notethatsocketsarealwaysvoidpointers,andmessages(whichwe'llcometoverysoon)arestructures.SoinCyoupass
socketsassuch,butyoupassaddressesofmessagesinallfunctionsthatworkwithmessages,likezmq_msg_send()and
zmq_msg_recv().Asamnemonic,realizethat"inZeroMQ,allyoursocketsarebelongtous",butmessagesarethingsyou
actuallyowninyourcode.

Creating,destroying,andconfiguringsocketsworksasyou'dexpectforanyobject.ButrememberthatZeroMQisan
asynchronous,elasticfabric.Thishassomeimpactonhowweplugsocketsintothenetworktopologyandhowweusethe
socketsafterthat.

PluggingSocketsintotheTopology topprevnext

http://zguide.zeromq.org/page:all 19/225
12/31/2015 MQ - The Guide - MQ - The Guide
Tocreateaconnectionbetweentwonodes,youusezmq_bind()inonenodeandzmq_connect()intheother.Asageneral
ruleofthumb,thenodethatdoeszmq_bind()isa"server",sittingonawellknownnetworkaddress,andthenodewhichdoes
zmq_connect()isa"client",withunknownorarbitrarynetworkaddresses.Thuswesaythatwe"bindasockettoanendpoint"
and"connectasockettoanendpoint",theendpointbeingthatwellknownnetworkaddress.

ZeroMQconnectionsaresomewhatdifferentfromclassicTCPconnections.Themainnotabledifferencesare:

Theygoacrossanarbitrarytransport(inproc,ipc,tcp,pgm,orepgm).Seezmq_inproc(),zmq_ipc(),zmq_tcp(),
zmq_pgm(),andzmq_epgm().

Onesocketmayhavemanyoutgoingandmanyincomingconnections.

Thereisnozmq_accept()method.Whenasocketisboundtoanendpointitautomaticallystartsacceptingconnections.

Thenetworkconnectionitselfhappensinthebackground,andZeroMQwillautomaticallyreconnectifthenetwork
connectionisbroken(e.g.,ifthepeerdisappearsandthencomesback).

Yourapplicationcodecannotworkwiththeseconnectionsdirectlytheyareencapsulatedunderthesocket.

Manyarchitecturesfollowsomekindofclient/servermodel,wheretheserveristhecomponentthatismoststatic,andtheclients
arethecomponentsthataremostdynamic,i.e.,theycomeandgothemost.Therearesometimesissuesofaddressing:servers
willbevisibletoclients,butnotnecessarilyviceversa.Somostlyit'sobviouswhichnodeshouldbedoingzmq_bind()(the
server)andwhichshouldbedoingzmq_connect()(theclient).Italsodependsonthekindofsocketsyou'reusing,withsome
exceptionsforunusualnetworkarchitectures.We'lllookatsockettypeslater.

Now,imaginewestarttheclientbeforewestarttheserver.Intraditionalnetworking,wegetabigredFailflag.ButZeroMQlets
usstartandstoppiecesarbitrarily.Assoonastheclientnodedoeszmq_connect(),theconnectionexistsandthatnodecan
starttowritemessagestothesocket.Atsomestage(hopefullybeforemessagesqueueupsomuchthattheystarttoget
discarded,ortheclientblocks),theservercomesalive,doesazmq_bind(),andZeroMQstartstodelivermessages.

Aservernodecanbindtomanyendpoints(thatis,acombinationofprotocolandaddress)anditcandothisusingasingle
socket.Thismeansitwillacceptconnectionsacrossdifferenttransports:

zmq_bind(socket,"tcp://*:5555")
zmq_bind(socket,"tcp://*:9999")
zmq_bind(socket,"inproc://somename")

Withmosttransports,youcannotbindtothesameendpointtwice,unlikeforexampleinUDP.Theipctransportdoes,however,
letoneprocessbindtoanendpointalreadyusedbyafirstprocess.It'smeanttoallowaprocesstorecoverafteracrash.

AlthoughZeroMQtriestobeneutralaboutwhichsidebindsandwhichsideconnects,therearedifferences.We'llseethesein
moredetaillater.Theupshotisthatyoushouldusuallythinkintermsof"servers"asstaticpartsofyourtopologythatbindto
moreorlessfixedendpoints,and"clients"asdynamicpartsthatcomeandgoandconnecttotheseendpoints.Then,designyour
applicationaroundthismodel.Thechancesthatitwill"justwork"aremuchbetterlikethat.

Socketshavetypes.Thesockettypedefinesthesemanticsofthesocket,itspoliciesforroutingmessagesinwardsandoutwards,
queuing,etc.Youcanconnectcertaintypesofsockettogether,e.g.,apublishersocketandasubscribersocket.Socketswork
togetherin"messagingpatterns".We'lllookatthisinmoredetaillater.

It'stheabilitytoconnectsocketsinthesedifferentwaysthatgivesZeroMQitsbasicpowerasamessagequeuingsystem.There
arelayersontopofthis,suchasproxies,whichwe'llgettolater.Butessentially,withZeroMQyoudefineyournetwork
architecturebypluggingpiecestogetherlikeachild'sconstructiontoy.

SendingandReceivingMessages topprevnext

Tosendandreceivemessagesyouusethezmq_msg_send()andzmq_msg_recv()methods.Thenamesareconventional,
butZeroMQ'sI/OmodelisdifferentenoughfromtheclassicTCPmodelthatyouwillneedtimetogetyourheadaroundit.

Figure9TCPsocketsare1to1

http://zguide.zeromq.org/page:all 20/225
12/31/2015 MQ - The Guide - MQ - The Guide

Let'slookatthemaindifferencesbetweenTCPsocketsandZeroMQsocketswhenitcomestoworkingwithdata:

ZeroMQsocketscarrymessages,likeUDP,ratherthanastreamofbytesasTCPdoes.AZeroMQmessageislength
specifiedbinarydata.We'llcometomessagesshortlytheirdesignisoptimizedforperformanceandsoalittletricky.

ZeroMQsocketsdotheirI/Oinabackgroundthread.Thismeansthatmessagesarriveinlocalinputqueuesandaresent
fromlocaloutputqueues,nomatterwhatyourapplicationisbusydoing.

ZeroMQsocketshaveonetoNroutingbehaviorbuiltin,accordingtothesockettype.

Thezmq_send()methoddoesnotactuallysendthemessagetothesocketconnection(s).ItqueuesthemessagesothattheI/O
threadcansenditasynchronously.Itdoesnotblockexceptinsomeexceptioncases.Sothemessageisnotnecessarilysent
whenzmq_send()returnstoyourapplication.

UnicastTransports topprevnext

ZeroMQprovidesasetofunicasttransports(inproc,ipc,andtcp)andmulticasttransports(epgm,pgm).Multicastisan
advancedtechniquethatwe'llcometolater.Don'tevenstartusingitunlessyouknowthatyourfanoutratioswillmake1toN
unicastimpossible.

Formostcommoncases,usetcp,whichisadisconnectedTCPtransport.Itiselastic,portable,andfastenoughformostcases.
WecallthisdisconnectedbecauseZeroMQ'stcptransportdoesn'trequirethattheendpointexistsbeforeyouconnecttoit.
Clientsandserverscanconnectandbindatanytime,cangoandcomeback,anditremainstransparenttoapplications.

Theinterprocessipctransportisdisconnected,liketcp.Ithasonelimitation:itdoesnotyetworkonWindows.Byconvention
weuseendpointnameswithan".ipc"extensiontoavoidpotentialconflictwithotherfilenames.OnUNIXsystems,ifyouuseipc
endpointsyouneedtocreatethesewithappropriatepermissionsotherwisetheymaynotbeshareablebetweenprocesses
runningunderdifferentuserIDs.Youmustalsomakesureallprocessescanaccessthefiles,e.g.,byrunninginthesame
workingdirectory.

Theinterthreadtransport,inproc,isaconnectedsignalingtransport.Itismuchfasterthantcporipc.Thistransporthasa
specificlimitationcomparedtotcpandipc:theservermustissueabindbeforeanyclientissuesaconnect.Thisis
somethingfutureversionsofZeroMQmayfix,butatpresentthisdefineshowyouuseinprocsockets.Wecreateandbindone
socketandstartthechildthreads,whichcreateandconnecttheothersockets.

ZeroMQisNotaNeutralCarrier topprevnext

AcommonquestionthatnewcomerstoZeroMQask(it'soneI'veaskedmyself)is,"howdoIwriteanXYZserverinZeroMQ?"
Forexample,"howdoIwriteanHTTPserverinZeroMQ?"TheimplicationisthatifweusenormalsocketstocarryHTTP
http://zguide.zeromq.org/page:all 21/225
12/31/2015 MQ - The Guide - MQ - The Guide
requestsandresponses,weshouldbeabletouseZeroMQsocketstodothesame,onlymuchfasterandbetter.

Theanswerusedtobe"thisisnothowitworks".ZeroMQisnotaneutralcarrier:itimposesaframingonthetransportprotocolsit
uses.Thisframingisnotcompatiblewithexistingprotocols,whichtendtousetheirownframing.Forexample,compareanHTTP
requestandaZeroMQrequest,bothoverTCP/IP.

Figure10HTTPontheWire

TheHTTPrequestusesCRLFasitssimplestframingdelimiter,whereasZeroMQusesalengthspecifiedframe.Soyoucould
writeanHTTPlikeprotocolusingZeroMQ,usingforexampletherequestreplysocketpattern.ButitwouldnotbeHTTP.

Figure11ZeroMQontheWire

Sincev3.3,however,ZeroMQhasasocketoptioncalledZMQ_ROUTER_RAWthatletsyoureadandwritedatawithouttheZeroMQ
framing.YoucouldusethistoreadandwriteproperHTTPrequestsandresponses.HardeepSinghcontributedthischangeso
thathecouldconnecttoTelnetserversfromhisZeroMQapplication.Attimeofwritingthisisstillsomewhatexperimental,butit
showshowZeroMQkeepsevolvingtosolvenewproblems.Maybethenextpatchwillbeyours.

I/OThreads topprevnext

WesaidthatZeroMQdoesI/Oinabackgroundthread.OneI/Othread(forallsockets)issufficientforallbutthemostextreme
applications.Whenyoucreateanewcontext,itstartswithoneI/Othread.ThegeneralruleofthumbistoallowoneI/Othread
pergigabyteofdatainoroutpersecond.ToraisethenumberofI/Othreads,usethezmq_ctx_set()callbeforecreatingany
sockets:

intio_threads=4
void*context=zmq_ctx_new()
zmq_ctx_set(context,ZMQ_IO_THREADS,io_threads)
assert(zmq_ctx_get(context,ZMQ_IO_THREADS)==io_threads)

We'veseenthatonesocketcanhandledozens,eventhousandsofconnectionsatonce.Thishasafundamentalimpactonhow
youwriteapplications.Atraditionalnetworkedapplicationhasoneprocessoronethreadperremoteconnection,andthat
processorthreadhandlesonesocket.ZeroMQletsyoucollapsethisentirestructureintoasingleprocessandthenbreakitupas
necessaryforscaling.

IfyouareusingZeroMQforinterthreadcommunicationsonly(i.e.,amultithreadedapplicationthatdoesnoexternalsocketI/O)
youcansettheI/Othreadstozero.It'snotasignificantoptimizationthough,moreofacuriosity.

MessagingPatterns topprevnext

UnderneaththebrownpaperwrappingofZeroMQ'ssocketAPIliestheworldofmessagingpatterns.Ifyouhaveabackgroundin
enterprisemessaging,orknowUDPwell,thesewillbevaguelyfamiliar.ButtomostZeroMQnewcomers,theyareasurprise.
We'resousedtotheTCPparadigmwhereasocketmapsonetoonetoanothernode.

http://zguide.zeromq.org/page:all 22/225
12/31/2015 MQ - The Guide - MQ - The Guide
Let'srecapbrieflywhatZeroMQdoesforyou.Itdeliversblobsofdata(messages)tonodes,quicklyandefficiently.Youcanmap
nodestothreads,processes,ornodes.ZeroMQgivesyourapplicationsasinglesocketAPItoworkwith,nomatterwhatthe
actualtransport(likeinprocess,interprocess,TCP,ormulticast).Itautomaticallyreconnectstopeersastheycomeandgo.It
queuesmessagesatbothsenderandreceiver,asneeded.Itlimitsthesequeuestoguardprocessesagainstrunningoutof
memory.Ithandlessocketerrors.ItdoesallI/Oinbackgroundthreads.Ituseslockfreetechniquesfortalkingbetweennodes,so
thereareneverlocks,waits,semaphores,ordeadlocks.

Butcuttingthroughthat,itroutesandqueuesmessagesaccordingtopreciserecipescalledpatterns.Itisthesepatternsthat
provideZeroMQ'sintelligence.Theyencapsulateourhardearnedexperienceofthebestwaystodistributedataandwork.
ZeroMQ'spatternsarehardcodedbutfutureversionsmayallowuserdefinablepatterns.

ZeroMQpatternsareimplementedbypairsofsocketswithmatchingtypes.Inotherwords,tounderstandZeroMQpatternsyou
needtounderstandsockettypesandhowtheyworktogether.Mostly,thisjusttakesstudythereislittlethatisobviousatthis
level.

ThebuiltincoreZeroMQpatternsare:

Requestreply,whichconnectsasetofclientstoasetofservices.Thisisaremoteprocedurecallandtaskdistribution
pattern.

Pubsub,whichconnectsasetofpublisherstoasetofsubscribers.Thisisadatadistributionpattern.

Pipeline,whichconnectsnodesinafanout/faninpatternthatcanhavemultiplestepsandloops.Thisisaparalleltask
distributionandcollectionpattern.

Exclusivepair,whichconnectstwosocketsexclusively.Thisisapatternforconnectingtwothreadsinaprocess,notto
beconfusedwith"normal"pairsofsockets.

WelookedatthefirstthreeoftheseinChapter1Basics,andwe'llseetheexclusivepairpatternlaterinthischapter.The
zmq_socket()manpageisfairlyclearaboutthepatternsit'sworthreadingseveraltimesuntilitstartstomakesense.These
arethesocketcombinationsthatarevalidforaconnectbindpair(eithersidecanbind):

PUBandSUB
REQandREP
REQandROUTER(takecare,REQinsertsanextranullframe)
DEALERandREP(takecare,REPassumesanullframe)
DEALERandROUTER
DEALERandDEALER
ROUTERandROUTER
PUSHandPULL
PAIRandPAIR

You'llalsoseereferencestoXPUBandXSUBsockets,whichwe'llcometolater(they'relikerawversionsofPUBandSUB).Any
othercombinationwillproduceundocumentedandunreliableresults,andfutureversionsofZeroMQwillprobablyreturnerrorsif
youtrythem.Youcanandwill,ofcourse,bridgeothersockettypesviacode,i.e.,readfromonesockettypeandwritetoanother.

HighLevelMessagingPatterns topprevnext

ThesefourcorepatternsarecookedintoZeroMQ.TheyarepartoftheZeroMQAPI,implementedinthecoreC++library,and
areguaranteedtobeavailableinallfineretailstores.

Ontopofthose,weaddhighlevelmessagingpatterns.WebuildthesehighlevelpatternsontopofZeroMQandimplementthem
inwhateverlanguagewe'reusingforourapplication.Theyarenotpartofthecorelibrary,donotcomewiththeZeroMQpackage,
andexistintheirownspaceaspartoftheZeroMQcommunity.ForexampletheMajordomopattern,whichweexploreinChapter
4ReliableRequestReplyPatterns,sitsintheGitHubMajordomoprojectintheZeroMQorganization.

Oneofthethingsweaimtoprovideyouwithinthisbookareasetofsuchhighlevelpatterns,bothsmall(howtohandle
messagessanely)andlarge(howtomakeareliablepubsubarchitecture).

WorkingwithMessages topprevnext

http://zguide.zeromq.org/page:all 23/225
12/31/2015 MQ - The Guide - MQ - The Guide

ThelibzmqcorelibraryhasinfacttwoAPIstosendandreceivemessages.Thezmq_send()andzmq_recv()methodsthat
we'vealreadyseenandusedaresimpleoneliners.Wewillusetheseoften,butzmq_recv()isbadatdealingwitharbitrary
messagesizes:ittruncatesmessagestowhateverbuffersizeyouprovide.Sothere'sasecondAPIthatworkswithzmq_msg_t
structures,witharicherbutmoredifficultAPI:

Initialiseamessage:zmq_msg_init(),zmq_msg_init_size(),zmq_msg_init_data().
Sendingandreceivingamessage:zmq_msg_send(),zmq_msg_recv().
Releaseamessage:zmq_msg_close().
Accessmessagecontent:zmq_msg_data(),zmq_msg_size(),zmq_msg_more().
Workwithmessageproperties:zmq_msg_get(),zmq_msg_set().
Messagemanipulation:zmq_msg_copy(),zmq_msg_move().

Onthewire,ZeroMQmessagesareblobsofanysizefromzeroupwardsthatfitinmemory.Youdoyourownserializationusing
protocolbuffers,msgpack,JSON,orwhateverelseyourapplicationsneedtospeak.It'swisetochooseadatarepresentationthat
isportable,butyoucanmakeyourowndecisionsabouttradeoffs.

Inmemory,ZeroMQmessagesarezmq_msg_tstructures(orclassesdependingonyourlanguage).Herearethebasicground
rulesforusingZeroMQmessagesinC:

Youcreateandpassaroundzmq_msg_tobjects,notblocksofdata.

Toreadamessage,youusezmq_msg_init()tocreateanemptymessage,andthenyoupassthatto
zmq_msg_recv().

Towriteamessagefromnewdata,youusezmq_msg_init_size()tocreateamessageandatthesametimeallocate
ablockofdataofsomesize.Youthenfillthatdatausingmemcpy,andpassthemessagetozmq_msg_send().

Torelease(notdestroy)amessage,youcallzmq_msg_close().Thisdropsareference,andeventuallyZeroMQwill
destroythemessage.

Toaccessthemessagecontent,youusezmq_msg_data().Toknowhowmuchdatathemessagecontains,use
zmq_msg_size().

Donotusezmq_msg_move(),zmq_msg_copy(),orzmq_msg_init_data()unlessyoureadthemanpagesandknow
preciselywhyyouneedthese.

Afteryoupassamessagetozmq_msg_send(),MQwillclearthemessage,i.e.,setthesizetozero.Youcannotsend
thesamemessagetwice,andyoucannotaccessthemessagedataaftersendingit.

Theserulesdon'tapplyifyouusezmq_send()andzmq_recv(),towhichyoupassbytearrays,notmessagestructures.

Ifyouwanttosendthesamemessagemorethanonce,andit'ssizable,createasecondmessage,initializeitusing
zmq_msg_init(),andthenusezmq_msg_copy()tocreateacopyofthefirstmessage.Thisdoesnotcopythedatabutcopies
areference.Youcanthensendthemessagetwice(ormore,ifyoucreatemorecopies)andthemessagewillonlybefinally
destroyedwhenthelastcopyissentorclosed.

ZeroMQalsosupportsmultipartmessages,whichletyousendorreceivealistofframesasasingleonthewiremessage.Thisis
widelyusedinrealapplicationsandwe'lllookatthatlaterinthischapterandinChapter3AdvancedRequestReplyPatterns.

Frames(alsocalled"messageparts"intheZeroMQreferencemanualpages)arethebasicwireformatforZeroMQmessages.A
frameisalengthspecifiedblockofdata.Thelengthcanbezeroupwards.Ifyou'vedoneanyTCPprogrammingyou'llappreciate
whyframesareausefulanswertothequestion"howmuchdataamIsupposedtoreadofthisnetworksocketnow?"

ThereisawirelevelprotocolcalledZMTPthatdefineshowZeroMQreadsandwritesframesonaTCPconnection.Ifyou're
interestedinhowthisworks,thespecisquiteshort.

Originally,aZeroMQmessagewasoneframe,likeUDP.Welaterextendedthiswithmultipartmessages,whicharequitesimply
seriesofframeswitha"more"bitsettoone,followedbyonewiththatbitsettozero.TheZeroMQAPIthenletsyouwrite
messageswitha"more"flagandwhenyoureadmessages,itletsyoucheckifthere's"more".

InthelowlevelZeroMQAPIandthereferencemanual,therefore,there'ssomefuzzinessaboutmessagesversusframes.So
here'sausefullexicon:

Amessagecanbeoneormoreparts.
Thesepartsarealsocalled"frames".
Eachpartisazmq_msg_tobject.
http://zguide.zeromq.org/page:all 24/225
12/31/2015 MQ - The Guide - MQ - The Guide
Yousendandreceiveeachpartseparately,inthelowlevelAPI.
HigherlevelAPIsprovidewrapperstosendentiremultipartmessages.

Someotherthingsthatareworthknowingaboutmessages:

Youmaysendzerolengthmessages,e.g.,forsendingasignalfromonethreadtoanother.

ZeroMQguaranteestodeliveralltheparts(oneormore)foramessage,ornoneofthem.

ZeroMQdoesnotsendthemessage(singleormultipart)rightaway,butatsomeindeterminatelatertime.Amultipart
messagemustthereforefitinmemory.

Amessage(singleormultipart)mustfitinmemory.Ifyouwanttosendfilesofarbitrarysizes,youshouldbreaktheminto
piecesandsendeachpieceasseparatesinglepartmessages.Usingmultipartdatawillnotreducememoryconsumption.

Youmustcallzmq_msg_close()whenfinishedwithareceivedmessage,inlanguagesthatdon'tautomaticallydestroy
objectswhenascopecloses.Youdon'tcallthismethodaftersendingamessage.

Andtoberepetitive,donotusezmq_msg_init_data()yet.Thisisazerocopymethodandisguaranteedtocreatetroublefor
you.TherearefarmoreimportantthingstolearnaboutZeroMQbeforeyoustarttoworryaboutshavingoffmicroseconds.

ThisrichAPIcanbetiresometoworkwith.Themethodsareoptimizedforperformance,notsimplicity.Ifyoustartusingthese
youwillalmostdefinitelygetthemwronguntilyou'vereadthemanpageswithsomecare.Sooneofthemainjobsofagood
languagebindingistowrapthisAPIupinclassesthatareeasiertouse.

HandlingMultipleSockets topprevnext

Inalltheexamplessofar,themainloopofmostexampleshasbeen:

1. Waitformessageonsocket.
2. Processmessage.
3. Repeat.

Whatifwewanttoreadfrommultipleendpointsatthesametime?Thesimplestwayistoconnectonesockettoalltheendpoints
andgetZeroMQtodothefaninforus.Thisislegaliftheremoteendpointsareinthesamepattern,butitwouldbewrongto
connectaPULLsockettoaPUBendpoint.

Toactuallyreadfrommultiplesocketsallatonce,usezmq_poll().Anevenbetterwaymightbetowrapzmq_poll()ina
frameworkthatturnsitintoaniceeventdrivenreactor,butit'ssignificantlymoreworkthanwewanttocoverhere.

Let'sstartwithadirtyhack,partlyforthefunofnotdoingitright,butmainlybecauseitletsmeshowyouhowtodononblocking
socketreads.Hereisasimpleexampleofreadingfromtwosocketsusingnonblockingreads.Thisratherconfusedprogramacts
bothasasubscribertoweatherupdates,andaworkerforparalleltasks:

msreader:MultiplesocketreaderinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Java|Lua|ObjectiveC|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|Haskell|Haxe|
Node.js|ooc|Q|Racket

Thecostofthisapproachissomeadditionallatencyonthefirstmessage(thesleepattheendoftheloop,whenthereareno
waitingmessagestoprocess).Thiswouldbeaprobleminapplicationswheresubmillisecondlatencywasvital.Also,youneedto
checkthedocumentationfornanosleep()orwhateverfunctionyouusetomakesureitdoesnotbusyloop.

Youcantreatthesocketsfairlybyreadingfirstfromone,thenthesecondratherthanprioritizingthemaswedidinthisexample.

Nowlet'sseethesamesenselesslittleapplicationdoneright,usingzmq_poll():

mspoller:MultiplesocketpollerinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|
Haxe|ooc|Q|Racket

Theitemsstructurehasthesefourmembers:

http://zguide.zeromq.org/page:all 25/225
12/31/2015 MQ - The Guide - MQ - The Guide

typedefstruct{
void*socket//ZeroMQsockettopollon
intfd//OR,nativefilehandletopollon
shortevents//Eventstopollon
shortrevents//Eventsreturnedafterpoll
}zmq_pollitem_t

MultipartMessages topprevnext

ZeroMQletsuscomposeamessageoutofseveralframes,givingusa"multipartmessage".Realisticapplicationsusemultipart
messagesheavily,bothforwrappingmessageswithaddressinformationandforsimpleserialization.We'lllookatreply
envelopeslater.

Whatwe'lllearnnowissimplyhowtoblindlyandsafelyreadandwritemultipartmessagesinanyapplication(suchasaproxy)
thatneedstoforwardmessageswithoutinspectingthem.

Whenyouworkwithmultipartmessages,eachpartisazmq_msgitem.E.g.,ifyouaresendingamessagewithfiveparts,you
mustconstruct,send,anddestroyfivezmq_msgitems.Youcandothisinadvance(andstorethezmq_msgitemsinanarrayor
otherstructure),orasyousendthem,onebyone.

Hereishowwesendtheframesinamultipartmessage(wereceiveeachframeintoamessageobject):

zmq_msg_send(&message,socket,ZMQ_SNDMORE)

zmq_msg_send(&message,socket,ZMQ_SNDMORE)

zmq_msg_send(&message,socket,0)

Hereishowwereceiveandprocessallthepartsinamessage,beitsinglepartormultipart:

while(1){
zmq_msg_tmessage
zmq_msg_init(&message)
zmq_msg_recv(&message,socket,0)
//Processthemessageframe

zmq_msg_close(&message)
if(!zmq_msg_more(&message))
break//Lastmessageframe
}

Somethingstoknowaboutmultipartmessages:

Whenyousendamultipartmessage,thefirstpart(andallfollowingparts)areonlyactuallysentonthewirewhenyou
sendthefinalpart.
Ifyouareusingzmq_poll(),whenyoureceivethefirstpartofamessage,alltheresthasalsoarrived.
Youwillreceiveallpartsofamessage,ornoneatall.
Eachpartofamessageisaseparatezmq_msgitem.
Youwillreceiveallpartsofamessagewhetherornotyoucheckthemoreproperty.
Onsending,ZeroMQqueuesmessageframesinmemoryuntilthelastisreceived,thensendsthemall.
Thereisnowaytocancelapartiallysentmessage,exceptbyclosingthesocket.

IntermediariesandProxies topprevnext

http://zguide.zeromq.org/page:all 26/225
12/31/2015 MQ - The Guide - MQ - The Guide

ZeroMQaimsfordecentralizedintelligence,butthatdoesn'tmeanyournetworkisemptyspaceinthemiddle.It'sfilledwith
messageawareinfrastructureandquiteoften,webuildthatinfrastructurewithZeroMQ.TheZeroMQplumbingcanrangefrom
tinypipestofullblownserviceorientedbrokers.Themessagingindustrycallsthisintermediation,meaningthatthestuffinthe
middledealswitheitherside.InZeroMQ,wecalltheseproxies,queues,forwarders,device,orbrokers,dependingonthe
context.

Thispatternisextremelycommonintherealworldandiswhyoursocietiesandeconomiesarefilledwithintermediarieswho
havenootherrealfunctionthantoreducethecomplexityandscalingcostsoflargernetworks.Realworldintermediariesare
typicallycalledwholesalers,distributors,managers,andsoon.

TheDynamicDiscoveryProblem topprevnext

Oneoftheproblemsyouwillhitasyoudesignlargerdistributedarchitecturesisdiscovery.Thatis,howdopiecesknowabout
eachother?It'sespeciallydifficultifpiecescomeandgo,sowecallthisthe"dynamicdiscoveryproblem".

Thereareseveralsolutionstodynamicdiscovery.Thesimplestistoentirelyavoiditbyhardcoding(orconfiguring)thenetwork
architecturesodiscoveryisdonebyhand.Thatis,whenyouaddanewpiece,youreconfigurethenetworktoknowaboutit.

Figure12SmallScalePubSubNetwork

Inpractice,thisleadstoincreasinglyfragileandunwieldyarchitectures.Let'ssayyouhaveonepublisherandahundred
subscribers.Youconnecteachsubscribertothepublisherbyconfiguringapublisherendpointineachsubscriber.That'seasy.
Subscribersaredynamicthepublisherisstatic.Nowsayyouaddmorepublishers.Suddenly,it'snotsoeasyanymore.Ifyou
continuetoconnecteachsubscribertoeachpublisher,thecostofavoidingdynamicdiscoverygetshigherandhigher.

Figure13PubSubNetworkwithaProxy

http://zguide.zeromq.org/page:all 27/225
12/31/2015 MQ - The Guide - MQ - The Guide

Therearequiteafewanswerstothis,buttheverysimplestansweristoaddanintermediarythatis,astaticpointinthenetwork
towhichallothernodesconnect.Inclassicmessaging,thisisthejobofthemessagebroker.ZeroMQdoesn'tcomewitha
messagebrokerassuch,butitletsusbuildintermediariesquiteeasily.

Youmightwonder,ifallnetworkseventuallygetlargeenoughtoneedintermediaries,whydon'twesimplyhaveamessage
brokerinplaceforallapplications?Forbeginners,it'safaircompromise.Justalwaysuseastartopology,forgetabout
performance,andthingswillusuallywork.However,messagebrokersaregreedythingsintheirroleascentralintermediaries,
theybecometoocomplex,toostateful,andeventuallyaproblem.

It'sbettertothinkofintermediariesassimplestatelessmessageswitches.AgoodanalogyisanHTTPproxyit'sthere,but
doesn'thaveanyspecialrole.Addingapubsubproxysolvesthedynamicdiscoveryprobleminourexample.Wesettheproxyin
the"middle"ofthenetwork.TheproxyopensanXSUBsocket,anXPUBsocket,andbindseachtowellknownIPaddressesand
ports.Then,allotherprocessesconnecttotheproxy,insteadoftoeachother.Itbecomestrivialtoaddmoresubscribersor
publishers.

Figure14ExtendedPubSub

http://zguide.zeromq.org/page:all 28/225
12/31/2015 MQ - The Guide - MQ - The Guide

WeneedXPUBandXSUBsocketsbecauseZeroMQdoessubscriptionforwardingfromsubscriberstopublishers.XSUBand
XPUBareexactlylikeSUBandPUBexcepttheyexposesubscriptionsasspecialmessages.Theproxyhastoforwardthese
subscriptionmessagesfromsubscribersidetopublisherside,byreadingthemfromtheXSUBsocketandwritingthemtothe
XPUBsocket.ThisisthemainusecaseforXSUBandXPUB.

SharedQueue(DEALERandROUTERsockets) topprevnext

IntheHelloWorldclient/serverapplication,wehaveoneclientthattalkstooneservice.However,inrealcasesweusuallyneed
toallowmultipleservicesaswellasmultipleclients.Thisletsusscaleupthepoweroftheservice(manythreadsorprocessesor
nodesratherthanjustone).Theonlyconstraintisthatservicesmustbestateless,allstatebeingintherequestorinsomeshared
storagesuchasadatabase.

Figure15RequestDistribution

http://zguide.zeromq.org/page:all 29/225
12/31/2015 MQ - The Guide - MQ - The Guide
Therearetwowaystoconnectmultipleclientstomultipleservers.Thebruteforcewayistoconnecteachclientsockettomultiple
serviceendpoints.Oneclientsocketcanconnecttomultipleservicesockets,andtheREQsocketwillthendistributerequests
amongtheseservices.Let'ssayyouconnectaclientsockettothreeserviceendpointsA,B,andC.Theclientmakesrequests
R1,R2,R3,R4.R1andR4gotoserviceA,R2goestoB,andR3goestoserviceC.

Thisdesignletsyouaddmoreclientscheaply.Youcanalsoaddmoreservices.Eachclientwilldistributeitsrequeststothe
services.Buteachclienthastoknowtheservicetopology.Ifyouhave100clientsandthenyoudecidetoaddthreemore
services,youneedtoreconfigureandrestart100clientsinorderfortheclientstoknowaboutthethreenewservices.

That'sclearlynotthekindofthingwewanttobedoingat3a.m.whenoursupercomputingclusterhasrunoutofresourcesand
wedesperatelyneedtoaddacoupleofhundredofnewservicenodes.Toomanystaticpiecesarelikeliquidconcrete:
knowledgeisdistributedandthemorestaticpiecesyouhave,themoreeffortitistochangethetopology.Whatwewantis
somethingsittinginbetweenclientsandservicesthatcentralizesallknowledgeofthetopology.Ideally,weshouldbeabletoadd
andremoveservicesorclientsatanytimewithouttouchinganyotherpartofthetopology.

Sowe'llwritealittlemessagequeuingbrokerthatgivesusthisflexibility.Thebrokerbindstotwoendpoints,afrontendforclients
andabackendforservices.Itthenuseszmq_poll()tomonitorthesetwosocketsforactivityandwhenithassome,itshuttles
messagesbetweenitstwosockets.Itdoesn'tactuallymanageanyqueuesexplicitlyZeroMQdoesthatautomaticallyoneach
socket.

WhenyouuseREQtotalktoREP,yougetastrictlysynchronousrequestreplydialog.Theclientsendsarequest.Theservice
readstherequestandsendsareply.Theclientthenreadsthereply.Ifeithertheclientortheservicetrytodoanythingelse(e.g.,
sendingtworequestsinarowwithoutwaitingforaresponse),theywillgetanerror.

Butourbrokerhastobenonblocking.Obviously,wecanusezmq_poll()towaitforactivityoneithersocket,butwecan'tuse
REPandREQ.

Figure16ExtendedRequestReply

Luckily,therearetwosocketscalledDEALERandROUTERthatletyoudononblockingrequestresponse.You'llseeinChapter
3AdvancedRequestReplyPatternshowDEALERandROUTERsocketsletyoubuildallkindsofasynchronousrequestreply
flows.Fornow,we'rejustgoingtoseehowDEALERandROUTERletusextendREQREPacrossanintermediary,thatis,our
littlebroker.

Inthissimpleextendedrequestreplypattern,REQtalkstoROUTERandDEALERtalkstoREP.InbetweentheDEALERand
ROUTER,wehavetohavecode(likeourbroker)thatpullsmessagesofftheonesocketandshovesthemontotheother.

Therequestreplybrokerbindstotwoendpoints,oneforclientstoconnectto(thefrontendsocket)andoneforworkersto
connectto(thebackend).Totestthisbroker,youwillwanttochangeyourworkerssotheyconnecttothebackendsocket.Here
isaclientthatshowswhatImean:

rrclient:RequestreplyclientinC
http://zguide.zeromq.org/page:all 30/225
12/31/2015 MQ - The Guide - MQ - The Guide

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Racket|Ruby|Scala|Tcl|Ada|Basic|
Felix|ObjectiveC|ooc|Q

Hereistheworker:

rrworker:RequestreplyworkerinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Racket|Ruby|Scala|Tcl|Ada|Basic|
Felix|ObjectiveC|ooc|Q

Andhereisthebroker,whichproperlyhandlesmultipartmessages:

rrbroker:RequestreplybrokerinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|Felix|
ObjectiveC|ooc|Q|Racket

Figure17RequestReplyBroker

Usingarequestreplybrokermakesyourclient/serverarchitectureseasiertoscalebecauseclientsdon'tseeworkers,and
workersdon'tseeclients.Theonlystaticnodeisthebrokerinthemiddle.

ZeroMQ'sBuiltInProxyFunction topprevnext

Itturnsoutthatthecoreloopintheprevioussection'srrbrokerisveryuseful,andreusable.Itletsusbuildpubsubforwarders
andsharedqueuesandotherlittleintermediarieswithverylittleeffort.ZeroMQwrapsthisupinasinglemethod,zmq_proxy():

zmq_proxy(frontend,backend,capture)

Thetwo(orthreesockets,ifwewanttocapturedata)mustbeproperlyconnected,bound,andconfigured.Whenwecallthe
http://zguide.zeromq.org/page:all 31/225
12/31/2015 MQ - The Guide - MQ - The Guide
zmq_proxymethod,it'sexactlylikestartingthemainloopofrrbroker.Let'srewritetherequestreplybrokertocall
zmq_proxy,andrebadgethisasanexpensivesounding"messagequeue"(peoplehavechargedhousesforcodethatdidless):

msgqueue:MessagequeuebrokerinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Perl|PHP|Python|Q|Ruby|Tcl|Ada|Basic|Felix|Node.js|
ObjectiveC|ooc|Racket|Scala

Ifyou'relikemostZeroMQusers,atthisstageyourmindisstartingtothink,"WhatkindofevilstuffcanIdoifIplugrandom
sockettypesintotheproxy?"Theshortansweris:tryitandworkoutwhatishappening.Inpractice,youwouldusuallystickto
ROUTER/DEALER,XSUB/XPUB,orPULL/PUSH.

TransportBridging topprevnext

AfrequentrequestfromZeroMQusersis,"HowdoIconnectmyZeroMQnetworkwithtechnologyX?"whereXissomeother
networkingormessagingtechnology.

Figure18PubSubForwarderProxy

Thesimpleansweristobuildabridge.Abridgeisasmallapplicationthatspeaksoneprotocolatonesocket,andconverts
to/fromasecondprotocolatanothersocket.Aprotocolinterpreter,ifyoulike.AcommonbridgingprobleminZeroMQistobridge
twotransportsornetworks.

Asanexample,we'regoingtowritealittleproxythatsitsinbetweenapublisherandasetofsubscribers,bridgingtwonetworks.
Thefrontendsocket(SUB)facestheinternalnetworkwheretheweatherserverissitting,andthebackend(PUB)faces
subscribersontheexternalnetwork.Itsubscribestotheweatherserviceonthefrontendsocket,andrepublishesitsdataonthe
backendsocket.

http://zguide.zeromq.org/page:all 32/225
12/31/2015 MQ - The Guide - MQ - The Guide
wuproxy:WeatherupdateproxyinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|Felix|
ObjectiveC|ooc|Q|Racket

Itlooksverysimilartotheearlierproxyexample,butthekeypartisthatthefrontendandbackendsocketsareontwodifferent
networks.Wecanusethismodelforexampletoconnectamulticastnetwork(pgmtransport)toatcppublisher.

HandlingErrorsandETERM topprevnext

ZeroMQ'serrorhandlingphilosophyisamixoffailfastandresilience.Processes,webelieve,shouldbeasvulnerableas
possibletointernalerrors,andasrobustaspossibleagainstexternalattacksanderrors.Togiveananalogy,alivingcellwillself
destructifitdetectsasingleinternalerror,yetitwillresistattackfromtheoutsidebyallmeanspossible.

Assertions,whichpeppertheZeroMQcode,areabsolutelyvitaltorobustcodetheyjusthavetobeontherightsideofthe
cellularwall.Andthereshouldbesuchawall.Ifitisunclearwhetherafaultisinternalorexternal,thatisadesignflawtobefixed.
InC/C++,assertionsstoptheapplicationimmediatelywithanerror.Inotherlanguages,youmaygetexceptionsorhalts.

WhenZeroMQdetectsanexternalfaultitreturnsanerrortothecallingcode.Insomerarecases,itdropsmessagessilentlyif
thereisnoobviousstrategyforrecoveringfromtheerror.

InmostoftheCexampleswe'veseensofarthere'sbeennoerrorhandling.Realcodeshoulddoerrorhandlingonevery
singleZeroMQcall.Ifyou'reusingalanguagebindingotherthanC,thebindingmayhandleerrorsforyou.InC,youdoneedto
dothisyourself.Therearesomesimplerules,startingwithPOSIXconventions:

MethodsthatcreateobjectsreturnNULLiftheyfail.
Methodsthatprocessdatamayreturnthenumberofbytesprocessed,or1onanerrororfailure.
Othermethodsreturn0onsuccessand1onanerrororfailure.
Theerrorcodeisprovidedinerrnoorzmq_errno().
Adescriptiveerrortextforloggingisprovidedbyzmq_strerror().

Forexample:

void*context=zmq_ctx_new()
assert(context)
void*socket=zmq_socket(context,ZMQ_REP)
assert(socket)
intrc=zmq_bind(socket,"tcp://*:5555")
if(rc==1){
printf("E:bindfailed:%s\n",strerror(errno))
return1
}

Therearetwomainexceptionalconditionsthatyoushouldhandleasnonfatal:

WhenyourcodereceivesamessagewiththeZMQ_DONTWAIToptionandthereisnowaitingdata,ZeroMQwillreturn1
andseterrnotoEAGAIN.

Whenonethreadcallszmq_ctx_destroy(),andotherthreadsarestilldoingblockingwork,thezmq_ctx_destroy()
callclosesthecontextandallblockingcallsexitwith1,anderrnosettoETERM.

InC/C++,assertscanberemovedentirelyinoptimizedcode,sodon'tmakethemistakeofwrappingthewholeZeroMQcallinan
assert().Itlooksneatthentheoptimizerremovesalltheassertsandthecallsyouwanttomake,andyourapplicationbreaks
inimpressiveways.

Figure19ParallelPipelinewithKillSignaling

http://zguide.zeromq.org/page:all 33/225
12/31/2015 MQ - The Guide - MQ - The Guide

Let'sseehowtoshutdownaprocesscleanly.We'lltaketheparallelpipelineexamplefromtheprevioussection.Ifwe'vestarted
awholelotofworkersinthebackground,wenowwanttokillthemwhenthebatchisfinished.Let'sdothisbysendingakill
messagetotheworkers.Thebestplacetodothisisthesinkbecauseitreallyknowswhenthebatchisdone.

Howdoweconnectthesinktotheworkers?ThePUSH/PULLsocketsareonewayonly.Wecouldswitchtoanothersockettype,
orwecouldmixmultiplesocketflows.Let'strythelatter:usingapubsubmodeltosendkillmessagestotheworkers:

ThesinkcreatesaPUBsocketonanewendpoint.
Workersbindtheirinputsockettothisendpoint.
Whenthesinkdetectstheendofthebatch,itsendsakilltoitsPUBsocket.
Whenaworkerdetectsthiskillmessage,itexits.

Itdoesn'ttakemuchnewcodeinthesink:

void*controller=zmq_socket(context,ZMQ_PUB)
zmq_bind(controller,"tcp://*:5559")

//Sendkillsignaltoworkers
s_send(controller,"KILL")

Hereistheworkerprocess,whichmanagestwosockets(aPULLsocketgettingtasks,andaSUBsocketgettingcontrol
commands),usingthezmq_poll()techniquewesawearlier:

taskwork2:ParalleltaskworkerwithkillsignalinginC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic
|Felix|ooc|Q|Racket

http://zguide.zeromq.org/page:all 34/225
12/31/2015 MQ - The Guide - MQ - The Guide
Hereisthemodifiedsinkapplication.Whenit'sfinishedcollectingresults,itbroadcastsakillmessagetoallworkers:

tasksink2:ParalleltasksinkwithkillsignalinginC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic
|Felix|ooc|Q|Racket

HandlingInterruptSignals topprevnext

RealisticapplicationsneedtoshutdowncleanlywheninterruptedwithCtrlCoranothersignalsuchasSIGTERM.Bydefault,
thesesimplykilltheprocess,meaningmessageswon'tbeflushed,fileswon'tbeclosedcleanly,andsoon.

Hereishowwehandleasignalinvariouslanguages:

interrupt:HandlingCtrlCcleanlyinC

C++|C#|Delphi|Erlang|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Ruby|Scala|Ada|Basic|Clojure|CL|F#|Felix|Objective
C|ooc|Q|Racket|Tcl

Theprogramprovidess_catch_signals(),whichtrapsCtrlC(SIGINT)andSIGTERM.Wheneitherofthesesignalsarrive,the
s_catch_signals()handlersetstheglobalvariables_interrupted.Thankstoyoursignalhandler,yourapplicationwillnot
dieautomatically.Instead,youhaveachancetocleanupandexitgracefully.Youhavetonowexplicitlycheckforaninterrupt
andhandleitproperly.Dothisbycallings_catch_signals()(copythisfrominterrupt.c)atthestartofyourmaincode.
Thissetsupthesignalhandling.TheinterruptwillaffectZeroMQcallsasfollows:

Ifyourcodeisblockinginablockingcall(sendingamessage,receivingamessage,orpolling),thenwhenasignalarrives,
thecallwillreturnwithEINTR.
Wrapperslikes_recv()returnNULLiftheyareinterrupted.

SocheckforanEINTRreturncode,aNULLreturn,and/ors_interrupted.

Hereisatypicalcodefragment:

s_catch_signals()
client=zmq_socket(...)
while(!s_interrupted){
char*message=s_recv(client)
if(!message)
break//CtrlCused
}
zmq_close(client)

Ifyoucalls_catch_signals()anddon'ttestforinterrupts,thenyourapplicationwillbecomeimmunetoCtrlCandSIGTERM,
whichmaybeuseful,butisusuallynot.

DetectingMemoryLeaks topprevnext

Anylongrunningapplicationhastomanagememorycorrectly,oreventuallyit'lluseupallavailablememoryandcrash.Ifyou
usealanguagethathandlesthisautomaticallyforyou,congratulations.IfyouprograminCorC++oranyotherlanguagewhere
you'reresponsibleformemorymanagement,here'sashorttutorialonusingvalgrind,whichamongotherthingswillreportonany
leaksyourprogramshave.

Toinstallvalgrind,e.g.,onUbuntuorDebian,issuethiscommand:

sudoaptgetinstallvalgrind

http://zguide.zeromq.org/page:all 35/225
12/31/2015 MQ - The Guide - MQ - The Guide
Bydefault,ZeroMQwillcausevalgrindtocomplainalot.Toremovethesewarnings,createafilecalledvg.suppthat
containsthis:

{
<socketcall_sendto>
Memcheck:Param
socketcall.sendto(msg)
fun:send
...
}
{
<socketcall_sendto>
Memcheck:Param
socketcall.send(msg)
fun:send
...
}

FixyourapplicationstoexitcleanlyafterCtrlC.Foranyapplicationthatexitsbyitself,that'snotneeded,butforlong
runningapplications,thisisessential,otherwisevalgrindwillcomplainaboutallcurrentlyallocatedmemory.

BuildyourapplicationwithDDEBUGifit'snotyourdefaultsetting.Thatensuresvalgrindcantellyouexactlywhere
memoryisbeingleaked.

Finally,runvalgrindthus:

valgrindtool=memcheckleakcheck=fullsuppressions=vg.suppsomeprog

Andafterfixinganyerrorsitreported,youshouldgetthepleasantmessage:

==30536==ERRORSUMMARY:0errorsfrom0contexts...

MultithreadingwithZeroMQ topprevnext

ZeroMQisperhapsthenicestwayevertowritemultithreaded(MT)applications.WhereasZeroMQsocketsrequiresome
readjustmentifyouareusedtotraditionalsockets,ZeroMQmultithreadingwilltakeeverythingyouknowaboutwritingMT
applications,throwitintoaheapinthegarden,pourgasolineoverit,andsetitalight.It'sararebookthatdeservesburning,but
mostbooksonconcurrentprogrammingdo.

TomakeutterlyperfectMTprograms(andImeanthatliterally),wedon'tneedmutexes,locks,oranyotherformofinter
threadcommunicationexceptmessagessentacrossZeroMQsockets.

By"perfectMTprograms",Imeancodethat'seasytowriteandunderstand,thatworkswiththesamedesignapproachinany
programminglanguage,andonanyoperatingsystem,andthatscalesacrossanynumberofCPUswithzerowaitstatesandno
pointofdiminishingreturns.

Ifyou'vespentyearslearningtrickstomakeyourMTcodeworkatall,letalonerapidly,withlocksandsemaphoresandcritical
sections,youwillbedisgustedwhenyourealizeitwasallfornothing.Ifthere'sonelessonwe'velearnedfrom30+yearsof
concurrentprogramming,itis:justdon'tsharestate.It'sliketwodrunkardstryingtoshareabeer.Itdoesn'tmatterifthey'regood
buddies.Soonerorlater,they'regoingtogetintoafight.Andthemoredrunkardsyouaddtothetable,themoretheyfighteach
otheroverthebeer.ThetragicmajorityofMTapplicationslooklikedrunkenbarfights.

ThelistofweirdproblemsthatyouneedtofightasyouwriteclassicsharedstateMTcodewouldbehilariousifitdidn'ttranslate
directlyintostressandrisk,ascodethatseemstoworksuddenlyfailsunderpressure.Alargefirmwithworldbeatingexperience
inbuggycodereleaseditslistof"11LikelyProblemsInYourMultithreadedCode",whichcoversforgottensynchronization,
incorrectgranularity,readandwritetearing,lockfreereordering,lockconvoys,twostepdance,andpriorityinversion.

http://zguide.zeromq.org/page:all 36/225
12/31/2015 MQ - The Guide - MQ - The Guide
Yeah,wecountedsevenproblems,noteleven.That'snotthepointthough.Thepointis,doyoureallywantthatcoderunningthe
powergridorstockmarkettostartgettingtwosteplockconvoysat3p.m.onabusyThursday?Whocareswhattheterms
actuallymean?Thisisnotwhatturnedusontoprogramming,fightingevermorecomplexsideeffectswithevermorecomplex
hacks.

Somewidelyusedmodels,despitebeingthebasisforentireindustries,arefundamentallybroken,andsharedstateconcurrency
isoneofthem.CodethatwantstoscalewithoutlimitdoesitliketheInternetdoes,bysendingmessagesandsharingnothing
exceptacommoncontemptforbrokenprogrammingmodels.

YoushouldfollowsomerulestowritehappymultithreadedcodewithZeroMQ:

Isolatedataprivatelywithinitsthreadandneversharedatainmultiplethreads.TheonlyexceptiontothisareZeroMQ
contexts,whicharethreadsafe.

Stayawayfromtheclassicconcurrencymechanismslikeasmutexes,criticalsections,semaphores,etc.Thesearean
antipatterninZeroMQapplications.

CreateoneZeroMQcontextatthestartofyourprocess,andpassthattoallthreadsthatyouwanttoconnectviainproc
sockets.

Useattachedthreadstocreatestructurewithinyourapplication,andconnectthesetotheirparentthreadsusingPAIR
socketsoverinproc.Thepatternis:bindparentsocket,thencreatechildthreadwhichconnectsitssocket.

Usedetachedthreadstosimulateindependenttasks,withtheirowncontexts.Connecttheseovertcp.Lateryoucan
movethesetostandaloneprocesseswithoutchangingthecodesignificantly.

AllinteractionbetweenthreadshappensasZeroMQmessages,whichyoucandefinemoreorlessformally.

Don'tshareZeroMQsocketsbetweenthreads.ZeroMQsocketsarenotthreadsafe.Technicallyit'spossibletomigratea
socketfromonethreadtoanotherbutitdemandsskill.Theonlyplacewhereit'sremotelysanetosharesocketsbetween
threadsareinlanguagebindingsthatneedtodomagiclikegarbagecollectiononsockets.

Ifyouneedtostartmorethanoneproxyinanapplication,forexample,youwillwanttoruneachintheirownthread.Itiseasyto
maketheerrorofcreatingtheproxyfrontendandbackendsocketsinonethread,andthenpassingthesocketstotheproxyin
anotherthread.Thismayappeartoworkatfirstbutwillfailrandomlyinrealuse.Remember:Donotuseorclosesocketsexcept
inthethreadthatcreatedthem.

Ifyoufollowtheserules,youcanquiteeasilybuildelegantmultithreadedapplications,andlatersplitoffthreadsintoseparate
processesasyouneedto.Applicationlogiccansitinthreads,processes,ornodes:whateveryourscaleneeds.

ZeroMQusesnativeOSthreadsratherthanvirtual"green"threads.Theadvantageisthatyoudon'tneedtolearnanynew
threadingAPI,andthatZeroMQthreadsmapcleanlytoyouroperatingsystem.YoucanusestandardtoolslikeIntel's
ThreadCheckertoseewhatyourapplicationisdoing.ThedisadvantagesarethatnativethreadingAPIsarenotalwaysportable,
andthatifyouhaveahugenumberofthreads(inthethousands),someoperatingsystemswillgetstressed.

Let'sseehowthisworksinpractice.We'llturnouroldHelloWorldserverintosomethingmorecapable.Theoriginalserverranin
asinglethread.Iftheworkperrequestislow,that'sfine:oneMQthreadcanrunatfullspeedonaCPUcore,withnowaits,
doinganawfullotofwork.Butrealisticservershavetodonontrivialworkperrequest.Asinglecoremaynotbeenoughwhen
10,000clientshittheserverallatonce.Soarealisticserverwillstartmultipleworkerthreads.Itthenacceptsrequestsasfastas
itcananddistributesthesetoitsworkerthreads.Theworkerthreadsgrindthroughtheworkandeventuallysendtheirreplies
back.

Youcan,ofcourse,doallthisusingaproxybrokerandexternalworkerprocesses,butoftenit'seasiertostartoneprocessthat
gobblesupsixteencoresthansixteenprocesses,eachgobblinguponecore.Further,runningworkersasthreadswillcutouta
networkhop,latency,andnetworktraffic.

TheMTversionoftheHelloWorldservicebasicallycollapsesthebrokerandworkersintoasingleprocess:

mtserver:MultithreadedserviceinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Perl|PHP|Python|Q|Ruby|Scala|Ada|Basic|Felix|Node.js|
ObjectiveC|ooc|Racket|Tcl

Figure20MultithreadedServer

http://zguide.zeromq.org/page:all 37/225
12/31/2015 MQ - The Guide - MQ - The Guide

Allthecodeshouldberecognizabletoyoubynow.Howitworks:

Theserverstartsasetofworkerthreads.EachworkerthreadcreatesaREPsocketandthenprocessesrequestsonthis
socket.Workerthreadsarejustlikesinglethreadedservers.Theonlydifferencesarethetransport(inprocinsteadof
tcp),andthebindconnectdirection.

TheservercreatesaROUTERsockettotalktoclientsandbindsthistoitsexternalinterface(overtcp).

TheservercreatesaDEALERsockettotalktotheworkersandbindsthistoitsinternalinterface(overinproc).

Theserverstartsaproxythatconnectsthetwosockets.Theproxypullsincomingrequestsfairlyfromallclients,and
distributesthoseouttoworkers.Italsoroutesrepliesbacktotheirorigin.

Notethatcreatingthreadsisnotportableinmostprogramminglanguages.ThePOSIXlibraryispthreads,butonWindowsyou
havetouseadifferentAPI.Inourexample,thepthread_createcallstartsupanewthreadrunningtheworker_routine
functionwedefined.We'llseeinChapter3AdvancedRequestReplyPatternshowtowrapthisinaportableAPI.

Herethe"work"isjustaonesecondpause.Wecoulddoanythingintheworkers,includingtalkingtoothernodes.Thisiswhat
theMTserverlookslikeintermsofMQsocketsandnodes.NotehowtherequestreplychainisREQROUTERqueue
DEALERREP.

SignalingBetweenThreads(PAIRSockets) topprevnext

WhenyoustartmakingmultithreadedapplicationswithZeroMQ,you'llencounterthequestionofhowtocoordinateyourthreads.
Thoughyoumightbetemptedtoinsert"sleep"statements,orusemultithreadingtechniquessuchassemaphoresormutexes,
theonlymechanismthatyoushoulduseareZeroMQmessages.RememberthestoryofTheDrunkardsandTheBeer
http://zguide.zeromq.org/page:all 38/225
12/31/2015 MQ - The Guide - MQ - The Guide
Bottle.

Let'smakethreethreadsthatsignaleachotherwhentheyareready.Inthisexample,weusePAIRsocketsovertheinproc
transport:

mtrelay:MultithreadedrelayinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Perl|PHP|Python|Q|Ruby|Scala|Ada|Basic|Felix|Node.js|
ObjectiveC|ooc|Racket|Tcl

Figure21TheRelayRace

ThisisaclassicpatternformultithreadingwithZeroMQ:

1. Twothreadscommunicateoverinproc,usingasharedcontext.
2. Theparentthreadcreatesonesocket,bindsittoaninproc://endpoint,andthenstartsthechildthread,passingthe
contexttoit.
3. Thechildthreadcreatesthesecondsocket,connectsittothatinproc://endpoint,andthensignalstotheparentthread
thatit'sready.

Notethatmultithreadingcodeusingthispatternisnotscalableouttoprocesses.Ifyouuseinprocandsocketpairs,youare
buildingatightlyboundapplication,i.e.,onewhereyourthreadsarestructurallyinterdependent.Dothiswhenlowlatencyis
reallyvital.Theotherdesignpatternisalooselyboundapplication,wherethreadshavetheirowncontextandcommunicateover
ipcortcp.Youcaneasilybreaklooselyboundthreadsintoseparateprocesses.

Thisisthefirsttimewe'veshownanexampleusingPAIRsockets.WhyusePAIR?Othersocketcombinationsmightseemto
work,buttheyallhavesideeffectsthatcouldinterferewithsignaling:

YoucanusePUSHforthesenderandPULLforthereceiver.Thislookssimpleandwillwork,butrememberthatPUSH
willdistributemessagestoallavailablereceivers.Ifyoubyaccidentstarttworeceivers(e.g.,youalreadyhaveone
runningandyoustartasecond),you'll"lose"halfofyoursignals.PAIRhastheadvantageofrefusingmorethanone
connectionthepairisexclusive.

YoucanuseDEALERforthesenderandROUTERforthereceiver.ROUTER,however,wrapsyourmessageinan
"envelope",meaningyourzerosizesignalturnsintoamultipartmessage.Ifyoudon'tcareaboutthedataandtreat
anythingasavalidsignal,andifyoudon'treadmorethanoncefromthesocket,thatwon'tmatter.If,however,youdecide
tosendrealdata,youwillsuddenlyfindROUTERprovidingyouwith"wrong"messages.DEALERalsodistributes
http://zguide.zeromq.org/page:all 39/225
12/31/2015 MQ - The Guide - MQ - The Guide
outgoingmessages,givingthesameriskasPUSH.

YoucanusePUBforthesenderandSUBforthereceiver.Thiswillcorrectlydeliveryourmessagesexactlyasyousent
themandPUBdoesnotdistributeasPUSHorDEALERdo.However,youneedtoconfigurethesubscriberwithanempty
subscription,whichisannoying.

Forthesereasons,PAIRmakesthebestchoiceforcoordinationbetweenpairsofthreads.

NodeCoordination topprevnext

Whenyouwanttocoordinateasetofnodesonanetwork,PAIRsocketswon'tworkwellanymore.Thisisoneofthefewareas
wherethestrategiesforthreadsandnodesaredifferent.Principally,nodescomeandgowhereasthreadsareusuallystatic.
PAIRsocketsdonotautomaticallyreconnectiftheremotenodegoesawayandcomesback.

Figure22PubSubSynchronization

Thesecondsignificantdifferencebetweenthreadsandnodesisthatyoutypicallyhaveafixednumberofthreadsbutamore
variablenumberofnodes.Let'stakeoneofourearlierscenarios(theweatherserverandclients)andusenodecoordinationto
ensurethatsubscribersdon'tlosedatawhenstartingup.

Thisishowtheapplicationwillwork:

Thepublisherknowsinadvancehowmanysubscribersitexpects.Thisisjustamagicnumberitgetsfromsomewhere.

Thepublisherstartsupandwaitsforallsubscriberstoconnect.Thisisthenodecoordinationpart.Eachsubscriber
subscribesandthentellsthepublisherit'sreadyviaanothersocket.

Whenthepublisherhasallsubscribersconnected,itstartstopublishdata.

Inthiscase,we'lluseaREQREPsocketflowtosynchronizesubscribersandpublisher.Hereisthepublisher:

syncpub:SynchronizedpublisherinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Racket|Ruby|Scala|Tcl|Ada|Basic|
Felix|ObjectiveC|ooc|Q

Andhereisthesubscriber:

syncsub:SynchronizedsubscriberinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Racket|Ruby|Scala|Tcl|Ada|Basic|
Felix|ObjectiveC|ooc|Q

ThisBashshellscriptwillstarttensubscribersandthenthepublisher:

http://zguide.zeromq.org/page:all 40/225
12/31/2015 MQ - The Guide - MQ - The Guide

echo"Startingsubscribers..."
for((a=0a<10a++))do
syncsub&
done
echo"Startingpublisher..."
syncpub

Whichgivesusthissatisfyingoutput:

Startingsubscribers...
Startingpublisher...
Received1000000updates
Received1000000updates
...
Received1000000updates
Received1000000updates

Wecan'tassumethattheSUBconnectwillbefinishedbythetimetheREQ/REPdialogiscomplete.Therearenoguarantees
thatoutboundconnectswillfinishinanyorderwhatsoever,ifyou'reusinganytransportexceptinproc.So,theexampledoesa
bruteforcesleepofonesecondbetweensubscribing,andsendingtheREQ/REPsynchronization.

Amorerobustmodelcouldbe:

PublisheropensPUBsocketandstartssending"Hello"messages(notdata).
SubscribersconnectSUBsocketandwhentheyreceiveaHellomessagetheytellthepublisherviaaREQ/REPsocket
pair.
Whenthepublisherhashadallthenecessaryconfirmations,itstartstosendrealdata.

ZeroCopy topprevnext

ZeroMQ'smessageAPIletsyousendandreceivemessagesdirectlyfromandtoapplicationbufferswithoutcopyingdata.We
callthiszerocopy,anditcanimproveperformanceinsomeapplications.

Youshouldthinkaboutusingzerocopyinthespecificcasewhereyouaresendinglargeblocksofmemory(thousandsofbytes),
atahighfrequency.Forshortmessages,orforlowermessagerates,usingzerocopywillmakeyourcodemessierandmore
complexwithnomeasurablebenefit.Likealloptimizations,usethiswhenyouknowithelps,andmeasurebeforeandafter.

Todozerocopy,youusezmq_msg_init_data()tocreateamessagethatreferstoablockofdataalreadyallocatedwith
malloc()orsomeotherallocator,andthenyoupassthattozmq_msg_send().Whenyoucreatethemessage,youalsopassa
functionthatZeroMQwillcalltofreetheblockofdata,whenithasfinishedsendingthemessage.Thisisthesimplestexample,
assumingbufferisablockof1,000bytesallocatedontheheap:

voidmy_free(void*data,void*hint){
free(data)
}
//Sendmessagefrombuffer,whichweallocateandZeroMQwillfreeforus
zmq_msg_tmessage
zmq_msg_init_data(&message,buffer,1000,my_free,NULL)
zmq_msg_send(&message,socket,0)

Notethatyoudon'tcallzmq_msg_close()aftersendingamessagelibzmqwilldothisautomaticallywhenit'sactuallydone
sendingthemessage.

Thereisnowaytodozerocopyonreceive:ZeroMQdeliversyouabufferthatyoucanstoreaslongasyouwish,butitwillnot
writedatadirectlyintoapplicationbuffers.

http://zguide.zeromq.org/page:all 41/225
12/31/2015 MQ - The Guide - MQ - The Guide
Onwriting,ZeroMQ'smultipartmessagesworknicelytogetherwithzerocopy.Intraditionalmessaging,youneedtomarshal
differentbufferstogetherintoonebufferthatyoucansend.Thatmeanscopyingdata.WithZeroMQ,youcansendmultiple
bufferscomingfromdifferentsourcesasindividualmessageframes.Sendeachfieldasalengthdelimitedframe.Tothe
application,itlookslikeaseriesofsendandreceivecalls.Butinternally,themultiplepartsgetwrittentothenetworkandread
backwithsinglesystemcalls,soit'sveryefficient.

PubSubMessageEnvelopes topprevnext

Inthepubsubpattern,wecansplitthekeyintoaseparatemessageframethatwecallanenvelope.Ifyouwanttousepubsub
envelopes,makethemyourself.It'soptional,andinpreviouspubsubexampleswedidn'tdothis.Usingapubsubenvelopeisa
littlemoreworkforsimplecases,butit'scleanerespeciallyforrealcases,wherethekeyandthedataarenaturallyseparate
things.

Figure23PubSubEnvelopewithSeparateKey

Recallthatsubscriptionsdoaprefixmatch.Thatis,theylookfor"allmessagesstartingwithXYZ".Theobviousquestionis:how
todelimitkeysfromdatasothattheprefixmatchdoesn'taccidentallymatchdata.Thebestansweristouseanenvelope
becausethematchwon'tcrossaframeboundary.Hereisaminimalistexampleofhowpubsubenvelopeslookincode.This
publishersendsmessagesoftwotypes,AandB.

Theenvelopeholdsthemessagetype:

psenvpub:PubSubenvelopepublisherinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|Felix|
ObjectiveC|ooc|Q|Racket

ThesubscriberwantsonlymessagesoftypeB:

psenvsub:PubSubenvelopesubscriberinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|Felix|
ObjectiveC|ooc|Q|Racket

Whenyourunthetwoprograms,thesubscribershouldshowyouthis:

[B]Wewouldliketoseethis
[B]Wewouldliketoseethis
[B]Wewouldliketoseethis
...

Thisexampleshowsthatthesubscriptionfilterrejectsoracceptstheentiremultipartmessage(keyplusdata).Youwon'tgetpart
ofamultipartmessage,ever.Ifyousubscribetomultiplepublishersandyouwanttoknowtheiraddresssothatyoucansend
themdataviaanothersocket(andthisisatypicalusecase),createathreepartmessage.

Figure24PubSubEnvelopewithSenderAddress

http://zguide.zeromq.org/page:all 42/225
12/31/2015 MQ - The Guide - MQ - The Guide

HighWaterMarks topprevnext

Whenyoucansendmessagesrapidlyfromprocesstoprocess,yousoondiscoverthatmemoryisapreciousresource,andone
thatcanbetriviallyfilledup.Afewsecondsofdelaysomewhereinaprocesscanturnintoabacklogthatblowsupaserver
unlessyouunderstandtheproblemandtakeprecautions.

Theproblemisthis:imagineyouhaveprocessAsendingmessagesathighfrequencytoprocessB,whichisprocessingthem.
SuddenlyBgetsverybusy(garbagecollection,CPUoverload,whatever),andcan'tprocessthemessagesforashortperiod.It
couldbeafewsecondsforsomeheavygarbagecollection,oritcouldbemuchlonger,ifthere'samoreseriousproblem.What
happenstothemessagesthatprocessAisstilltryingtosendfrantically?SomewillsitinB'snetworkbuffers.Somewillsitonthe
Ethernetwireitself.SomewillsitinA'snetworkbuffers.AndtherestwillaccumulateinA'smemory,asrapidlyastheapplication
behindAsendsthem.Ifyoudon'ttakesomeprecaution,Acaneasilyrunoutofmemoryandcrash.

Itisaconsistent,classicproblemwithmessagebrokers.Whatmakesithurtmoreisthatit'sB'sfault,superficially,andBis
typicallyauserwrittenapplicationwhichAhasnocontrolover.

Whataretheanswers?Oneistopasstheproblemupstream.Aisgettingthemessagesfromsomewhereelse.Sotellthat
process,"Stop!"Andsoon.Thisiscalledflowcontrol.Itsoundsplausible,butwhatifyou'resendingoutaTwitterfeed?Doyou
tellthewholeworldtostoptweetingwhileBgetsitsacttogether?

Flowcontrolworksinsomecases,butnotinothers.Thetransportlayercan'ttelltheapplicationlayerto"stop"anymorethana
subwaysystemcantellalargebusiness,"pleasekeepyourstaffatworkforanotherhalfanhour.I'mtoobusy".Theanswerfor
messagingistosetlimitsonthesizeofbuffers,andthenwhenwereachthoselimits,totakesomesensibleaction.Insome
cases(notforasubwaysystem,though),theansweristothrowawaymessages.Inothers,thebeststrategyistowait.

ZeroMQusestheconceptofHWM(highwatermark)todefinethecapacityofitsinternalpipes.Eachconnectionoutofasocket
orintoasockethasitsownpipe,andHWMforsending,and/orreceiving,dependingonthesockettype.Somesockets(PUB,
PUSH)onlyhavesendbuffers.Some(SUB,PULL,REQ,REP)onlyhavereceivebuffers.Some(DEALER,ROUTER,PAIR)
havebothsendandreceivebuffers.

InZeroMQv2.x,theHWMwasinfinitebydefault.Thiswaseasybutalsotypicallyfatalforhighvolumepublishers.InZeroMQ
v3.x,it'ssetto1,000bydefault,whichismoresensible.Ifyou'restillusingZeroMQv2.x,youshouldalwayssetaHWMonyour
sockets,beit1,000tomatchZeroMQv3.xoranotherfigurethattakesintoaccountyourmessagesizesandexpectedsubscriber
performance.

WhenyoursocketreachesitsHWM,itwilleitherblockordropdatadependingonthesockettype.PUBandROUTERsockets
willdropdataiftheyreachtheirHWM,whileothersockettypeswillblock.Overtheinproctransport,thesenderandreceiver
sharethesamebuffers,sotherealHWMisthesumoftheHWMsetbybothsides.

Lastly,theHWMsarenotexactwhileyoumaygetupto1,000messagesbydefault,therealbuffersizemaybemuchlower(as
littleashalf),duetothewaylibzmqimplementsitsqueues.

MissingMessageProblemSolver topprevnext

AsyoubuildapplicationswithZeroMQ,youwillcomeacrossthisproblemmorethanonce:losingmessagesthatyouexpectto
receive.Wehaveputtogetheradiagramthatwalksthroughthemostcommoncausesforthis.

Figure25MissingMessageProblemSolver

http://zguide.zeromq.org/page:all 43/225
12/31/2015 MQ - The Guide - MQ - The Guide

http://zguide.zeromq.org/page:all 44/225
12/31/2015 MQ - The Guide - MQ - The Guide
Here'sasummaryofwhatthegraphicsays:

OnSUBsockets,setasubscriptionusingzmq_setsockopt()withZMQ_SUBSCRIBE,oryouwon'tgetmessages.
Becauseyousubscribetomessagesbyprefix,ifyousubscribeto""(anemptysubscription),youwillgeteverything.

IfyoustarttheSUBsocket(i.e.,establishaconnectiontoaPUBsocket)afterthePUBsockethasstartedsendingout
data,youwilllosewhateveritpublishedbeforetheconnectionwasmade.Ifthisisaproblem,setupyourarchitectureso
theSUBsocketstartsfirst,thenthePUBsocketstartspublishing.

EvenifyousynchronizeaSUBandPUBsocket,youmaystilllosemessages.It'sduetothefactthatinternalqueues
aren'tcreateduntilaconnectionisactuallycreated.Ifyoucanswitchthebind/connectdirectionsotheSUBsocketbinds,
andthePUBsocketconnects,youmayfinditworksmoreasyou'dexpect.

Ifyou'reusingREPandREQsockets,andyou'renotstickingtothesynchronoussend/recv/send/recvorder,ZeroMQwill
reporterrors,whichyoumightignore.Then,itwouldlooklikeyou'relosingmessages.IfyouuseREQorREP,sticktothe
send/recvorder,andalways,inrealcode,checkforerrorsonZeroMQcalls.

Ifyou'reusingPUSHsockets,you'llfindthatthefirstPULLsockettoconnectwillgrabanunfairshareofmessages.The
accuraterotationofmessagesonlyhappenswhenallPULLsocketsaresuccessfullyconnected,whichcantakesome
milliseconds.AsanalternativetoPUSH/PULL,forlowerdatarates,considerusingROUTER/DEALERandtheload
balancingpattern.

Ifyou'resharingsocketsacrossthreads,don't.Itwillleadtorandomweirdness,andcrashes.

Ifyou'reusinginproc,makesurebothsocketsareinthesamecontext.Otherwisetheconnectingsidewillinfactfail.
Also,bindfirst,thenconnect.inprocisnotadisconnectedtransportliketcp.

Ifyou'reusingROUTERsockets,it'sremarkablyeasytolosemessagesbyaccident,bysendingmalformedidentity
frames(orforgettingtosendanidentityframe).IngeneralsettingtheZMQ_ROUTER_MANDATORYoptiononROUTER
socketsisagoodidea,butdoalsocheckthereturncodeoneverysendcall.

Lastly,ifyoureallycan'tfigureoutwhat'sgoingwrong,makeaminimaltestcasethatreproducestheproblem,andaskfor
helpfromtheZeroMQcommunity.

Chapter3AdvancedRequestReplyPatterns topprevnext

InChapter2SocketsandPatternsweworkedthroughthebasicsofusingZeroMQbydevelopingaseriesofsmallapplications,
eachtimeexploringnewaspectsofZeroMQ.We'llcontinuethisapproachinthischapterasweexploreadvancedpatternsbuilt
ontopofZeroMQ'scorerequestreplypattern.

We'llcover:

Howtherequestreplymechanismswork
HowtocombineREQ,REP,DEALER,andROUTERsockets
HowROUTERsocketswork,indetail
Theloadbalancingpattern
Buildingasimpleloadbalancingmessagebroker
DesigningahighlevelAPIforZeroMQ
Buildinganasynchronousrequestreplyserver
Adetailedinterbrokerroutingexample

TheRequestReplyMechanisms topprevnext

Wealreadylookedbrieflyatmultipartmessages.Let'snowlookatamajorusecase,whichisreplymessageenvelopes.An
envelopeisawayofsafelypackagingupdatawithanaddress,withouttouchingthedataitself.Byseparatingreplyaddresses
intoanenvelopewemakeitpossibletowritegeneralpurposeintermediariessuchasAPIsandproxiesthatcreate,read,and
removeaddressesnomatterwhatthemessagepayloadorstructureis.

http://zguide.zeromq.org/page:all 45/225
12/31/2015 MQ - The Guide - MQ - The Guide
Intherequestreplypattern,theenvelopeholdsthereturnaddressforreplies.ItishowaZeroMQnetworkwithnostatecan
createroundtriprequestreplydialogs.

WhenyouuseREQandREPsocketsyoudon'tevenseeenvelopesthesesocketsdealwiththemautomatically.Butformostof
theinterestingrequestreplypatterns,you'llwanttounderstandenvelopesandparticularlyROUTERsockets.We'llworkthrough
thisstepbystep.

TheSimpleReplyEnvelope topprevnext

Arequestreplyexchangeconsistsofarequestmessage,andaneventualreplymessage.Inthesimplerequestreplypattern,
there'sonereplyforeachrequest.Inmoreadvancedpatterns,requestsandrepliescanflowasynchronously.However,thereply
envelopealwaysworksthesameway.

TheZeroMQreplyenvelopeformallyconsistsofzeroormorereplyaddresses,followedbyanemptyframe(theenvelope
delimiter),followedbythemessagebody(zeroormoreframes).Theenvelopeiscreatedbymultiplesocketsworkingtogetherin
achain.We'llbreakthisdown.

We'llstartbysending"Hello"throughaREQsocket.TheREQsocketcreatesthesimplestpossiblereplyenvelope,whichhasno
addresses,justanemptydelimiterframeandthemessageframecontainingthe"Hello"string.Thisisatwoframemessage.

Figure26RequestwithMinimalEnvelope

TheREPsocketdoesthematchingwork:itstripsofftheenvelope,uptoandincludingthedelimiterframe,savesthewhole
envelope,andpassesthe"Hello"stringuptheapplication.ThusouroriginalHelloWorldexampleusedrequestreplyenvelopes
internally,buttheapplicationneversawthem.

Ifyouspyonthenetworkdataflowingbetweenhwclientandhwserver,thisiswhatyou'llsee:everyrequestandeveryreply
isinfacttwoframes,anemptyframeandthenthebody.Itdoesn'tseemtomakemuchsenseforasimpleREQREPdialog.
Howeveryou'llseethereasonwhenweexplorehowROUTERandDEALERhandleenvelopes.

TheExtendedReplyEnvelope topprevnext

Nowlet'sextendtheREQREPpairwithaROUTERDEALERproxyinthemiddleandseehowthisaffectsthereplyenvelope.
ThisistheextendedrequestreplypatternwealreadysawinChapter2SocketsandPatterns.Wecan,infact,insertany
numberofproxysteps.Themechanicsarethesame.

Figure27ExtendedRequestReplyPattern

http://zguide.zeromq.org/page:all 46/225
12/31/2015 MQ - The Guide - MQ - The Guide

Theproxydoesthis,inpseudocode:

preparecontext,frontendandbackendsockets
whiletrue:
pollonbothsockets
iffrontendhadinput:
readallframesfromfrontend
sendtobackend
ifbackendhadinput:
readallframesfrombackend
sendtofrontend

TheROUTERsocket,unlikeothersockets,trackseveryconnectionithas,andtellsthecalleraboutthese.Thewayittellsthe
calleristosticktheconnectionidentityinfrontofeachmessagereceived.Anidentity,sometimescalledanaddress,isjusta
binarystringwithnomeaningexcept"thisisauniquehandletotheconnection".Then,whenyousendamessageviaaROUTER
socket,youfirstsendanidentityframe.

Thezmq_socket()manpagedescribesitthus:

WhenreceivingmessagesaZMQ_ROUTERsocketshallprependamessagepartcontainingtheidentityoftheoriginating
peertothemessagebeforepassingittotheapplication.Messagesreceivedarefairqueuedfromamongallconnected
peers.WhensendingmessagesaZMQ_ROUTERsocketshallremovethefirstpartofthemessageanduseittodetermine
theidentityofthepeerthemessageshallberoutedto.

Asahistoricalnote,ZeroMQv2.2andearlieruseUUIDsasidentities,andZeroMQv3.0andlateruseshortintegers.There's
someimpactonnetworkperformance,butonlywhenyouusemultipleproxyhops,whichisrare.Mostlythechangewasto
simplifybuildinglibzmqbyremovingthedependencyonaUUIDlibrary.

Identitiesareadifficultconcepttounderstand,butit'sessentialifyouwanttobecomeaZeroMQexpert.TheROUTERsocket
inventsarandomidentityforeachconnectionwithwhichitworks.IftherearethreeREQsocketsconnectedtoaROUTER
socket,itwillinventthreerandomidentities,oneforeachREQsocket.

Soifwecontinueourworkedexample,let'ssaytheREQsockethasa3byteidentityABC.Internally,thismeanstheROUTER
socketkeepsahashtablewhereitcansearchforABCandfindtheTCPconnectionfortheREQsocket.

WhenwereceivethemessageofftheROUTERsocket,wegetthreeframes.

Figure28RequestwithOneAddress

http://zguide.zeromq.org/page:all 47/225
12/31/2015 MQ - The Guide - MQ - The Guide

Thecoreoftheproxyloopis"readfromonesocket,writetotheother",soweliterallysendthesethreeframesoutonthe
DEALERsocket.Ifyounowsniffedthenetworktraffic,youwouldseethesethreeframesflyingfromtheDEALERsockettothe
REPsocket.TheREPsocketdoesasbefore,stripsoffthewholeenvelopeincludingthenewreplyaddress,andonceagain
deliversthe"Hello"tothecaller.

IncidentallytheREPsocketcanonlydealwithonerequestreplyexchangeatatime,whichiswhyifyoutrytoreadmultiple
requestsorsendmultiplereplieswithoutstickingtoastrictrecvsendcycle,itgivesanerror.

Youshouldnowbeabletovisualizethereturnpath.Whenhwserversends"World"back,theREPsocketwrapsthatwiththe
envelopeitsaved,andsendsathreeframereplymessageacrossthewiretotheDEALERsocket.

Figure29ReplywithoneAddress

NowtheDEALERreadsthesethreeframes,andsendsallthreeoutviatheROUTERsocket.TheROUTERtakesthefirstframe
forthemessage,whichistheABCidentity,andlooksuptheconnectionforthis.Ifitfindsthat,itthenpumpsthenexttwoframes
outontothewire.

Figure30ReplywithMinimalEnvelope

TheREQsocketpicksthismessageup,andchecksthatthefirstframeistheemptydelimiter,whichitis.TheREQsocket
discardsthatframeandpasses"World"tothecallingapplication,whichprintsitouttotheamazementoftheyoungeruslooking
atZeroMQforthefirsttime.

What'sThisGoodFor? topprevnext

Tobehonest,theusecasesforstrictrequestreplyorextendedrequestreplyaresomewhatlimited.Foronething,there'sno
easywaytorecoverfromcommonfailuresliketheservercrashingduetobuggyapplicationcode.We'llseemoreaboutthisin
Chapter4ReliableRequestReplyPatterns.Howeveronceyougraspthewaythesefoursocketsdealwithenvelopes,andhow
theytalktoeachother,youcandoveryusefulthings.WesawhowROUTERusesthereplyenvelopetodecidewhichclientREQ
sockettorouteareplybackto.Nowlet'sexpressthisanotherway:

http://zguide.zeromq.org/page:all 48/225
12/31/2015 MQ - The Guide - MQ - The Guide
EachtimeROUTERgivesyouamessage,ittellsyouwhatpeerthatcamefrom,asanidentity.
Youcanusethiswithahashtable(withtheidentityaskey)totracknewpeersastheyarrive.
ROUTERwillroutemessagesasynchronouslytoanypeerconnectedtoit,ifyouprefixtheidentityasthefirstframeofthe
message.

ROUTERsocketsdon'tcareaboutthewholeenvelope.Theydon'tknowanythingabouttheemptydelimiter.Alltheycareabout
isthatoneidentityframethatletsthemfigureoutwhichconnectiontosendamessageto.

RecapofRequestReplySockets topprevnext

Let'srecapthis:

TheREQsocketsends,tothenetwork,anemptydelimiterframeinfrontofthemessagedata.REQsocketsare
synchronous.REQsocketsalwayssendonerequestandthenwaitforonereply.REQsocketstalktoonepeeratatime.
IfyouconnectaREQsockettomultiplepeers,requestsaredistributedtoandrepliesexpectedfromeachpeeroneturnat
atime.

TheREPsocketreadsandsavesallidentityframesuptoandincludingtheemptydelimiter,thenpassesthefollowing
frameorframestothecaller.REPsocketsaresynchronousandtalktoonepeeratatime.IfyouconnectaREPsocketto
multiplepeers,requestsarereadfrompeersinfairfashion,andrepliesarealwayssenttothesamepeerthatmadethe
lastrequest.

TheDEALERsocketisoblivioustothereplyenvelopeandhandlesthislikeanymultipartmessage.DEALERsocketsare
asynchronousandlikePUSHandPULLcombined.Theydistributesentmessagesamongallconnections,andfairqueue
receivedmessagesfromallconnections.

TheROUTERsocketisoblivioustothereplyenvelope,likeDEALER.Itcreatesidentitiesforitsconnections,andpasses
theseidentitiestothecallerasafirstframeinanyreceivedmessage.Conversely,whenthecallersendsamessage,it
usesthefirstmessageframeasanidentitytolookuptheconnectiontosendto.ROUTERSareasynchronous.

RequestReplyCombinations topprevnext

Wehavefourrequestreplysockets,eachwithacertainbehavior.We'veseenhowtheyconnectinsimpleandextendedrequest
replypatterns.Butthesesocketsarebuildingblocksthatyoucanusetosolvemanyproblems.

Thesearethelegalcombinations:

REQtoREP
DEALERtoREP
REQtoROUTER
DEALERtoROUTER
DEALERtoDEALER
ROUTERtoROUTER

Andthesecombinationsareinvalid(andI'llexplainwhy):

REQtoREQ
REQtoDEALER
REPtoREP
REPtoROUTER

Herearesometipsforrememberingthesemantics.DEALERislikeanasynchronousREQsocket,andROUTERislikean
asynchronousREPsocket.WhereweuseaREQsocket,wecanuseaDEALERwejusthavetoreadandwritetheenvelope
ourselves.WhereweuseaREPsocket,wecanstickaROUTERwejustneedtomanagetheidentitiesourselves.

ThinkofREQandDEALERsocketsas"clients"andREPandROUTERsocketsas"servers".Mostly,you'llwanttobindREPand
ROUTERsockets,andconnectREQandDEALERsocketstothem.It'snotalwaysgoingtobethissimple,butitisacleanand
memorableplacetostart.

http://zguide.zeromq.org/page:all 49/225
12/31/2015 MQ - The Guide - MQ - The Guide

TheREQtoREPCombination topprevnext

We'vealreadycoveredaREQclienttalkingtoaREPserverbutlet'stakeoneaspect:theREQclientmustinitiatethemessage
flow.AREPservercannottalktoaREQclientthathasn'tfirstsentitarequest.Technically,it'snotevenpossible,andtheAPI
alsoreturnsanEFSMerrorifyoutryit.

TheDEALERtoREPCombination topprevnext

Now,let'sreplacetheREQclientwithaDEALER.ThisgivesusanasynchronousclientthatcantalktomultipleREPservers.If
werewrotethe"HelloWorld"clientusingDEALER,we'dbeabletosendoffanynumberof"Hello"requestswithoutwaitingfor
replies.

WhenweuseaDEALERtotalktoaREPsocket,wemustaccuratelyemulatetheenvelopethattheREQsocketwouldhave
sent,ortheREPsocketwilldiscardthemessageasinvalid.So,tosendamessage,we:

SendanemptymessageframewiththeMOREflagsetthen
Sendthemessagebody.

Andwhenwereceiveamessage,we:

Receivethefirstframeandifit'snotempty,discardthewholemessage
Receivethenextframeandpassthattotheapplication.

TheREQtoROUTERCombination topprevnext

InthesamewaythatwecanreplaceREQwithDEALER,wecanreplaceREPwithROUTER.Thisgivesusanasynchronous
serverthatcantalktomultipleREQclientsatthesametime.Ifwerewrotethe"HelloWorld"serverusingROUTER,we'dbeable
toprocessanynumberof"Hello"requestsinparallel.WesawthisintheChapter2SocketsandPatternsmtserverexample.

WecanuseROUTERintwodistinctways:

Asaproxythatswitchesmessagesbetweenfrontendandbackendsockets.
Asanapplicationthatreadsthemessageandactsonit.

Inthefirstcase,theROUTERsimplyreadsallframes,includingtheartificialidentityframe,andpassesthemonblindly.Inthe
secondcasetheROUTERmustknowtheformatofthereplyenvelopeit'sbeingsent.AstheotherpeerisaREQsocket,the
ROUTERgetstheidentityframe,anemptyframe,andthenthedataframe.

TheDEALERtoROUTERCombination topprevnext

NowwecanswitchoutbothREQandREPwithDEALERandROUTERtogetthemostpowerfulsocketcombination,whichis
DEALERtalkingtoROUTER.Itgivesusasynchronousclientstalkingtoasynchronousservers,wherebothsideshavefullcontrol
overthemessageformats.

BecausebothDEALERandROUTERcanworkwitharbitrarymessageformats,ifyouhopetousethesesafely,youhaveto
becomealittlebitofaprotocoldesigner.AttheveryleastyoumustdecidewhetheryouwishtoemulatetheREQ/REPreply
envelope.Itdependsonwhetheryouactuallyneedtosendrepliesornot.

TheDEALERtoDEALERCombination topprevnext

http://zguide.zeromq.org/page:all 50/225
12/31/2015 MQ - The Guide - MQ - The Guide
YoucanswapaREPwithaROUTER,butyoucanalsoswapaREPwithaDEALER,iftheDEALERistalkingtooneandonly
onepeer.

WhenyoureplaceaREPwithaDEALER,yourworkercansuddenlygofullasynchronous,sendinganynumberofrepliesback.
Thecostisthatyouhavetomanagethereplyenvelopesyourself,andgetthemright,ornothingatallwillwork.We'llseea
workedexamplelater.Let'sjustsayfornowthatDEALERtoDEALERisoneofthetrickierpatternstogetright,andhappilyit's
rarethatweneedit.

TheROUTERtoROUTERCombination topprevnext

ThissoundsperfectforNtoNconnections,butit'sthemostdifficultcombinationtouse.Youshouldavoidituntilyouarewell
advancedwithZeroMQ.We'llseeoneexampleitintheFreelancepatterninChapter4ReliableRequestReplyPatterns,andan
alternativeDEALERtoROUTERdesignforpeertopeerworkinChapter8AFrameworkforDistributedComputing.

InvalidCombinations topprevnext

Mostly,tryingtoconnectclientstoclients,orserverstoserversisabadideaandwon'twork.However,ratherthangivegeneral
vaguewarnings,I'llexplainindetail:

REQtoREQ:bothsideswanttostartbysendingmessagestoeachother,andthiscouldonlyworkifyoutimedthingsso
thatbothpeersexchangedmessagesatthesametime.Ithurtsmybraintoeventhinkaboutit.

REQtoDEALER:youcouldintheorydothis,butitwouldbreakifyouaddedasecondREQbecauseDEALERhasno
wayofsendingareplytotheoriginalpeer.ThustheREQsocketwouldgetconfused,and/orreturnmessagesmeantfor
anotherclient.

REPtoREP:bothsideswouldwaitfortheothertosendthefirstmessage.

REPtoROUTER:theROUTERsocketcanintheoryinitiatethedialogandsendaproperlyformattedrequest,ifitknows
theREPsockethasconnectedanditknowstheidentityofthatconnection.It'smessyandaddsnothingoverDEALERto
ROUTER.

ThecommonthreadinthisvalidversusinvalidbreakdownisthataZeroMQsocketconnectionisalwaysbiasedtowardsonepeer
thatbindstoanendpoint,andanotherthatconnectstothat.Further,thatwhichsidebindsandwhichsideconnectsisnot
arbitrary,butfollowsnaturalpatterns.Thesidewhichweexpectto"bethere"binds:it'llbeaserver,abroker,apublisher,a
collector.Thesidethat"comesandgoes"connects:it'llbeclientsandworkers.Rememberingthiswillhelpyoudesignbetter
ZeroMQarchitectures.

ExploringROUTERSockets topprevnext

Let'slookatROUTERsocketsalittlecloser.We'vealreadyseenhowtheyworkbyroutingindividualmessagestospecific
connections.I'llexplaininmoredetailhowweidentifythoseconnections,andwhataROUTERsocketdoeswhenitcan'tsenda
message.

IdentitiesandAddresses topprevnext

TheidentityconceptinZeroMQrefersspecificallytoROUTERsocketsandhowtheyidentifytheconnectionstheyhavetoother
sockets.Morebroadly,identitiesareusedasaddressesinthereplyenvelope.Inmostcases,theidentityisarbitraryandlocalto
theROUTERsocket:it'salookupkeyinahashtable.Independently,apeercanhaveanaddressthatisphysical(anetwork
endpointlike"tcp://192.168.55.117:5670")orlogical(aUUIDoremailaddressorotheruniquekey).

http://zguide.zeromq.org/page:all 51/225
12/31/2015 MQ - The Guide - MQ - The Guide
AnapplicationthatusesaROUTERsockettotalktospecificpeerscanconvertalogicaladdresstoanidentityifithasbuiltthe
necessaryhashtable.BecauseROUTERsocketsonlyannouncetheidentityofaconnection(toaspecificpeer)whenthatpeer
sendsamessage,youcanonlyreallyreplytoamessage,notspontaneouslytalktoapeer.

ThisistrueevenifyoufliptherulesandmaketheROUTERconnecttothepeerratherthanwaitforthepeertoconnecttothe
ROUTER.HoweveryoucanforcetheROUTERsockettousealogicaladdressinplaceofitsidentity.Thezmq_setsockopt
referencepagecallsthissettingthesocketidentity.Itworksasfollows:

ThepeerapplicationsetstheZMQ_IDENTITYoptionofitspeersocket(DEALERorREQ)beforebindingorconnecting.
UsuallythepeerthenconnectstothealreadyboundROUTERsocket.ButtheROUTERcanalsoconnecttothepeer.
Atconnectiontime,thepeersockettellstheroutersocket,"pleaseusethisidentityforthisconnection".
Ifthepeersocketdoesn'tsaythat,theroutergeneratesitsusualarbitraryrandomidentityfortheconnection.
TheROUTERsocketnowprovidesthislogicaladdresstotheapplicationasaprefixidentityframeforanymessages
cominginfromthatpeer.
TheROUTERalsoexpectsthelogicaladdressastheprefixidentityframeforanyoutgoingmessages.

HereisasimpleexampleoftwopeersthatconnecttoaROUTERsocket,onethatimposesalogicaladdress"PEER2":

identity:IdentitycheckinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Q|Ruby|Scala|Tcl|Ada|Basic|Felix|
ObjectiveC|ooc|Racket

Hereiswhattheprogramprints:


[005]006B8B4567
[000]
[026]ROUTERusesageneratedUUID

[005]PEER2
[000]
[038]ROUTERusesREQ'ssocketidentity

ROUTERErrorHandling topprevnext

ROUTERsocketsdohaveasomewhatbrutalwayofdealingwithmessagestheycan'tsendanywhere:theydropthemsilently.
It'sanattitudethatmakessenseinworkingcode,butitmakesdebugginghard.The"sendidentityasfirstframe"approachis
trickyenoughthatweoftengetthiswrongwhenwe'relearning,andtheROUTER'sstonysilencewhenwemessupisn'tvery
constructive.

SinceZeroMQv3.2there'sasocketoptionyoucansettocatchthiserror:ZMQ_ROUTER_MANDATORY.SetthatontheROUTER
socketandthenwhenyouprovideanunroutableidentityonasendcall,thesocketwillsignalanEHOSTUNREACHerror.

TheLoadBalancingPattern topprevnext

Nowlet'slookatsomecode.We'llseehowtoconnectaROUTERsockettoaREQsocket,andthentoaDEALERsocket.These
twoexamplesfollowthesamelogic,whichisaloadbalancingpattern.ThispatternisourfirstexposuretousingtheROUTER
socketfordeliberaterouting,ratherthansimplyactingasareplychannel.

Theloadbalancingpatternisverycommonandwe'llseeitseveraltimesinthisbook.Itsolvesthemainproblemwithsimple
roundrobinrouting(asPUSHandDEALERoffer)whichisthatroundrobinbecomesinefficientiftasksdonotallroughlytakethe
sametime.

It'sthepostofficeanalogy.Ifyouhaveonequeuepercounter,andyouhavesomepeoplebuyingstamps(afast,simple
transaction),andsomepeopleopeningnewaccounts(averyslowtransaction),thenyouwillfindstampbuyersgettingunfairly
stuckinqueues.Justasinapostoffice,ifyourmessagingarchitectureisunfair,peoplewillgetannoyed.

http://zguide.zeromq.org/page:all 52/225
12/31/2015 MQ - The Guide - MQ - The Guide
Thesolutioninthepostofficeistocreateasinglequeuesothatevenifoneortwocountersgetstuckwithslowwork,other
counterswillcontinuetoserveclientsonafirstcome,firstservebasis.

OnereasonPUSHandDEALERusethesimplisticapproachissheerperformance.IfyouarriveinanymajorUSairport,you'll
findlongqueuesofpeoplewaitingatimmigration.Theborderpatrolofficialswillsendpeopleinadvancetoqueueupateach
counter,ratherthanusingasinglequeue.Havingpeoplewalkfiftyyardsinadvancesavesaminuteortwoperpassenger.And
becauseeverypassportchecktakesroughlythesametime,it'smoreorlessfair.ThisisthestrategyforPUSHandDEALER:
sendworkloadsaheadoftimesothatthereislesstraveldistance.

ThisisarecurringthemewithZeroMQ:theworld'sproblemsarediverseandyoucanbenefitfromsolvingdifferentproblems
eachintherightway.Theairportisn'tthepostofficeandonesizefitsnoone,reallywell.

Let'sreturntothescenarioofaworker(DEALERorREQ)connectedtoabroker(ROUTER).Thebrokerhastoknowwhenthe
workerisready,andkeepalistofworkerssothatitcantaketheleastrecentlyusedworkereachtime.

Thesolutionisreallysimple,infact:workerssenda"ready"messagewhentheystart,andaftertheyfinisheachtask.Thebroker
readsthesemessagesonebyone.Eachtimeitreadsamessage,itisfromthelastusedworker.Andbecausewe'reusinga
ROUTERsocket,wegetanidentitythatwecanthenusetosendataskbacktotheworker.

It'satwistonrequestreplybecausethetaskissentwiththereply,andanyresponseforthetaskissentasanewrequest.The
followingcodeexamplesshouldmakeitclearer.

ROUTERBrokerandREQWorkers topprevnext

HereisanexampleoftheloadbalancingpatternusingaROUTERbrokertalkingtoasetofREQworkers:

rtreq:ROUTERtoREQinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|Felix|
ObjectiveC|ooc|Q|Racket

Theexamplerunsforfivesecondsandtheneachworkerprintshowmanytaskstheyhandled.Iftheroutingworked,we'dexpect
afairdistributionofwork:

Completed:20tasks
Completed:18tasks
Completed:21tasks
Completed:23tasks
Completed:19tasks
Completed:21tasks
Completed:17tasks
Completed:17tasks
Completed:25tasks
Completed:19tasks

Totalktotheworkersinthisexample,wehavetocreateaREQfriendlyenvelopeconsistingofanidentityplusanempty
envelopedelimiterframe.

Figure31RoutingEnvelopeforREQ

http://zguide.zeromq.org/page:all 53/225
12/31/2015 MQ - The Guide - MQ - The Guide

ROUTERBrokerandDEALERWorkers topprevnext

AnywhereyoucanuseREQ,youcanuseDEALER.Therearetwospecificdifferences:

TheREQsocketalwayssendsanemptydelimiterframebeforeanydataframestheDEALERdoesnot.
TheREQsocketwillsendonlyonemessagebeforeitreceivesareplytheDEALERisfullyasynchronous.

Thesynchronousversusasynchronousbehaviorhasnoeffectonourexamplebecausewe'redoingstrictrequestreply.Itismore
relevantwhenweaddressrecoveringfromfailures,whichwe'llcometoinChapter4ReliableRequestReplyPatterns.

Nowlet'slookatexactlythesameexamplebutwiththeREQsocketreplacedbyaDEALERsocket:

rtdealer:ROUTERtoDEALERinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|Felix|
ObjectiveC|ooc|Q|Racket

ThecodeisalmostidenticalexceptthattheworkerusesaDEALERsocket,andreadsandwritesthatemptyframebeforethe
dataframe.ThisistheapproachIusewhenIwanttokeepcompatibilitywithREQworkers.

However,rememberthereasonforthatemptydelimiterframe:it'stoallowmultihopextendedrequeststhatterminateinaREP
socket,whichusesthatdelimitertosplitoffthereplyenvelopesoitcanhandthedataframestoitsapplication.

IfweneverneedtopassthemessagealongtoaREPsocket,wecansimplydroptheemptydelimiterframeatbothsides,which
makesthingssimpler.ThisisusuallythedesignIuseforpureDEALERtoROUTERprotocols.

ALoadBalancingMessageBroker topprevnext

Thepreviousexampleishalfcomplete.Itcanmanageasetofworkerswithdummyrequestsandreplies,butithasnowaytotalk
toclients.IfweaddasecondfrontendROUTERsocketthatacceptsclientrequests,andturnourexampleintoaproxythatcan
switchmessagesfromfrontendtobackend,wegetausefulandreusabletinyloadbalancingmessagebroker.

Figure32LoadBalancingBroker

http://zguide.zeromq.org/page:all 54/225
12/31/2015 MQ - The Guide - MQ - The Guide
Thisbrokerdoesthefollowing:

Acceptsconnectionsfromasetofclients.
Acceptsconnectionsfromasetofworkers.
Acceptsrequestsfromclientsandholdstheseinasinglequeue.
Sendstheserequeststoworkersusingtheloadbalancingpattern.
Receivesrepliesbackfromworkers.
Sendstheserepliesbacktotheoriginalrequestingclient.

Thebrokercodeisfairlylong,butworthunderstanding:

lbbroker:LoadbalancingbrokerinC

C++|C#|Clojure|CL|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|Felix|
ObjectiveC|ooc|Q|Racket

Thedifficultpartofthisprogramis(a)theenvelopesthateachsocketreadsandwrites,and(b)theloadbalancingalgorithm.
We'lltaketheseinturn,startingwiththemessageenvelopeformats.

Let'swalkthroughafullrequestreplychainfromclienttoworkerandback.Inthiscodewesettheidentityofclientandworker
socketstomakeiteasiertotracethemessageframes.Inreality,we'dallowtheROUTERsocketstoinventidentitiesfor
connections.Let'sassumetheclient'sidentityis"CLIENT"andtheworker'sidentityis"WORKER".Theclientapplicationsendsa
singleframecontaining"Hello".

Figure33MessagethatClientSends

BecausetheREQsocketaddsitsemptydelimiterframeandtheROUTERsocketaddsitsconnectionidentity,theproxyreadsoff
thefrontendROUTERsockettheclientaddress,emptydelimiterframe,andthedatapart.

Figure34MessageCominginonFrontend

Thebrokersendsthistotheworker,prefixedbytheaddressofthechosenworker,plusanadditionalemptyparttokeeptheREQ
attheotherendhappy.

Figure35MessageSenttoBackend

http://zguide.zeromq.org/page:all 55/225
12/31/2015 MQ - The Guide - MQ - The Guide

ThiscomplexenvelopestackgetschewedupfirstbythebackendROUTERsocket,whichremovesthefirstframe.Thenthe
REQsocketintheworkerremovestheemptypart,andprovidestheresttotheworkerapplication.

Figure36MessageDeliveredtoWorker

Theworkerhastosavetheenvelope(whichisallthepartsuptoandincludingtheemptymessageframe)andthenitcando
what'sneededwiththedatapart.NotethataREPsocketwoulddothisautomatically,butwe'reusingtheREQROUTERpattern
sothatwecangetproperloadbalancing.

Onthereturnpath,themessagesarethesameaswhentheycomein,i.e.,thebackendsocketgivesthebrokeramessagein
fiveparts,andthebrokersendsthefrontendsocketamessageinthreeparts,andtheclientgetsamessageinonepart.

Nowlet'slookattheloadbalancingalgorithm.ItrequiresthatbothclientsandworkersuseREQsockets,andthatworkers
correctlystoreandreplaytheenvelopeonmessagestheyget.Thealgorithmis:

Createapollsetthatalwayspollsthebackend,andpollsthefrontendonlyifthereareoneormoreworkersavailable.

Pollforactivitywithinfinitetimeout.

Ifthereisactivityonthebackend,weeitherhavea"ready"messageorareplyforaclient.Ineithercase,westorethe
workeraddress(thefirstpart)onourworkerqueue,andiftherestisaclientreply,wesenditbacktothatclientviathe
frontend.

Ifthereisactivityonthefrontend,wetaketheclientrequest,popthenextworker(whichisthelastused),andsendthe
requesttothebackend.Thismeanssendingtheworkeraddress,emptypart,andthenthethreepartsoftheclientrequest.

Youshouldnowseethatyoucanreuseandextendtheloadbalancingalgorithmwithvariationsbasedontheinformationthe
workerprovidesinitsinitial"ready"message.Forexample,workersmightstartupanddoaperformanceselftest,thentellthe
brokerhowfasttheyare.Thebrokercanthenchoosethefastestavailableworkerratherthantheoldest.

AHighLevelAPIforZeroMQ topprevnext

We'regoingtopushrequestreplyontothestackandopenadifferentarea,whichistheZeroMQAPIitself.There'sareasonfor
thisdetour:aswewritemorecomplexexamples,thelowlevelZeroMQAPIstartstolookincreasinglyclumsy.Lookatthecoreof
theworkerthreadfromourloadbalancingbroker:

while(true){
//Getoneaddressframeandemptydelimiter
char*address=s_recv(worker)
char*empty=s_recv(worker)
assert(*empty==0)
free(empty)

//Getrequest,sendreply
char*request=s_recv(worker)
printf("Worker:%s\n",request)
free(request)

s_sendmore(worker,address)

http://zguide.zeromq.org/page:all 56/225
12/31/2015 MQ - The Guide - MQ - The Guide
s_sendmore(worker,"")

s_send(worker,"OK")
free(address)
}

Thatcodeisn'tevenreusablebecauseitcanonlyhandleonereplyaddressintheenvelope,anditalreadydoessomewrapping
aroundtheZeroMQAPI.IfweusedthelibzmqsimplemessageAPIthisiswhatwe'dhavetowrite:

while(true){
//Getoneaddressframeandemptydelimiter
charaddress[255]
intaddress_size=zmq_recv(worker,address,255,0)
if(address_size==1)
break

charempty[1]
intempty_size=zmq_recv(worker,empty,1,0)
zmq_recv(worker,&empty,0)
assert(empty_size<=0)
if(empty_size==1)
break

//Getrequest,sendreply
charrequest[256]
intrequest_size=zmq_recv(worker,request,255,0)
if(request_size==1)
returnNULL
request[request_size]=0
printf("Worker:%s\n",request)

zmq_send(worker,address,address_size,ZMQ_SNDMORE)
zmq_send(worker,empty,0,ZMQ_SNDMORE)
zmq_send(worker,"OK",2,0)
}

Andwhencodeistoolongtowritequickly,it'salsotoolongtounderstand.Upuntilnow,I'vestucktothenativeAPIbecause,as
ZeroMQusers,weneedtoknowthatintimately.Butwhenitgetsinourway,wehavetotreatitasaproblemtosolve.

Wecan'tofcoursejustchangetheZeroMQAPI,whichisadocumentedpubliccontractonwhichthousandsofpeopleagreeand
depend.Instead,weconstructahigherlevelAPIontopbasedonourexperiencesofar,andmostspecifically,ourexperience
fromwritingmorecomplexrequestreplypatterns.

WhatwewantisanAPIthatletsusreceiveandsendanentiremessageinoneshot,includingthereplyenvelopewithany
numberofreplyaddresses.Onethatletsusdowhatwewantwiththeabsoluteleastlinesofcode.

MakingagoodmessageAPIisfairlydifficult.Wehaveaproblemofterminology:ZeroMQuses"message"todescribeboth
multipartmessages,andindividualmessageframes.Wehaveaproblemofexpectations:sometimesit'snaturaltoseemessage
contentasprintablestringdata,sometimesasbinaryblobs.Andwehavetechnicalchallenges,especiallyifwewanttoavoid
copyingdataaroundtoomuch.

ThechallengeofmakingagoodAPIaffectsalllanguages,thoughmyspecificusecaseisC.Whateverlanguageyouuse,think
abouthowyoucouldcontributetoyourlanguagebindingtomakeitasgood(orbetter)thantheCbindingI'mgoingtodescribe.

FeaturesofaHigherLevelAPI topprevnext

Mysolutionistousethreefairlynaturalandobviousconcepts:string(alreadythebasisforours_sendands_recv)helpers,
frame(amessageframe),andmessage(alistofoneormoreframes).Hereistheworkercode,rewrittenontoanAPIusing
theseconcepts:

http://zguide.zeromq.org/page:all 57/225
12/31/2015 MQ - The Guide - MQ - The Guide

while(true){
zmsg_t*msg=zmsg_recv(worker)
zframe_reset(zmsg_last(msg),"OK",2)
zmsg_send(&msg,worker)
}

Cuttingtheamountofcodeweneedtoreadandwritecomplexmessagesisgreat:theresultsareeasytoreadandunderstand.
Let'scontinuethisprocessforotheraspectsofworkingwithZeroMQ.Here'sawishlistofthingsI'dlikeinahigherlevelAPI,
basedonmyexperiencewithZeroMQsofar:

Automatichandlingofsockets.Ifinditcumbersometohavetoclosesocketsmanually,andtohavetoexplicitlydefinethe
lingertimeoutinsome(butnotall)cases.It'dbegreattohaveawaytoclosesocketsautomaticallywhenIclosethe
context.

Portablethreadmanagement.EverynontrivialZeroMQapplicationusesthreads,butPOSIXthreadsaren'tportable.Soa
decenthighlevelAPIshouldhidethisunderaportablelayer.

Pipingfromparenttochildthreads.It'sarecurrentproblem:howtosignalbetweenparentandchildthreads.OurAPI
shouldprovideaZeroMQmessagepipe(usingPAIRsocketsandinprocautomatically.

Portableclocks.Evengettingthetimetoamillisecondresolution,orsleepingforsomemilliseconds,isnotportable.
RealisticZeroMQapplicationsneedportableclocks,soourAPIshouldprovidethem.

Areactortoreplacezmq_poll().Thepollloopissimple,butclumsy.Writingalotofthese,weendupdoingthesame
workoverandover:calculatingtimers,andcallingcodewhensocketsareready.Asimplereactorwithsocketreadersand
timerswouldsavealotofrepeatedwork.

ProperhandlingofCtrlC.Wealreadysawhowtocatchaninterrupt.Itwouldbeusefulifthishappenedinallapplications.

TheCZMQHighLevelAPI topprevnext

TurningthiswishlistintorealityfortheClanguagegivesusCZMQ,aZeroMQlanguagebindingforC.Thishighlevelbinding,in
fact,developedoutofearlierversionsoftheexamples.ItcombinesnicersemanticsforworkingwithZeroMQwithsome
portabilitylayers,and(importantlyforC,butlessforotherlanguages)containerslikehashesandlists.CZMQalsousesan
elegantobjectmodelthatleadstofranklylovelycode.

HereistheloadbalancingbrokerrewrittentouseahigherlevelAPI(CZMQfortheCcase):

lbbroker2:LoadbalancingbrokerusinghighlevelAPIinC

Delphi|Haxe|Java|Lua|PHP|Python|Scala|Ada|Basic|C++|C#|Clojure|CL|Erlang|F#|Felix|Go|Haskell|Node.js|ObjectiveC|ooc|Perl
|Q|Racket|Ruby|Tcl

OnethingCZMQprovidesiscleaninterrupthandling.ThismeansthatCtrlCwillcauseanyblockingZeroMQcalltoexitwitha
returncode1anderrnosettoEINTR.ThehighlevelrecvmethodswillreturnNULLinsuchcases.So,youcancleanlyexita
looplikethis:

while(true){
zstr_send(client,"Hello")
char*reply=zstr_recv(client)
if(!reply)
break//Interrupted
printf("Client:%s\n",reply)
free(reply)
sleep(1)
}

Or,ifyou'recallingzmq_poll(),testonthereturncode:

http://zguide.zeromq.org/page:all 58/225
12/31/2015 MQ - The Guide - MQ - The Guide
if(zmq_poll(items,2,1000*1000)==1)
break//Interrupted

Thepreviousexamplestilluseszmq_poll().Sohowaboutreactors?TheCZMQzloopreactorissimplebutfunctional.Itlets
you:

Setareaderonanysocket,i.e.,codethatiscalledwheneverthesockethasinput.
Cancelareaderonasocket.
Setatimerthatgoesoffonceormultipletimesatspecificintervals.
Cancelatimer.

zloopofcourseuseszmq_poll()internally.Itrebuildsitspollseteachtimeyouaddorremovereaders,anditcalculatesthe
polltimeouttomatchthenexttimer.Then,itcallsthereaderandtimerhandlersforeachsocketandtimerthatneedattention.

Whenweuseareactorpattern,ourcodeturnsinsideout.Themainlogiclookslikethis:

zloop_t*reactor=zloop_new()
zloop_reader(reactor,self>backend,s_handle_backend,self)
zloop_start(reactor)
zloop_destroy(&reactor)

Theactualhandlingofmessagessitsinsidededicatedfunctionsormethods.Youmaynotlikethestyleit'samatteroftaste.
Whatitdoeshelpwithismixingtimersandsocketactivity.Intherestofthistext,we'llusezmq_poll()insimplercases,and
zloopinmorecomplexexamples.

Hereistheloadbalancingbrokerrewrittenonceagain,thistimetousezloop:

lbbroker3:LoadbalancingbrokerusingzloopinC

Haxe|Java|Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

GettingapplicationstoproperlyshutdownwhenyousendthemCtrlCcanbetricky.Ifyouusethezctxclassit'llautomatically
setupsignalhandling,butyourcodestillhastocooperate.Youmustbreakanyloopifzmq_pollreturns1orifanyofthe
zstr_recv,zframe_recv,orzmsg_recvmethodsreturnNULL.Ifyouhavenestedloops,itcanbeusefultomaketheouter
onesconditionalon!zctx_interrupted.

Ifyou'reusingchildthreads,theywon'treceivetheinterrupt.Totellthemtoshutdown,youcaneither:

Destroythecontext,iftheyaresharingthesamecontext,inwhichcaseanyblockingcallstheyarewaitingonwillendwith
ETERM.
Sendthemshutdownmessages,iftheyareusingtheirowncontexts.Forthisyou'llneedsomesocketplumbing.

TheAsynchronousClient/ServerPattern topprevnext

IntheROUTERtoDEALERexample,wesawa1toNusecasewhereoneservertalksasynchronouslytomultipleworkers.We
canturnthisupsidedowntogetaveryusefulNto1architecturewherevariousclientstalktoasingleserver,anddothis
asynchronously.

Figure37AsynchronousClient/Server

http://zguide.zeromq.org/page:all 59/225
12/31/2015 MQ - The Guide - MQ - The Guide

Here'showitworks:

Clientsconnecttotheserverandsendrequests.
Foreachrequest,theserversends0ormorereplies.
Clientscansendmultiplerequestswithoutwaitingforareply.
Serverscansendmultiplereplieswithoutwaitingfornewrequests.

Here'scodethatshowshowthisworks:

asyncsrv:Asynchronousclient/serverinC

C++|C#|Clojure|Delphi|Erlang|F#|Go|Haskell|Haxe|Java|Lua|Node.js|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|CL|Felix|ObjectiveC
|ooc|Perl|Q|Racket

Theexamplerunsinoneprocess,withmultiplethreadssimulatingarealmultiprocessarchitecture.Whenyouruntheexample,
you'llseethreeclients(eachwitharandomID),printingouttherepliestheygetfromtheserver.Lookcarefullyandyou'llsee
eachclienttaskgets0ormorerepliesperrequest.

Somecommentsonthiscode:

Theclientssendarequestoncepersecond,andgetzeroormorerepliesback.Tomakethisworkusingzmq_poll(),we
can'tsimplypollwitha1secondtimeout,orwe'dendupsendinganewrequestonlyonesecondafterwereceivedthe
lastreply.Sowepollatahighfrequency(100timesat1/100thofasecondperpoll),whichisapproximatelyaccurate.

Theserverusesapoolofworkerthreads,eachprocessingonerequestsynchronously.Itconnectsthesetoitsfrontend
socketusinganinternalqueue.Itconnectsthefrontendandbackendsocketsusingazmq_proxy()call.

Figure38DetailofAsynchronousServer

http://zguide.zeromq.org/page:all 60/225
12/31/2015 MQ - The Guide - MQ - The Guide

Notethatwe'redoingDEALERtoROUTERdialogbetweenclientandserver,butinternallybetweentheservermainthreadand
workers,we'redoingDEALERtoDEALER.Iftheworkerswerestrictlysynchronous,we'duseREP.However,becausewewant
tosendmultiplereplies,weneedanasyncsocket.Wedonotwanttoroutereplies,theyalwaysgotothesingleserverthread
thatsentustherequest.

Let'sthinkabouttheroutingenvelope.Theclientsendsamessageconsistingofasingleframe.Theserverthreadreceivesa
twoframemessage(originalmessageprefixedbyclientidentity).Wesendthesetwoframesontotheworker,whichtreatsitasa
normalreplyenvelope,returnsthattousasatwoframemessage.Wethenusethefirstframeasanidentitytoroutethesecond
framebacktotheclientasareply.

Itlookssomethinglikethis:

clientserverfrontendworker
[DEALER]<>[ROUTER<>DEALER<>DEALER]
1part2parts2parts

Nowforthesockets:wecouldusetheloadbalancingROUTERtoDEALERpatterntotalktoworkers,butit'sextrawork.Inthis
case,aDEALERtoDEALERpatternisprobablyfine:thetradeoffislowerlatencyforeachrequest,buthigherriskofunbalanced
workdistribution.Simplicitywinsinthiscase.

Whenyoubuildserversthatmaintainstatefulconversationswithclients,youwillrunintoaclassicproblem.Iftheserverkeeps
somestateperclient,andclientskeepcomingandgoing,eventuallyitwillrunoutofresources.Evenifthesameclientskeep
connecting,ifyou'reusingdefaultidentities,eachconnectionwilllooklikeanewone.

Wecheatintheaboveexamplebykeepingstateonlyforaveryshorttime(thetimeittakesaworkertoprocessarequest)and
thenthrowingawaythestate.Butthat'snotpracticalformanycases.Toproperlymanageclientstateinastatefulasynchronous
server,youhaveto:

http://zguide.zeromq.org/page:all 61/225
12/31/2015 MQ - The Guide - MQ - The Guide
Doheartbeatingfromclienttoserver.Inourexample,wesendarequestoncepersecond,whichcanreliablybeusedasa
heartbeat.

Storestateusingtheclientidentity(whethergeneratedorexplicit)askey.

Detectastoppedheartbeat.Ifthere'snorequestfromaclientwithin,say,twoseconds,theservercandetectthisand
destroyanystateit'sholdingforthatclient.

WorkedExample:InterBrokerRouting topprevnext

Let'stakeeverythingwe'veseensofar,andscalethingsuptoarealapplication.We'llbuildthisstepbystepoverseveral
iterations.Ourbestclientcallsusurgentlyandasksforadesignofalargecloudcomputingfacility.Hehasthisvisionofacloud
thatspansmanydatacenters,eachaclusterofclientsandworkers,andthatworkstogetherasawhole.Becausewe'resmart
enoughtoknowthatpracticealwaysbeatstheory,weproposetomakeaworkingsimulationusingZeroMQ.Ourclient,eagerto
lockdownthebudgetbeforehisownbosschangeshismind,andhavingreadgreatthingsaboutZeroMQonTwitter,agrees.

EstablishingtheDetails topprevnext

Severalespressoslater,wewanttojumpintowritingcode,butalittlevoicetellsustogetmoredetailsbeforemakinga
sensationalsolutiontoentirelythewrongproblem."Whatkindofworkistheclouddoing?",weask.

Theclientexplains:

Workersrunonvariouskindsofhardware,buttheyareallabletohandleanytask.Thereareseveralhundredworkersper
cluster,andasmanyasadozenclustersintotal.

Clientscreatetasksforworkers.Eachtaskisanindependentunitofworkandalltheclientwantsistofindanavailable
worker,andsenditthetask,assoonaspossible.Therewillbealotofclientsandthey'llcomeandgoarbitrarily.

Therealdifficultyistobeabletoaddandremoveclustersatanytime.Aclustercanleaveorjointhecloudinstantly,
bringingallitsworkersandclientswithit.

Iftherearenoworkersintheirowncluster,clients'taskswillgoofftootheravailableworkersinthecloud.

Clientssendoutonetaskatatime,waitingforareply.Iftheydon'tgetananswerwithinXseconds,they'lljustsendout
thetaskagain.Thisisn'tourconcerntheclientAPIdoesitalready.

Workersprocessonetaskatatimetheyareverysimplebeasts.Iftheycrash,theygetrestartedbywhateverscript
startedthem.

Sowedoublechecktomakesurethatweunderstoodthiscorrectly:

"Therewillbesomekindofsuperdupernetworkinterconnectbetweenclusters,right?",weask.Theclientsays,"Yes,of
course,we'renotidiots."

"Whatkindofvolumesarewetalkingabout?",weask.Theclientreplies,"Uptoathousandclientspercluster,eachdoing
atmosttenrequestspersecond.Requestsaresmall,andrepliesarealsosmall,nomorethan1Kbyteseach."

SowedoalittlecalculationandseethatthiswillworknicelyoverplainTCP.2,500clientsx10/secondx1,000bytesx2
directions=50MB/secor400Mb/sec,notaproblemfora1Gbnetwork.

It'sastraightforwardproblemthatrequiresnoexotichardwareorprotocols,justsomecleverroutingalgorithmsandcareful
design.Westartbydesigningonecluster(onedatacenter)andthenwefigureouthowtoconnectclusterstogether.

ArchitectureofaSingleCluster topprevnext

http://zguide.zeromq.org/page:all 62/225
12/31/2015 MQ - The Guide - MQ - The Guide
Workersandclientsaresynchronous.Wewanttousetheloadbalancingpatterntoroutetaskstoworkers.Workersareall
identicalourfacilityhasnonotionofdifferentservices.Workersareanonymousclientsneveraddressthemdirectly.Wemake
noattemptheretoprovideguaranteeddelivery,retry,andsoon.

Forreasonswealreadyexamined,clientsandworkerswon'tspeaktoeachotherdirectly.Itmakesitimpossibletoaddorremove
nodesdynamically.Soourbasicmodelconsistsoftherequestreplymessagebrokerwesawearlier.

Figure39ClusterArchitecture

ScalingtoMultipleClusters topprevnext

Nowwescalethisouttomorethanonecluster.Eachclusterhasasetofclientsandworkers,andabrokerthatjoinsthese
together.

Figure40MultipleClusters

http://zguide.zeromq.org/page:all 63/225
12/31/2015 MQ - The Guide - MQ - The Guide

Thequestionis:howdowegettheclientsofeachclustertalkingtotheworkersoftheothercluster?Thereareafewpossibilities,
eachwithprosandcons:

Clientscouldconnectdirectlytobothbrokers.Theadvantageisthatwedon'tneedtomodifybrokersorworkers.But
clientsgetmorecomplexandbecomeawareoftheoveralltopology.Ifwewanttoaddathirdorforthcluster,forexample,
alltheclientsareaffected.Ineffectwehavetomoveroutingandfailoverlogicintotheclientsandthat'snotnice.

Workersmightconnectdirectlytobothbrokers.ButREQworkerscan'tdothat,theycanonlyreplytoonebroker.We
mightuseREPsbutREPsdon'tgiveuscustomizablebrokertoworkerroutinglikeloadbalancingdoes,onlythebuiltin
loadbalancing.That'safailifwewanttodistributeworktoidleworkers,wepreciselyneedloadbalancing.Onesolution
wouldbetouseROUTERsocketsfortheworkernodes.Let'slabelthis"Idea#1".

Brokerscouldconnecttoeachother.Thislooksneatestbecauseitcreatesthefewestadditionalconnections.Wecan't
addclustersonthefly,butthatisprobablyoutofscope.Nowclientsandworkersremainignorantoftherealnetwork
topology,andbrokerstelleachotherwhentheyhavesparecapacity.Let'slabelthis"Idea#2".

Let'sexploreIdea#1.Inthismodel,wehaveworkersconnectingtobothbrokersandacceptingjobsfromeitherone.

Figure41Idea1:CrossconnectedWorkers

Itlooksfeasible.However,itdoesn'tprovidewhatwewanted,whichwasthatclientsgetlocalworkersifpossibleandremote
workersonlyifit'sbetterthanwaiting.Alsoworkerswillsignal"ready"tobothbrokersandcangettwojobsatonce,whileother
workersremainidle.Itseemsthisdesignfailsbecauseagainwe'reputtingroutinglogicattheedges.

http://zguide.zeromq.org/page:all 64/225
12/31/2015 MQ - The Guide - MQ - The Guide
So,idea#2then.Weinterconnectthebrokersanddon'ttouchtheclientsorworkers,whichareREQslikewe'reusedto.

Figure42Idea2:BrokersTalkingtoEachOther

Thisdesignisappealingbecausetheproblemissolvedinoneplace,invisibletotherestoftheworld.Basically,brokersopen
secretchannelstoeachotherandwhisper,likecameltraders,"Hey,I'vegotsomesparecapacity.Ifyouhavetoomanyclients,
givemeashoutandwe'lldeal".

Ineffectitisjustamoresophisticatedroutingalgorithm:brokersbecomesubcontractorsforeachother.Thereareotherthingsto
likeaboutthisdesign,evenbeforeweplaywithrealcode:

Ittreatsthecommoncase(clientsandworkersonthesamecluster)asdefaultanddoesextraworkfortheexceptional
case(shufflingjobsbetweenclusters).

Itletsususedifferentmessageflowsforthedifferenttypesofwork.Thatmeanswecanhandlethemdifferently,e.g.,
usingdifferenttypesofnetworkconnection.

Itfeelslikeitwouldscalesmoothly.Interconnectingthreeormorebrokersdoesn'tgetoverlycomplex.Ifwefindthistobe
aproblem,it'seasytosolvebyaddingasuperbroker.

We'llnowmakeaworkedexample.We'llpackanentireclusterintooneprocess.Thatisobviouslynotrealistic,butitmakesit
simpletosimulate,andthesimulationcanaccuratelyscaletorealprocesses.ThisisthebeautyofZeroMQyoucandesignat
themicrolevelandscalethatuptothemacrolevel.Threadsbecomeprocesses,andthenbecomeboxesandthepatternsand
logicremainthesame.Eachofour"cluster"processescontainsclientthreads,workerthreads,andabrokerthread.

Weknowthebasicmodelwellbynow:

TheREQclient(REQ)threadscreateworkloadsandpassthemtothebroker(ROUTER).
TheREQworker(REQ)threadsprocessworkloadsandreturntheresultstothebroker(ROUTER).
Thebrokerqueuesanddistributesworkloadsusingtheloadbalancingpattern.

FederationVersusPeering topprevnext

Thereareseveralpossiblewaystointerconnectbrokers.Whatwewantistobeabletotellotherbrokers,"wehavecapacity",
andthenreceivemultipletasks.Wealsoneedtobeabletotellotherbrokers,"stop,we'refull".Itdoesn'tneedtobeperfect
sometimeswemayacceptjobswecan'tprocessimmediately,thenwe'lldothemassoonaspossible.

Thesimplestinterconnectisfederation,inwhichbrokerssimulateclientsandworkersforeachother.Wewoulddothisby
connectingourfrontendtotheotherbroker'sbackendsocket.Notethatitislegaltobothbindasockettoanendpointand
connectittootherendpoints.

Figure43CrossconnectedBrokersinFederationModel

http://zguide.zeromq.org/page:all 65/225
12/31/2015 MQ - The Guide - MQ - The Guide

Thiswouldgiveussimplelogicinbothbrokersandareasonablygoodmechanism:whentherearenoworkers,telltheother
broker"ready",andacceptonejobfromit.Theproblemisalsothatitistoosimpleforthisproblem.Afederatedbrokerwouldbe
abletohandleonlyonetaskatatime.Ifthebrokeremulatesalockstepclientandworker,itisbydefinitionalsogoingtobelock
step,andifithaslotsofavailableworkerstheywon'tbeused.Ourbrokersneedtobeconnectedinafullyasynchronousfashion.

Thefederationmodelisperfectforotherkindsofrouting,especiallyserviceorientedarchitectures(SOAs),whichroutebyservice
nameandproximityratherthanloadbalancingorroundrobin.Sodon'tdismissitasuseless,it'sjustnotrightforallusecases.

Insteadoffederation,let'slookatapeeringapproachinwhichbrokersareexplicitlyawareofeachotherandtalkoverprivileged
channels.Let'sbreakthisdown,assumingwewanttointerconnectNbrokers.Eachbrokerhas(N1)peers,andallbrokersare
usingexactlythesamecodeandlogic.Therearetwodistinctflowsofinformationbetweenbrokers:

Eachbrokerneedstotellitspeershowmanyworkersithasavailableatanytime.Thiscanbefairlysimpleinformation
justaquantitythatisupdatedregularly.Theobvious(andcorrect)socketpatternforthisispubsub.Soeverybroker
opensaPUBsocketandpublishesstateinformationonthat,andeverybrokeralsoopensaSUBsocketandconnectsthat
tothePUBsocketofeveryotherbrokertogetstateinformationfromitspeers.

Eachbrokerneedsawaytodelegatetaskstoapeerandgetrepliesback,asynchronously.We'lldothisusingROUTER
socketsnoothercombinationworks.Eachbrokerhastwosuchsockets:onefortasksitreceivesandonefortasksit
delegates.Ifwedidn'tusetwosockets,itwouldbemoreworktoknowwhetherwewerereadingarequestorareplyeach
time.Thatwouldmeanaddingmoreinformationtothemessageenvelope.

Andthereisalsotheflowofinformationbetweenabrokeranditslocalclientsandworkers.

TheNamingCeremony topprevnext

Threeflowsxtwosocketsforeachflow=sixsocketsthatwehavetomanageinthebroker.Choosinggoodnamesisvitalto
keepingamultisocketjugglingactreasonablycoherentinourminds.Socketsdosomethingandwhattheydoshouldformthe
basisfortheirnames.It'saboutbeingabletoreadthecodeseveralweekslateronacoldMondaymorningbeforecoffee,andnot
feelanypain.

Let'sdoashamanisticnamingceremonyforthesockets.Thethreeflowsare:

Alocalrequestreplyflowbetweenthebrokeranditsclientsandworkers.
Acloudrequestreplyflowbetweenthebrokeranditspeerbrokers.
Astateflowbetweenthebrokeranditspeerbrokers.

Findingmeaningfulnamesthatareallthesamelengthmeansourcodewillalignnicely.It'snotabigthing,butattentiontodetails
helps.Foreachflowthebrokerhastwosocketsthatwecanorthogonallycallthefrontendandbackend.We'veusedthese
namesquiteoften.Afrontendreceivesinformationortasks.Abackendsendsthoseouttootherpeers.Theconceptualflowis
fromfronttoback(withrepliesgoingintheoppositedirectionfrombacktofront).

http://zguide.zeromq.org/page:all 66/225
12/31/2015 MQ - The Guide - MQ - The Guide
Soinallthecodewewriteforthistutorial,wewillusethesesocketnames:

localfeandlocalbeforthelocalflow.
cloudfeandcloudbeforthecloudflow.
statefeandstatebeforthestateflow.

Forourtransportandbecausewe'resimulatingthewholethingononebox,we'lluseipcforeverything.Thishastheadvantage
ofworkingliketcpintermsofconnectivity(i.e.,it'sadisconnectedtransport,unlikeinproc),yetwedon'tneedIPaddressesor
DNSnames,whichwouldbeapainhere.Instead,wewilluseipcendpointscalledsomethinglocal,somethingcloud,and
somethingstate,wheresomethingisthenameofoursimulatedcluster.

Youmightbethinkingthatthisisalotofworkforsomenames.Whynotcallthems1,s2,s3,s4,etc.?Theansweristhatifyour
brainisnotaperfectmachine,youneedalotofhelpwhenreadingcode,andwe'llseethatthesenamesdohelp.It'seasierto
remember"threeflows,twodirections"than"sixdifferentsockets".

Figure44BrokerSocketArrangement

Notethatweconnectthecloudbeineachbrokertothecloudfeineveryotherbroker,andlikewiseweconnectthestatebeineach
brokertothestatefeineveryotherbroker.

PrototypingtheStateFlow topprevnext

Becauseeachsocketflowhasitsownlittletrapsfortheunwary,wewilltesttheminrealcodeonebyone,ratherthantryto

http://zguide.zeromq.org/page:all 67/225
12/31/2015 MQ - The Guide - MQ - The Guide
throwthewholelotintocodeinonego.Whenwe'rehappywitheachflow,wecanputthemtogetherintoafullprogram.We'll
startwiththestateflow.

Figure45TheStateFlow

Hereishowthisworksincode:

peering1:PrototypestateflowinC

C#|Clojure|Delphi|F#|Go|Haskell|Haxe|Java|Lua|PHP|Python|Racket|Ruby|Scala|Tcl|Ada|Basic|C++|CL|Erlang|Felix|Node.js|
ObjectiveC|ooc|Perl|Q

Notesaboutthiscode:

Eachbrokerhasanidentitythatweusetoconstructipcendpointnames.ArealbrokerwouldneedtoworkwithTCPand
amoresophisticatedconfigurationscheme.We'lllookatsuchschemeslaterinthisbook,butfornow,usinggenerated
ipcnamesletsusignoretheproblemofwheretogetTCP/IPaddressesornames.

Weuseazmq_poll()loopasthecoreoftheprogram.Thisprocessesincomingmessagesandsendsoutstate
messages.Wesendastatemessageonlyifwedidnotgetanyincomingmessagesandwewaitedforasecond.Ifwe
sendoutastatemessageeachtimewegetonein,we'llgetmessagestorms.

Weuseatwopartpubsubmessageconsistingofsenderaddressanddata.Notethatwewillneedtoknowtheaddressof
thepublisherinordertosendittasks,andtheonlywayistosendthisexplicitlyasapartofthemessage.

Wedon'tsetidentitiesonsubscribersbecauseifwedidthenwe'dgetoutdatedstateinformationwhenconnectingto
runningbrokers.

http://zguide.zeromq.org/page:all 68/225
12/31/2015 MQ - The Guide - MQ - The Guide
Wedon'tsetaHWMonthepublisher,butifwewereusingZeroMQv2.xthatwouldbeawiseidea.

Wecanbuildthislittleprogramandrunitthreetimestosimulatethreeclusters.Let'scallthemDC1,DC2,andDC3(thenames
arearbitrary).Werunthesethreecommands,eachinaseparatewindow:

peering1DC1DC2DC3#StartDC1andconnecttoDC2andDC3
peering1DC2DC1DC3#StartDC2andconnecttoDC1andDC3
peering1DC3DC1DC2#StartDC3andconnecttoDC1andDC2

You'llseeeachclusterreportthestateofitspeers,andafterafewsecondstheywillallhappilybeprintingrandomnumbersonce
persecond.Trythisandsatisfyyourselfthatthethreebrokersallmatchupandsynchronizetopersecondstateupdates.

Inreallife,we'dnotsendoutstatemessagesatregularintervals,butratherwheneverwehadastatechange,i.e.,whenevera
workerbecomesavailableorunavailable.Thatmayseemlikealotoftraffic,butstatemessagesaresmallandwe'veestablished
thattheinterclusterconnectionsaresuperfast.

Ifwewantedtosendstatemessagesatpreciseintervals,we'dcreateachildthreadandopenthestatebesocketinthatthread.
We'dthensendirregularstateupdatestothatchildthreadfromourmainthreadandallowthechildthreadtoconflatetheminto
regularoutgoingmessages.Thisismoreworkthanweneedhere.

PrototypingtheLocalandCloudFlows topprevnext

Let'snowprototypetheflowoftasksviathelocalandcloudsockets.Thiscodepullsrequestsfromclientsandthendistributes
themtolocalworkersandcloudpeersonarandombasis.

Figure46TheFlowofTasks

http://zguide.zeromq.org/page:all 69/225
12/31/2015 MQ - The Guide - MQ - The Guide

Beforewejumpintothecode,whichisgettingalittlecomplex,let'ssketchthecoreroutinglogicandbreakitdownintoasimple
yetrobustdesign.

Weneedtwoqueues,oneforrequestsfromlocalclientsandoneforrequestsfromcloudclients.Oneoptionwouldbetopull
messagesoffthelocalandcloudfrontends,andpumptheseontotheirrespectivequeues.Butthisiskindofpointlessbecause
ZeroMQsocketsarequeuesalready.Solet'susetheZeroMQsocketbuffersasqueues.

Thiswasthetechniqueweusedintheloadbalancingbroker,anditworkednicely.Weonlyreadfromthetwofrontendswhen
thereissomewheretosendtherequests.Wecanalwaysreadfromthebackends,astheygiveusrepliestorouteback.Aslong
asthebackendsaren'ttalkingtous,there'snopointinevenlookingatthefrontends.

Soourmainloopbecomes:

Pollthebackendsforactivity.Whenwegetamessage,itmaybe"ready"fromaworkeroritmaybeareply.Ifit'sareply,
routebackviathelocalorcloudfrontend.

Ifaworkerreplied,itbecameavailable,sowequeueitandcountit.

Whilethereareworkersavailable,takearequest,ifany,fromeitherfrontendandroutetoalocalworker,orrandomly,toa
cloudpeer.

Randomlysendingtaskstoapeerbrokerratherthanaworkersimulatesworkdistributionacrossthecluster.It'sdumb,butthatis
fineforthisstage.

Weusebrokeridentitiestoroutemessagesbetweenbrokers.Eachbrokerhasanamethatweprovideonthecommandlinein
thissimpleprototype.Aslongasthesenamesdon'toverlapwiththeZeroMQgeneratedUUIDsusedforclientnodes,wecan
figureoutwhethertorouteareplybacktoaclientortoabroker.

http://zguide.zeromq.org/page:all 70/225
12/31/2015 MQ - The Guide - MQ - The Guide
Hereishowthisworksincode.Theinterestingpartstartsaroundthecomment"Interestingpart".

peering2:PrototypelocalandcloudflowinC

C#|Delphi|F#|Go|Haskell|Haxe|Java|Lua|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|C++|Clojure|CL|Erlang|Felix|Node.js|ObjectiveC
|ooc|Perl|Q|Racket

Runthisby,forinstance,startingtwoinstancesofthebrokerintwowindows:

peering2meyou
peering2youme

Somecommentsonthiscode:

IntheCcodeatleast,usingthezmsgclassmakeslifemucheasier,andourcodemuchshorter.It'sobviouslyan
abstractionthatworks.IfyoubuildZeroMQapplicationsinC,youshoulduseCZMQ.

Becausewe'renotgettinganystateinformationfrompeers,wenaivelyassumetheyarerunning.Thecodepromptsyouto
confirmwhenyou'vestartedallthebrokers.Intherealcase,we'dnotsendanythingtobrokerswhohadnottoldusthey
exist.

Youcansatisfyyourselfthatthecodeworksbywatchingitrunforever.Iftherewereanymisroutedmessages,clientswouldend
upblocking,andthebrokerswouldstopprintingtraceinformation.Youcanprovethatbykillingeitherofthebrokers.Theother
brokertriestosendrequeststothecloud,andonebyoneitsclientsblock,waitingforananswer.

PuttingitAllTogether topprevnext

Let'sputthistogetherintoasinglepackage.Asbefore,we'llrunanentireclusterasoneprocess.We'regoingtotakethetwo
previousexamplesandmergethemintooneproperlyworkingdesignthatletsyousimulateanynumberofclusters.

Thiscodeisthesizeofbothpreviousprototypestogether,at270LoC.That'sprettygoodforasimulationofaclusterthat
includesclientsandworkersandcloudworkloaddistribution.Hereisthecode:

peering3:FullclustersimulationinC

Delphi|F#|Go|Haskell|Haxe|Java|Lua|PHP|Python|Ruby|Tcl|Ada|Basic|C++|C#|Clojure|CL|Erlang|Felix|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

It'sanontrivialprogramandtookaboutadaytogetworking.Thesearethehighlights:

Theclientthreadsdetectandreportafailedrequest.Theydothisbypollingforaresponseandifnonearrivesaftera
while(10seconds),printinganerrormessage.

Clientthreadsdon'tprintdirectly,butinsteadsendamessagetoamonitorsocket(PUSH)thatthemainloopcollects
(PULL)andprintsoff.Thisisthefirstcasewe'veseenofusingZeroMQsocketsformonitoringandloggingthisisabig
usecasethatwe'llcomebacktolater.

Clientssimulatevaryingloadstogetthecluster100%atrandommoments,sothattasksareshiftedovertothecloud.The
numberofclientsandworkers,anddelaysintheclientandworkerthreadscontrolthis.Feelfreetoplaywiththemtoseeif
youcanmakeamorerealisticsimulation.

Themainloopusestwopollsets.Itcouldinfactusethree:information,backends,andfrontends.Asintheearlier
prototype,thereisnopointintakingafrontendmessageifthereisnobackendcapacity.

Thesearesomeoftheproblemsthataroseduringdevelopmentofthisprogram:

Clientswouldfreeze,duetorequestsorrepliesgettinglostsomewhere.RecallthattheROUTERsocketdropsmessages
itcan'troute.Thefirsttacticherewastomodifytheclientthreadtodetectandreportsuchproblems.Secondly,Iput
zmsg_dump()callsaftereveryreceiveandbeforeeverysendinthemainloop,untiltheoriginoftheproblemswasclear.

Themainloopwasmistakenlyreadingfrommorethanonereadysocket.Thiscausedthefirstmessagetobelost.Ifixed
thatbyreadingonlyfromthefirstreadysocket.

http://zguide.zeromq.org/page:all 71/225
12/31/2015 MQ - The Guide - MQ - The Guide
ThezmsgclasswasnotproperlyencodingUUIDsasCstrings.ThiscausedUUIDsthatcontain0bytestobecorrupted.I
fixedthatbymodifyingzmsgtoencodeUUIDsasprintablehexstrings.

Thissimulationdoesnotdetectdisappearanceofacloudpeer.Ifyoustartseveralpeersandstopone,anditwasbroadcasting
capacitytotheothers,theywillcontinuetosenditworkevenifit'sgone.Youcantrythis,andyouwillgetclientsthatcomplainof
lostrequests.Thesolutionistwofold:first,onlykeepthecapacityinformationforashorttimesothatifapeerdoesdisappear,its
capacityisquicklysettozero.Second,addreliabilitytotherequestreplychain.We'lllookatreliabilityinthenextchapter.

Chapter4ReliableRequestReplyPatterns topprevnext

Chapter3AdvancedRequestReplyPatternscoveredadvancedusesofZeroMQ'srequestreplypatternwithworking
examples.Thischapterlooksatthegeneralquestionofreliabilityandbuildsasetofreliablemessagingpatternsontopof
ZeroMQ'scorerequestreplypattern.

Inthischapter,wefocusheavilyonuserspacerequestreplypatterns,reusablemodelsthathelpyoudesignyourownZeroMQ
architectures:

TheLazyPiratepattern:reliablerequestreplyfromtheclientside
TheSimplePiratepattern:reliablerequestreplyusingloadbalancing
TheParanoidPiratepattern:reliablerequestreplywithheartbeating
TheMajordomopattern:serviceorientedreliablequeuing
TheTitanicpattern:diskbased/disconnectedreliablequeuing
TheBinaryStarpattern:primarybackupserverfailover
TheFreelancepattern:brokerlessreliablerequestreply

Whatis"Reliability"? topprevnext

Mostpeoplewhospeakof"reliability"don'treallyknowwhattheymean.Wecanonlydefinereliabilityintermsoffailure.Thatis,
ifwecanhandleacertainsetofwelldefinedandunderstoodfailures,thenwearereliablewithrespecttothosefailures.No
more,noless.Solet'slookatthepossiblecausesoffailureinadistributedZeroMQapplication,inroughlydescendingorderof
probability:

Applicationcodeistheworstoffender.Itcancrashandexit,freezeandstoprespondingtoinput,runtooslowlyforits
input,exhaustallmemory,andsoon.

SystemcodesuchasbrokerswewriteusingZeroMQcandieforthesamereasonsasapplicationcode.Systemcode
shouldbemorereliablethanapplicationcode,butitcanstillcrashandburn,andespeciallyrunoutofmemoryifittriesto
queuemessagesforslowclients.

Messagequeuescanoverflow,typicallyinsystemcodethathaslearnedtodealbrutallywithslowclients.Whenaqueue
overflows,itstartstodiscardmessages.Soweget"lost"messages.

Networkscanfail(e.g.,WiFigetsswitchedofforgoesoutofrange).ZeroMQwillautomaticallyreconnectinsuchcases,
butinthemeantime,messagesmaygetlost.

Hardwarecanfailandtakewithitalltheprocessesrunningonthatbox.

Networkscanfailinexoticways,e.g.,someportsonaswitchmaydieandthosepartsofthenetworkbecome
inaccessible.

Entiredatacenterscanbestruckbylightning,earthquakes,fire,ormoremundanepowerorcoolingfailures.

Tomakeasoftwaresystemfullyreliableagainstallofthesepossiblefailuresisanenormouslydifficultandexpensivejoband
goesbeyondthescopeofthisbook.

Becausethefirstfivecasesintheabovelistcover99.9%ofrealworldrequirementsoutsidelargecompanies(accordingtoa
highlyscientificstudyIjustran,whichalsotoldmethat78%ofstatisticsaremadeuponthespot,andmoreovernevertotrusta
statisticthatwedidn'tfalsifyourselves),that'swhatwe'llexamine.Ifyou'realargecompanywithmoneytospendonthelasttwo

http://zguide.zeromq.org/page:all 72/225
12/31/2015 MQ - The Guide - MQ - The Guide
cases,contactmycompanyimmediately!There'salargeholebehindmybeachhousewaitingtobeconvertedintoanexecutive
swimmingpool.

DesigningReliability topprevnext

Sotomakethingsbrutallysimple,reliabilityis"keepingthingsworkingproperlywhencodefreezesorcrashes",asituationwe'll
shortento"dies".However,thethingswewanttokeepworkingproperlyaremorecomplexthanjustmessages.Weneedtotake
eachcoreZeroMQmessagingpatternandseehowtomakeitwork(ifwecan)evenwhencodedies.

Let'stakethemonebyone:

Requestreply:iftheserverdies(whileprocessingarequest),theclientcanfigurethatoutbecauseitwon'tgetananswer
back.Thenitcangiveupinahuff,waitandtryagainlater,findanotherserver,andsoon.Asfortheclientdying,wecan
brushthatoffas"someoneelse'sproblem"fornow.

Pubsub:iftheclientdies(havinggottensomedata),theserverdoesn'tknowaboutit.Pubsubdoesn'tsendany
informationbackfromclienttoserver.Buttheclientcancontacttheserveroutofband,e.g.,viarequestreply,andask,
"pleaseresendeverythingImissed".Asfortheserverdying,that'soutofscopeforhere.Subscriberscanalsoselfverify
thatthey'renotrunningtooslowly,andtakeaction(e.g.,warntheoperatoranddie)iftheyare.

Pipeline:ifaworkerdies(whileworking),theventilatordoesn'tknowaboutit.Pipelines,likethegrindinggearsoftime,
onlyworkinonedirection.Butthedownstreamcollectorcandetectthatonetaskdidn'tgetdone,andsendamessage
backtotheventilatorsaying,"hey,resendtask324!"Iftheventilatororcollectordies,whateverupstreamclientoriginally
senttheworkbatchcangettiredofwaitingandresendthewholelot.It'snotelegant,butsystemcodeshouldreallynotdie
oftenenoughtomatter.

Inthischapterwe'llfocusjustonrequestreply,whichisthelowhangingfruitofreliablemessaging.

Thebasicrequestreplypattern(aREQclientsocketdoingablockingsend/receivetoaREPserversocket)scoreslowon
handlingthemostcommontypesoffailure.Iftheservercrasheswhileprocessingtherequest,theclientjusthangsforever.Ifthe
networklosestherequestorthereply,theclienthangsforever.

RequestreplyisstillmuchbetterthanTCP,thankstoZeroMQ'sabilitytoreconnectpeerssilently,toloadbalancemessages,
andsoon.Butit'sstillnotgoodenoughforrealwork.Theonlycasewhereyoucanreallytrustthebasicrequestreplypatternis
betweentwothreadsinthesameprocesswherethere'snonetworkorseparateserverprocesstodie.

However,withalittleextrawork,thishumblepatternbecomesagoodbasisforrealworkacrossadistributednetwork,andwe
getasetofreliablerequestreply(RRR)patternsthatIliketocallthePiratepatterns(you'lleventuallygetthejoke,Ihope).

Thereare,inmyexperience,roughlythreewaystoconnectclientstoservers.Eachneedsaspecificapproachtoreliability:

Multipleclientstalkingdirectlytoasingleserver.Usecase:asinglewellknownservertowhichclientsneedtotalk.Types
offailureweaimtohandle:servercrashesandrestarts,andnetworkdisconnects.

Multipleclientstalkingtoabrokerproxythatdistributesworktomultipleworkers.Usecase:serviceorientedtransaction
processing.Typesoffailureweaimtohandle:workercrashesandrestarts,workerbusylooping,workeroverload,queue
crashesandrestarts,andnetworkdisconnects.

Multipleclientstalkingtomultipleserverswithnointermediaryproxies.Usecase:distributedservicessuchasname
resolution.Typesoffailureweaimtohandle:servicecrashesandrestarts,servicebusylooping,serviceoverload,and
networkdisconnects.

Eachoftheseapproacheshasitstradeoffsandoftenyou'llmixthem.We'lllookatallthreeindetail.

ClientSideReliability(LazyPiratePattern) topprevnext

Wecangetverysimplereliablerequestreplywithsomechangestotheclient.WecallthistheLazyPiratepattern.Ratherthan
doingablockingreceive,we:

PolltheREQsocketandreceivefromitonlywhenit'ssureareplyhasarrived.

http://zguide.zeromq.org/page:all 73/225
12/31/2015 MQ - The Guide - MQ - The Guide
Resendarequest,ifnoreplyhasarrivedwithinatimeoutperiod.
Abandonthetransactionifthereisstillnoreplyafterseveralrequests.

IfyoutrytouseaREQsocketinanythingotherthanastrictsend/receivefashion,you'llgetanerror(technically,theREQsocket
implementsasmallfinitestatemachinetoenforcethesend/receivepingpong,andsotheerrorcodeiscalled"EFSM").Thisis
slightlyannoyingwhenwewanttouseREQinapiratepattern,becausewemaysendseveralrequestsbeforegettingareply.

TheprettygoodbruteforcesolutionistocloseandreopentheREQsocketafteranerror:

lpclient:LazyPirateclientinC

C++|C#|Clojure|Delphi|Go|Haskell|Haxe|Java|Lua|Perl|PHP|Python|Ruby|Tcl|Ada|Basic|CL|Erlang|F#|Felix|Node.js|ObjectiveC|
ooc|Q|Racket|Scala

Runthistogetherwiththematchingserver:

lpserver:LazyPirateserverinC

C++|C#|Clojure|Delphi|Go|Haskell|Haxe|Java|Lua|Perl|PHP|Python|Ruby|Scala|Tcl|Ada|Basic|CL|Erlang|F#|Felix|Node.js|
ObjectiveC|ooc|Q|Racket

Figure47TheLazyPiratePattern

Torunthistestcase,starttheclientandtheserverintwoconsolewindows.Theserverwillrandomlymisbehaveafterafew
messages.Youcanchecktheclient'sresponse.Hereistypicaloutputfromtheserver:

I:normalrequest(1)
I:normalrequest(2)
I:normalrequest(3)
I:simulatingCPUoverload
I:normalrequest(4)
I:simulatingacrash

Andhereistheclient'sresponse:

I:connectingtoserver...
I:serverrepliedOK(1)
I:serverrepliedOK(2)
I:serverrepliedOK(3)
W:noresponsefromserver,retrying...

http://zguide.zeromq.org/page:all 74/225
12/31/2015 MQ - The Guide - MQ - The Guide
I:connectingtoserver...
W:noresponsefromserver,retrying...
I:connectingtoserver...
E:serverseemstobeoffline,abandoning

Theclientsequenceseachmessageandchecksthatrepliescomebackexactlyinorder:thatnorequestsorrepliesarelost,and
norepliescomebackmorethanonce,oroutoforder.Runthetestafewtimesuntilyou'reconvincedthatthismechanism
actuallyworks.Youdon'tneedsequencenumbersinaproductionapplicationtheyjusthelpustrustourdesign.

TheclientusesaREQsocket,anddoesthebruteforceclose/reopenbecauseREQsocketsimposethatstrictsend/receive
cycle.YoumightbetemptedtouseaDEALERinstead,butitwouldnotbeagooddecision.First,itwouldmeanemulatingthe
secretsaucethatREQdoeswithenvelopes(ifyou'veforgottenwhatthatis,it'sagoodsignyoudon'twanttohavetodoit).
Second,itwouldmeanpotentiallygettingbackrepliesthatyoudidn'texpect.

Handlingfailuresonlyattheclientworkswhenwehaveasetofclientstalkingtoasingleserver.Itcanhandleaservercrash,but
onlyifrecoverymeansrestartingthatsameserver.Ifthere'sapermanenterror,suchasadeadpowersupplyontheserver
hardware,thisapproachwon'twork.Becausetheapplicationcodeinserversisusuallythebiggestsourceoffailuresinany
architecture,dependingonasingleserverisnotagreatidea.

So,prosandcons:

Pro:simpletounderstandandimplement.
Pro:workseasilywithexistingclientandserverapplicationcode.
Pro:ZeroMQautomaticallyretriestheactualreconnectionuntilitworks.
Con:doesn'tfailovertobackuporalternateservers.

BasicReliableQueuing(SimplePiratePattern) topprevnext

OursecondapproachextendstheLazyPiratepatternwithaqueueproxythatletsustalk,transparently,tomultipleservers,
whichwecanmoreaccuratelycall"workers".We'lldevelopthisinstages,startingwithaminimalworkingmodel,theSimple
Piratepattern.

InallthesePiratepatterns,workersarestateless.Iftheapplicationrequiressomesharedstate,suchasashareddatabase,we
don'tknowaboutitaswedesignourmessagingframework.Havingaqueueproxymeansworkerscancomeandgowithout
clientsknowinganythingaboutit.Ifoneworkerdies,anothertakesover.Thisisanice,simpletopologywithonlyonereal
weakness,namelythecentralqueueitself,whichcanbecomeaproblemtomanage,andasinglepointoffailure.

Figure48TheSimplePiratePattern

http://zguide.zeromq.org/page:all 75/225
12/31/2015 MQ - The Guide - MQ - The Guide

ThebasisforthequeueproxyistheloadbalancingbrokerfromChapter3AdvancedRequestReplyPatterns.Whatisthevery
minimumweneedtodotohandledeadorblockedworkers?Turnsout,it'ssurprisinglylittle.Wealreadyhavearetrymechanism
intheclient.Sousingtheloadbalancingpatternwillworkprettywell.ThisfitswithZeroMQ'sphilosophythatwecanextenda
peertopeerpatternlikerequestreplybypluggingnaiveproxiesinthemiddle.

Wedon'tneedaspecialclientwe'restillusingtheLazyPirateclient.Hereisthequeue,whichisidenticaltothemaintaskofthe
loadbalancingbroker:

spqueue:SimplePiratequeueinC

C++|C#|Clojure|Delphi|Go|Haskell|Haxe|Java|Lua|PHP|Python|Tcl|Ada|Basic|CL|Erlang|F#|Felix|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Hereistheworker,whichtakestheLazyPirateserverandadaptsitfortheloadbalancingpattern(usingtheREQ"ready"
signaling):

spworker:SimplePirateworkerinC

C++|C#|Clojure|Delphi|Go|Haskell|Haxe|Java|Lua|PHP|Python|Tcl|Ada|Basic|CL|Erlang|F#|Felix|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Totestthis,startahandfulofworkers,aLazyPirateclient,andthequeue,inanyorder.You'llseethattheworkerseventuallyall
crashandburn,andtheclientretriesandthengivesup.Thequeueneverstops,andyoucanrestartworkersandclientsad
nauseam.Thismodelworkswithanynumberofclientsandworkers.

RobustReliableQueuing(ParanoidPiratePattern) topprevnext

Figure49TheParanoidPiratePattern

http://zguide.zeromq.org/page:all 76/225
12/31/2015 MQ - The Guide - MQ - The Guide

TheSimplePirateQueuepatternworksprettywell,especiallybecauseit'sjustacombinationoftwoexistingpatterns.Still,itdoes
havesomeweaknesses:

It'snotrobustinthefaceofaqueuecrashandrestart.Theclientwillrecover,buttheworkerswon't.WhileZeroMQwill
reconnectworkers'socketsautomatically,asfarasthenewlystartedqueueisconcerned,theworkershaven'tsignaled
ready,sodon'texist.Tofixthis,wehavetodoheartbeatingfromqueuetoworkersothattheworkercandetectwhenthe
queuehasgoneaway.

Thequeuedoesnotdetectworkerfailure,soifaworkerdieswhileidle,thequeuecan'tremoveitfromitsworkerqueue
untilthequeuesendsitarequest.Theclientwaitsandretriesfornothing.It'snotacriticalproblem,butit'snotnice.To
makethisworkproperly,wedoheartbeatingfromworkertoqueue,sothatthequeuecandetectalostworkeratany
stage.

We'llfixtheseinaproperlypedanticParanoidPiratePattern.

WepreviouslyusedaREQsocketfortheworker.FortheParanoidPirateworker,we'llswitchtoaDEALERsocket.Thishasthe
advantageoflettingussendandreceivemessagesatanytime,ratherthanthelockstepsend/receivethatREQimposes.The
downsideofDEALERisthatwehavetodoourownenvelopemanagement(rereadChapter3AdvancedRequestReply
Patternsforbackgroundonthisconcept).

We'restillusingtheLazyPirateclient.HereistheParanoidPiratequeueproxy:

ppqueue:ParanoidPiratequeueinC

C++|C#|Go|Haskell|Haxe|Java|Lua|PHP|Python|Tcl|Ada|Basic|Clojure|CL|Delphi|Erlang|F#|Felix|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Thequeueextendstheloadbalancingpatternwithheartbeatingofworkers.Heartbeatingisoneofthose"simple"thingsthatcan
bedifficulttogetright.I'llexplainmoreaboutthatinasecond.

http://zguide.zeromq.org/page:all 77/225
12/31/2015 MQ - The Guide - MQ - The Guide
HereistheParanoidPirateworker:

ppworker:ParanoidPirateworkerinC

C++|C#|Go|Haskell|Haxe|Java|Lua|PHP|Python|Tcl|Ada|Basic|Clojure|CL|Delphi|Erlang|F#|Felix|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Somecommentsaboutthisexample:

Thecodeincludessimulationoffailures,asbefore.Thismakesit(a)veryhardtodebug,and(b)dangeroustoreuse.
Whenyouwanttodebugthis,disablethefailuresimulation.

TheworkerusesareconnectstrategysimilartotheonewedesignedfortheLazyPirateclient,withtwomajordifferences:
(a)itdoesanexponentialbackoff,and(b)itretriesindefinitely(whereastheclientretriesafewtimesbeforereportinga
failure).

Trytheclient,queue,andworkers,suchasbyusingascriptlikethis:

ppqueue&
foriin1234do
ppworker&
sleep1
done
lpclient&

Youshouldseetheworkersdieonebyoneastheysimulateacrash,andtheclienteventuallygiveup.Youcanstopandrestart
thequeueandbothclientandworkerswillreconnectandcarryon.Andnomatterwhatyoudotoqueuesandworkers,theclient
willnevergetanoutoforderreply:thewholechaineitherworks,ortheclientabandons.

Heartbeating topprevnext

Heartbeatingsolvestheproblemofknowingwhetherapeerisaliveordead.ThisisnotanissuespecifictoZeroMQ.TCPhasa
longtimeout(30minutesorso),thatmeansthatitcanbeimpossibletoknowwhetherapeerhasdied,beendisconnected,or
goneonaweekendtoPraguewithacaseofvodka,aredhead,andalargeexpenseaccount.

It'sisnoteasytogetheartbeatingright.WhenwritingtheParanoidPirateexamples,ittookaboutfivehourstogetthe
heartbeatingworkingproperly.Therestoftherequestreplychaintookperhapstenminutes.Itisespeciallyeasytocreate"false
failures",i.e.,whenpeersdecidethattheyaredisconnectedbecausetheheartbeatsaren'tsentproperly.

We'lllookatthethreemainanswerspeopleuseforheartbeatingwithZeroMQ.

ShruggingItOff topprevnext

Themostcommonapproachistodonoheartbeatingatallandhopeforthebest.ManyifnotmostZeroMQapplicationsdothis.
ZeroMQencouragesthisbyhidingpeersinmanycases.Whatproblemsdoesthisapproachcause?

WhenweuseaROUTERsocketinanapplicationthattrackspeers,aspeersdisconnectandreconnect,theapplication
willleakmemory(resourcesthattheapplicationholdsforeachpeer)andgetslowerandslower.

WhenweuseSUBorDEALERbaseddatarecipients,wecan'ttellthedifferencebetweengoodsilence(there'snodata)
andbadsilence(theotherenddied).Whenarecipientknowstheothersidedied,itcanforexampleswitchovertoa
backuproute.

IfweuseaTCPconnectionthatstayssilentforalongwhile,itwill,insomenetworks,justdie.Sendingsomething
(technically,a"keepalive"morethanaheartbeat),willkeepthenetworkalive.

http://zguide.zeromq.org/page:all 78/225
12/31/2015 MQ - The Guide - MQ - The Guide

OneWayHeartbeats topprevnext

Asecondoptionistosendaheartbeatmessagefromeachnodetoitspeerseverysecondorso.Whenonenodehearsnothing
fromanotherwithinsometimeout(severalseconds,typically),itwilltreatthatpeerasdead.Soundsgood,right?Sadly,no.This
worksinsomecasesbuthasnastyedgecasesinothers.

Forpubsub,thisdoeswork,andit'stheonlymodelyoucanuse.SUBsocketscannottalkbacktoPUBsockets,butPUB
socketscanhappilysend"I'malive"messagestotheirsubscribers.

Asanoptimization,youcansendheartbeatsonlywhenthereisnorealdatatosend.Furthermore,youcansendheartbeats
progressivelyslowerandslower,ifnetworkactivityisanissue(e.g.,onmobilenetworkswhereactivitydrainsthebattery).As
longastherecipientcandetectafailure(sharpstopinactivity),that'sfine.

Herearethetypicalproblemswiththisdesign:

Itcanbeinaccuratewhenwesendlargeamountsofdata,asheartbeatswillbedelayedbehindthatdata.Ifheartbeatsare
delayed,youcangetfalsetimeoutsanddisconnectionsduetonetworkcongestion.Thus,alwaystreatanyincomingdata
asaheartbeat,whetherornotthesenderoptimizesoutheartbeats.

Whilethepubsubpatternwilldropmessagesfordisappearedrecipients,PUSHandDEALERsocketswillqueuethem.
Soifyousendheartbeatstoadeadpeeranditcomesback,itwillgetalltheheartbeatsyousent,whichcanbe
thousands.Whoa,whoa!

Thisdesignassumesthatheartbeattimeoutsarethesameacrossthewholenetwork.Butthatwon'tbeaccurate.Some
peerswillwantveryaggressiveheartbeatinginordertodetectfaultsrapidly.Andsomewillwantveryrelaxed
heartbeating,inordertoletsleepingnetworkslieandsavepower.

PingPongHeartbeats topprevnext

Thethirdoptionistouseapingpongdialog.Onepeersendsapingcommandtotheother,whichreplieswithapongcommand.
Neithercommandhasanypayload.Pingsandpongsarenotcorrelated.Becausetherolesof"client"and"server"arearbitraryin
somenetworks,weusuallyspecifythateitherpeercaninfactsendapingandexpectaponginresponse.However,becausethe
timeoutsdependonnetworktopologiesknownbesttodynamicclients,itisusuallytheclientthatpingstheserver.

ThisworksforallROUTERbasedbrokers.Thesameoptimizationsweusedinthesecondmodelmakethisworkevenbetter:
treatanyincomingdataasapong,andonlysendapingwhennototherwisesendingdata.

HeartbeatingforParanoidPirate topprevnext

ForParanoidPirate,wechosethesecondapproach.Itmightnothavebeenthesimplestoption:ifdesigningthistoday,I'd
probablytryapingpongapproachinstead.Howevertheprinciplesaresimilar.Theheartbeatmessagesflowasynchronouslyin
bothdirections,andeitherpeercandecidetheotheris"dead"andstoptalkingtoit.

Intheworker,thisishowwehandleheartbeatsfromthequeue:

Wecalculatealiveness,whichishowmanyheartbeatswecanstillmissbeforedecidingthequeueisdead.Itstartsat
threeandwedecrementiteachtimewemissaheartbeat.
Wewait,inthezmq_pollloop,foronesecondeachtime,whichisourheartbeatinterval.
Ifthere'sanymessagefromthequeueduringthattime,weresetourlivenesstothree.
Ifthere'snomessageduringthattime,wecountdownourliveness.
Ifthelivenessreacheszero,weconsiderthequeuedead.
Ifthequeueisdead,wedestroyoursocket,createanewone,andreconnect.
Toavoidopeningandclosingtoomanysockets,wewaitforacertainintervalbeforereconnecting,andwedoublethe
intervaleachtimeuntilitreaches32seconds.

Andthisishowwehandleheartbeatstothequeue:

http://zguide.zeromq.org/page:all 79/225
12/31/2015 MQ - The Guide - MQ - The Guide
Wecalculatewhentosendthenextheartbeatthisisasinglevariablebecausewe'retalkingtoonepeer,thequeue.
Inthezmq_pollloop,wheneverwepassthistime,wesendaheartbeattothequeue.

Here'stheessentialheartbeatingcodefortheworker:

#defineHEARTBEAT_LIVENESS3//35isreasonable
#defineHEARTBEAT_INTERVAL1000//msecs
#defineINTERVAL_INIT1000//Initialreconnect
#defineINTERVAL_MAX32000//Afterexponentialbackoff


//Iflivenesshitszero,queueisconsidereddisconnected
size_tliveness=HEARTBEAT_LIVENESS
size_tinterval=INTERVAL_INIT

//Sendoutheartbeatsatregularintervals
uint64_theartbeat_at=zclock_time()+HEARTBEAT_INTERVAL

while(true){
zmq_pollitem_titems[]={{worker,0,ZMQ_POLLIN,0}}
intrc=zmq_poll(items,1,HEARTBEAT_INTERVAL*ZMQ_POLL_MSEC)

if(items[0].revents&ZMQ_POLLIN){
//Receiveanymessagefromqueue
liveness=HEARTBEAT_LIVENESS
interval=INTERVAL_INIT
}
else
if(liveness==0){
zclock_sleep(interval)
if(interval<INTERVAL_MAX)
interval*=2
zsocket_destroy(ctx,worker)

liveness=HEARTBEAT_LIVENESS
}
//Sendheartbeattoqueueifit'stime
if(zclock_time()>heartbeat_at){
heartbeat_at=zclock_time()+HEARTBEAT_INTERVAL
//Sendheartbeatmessagetoqueue
}
}

Thequeuedoesthesame,butmanagesanexpirationtimeforeachworker.

Herearesometipsforyourownheartbeatingimplementation:

Usezmq_pollorareactorasthecoreofyourapplication'smaintask.

Startbybuildingtheheartbeatingbetweenpeers,testitbysimulatingfailures,andthenbuildtherestofthemessageflow.
Addingheartbeatingafterwardsismuchtrickier.

Usesimpletracing,i.e.,printtoconsole,togetthisworking.Tohelpyoutracetheflowofmessagesbetweenpeers,usea
dumpmethodsuchaszmsgoffers,andnumberyourmessagesincrementallysoyoucanseeiftherearegaps.

Inarealapplication,heartbeatingmustbeconfigurableandusuallynegotiatedwiththepeer.Somepeerswillwant
aggressiveheartbeating,aslowas10msecs.Otherpeerswillbefarawayandwantheartbeatingashighas30seconds.

Ifyouhavedifferentheartbeatintervalsfordifferentpeers,yourpolltimeoutshouldbethelowest(shortesttime)ofthese.
Donotuseaninfinitetimeout.

Doheartbeatingonthesamesocketyouuseformessages,soyourheartbeatsalsoactasakeepalivetostopthe
networkconnectionfromgoingstale(somefirewallscanbeunkindtosilentconnections).

http://zguide.zeromq.org/page:all 80/225
12/31/2015 MQ - The Guide - MQ - The Guide

ContractsandProtocols topprevnext

Ifyou'repayingattention,you'llrealizethatParanoidPirateisnotinteroperablewithSimplePirate,becauseoftheheartbeats.But
howdowedefine"interoperable"?Toguaranteeinteroperability,weneedakindofcontract,anagreementthatletsdifferent
teamsindifferenttimesandplaceswritecodethatisguaranteedtoworktogether.Wecallthisa"protocol".

It'sfuntoexperimentwithoutspecifications,butthat'snotasensiblebasisforrealapplications.Whathappensifwewanttowrite
aworkerinanotherlanguage?Dowehavetoreadcodetoseehowthingswork?Whatifwewanttochangetheprotocolfor
somereason?Evenasimpleprotocolwill,ifit'ssuccessful,evolveandbecomemorecomplex.

Lackofcontractsisasuresignofadisposableapplication.Solet'swriteacontractforthisprotocol.Howdowedothat?

There'sawikiatrfc.zeromq.orgthatwemadeespeciallyasahomeforpublicZeroMQcontracts.
Tocreateanewspecification,registeronthewikiifneeded,andfollowtheinstructions.It'sfairlystraightforward,thoughwriting
technicaltextsisnoteveryone'scupoftea.

IttookmeaboutfifteenminutestodraftthenewPiratePatternProtocol.It'snotabigspecification,butitdoescaptureenoughto
actasthebasisforarguments("yourqueueisn'tPPPcompatiblepleasefixit!").

TurningPPPintoarealprotocolwouldtakemorework:

ThereshouldbeaprotocolversionnumberintheREADYcommandsothatit'spossibletodistinguishbetweendifferent
versionsofPPP.

Rightnow,READYandHEARTBEATarenotentirelydistinctfromrequestsandreplies.Tomakethemdistinct,wewould
needamessagestructurethatincludesa"messagetype"part.

ServiceOrientedReliableQueuing(MajordomoPattern) topprevnext

Figure50TheMajordomoPattern

Thenicethingaboutprogressishowfastithappenswhenlawyersandcommitteesaren'tinvolved.TheonepageMDP
specificationturnsPPPintosomethingmoresolid.Thisishowweshoulddesigncomplexarchitectures:startbywritingdownthe
contracts,andonlythenwritesoftwaretoimplementthem.

TheMajordomoProtocol(MDP)extendsandimprovesonPPPinoneinterestingway:itaddsa"servicename"torequeststhat
http://zguide.zeromq.org/page:all 81/225
12/31/2015 MQ - The Guide - MQ - The Guide
theclientsends,andasksworkerstoregisterforspecificservices.AddingservicenamesturnsourParanoidPiratequeueintoa
serviceorientedbroker.ThenicethingaboutMDPisthatitcameoutofworkingcode,asimplerancestorprotocol(PPP),anda
precisesetofimprovementsthateachsolvedaclearproblem.Thismadeiteasytodraft.

ToimplementMajordomo,weneedtowriteaframeworkforclientsandworkers.It'sreallynotsanetoaskeveryapplication
developertoreadthespecandmakeitwork,whentheycouldbeusingasimplerAPIthatdoestheworkforthem.

Sowhileourfirstcontract(MDPitself)defineshowthepiecesofourdistributedarchitecturetalktoeachother,oursecond
contractdefineshowuserapplicationstalktothetechnicalframeworkwe'regoingtodesign.

Majordomohastwohalves,aclientsideandaworkerside.Becausewe'llwritebothclientandworkerapplications,wewillneed
twoAPIs.HereisasketchfortheclientAPI,usingasimpleobjectorientedapproach:

//MajordomoProtocolclientexample
//UsesthemdcliAPItohideallMDPaspects

//Letsusbuildthissourcewithoutcreatingalibrary
#include"mdcliapi.c"

intmain(intargc,char*argv[])
{
intverbose=(argc>1&&streq(argv[1],"v"))
mdcli_t*session=mdcli_new("tcp://localhost:5555",verbose)

intcount
for(count=0count<100000count++){
zmsg_t*request=zmsg_new()
zmsg_pushstr(request,"Helloworld")
zmsg_t*reply=mdcli_send(session,"echo",&request)
if(reply)
zmsg_destroy(&reply)
else
break//Interruptorfailure
}
printf("%drequests/repliesprocessed\n",count)
mdcli_destroy(&session)
return0
}

That'sit.Weopenasessiontothebroker,sendarequestmessage,getareplymessageback,andeventuallyclosethe
connection.Here'sasketchfortheworkerAPI:

//MajordomoProtocolworkerexample
//UsesthemdwrkAPItohideallMDPaspects

//Letsusbuildthissourcewithoutcreatingalibrary
#include"mdwrkapi.c"

intmain(intargc,char*argv[])
{
intverbose=(argc>1&&streq(argv[1],"v"))
mdwrk_t*session=mdwrk_new(
"tcp://localhost:5555","echo",verbose)

zmsg_t*reply=NULL
while(true){
zmsg_t*request=mdwrk_recv(session,&reply)
if(request==NULL)
break//Workerwasinterrupted
reply=request//Echoiscomplex:)
}
mdwrk_destroy(&session)

http://zguide.zeromq.org/page:all 82/225
12/31/2015 MQ - The Guide - MQ - The Guide
return0
}

It'smoreorlesssymmetrical,buttheworkerdialogisalittledifferent.Thefirsttimeaworkerdoesarecv(),itpassesanullreply.
Thereafter,itpassesthecurrentreply,andgetsanewrequest.

TheclientandworkerAPIswerefairlysimpletoconstructbecausethey'reheavilybasedontheParanoidPiratecodewealready
developed.HereistheclientAPI:

mdcliapi:MajordomoclientAPIinC

Go|Haxe|Java|Lua|PHP|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Haskell|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Let'sseehowtheclientAPIlooksinaction,withanexampletestprogramthatdoes100Krequestreplycycles:

mdclient:MajordomoclientapplicationinC

C++|Go|Haskell|Haxe|Java|Lua|PHP|Python|Tcl|Ada|Basic|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

AndhereistheworkerAPI:

mdwrkapi:MajordomoworkerAPIinC

Go|Haxe|Java|Lua|PHP|Python|Ruby|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Haskell|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

Let'sseehowtheworkerAPIlooksinaction,withanexampletestprogramthatimplementsanechoservice:

mdworker:MajordomoworkerapplicationinC

C++|Go|Haskell|Haxe|Java|Lua|PHP|Python|Ruby|Tcl|Ada|Basic|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

HerearesomethingstonoteabouttheworkerAPIcode:

TheAPIsaresinglethreaded.Thismeans,forexample,thattheworkerwon'tsendheartbeatsinthebackground.Happily,
thisisexactlywhatwewant:iftheworkerapplicationgetsstuck,heartbeatswillstopandthebrokerwillstopsending
requeststotheworker.

TheworkerAPIdoesn'tdoanexponentialbackoffit'snotworththeextracomplexity.

TheAPIsdon'tdoanyerrorreporting.Ifsomethingisn'tasexpected,theyraiseanassertion(orexceptiondependingon
thelanguage).Thisisidealforareferenceimplementation,soanyprotocolerrorsshowimmediately.Forrealapplications,
theAPIshouldberobustagainstinvalidmessages.

YoumightwonderwhytheworkerAPIismanuallyclosingitssocketandopeninganewone,whenZeroMQwillautomatically
reconnectasocketifthepeerdisappearsandcomesback.LookbackattheSimplePirateandParanoidPirateworkersto
understand.AlthoughZeroMQwillautomaticallyreconnectworkersifthebrokerdiesandcomesbackup,thisisn'tsufficienttore
registertheworkerswiththebroker.Iknowofatleasttwosolutions.Thesimplest,whichweusehere,isfortheworkertomonitor
theconnectionusingheartbeats,andifitdecidesthebrokerisdead,tocloseitssocketandstartafreshwithanewsocket.The
alternativeisforthebrokertochallengeunknownworkerswhenitgetsaheartbeatfromtheworkerandaskthemtoreregister.
Thatwouldrequireprotocolsupport.

Nowlet'sdesigntheMajordomobroker.Itscorestructureisasetofqueues,oneperservice.Wewillcreatethesequeuesas
workersappear(wecoulddeletethemasworkersdisappear,butforgetthatfornowbecauseitgetscomplex).Additionally,we
keepaqueueofworkersperservice.

Andhereisthebroker:

mdbroker:MajordomobrokerinC

C++|Go|Haskell|Haxe|Java|Lua|PHP|Python|Ruby|Tcl|Ada|Basic|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

Thisisbyfarthemostcomplexexamplewe'veseen.It'salmost500linesofcode.Towritethisandmakeitsomewhatrobust
tooktwodays.However,thisisstillashortpieceofcodeforafullserviceorientedbroker.

http://zguide.zeromq.org/page:all 83/225
12/31/2015 MQ - The Guide - MQ - The Guide
Herearesomethingstonoteaboutthebrokercode:

TheMajordomoProtocolletsushandlebothclientsandworkersonasinglesocket.Thisisnicerforthosedeployingand
managingthebroker:itjustsitsononeZeroMQendpointratherthanthetwothatmostproxiesneed.

ThebrokerimplementsallofMDP/0.1properly(asfarasIknow),includingdisconnectionifthebrokersendsinvalid
commands,heartbeating,andtherest.

Itcanbeextendedtorunmultiplethreads,eachmanagingonesocketandonesetofclientsandworkers.Thiscouldbe
interestingforsegmentinglargearchitectures.TheCcodeisalreadyorganizedaroundabrokerclasstomakethistrivial.

Aprimary/failoverorlive/livebrokerreliabilitymodeliseasy,asthebrokeressentiallyhasnostateexceptservice
presence.It'suptoclientsandworkerstochooseanotherbrokeriftheirfirstchoiceisn'tupandrunning.

Theexamplesusefivesecondheartbeats,mainlytoreducetheamountofoutputwhenyouenabletracing.Realistic
valueswouldbelowerformostLANapplications.However,anyretryhastobeslowenoughtoallowforaserviceto
restart,say10secondsatleast.

WelaterimprovedandextendedtheprotocolandtheMajordomoimplementation,whichnowsitsinitsownGithubproject.Ifyou
wantaproperlyusableMajordomostack,usetheGitHubproject.

AsynchronousMajordomoPattern topprevnext

TheMajordomoimplementationintheprevioussectionissimpleandstupid.TheclientisjusttheoriginalSimplePirate,wrapped
upinasexyAPI.WhenIfireupaclient,broker,andworkeronatestbox,itcanprocess100,000requestsinabout14seconds.
Thatispartiallyduetothecode,whichcheerfullycopiesmessageframesaroundasifCPUcycleswerefree.Butthereal
problemisthatwe'redoingnetworkroundtrips.ZeroMQdisablesNagle'salgorithm,butroundtrippingisstillslow.

Theoryisgreatintheory,butinpractice,practiceisbetter.Let'smeasuretheactualcostofroundtrippingwithasimpletest
program.Thissendsabunchofmessages,firstwaitingforareplytoeachmessage,andsecondasabatch,readingallthe
repliesbackasabatch.Bothapproachesdothesamework,buttheygiveverydifferentresults.Wemockupaclient,broker,and
worker:

tripping:RoundtripdemonstratorinC

C++|Go|Haskell|Haxe|Java|Lua|PHP|Python|Tcl|Ada|Basic|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Onmydevelopmentbox,thisprogramsays:

Settinguptest...
Synchronousroundtriptest...
9057calls/second
Asynchronousroundtriptest...
173010calls/second

Notethattheclientthreaddoesasmallpausebeforestarting.Thisistogetaroundoneofthe"features"oftheroutersocket:if
yousendamessagewiththeaddressofapeerthat'snotyetconnected,themessagegetsdiscarded.Inthisexamplewedon't
usetheloadbalancingmechanism,sowithoutthesleep,iftheworkerthreadistooslowtoconnect,itwilllosemessages,making
amessofourtest.

Aswesee,roundtrippinginthesimplestcaseis20timesslowerthantheasynchronous,"shoveitdownthepipeasfastasit'll
go"approach.Let'sseeifwecanapplythistoMajordomotomakeitfaster.

First,wemodifytheclientAPItosendandreceiveintwoseparatemethods:

mdcli_t*mdcli_new(char*broker)
voidmdcli_destroy(mdcli_t**self_p)
intmdcli_send(mdcli_t*self,char*service,zmsg_t**request_p)
zmsg_t*mdcli_recv(mdcli_t*self)

http://zguide.zeromq.org/page:all 84/225
12/31/2015 MQ - The Guide - MQ - The Guide
It'sliterallyafewminutes'worktorefactorthesynchronousclientAPItobecomeasynchronous:

mdcliapi2:MajordomoasynchronousclientAPIinC

Haxe|Java|Lua|PHP|Python|Ruby|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

Thedifferencesare:

WeuseaDEALERsocketinsteadofREQ,soweemulateREQwithanemptydelimiterframebeforeeachrequestand
eachresponse.
Wedon'tretryrequestsiftheapplicationneedstoretry,itcandothisitself.
Webreakthesynchronoussendmethodintoseparatesendandrecvmethods.
Thesendmethodisasynchronousandreturnsimmediatelyaftersending.Thecallercanthussendanumberof
messagesbeforegettingaresponse.
Therecvmethodwaitsfor(withatimeout)oneresponseandreturnsthattothecaller.

Andhere'sthecorrespondingclienttestprogram,whichsends100,000messagesandthenreceives100,000back:

mdclient2:MajordomoclientapplicationinC

C++|Haxe|Java|Lua|PHP|Python|Ruby|Tcl|Ada|Basic|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

Thebrokerandworkerareunchangedbecausewe'venotmodifiedtheprotocolatall.Weseeanimmediateimprovementin
performance.Here'sthesynchronousclientchuggingthrough100Krequestreplycycles:

$timemdclient
100000requests/repliesprocessed

real0m14.088s
user0m1.310s
sys0m2.670s

Andhere'stheasynchronousclient,withasingleworker:

$timemdclient2
100000repliesreceived

real0m8.730s
user0m0.920s
sys0m1.550s

Twiceasfast.Notbad,butlet'sfireup10workersandseehowithandlesthetraffic

$timemdclient2
100000repliesreceived

real0m3.863s
user0m0.730s
sys0m0.470s

Itisn'tfullyasynchronousbecauseworkersgettheirmessagesonastrictlastusedbasis.Butitwillscalebetterwithmore
workers.OnmyPC,aftereightorsoworkers,itdoesn'tgetanyfaster.Fourcoresonlystretchessofar.Butwegota4x
improvementinthroughputwithjustafewminutes'work.Thebrokerisstillunoptimized.Itspendsmostofitstimecopying
messageframesaround,insteadofdoingzerocopy,whichitcould.Butwe'regetting25Kreliablerequest/replycallsasecond,
withprettyloweffort.

However,theasynchronousMajordomopatternisn'tallroses.Ithasafundamentalweakness,namelythatitcannotsurvivea
brokercrashwithoutmorework.Ifyoulookatthemdcliapi2codeyou'llseeitdoesnotattempttoreconnectafterafailure.A
properreconnectwouldrequirethefollowing:

http://zguide.zeromq.org/page:all 85/225
12/31/2015 MQ - The Guide - MQ - The Guide
Anumberoneveryrequestandamatchingnumberoneveryreply,whichwouldideallyrequireachangetotheprotocolto
enforce.
TrackingandholdingontoalloutstandingrequestsintheclientAPI,i.e.,thoseforwhichnoreplyhasyetbeenreceived.
Incaseoffailover,fortheclientAPItoresendalloutstandingrequeststothebroker.

It'snotadealbreaker,butitdoesshowthatperformanceoftenmeanscomplexity.IsthisworthdoingforMajordomo?Itdepends
onyourusecase.Foranamelookupserviceyoucalloncepersession,no.Forawebfrontendservingthousandsofclients,
probablyyes.

ServiceDiscovery topprevnext

So,wehaveaniceserviceorientedbroker,butwehavenowayofknowingwhetheraparticularserviceisavailableornot.We
knowwhetherarequestfailed,butwedon'tknowwhy.Itisusefultobeabletoaskthebroker,"istheechoservicerunning?"The
mostobviouswaywouldbetomodifyourMDP/Clientprotocoltoaddcommandstoaskthis.ButMDP/Clienthasthegreatcharm
ofbeingsimple.AddingservicediscoverytoitwouldmakeitascomplexastheMDP/Workerprotocol.

Anotheroptionistodowhatemaildoes,andaskthatundeliverablerequestsbereturned.Thiscanworkwellinanasynchronous
world,butitalsoaddscomplexity.Weneedwaystodistinguishreturnedrequestsfromrepliesandtohandletheseproperly.

Let'strytousewhatwe'vealreadybuilt,buildingontopofMDPinsteadofmodifyingit.Servicediscoveryis,itself,aservice.It
mightindeedbeoneofseveralmanagementservices,suchas"disableserviceX","providestatistics",andsoon.Whatwewant
isageneral,extensiblesolutionthatdoesn'taffecttheprotocolorexistingapplications.

Sohere'sasmallRFCthatlayersthisontopofMDP:theMajordomoManagementInterface(MMI).Wealreadyimplementeditin
thebroker,thoughunlessyoureadthewholethingyouprobablymissedthat.I'llexplainhowitworksinthebroker:

Whenaclientrequestsaservicethatstartswithmmi.,insteadofroutingthistoaworker,wehandleitinternally.

Wehandlejustoneserviceinthisbroker,whichismmi.service,theservicediscoveryservice.

Thepayloadfortherequestisthenameofanexternalservice(arealone,providedbyaworker).

Thebrokerreturns"200"(OK)or"404"(Notfound),dependingonwhetherthereareworkersregisteredforthatserviceor
not.

Here'showweusetheservicediscoveryinanapplication:

mmiecho:ServicediscoveryoverMajordomoinC

Go|Haxe|Java|Lua|PHP|Python|Ruby|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Haskell|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

Trythiswithandwithoutaworkerrunning,andyoushouldseethelittleprogramreport"200"or"404"accordingly.The
implementationofMMIinourexamplebrokerisflimsy.Forexample,ifaworkerdisappears,servicesremain"present".In
practice,abrokershouldremoveservicesthathavenoworkersaftersomeconfigurabletimeout.

IdempotentServices topprevnext

Idempotencyisnotsomethingyoutakeapillfor.Whatitmeansisthatit'ssafetorepeatanoperation.Checkingtheclockis
idempotent.Lendingonescreditcardtooneschildrenisnot.Whilemanyclienttoserverusecasesareidempotent,someare
not.Examplesofidempotentusecasesinclude:

Statelesstaskdistribution,i.e.,apipelinewheretheserversarestatelessworkersthatcomputeareplybasedpurelyon
thestateprovidedbyarequest.Insuchacase,it'ssafe(thoughinefficient)toexecutethesamerequestmanytimes.

Anameservicethattranslateslogicaladdressesintoendpointstobindorconnectto.Insuchacase,it'ssafetomakethe
samelookuprequestmanytimes.

Andhereareexamplesofanonidempotentusecases:

Aloggingservice.Onedoesnotwantthesameloginformationrecordedmorethanonce.
http://zguide.zeromq.org/page:all 86/225
12/31/2015 MQ - The Guide - MQ - The Guide
Anyservicethathasimpactondownstreamnodes,e.g.,sendsoninformationtoothernodes.Ifthatservicegetsthesame
requestmorethanonce,downstreamnodeswillgetduplicateinformation.

Anyservicethatmodifiesshareddatainsomenonidempotentwaye.g.,aservicethatdebitsabankaccountisnot
idempotentwithoutextrawork.

Whenourserverapplicationsarenotidempotent,wehavetothinkmorecarefullyaboutwhenexactlytheymightcrash.Ifan
applicationdieswhenit'sidle,orwhileit'sprocessingarequest,that'susuallyfine.Wecanusedatabasetransactionstomake
sureadebitandacreditarealwaysdonetogether,ifatall.Iftheserverdieswhilesendingitsreply,that'saproblem,becauseas
farasit'sconcerned,ithasdoneitswork.

Ifthenetworkdiesjustasthereplyismakingitswaybacktotheclient,thesameproblemarises.Theclientwillthinktheserver
diedandwillresendtherequest,andtheserverwilldothesameworktwice,whichisnotwhatwewant.

Tohandlenonidempotentoperations,usethefairlystandardsolutionofdetectingandrejectingduplicaterequests.Thismeans:

Theclientmuststampeveryrequestwithauniqueclientidentifierandauniquemessagenumber.

Theserver,beforesendingbackareply,storesitusingthecombinationofclientIDandmessagenumberasakey.

Theserver,whengettingarequestfromagivenclient,firstcheckswhetherithasareplyforthatclientIDandmessage
number.Ifso,itdoesnotprocesstherequest,butjustresendsthereply.

DisconnectedReliability(TitanicPattern) topprevnext

OnceyourealizethatMajordomoisa"reliable"messagebroker,youmightbetemptedtoaddsomespinningrust(thatis,
ferrousbasedharddiskplatters).Afterall,thisworksforalltheenterprisemessagingsystems.It'ssuchatemptingideathatit'sa
littlesadtohavetobenegativetowardit.Butbrutalcynicismisoneofmyspecialties.So,somereasonsyoudon'twantrust
basedbrokerssittinginthecenterofyourarchitectureare:

Asyou'veseen,theLazyPirateclientperformssurprisinglywell.Itworksacrossawholerangeofarchitectures,from
directclienttoservertodistributedqueueproxies.Itdoestendtoassumethatworkersarestatelessandidempotent.But
wecanworkaroundthatlimitationwithoutresortingtorust.

Rustbringsawholesetofproblems,fromslowperformancetoadditionalpiecesthatyouhavetomanage,repair,and
handle6a.m.panicsfrom,astheyinevitablybreakatthestartofdailyoperations.ThebeautyofthePiratepatternsin
generalistheirsimplicity.Theywon'tcrash.Andifyou'restillworriedaboutthehardware,youcanmovetoapeertopeer
patternthathasnobrokeratall.I'llexplainlaterinthischapter.

Havingsaidthis,however,thereisonesaneusecaseforrustbasedreliability,whichisanasynchronousdisconnectednetwork.
ItsolvesamajorproblemwithPirate,namelythataclienthastowaitforananswerinrealtime.Ifclientsandworkersareonly
sporadicallyconnected(thinkofemailasananalogy),wecan'tuseastatelessnetworkbetweenclientsandworkers.Wehaveto
putstateinthemiddle.

So,here'stheTitanicpattern,inwhichwewritemessagestodisktoensuretheynevergetlost,nomatterhowsporadically
clientsandworkersareconnected.Aswedidforservicediscovery,we'regoingtolayerTitanicontopofMDPratherthanextend
it.It'swonderfullylazybecauseitmeanswecanimplementourfireandforgetreliabilityinaspecializedworker,ratherthaninthe
broker.Thisisexcellentforseveralreasons:

Itismucheasierbecausewedivideandconquer:thebrokerhandlesmessageroutingandtheworkerhandlesreliability.
Itletsusmixbrokerswritteninonelanguagewithworkerswritteninanother.
Itletsusevolvethefireandforgettechnologyindependently.

Theonlydownsideisthatthere'sanextranetworkhopbetweenbrokerandharddisk.Thebenefitsareeasilyworthit.

Therearemanywaystomakeapersistentrequestreplyarchitecture.We'llaimforonethatissimpleandpainless.Thesimplest
designIcouldcomeupwith,afterplayingwiththisforafewhours,isa"proxyservice".Thatis,Titanicdoesn'taffectworkersat
all.Ifaclientwantsareplyimmediately,ittalksdirectlytoaserviceandhopestheserviceisavailable.Ifaclientishappytowait
awhile,ittalkstoTitanicinsteadandasks,"hey,buddy,wouldyoutakecareofthisformewhileIgobuymygroceries?"

Figure51TheTitanicPattern

http://zguide.zeromq.org/page:all 87/225
12/31/2015 MQ - The Guide - MQ - The Guide

Titanicisthusbothaworkerandaclient.ThedialogbetweenclientandTitanicgoesalongtheselines:

Client:Pleaseacceptthisrequestforme.Titanic:OK,done.
Client:Doyouhaveareplyforme?Titanic:Yes,hereitis.Or,no,notyet.
Client:OK,youcanwipethatrequestnow,I'mhappy.Titanic:OK,done.

WhereasthedialogbetweenTitanicandbrokerandworkergoeslikethis:

Titanic:Hey,Broker,isthereancoffeeservice?Broker:Uhm,Yeah,seemslike.
Titanic:Hey,coffeeservice,pleasehandlethisforme.
Coffee:Sure,hereyouare.
Titanic:Sweeeeet!

Youcanworkthroughthisandthepossiblefailurescenarios.Ifaworkercrasheswhileprocessingarequest,Titanicretries
indefinitely.Ifareplygetslostsomewhere,Titanicwillretry.Iftherequestgetsprocessedbuttheclientdoesn'tgetthereply,it
willaskagain.IfTitaniccrasheswhileprocessingarequestorareply,theclientwilltryagain.Aslongasrequestsarefully
committedtosafestorage,workcan'tgetlost.

Thehandshakingispedantic,butcanbepipelined,i.e.,clientscanusetheasynchronousMajordomopatterntodoalotofwork
andthengettheresponseslater.

Weneedsomewayforaclienttorequestitsreplies.We'llhavemanyclientsaskingforthesameservices,andclientsdisappear
andreappearwithdifferentidentities.Hereisasimple,reasonablysecuresolution:

EveryrequestgeneratesauniversallyuniqueID(UUID),whichTitanicreturnstotheclientafterithasqueuedtherequest.
Whenaclientasksforareply,itmustspecifytheUUIDfortheoriginalrequest.

Inarealisticcase,theclientwouldwanttostoreitsrequestUUIDssafely,e.g.,inalocaldatabase.

Beforewejumpoffandwriteyetanotherformalspecification(fun,fun!),let'sconsiderhowtheclienttalkstoTitanic.Onewayis
touseasingleserviceandsenditthreedifferentrequesttypes.Anotherway,whichseemssimpler,istousethreeservices:

titanic.request:storearequestmessage,andreturnaUUIDfortherequest.
titanic.reply:fetchareply,ifavailable,foragivenrequestUUID.
titanic.close:confirmthatareplyhasbeenstoredandprocessed.

http://zguide.zeromq.org/page:all 88/225
12/31/2015 MQ - The Guide - MQ - The Guide
We'lljustmakeamultithreadedworker,whichaswe'veseenfromourmultithreadingexperiencewithZeroMQ,istrivial.However,
let'sfirstsketchwhatTitanicwouldlooklikeintermsofZeroMQmessagesandframes.ThisgivesustheTitanicServiceProtocol
(TSP).

UsingTSPisclearlymoreworkforclientapplicationsthanaccessingaservicedirectlyviaMDP.Here'stheshortestrobust
"echo"clientexample:

ticlient:TitanicclientexampleinC

Haxe|Java|PHP|Python|Ruby|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Lua|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

Ofcoursethiscanbe,andshouldbe,wrappedupinsomekindofframeworkorAPI.It'snothealthytoaskaverageapplication
developerstolearnthefulldetailsofmessaging:ithurtstheirbrains,coststime,andofferstoomanywaystomakebuggy
complexity.Additionally,itmakesithardtoaddintelligence.

Forexample,thisclientblocksoneachrequestwhereasinarealapplication,we'dwanttobedoingusefulworkwhiletasksare
executed.Thisrequiressomenontrivialplumbingtobuildabackgroundthreadandtalktothatcleanly.It'sthekindofthingyou
wanttowrapinanicesimpleAPIthattheaveragedevelopercannotmisuse.It'sthesameapproachthatweusedforMajordomo.

Here'stheTitanicimplementation.Thisserverhandlesthethreeservicesusingthreethreads,asproposed.Itdoesfull
persistencetodiskusingthemostbrutalapproachpossible:onefilepermessage.It'ssosimple,it'sscary.Theonlycomplexpart
isthatitkeepsaseparatequeueofallrequests,toavoidreadingthedirectoryoverandover:

titanic:TitanicbrokerexampleinC

Haxe|Java|PHP|Python|Ruby|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Lua|Node.js|ObjectiveC|ooc|
Perl|Q|Racket|Scala

Totestthis,startmdbrokerandtitanic,andthenrunticlient.Nowstartmdworkerarbitrarily,andyoushouldseethe
clientgettingaresponseandexitinghappily.

Somenotesaboutthiscode:

Notethatsomeloopsstartbysending,othersbyreceivingmessages.ThisisbecauseTitanicactsbothasaclientanda
workerindifferentroles.
TheTitanicbrokerusestheMMIservicediscoveryprotocoltosendrequestsonlytoservicesthatappeartoberunning.
SincetheMMIimplementationinourlittleMajordomobrokerisquitepoor,thiswon'tworkallthetime.
Weuseaninprocconnectiontosendnewrequestdatafromthetitanic.requestservicethroughtothemain
dispatcher.Thissavesthedispatcherfromhavingtoscanthediskdirectory,loadallrequestfiles,andsortthemby
date/time.

Theimportantthingaboutthisexampleisnotperformance(which,althoughIhaven'ttestedit,issurelyterrible),buthowwellit
implementsthereliabilitycontract.Totryit,startthemdbrokerandtitanicprograms.Thenstarttheticlient,andthenstartthe
mdworkerechoservice.Youcanrunallfouroftheseusingthevoptiontodoverboseactivitytracing.Youcanstopandrestart
anypieceexcepttheclientandnothingwillgetlost.

IfyouwanttouseTitanicinrealcases,you'llrapidlybeasking"howdowemakethisfaster?"

Here'swhatI'ddo,startingwiththeexampleimplementation:

Useasinglediskfileforalldata,ratherthanmultiplefiles.Operatingsystemsareusuallybetterathandlingafewlarge
filesthanmanysmallerones.
Organizethatdiskfileasacircularbuffersothatnewrequestscanbewrittencontiguously(withveryoccasional
wraparound).Onethread,writingfullspeedtoadiskfile,canworkrapidly.
Keeptheindexinmemoryandrebuildtheindexatstartuptime,fromthediskbuffer.Thissavestheextradiskheadflutter
neededtokeeptheindexfullysafeondisk.Youwouldwantanfsyncaftereverymessage,oreveryNmillisecondsifyou
werepreparedtolosethelastMmessagesincaseofasystemfailure.
Useasolidstatedriveratherthanspinningironoxideplatters.
Preallocatetheentirefile,orallocateitinlargechunks,whichallowsthecircularbuffertogrowandshrinkasneeded.
Thisavoidsfragmentationandensuresthatmostreadsandwritesarecontiguous.

Andsoon.WhatI'dnotrecommendisstoringmessagesinadatabase,notevena"fast"key/valuestore,unlessyoureallylikea
specificdatabaseanddon'thaveperformanceworries.Youwillpayasteeppricefortheabstraction,tentoathousandtimesover
arawdiskfile.

IfyouwanttomakeTitanicevenmorereliable,duplicatetherequeststoasecondserver,whichyou'dplaceinasecondlocation
justfarawayenoughtosurviveanuclearattackonyourprimarylocation,yetnotsofarthatyougettoomuchlatency.

http://zguide.zeromq.org/page:all 89/225
12/31/2015 MQ - The Guide - MQ - The Guide
IfyouwanttomakeTitanicmuchfasterandlessreliable,storerequestsandrepliespurelyinmemory.Thiswillgiveyouthe
functionalityofadisconnectednetwork,butrequestswon'tsurviveacrashoftheTitanicserveritself.

HighAvailabilityPair(BinaryStarPattern) topprevnext

Figure52HighAvailabilityPair,NormalOperation

TheBinaryStarpatternputstwoserversinaprimarybackuphighavailabilitypair.Atanygiventime,oneofthese(theactive)
acceptsconnectionsfromclientapplications.Theother(thepassive)doesnothing,butthetwoserversmonitoreachother.Ifthe
activedisappearsfromthenetwork,afteracertaintimethepassivetakesoverasactive.

WedevelopedtheBinaryStarpatternatiMatixforourOpenAMQserver.Wedesignedit:

Toprovideastraightforwardhighavailabilitysolution.
Tobesimpleenoughtoactuallyunderstandanduse.
Tofailoverreliablywhenneeded,andonlywhenneeded.

AssumingwehaveaBinaryStarpairrunning,herearethedifferentscenariosthatwillresultinafailover:

Thehardwarerunningtheprimaryserverhasafatalproblem(powersupplyexplodes,machinecatchesfire,orsomeone
simplyunplugsitbymistake),anddisappears.Applicationsseethis,andreconnecttothebackupserver.
Thenetworksegmentonwhichtheprimaryserversitscrashesperhapsaroutergetshitbyapowerspikeand
applicationsstarttoreconnecttothebackupserver.
Theprimaryservercrashesoriskilledbytheoperatoranddoesnotrestartautomatically.

Figure53HighavailabilityPairDuringFailover

http://zguide.zeromq.org/page:all 90/225
12/31/2015 MQ - The Guide - MQ - The Guide

Recoveryfromfailoverworksasfollows:

Theoperatorsrestarttheprimaryserverandfixwhateverproblemswerecausingittodisappearfromthenetwork.
Theoperatorsstopthebackupserveratamomentwhenitwillcauseminimaldisruptiontoapplications.
Whenapplicationshavereconnectedtotheprimaryserver,theoperatorsrestartthebackupserver.

Recovery(tousingtheprimaryserverasactive)isamanualoperation.Painfulexperienceteachesusthatautomaticrecoveryis
undesirable.Thereareseveralreasons:

Failovercreatesaninterruptionofservicetoapplications,possiblylasting1030seconds.Ifthereisarealemergency,this
ismuchbetterthantotaloutage.Butifrecoverycreatesafurther1030secondoutage,itisbetterthatthishappensoff
peak,whenusershavegoneoffthenetwork.

Whenthereisanemergency,theabsolutefirstpriorityiscertaintyforthosetryingtofixthings.Automaticrecoverycreates
uncertaintyforsystemadministrators,whocannolongerbesurewhichserverisinchargewithoutdoublechecking.

Automaticrecoverycancreatesituationswherenetworksfailoverandthenrecover,placingoperatorsinthedifficult
positionofanalyzingwhathappened.Therewasaninterruptionofservice,butthecauseisn'tclear.

Havingsaidthis,theBinaryStarpatternwillfailbacktotheprimaryserverifthisisrunning(again)andthebackupserverfails.In
fact,thisishowweprovokerecovery.

TheshutdownprocessforaBinaryStarpairistoeither:

1. Stopthepassiveserverandthenstoptheactiveserveratanylatertime,or
2. Stopbothserversinanyorderbutwithinafewsecondsofeachother.

Stoppingtheactiveandthenthepassiveserverwithanydelaylongerthanthefailovertimeoutwillcauseapplicationsto
disconnect,thenreconnect,andthendisconnectagain,whichmaydisturbusers.

DetailedRequirements topprevnext

BinaryStarisassimpleasitcanbe,whilestillworkingaccurately.Infact,thecurrentdesignisthethirdcompleteredesign.Each
ofthepreviousdesignswefoundtobetoocomplex,tryingtodotoomuch,andwestrippedoutfunctionalityuntilwecametoa
designthatwasunderstandable,easytouse,andreliableenoughtobeworthusing.

Theseareourrequirementsforahighavailabilityarchitecture:

Thefailoverismeanttoprovideinsuranceagainstcatastrophicsystemfailures,suchashardwarebreakdown,fire,
accident,andsoon.Therearesimplerwaystorecoverfromordinaryservercrashesandwealreadycoveredthese.

Failovertimeshouldbeunder60secondsandpreferablyunder10seconds.

Failoverhastohappenautomatically,whereasrecoverymusthappenmanually.Wewantapplicationstoswitchoverto
http://zguide.zeromq.org/page:all 91/225
12/31/2015 MQ - The Guide - MQ - The Guide
thebackupserverautomatically,butwedonotwantthemtoswitchbacktotheprimaryserverexceptwhentheoperators
havefixedwhateverproblemtherewasanddecidedthatitisagoodtimetointerruptapplicationsagain.

Thesemanticsforclientapplicationsshouldbesimpleandeasyfordeveloperstounderstand.Ideally,theyshouldbe
hiddenintheclientAPI.

Thereshouldbeclearinstructionsfornetworkarchitectsonhowtoavoiddesignsthatcouldleadtosplitbrainsyndrome,
inwhichbothserversinaBinaryStarpairthinktheyaretheactiveserver.

Thereshouldbenodependenciesontheorderinwhichthetwoserversarestarted.

Itmustbepossibletomakeplannedstopsandrestartsofeitherserverwithoutstoppingclientapplications(thoughthey
maybeforcedtoreconnect).

Operatorsmustbeabletomonitorbothserversatalltimes.

Itmustbepossibletoconnectthetwoserversusingahighspeeddedicatednetworkconnection.Thatis,failover
synchronizationmustbeabletouseaspecificIProute.

Wemakethefollowingassumptions:

Asinglebackupserverprovidesenoughinsurancewedon'tneedmultiplelevelsofbackup.

Theprimaryandbackupserversareequallycapableofcarryingtheapplicationload.Wedonotattempttobalanceload
acrosstheservers.

Thereissufficientbudgettocoverafullyredundantbackupserverthatdoesnothingalmostallthetime.

Wedon'tattempttocoverthefollowing:

Theuseofanactivebackupserverorloadbalancing.InaBinaryStarpair,thebackupserverisinactiveanddoesno
usefulworkuntiltheprimaryservergoesoffline.

Thehandlingofpersistentmessagesortransactionsinanyway.Weassumetheexistenceofanetworkofunreliable(and
probablyuntrusted)serversorBinaryStarpairs.

Anyautomaticexplorationofthenetwork.TheBinaryStarpairismanuallyandexplicitlydefinedinthenetworkandis
knowntoapplications(atleastintheirconfigurationdata).

Replicationofstateormessagesbetweenservers.Allserversidestatemustberecreatedbyapplicationswhentheyfail
over.

HereisthekeyterminologythatweuseinBinaryStar:

Primary:theserverthatisnormallyorinitiallyactive.

Backup:theserverthatisnormallypassive.Itwillbecomeactiveifandwhentheprimaryserverdisappearsfromthe
network,andwhenclientapplicationsaskthebackupservertoconnect.

Active:theserverthatacceptsclientconnections.Thereisatmostoneactiveserver.

Passive:theserverthattakesoveriftheactivedisappears.NotethatwhenaBinaryStarpairisrunningnormally,the
primaryserverisactive,andthebackupispassive.Whenafailoverhashappened,therolesareswitched.

ToconfigureaBinaryStarpair,youneedto:

1. Telltheprimaryserverwherethebackupserverislocated.
2. Tellthebackupserverwheretheprimaryserverislocated.
3. Optionally,tunethefailoverresponsetimes,whichmustbethesameforbothservers.

Themaintuningconcernishowfrequentlyyouwanttheserverstochecktheirpeeringstatus,andhowquicklyyouwantto
activatefailover.Inourexample,thefailovertimeoutvaluedefaultsto2,000msec.Ifyoureducethis,thebackupserverwilltake
overasactivemorerapidlybutmaytakeoverincaseswheretheprimaryservercouldrecover.Forexample,youmayhave
wrappedtheprimaryserverinashellscriptthatrestartsitifitcrashes.Inthatcase,thetimeoutshouldbehigherthanthetime
neededtorestarttheprimaryserver.

ForclientapplicationstoworkproperlywithaBinaryStarpair,theymust:

1. Knowbothserveraddresses.

http://zguide.zeromq.org/page:all 92/225
12/31/2015 MQ - The Guide - MQ - The Guide
2. Trytoconnecttotheprimaryserver,andifthatfails,tothebackupserver.
3. Detectafailedconnection,typicallyusingheartbeating.
4. Trytoreconnecttotheprimary,andthenbackup(inthatorder),withadelaybetweenretriesthatisatleastashighasthe
serverfailovertimeout.
5. Recreateallofthestatetheyrequireonaserver.
6. Retransmitmessageslostduringafailover,ifmessagesneedtobereliable.

It'snottrivialwork,andwe'dusuallywrapthisinanAPIthathidesitfromrealenduserapplications.

ThesearethemainlimitationsoftheBinaryStarpattern:

AserverprocesscannotbepartofmorethanoneBinaryStarpair.
Aprimaryservercanhaveasinglebackupserver,andnomore.
Thepassiveserverdoesnousefulwork,andisthuswasted.
Thebackupservermustbecapableofhandlingfullapplicationloads.
Failoverconfigurationcannotbemodifiedatruntime.
Clientapplicationsmustdosomeworktobenefitfromfailover.

PreventingSplitBrainSyndrome topprevnext

Splitbrainsyndromeoccurswhendifferentpartsofaclusterthinktheyareactiveatthesametime.Itcausesapplicationstostop
seeingeachother.BinaryStarhasanalgorithmfordetectingandeliminatingsplitbrain,whichisbasedonathreewaydecision
mechanism(aserverwillnotdecidetobecomeactiveuntilitgetsapplicationconnectionrequestsanditcannotseeitspeer
server).

However,itisstillpossibleto(mis)designanetworktofoolthisalgorithm.AtypicalscenariowouldbeaBinaryStarpair,thatis
distributedbetweentwobuildings,whereeachbuildingalsohadasetofapplicationsandwheretherewasasinglenetworklink
betweenbothbuildings.Breakingthislinkwouldcreatetwosetsofclientapplications,eachwithhalfoftheBinaryStarpair,and
eachfailoverserverwouldbecomeactive.

Topreventsplitbrainsituations,wemustconnectaBinaryStarpairusingadedicatednetworklink,whichcanbeassimpleas
pluggingthembothintothesameswitchor,better,usingacrossovercabledirectlybetweentwomachines.

WemustnotsplitaBinaryStararchitectureintotwoislands,eachwithasetofapplications.Whilethismaybeacommontypeof
networkarchitecture,youshouldusefederation,nothighavailabilityfailover,insuchcases.

Asuitablyparanoidnetworkconfigurationwouldusetwoprivateclusterinterconnects,ratherthanasingleone.Further,the
networkcardsusedfortheclusterwouldbedifferentfromthoseusedformessagetraffic,andpossiblyevenondifferentpathson
theserverhardware.Thegoalistoseparatepossiblefailuresinthenetworkfrompossiblefailuresinthecluster.Networkports
canhavearelativelyhighfailurerate.

BinaryStarImplementation topprevnext

Withoutfurtherado,hereisaproofofconceptimplementationoftheBinaryStarserver.Theprimaryandbackupserversrunthe
samecode,youchoosetheirroleswhenyourunthecode:

bstarsrv:BinaryStarserverinC

Haxe|Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Andhereistheclient:

bstarcli:BinaryStarclientinC

Haxe|Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

TotestBinaryStar,starttheserversandclientinanyorder:

http://zguide.zeromq.org/page:all 93/225
12/31/2015 MQ - The Guide - MQ - The Guide
bstarsrvp#Startprimary
bstarsrvb#Startbackup
bstarcli

Youcanthenprovokefailoverbykillingtheprimaryserver,andrecoverybyrestartingtheprimaryandkillingthebackup.Note
howit'stheclientvotethattriggersfailover,andrecovery.

Binarystarisdrivenbyafinitestatemachine.Eventsarethepeerstate,so"PeerActive"meanstheotherserverhastoldusit's
active."ClientRequest"meanswe'vereceivedaclientrequest."ClientVote"meanswe'vereceivedaclientrequestANDour
peerisinactivefortwoheartbeats.

NotethattheserversusePUBSUBsocketsforstateexchange.Noothersocketcombinationwillworkhere.PUSHandDEALER
blockifthereisnopeerreadytoreceiveamessage.PAIRdoesnotreconnectifthepeerdisappearsandcomesback.ROUTER
needstheaddressofthepeerbeforeitcansenditamessage.

Figure54BinaryStarFiniteStateMachine

BinaryStarReactor topprevnext

BinaryStarisusefulandgenericenoughtopackageupasareusablereactorclass.Thereactorthenrunsandcallsourcode
wheneverithasamessagetoprocess.Thisismuchnicerthancopying/pastingtheBinaryStarcodeintoeachserverwherewe
wantthatcapability.

InC,wewraptheCZMQzloopclassthatwesawbefore.zloopletsyouregisterhandlerstoreactonsocketandtimerevents.
IntheBinaryStarreactor,weprovidehandlersforvotersandforstatechanges(activetopassive,andviceversa).Hereisthe
bstarAPI:

//bstarclassBinaryStarreactor

#include"bstar.h"

http://zguide.zeromq.org/page:all 94/225
12/31/2015 MQ - The Guide - MQ - The Guide
//Stateswecanbeinatanypointintime
typedefenum{
STATE_PRIMARY=1,//Primary,waitingforpeertoconnect
STATE_BACKUP=2,//Backup,waitingforpeertoconnect
STATE_ACTIVE=3,//Activeacceptingconnections
STATE_PASSIVE=4//Passivenotacceptingconnections
}state_t

//Events,whichstartwiththestatesourpeercanbein
typedefenum{
PEER_PRIMARY=1,//HApeerispendingprimary
PEER_BACKUP=2,//HApeerispendingbackup
PEER_ACTIVE=3,//HApeerisactive
PEER_PASSIVE=4,//HApeerispassive
CLIENT_REQUEST=5//Clientmakesrequest
}event_t

//Structureofourclass

struct_bstar_t{
zctx_t*ctx//Ourprivatecontext
zloop_t*loop//Reactorloop
void*statepub//Statepublisher
void*statesub//Statesubscriber
state_tstate//Currentstate
event_tevent//Currentevent
int64_tpeer_expiry//Whenpeerisconsidered'dead'
zloop_fn*voter_fn//Votingsockethandler
void*voter_arg//Argumentsforvotinghandler
zloop_fn*active_fn//Callwhenbecomeactive
void*active_arg//Argumentsforhandler
zloop_fn*passive_fn//Callwhenbecomepassive
void*passive_arg//Argumentsforhandler
}

//Thefinitestatemachineisthesameasintheproofofconceptserver.
//Tounderstandthisreactorindetail,firstreadtheCZMQzloopclass.

//Wesendstateinformationeverythisoften
//Ifpeerdoesn'trespondintwoheartbeats,itis'dead'
#defineBSTAR_HEARTBEAT1000//Inmsecs

//BinaryStarfinitestatemachine(applieseventtostate)
//Returns1iftherewasanexception,0ifeventwasvalid.

staticint
s_execute_fsm(bstar_t*self)
{
intrc=0
//Primaryserveriswaitingforpeertoconnect
//AcceptsCLIENT_REQUESTeventsinthisstate
if(self>state==STATE_PRIMARY){
if(self>event==PEER_BACKUP){
zclock_log("I:connectedtobackup(passive),readyasactive")
self>state=STATE_ACTIVE
if(self>active_fn)
(self>active_fn)(self>loop,NULL,self>active_arg)
}
else
if(self>event==PEER_ACTIVE){
zclock_log("I:connectedtobackup(active),readyaspassive")
self>state=STATE_PASSIVE
if(self>passive_fn)

http://zguide.zeromq.org/page:all 95/225
12/31/2015 MQ - The Guide - MQ - The Guide
(self>passive_fn)(self>loop,NULL,self>passive_arg)
}
e
lse
elf>event==CLIENT_REQUEST){
if(s
usintotheactiveifwe've
//Allowclientrequeststoturn
//waitedsufficientlylongtobelievethebackupisnot
//currentlyactingasactive(i.e.,afterafailover)
assert(self>peer_expiry>0)

if(zclock_time()>=self>peer_expiry){
adyasactive")
zclock_log("I:requestfromclient,re
self>state=STATE_ACTIVE
if(self>active_fn)
(self>loop,NULL,self>active_arg)
(self>active_fn)
}else

//Don'trespondtoclientsyetit'spossiblewe're
//performingafailbackandthebackupiscurrently
active
rc=1
}
}

else
ackupserveriswaitingforpeertoconnect
//B
//RejectsCLIENT_REQUESTeventsinthisstate
if(self>state==STATE_BACKUP){
{
if(self>event==PEER_ACTIVE)
rimary(active),readyaspassive")
zclock_log("I:connectedtop
self>state=STATE_PASSIVE
if(self>passive_fn)
(self>loop,NULL,self>passive_arg)
(self>passive_fn)
}

else
elf>event==CLIENT_REQUEST)
if(s
rc=1
}
e
lse
erverisactive
//S
//AcceptsCLIENT_R
EQUESTeventsinthisstate
//TheonlywayoutofACTIVEisdeath
if(self>state==STATE_ACTIVE){
{
if(self>event==PEER_ACTIVE)
itbrain
//Twoactiveswouldmeanspl
zclock_log("E:fatalerrordualact
ives,aborting")
rc=1
}
}

else
erverispassive
//S
//CLIENT_REQUESTev
entscantriggerfailoverifpeerlooksdead
if(self>state==STATE_PASSIVE){

if(self>event==PEER_PRIMARY){
active,peerwillgopassive
//Peerisrestartingbecome
zclock_log("I:primary(passive)isrestarting,readyasac
tive")
self>state=STATE_ACTIVE
}
e
lse
elf>event==PEER_BACKUP){
if(s
eactive,peerwillgopassive
//Peerisrestartingbecom
zclock_log("I:backup(passive)isrestarting,readyasact
ive")
self>state=STATE_ACTIVE
}
e
lse
elf>event==PEER_PASSIVE){
if(s
sterwouldbenonresponsive
//Twopassiveswouldmeanclu

http://zguide.zeromq.org/page:all 96/225
12/31/2015 MQ - The Guide - MQ - The Guide
zclock_log("E:fatalerrordualpassives,aborting")
rc=1
}

else
elf>event==CLIENT_REQUEST){
if(s
thaspassed
//Peerbecomesactiveiftimeou
//It'stheclientrequestthattriggersthe
failover
assert(self>peer_expiry>0)

if(zclock_time()>=self>peer_expiry){
vestate
//Ifpeerisdead,switchtotheacti
zclock_log("I:failoversuccessful,readyasa
ctive")
self>state=STATE_ACTIVE
}
e
lse
/Ifpeerisalive,rejectconnections
/
rc=1
}
/
/Callstatechangehandlerifnecessary
if(self>state==STATE_ACTIVE&&self>a
ctive_fn)
e_arg)
(self>active_fn)(self>loop,NULL,self>activ
}
r
eturnrc
}

staticvoid
s_update_peer_expiry(bstar_t*self)
{
self>peer_expiry=zclock_time()+2*BSTAR_HEARTBEAT
}

//Reactoreventhandlers

//Publishourstatetopeer
ints_send_state(zloop_t*loop,inttimer_id,void*arg)
{
bstar_t*self=(bstar_t*)arg
zstr_sendf(self>statepub,"%d",self>state)
return0
}

//Receivestatefrompeer,executefinitestatemachine
ints_recv_state(zloop_t*loop,zmq_pollitem_t*poller,void*arg)
{
bstar_t*self=(bstar_t*)arg
char*state=zstr_recv(poller>socket)
if(state){
self>event=atoi(state)
s_update_peer_expiry(self)
free(state)
}
returns_execute_fsm(self)
}

//Applicationwantstospeaktous,seeifit'spossible
ints_voter_ready(zloop_t*loop,zmq_pollitem_t*poller,void*arg)
{
bstar_t*self=(bstar_t*)arg
//Ifservercanacceptinputnow,callapplhandler
self>event=CLIENT_REQUEST
if(s_execute_fsm(self)==0)
(self>voter_fn)(self>loop,poller,self>voter_arg)
else{
//Destroywaitingmessage,noonetoreadit
http://zguide.zeromq.org/page:all 97/225
12/31/2015 MQ - The Guide - MQ - The Guide
zmsg_t*msg=zmsg_recv(poller>socket)
zmsg_destroy(&msg)
}
r
eturn0
}

//Thisistheconstructorforourbstarclass.Wehavetotellit
//whetherwe'reprimaryorbackupserver,aswellasourlocaland
//remoteendpointstobindandconnectto:

bstar_t*
bstar_new(intprimary,char*local,char*remote)
{
bstar_t
*self

self=(bstar_t*)zmalloc(sizeof(bstar_t))

//InitializetheBinaryStar
self>ctx=zctx_new()
self>loop=zloop_new()
self>state=primary?STATE_PRIMARY:STATE_BACKUP

//Createpublisherforstategoingtopeer
self>statepub=zsocket_new(self>ctx,ZMQ_PUB)
zsocket_bind(self>statepub,local)

//Createsubscriberforstatecomingfrompeer
self>statesub=zsocket_new(self>ctx,ZMQ_SUB)
zsocket_set_subscribe(self>statesub,"")
zsocket_connect(self>statesub,remote)

//Setupbasicreactorevents
zloop_timer(self>loop,BSTAR_HEARTBEAT,0,s_send_state,self)
zmq_pollitem_tpoller={self>statesub,0,ZMQ_POLLIN}
zloop_poller(self>loop,&poller,s_recv_state,self)
returnself
}

//Thedestructorshutsdownthebstarreactor:

void
bstar_destroy(bstar_t**self_p)
{
assert(self_p)
if(*self_p){
bstar_t*self=*self_p
zloop_destroy(&self>loop)
zctx_destroy(&self>ctx)
free(self)
*self_p=NULL
}
}

//Thismethodreturnstheunderlyingzloopreactor,sowecanadd
//additionaltimersandreaders:

zloop_t*
bstar_zloop(bstar_t*self)
{
returnself>loop
}

//Thismethodregistersaclientvotersocket.Messagesreceived

http://zguide.zeromq.org/page:all 98/225
12/31/2015 MQ - The Guide - MQ - The Guide
//onthissocketprovidetheCLIENT_REQUESTeventsfortheBinaryStar
//FSMandarepassedtotheprovidedapplicationhandler.Werequire
//exactlyonevoterperbstarinstance:

int
bstar_voter(bstar_t*self,char*endpoint,inttype,zloop_fnhandler,
void*arg)
{
//Holdactualhandler+argsowecancallthislater
void*socket=zsocket_new(self>ctx,type)
zsocket_bind(socket,endpoint)
assert(!self>voter_fn)
self>voter_fn=handler
self>voter_arg=arg
zmq_pollitem_tpoller={socket,0,ZMQ_POLLIN}
returnzloop_poller(self>loop,&poller,s_voter_ready,self)
}

//Registerhandlerstobecalledeachtimethere'sastatechange:

void
bstar_new_active(bstar_t*self,zloop_fnhandler,void*arg)
{
assert(!self>active_fn)
self>active_fn=handler
self>active_arg=arg
}

void
bstar_new_passive(bstar_t*self,zloop_fnhandler,void*arg)
{
assert(!self>passive_fn)
self>passive_fn=handler
self>passive_arg=arg
}

//Enable/disableverbosetracing,fordebugging:

voidbstar_set_verbose(bstar_t*self,boolverbose)
{
zloop_set_verbose(self>loop,verbose)
}

//Finally,starttheconfiguredreactor.Itwillendifanyhandler
//returns1tothereactor,oriftheprocessreceivesSIGINTorSIGTERM:

int
bstar_start(bstar_t*self)
{
assert(self>voter_fn)
s_update_peer_expiry(self)
returnzloop_start(self>loop)
}

Andhereistheclassimplementation:

bstar:BinaryStarcoreclassinC

Haxe|Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Thisgivesusthefollowingshortmainprogramfortheserver:

bstarsrv2:BinaryStarserver,usingcoreclassinC
http://zguide.zeromq.org/page:all 99/225
12/31/2015 MQ - The Guide - MQ - The Guide

Haxe|Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

BrokerlessReliability(FreelancePattern) topprevnext

Itmightseemironictofocussomuchonbrokerbasedreliability,whenweoftenexplainZeroMQas"brokerlessmessaging".
However,inmessaging,asinreallife,themiddlemanisbothaburdenandabenefit.Inpractice,mostmessagingarchitectures
benefitfromamixofdistributedandbrokeredmessaging.Yougetthebestresultswhenyoucandecidefreelywhattradeoffs
youwanttomake.ThisiswhyIcandrivetwentyminutestoawholesalertobuyfivecasesofwineforaparty,butIcanalsowalk
tenminutestoacornerstoretobuyonebottleforadinner.Ourhighlycontextsensitiverelativevaluationsoftime,energy,and
costareessentialtotherealworldeconomy.Andtheyareessentialtoanoptimalmessagebasedarchitecture.

ThisiswhyZeroMQdoesnotimposeabrokercentricarchitecture,thoughitdoesgiveyouthetoolstobuildbrokers,akaproxies,
andwe'vebuiltadozenorsodifferentonessofar,justforpractice.

Sowe'llendthischapterbydeconstructingthebrokerbasedreliabilitywe'vebuiltsofar,andturningitbackintoadistributed
peertopeerarchitectureIcalltheFreelancepattern.Ourusecasewillbeanameresolutionservice.Thisisacommonproblem
withZeroMQarchitectures:howdoweknowtheendpointtoconnectto?HardcodingTCP/IPaddressesincodeisinsanely
fragile.Usingconfigurationfilescreatesanadministrationnightmare.Imagineifyouhadtohandconfigureyourwebbrowser,on
everyPCormobilephoneyouused,torealizethat"google.com"was"74.125.230.82".

AZeroMQnameservice(andwe'llmakeasimpleimplementation)mustdothefollowing:

Resolvealogicalnameintoatleastabindendpoint,andaconnectendpoint.Arealisticnameservicewouldprovide
multiplebindendpoints,andpossiblymultipleconnectendpointsaswell.

Allowustomanagemultipleparallelenvironments,e.g.,"test"versus"production",withoutmodifyingcode.

Bereliable,becauseifitisunavailable,applicationswon'tbeabletoconnecttothenetwork.

PuttinganameservicebehindaserviceorientedMajordomobrokeriscleverfromsomepointsofview.However,it'ssimplerand
muchlesssurprisingtojustexposethenameserviceasaservertowhichclientscanconnectdirectly.Ifwedothisright,the
nameservicebecomestheonlyglobalnetworkendpointweneedtohardcodeinourcodeorconfigurationfiles.

Figure55TheFreelancePattern

Thetypesoffailureweaimtohandleareservercrashesandrestarts,serverbusylooping,serveroverload,andnetworkissues.
Togetreliability,we'llcreateapoolofnameserverssoifonecrashesorgoesaway,clientscanconnecttoanother,andsoon.In
practice,twowouldbeenough.Butfortheexample,we'llassumethepoolcanbeanysize.

Inthisarchitecture,alargesetofclientsconnecttoasmallsetofserversdirectly.Theserversbindtotheirrespectiveaddresses.
It'sfundamentallydifferentfromabrokerbasedapproachlikeMajordomo,whereworkersconnecttothebroker.Clientshavea
coupleofoptions:

UseREQsocketsandtheLazyPiratepattern.Easy,butwouldneedsomeadditionalintelligencesoclientsdon'tstupidly

http://zguide.zeromq.org/page:all 100/225
12/31/2015 MQ - The Guide - MQ - The Guide
trytoreconnecttodeadserversoverandover.

UseDEALERsocketsandblastoutrequests(whichwillbeloadbalancedtoallconnectedservers)untiltheygetareply.
Effective,butnotelegant.

UseROUTERsocketssoclientscanaddressspecificservers.Buthowdoestheclientknowtheidentityoftheserver
sockets?Eithertheserverhastopingtheclientfirst(complex),ortheserverhastouseahardcoded,fixedidentityknown
totheclient(nasty).

We'lldevelopeachoftheseinthefollowingsubsections.

ModelOne:SimpleRetryandFailover topprevnext

Soourmenuappearstooffer:simple,brutal,complex,ornasty.Let'sstartwithsimpleandthenworkoutthekinks.WetakeLazy
Pirateandrewriteittoworkwithmultipleserverendpoints.

Startoneorseveralserversfirst,specifyingabindendpointastheargument:

flserver1:Freelanceserver,ModelOneinC

C#|Java|Lua|PHP|Python|Tcl|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Thenstarttheclient,specifyingoneormoreconnectendpointsasarguments:

flclient1:Freelanceclient,ModelOneinC

C#|Java|PHP|Python|Tcl|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Asamplerunis:

flserver1tcp://*:5555&
flserver1tcp://*:5556&
flclient1tcp://localhost:5555tcp://localhost:5556

AlthoughthebasicapproachisLazyPirate,theclientaimstojustgetonesuccessfulreply.Ithastwotechniques,dependingon
whetheryouarerunningasingleserverormultipleservers:

Withasingleserver,theclientwillretryseveraltimes,exactlyasforLazyPirate.
Withmultipleservers,theclientwilltryeachserveratmostonceuntilit'sreceivedareplyorhastriedallservers.

ThissolvesthemainweaknessofLazyPirate,namelythatitcouldnotfailovertobackuporalternateservers.

However,thisdesignwon'tworkwellinarealapplication.Ifwe'reconnectingmanysocketsandourprimarynameserveris
down,we'regoingtoexperiencethispainfultimeouteachtime.

ModelTwo:BrutalShotgunMassacre topprevnext

Let'sswitchourclienttousingaDEALERsocket.Ourgoalhereistomakesurewegetareplybackwithintheshortestpossible
time,nomatterwhetheraparticularserverisupordown.Ourclienttakesthisapproach:

Wesetthingsup,connectingtoallservers.
Whenwehavearequest,weblastitoutasmanytimesaswehaveservers.
Wewaitforthefirstreply,andtakethat.
Weignoreanyotherreplies.

Whatwillhappeninpracticeisthatwhenallserversarerunning,ZeroMQwilldistributetherequestssothateachservergetsone
requestandsendsonereply.Whenanyserverisofflineanddisconnected,ZeroMQwilldistributetherequeststotheremaining
http://zguide.zeromq.org/page:all 101/225
12/31/2015 MQ - The Guide - MQ - The Guide
servers.Soaservermayinsomecasesgetthesamerequestmorethanonce.

What'smoreannoyingfortheclientisthatwe'llgetmultiplerepliesback,butthere'snoguaranteewe'llgetaprecisenumberof
replies.Requestsandrepliescangetlost(e.g.,iftheservercrasheswhileprocessingarequest).

Sowehavetonumberrequestsandignoreanyrepliesthatdon'tmatchtherequestnumber.OurModelOneserverwillwork
becauseit'sanechoserver,butcoincidenceisnotagreatbasisforunderstanding.Sowe'llmakeaModelTwoserverthatchews
upthemessageandreturnsacorrectlynumberedreplywiththecontent"OK".We'llusemessagesconsistingoftwoparts:a
sequencenumberandabody.

Startoneormoreservers,specifyingabindendpointeachtime:

flserver2:Freelanceserver,ModelTwoinC

C#|Java|Lua|PHP|Python|Tcl|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Thenstarttheclient,specifyingtheconnectendpointsasarguments:

flclient2:Freelanceclient,ModelTwoinC

C#|Java|PHP|Python|Tcl|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

Herearesomethingstonoteabouttheclientimplementation:

TheclientisstructuredasanicelittleclassbasedAPIthathidesthedirtyworkofcreatingZeroMQcontextsandsockets
andtalkingtotheserver.Thatis,ifashotgunblasttothemidriffcanbecalled"talking".

Theclientwillabandonthechaseifitcan'tfindanyresponsiveserverwithinafewseconds.

TheclienthastocreateavalidREPenvelope,i.e.,addanemptymessageframetothefrontofthemessage.

Theclientperforms10,000nameresolutionrequests(fakeones,asourserverdoesessentiallynothing)andmeasuresthe
averagecost.Onmytestbox,talkingtooneserver,thisrequiresabout60microseconds.Talkingtothreeservers,ittakesabout
80microseconds.

Theprosandconsofourshotgunapproachare:

Pro:itissimple,easytomakeandeasytounderstand.
Pro:itdoesthejoboffailover,andworksrapidly,solongasthereisatleastoneserverrunning.
Con:itcreatesredundantnetworktraffic.
Con:wecan'tprioritizeourservers,i.e.,Primary,thenSecondary.
Con:theservercandoatmostonerequestatatime,period.

ModelThree:ComplexandNasty topprevnext

Theshotgunapproachseemstoogoodtobetrue.Let'sbescientificandworkthroughallthealternatives.We'regoingtoexplore
thecomplex/nastyoption,evenifit'sonlytofinallyrealizethatwepreferredbrutal.Ah,thestoryofmylife.

WecansolvethemainproblemsoftheclientbyswitchingtoaROUTERsocket.Thatletsussendrequeststospecificservers,
avoidserversweknowaredead,andingeneralbeassmartaswewanttobe.Wecanalsosolvethemainproblemoftheserver
(singlethreadedness)byswitchingtoaROUTERsocket.

ButdoingROUTERtoROUTERbetweentwoanonymoussockets(whichhaven'tsetanidentity)isnotpossible.Bothsides
generateanidentity(fortheotherpeer)onlywhentheyreceiveafirstmessage,andthusneithercantalktotheotheruntilithas
firstreceivedamessage.Theonlywayoutofthisconundrumistocheat,andusehardcodedidentitiesinonedirection.The
properwaytocheat,inaclient/servercase,istolettheclient"know"theidentityoftheserver.Doingittheotherwayaround
wouldbeinsane,ontopofcomplexandnasty,becauseanynumberofclientsshouldbeabletoariseindependently.Insane,
complex,andnastyaregreatattributesforagenocidaldictator,butterribleonesforsoftware.

Ratherthaninventyetanotherconcepttomanage,we'llusetheconnectionendpointasidentity.Thisisauniquestringonwhich
bothsidescanagreewithoutmorepriorknowledgethantheyalreadyhavefortheshotgunmodel.It'sasneakyandeffectiveway
toconnecttwoROUTERsockets.

http://zguide.zeromq.org/page:all 102/225
12/31/2015 MQ - The Guide - MQ - The Guide
RememberhowZeroMQidentitieswork.TheserverROUTERsocketsetsanidentitybeforeitbindsitssocket.Whenaclient
connects,theydoalittlehandshaketoexchangeidentities,beforeeithersidesendsarealmessage.TheclientROUTERsocket,
havingnotsetanidentity,sendsanullidentitytotheserver.TheservergeneratesarandomUUIDtodesignatetheclientforits
ownuse.Theserversendsitsidentity(whichwe'veagreedisgoingtobeanendpointstring)totheclient.

Thismeansthatourclientcanrouteamessagetotheserver(i.e.,sendonitsROUTERsocket,specifyingtheserverendpointas
identity)assoonastheconnectionisestablished.That'snotimmediatelyafterdoingazmq_connect(),butsomerandomtime
thereafter.Hereinliesoneproblem:wedon'tknowwhentheserverwillactuallybeavailableandcompleteitsconnection
handshake.Iftheserverisonline,itcouldbeafterafewmilliseconds.Iftheserverisdownandthesysadminisouttolunch,it
couldbeanhourfromnow.

There'sasmallparadoxhere.Weneedtoknowwhenserversbecomeconnectedandavailableforwork.IntheFreelance
pattern,unlikethebrokerbasedpatternswesawearlierinthischapter,serversaresilentuntilspokento.Thuswecan'ttalktoa
serveruntilit'stoldusit'sonline,whichitcan'tdountilwe'veaskedit.

Mysolutionistomixinalittleoftheshotgunapproachfrommodel2,meaningwe'llfire(harmless)shotsatanythingwecan,and
ifanythingmoves,weknowit'salive.We'renotgoingtofirerealrequests,butratherakindofpingpongheartbeat.

Thisbringsustotherealmofprotocolsagain,sohere'sashortspecthatdefineshowaFreelanceclientandserverexchange
pingpongcommandsandrequestreplycommands.

Itisshortandsweettoimplementasaserver.Here'sourechoserver,ModelThree,nowspeakingFLP:

flserver3:Freelanceserver,ModelThreeinC

C#|Java|Lua|Python|Tcl|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

TheFreelanceclient,however,hasgottenlarge.Forclarity,it'ssplitintoanexampleapplicationandaclassthatdoesthehard
work.Here'sthetoplevelapplication:

flclient3:Freelanceclient,ModelThreeinC

C#|Java|Python|Tcl|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Andhere,almostascomplexandlargeastheMajordomobroker,istheclientAPIclass:

flcliapi:FreelanceclientAPIinC

C#|Java|Python|Tcl|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

ThisAPIimplementationisfairlysophisticatedandusesacoupleoftechniquesthatwe'venotseenbefore.

MultithreadedAPI:theclientAPIconsistsoftwoparts,asynchronousflcliapiclassthatrunsintheapplicationthread,
andanasynchronousagentclassthatrunsasabackgroundthread.RememberhowZeroMQmakesiteasytocreate
multithreadedapps.Theflcliapiandagentclassestalktoeachotherwithmessagesoveraninprocsocket.AllZeroMQ
aspects(suchascreatinganddestroyingacontext)arehiddenintheAPI.Theagentineffectactslikeaminibroker,
talkingtoserversinthebackground,sothatwhenwemakearequest,itcanmakeabestefforttoreachaserverit
believesisavailable.

Ticklesspolltimer:inpreviouspollloopswealwaysusedafixedtickinterval,e.g.,1second,whichissimpleenoughbut
notexcellentonpowersensitiveclients(suchasnotebooksormobilephones),wherewakingtheCPUcostspower.For
fun,andtohelpsavetheplanet,theagentusesaticklesstimer,whichcalculatesthepolldelaybasedonthenexttimeout
we'reexpecting.Aproperimplementationwouldkeepanorderedlistoftimeouts.Wejustcheckalltimeoutsandcalculate
thepolldelayuntilthenextone.

Conclusion topprevnext

Inthischapter,we'veseenavarietyofreliablerequestreplymechanisms,eachwithcertaincostsandbenefits.Theexample
codeislargelyreadyforrealuse,thoughitisnotoptimized.Ofallthedifferentpatterns,thetwothatstandoutforproductionuse
aretheMajordomopattern,forbrokerbasedreliability,andtheFreelancepattern,forbrokerlessreliability.

http://zguide.zeromq.org/page:all 103/225
12/31/2015 MQ - The Guide - MQ - The Guide

Chapter5AdvancedPubSubPatterns topprevnext

InChapter3AdvancedRequestReplyPatternsandChapter4ReliableRequestReplyPatternswelookedatadvanceduseof
ZeroMQ'srequestreplypattern.Ifyoumanagedtodigestallthat,congratulations.Inthischapterwe'llfocusonpublishsubscribe
andextendZeroMQ'scorepubsubpatternwithhigherlevelpatternsforperformance,reliability,statedistribution,and
monitoring.

We'llcover:

Whentousepublishsubscribe
Howtohandletooslowsubscribers(theSuicidalSnailpattern)
Howtodesignhighspeedsubscribers(theBlackBoxpattern)
Howtomonitorapubsubnetwork(theEspressopattern)
Howtobuildasharedkeyvaluestore(theClonepattern)
Howtousereactorstosimplifycomplexservers
HowtousetheBinaryStarpatterntoaddfailovertoaserver

ProsandConsofPubSub topprevnext

ZeroMQ'slowlevelpatternshavetheirdifferentcharacters.Pubsubaddressesanoldmessagingproblem,whichismulticastor
groupmessaging.IthasthatuniquemixofmeticuloussimplicityandbrutalindifferencethatcharacterizesZeroMQ.It'sworth
understandingthetradeoffsthatpubsubmakes,howthesebenefitus,andhowwecanworkaroundthemifneeded.

First,PUBsendseachmessageto"allofmany",whereasPUSHandDEALERrotatemessagesto"oneofmany".Youcannot
simplyreplacePUSHwithPUBorviceversaandhopethatthingswillwork.Thisbearsrepeatingbecausepeopleseemtoquite
oftensuggestdoingthis.

Moreprofoundly,pubsubisaimedatscalability.Thismeanslargevolumesofdata,sentrapidlytomanyrecipients.Ifyouneed
millionsofmessagespersecondsenttothousandsofpoints,you'llappreciatepubsubalotmorethanifyouneedafew
messagesasecondsenttoahandfulofrecipients.

Togetscalability,pubsubusesthesametrickaspushpull,whichistogetridofbackchatter.Thismeansthatrecipientsdon't
talkbacktosenders.Therearesomeexceptions,e.g.,SUBsocketswillsendsubscriptionstoPUBsockets,butit'sanonymous
andinfrequent.

Killingbackchatterisessentialtorealscalability.Withpubsub,it'showthepatterncanmapcleanlytothePGMmulticast
protocol,whichishandledbythenetworkswitch.Inotherwords,subscribersdon'tconnecttothepublisheratall,theyconnectto
amulticastgroupontheswitch,towhichthepublishersendsitsmessages.

Whenweremovebackchatter,ouroverallmessageflowbecomesmuchsimpler,whichletsusmakesimplerAPIs,simpler
protocols,andingeneralreachmanymorepeople.Butwealsoremoveanypossibilitytocoordinatesendersandreceivers.What
thismeansis:

Publisherscan'ttellwhensubscribersaresuccessfullyconnected,bothoninitialconnections,andonreconnectionsafter
networkfailures.

Subscriberscan'ttellpublishersanythingthatwouldallowpublisherstocontroltherateofmessagestheysend.Publishers
onlyhaveonesetting,whichisfullspeed,andsubscribersmusteitherkeepuporlosemessages.

Publisherscan'ttellwhensubscribershavedisappearedduetoprocessescrashing,networksbreaking,andsoon.

Thedownsideisthatweactuallyneedalloftheseifwewanttodoreliablemulticast.TheZeroMQpubsubpatternwilllose
messagesarbitrarilywhenasubscriberisconnecting,whenanetworkfailureoccurs,orjustifthesubscriberornetworkcan't
keepupwiththepublisher.

Theupsideisthattherearemanyusecaseswherealmostreliablemulticastisjustfine.Whenweneedthisbackchatter,wecan
eitherswitchtousingROUTERDEALER(whichItendtodoformostnormalvolumecases),orwecanaddaseparatechannel
forsynchronization(we'llseeanexampleofthislaterinthischapter).

Pubsubislikearadiobroadcastyoumisseverythingbeforeyoujoin,andthenhowmuchinformationyougetdependsonthe

http://zguide.zeromq.org/page:all 104/225
12/31/2015 MQ - The Guide - MQ - The Guide
qualityofyourreception.Surprisingly,thismodelisusefulandwidespreadbecauseitmapsperfectlytorealworlddistributionof
information.ThinkofFacebookandTwitter,theBBCWorldService,andthesportsresults.

Aswedidforrequestreply,let'sdefinereliabilityintermsofwhatcangowrong.Herearetheclassicfailurecasesforpubsub:

Subscribersjoinlate,sotheymissmessagestheserveralreadysent.
Subscriberscanfetchmessagestooslowly,soqueuesbuildupandthenoverflow.
Subscriberscandropoffandlosemessageswhiletheyareaway.
Subscriberscancrashandrestart,andlosewhateverdatatheyalreadyreceived.
Networkscanbecomeoverloadedanddropdata(specifically,forPGM).
Networkscanbecometooslow,sopublishersidequeuesoverflowandpublisherscrash.

Alotmorecangowrongbutthesearethetypicalfailuresweseeinarealisticsystem.Sincev3.x,ZeroMQforcesdefaultlimits
onitsinternalbuffers(thesocalledhighwatermarkorHWM),sopublishercrashesarerarerunlessyoudeliberatelysetthe
HWMtoinfinite.

Allofthesefailurecaseshaveanswers,thoughnotalwayssimpleones.Reliabilityrequirescomplexitythatmostofusdon'tneed,
mostofthetime,whichiswhyZeroMQdoesn'tattempttoprovideitoutofthebox(eveniftherewasoneglobaldesignfor
reliability,whichthereisn't).

PubSubTracing(EspressoPattern) topprevnext

Let'sstartthischapterbylookingatawaytotracepubsubnetworks.InChapter2SocketsandPatternswesawasimpleproxy
thatusedthesetodotransportbridging.Thezmq_proxy()methodhasthreearguments:afrontendandbackendsocketthatit
bridgestogether,andacapturesockettowhichitwillsendallmessages.

Thecodeisdeceptivelysimple:

espresso:EspressoPatterninC

C#|Java|Python|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

EspressoworksbycreatingalistenerthreadthatreadsaPAIRsocketandprintsanythingitgets.ThatPAIRsocketisoneendof
apipetheotherend(anotherPAIR)isthesocketwepasstozmq_proxy().Inpractice,you'dfilterinterestingmessagestoget
theessenceofwhatyouwanttotrack(hencethenameofthepattern).

Thesubscriberthreadsubscribesto"A"and"B",receivesfivemessages,andthendestroysitssocket.Whenyourunthe
example,thelistenerprintstwosubscriptionmessages,fivedatamessages,twounsubscribemessages,andthensilence:

[002]0141
[002]0142
[007]B91164
[007]B12979
[007]A52599
[007]A06417
[007]A45770
[002]0041
[002]0042

Thisshowsneatlyhowthepublishersocketstopssendingdatawhentherearenosubscribersforit.Thepublisherthreadisstill
sendingmessages.Thesocketjustdropsthemsilently.

LastValueCaching topprevnext

Ifyou'veusedcommercialpubsubsystems,youmaybeusedtosomefeaturesthataremissinginthefastandcheerfulZeroMQ
pubsubmodel.Oneoftheseislastvaluecaching(LVC).Thissolvestheproblemofhowanewsubscribercatchesupwhenit
joinsthenetwork.Thetheoryisthatpublishersgetnotifiedwhenanewsubscriberjoinsandsubscribestosomespecifictopics.
http://zguide.zeromq.org/page:all 105/225
12/31/2015 MQ - The Guide - MQ - The Guide
Thepublishercanthenrebroadcastthelastmessageforthosetopics.

I'vealreadyexplainedwhypublishersdon'tgetnotifiedwhentherearenewsubscribers,becauseinlargepubsubsystems,the
volumesofdatamakeitprettymuchimpossible.Tomakereallylargescalepubsubnetworks,youneedaprotocollikePGMthat
exploitsanupscaleEthernetswitch'sabilitytomulticastdatatothousandsofsubscribers.TryingtodoaTCPunicastfromthe
publishertoeachofthousandsofsubscribersjustdoesn'tscale.Yougetweirdspikes,unfairdistribution(somesubscribers
gettingthemessagebeforeothers),networkcongestion,andgeneralunhappiness.

PGMisaonewayprotocol:thepublishersendsamessagetoamulticastaddressattheswitch,whichthenrebroadcaststhatto
allinterestedsubscribers.Thepublisherneverseeswhensubscribersjoinorleave:thisallhappensintheswitch,whichwedon't
reallywanttostartreprogramming.

However,inalowervolumenetworkwithafewdozensubscribersandalimitednumberoftopics,wecanuseTCPandthenthe
XSUBandXPUBsocketsdotalktoeachotheraswejustsawintheEspressopattern.

CanwemakeanLVCusingZeroMQ?Theanswerisyes,ifwemakeaproxythatsitsbetweenthepublisherandsubscribersan
analogforthePGMswitch,butonewecanprogramourselves.

I'llstartbymakingapublisherandsubscriberthathighlighttheworstcasescenario.Thispublisherispathological.Itstartsby
immediatelysendingmessagestoeachofathousandtopics,andthenitsendsoneupdateasecondtoarandomtopic.A
subscriberconnects,andsubscribestoatopic.WithoutLVC,asubscriberwouldhavetowaitanaverageof500secondstoget
anydata.Toaddsomedrama,let'spretendthere'sanescapedconvictcalledGregorthreateningtoriptheheadoffRogerthe
toybunnyifwecan'tfixthat8.3minutes'delay.

Here'sthepublishercode.Notethatithasthecommandlineoptiontoconnecttosomeaddress,butotherwisebindstoan
endpoint.We'llusethislatertoconnecttoourlastvaluecache:

pathopub:PathologicPublisherinC

C#|Java|Python|Ruby|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Scala|Tcl

Andhere'sthesubscriber:

pathosub:PathologicSubscriberinC

C#|Java|Python|Ruby|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Scala|Tcl

Trybuildingandrunningthese:firstthesubscriber,thenthepublisher.You'llseethesubscriberreportsgetting"SaveRoger"as
you'dexpect:

./pathosub&
./pathopub

It'swhenyourunasecondsubscriberthatyouunderstandRoger'spredicament.Youhavetoleaveitanawfullongtimebeforeit
reportsgettinganydata.So,here'sourlastvaluecache.AsIpromised,it'saproxythatbindstotwosocketsandthenhandles
messagesonboth:

lvcache:LastValueCachingProxyinC

C#|Java|Python|Ruby|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Scala|Tcl

Now,runtheproxy,andthenthepublisher:

./lvcache&
./pathopubtcp://localhost:5557

Andnowrunasmanyinstancesofthesubscriberasyouwanttotry,eachtimeconnectingtotheproxyonport5558:

./pathosubtcp://localhost:5558

http://zguide.zeromq.org/page:all 106/225
12/31/2015 MQ - The Guide - MQ - The Guide
Eachsubscriberhappilyreports"SaveRoger",andGregortheEscapedConvictslinksbacktohisseatfordinnerandanicecup
ofhotmilk,whichisallhereallywantedinthefirstplace.

Onenote:bydefault,theXPUBsocketdoesnotreportduplicatesubscriptions,whichiswhatyouwantwhenyou'renaively
connectinganXPUBtoanXSUB.Ourexamplesneakilygetsaroundthisbyusingrandomtopicssothechanceofitnotworking
isoneinamillion.InarealLVCproxy,you'llwanttousetheZMQ_XPUB_VERBOSEoptionthatweimplementinChapter6The
ZeroMQCommunityasanexercise.

SlowSubscriberDetection(SuicidalSnailPattern) topprevnext

Acommonproblemyouwillhitwhenusingthepubsubpatterninreallifeistheslowsubscriber.Inanidealworld,westream
dataatfullspeedfrompublisherstosubscribers.Inreality,subscriberapplicationsareoftenwrittenininterpretedlanguages,or
justdoalotofwork,orarejustbadlywritten,totheextentthattheycan'tkeepupwithpublishers.

Howdowehandleaslowsubscriber?Theidealfixistomakethesubscriberfaster,butthatmighttakeworkandtime.Someof
theclassicstrategiesforhandlingaslowsubscriberare:

Queuemessagesonthepublisher.ThisiswhatGmaildoeswhenIdon'treadmyemailforacoupleofhours.Butin
highvolumemessaging,pushingqueuesupstreamhasthethrillingbutunprofitableresultofmakingpublishersrunoutof
memoryandcrashespeciallyiftherearelotsofsubscribersandit'snotpossibletoflushtodiskforperformancereasons.

Queuemessagesonthesubscriber.Thisismuchbetter,andit'swhatZeroMQdoesbydefaultifthenetworkcankeep
upwiththings.Ifanyone'sgoingtorunoutofmemoryandcrash,it'llbethesubscriberratherthanthepublisher,whichis
fair.Thisisperfectfor"peaky"streamswhereasubscribercan'tkeepupforawhile,butcancatchupwhenthestream
slowsdown.However,it'snoanswertoasubscriberthat'ssimplytooslowingeneral.

Stopqueuingnewmessagesafterawhile.ThisiswhatGmaildoeswhenmymailboxoverflowsitspreciousgigabytes
ofspace.Newmessagesjustgetrejectedordropped.Thisisagreatstrategyfromtheperspectiveofthepublisher,and
it'swhatZeroMQdoeswhenthepublishersetsaHWM.However,itstilldoesn'thelpusfixtheslowsubscriber.Nowwe
justgetgapsinourmessagestream.

Punishslowsubscriberswithdisconnect.ThisiswhatHotmail(rememberthat?)didwhenIdidn'tloginfortwoweeks,
whichiswhyIwasonmyfifteenthHotmailaccountwhenithitmethattherewasperhapsabetterway.It'sanicebrutal
strategythatforcessubscriberstositupandpayattentionandwouldbeideal,butZeroMQdoesn'tdothis,andthere'sno
waytolayeritontopbecausesubscribersareinvisibletopublisherapplications.

Noneoftheseclassicstrategiesfit,soweneedtogetcreative.Ratherthandisconnectthepublisher,let'sconvincethe
subscribertokillitself.ThisistheSuicidalSnailpattern.Whenasubscriberdetectsthatit'srunningtooslowly(where"tooslowly"
ispresumablyaconfiguredoptionthatreallymeans"soslowlythatifyouevergethere,shoutreallyloudlybecauseIneedto
know,soIcanfixthis!"),itcroaksanddies.

Howcanasubscriberdetectthis?Onewaywouldbetosequencemessages(numbertheminorder)anduseaHWMatthe
publisher.Now,ifthesubscriberdetectsagap(i.e.,thenumberingisn'tconsecutive),itknowssomethingiswrong.Wethentune
theHWMtothe"croakanddieifyouhitthis"level.

Therearetwoproblemswiththissolution.One,ifwehavemanypublishers,howdowesequencemessages?Thesolutionisto
giveeachpublisherauniqueIDandaddthattothesequencing.Second,ifsubscribersuseZMQ_SUBSCRIBEfilters,theywillget
gapsbydefinition.Ourprecioussequencingwillbefornothing.

Someusecaseswon'tusefilters,andsequencingwillworkforthem.Butamoregeneralsolutionisthatthepublishertimestamps
eachmessage.Whenasubscribergetsamessage,itchecksthetime,andifthedifferenceismorethan,say,onesecond,it
doesthe"croakanddie"thing,possiblyfiringoffasquawktosomeoperatorconsolefirst.

TheSuicideSnailpatternworksespeciallywhensubscribershavetheirownclientsandservicelevelagreementsandneedto
guaranteecertainmaximumlatencies.Abortingasubscribermaynotseemlikeaconstructivewaytoguaranteeamaximum
latency,butit'stheassertionmodel.Aborttoday,andtheproblemwillbefixed.Allowlatedatatoflowdownstream,andthe
problemmaycausewiderdamageandtakelongertoappearontheradar.

HereisaminimalexampleofaSuicidalSnail:

suisnail:SuicidalSnailinC

C++|C#|Java|Lua|PHP|Python|Tcl|Ada|Basic|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Node.js|ObjectiveC|ooc|Perl|
Q|Racket|Ruby|Scala

http://zguide.zeromq.org/page:all 107/225
12/31/2015 MQ - The Guide - MQ - The Guide
HerearesomethingstonoteabouttheSuicidalSnailexample:

Themessagehereconsistssimplyofthecurrentsystemclockasanumberofmilliseconds.Inarealisticapplication,you'd
haveatleastamessageheaderwiththetimestampandamessagebodywithdata.

Theexamplehassubscriberandpublisherinasingleprocessastwothreads.Inreality,theywouldbeseparate
processes.Usingthreadsisjustconvenientforthedemonstration.

HighSpeedSubscribers(BlackBoxPattern) topprevnext

Nowletslookatonewaytomakeoursubscribersfaster.Acommonusecaseforpubsubisdistributinglargedatastreamslike
marketdatacomingfromstockexchanges.Atypicalsetupwouldhaveapublisherconnectedtoastockexchange,takingprice
quotes,andsendingthemouttoanumberofsubscribers.Ifthereareahandfulofsubscribers,wecoulduseTCP.Ifwehavea
largernumberofsubscribers,we'dprobablyusereliablemulticast,i.e.,PGM.

Figure56TheSimpleBlackBoxPattern

Let'simagineourfeedhasanaverageof100,000100bytemessagesasecond.That'satypicalrate,afterfilteringmarketdata
wedon'tneedtosendontosubscribers.Nowwedecidetorecordaday'sdata(maybe250GBin8hours),andthenreplayitto
asimulationnetwork,i.e.,asmallgroupofsubscribers.While100KmessagesasecondiseasyforaZeroMQapplication,we
wanttoreplayitmuchfaster.

Sowesetupourarchitecturewithabunchofboxesoneforthepublisherandoneforeachsubscriber.Thesearewellspecified
boxeseightcores,twelveforthepublisher.

Andaswepumpdataintooursubscribers,wenoticetwothings:

1. Whenwedoeventheslightestamountofworkwithamessage,itslowsdownoursubscribertothepointwhereitcan't
catchupwiththepublisheragain.

1. We'rehittingaceiling,atbothpublisherandsubscriber,toaround6Mmessagesasecond,evenaftercarefuloptimization
andTCPtuning.
http://zguide.zeromq.org/page:all 108/225
12/31/2015 MQ - The Guide - MQ - The Guide
Thefirstthingwehavetodoisbreakoursubscriberintoamultithreadeddesignsothatwecandoworkwithmessagesinoneset
ofthreads,whilereadingmessagesinanother.Typically,wedon'twanttoprocesseverymessagethesameway.Rather,the
subscriberwillfiltersomemessages,perhapsbyprefixkey.Whenamessagematchessomecriteria,thesubscriberwillcalla
workertodealwithit.InZeroMQterms,thismeanssendingthemessagetoaworkerthread.

Sothesubscriberlookssomethinglikeaqueuedevice.Wecouldusevarioussocketstoconnectthesubscriberandworkers.If
weassumeonewaytrafficandworkersthatareallidentical,wecanusePUSHandPULLanddelegatealltheroutingworkto
ZeroMQ.Thisisthesimplestandfastestapproach.

ThesubscribertalkstothepublisheroverTCPorPGM.Thesubscribertalkstoitsworkers,whichareallinthesameprocess,
overinproc://.

Figure57MadBlackBoxPattern

Nowtobreakthatceiling.Thesubscriberthreadhits100%ofCPUandbecauseitisonethread,itcannotusemorethanone
core.Asinglethreadwillalwayshitaceiling,beitat2M,6M,ormoremessagespersecond.Wewanttosplittheworkacross
multiplethreadsthatcanruninparallel.

Theapproachusedbymanyhighperformanceproducts,whichworkshere,issharding.Usingsharding,wesplittheworkinto
parallelandindependentstreams,suchashalfofthetopickeysinonestream,andhalfinanother.Wecouldusemanystreams,
butperformancewon'tscaleunlesswehavefreecores.Solet'sseehowtoshardintotwostreams.

Withtwostreams,workingatfullspeed,wewouldconfigureZeroMQasfollows:

TwoI/Othreads,ratherthanone.
Twonetworkinterfaces(NIC),onepersubscriber.
EachI/OthreadboundtoaspecificNIC.
Twosubscriberthreads,boundtospecificcores.
TwoSUBsockets,onepersubscriberthread.
Theremainingcoresassignedtoworkerthreads.
WorkerthreadsconnectedtobothsubscriberPUSHsockets.

Ideally,wewanttomatchthenumberoffullyloadedthreadsinourarchitecturewiththenumberofcores.Whenthreadsstartto
fightforcoresandCPUcycles,thecostofaddingmorethreadsoutweighsthebenefits.Therewouldbenobenefit,forexample,
increatingmoreI/Othreads.

http://zguide.zeromq.org/page:all 109/225
12/31/2015 MQ - The Guide - MQ - The Guide

ReliablePubSub(ClonePattern) topprevnext

Asalargerworkedexample,we'lltaketheproblemofmakingareliablepubsubarchitecture.We'lldevelopthisinstages.The
goalistoallowasetofapplicationstosharesomecommonstate.Hereareourtechnicalchallenges:

Wehavealargesetofclientapplications,saythousandsortensofthousands.
Theywilljoinandleavethenetworkarbitrarily.
Theseapplicationsmustshareasingleeventuallyconsistentstate.
Anyapplicationcanupdatethestateatanypointintime.

Let'ssaythatupdatesarereasonablylowvolume.Wedon'thaverealtimegoals.Thewholestatecanfitintomemory.Some
plausibleusecasesare:

Aconfigurationthatissharedbyagroupofcloudservers.
Somegamestatesharedbyagroupofplayers.
Exchangeratedatathatisupdatedinrealtimeandavailabletoapplications.

CentralizedVersusDecentralized topprevnext

Afirstdecisionwehavetomakeiswhetherweworkwithacentralserverornot.Itmakesabigdifferenceintheresultingdesign.
Thetradeoffsarethese:

Conceptually,acentralserverissimplertounderstandbecausenetworksarenotnaturallysymmetrical.Withacentral
server,weavoidallquestionsofdiscovery,bindversusconnect,andsoon.

Generally,afullydistributedarchitectureistechnicallymorechallengingbutendsupwithsimplerprotocols.Thatis,each
nodemustactasserverandclientintherightway,whichisdelicate.Whendoneright,theresultsaresimplerthanusinga
centralserver.WesawthisintheFreelancepatterninChapter4ReliableRequestReplyPatterns.

Acentralserverwillbecomeabottleneckinhighvolumeusecases.Ifhandlingscaleintheorderofmillionsofmessages
asecondisrequired,weshouldaimfordecentralizationrightaway.

Ironically,acentralizedarchitecturewillscaletomorenodesmoreeasilythanadecentralizedone.Thatis,it'seasierto
connect10,000nodestooneserverthantoeachother.

So,fortheClonepatternwe'llworkwithaserverthatpublishesstateupdatesandasetofclientsthatrepresentapplications.

RepresentingStateasKeyValuePairs topprevnext

We'lldevelopCloneinstages,solvingoneproblematatime.First,let'slookathowtoupdateasharedstateacrossasetof
clients.Weneedtodecidehowtorepresentourstate,aswellastheupdates.Thesimplestplausibleformatisakeyvaluestore,
whereonekeyvaluepairrepresentsanatomicunitofchangeinthesharedstate.

WehaveasimplepubsubexampleinChapter1Basics,theweatherserverandclient.Let'schangetheservertosendkey
valuepairs,andtheclienttostoretheseinahashtable.Thisletsussendupdatesfromoneservertoasetofclientsusingthe
classicpubsubmodel.

Anupdateiseitheranewkeyvaluepair,amodifiedvalueforanexistingkey,oradeletedkey.Wecanassumefornowthatthe
wholestorefitsinmemoryandthatapplicationsaccessitbykey,suchasbyusingahashtableordictionary.Forlargerstores
andsomekindofpersistencewe'dprobablystorethestateinadatabase,butthat'snotrelevanthere.

Thisistheserver:

clonesrv1:Cloneserver,ModelOneinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|

http://zguide.zeromq.org/page:all 110/225
12/31/2015 MQ - The Guide - MQ - The Guide
Q|Racket|Ruby|Scala

Andhereistheclient:

clonecli1:Cloneclient,ModelOneinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Figure58PublishingStateUpdates

Herearesomethingstonoteaboutthisfirstmodel:

Allthehardworkisdoneinakvmsgclass.Thisclassworkswithkeyvaluemessageobjects,whicharemultipartZeroMQ
messagesstructuredasthreeframes:akey(aZeroMQstring),asequencenumber(64bitvalue,innetworkbyteorder),
andabinarybody(holdseverythingelse).

Theservergeneratesmessageswitharandomized4digitkey,whichletsussimulatealargebutnotenormoushashtable
(10Kentries).

Wedon'timplementdeletionsinthisversion:allmessagesareinsertsorupdates.

Theserverdoesa200millisecondpauseafterbindingitssocket.Thisistopreventslowjoinersyndrome,wherethe
subscriberlosesmessagesasitconnectstotheserver'ssocket.We'llremovethatinlaterversionsoftheClonecode.

We'llusethetermspublisherandsubscriberinthecodetorefertosockets.Thiswillhelplaterwhenwehavemultiple
socketsdoingdifferentthings.

Hereisthekvmsgclass,inthesimplestformthatworksfornow:

kvsimple:KeyvaluemessageclassinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Later,we'llmakeamoresophisticatedkvmsgclassthatwillworkinrealapplications.

Boththeserverandclientmaintainhashtables,butthisfirstmodelonlyworksproperlyifwestartallclientsbeforetheserverand
theclientsnevercrash.That'sveryartificial.

GettinganOutofBandSnapshot topprevnext

Sonowwehaveoursecondproblem:howtodealwithlatejoiningclientsorclientsthatcrashandthenrestart.

http://zguide.zeromq.org/page:all 111/225
12/31/2015 MQ - The Guide - MQ - The Guide
Inordertoallowalate(orrecovering)clienttocatchupwithaserver,ithastogetasnapshotoftheserver'sstate.Justaswe've
reduced"message"tomean"asequencedkeyvaluepair",wecanreduce"state"tomean"ahashtable".Togettheserverstate,
aclientopensaDEALERsocketandasksforitexplicitly.

Tomakethiswork,wehavetosolveaproblemoftiming.Gettingastatesnapshotwilltakeacertaintime,possiblyfairlylongif
thesnapshotislarge.Weneedtocorrectlyapplyupdatestothesnapshot.Buttheserverwon'tknowwhentostartsendingus
updates.Onewaywouldbetostartsubscribing,getafirstupdate,andthenaskfor"stateforupdateN".Thiswouldrequirethe
serverstoringonesnapshotforeachupdate,whichisn'tpractical.

Figure59StateReplication

Sowewilldothesynchronizationintheclient,asfollows:

Theclientfirstsubscribestoupdatesandthenmakesastaterequest.Thisguaranteesthatthestateisgoingtobenewer
thantheoldestupdateithas.

Theclientwaitsfortheservertoreplywithstate,andmeanwhilequeuesallupdates.Itdoesthissimplybynotreading
them:ZeroMQkeepsthemqueuedonthesocketqueue.

Whentheclientreceivesitsstateupdate,itbeginsonceagaintoreadupdates.However,itdiscardsanyupdatesthatare
olderthanthestateupdate.Soifthestateupdateincludesupdatesupto200,theclientwilldiscardupdatesupto201.

Theclientthenappliesupdatestoitsownstatesnapshot.

It'sasimplemodelthatexploitsZeroMQ'sowninternalqueues.Here'stheserver:

clonesrv2:Cloneserver,ModelTwoinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Andhereistheclient:

clonecli2:Cloneclient,ModelTwoinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Herearesomethingstonoteaboutthesetwoprograms:

Theserverusestwotasks.Onethreadproducestheupdates(randomly)andsendsthesetothemainPUBsocket,while
theotherthreadhandlesstaterequestsontheROUTERsocket.ThetwocommunicateacrossPAIRsocketsoveran
inproc://connection.

Theclientisreallysimple.InC,itconsistsofaboutfiftylinesofcode.Alotoftheheavyliftingisdoneinthekvmsgclass.

http://zguide.zeromq.org/page:all 112/225
12/31/2015 MQ - The Guide - MQ - The Guide
Evenso,thebasicClonepatterniseasiertoimplementthanitseemedatfirst.

Wedon'tuseanythingfancyforserializingthestate.Thehashtableholdsasetofkvmsgobjects,andtheserversends
these,asabatchofmessages,totheclientrequestingstate.Ifmultipleclientsrequeststateatonce,eachwillgeta
differentsnapshot.

Weassumethattheclienthasexactlyoneservertotalkto.Theservermustberunningwedonottrytosolvethe
questionofwhathappensiftheservercrashes.

Rightnow,thesetwoprogramsdon'tdoanythingreal,buttheycorrectlysynchronizestate.It'saneatexampleofhowtomix
differentpatterns:PAIRPAIR,PUBSUB,andROUTERDEALER.

RepublishingUpdatesfromClients topprevnext

Inoursecondmodel,changestothekeyvaluestorecamefromtheserveritself.Thisisacentralizedmodelthatisuseful,for
exampleifwehaveacentralconfigurationfilewewanttodistribute,withlocalcachingoneachnode.Amoreinterestingmodel
takesupdatesfromclients,nottheserver.Theserverthusbecomesastatelessbroker.Thisgivesussomebenefits:

We'relessworriedaboutthereliabilityoftheserver.Ifitcrashes,wecanstartanewinstanceandfeeditnewvalues.

Wecanusethekeyvaluestoretoshareknowledgebetweenactivepeers.

Tosendupdatesfromclientsbacktotheserver,wecoulduseavarietyofsocketpatterns.Thesimplestplausiblesolutionisa
PUSHPULLcombination.

Whydon'tweallowclientstopublishupdatesdirectlytoeachother?Whilethiswouldreducelatency,itwouldremovethe
guaranteeofconsistency.Youcan'tgetconsistentsharedstateifyouallowtheorderofupdatestochangedependingonwho
receivesthem.Saywehavetwoclients,changingdifferentkeys.Thiswillworkfine.Butifthetwoclientstrytochangethesame
keyatroughlythesametime,they'llendupwithdifferentnotionsofitsvalue.

Thereareafewstrategiesforobtainingconsistencywhenchangeshappeninmultipleplacesatonce.We'llusetheapproachof
centralizingallchange.Nomattertheprecisetimingofthechangesthatclientsmake,theyareallpushedthroughtheserver,
whichenforcesasinglesequenceaccordingtotheorderinwhichitgetsupdates.

Figure60RepublishingUpdates

Bymediatingallchanges,theservercanalsoaddauniquesequencenumbertoallupdates.Withuniquesequencing,clientscan
detectthenastierfailures,includingnetworkcongestionandqueueoverflow.Ifaclientdiscoversthatitsincomingmessage

http://zguide.zeromq.org/page:all 113/225
12/31/2015 MQ - The Guide - MQ - The Guide
streamhasahole,itcantakeaction.Itseemssensiblethattheclientcontacttheserverandaskforthemissingmessages,butin
practicethatisn'tuseful.Ifthereareholes,they'recausedbynetworkstress,andaddingmorestresstothenetworkwillmake
thingsworse.Alltheclientcandoiswarnitsusersthatitis"unabletocontinue",stop,andnotrestartuntilsomeonehas
manuallycheckedthecauseoftheproblem.

We'llnowgeneratestateupdatesintheclient.Here'stheserver:

clonesrv3:Cloneserver,ModelThreeinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Andhereistheclient:

clonecli3:Cloneclient,ModelThreeinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Herearesomethingstonoteaboutthisthirddesign:

Theserverhascollapsedtoasingletask.ItmanagesaPULLsocketforincomingupdates,aROUTERsocketforstate
requests,andaPUBsocketforoutgoingupdates.

Theclientusesasimpleticklesstimertosendarandomupdatetotheserveronceasecond.Inarealimplementation,we
woulddriveupdatesfromapplicationcode.

WorkingwithSubtrees topprevnext

Aswegrowthenumberofclients,thesizeofoursharedstorewillalsogrow.Itstopsbeingreasonabletosendeverythingto
everyclient.Thisistheclassicstorywithpubsub:whenyouhaveaverysmallnumberofclients,youcansendeverymessage
toallclients.Asyougrowthearchitecture,thisbecomesinefficient.Clientsspecializeindifferentareas.

Soevenwhenworkingwithasharedstore,someclientswillwanttoworkonlywithapartofthatstore,whichwecallasubtree.
Theclienthastorequestthesubtreewhenitmakesastaterequest,anditmustspecifythesamesubtreewhenitsubscribesto
updates.

Thereareacoupleofcommonsyntaxesfortrees.Oneisthepathhierarchy,andanotheristhetopictree.Theselooklikethis:

Pathhierarchy:/some/list/of/paths
Topictree:some.list.of.topics

We'llusethepathhierarchy,andextendourclientandserversothataclientcanworkwithasinglesubtree.Onceyouseehow
toworkwithasinglesubtreeyou'llbeabletoextendthisyourselftohandlemultiplesubtrees,ifyourusecasedemandsit.

Here'stheserverimplementingsubtrees,asmallvariationonModelThree:

clonesrv4:Cloneserver,ModelFourinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

Andhereisthecorrespondingclient:

clonecli4:Cloneclient,ModelFourinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

EphemeralValues topprevnext

http://zguide.zeromq.org/page:all 114/225
12/31/2015 MQ - The Guide - MQ - The Guide
Anephemeralvalueisonethatexpiresautomaticallyunlessregularlyrefreshed.IfyouthinkofClonebeingusedforaregistration
service,thenephemeralvalueswouldletyoudodynamicvalues.Anodejoinsthenetwork,publishesitsaddress,andrefreshes
thisregularly.Ifthenodedies,itsaddresseventuallygetsremoved.

Theusualabstractionforephemeralvaluesistoattachthemtoasession,anddeletethemwhenthesessionends.InClone,
sessionswouldbedefinedbyclients,andwouldendiftheclientdied.Asimpleralternativeistoattachatimetolive(TTL)to
ephemeralvalues,whichtheserverusestoexpirevaluesthathaven'tbeenrefreshedintime.

AgooddesignprinciplethatIusewheneverpossibleistonotinventconceptsthatarenotabsolutelyessential.Ifwehavevery
largenumbersofephemeralvalues,sessionswillofferbetterperformance.Ifweuseahandfulofephemeralvalues,it'sfineto
setaTTLoneachone.Ifweusemassesofephemeralvalues,it'smoreefficienttoattachthemtosessionsandexpirethemin
bulk.Thisisn'taproblemwefaceatthisstage,andmayneverface,sosessionsgooutthewindow.

Nowwewillimplementephemeralvalues.First,weneedawaytoencodetheTTLinthekeyvaluemessage.Wecouldadda
frame.TheproblemwithusingZeroMQframesforpropertiesisthateachtimewewanttoaddanewproperty,wehaveto
changethemessagestructure.Itbreakscompatibility.Solet'saddapropertiesframetothemessage,andwritethecodetolet
usgetandputpropertyvalues.

Next,weneedawaytosay,"deletethisvalue".Upuntilnow,serversandclientshavealwaysblindlyinsertedorupdatednew
valuesintotheirhashtable.We'llsaythatifthevalueisempty,thatmeans"deletethiskey".

Here'samorecompleteversionofthekvmsgclass,whichimplementsthepropertiesframe(andaddsaUUIDframe,whichwe'll
needlateron).Italsohandlesemptyvaluesbydeletingthekeyfromthehash,ifnecessary:

kvmsg:Keyvaluemessageclass:fullinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

TheModelFiveclientisalmostidenticaltoModelFour.Itusesthefullkvmsgclassnow,andsetsarandomizedttlproperty
(measuredinseconds)oneachmessage:

kvmsg_set_prop(kvmsg,"ttl","%d",randof(30))

UsingaReactor topprevnext

Untilnow,wehaveusedapollloopintheserver.Inthisnextmodeloftheserver,weswitchtousingareactor.InC,weuse
CZMQ'szloopclass.Usingareactormakesthecodemoreverbose,buteasiertounderstandandbuildoutbecauseeachpiece
oftheserverishandledbyaseparatereactorhandler.

Weuseasinglethreadandpassaserverobjectaroundtothereactorhandlers.Wecouldhaveorganizedtheserverasmultiple
threads,eachhandlingonesocketortimer,butthatworksbetterwhenthreadsdon'thavetosharedata.Inthiscaseallworkis
centeredaroundtheserver'shashmap,soonethreadissimpler.

Therearethreereactorhandlers:

OnetohandlesnapshotrequestscomingontheROUTERsocket
Onetohandleincomingupdatesfromclients,comingonthePULLsocket
OnetoexpireephemeralvaluesthathavepassedtheirTTL.

clonesrv5:Cloneserver,ModelFiveinC

Java|Python|Tcl|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|
Q|Racket|Ruby|Scala

AddingtheBinaryStarPatternforReliability topprevnext

TheClonemodelswe'veexploreduptonowhavebeenrelativelysimple.Nowwe'regoingtogetintounpleasantlycomplex

http://zguide.zeromq.org/page:all 115/225
12/31/2015 MQ - The Guide - MQ - The Guide
territory,whichhasmegettingupforanotherespresso.Youshouldappreciatethatmaking"reliable"messagingiscomplex
enoughthatyoualwaysneedtoask,"Doweactuallyneedthis?"beforejumpingintoit.Ifyoucangetawaywithunreliableorwith
"goodenough"reliability,youcanmakeahugewinintermsofcostandcomplexity.Sure,youmaylosesomedatanowandthen.
Itisoftenagoodtradeoff.Havingsaid,that,andsipsbecausetheespressoisreallygood,let'sjumpin.

Asyouplaywiththelastmodel,you'llstopandrestarttheserver.Itmightlooklikeitrecovers,butofcourseit'sapplyingupdates
toanemptystateinsteadofthepropercurrentstate.Anynewclientjoiningthenetworkwillonlygetthelatestupdatesinsteadof
thefullhistoricalrecord.

Whatwewantisawayfortheservertorecoverfrombeingkilled,orcrashing.Wealsoneedtoprovidebackupincasetheserver
isoutofcommissionforanylengthoftime.Whensomeoneasksfor"reliability",askthemtolistthefailurestheywanttohandle.
Inourcase,theseare:

Theserverprocesscrashesandisautomaticallyormanuallyrestarted.Theprocesslosesitsstateandhastogetitback
fromsomewhere.

Theservermachinediesandisofflineforasignificanttime.Clientshavetoswitchtoanalternateserversomewhere.

Theserverprocessormachinegetsdisconnectedfromthenetwork,e.g.,aswitchdiesoradatacentergetsknockedout.
Itmaycomebackatsomepoint,butinthemeantimeclientsneedanalternateserver.

Ourfirststepistoaddasecondserver.WecanusetheBinaryStarpatternfromChapter4ReliableRequestReplyPatternsto
organizetheseintoprimaryandbackup.BinaryStarisareactor,soit'susefulthatwealreadyrefactoredthelastservermodel
intoareactorstyle.

Weneedtoensurethatupdatesarenotlostiftheprimaryservercrashes.Thesimplesttechniqueistosendthemtoboth
servers.Thebackupservercanthenactasaclient,andkeepitsstatesynchronizedbyreceivingupdatesasallclientsdo.It'll
alsogetnewupdatesfromclients.Itcan'tyetstoretheseinitshashtable,butitcanholdontothemforawhile.

So,ModelSixintroducesthefollowingchangesoverModelFive:

Weuseapubsubflowinsteadofapushpullflowforclientupdatessenttotheservers.Thistakescareoffanningoutthe
updatestobothservers.Otherwisewe'dhavetousetwoDEALERsockets.

Weaddheartbeatstoserverupdates(toclients),sothataclientcandetectwhentheprimaryserverhasdied.Itcanthen
switchovertothebackupserver.

WeconnectthetwoserversusingtheBinaryStarbstarreactorclass.BinaryStarreliesontheclientstovotebymaking
anexplicitrequesttotheservertheyconsideractive.We'llusesnapshotrequestsasthevotingmechanism.

WemakeallupdatemessagesuniquelyidentifiablebyaddingaUUIDfield.Theclientgeneratesthis,andtheserver
propagatesitbackonrepublishedupdates.

Thepassiveserverkeepsa"pendinglist"ofupdatesthatithasreceivedfromclients,butnotyetfromtheactiveserveror
updatesit'sreceivedfromtheactiveserver,butnotyetfromtheclients.Thelistisorderedfromoldesttonewest,sothatit
iseasytoremoveupdatesoffthehead.

Figure61CloneClientFiniteStateMachine

http://zguide.zeromq.org/page:all 116/225
12/31/2015 MQ - The Guide - MQ - The Guide

It'susefultodesigntheclientlogicasafinitestatemachine.Theclientcyclesthroughthreestates:

Theclientopensandconnectsitssockets,andthenrequestsasnapshotfromthefirstserver.Toavoidrequeststorms,it
willaskanygivenserveronlytwice.Onerequestmightgetlost,whichwouldbebadluck.Twogettinglostwouldbe
carelessness.

Theclientwaitsforareply(snapshotdata)fromthecurrentserver,andifitgetsit,itstoresit.Ifthereisnoreplywithin
sometimeout,itfailsovertothenextserver.

Whentheclienthasgottenitssnapshot,itwaitsforandprocessesupdates.Again,ifitdoesn'thearanythingfromthe
serverwithinsometimeout,itfailsovertothenextserver.

Theclientloopsforever.It'squitelikelyduringstartuporfailoverthatsomeclientsmaybetryingtotalktotheprimaryserverwhile
othersaretryingtotalktothebackupserver.TheBinaryStarstatemachinehandlesthis,hopefullyaccurately.It'shardtoprove
softwarecorrectinsteadwehammerituntilwecan'tproveitwrong.

Failoverhappensasfollows:

Theclientdetectsthatprimaryserverisnolongersendingheartbeats,andconcludesthatithasdied.Theclientconnects
tothebackupserverandrequestsanewstatesnapshot.

Thebackupserverstartstoreceivesnapshotrequestsfromclients,anddetectsthatprimaryserverhasgone,soittakes
overasprimary.

Thebackupserverappliesitspendinglisttoitsownhashtable,andthenstartstoprocessstatesnapshotrequests.

Whentheprimaryservercomesbackonline,itwill:

Startupaspassiveserver,andconnecttothebackupserverasaCloneclient.

Starttoreceiveupdatesfromclients,viaitsSUBsocket.

Wemakeafewassumptions:

Atleastoneserverwillkeeprunning.Ifbothserverscrash,weloseallserverstateandthere'snowaytorecoverit.

Multipleclientsdonotupdatethesamehashkeysatthesametime.Clientupdateswillarriveatthetwoserversina
differentorder.Therefore,thebackupservermayapplyupdatesfromitspendinglistinadifferentorderthantheprimary
serverwouldordid.Updatesfromoneclientwillalwaysarriveinthesameorderonbothservers,sothatissafe.

http://zguide.zeromq.org/page:all 117/225
12/31/2015 MQ - The Guide - MQ - The Guide
ThusthearchitectureforourhighavailabilityserverpairusingtheBinaryStarpatternhastwoserversandasetofclientsthat
talktobothservers.

Figure62HighavailabilityCloneServerPair

HereisthesixthandlastmodeloftheCloneserver:

clonesrv6:Cloneserver,ModelSixinC

Java|Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

Thismodelisonlyafewhundredlinesofcode,butittookquiteawhiletogetworking.Tobeaccurate,buildingModelSixtook
aboutafullweekof"Sweetgod,thisisjusttoocomplexforanexample"hacking.We'veassembledprettymucheverythingand
thekitchensinkintothissmallapplication.Wehavefailover,ephemeralvalues,subtrees,andsoon.Whatsurprisedmewasthat
theupfrontdesignwasprettyaccurate.Stillthedetailsofwritinganddebuggingsomanysocketflowsisquitechallenging.

Thereactorbaseddesignremovesalotofthegruntworkfromthecode,andwhatremainsissimplerandeasiertounderstand.
WereusethebstarreactorfromChapter4ReliableRequestReplyPatterns.Thewholeserverrunsasonethread,sothere'sno
interthreadweirdnessgoingonjustastructurepointer(self)passedaroundtoallhandlers,whichcandotheirthinghappily.
Onenicesideeffectofusingreactorsisthatthecode,beinglesstightlyintegratedintoapollloop,ismucheasiertoreuse.Large
chunksofModelSixaretakenfromModelFive.

Ibuiltitpiecebypiece,andgoteachpieceworkingproperlybeforegoingontothenextone.Becausetherearefourorfivemain
socketflows,thatmeantquitealotofdebuggingandtesting.Idebuggedjustbydumpingmessagestotheconsole.Don'tuse
classicdebuggerstostepthroughZeroMQapplicationsyouneedtoseethemessageflowstomakeanysenseofwhatisgoing
on.

Fortesting,IalwaystrytouseValgrind,whichcatchesmemoryleaksandinvalidmemoryaccesses.InC,thisisamajorconcern,
asyoucan'tdelegatetoagarbagecollector.UsingproperandconsistentabstractionslikekvmsgandCZMQhelpsenormously.

TheClusteredHashmapProtocol topprevnext

WhiletheserverisprettymuchamashupofthepreviousmodelplustheBinaryStarpattern,theclientisquitealotmore
complex.Butbeforewegettothat,let'slookatthefinalprotocol.I'vewrittenthisupasaspecificationontheZeroMQRFC
websiteastheClusteredHashmapProtocol.

Roughly,therearetwowaystodesignacomplexprotocolsuchasthisone.Onewayistoseparateeachflowintoitsownsetof
sockets.Thisistheapproachweusedhere.Theadvantageisthateachflowissimpleandclean.Thedisadvantageisthat
managingmultiplesocketflowsatoncecanbequitecomplex.Usingareactormakesitsimpler,butstill,itmakesalotofmoving

http://zguide.zeromq.org/page:all 118/225
12/31/2015 MQ - The Guide - MQ - The Guide
piecesthathavetofittogethercorrectly.

Thesecondwaytomakesuchaprotocolistouseasinglesocketpairforeverything.Inthiscase,I'dhaveusedROUTERforthe
serverandDEALERfortheclients,andthendoneeverythingoverthatconnection.Itmakesforamorecomplexprotocolbutat
leastthecomplexityisallinoneplace.InChapter7AdvancedArchitectureusingZeroMQwe'lllookatanexampleofaprotocol
doneoveraROUTERDEALERcombination.

Let'stakealookattheCHPspecification.Notethat"SHOULD","MUST"and"MAY"arekeywordsweuseinprotocol
specificationstoindicaterequirementlevels.

Goals

CHPismeanttoprovideabasisforreliablepubsubacrossaclusterofclientsconnectedoveraZeroMQnetwork.Itdefinesa
"hashmap"abstractionconsistingofkeyvaluepairs.Anyclientcanmodifyanykeyvaluepairatanytime,andchangesare
propagatedtoallclients.Aclientcanjointhenetworkatanytime.

Architecture

CHPconnectsasetofclientapplicationsandasetofservers.Clientsconnecttotheserver.Clientsdonotseeeachother.
Clientscancomeandgoarbitrarily.

PortsandConnections

TheserverMUSTopenthreeportsasfollows:

ASNAPSHOTport(ZeroMQROUTERsocket)atportnumberP.
APUBLISHERport(ZeroMQPUBsocket)atportnumberP+1.
ACOLLECTORport(ZeroMQSUBsocket)atportnumberP+2.

TheclientSHOULDopenatleasttwoconnections:

ASNAPSHOTconnection(ZeroMQDEALERsocket)toportnumberP.
ASUBSCRIBERconnection(ZeroMQSUBsocket)toportnumberP+1.

TheclientMAYopenathirdconnection,ifitwantstoupdatethehashmap:

APUBLISHERconnection(ZeroMQPUBsocket)toportnumberP+2.

Thisextraframeisnotshowninthecommandsexplainedbelow.

StateSynchronization

TheclientMUSTstartbysendingaICANHAZcommandtoitssnapshotconnection.Thiscommandconsistsoftwoframesas
follows:

ICANHAZcommand

Frame0:"ICANHAZ?"
Frame1:subtreespecification

BothframesareZeroMQstrings.ThesubtreespecificationMAYbeempty.Ifnotempty,itconsistsofaslashfollowedbyoneor
morepathsegments,endinginaslash.

TheserverMUSTrespondtoaICANHAZcommandbysendingzeroormoreKVSYNCcommandstoitssnapshotport,followed
withaKTHXBAIcommand.TheserverMUSTprefixeachcommandwiththeidentityoftheclient,asprovidedbyZeroMQwith
theICANHAZcommand.TheKVSYNCcommandspecifiesasinglekeyvaluepairasfollows:

KVSYNCcommand

Frame0:key,asZeroMQstring
Frame1:sequencenumber,8bytesinnetworkorder
Frame2:<empty>
Frame3:<empty>
Frame4:value,asblob

http://zguide.zeromq.org/page:all 119/225
12/31/2015 MQ - The Guide - MQ - The Guide
Thesequencenumberhasnosignificanceandmaybezero.

TheKTHXBAIcommandtakesthisform:

KTHXBAIcommand

Frame0:"KTHXBAI"
Frame1:sequencenumber,8bytesinnetworkorder
Frame2:<empty>
Frame3:<empty>
Frame4:subtreespecification

ThesequencenumberMUSTbethehighestsequencenumberoftheKVSYNCcommandspreviouslysent.

WhentheclienthasreceivedaKTHXBAIcommand,itSHOULDstarttoreceivemessagesfromitssubscriberconnectionand
applythem.

ServertoClientUpdates

WhentheserverhasanupdateforitshashmapitMUSTbroadcastthisonitspublishersocketasaKVPUBcommand.The
KVPUBcommandhasthisform:

KVPUBcommand

Frame0:key,asZeroMQstring
Frame1:sequencenumber,8bytesinnetworkorder
Frame2:UUID,16bytes
Frame3:properties,asZeroMQstring
Frame4:value,asblob

ThesequencenumberMUSTbestrictlyincremental.TheclientMUSTdiscardanyKVPUBcommandswhosesequencenumbers
arenotstrictlygreaterthanthelastKTHXBAIorKVPUBcommandreceived.

TheUUIDisoptionalandframe2MAYbeempty(sizezero).Thepropertiesfieldisformattedaszeroormoreinstancesof
"name=value"followedbyanewlinecharacter.Ifthekeyvaluepairhasnoproperties,thepropertiesfieldisempty.

Ifthevalueisempty,theclientSHOULDdeleteitskeyvalueentrywiththespecifiedkey.

IntheabsenceofotherupdatestheserverSHOULDsendaHUGZcommandatregularintervals,e.g.,oncepersecond.The
HUGZcommandhasthisformat:

HUGZcommand

Frame0:"HUGZ"
Frame1:00000000
Frame2:<empty>
Frame3:<empty>
Frame4:<empty>

TheclientMAYtreattheabsenceofHUGZasanindicatorthattheserverhascrashed(seeReliabilitybelow).

ClienttoServerUpdates

Whentheclienthasanupdateforitshashmap,itMAYsendthistotheserverviaitspublisherconnectionasaKVSETcommand.
TheKVSETcommandhasthisform:

KVSETcommand

Frame0:key,asZeroMQstring
Frame1:sequencenumber,8bytesinnetworkorder
http://zguide.zeromq.org/page:all 120/225
12/31/2015 MQ - The Guide - MQ - The Guide
Frame2:UUID,16bytes
Frame3:properties,asZeroMQstring
Frame4:value,asblob

Thesequencenumberhasnosignificanceandmaybezero.TheUUIDSHOULDbeauniversallyuniqueidentifier,ifareliable
serverarchitectureisused.

Ifthevalueisempty,theserverMUSTdeleteitskeyvalueentrywiththespecifiedkey.

TheserverSHOULDacceptthefollowingproperties:

ttl:specifiesatimetoliveinseconds.IftheKVSETcommandhasattlproperty,theserverSHOULDdeletethekey
valuepairandbroadcastaKVPUBwithanemptyvalueinordertodeletethisfromallclientswhentheTTLhasexpired.

Reliability

CHPmaybeusedinadualserverconfigurationwhereabackupservertakesoveriftheprimaryserverfails.CHPdoesnot
specifythemechanismsusedforthisfailoverbuttheBinaryStarpatternmaybehelpful.

Toassistserverreliability,theclientMAY:

SetaUUIDineveryKVSETcommand.
DetectthelackofHUGZoveratimeperiodandusethisasanindicatorthatthecurrentserverhasfailed.
Connecttoabackupserverandrerequestastatesynchronization.

ScalabilityandPerformance

CHPisdesignedtobescalabletolargenumbers(thousands)ofclients,limitedonlybysystemresourcesonthebroker.Because
allupdatespassthroughasingleserver,theoverallthroughputwillbelimitedtosomemillionsofupdatespersecondatpeak,
andprobablyless.

Security

CHPdoesnotimplementanyauthentication,accesscontrol,orencryptionmechanismsandshouldnotbeusedinany
deploymentwherethesearerequired.

BuildingaMultithreadedStackandAPI topprevnext

Theclientstackwe'veusedsofarisn'tsmartenoughtohandlethisprotocolproperly.Assoonaswestartdoingheartbeats,we
needaclientstackthatcanruninabackgroundthread.IntheFreelancepatternattheendofChapter4ReliableRequest
ReplyPatternsweusedamultithreadedAPIbutdidn'texplainitindetail.ItturnsoutthatmultithreadedAPIsarequiteuseful
whenyoustarttomakemorecomplexZeroMQprotocolslikeCHP.

Figure63MultithreadedAPI

http://zguide.zeromq.org/page:all 121/225
12/31/2015 MQ - The Guide - MQ - The Guide

Ifyoumakeanontrivialprotocolandyouexpectapplicationstoimplementitproperly,mostdeveloperswillgetitwrongmostof
thetime.You'regoingtobeleftwithalotofunhappypeoplecomplainingthatyourprotocolistoocomplex,toofragile,andtoo
hardtouse.WhereasifyougivethemasimpleAPItocall,youhavesomechanceofthembuyingin.

OurmultithreadedAPIconsistsofafrontendobjectandabackgroundagent,connectedbytwoPAIRsockets.Connectingtwo
PAIRsocketslikethisissousefulthatyourhighlevelbindingshouldprobablydowhatCZMQdoes,whichispackagea"create
newthreadwithapipethatIcanusetosendmessagestoit"method.

ThemultithreadedAPIsthatweseeinthisbookalltakethesameform:

Theconstructorfortheobject(clone_new)createsacontextandstartsabackgroundthreadconnectedwithapipe.It
holdsontooneendofthepipesoitcansendcommandstothebackgroundthread.

Thebackgroundthreadstartsanagentthatisessentiallyazmq_pollloopreadingfromthepipesocketandanyother
sockets(here,theDEALERandSUBsockets).

ThemainapplicationthreadandthebackgroundthreadnowcommunicateonlyviaZeroMQmessages.Byconvention,the
frontendsendsstringcommandssothateachmethodontheclassturnsintoamessagesenttothebackendagent,like
this:

void
clone_connect(clone_t*self,char*address,char*service)
{
assert(self)
zmsg_t*msg=zmsg_new()
zmsg_addstr(msg,"CONNECT")
zmsg_addstr(msg,address)
zmsg_addstr(msg,service)
zmsg_send(&msg,self>pipe)
}

Ifthemethodneedsareturncode,itcanwaitforareplymessagefromtheagent.

Iftheagentneedstosendasynchronouseventsbacktothefrontend,weaddarecvmethodtotheclass,whichwaitsfor
messagesonthefrontendpipe.

http://zguide.zeromq.org/page:all 122/225
12/31/2015 MQ - The Guide - MQ - The Guide
Wemaywanttoexposethefrontendpipesockethandletoallowtheclasstobeintegratedintofurtherpollloops.
Otherwiseanyrecvmethodwouldblocktheapplication.

ThecloneclasshasthesamestructureastheflcliapiclassfromChapter4ReliableRequestReplyPatternsandaddsthe
logicfromthelastmodeloftheCloneclient.WithoutZeroMQ,thiskindofmultithreadedAPIdesignwouldbeweeksofreallyhard
work.WithZeroMQ,itwasadayortwoofwork.

TheactualAPImethodsforthecloneclassarequitesimple:

//Createanewcloneclassinstance
clone_t*
clone_new(void)

//Destroyacloneclassinstance
void
clone_destroy(clone_t**self_p)

//Definethesubtree,ifany,forthiscloneclass
void
clone_subtree(clone_t*self,char*subtree)

//Connectthecloneclasstooneserver
void
clone_connect(clone_t*self,char*address,char*service)

//Setavalueinthesharedhashmap
void
clone_set(clone_t*self,char*key,char*value,intttl)

//Getavaluefromthesharedhashmap
char*
clone_get(clone_t*self,char*key)

SohereisModelSixofthecloneclient,whichhasnowbecomejustathinshellusingthecloneclass:

clonecli6:Cloneclient,ModelSixinC

Java|Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

Notetheconnectmethod,whichspecifiesoneserverendpoint.Underthehood,we'reinfacttalkingtothreeports.However,as
theCHPprotocolsays,thethreeportsareonconsecutiveportnumbers:

Theserverstaterouter(ROUTER)isatportP.
Theserverupdatespublisher(PUB)isatportP+1.
Theserverupdatessubscriber(SUB)isatportP+2.

Sowecanfoldthethreeconnectionsintoonelogicaloperation(whichweimplementasthreeseparateZeroMQconnectcalls).

Let'sendwiththesourcecodefortheclonestack.Thisisacomplexpieceofcode,buteasiertounderstandwhenyoubreakit
intothefrontendobjectclassandthebackendagent.Thefrontendsendsstringcommands("SUBTREE","CONNECT","SET",
"GET")totheagent,whichhandlesthesecommandsaswellastalkingtotheserver(s).Hereistheagent'slogic:

1. Startupbygettingasnapshotfromthefirstserver
2. Whenwegetasnapshotswitchtoreadingfromthesubscribersocket.
3. Ifwedon'tgetasnapshotthenfailovertothesecondserver.
4. Pollonthepipeandthesubscribersocket.
5. Ifwegotinputonthepipe,handlethecontrolmessagefromthefrontendobject.
6. Ifwegotinputonthesubscriber,storeorapplytheupdate.
7. Ifwedidn'tgetanythingfromtheserverwithinacertaintime,failover.
8. RepeatuntiltheprocessisinterruptedbyCtrlC.

Andhereistheactualcloneclassimplementation:

clone:CloneclassinC

http://zguide.zeromq.org/page:all 123/225
12/31/2015 MQ - The Guide - MQ - The Guide

Java|Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

Chapter6TheZeroMQCommunity topprevnext

Peoplesometimesaskmewhat'ssospecialaboutZeroMQ.MystandardansweristhatZeroMQisarguablythebestanswerwe
havetothevexingquestionof"Howdowemakethedistributedsoftwarethatthe21stcenturydemands?"Butmorethanthat,
ZeroMQisspecialbecauseofitscommunity.Thisisultimatelywhatseparatesthewolvesfromthesheep.

Therearethreemainopensourcepatterns.Thefirstisthelargefirmdumpingcodetobreakthemarketforothers.Thisisthe
ApacheFoundationmodel.Thesecondistinyteamsorsmallfirmsbuildingtheirdream.Thisisthemostcommonopensource
model,whichcanbeverysuccessfulcommercially.Thelastisaggressiveanddiversecommunitiesthatswarmoveraproblem
landscape.ThisistheLinuxmodel,andtheonetowhichweaspirewithZeroMQ.

It'shardtooveremphasizethepowerandpersistenceofaworkingopensourcecommunity.Therereallydoesnotseemtobea
betterwayofmakingsoftwareforthelongterm.Notonlydoesthecommunitychoosethebestproblemstosolve,itsolvesthem
minimally,carefully,anditthenlooksaftertheseanswersforyears,decades,untilthey'renolongerrelevant,andthenitquietly
putsthemaway.

ToreallybenefitfromZeroMQ,youneedtounderstandthecommunity.Atsomepointdowntheroadyou'llwanttosubmita
patch,anissue,oranaddon.Youmightwanttoasksomeoneforhelp.Youwillprobablywanttobetapartofyourbusinesson
ZeroMQ,andwhenItellyouthatthecommunityismuch,muchmoreimportantthanthecompanythatbackstheproduct,even
thoughI'mCEOofthatcompany,thisshouldbesignificant.

InthischapterI'mgoingtolookatourcommunityfromseveralanglesandconcludebyexplainingindetailourcontractfor
collaboration,whichwecall"C4".Youshouldfindthediscussionusefulforyourownwork.We'vealsoadaptedtheZeroMQC4
processforclosedsourceprojectswithgoodsuccess.

We'llcover:

TheroughstructureofZeroMQasasetofprojects
What"softwarearchitecture"isreallyabout
WhyweusetheLGPLandnottheBSDlicense
HowwedesignedandgrewtheZeroMQcommunity
ThebusinessthatbacksZeroMQ
WhoownstheZeroMQsourcecode
HowtomakeandsubmitapatchtoZeroMQ
WhocontrolswhatpatchesactuallygointoZeroMQ
Howweguaranteecompatibilitywitholdcode
Whywedon'tusepublicgitbranches
WhodecidesontheZeroMQroadmap
Aworkedexampleofachangetolibzmq

ArchitectureoftheZeroMQCommunity topprevnext

YouknowthatZeroMQisanLGPLlicensedproject.Infactit'sacollectionofprojects,builtaroundthecorelibrary,libzmq.I'll
visualizetheseprojectsasanexpandinggalaxy:

Atthecore,libzmqistheZeroMQcorelibrary.It'swritteninC++,withalowlevelCAPI.Thecodeisnasty,mainly
becauseit'shighlyoptimizedbutalsobecauseit'swritteninC++,alanguagethatlendsitselftosubtleanddeepnastiness.
MartinSustrikwrotethebulkofthiscode.Todayithasdozensofpeoplewhomaintaindifferentpartsofit.

Aroundlibzmq,thereareabout50bindings.TheseareindividualprojectsthatcreatehigherlevelAPIsforZeroMQ,orat
leastmapthelowlevelAPIintootherlanguages.Thebindingsvaryinqualityfromexperimentaltoutterlyawesome.
ProbablythemostimpressivebindingisPyZMQ,whichwasoneofthefirstcommunityprojectsontopofZeroMQ.Ifyou
areabindingauthor,youshouldreallystudyPyZMQandaspiretomakingyourcodeandcommunityasgreat.

http://zguide.zeromq.org/page:all 124/225
12/31/2015 MQ - The Guide - MQ - The Guide
Alotoflanguageshavemultiplebindings(Erlang,Ruby,C#,atleast)writtenbydifferentpeopleovertime,ortaking
varyingapproaches.Wedon'tregulatetheseinanyway.Thereareno"official"bindings.Youvotebyusingoneorthe
other,contributingtoit,orignoringit.

Thereareaseriesofreimplementationsoflibzmq,startingwithJeroMQ,afullJavatranslationofthelibrary,whichis
nowthebasisforNetMQ,aC#stack.ThesenativestacksoffersimilaroridenticalAPIs,andspeakthesameprotocol
(ZMTP)aslibzmq.

OntopofthebindingsarealotofprojectsthatuseZeroMQorbuildonit.Seethe"Labs"pageonthewikiforalonglistof
projectsandprotoprojectsthatuseZeroMQinsomeway.Thereareframeworks,webserverslikeMongrel2,brokerslike
Majordomo,andenterpriseopensourcetoolslikeStorm.

Libzmq,mostofthebindings,andsomeoftheouterprojectssitintheZeroMQcommunity"organization"onGitHub.This
organizationis"run"byagroupconsistingofthemostseniorbindingauthors.There'sverylittletorunasit'salmostallself
managingandthere'szeroconflictthesedays.

iMatix,myfirm,playsaspecificroleinthecommunity.Weownthetrademarksandenforcethemdiscretelyinordertomakesure
thatifyoudownloadapackagecallingitself"ZeroMQ",youcantrustwhatyouaregetting.Peoplehaveonrareoccasiontriedto
hijackthename,maybebelievingthat"freesoftware"meansthereisnopropertyatstakeandnoonewillingtodefendit.One
thingyou'llunderstandfromthischapterishowseriouslywetaketheprocessbehindoursoftware(andImean"us"asa
community,notacompany).iMatixbacksthecommunitybyenforcingthatprocessonanythingcallingitself"ZeroMQ"or
"ZeroMQ".WealsoputmoneyandtimeintothesoftwareandpackagingforreasonsI'llexplainlater.

Itisnotacharityexercise.ZeroMQisaforprofitproject,andaveryprofitableone.Theprofitsarewidelydistributedamongall
thosewhoinvestinit.It'sreallythatsimple:takethetimetobecomeanexpertinZeroMQ,orbuildsomethingusefulontopof
ZeroMQ,andyou'llfindyourvalueasanindividual,orteam,orcompanyincreasing.iMatixenjoysthesamebenefitsaseveryone
elseinthecommunity.It'swinwintoeveryoneexceptourcompetitors,whofindthemselvesfacingathreattheycan'tbeatand
can'treallyescape.ZeroMQdominatesthefutureworldofmassivelydistributedsoftware.

Myfirmdoesn'tjusthavethecommunity'sbackwealsobuiltthecommunity.ThiswasdeliberateworkintheoriginalZeroMQ
whitepaperfrom2007,thereweretwoprojects.Onewastechnical,howtomakeabettermessagingsystem.Thesecondwas
howtobuildacommunitythatcouldtakethesoftwaretodominantsuccess.Softwaredies,butcommunitysurvives.

HowtoMakeReallyLargeArchitectures topprevnext

Thereare,ithasbeensaid(atleastbypeoplereadingthissentenceoutloud),twowaystomakereallylargescalesoftware.
OptionOneistothrowmassiveamountsofmoneyandproblemsatempiresofsmartpeople,andhopethatwhatemergesisnot
yetanothercareerkiller.Ifyou'reveryluckyandarebuildingonlotsofexperience,havekeptyourteamssolid,andarenot
aimingfortechnicalbrilliance,andarefurthermoreincrediblylucky,itworks.

Butgamblingwithhundredsofmillionsofothers'moneyisn'tforeveryone.Fortherestofuswhowanttobuildlargescale
software,there'sOptionTwo,whichisopensource,andmorespecifically,freesoftware.Ifyou'reaskinghowthechoiceof
softwarelicenseisrelevanttothescaleofthesoftwareyoubuild,that'stherightquestion.

ThebrilliantandvisionaryEbenMoglenoncesaid,roughly,thatafreesoftwarelicenseisthecontractonwhichacommunity
builds.WhenIheardthis,abouttenyearsago,theideacametomeCanwedeliberatelygrowfreesoftwarecommunities?

Tenyearslater,theansweris"yes",andthereisalmostasciencetoit.Isay"almost"becausewedon'tyethaveenough
evidenceofpeopledoingthisdeliberatelywithadocumented,reproducibleprocess.ItiswhatI'mtryingtodowithSocial
Architecture.ZeroMQcameafterWikidot,aftertheDigitalStandardsOrganization(Digistan)andaftertheFoundationforaFree
InformationInfrastructure(akatheFFII,anNGOthatfightsagainstsoftwarepatents).Thisallcameafteralotoflesssuccessful
communityprojectslikeXitamiandLibero.Mymaintakeawayfromalongcareerofprojectsofeveryconceivableformatis:ifyou
wanttobuildtrulylargescaleandlonglastingsoftware,aimtobuildafreesoftwarecommunity.

PsychologyofSoftwareArchitecture topprevnext

DirkjanOchtmanpointedmetoWikipedia'sdefinitionofSoftwareArchitectureas"thesetofstructuresneededtoreasonabout
thesystem,whichcomprisesoftwareelements,relationsamongthem,andpropertiesofboth".Formethisvapidandcircular
jargonisagoodexampleofhowmiserablylittleweunderstandwhatactuallymakesasuccessfullargescalesoftware

http://zguide.zeromq.org/page:all 125/225
12/31/2015 MQ - The Guide - MQ - The Guide
architecture.

Architectureistheartandscienceofmakinglargeartificialstructuresforhumanuse.IfthereisonethingI'velearnedandapplied
successfullyin30yearsofmakinglargerandlargersoftwaresystems,itisthis:softwareisaboutpeople.Largestructuresin
themselvesaremeaningless.It'showtheyfunctionforhumanusethatmatters.Andinsoftware,humanusestartswiththe
programmerswhomakethesoftwareitself.

Thecoreproblemsinsoftwarearchitecturearedrivenbyhumanpsychology,nottechnology.Therearemanywaysour
psychologyaffectsourwork.Icouldpointtothewayteamsseemtogetstupiderastheygetlargerorwhentheyhavetowork
acrosslargerdistances.Doesthatmeanthesmallertheteam,themoreeffective?Howthendoesalargeglobalcommunitylike
ZeroMQmanagetoworksuccessfully?

TheZeroMQcommunitywasn'taccidental.Itwasadeliberatedesign,mycontributiontotheearlydayswhenthecodecameout
ofacellarinBratislava.Thedesignwasbasedonmypetscienceof"SocialArchitecture",whichWikipediadefinesas"the
consciousdesignofanenvironmentthatencouragesadesiredrangeofsocialbehaviorsleadingtowardssomegoalorsetof
goals."Idefinethisasmorespecificallyas"theprocess,andtheproduct,ofplanning,designing,andgrowinganonline
community."

OneofthetenetsofSocialArchitectureisthathowweorganizeismoresignificantthanwhoweare.Thesamegroup,organized
differently,canproducewhollydifferentresults.WearelikepeersinaZeroMQnetwork,andourcommunicationpatternshavea
dramaticimpactonourperformance.Ordinarypeople,wellconnected,canfaroutperformateamofexpertsusingpoorpatterns.
Ifyou'rethearchitectofalargerZeroMQapplication,you'regoingtohavetohelpothersfindtherightpatternsforworking
together.Dothisright,andyourprojectcansucceed.Doitwrong,andyourprojectwillfail.

Thetwomostimportantpsychologicalelementsarethatwe'rereallybadatunderstandingcomplexityandthatwearesogoodat
workingtogethertodivideandconquerlargeproblems.We'rehighlysocialapes,andkindofsmart,butonlyintherightkindof
crowd.

SohereismyshortlistofthePsychologicalElementsofSoftwareArchitecture:

Stupidity:ourmentalbandwidthislimited,sowe'reallstupidatsomepoint.Thearchitecturehastobesimpleto
understand.Thisisthenumberonerule:simplicitybeatsfunctionality,everysingletime.Ifyoucan'tunderstandan
architectureonacoldgrayMondaymorningbeforecoffee,itistoocomplex.

Selfishness:weactonlyoutofselfinterest,sothearchitecturemustcreatespaceandopportunityforselfishactsthat
benefitthewhole.Selfishnessisoftenindirectandsubtle.Forexample,I'llspendhourshelpingsomeoneelseunderstand
somethingbecausethatcouldbeworthdaystomelater.

Laziness:wemakelotsofassumptions,manyofwhicharewrong.Wearehappiestwhenwecanspendtheleasteffortto
getaresultortotestanassumptionquickly,sothearchitecturehastomakethispossible.Specifically,thatmeansitmust
besimple.

Jealousy:we'rejealousofothers,whichmeanswe'llovercomeourstupidityandlazinesstoproveotherswrongandbeat
themincompetition.Thearchitecturethushastocreatespaceforpubliccompetitionbasedonfairrulesthatanyonecan
understand.

Fear:we'reunwillingtotakerisks,especiallyifitmakesuslookstupid.Fearoffailureisamajorreasonpeopleconform
andfollowthegroupinmassstupidity.Thearchitectureshouldmakesilentexperimentationeasyandcheap,givingpeople
opportunityforsuccesswithoutpunishingfailure.

Reciprocity:we'llpayextraintermsofhardwork,evenmoney,topunishcheatsandenforcefairrules.Thearchitecture
shouldbeheavilyrulebased,tellingpeoplehowtoworktogether,butnotwhattoworkon.

Conformity:we'rehappiesttoconform,outoffearandlaziness,whichmeansifthepatternsaregood,clearlyexplained
anddocumented,andfairlyenforced,we'llnaturallychoosetherightpatheverytime.

Pride:we'reintenselyawareofoursocialstatus,andwe'llworkhardtoavoidlookingstupidorincompetentinpublic.The
architecturehastomakesureeverypiecewemakehasournameonit,sowe'llhavesleeplessnightsstressingabout
whatotherswillsayaboutourwork.

Greed:we'reultimatelyeconomicanimals(seeselfishness),sothearchitecturehastogiveuseconomicincentiveto
investinmakingithappen.Maybeit'spolishingourreputationasexperts,maybeit'sliterallymakingmoneyfromsome
skillorcomponent.Itdoesn'tmatterwhatitis,buttheremustbeeconomicincentive.Thinkofarchitectureasamarket
place,notanengineeringdesign.

Thesestrategiesworkonalargescalebutalsoonasmallscale,withinanorganizationorteam.

http://zguide.zeromq.org/page:all 126/225
12/31/2015 MQ - The Guide - MQ - The Guide

TheImportanceofContracts topprevnext

Letmediscussacontentiousbutimportantarea,whichiswhatlicensetochoose.I'llsay"BSD"tocoverMIT,X11,BSD,Apache,
andsimilarlicenses,and"GPL"tocoverGPLv3,LGPLv3,andAGPLv3.Thesignificantdifferenceistheobligationtoshareback
anyforkedversions,whichpreventsanyentityfromcapturingthesoftware,andthuskeepsit"free".

Asoftwarelicenseisn'ttechnicallyacontractsinceyoudon'tsignanything.Butbroadly,callingitacontractisusefulsinceit
takestheobligationsofeachparty,andmakesthemlegallyenforceableincourt,undercopyrightlaw.

Youmightask,whydoweneedcontractsatalltomakeopensource?Surelyit'sallaboutdecency,goodwill,peopleworking
togetherforselflessmotives.Surelytheprincipleof"lessismore"applieshereofallplaces?Don'tmorerulesmeanless
freedom?Dowereallyneedlawyerstotellushowtoworktogether?Itseemscynicalandevencounterproductivetoforcea
restrictivesetofrulesonthehappycommunesoffreeandopensourcesoftware.

Butthetruthabouthumannatureisnotthatpretty.We'renotreallyangels,nordevils,justselfinterestedwinnersdescended
fromabillionyearunbrokenlineofwinners.Inbusiness,marriage,andcollectiveworks,soonerorlater,weeitherstopcaring,or
wefightandweargue.

Putthisanotherway:acollectiveworkhastwoextremeoutcomes.Eitherit'safailure,irrelevant,andworthless,inwhichcase
everysanepersonwalksaway,withoutafight.Or,it'sasuccess,relevant,andvaluable,inwhichcasewestartjockeyingfor
power,control,andoften,money.

Whatawellwrittencontractdoesistoprotectthosevaluablerelationshipsfromconflict.Amarriagewherethetermsofdivorce
areclearlyagreedupfrontismuchlesslikelytoendindivorce.Abusinessdealwherebothpartiesagreehowtoresolvevarious
classicconflictssuchasonepartystealingtheothers'clientsorstaffismuchlesslikelytoendinconflict.

Similarly,asoftwareprojectthathasawellwrittencontractthatdefinesthetermsofbreakupclearlyismuchlesslikelytoendin
breakup.Thealternativeseemstobetoimmersetheprojectintoalargerorganizationthatcanassertpressureonteamstowork
together(orlosethebackingandbrandingoftheorganization).ThisisforexamplehowtheApacheFoundationworks.Inmy
experienceorganizationbuildinghasitsowncosts,andendsupfavoringwealthierparticipants(whocanaffordthosesometimes
hugecosts).

Inanopensourceorfreesoftwareproject,breakupusuallytakestheformofafork,wherethecommunitysplitsintotwoormore
groups,eachwithdifferentvisionsofthefuture.Duringthehoneymoonperiodofaproject,whichcanlastyears,there'sno
questionofabreakup.Itisasaprojectbeginstobeworthmoney,orasthemainauthorsstarttoburnout,thatthegoodwilland
generositytendstodryup.

Sowhendiscussingsoftwarelicenses,forthecodeyouwriteorthecodeyouuse,alittlecynicismhelps.Askyourself,not"which
licensewillattractmorecontributors?"becausetheanswertothatliesinthemissionstatementandcontributionprocess.Ask
yourself,"ifthisprojecthadabigfight,andsplitthreeways,whichlicensewouldsaveus?"Or,"ifthewholeteamwasboughtby
ahostilefirmthatwantedtoturnthiscodeintoaproprietaryproduct,whichlicensewouldsaveus?"

Longtermsurvivalmeansenduringthebadtimes,aswellasenjoyingthegoodones.

WhenBSDprojectsfork,theycannoteasilymergeagain.Indeed,onewayforkingofBSDprojectsisquitesystematic:everytime
BSDcodeendsupinacommercialproject,thisiswhat'shappened.WhenGPLprojectsfork,however,remergingistrivial.

TheGPL'sstoryisrelevanthere.Thoughcommunitiesofprogrammerssharingtheircodeopenlywerealreadysignificantbythe
1980's,theytendedtouseminimallicensesthatworkedaslongasnorealmoneygotinvolved.Therewasanimportantlanguage
stackcalledEmacs,originallybuiltinLispbyRichardStallman.Anotherprogrammer,JamesGosling(wholatergaveusJava),
rewroteEmacsinCwiththehelpofmanycontributors,ontheassumptionthatitwouldbeopen.Stallmangotthatcodeandused
itasthebasisforhisownCversion.Goslingthensoldthecodetoafirmwhichturnedaroundandblockedanyonedistributinga
competingproduct.Stallmanfoundthissaleofthecommonworkhugelyunethical,andbegandevelopingareusablelicensethat
wouldprotectcommunitiesfromthis.

WhateventuallyemergedwastheGNUGeneralPublicLicense,whichusedtraditionalcopyrighttoforceremixability.Itwasa
neathackthatspreadtootherdomains,forinstancetheCreativeCommonsforphotographyandmusic.In2007,wesawversion
3ofthelicense,whichwasaresponsetobelatedattacksfromMicrosoftandothersontheconcept.Ithasbecomealongand
complexdocumentbutcorporatecopyrightlawyershavebecomefamiliarwithitandinmyexperience,fewcompaniesmind
usingGPLsoftwareandlibraries,solongastheboundariesareclearlydefined.

Thus,agoodcontractandIconsiderthemodernGPLtobethebestforsoftwareletsprogrammersworktogetherwithout
upfrontagreements,organizations,orassumptionsofdecencyandgoodwill.Itmakesitcheapertocollaborate,andturnsconflict
intohealthycompetition.GPLdoesn'tjustdefinewhathappenswithafork,itactivelyencouragesforksasatoolfor
experimentationandlearning.Whereasaforkcankillaprojectwitha"moreliberal"license,GPLprojectsthriveonforkssince

http://zguide.zeromq.org/page:all 127/225
12/31/2015 MQ - The Guide - MQ - The Guide
successfulexperimentscan,bycontract,beremixedbackintothemainstream.

Yes,therearemanythrivingBSDprojectsandmanydeadGPLones.It'salwayswrongtogeneralize.Aprojectwillthriveordie
formanyreasons.However,inacompetitivesport,oneneedseveryadvantage.

TheotherimportantpartoftheBSDvs.GPLstoryiswhatIcall"leakage",whichistheeffectofpouringwaterintoapotwitha
smallbutrealholeinthebottom.

EatMe topprevnext

Hereisastory.Ithappenedtotheeldestbrotherinlawofthecousinofafriendofmine'scolleagueatwork.Hisnamewas,and
stillis,Patrick.

PatrickwasacomputerscientistwithaPhDinadvancednetworktopologies.Hespenttwoyearsandhissavingsbuildinganew
product,andchoosetheBSDlicensebecausehebelievedthatwouldgethimmoreadoption.Heworkedinhisattic,atgreat
personalcost,andproudlypublishedhiswork.Peopleapplauded,foritwastrulyfantastic,andhismailinglistsweresoonabuzz
withactivityandpatchesandhappychatter.Manycompaniestoldhimhowtheyweresavingmillionsusinghiswork.Someof
themevenpaidhimforconsultancyandtraining.Hewasinvitedtospeakatconferencesandstartedcollectingbadgeswithhis
nameonthem.Hestartedasmallbusiness,hiredafriendtoworkwithhim,anddreamedofmakingitbig.

Thenoneday,someonepointedhimtoanewproject,GPLlicensed,whichhadforkedhisworkandwasimprovingonit.Hewas
irritatedandupset,andaskedhowpeoplefellowopensourcers,noless!wouldsoshamelesslystealhiscode.Therewere
longargumentsonthelistaboutwhetheritwasevenlegaltorelicensetheirBSDcodeasGPLcode.Turnedout,itwas.Hetried
toignorethenewproject,butthenhesoonrealizedthatnewpatchescomingfromthatprojectcouldn'tevenbemergedbackinto
hiswork!

Worse,theGPLprojectgotpopularandsomeofhiscorecontributorsmadefirstsmall,andthenlargerpatchestoit.Again,he
couldn'tusethosechanges,andhefeltabandoned.Patrickwentintoadepression,hisgirlfriendlefthimforaninternational
currencydealercalled,weirdly,Patrice,andhestoppedallworkontheproject.Hefeltbetrayed,andutterlymiserable.Hefired
hisfriend,whotookitratherbadlyandtoldeveryonethatPatrickwasaclosetbanjoplayer.Finally,Patricktookajobasaproject
managerforacloudcompany,andbytheageofforty,hehadstoppedprogrammingevenforfun.

PoorPatrick.Ialmostfeltsorryforhim.ThenIaskedhim,"Whydidn'tyouchoosetheGPL?""Becauseit'sarestrictiveviral
license",hereplied.Itoldhim,"YoumayhaveaPhD,andyoumaybetheeldestbrotherinlawofthecousinofafriendofmy
colleague,butyouareanidiotandMoniquewassmarttoleaveyou.Youpublishedyourworkinvitingpeopletopleasestealyour
codeaslongastheykeptthis'pleasestealmycode'statementintheresultingwork",andwhenpeopledidexactlythat,yougot
upset.Worse,youwereahypocritebecausewhentheydiditinsecret,youwerehappy,butwhentheydiditopenly,youfelt
betrayed."

Seeingyourhardworkcapturedbyasmarterteamandthenusedagainstyouisenormouslypainful,sowhyevenmakethat
possible?EveryproprietaryprojectthatusesBSDcodeiscapturingit.ApublicGPLforkisperhapsmorehumiliating,butit'sfully
selfinflicted.

BSDislikefood.Itliterally(andImeanthatmetaphorically)whispers"eatme"inthelittlevoiceoneimaginesacubeofcheese
mightusewhenit'ssittingnexttoanemptybottleofthebestbeerintheworld,whichisofcourseOrval,brewedbyanancient
andalmostextinctorderofsilentBelgianmonkscalledLesGarsLabasQuiFabriquel'Orval.TheBSDlicense,likeitsnearclone
MIT/X11,wasdesignedspecificallybyauniversity(Berkeley)withnoprofitmotivetoleakworkandeffort.Itisawaytopush
subsidizedtechnologyatbelowitscostprice,adumpingofunderpricedcodeinthehopethatitwillbreakthemarketforothers.
BSDisanexcellentstrategictool,butonlyifyou'realargewellfundedinstitutionthatcanaffordtouseOptionOne.TheApache
licenseisBSDinasuit.

Forussmallbusinesseswhoaimourinvestmentslikepreciousbullets,leakingworkandeffortisunacceptable.Breakingthe
marketisgreat,butwecannotaffordtosubsidizeourcompetitors.TheBSDnetworkingstackendedupputtingWindowsonthe
Internet.Wecannotaffordbattleswiththoseweshouldnaturallybeallieswith.Wecannotaffordtomakefundamentalbusiness
errorsbecauseintheend,thatmeanswehavetofirepeople.

Itcomesdowntobehavioraleconomicsandgametheory.Thelicensewechoosemodifiestheeconomicsofthosewhouseour
work.Inthesoftwareindustry,therearefriends,foes,andfood.BSDmakesmostpeopleseeusaslunch.Closedsourcemakes
mostpeopleseeusasenemies(doyoulikepayingpeopleforsoftware?)GPL,however,makesmostpeople,withtheexception
ofthePatricksoftheworld,ourallies.AnyforkofZeroMQislicensecompatiblewithZeroMQ,tothepointwhereweencourage
forksasavaluabletoolforexperimentation.Yes,itcanbeweirdtoseesomeonetrytorunoffwiththeballbuthere'sthesecret,I
cangetitbackanytimeIwant.

http://zguide.zeromq.org/page:all 128/225
12/31/2015 MQ - The Guide - MQ - The Guide

TheProcess topprevnext

Ifyou'veacceptedmythesisuptonow,great!Now,I'llexplaintheroughprocessbywhichweactuallybuildanopensource
community.ThiswashowwebuiltorgreworgentlysteeredtheZeroMQcommunityintoexistence.

Yourgoalasleaderofacommunityistomotivatepeopletogetoutthereandexploretoensuretheycandososafelyand
withoutdisturbingotherstorewardthemwhentheymakesuccessfuldiscoveriesandtoensuretheysharetheirknowledgewith
everyoneelse(andnotbecauseweaskthem,notbecausetheyfeelgenerous,butbecauseit'sTheLaw).

Itisaniterativeprocess.Youmakeasmallproduct,atyourowncost,butinpublicview.Youthenbuildasmallcommunity
aroundthatproduct.Ifyouhaveasmallbutrealhit,thecommunitythenhelpsdesignandbuildthenextversion,andgrows
larger.Andthenthatcommunitybuildsthenextversion,andsoon.It'sevidentthatyouremainpartofthecommunity,maybe
evenamajoritycontributor,butthemorecontrolyoutrytoassertoverthematerialresults,thelesspeoplewillwanttoparticipate.
Planyourownretirementwellbeforesomeonedecidesyouaretheirnextproblem.

Crazy,Beautiful,andEasy topprevnext

Youneedagoalthat'scrazyandsimpleenoughtogetpeopleoutofbedinthemorning.Yourcommunityhastoattractthevery
bestpeopleandthatdemandssomethingspecial.WithZeroMQ,wesaidweweregoingtomake"theFastest.Messaging.Ever.",
whichqualifiesasagoodmotivator.Ifwe'dsaid,we'regoingtomake"asmarttransportlayerthat'llconnectyourmovingpieces
cheaplyandflexiblyacrossyourenterprise",we'dhavefailed.

Thenyourworkmustbebeautiful,immediatelyuseful,andattractive.Yourcontributorsareuserswhowanttoexplorejustalittle
beyondwheretheyarenow.Makeitsimple,elegant,andbrutallyclean.Theexperiencewhenpeoplerunoruseyourwork
shouldbeanemotionalone.Theyshouldfeelsomething,andifyouaccuratelysolvedevenjustonebigproblemthatuntilthen
theydidn'tquiterealizetheyfaced,you'llhaveasmallpartoftheirsoul.

Itmustbeeasytounderstand,use,andjoin.Toomanyprojectshavebarrierstoaccess:putyourselfintheotherperson'smind
andseeallthereasonstheycometoyoursite,thinking"Um,interestingproject,but"andthenleave.Youwantthemtostay
andtryit,justonce.UseGitHubandputtheissuetrackerrightthere.

Ifyoudothesethingswell,yourcommunitywillbesmartbutmoreimportantly,itwillbeintellectuallyandgeographicallydiverse.
Thisisreallyimportant.Agroupoflikemindedexpertscannotexploretheproblemlandscapewell.Theytendtomakebig
mistakes.Diversitybeatseducationanytime.

Stranger,MeetStranger topprevnext

Howmuchupfrontagreementdotwopeopleneedtoworktogetheronsomething?Inmostorganizations,alot.Butyoucan
bringthiscostdowntonearzero,andthenpeoplecancollaboratewithouthavingevermet,doneaphoneconference,meeting,
orbusinesstriptodiscussRolesandResponsibilitiesoverwaytoomanybottlesofcheapKoreanricewine.

Youneedwellwrittenrulesthataredesignedbycynicalpeoplelikemetoforcestrangersintomutuallybeneficialcollaboration
insteadofconflict.TheGPLisagoodstart.GitHubanditsfork/mergestrategyisagoodfollowup.Andthenyouwantsomething
likeourC4rulebooktocontrolhowworkactuallyhappens.

C4(whichInowuseforeverynewopensourceproject)hasdetailedandtestedanswerstoalotofcommonmistakespeople
make,suchasthesinofworkingofflineinacornerwithothers"becauseit'sfaster".Transparencyisessentialtogettrust,which
isessentialtogetscale.Byforcingeverysinglechangethroughasingletransparentprocess,youbuildrealtrustintheresults.

Anothercardinalsinthatmanyopensourcedevelopersmakeistoplacethemselvesaboveothers."Ifoundedthisprojectthus
myintellectissuperiortothatofothers".It'snotjustimmodestandrude,andusuallyinaccurate,it'salsopoorbusiness.Therules
mustapplyequallytoeveryone,withoutdistinction.Youarepartofthecommunity.Yourjob,asfounderofaproject,isnotto
imposeyourvisionoftheproductoverothers,buttomakesuretherulesaregood,honest,andenforced.

InfiniteProperty topprevnext

http://zguide.zeromq.org/page:all 129/225
12/31/2015 MQ - The Guide - MQ - The Guide

Oneofthesaddestmythsoftheknowledgebusinessisthatideasareasensibleformofproperty.It'smedievalnonsensethat
shouldhavebeenjunkedalongwithslavery,butsadlyit'sstillmakingtoomanypowerfulpeopletoomuchmoney.

Ideasarecheap.Whatdoesworksensiblyaspropertyisthehardworkwedoinbuildingamarket."Youeatwhatyoukill"isthe
rightmodelforencouragingpeopletoworkhard.Whetherit'smoralauthorityoveraproject,moneyfromconsulting,orthesaleof
atrademarktosomelarge,richfirm:ifyoumakeit,youownit.Butwhatyoureallyownis"footfall",participantsinyourproject,
whichultimatelydefinesyourpower.

Todothisrequiresinfinitefreespace.Thankfully,GitHubsolvedthisproblemforus,forwhichIwilldieagratefulperson(there
aremanyreasonstobegratefulinlife,whichIwon'tlistherebecauseweonlyhaveahundredorsopagesleft,butthisisoneof
them).

Youcannotscaleasingleprojectwithmanyownerslikeyoucanscaleacollectionofmanysmallprojects,eachwithfewer
owners.Whenweembraceforks,apersoncanbecomean"owner"withasingleclick.Nowtheyjusthavetoconvinceothersto
joinbydemonstratingtheiruniquevalue.

SoinZeroMQ,weaimedtomakeiteasytowritebindingsontopofthecorelibrary,andwestoppedtryingtomakethose
bindingsourselves.Thiscreatedspaceforotherstomakethose,becometheirowners,andgetthatcredit.

CareandFeeding topprevnext

Iwishacommunitycouldbe100%selfsteering,andperhapsonedaythiswillwork,buttodayit'snotthecase.We'reveryclose
withZeroMQ,butfrommyexperienceacommunityneedsfourtypesofcareandfeeding:

First,simplybecausemostpeoplearetoonice,weneedsomekindofsymbolicleadershiporownerswhoprovideultimate
authorityincaseofconflict.Usuallyit'sthefoundersofthecommunity.I'veseenitworkwithselfelectedgroupsof
"elders",butoldmenliketotalkalot.I'veseencommunitiessplitoverthequestion"whoisincharge?",andsettingup
legalentitieswithboardsandsuchseemstomakeargumentsovercontrolworse,notbetter.Maybebecausethereseems
tobemoretofightover.Oneoftherealbenefitsoffreesoftwareisthatit'salwaysremixable,soinsteadoffightingovera
pie,onesimplyforksthepie.

Second,communitiesneedlivingrules,andthustheyneedalawyerabletoformulateandwritethesedown.Rulesare
criticalwhendoneright,theyremovefriction.Whendonewrong,orneglected,weseerealfrictionandargumentthatcan
driveawaythenicemajority,leavingtheargumentativecoreinchargeoftheburninghouse.OnethingI'vetriedtodowith
theZeroMQandpreviouscommunitiesiscreatereusablerules,whichperhapsmeanswedon'tneedlawyersasmuch.

Thirdly,communitiesneedsomekindoffinancialbacking.Thisisthejaggedrockthatbreaksmostships.Ifyoustarvea
community,itbecomesmorecreativebutthecorecontributorsburnout.Ifyoupourtoomuchmoneyintoit,youattractthe
professionals,whoneversay"no",andthecommunitylosesitsdiversityandcreativity.Ifyoucreateafundforpeopleto
share,theywillfight(bitterly)overit.WithZeroMQ,we(iMatix)spendourtimeandmoneyonmarketingandpackaging
(likethisbook),andthebasiccare,likebugfixes,releases,andwebsites.

Lastly,salesandcommercialmediationareimportant.Thereisanaturalmarketbetweenexpertcontributorsand
customers,butbotharesomewhatincompetentattalkingtoeachother.Customersassumethatsupportisfreeorvery
cheapbecausethesoftwareisfree.Contributorsareshyataskingafairratefortheirwork.Itmakesforadifficultmarket.
Agrowingpartofmyworkandmyfirm'sprofitsissimplyconnectingZeroMQuserswhowanthelpwithexpertsfromthe
communityabletoprovideit,andensuringbothsidesarehappywiththeresults.

I'veseencommunitiesofbrilliantpeoplewithnoblegoalsdyingbecausethefoundersgotsomeorallofthesefourthingswrong.
Thecoreproblemisthatyoucan'texpectconsistentlygreatleadershipfromanyonecompany,person,orgroup.Whatworks
todayoftenwon'tworktomorrow,yetstructuresbecomemoresolid,notmoreflexible,overtime.

ThebestanswerIcanfindisamixoftwothings.One,theGPLanditsguaranteeofremixability.Nomatterhowbadthe
authority,nomatterhowmuchtheytrytoprivatizeandcapturethecommunity'swork,ifit'sGPLlicensed,thatworkcanwalk
awayandfindabetterauthority.Beforeyousay,"allopensourceoffersthis,"thinkitthrough.IcankillaBSDlicensedprojectby
hiringthecorecontributorsandnotreleasinganynewpatches.Butevenwithabillionofdollars,IcannotkillaGPLlicensed
project.Two,thephilosophicalanarchistmodelofauthority,whichisthatwechooseit,itdoesnotownus.

TheZeroMQProcess:C4 topprevnext

http://zguide.zeromq.org/page:all 130/225
12/31/2015 MQ - The Guide - MQ - The Guide

WhenwesayZeroMQwesometimesmeanlibzmq,thecorelibrary.Inearly2012,wesynthesizedthelibzmqprocessintoa
formalprotocolforcollaborationthatwecalledtheCollectiveCodeConstructionContract,orC4.Youcanseethisasalayer
abovetheGPL.Theseareourrules,andI'llexplainthereasoningbehindeachone.

C4isanevolutionoftheGitHubFork+PullModel.YoumaygetthefeelingI'mafanofgitandGitHub.Thiswouldbeaccurate:
thesetwotoolshavemadesuchapositiveimpactonourworkoverthelastyears,especiallywhenitcomestobuilding
community.

Language topprevnext

Thekeywords"MUST","MUSTNOT","REQUIRED","SHALL","SHALLNOT","SHOULD","SHOULDNOT",
"RECOMMENDED","MAY",and"OPTIONAL"inthisdocumentaretobeinterpretedasdescribedinRFC2119.

BystartingwiththeRFC2119language,theC4textmakesveryclearitsintentiontoactasaprotocolratherthanarandomly
writtensetofrecommendations.Aprotocolisacontractbetweenpartiesthatdefinestherightsandobligationsofeachparty.
Thesecanbepeersinanetworkortheycanbestrangersworkinginthesameproject.

IthinkC4isthefirsttimeanyonehasattemptedtocodifyacommunity'srulebookasaformalandreusableprotocolspec.
Previously,ourruleswerespreadoutoverseveralwikipages,andwerequitespecifictolibzmqinmanyways.Butexperience
teachesusthatthemoreformal,accurate,andreusabletherules,theeasieritisforstrangerstocollaborateupfront.Andless
frictionmeansamorescalablecommunity.AtthetimeofC4,wealsohadsomedisagreementinthelibzmqprojectover
preciselywhatprocesswewereusing.Noteveryonefeltboundbythesamerules.Let'sjustsaysomepeoplefelttheyhada
specialstatus,whichcreatedfrictionwiththerestofthecommunity.Socodificationmadethingsclear.

It'seasytouseC4:justhostyourprojectonGitHub,getoneotherpersontojoin,andopenthefloortopullrequests.Inyour
README,putalinktoC4andthat'sit.We'vedonethisinquiteafewprojectsanditdoesseemtowork.I'vebeenpleasantly
surprisedafewtimesjustapplyingtheserulestomyownwork,likeCZMQ.Noneofusaresoamazingthatwecanworkwithout
others.

Goals topprevnext

C4ismeanttoprovideareusableoptimalcollaborationmodelforopensourcesoftwareprojects.

TheshorttermreasonforwritingC4wastoendargumentsoverthelibzmqcontributionprocess.Thedissenterswentoff
elsewhere.TheZeroMQcommunityblossomedsmoothlyandeasily,asI'dpredicted.Mostpeopleweresurprised,butgratified.
There'sbeennorealcriticismsofC4exceptitsbranchingpolicy,whichI'llcometolaterasitdeservesitsowndiscussion.

There'sareasonI'mreviewinghistoryhere:asfounderofacommunity,youareaskingpeopletoinvestinyourproperty,
trademark,andbranding.Inreturn,andthisiswhatwedowithZeroMQ,youcanusethatbrandingtosetabarforquality.When
youdownloadaproductlabeled"ZeroMQ",youknowthatit'sbeenproducedtocertainstandards.It'sabasicruleofquality:write
downyourprocessotherwiseyoucannotimproveit.Ourprocessesaren'tperfect,norcantheyeverbe.Butanyflawinthemcan
befixed,andtested.

MakingC4reusableisthereforereallyimportant.Tolearnmoreaboutthebestpossibleprocess,weneedtogetresultsfromthe
widestrangeofprojects.

Ithasthesespecificgoals:
Tomaximizethescaleofthecommunityaroundaproject,byreducingthefrictionfornewContributorsandcreatingascaled
participationmodelwithstrongpositivefeedbacks

Thenumberonegoalissizeandhealthofthecommunitynottechnicalquality,notprofits,notperformance,notmarketshare.
Thegoalissimplythenumberofpeoplewhocontributetotheproject.Thesciencehereissimple:thelargerthecommunity,the
http://zguide.zeromq.org/page:all 131/225
12/31/2015 MQ - The Guide - MQ - The Guide
moreaccuratetheresults.

Torelievedependenciesonkeyindividualsbyseparatingdifferentskillsetssothatthereisalargerpoolofcompetencein
anyrequireddomain

Perhapstheworstproblemwefacedinlibzmqwasdependenceonpeoplewhocouldunderstandthecode,manageGitHub
branches,andmakecleanreleasesallatthesametime.It'slikelookingforathleteswhocanrunmarathonsandsprint,swim,
andalsoliftweights.Wehumansarereallygoodatspecialization.Askingustobereallygoodattwocontradictorythingsreduces
thenumberofcandidatessharply,whichisaBadThingforanyproject.Wehadthisproblemseverelyinlibzmqin2009orso,
andfixeditbysplittingtheroleofmaintainerintotwo:onepersonmakespatchesandanothermakesreleases.

Toallowtheprojecttodevelopfasterandmoreaccurately,byincreasingthediversityofthedecisionmakingprocess

Thisistheorynotfullyproven,butnotfalsified.Thediversityofthecommunityandthenumberofpeoplewhocanweighinon
discussions,withoutfearofbeingcriticizedordismissed,thefasterandmoreaccuratelythesoftwaredevelops.Speedisquite
subjectivehere.Goingveryfastinthewrongdirectionisnotjustuseless,it'sactivelydamaging(andwesufferedalotofthatin
libzmqbeforeweswitchedtoC4).

Tosupportthenaturallifecycleofprojectversionsfromexperimentalthroughtostable,byallowingsafeexperimentation,
rapidfailure,andisolationofstablecode

Tobehonest,thisgoalseemstobefadingintoirrelevance.It'squiteaninterestingeffectoftheprocess:thegitmasterisalmost
alwaysperfectlystable.Thishastodowiththesizeofchangesandtheirlatency,i.e.,thetimebetweensomeonewritingthecode
andsomeoneactuallyusingitfully.However,peoplestillexpect"stable"releases,sowe'llkeepthisgoalthereforawhile.

Toreducetheinternalcomplexityofprojectrepositories,thusmakingiteasierforContributorstoparticipateandreducing
thescopeforerror

Curiousobservation:peoplewhothriveincomplexsituationsliketocreatecomplexitybecauseitkeepstheirvaluehigh.It'sthe
CobraEffect(Googleit).Gitmadebrancheseasyandleftuswiththealltoocommonsyndromeof"gitiseasyonceyou
understandthatagitbranchisjustafoldedfivedimensionalleptonspacethathasadetachedhistorywithnointerveningcache".
Developersshouldnotbemadetofeelstupidbytheirtools.I'veseentoomanytopclassdevelopersconfusedbyrepository
structurestoacceptconventionalwisdomongitbranches.We'llcomebacktodisposeofgitbranchesshortly,dearreader.

Toenforcecollectiveownershipoftheproject,whichincreaseseconomicincentivetoContributorsandreducestheriskof
hijackbyhostileentities.

Ultimately,we'reeconomiccreatures,andthesensethat"weownthis,andourworkcanneverbeusedagainstus"makesit
mucheasierforpeopletoinvestinanopensourceprojectlikeZeroMQ.Anditcan'tbejustafeeling,ithastobereal.Therearea
numberofaspectstomakingcollectiveownershipwork,we'llseetheseonebyoneaswegothroughC4.

Preliminaries topprevnext

TheprojectSHALLusethegitdistributedrevisioncontrolsystem.

Githasitsfaults.ItscommandlineAPIishorriblyinconsistent,andithasacomplex,messyinternalmodelthatitshovesinyour
faceattheslightestprovocation.Butdespitedoingitsbesttomakeitsusersfeelstupid,gitdoesitsjobreally,reallywell.More
pragmatically,I'vefoundthatifyoustayawayfromcertainareas(branches!),peoplelearngitrapidlyanddon'tmakemany
mistakes.Thatworksforme.

TheprojectSHALLbehostedongithub.comorequivalent,hereincalledthe"Platform".

http://zguide.zeromq.org/page:all 132/225
12/31/2015 MQ - The Guide - MQ - The Guide
I'msureonedaysomelargefirmwillbuyGitHubandbreakit,andanotherplatformwillriseinitsplace.Untilthen,Githubserves
upanearperfectsetofminimal,fast,simpletools.I'vethrownhundredsofpeopleatit,andtheyallsticklikefliesstuckinadish
ofhoney.

TheprojectSHALLusethePlatformissuetracker.

WemadethemistakeinlibzmqofswitchingtoJirabecausewehadn'tlearnedyethowtoproperlyusetheGitHubissuetracker.
Jiraisagreatexampleofhowtoturnsomethingusefulintoacomplexmessbecausethebusinessdependsonsellingmore
"features".ButevenwithoutcriticizingJira,keepingtheissuetrackeronthesameplatformmeansonelessUItolearn,oneless
login,andsmoothintegrationbetweenissuesandpatches.

TheprojectSHOULDhaveclearlydocumentedguidelinesforcodestyle.

Thisisaprotocolplugin:insertcodestyleguidelineshere.Ifyoudon'tdocumentthecodestyleyouuse,youhavenobasis
exceptprejudicetorejectpatches.

A"Contributor"isapersonwhowishestoprovideapatch,beingasetofcommitsthatsolvesomeclearlyidentifiedproblem.
A"Maintainer"isapersonwhomergepatchestotheproject.Maintainersarenotdeveloperstheirjobistoenforceprocess.

Nowwemoveontodefinitionsoftheparties,andthesplittingofrolesthatsavedusfromthesinofstructuraldependencyonrare
individuals.Thisworkedwellinlibzmq,butasyouwillseeitdependsontherestoftheprocess.C4isn'tabuffetyouwillneed
thewholeprocess(orsomethingverylikeit),oritwon'tholdtogether.

ContributorsSHALLNOThavecommitaccesstotherepositoryunlesstheyarealsoMaintainers.
MaintainersSHALLhavecommitaccesstotherepository.

Whatwewantedtoavoidwaspeoplepushingtheirchangesdirectlytomaster.Thiswasthebiggestsourceoftroubleinlibzmq
historically:largemassesofrawcodethattookmonthsoryearstofullystabilize.WeeventuallyfollowedotherZeroMQprojects
likePyZMQinusingpullrequests.Wewentfurther,andstipulatedthatallchangeshadtofollowthesamepath.Noexceptions
for"specialpeople".

Everyone,withoutdistinctionordiscrimination,SHALLhaveanequalrighttobecomeaContributorunderthetermsofthis
contract.

Wehadtostatethisexplicitly.Itusedtobethatthelibzmqmaintainerswouldrejectpatchessimplybecausetheydidn'tlike
them.Now,thatmaysoundreasonabletotheauthorofalibrary(thoughlibzmqwasnotwrittenbyanyoneperson),butlet's
rememberourgoalofcreatingaworkthatisownedbyasmanypeopleaspossible.Saying"Idon'tlikeyourpatchsoI'mgoingto
rejectit"isequivalenttosaying,"IclaimtoownthisandIthinkI'mbetterthanyou,andIdon'ttrustyou".Thosearetoxic
messagestogivetootherswhoarethinkingofbecomingyourcoinvestors.

Ithinkthisfightbetweenindividualexpertiseandcollectiveintelligenceplaysoutinotherareas.ItdefinedWikipedia,andstill
does,adecadeafterthatworksurpassedanythingbuiltbysmallgroupsofexperts.Forme,wemakesoftwarebyslowly
synthesizingthemostaccurateknowledge,muchaswemakeWikipediaarticles.

LicensingandOwnership topprevnext

TheprojectSHALLusetheGPLv3oravariantthereof(LGPL,AGPL).

I'vealreadyexplainedhowfullremixabilitycreatesbetterscaleandwhytheGPLanditsvariantsseemstheoptimalcontractfor
remixablesoftware.Ifyou'realargebusinessaimingtodumpcodeonthemarket,youwon'twantC4,butthenyouwon'treally
careaboutcommunityeither.

http://zguide.zeromq.org/page:all 133/225
12/31/2015 MQ - The Guide - MQ - The Guide
Allcontributionstotheprojectsourcecode("patches")SHALLusethesamelicenseastheproject.

Thisremovestheneedforanyspecificlicenseorcontributionagreementforpatches.YouforktheGPLcode,youpublishyour
remixedversiononGitHub,andyouoranyoneelsecanthensubmitthatasapatchtotheoriginalcode.BSDdoesn'tallowthis.
AnyworkthatcontainsBSDcodemayalsocontainunlicensedproprietarycodesoyouneedexplicitactionfromtheauthorofthe
codebeforeyoucanremixit.

Allpatchesareownedbytheirauthors.ThereSHALLNOTbeanycopyrightassignmentprocess.

HerewecometothekeyreasonpeopletrusttheirinvestmentsinZeroMQ:it'slogisticallyimpossibletobuythecopyrightsto
createaclosedsourcecompetitortoZeroMQ.iMatixcan'tdothiseither.Andthemorepeoplethatsendpatches,theharderit
becomes.ZeroMQisn'tjustfreeandopentodaythisspecificrulemeansitwillremainsoforever.Notethatit'snotthecasein
allGPLprojects,manyofwhichstillaskforcopyrighttransferbacktothemaintainers.

TheprojectSHALLbeownedcollectivelybyallitsContributors.

Thisisperhapsredundant,butworthsaying:ifeveryoneownstheirpatches,thentheresultingwholeisalsoownedbyevery
contributor.There'snolegalconceptofowninglinesofcode:the"work"isatleastasourcefile.

EachContributorSHALLberesponsibleforidentifyingthemselvesintheprojectContributorlist.

Inotherwords,themaintainersarenotkarmaaccountants.Anyonewhowantscredithastoclaimitthemselves.

PatchRequirements topprevnext

Inthissection,wedefinetheobligationsofthecontributor:specifically,whatconstitutesa"valid"patch,sothatmaintainershave
rulestheycanusetoacceptorrejectpatches.

MaintainersandContributorsMUSThaveaPlatformaccountandSHOULDusetheirrealnamesorawellknownalias.

Intheworstcasescenario,wheresomeonehassubmittedtoxiccode(patented,orownedbysomeoneelse),weneedtobeable
totracewhoandwhen,sowecanremovethecode.Askingforrealnamesorawellknownaliasisatheoreticalstrategyfor
reducingtheriskofboguspatches.Wedon'tknowifthisactuallyworksbecausewehaven'thadtheproblemyet.

ApatchSHOULDbeaminimalandaccurateanswertoexactlyoneidentifiedandagreedproblem.

ThisimplementstheSimplicityOrientedDesignprocessthatI'llcometolaterinthischapter.Oneclearproblem,oneminimal
solution,apply,test,repeat.

ApatchMUSTadheretothecodestyleguidelinesoftheprojectifthesearedefined.

Thisisjustsanity.I'vespenttimecleaningupotherpeoples'patchesbecausetheyinsistedonputtingtheelsebesidetheif
insteadofjustbelowasNatureintended.Consistentcodeishealthier.

ApatchMUSTadheretothe"EvolutionofPublicContracts"guidelinesdefinedbelow.

Ah,thepain,thepain.I'mnotspeakingofthetimeatageeightwhenIsteppedonaplankwitha4inchnailprotrudingfromit.
ThatwasrelativelyOK.I'mspeakingof20102011whenwehadmultipleparallelreleasesofZeroMQ,eachwithdifferent
incompatibleAPIsorwireprotocols.Itwasanexerciseinbadrules,pointlesslyenforced,thatstillhurtsustoday.Therulewas,"If
youchangetheAPIorprotocol,youSHALLcreateanewmajorversion".Givemethenailthroughthefootthathurtless.
http://zguide.zeromq.org/page:all 134/225
12/31/2015 MQ - The Guide - MQ - The Guide
OneofthebigchangeswemadewithC4wassimplytoban,outright,thiskindofsanctionedsabotage.Amazingly,it'snoteven
hard.Wejustdon'tallowthebreakingofexistingpubliccontracts,period,unlesseveryoneagrees,inwhichcasenoperiod.As
LinusTorvaldsfamouslyputiton23December2012,"WEDONOTBREAKUSERSPACE!"

ApatchSHALLNOTincludenontrivialcodefromotherprojectsunlesstheContributoristheoriginalauthorofthatcode.

Thisrulehastwoeffects.Thefirstisthatitforcespeopletomakeminimalsolutionsbecausetheycannotsimplyimportswathes
ofexistingcode.InthecaseswhereI'veseenthishappentoprojects,it'salwaysbadunlesstheimportedcodeisverycleanly
separated.Thesecondisthatitavoidslicensearguments.Youwritethepatch,youareallowedtopublishitasLGPL,andwe
canmergeitbackin.Butyoufinda200linecodefragmentontheweb,andtrytopastethat,we'llrefuse.

ApatchMUSTcompilecleanlyandpassprojectselftestsonatleasttheprincipletargetplatform.

Forcrossplatformprojects,itisfairtoaskthatthepatchworksonthedevelopmentboxusedbythecontributor.

ApatchcommitmessageSHOULDconsistofasingleshort(lessthan50character)linesummarizingthechange,optionally
followedbyablanklineandthenamorethoroughdescription.

Thisisagoodformatforcommitmessagesthatfitsintoemail(thefirstlinebecomesthesubject,andtherestbecomestheemail
body).

A"CorrectPatch"isonethatsatisfiestheaboverequirements.

Justincaseitwasn'tclear,we'rebacktolegaleseanddefinitions.

DevelopmentProcess topprevnext

Inthissection,weaimtodescribetheactualdevelopmentprocess,stepbystep.

ChangeontheprojectSHALLbegovernedbythepatternofaccuratelyidentifyingproblemsandapplyingminimal,accurate
solutionstotheseproblems.

Thisisaunapologeticrammingthroughofthirtyyears'softwaredesignexperience.It'saprofoundlysimpleapproachtodesign:
makeminimal,accuratesolutionstorealproblems,nothingmoreorless.InZeroMQ,wedon'thavefeaturerequests.Treating
newfeaturesthesameasbugsconfusessomenewcomers.Butthisprocessworks,andnotjustinopensource.Enunciatingthe
problemwe'retryingtosolve,witheverysinglechange,iskeytodecidingwhetherthechangeisworthmakingornot.

Toinitiatechanges,auserSHALLloganissueontheprojectPlatformissuetracker.

Thisismeanttostopusfromgoingofflineandworkinginaghetto,eitherbyourselvesorwithothers.Althoughwetendtoaccept
pullrequeststhathaveclearargumentation,thisruleletsussay"stop"toconfusedortoolargepatches.

TheuserSHOULDwritetheissuebydescribingtheproblemtheyfaceorobserve.

"Problem:weneedfeatureX.Solution:makeit"isnotagoodissue."Problem:usercannotdocommontasksAorBexceptby
usingacomplexworkaround.Solution:makefeatureX"isadecentexplanation.BecauseeveryoneI'veeverworkedwithhas
neededtolearnthis,itseemsworthrestating:documenttherealproblemfirst,solutionsecond.

TheuserSHOULDseekconsensusontheaccuracyoftheirobservation,andthevalueofsolvingtheproblem.

http://zguide.zeromq.org/page:all 135/225
12/31/2015 MQ - The Guide - MQ - The Guide
Andbecausemanyapparentproblemsareillusionary,bystatingtheproblemexplicitlywegiveothersachancetocorrectour
logic."You'reonlyusingAandBalotbecausefunctionCisunreliable.Solution:makefunctionCworkproperly."

UsersSHALLNOTlogfeaturerequests,ideas,suggestions,oranysolutionstoproblemsthatarenotexplicitlydocumented
andprovable.

Thereareseveralreasonsfornotloggingideas,suggestions,orfeaturerequests.Inourexperience,thesejustaccumulateinthe
issuetrackeruntilsomeonedeletesthem.Butmoreprofoundly,whenwetreatallchangeasproblemsolutions,wecanprioritize
trivially.Eithertheproblemisrealandsomeonewantstosolveitnow,orit'snotonthetable.Thus,wishlistsareoffthetable.

Thus,thereleasehistoryoftheprojectSHALLbealistofmeaningfulissuesloggedandsolved.

I'dlovetheGitHubissuetrackertosimplylistalltheissueswesolvedineachrelease.Todaywestillhavetowritethatbyhand.If
oneputstheissuenumberineachcommit,andifoneusestheGitHubissuetracker,whichwesadlydon'tyetdoforZeroMQ,
thisreleasehistoryiseasiertoproducemechanically.

Toworkonanissue,aContributorSHALLforktheprojectrepositoryandthenworkontheirforkedrepository.

HereweexplaintheGitHubfork+pullrequestmodelsothatnewcomersonlyhavetolearnoneprocess(C4)inorderto
contribute.

Tosubmitapatch,aContributorSHALLcreateaPlatformpullrequestbacktotheproject.

GitHubhasmadethissosimplethatwedon'tneedtolearngitcommandstodoit,forwhichI'mdeeplygrateful.Sometimes,I'll
tellpeoplewhoIdon'tparticularlylikethatcommandlinegitisawesomeandalltheyneedtodoislearngit'sinternalmodelin
detailbeforetryingtouseitonrealwork.WhenIseethemseveralmonthslatertheylookchanged.

AContributorSHALLNOTcommitchangesdirectlytotheproject.

Anyonewhosubmitsapatchisacontributor,andallcontributorsfollowthesamerules.Nospecialprivilegestotheoriginal
authors,becauseotherwisewe'renotbuildingacommunity,onlyboostingouregos.

Todiscussapatch,peopleMAYcommentonthePlatformpullrequest,onthecommit,orelsewhere.

Randomlydistributeddiscussionsmaybeconfusingifyou'rewalkingupforthefirsttime,butGitHubsolvesthisforallcurrent
participantsbysendingemailstothosewhoneedtofollowwhat'sgoingon.Wehadthesameexperienceandthesamesolution
inWikidot,anditworks.There'snoevidencethatdiscussingindifferentplaceshasanynegativeeffect.

Toacceptorrejectapatch,aMaintainerSHALLusethePlatforminterface.

WorkingviatheGitHubwebuserinterfacemeanspullrequestsareloggedasissues,withworkflowanddiscussion.I'msure
therearemorecomplexwaystowork.Complexityiseasyit'ssimplicitythat'sincrediblyhard.

MaintainersSHALLNOTaccepttheirownpatches.

TherewasarulewedefinedintheFFIIyearsagotostoppeopleburningout:nolessthantwopeopleonanyproject.One
personprojectstendtoendintears,oratleastbittersilence.Wehavequitealotofdataonburnout,whyithappens,andhowto
preventit(evencureit).I'llexplorethislaterinthechapter,becauseifyouworkwithoronopensourceyouneedtobeawareof
therisks.The"nomergingyourownpatch"rulehastwogoals.First,ifyouwantyourprojecttobeC4certified,youhavetogetat
leastoneotherpersontohelp.Ifnoonewantstohelpyou,perhapsyouneedtorethinkyourproject.Second,havingacontrolfor
everypatchmakesitmuchmoresatisfying,keepsusmorefocused,andstopsusbreakingtherulesbecausewe'reinahurry,or
justfeelinglazy.

http://zguide.zeromq.org/page:all 136/225
12/31/2015 MQ - The Guide - MQ - The Guide
MaintainersSHALLNOTmakevaluejudgmentsoncorrectpatches.

Wealreadysaidthisbutit'sworthrepeating:theroleofMaintainerisnottojudgeapatch'ssubstance,onlyitstechnicalquality.
Thesubstantiveworthofapatchonlyemergesovertime:peopleuseit,andlikeit,ortheydonot.Andifnooneisusingapatch,
eventuallyit'llannoysomeoneelsewhowillremoveit,andnoonewillcomplain.

MaintainersSHALLmergecorrectpatchesrapidly.

ThereisacriteriaIcallchangelatency,whichistheroundtriptimefromidentifyingaproblemtotestingasolution.Thefasterthe
better.Ifmaintainerscannotrespondtopullrequestsasrapidlyaspeopleexpect,they'renotdoingtheirjob(ortheyneedmore
hands).

TheContributorMAYtaganissueas"Ready"aftermakingapullrequestfortheissue.

Bydefault,GitHubofferstheusualvarietyofissues,butwithC4wedon'tusethem.Instead,weneedjusttwolabels,"Urgent"
and"Ready".Acontributorwhowantsanotherusertotestanissuecanthenlabelitas"Ready".

TheuserwhocreatedanissueSHOULDclosetheissueaftercheckingthepatchissuccessful.

Whenonepersonopensanissue,andanotherworksonit,it'sbesttoallowtheoriginalpersontoclosetheissue.Thatactsasa
doublecheckthattheissuewasproperlyresolved.

MaintainersSHOULDaskforimprovementstoincorrectpatchesandSHOULDrejectincorrectpatchesiftheContributor
doesnotrespondconstructively.

Initially,Ifeltitwasworthmergingallpatches,nomatterhowpoor.There'sanelementoftrollinginvolved.Acceptingeven
obviouslyboguspatchescould,Ifelt,pullinmorecontributors.Butpeoplewereuncomfortablewiththissowedefinedthe
"correctpatch"rules,andtheMaintainer'sroleincheckingforquality.Onthenegativeside,Ithinkwedidn'ttakesomeinteresting
risks,whichcouldhavepaidoffwithmoreparticipants.Onthepositiveside,thishasledtolibzmqmaster(andinallprojects
thatuseC4)beingpracticallyproductionquality,practicallyallthetime.

AnyContributorwhohasvaluejudgmentsonacorrectpatchSHOULDexpresstheseviatheirownpatches.

Inessence,thegoalhereistoallowuserstotrypatchesratherthantospendtimearguingprosandcons.Aseasyasitisto
makeapatch,it'saseasytorevertitwithanotherpatch.Youmightthinkthiswouldleadto"patchwars",butthathasn't
happened.We'vehadahandfulofcasesinlibzmqwherepatchesbyonecontributorwerekilledbyanotherpersonwhofeltthe
experimentationwasn'tgoingintherightdirection.Itiseasierthanseekingupfrontconsensus.

MaintainersMAYcommitchangestononsourcedocumentationdirectlytotheproject.

Thisexitallowsmaintainerswhoaremakingreleasenotestopushthosewithouthavingtocreateanissuewhichwouldthen
affectthereleasenotes,leadingtostressonthespacetimefabricandpossiblyinvoluntaryreroutingbackwardsinthefourth
dimensiontobeforetheinventionofcoldbeer.Shudder.Itissimplertoagreethatreleasenotesaren'ttechnicallysoftware.

CreatingStableReleases topprevnext

Wewantsomeguaranteeofstabilityforaproductionsystem.Inthepast,thismeanttakingunstablecodeandthenovermonths
hammeringoutthebugsandfaultsuntilitwassafetotrust.iMatix'sjob,foryears,hasbeentodothistolibzmq,turningraw
codeintopackagesbyallowingonlybugfixesandnonewcodeintoa"stabilizationbranch".It'ssurprisinglynotasthanklessasit
sounds.

http://zguide.zeromq.org/page:all 137/225
12/31/2015 MQ - The Guide - MQ - The Guide
Now,sincewewentfullspeedwithC4,we'vefoundthatgitmasteroflibzmqismostlyperfect,mostofthetime.Thisfreesour
timetodomoreinterestingthings,suchasbuildingnewopensourcelayersontopoflibzmq.However,peoplestillwantthat
guarantee:manyuserswillsimplynotinstallexceptfroman"official"release.Soastablereleasetodaymeanstwothings.First,
asnapshotofthemastertakenatatimewhentherewerenonewchangesforawhile,andnodramaticopenbugs.Second,a
waytofinetunethatsnapshottofixthecriticalissuesremaininginit.

Thisistheprocessweexplaininthissection.

TheprojectSHALLhaveonebranch("master")thatalwaysholdsthelatestinprogressversionandSHOULDalwaysbuild.

Thisisredundantbecauseeverypatchalwaysbuildsbutit'sworthrestating.Ifthemasterdoesn'tbuild(andpassitstests),
someoneneedswakingup.

TheprojectSHALLNOTusetopicbranchesforanyreason.PersonalforksMAYusetopicbranches.

I'llcometobranchessoon.Inshort(or"tldr",astheysayonthewebs),branchesmaketherepositorytoocomplexandfragile,
andrequireupfrontagreement,allofwhichareexpensiveandavoidable.

TomakeastablereleasesomeoneSHALLforktherepositorybycopyingitandthusbecomemaintainerofthisrepository.
ForkingaprojectforstabilizationMAYbedoneunilaterallyandwithoutagreementofprojectmaintainers.

It'sfreesoftware.Noonehasamonopolyonit.Ifyouthinkthemaintainersaren'tproducingstablereleasesright,forkthe
repositoryanddoityourself.Forkingisn'tafailure,it'sanessentialtoolforcompetition.Youcan'tdothiswithbranches,which
meansabranchbasedreleasepolicygivestheprojectmaintainersamonopoly.Andthat'sbadbecausethey'llbecomelazier
andmorearrogantthanifrealcompetitionischasingtheirheels.

AstabilizationprojectSHOULDbemaintainedbythesameprocessasthemainproject.

Stabilizationprojectshavemaintainersandcontributorslikeanyproject.Inpracticeweusuallycherrypickpatchesfromthemain
projecttothestabilizationproject,butthat'sjustaconvenience.

Apatchtoarepositorydeclared"stable"SHALLbeaccompaniedbyareproducibletestcase.

Bewareofaonesizefitsallprocess.Newcodedoesnotneedthesameparanoiaascodethatpeoplearetrustingforproduction
use.Inthenormaldevelopmentprocess,wedidnotmentiontestcases.There'sareasonforthis.WhileIlovetestablepatches,
manychangesaren'teasilyoratalltestable.However,tostabilizeacodebaseyouwanttofixonlyseriousbugs,andyouwantto
be100%sureeverychangeisaccurate.Thismeansbeforeandaftertestsforeverychange.

EvolutionofPublicContracts topprevnext

By"publiccontracts",ImeanAPIsandprotocols.Upuntiltheendof2011,libzmq'snaturallyhappystatewasmarredbybroken
promisesandbrokencontracts.Westoppedmakingpromises(aka"roadmaps")forlibzmqcompletely,andourdominant
theoryofchangeisnowthatitemergescarefullyandaccuratelyovertime.Ata2012Chicagomeetup,GarrettSmithandChuck
Remescalledthisthe"drunkenstumbletogreatness",whichishowIthinkofitnow.

Westoppedbreakingpubliccontractssimplybybanningthepractice.Beforethenithadbeen"OK"(asinwediditandeveryone
complainedbitterly,andweignoredthem)tobreaktheAPIorprotocolsolongaswechangedthemajorversionnumber.Sounds
fine,untilyougetZeroMQv2.0,v3.0,andv4.0allindevelopmentatthesametime,andnotspeakingtoeachother.

AllPublicContracts(APIsorprotocols)SHOULDbedocumented.

You'dthinkthiswasagivenforprofessionalsoftwareengineersbutno,it'snot.So,it'sarule.YouwantC4certificationforyour

http://zguide.zeromq.org/page:all 138/225
12/31/2015 MQ - The Guide - MQ - The Guide
project,youmakesureyourpubliccontractsaredocumented.No"It'sspecifiedinthecode"excuses.Codeisnotacontract.
(Yes,IintendatsomepointtocreateaC4certificationprocesstoactasaqualityindicatorforopensourceprojects.)

AllPublicContractsSHALLuseSemanticVersioning.

Thisruleismainlyherebecausepeopleaskedforit.I'venorealloveforit,asSemanticVersioningiswhatledtothesocalled
"WhydoesZeroMQnotspeaktoitself?!"debacle.I'veneverseentheproblemthatthissolved.Somethingaboutruntime
validationoflibraryversions,orsomesuch.

AllPublicContractsSHOULDhavespaceforextensibilityandexperimentation.

Now,therealthingisthatpubliccontractsdochange.It'snotaboutnotchangingthem.It'saboutchangingthemsafely.This
meanseducating(especiallyprotocol)designerstocreatethatspaceupfront.

ApatchthatmodifiesastablePublicContractSHOULDnotbreakexistingapplicationsunlessthereisoverridingconsensus
onthevalueofdoingthis.

SometimesthepatchisfixingabadAPIthatnooneisusing.It'safreedomweneed,butitshouldbebasedonconsensus,not
oneperson'sdogma.However,makingrandomchanges"justbecause"isnotgood.InZeroMQv3.x,didwebenefitfrom
renamingZMQ_NOBLOCKtoZMQ_DONTWAIT?Sure,it'sclosertothePOSIXsocketrecv()call,butisthatworthbreaking
thousandsofapplications?Nooneeverreporteditasanissue.TomisquoteStallman:"yourfreedomtocreateanidealworld
stopsoneinchfrommyapplication."

ApatchthatintroducesnewfeaturestoaPublicContractSHOULDdosousingnewnames.

WehadtheexperienceinZeroMQonceortwiceofnewfeaturesusingoldnames(orworse,usingnamesthatwerestillinuse
elsewhere).ZeroMQv3.0hadanewlyintroduced"ROUTER"socketthatwastotallydifferentfromtheexistingROUTERsocket
in2.x.Dearlord,youshouldbefacepalming,why?Thereason:apparently,evensmartpeoplesometimesneedregulationto
stopthemdoingsillythings.

OldnamesSHOULDbedeprecatedinasystematicfashionbymarkingnewnamesas"experimental"untiltheyarestable,
thenmarkingtheoldnamesas"deprecated".

Thislifecyclenotationhasthegreatbenefitofactuallytellinguserswhatisgoingonwithaconsistentdirection."Experimental"
means"wehaveintroducedthisandintendtomakeitstableifitworks".Itdoesnotmean,"wehaveintroducedthisandwill
removeitatanytimeifwefeellikeit".Oneassumesthatcodethatsurvivesmorethanonepatchcycleismeanttobethere.
"Deprecated"means"wehavereplacedthisandintendtoremoveit".

Whensufficienttimehaspassed,olddeprecatednamesSHOULDbemarked"legacy"andeventuallyremoved.

Intheorythisgivesapplicationstimetomoveontostablenewcontractswithoutrisk.Youcanupgradefirst,makesurethings
work,andthen,overtime,fixthingsuptoremovedependenciesondeprecatedandlegacyAPIsandprotocols.

OldnamesSHALLNOTbereusedbynewfeatures.

Ah,yes,thejoywhenZeroMQv3.xrenamedthetopusedAPIfunctions(zmq_send()andzmq_recv())andthenrecycledthe
oldnamesfornewmethodsthatwereutterlyincompatible(andwhichIsuspectfewpeopleactuallyuse).Youshouldbeslapping
yourselfinconfusionagain,butreally,thisiswhathappenedandIwasasguiltyasanyone.Afterall,wedidchangetheversion
number!Theonlybenefitofthatexperiencewastogetthisrule.

Whenoldnamesareremoved,theirimplementationsMUSTprovokeanexception(assertion)ifusedbyapplications.

I'venottestedthisruletobecertainitmakessense.Perhapswhatitmeansis"ifyoucan'tprovokeacompileerrorbecausethe
http://zguide.zeromq.org/page:all 139/225
12/31/2015 MQ - The Guide - MQ - The Guide
APIisdynamic,provokeanassertion".

ProjectAdministration topprevnext

TheprojectfoundersSHALLactasAdministratorstomanagethesetofprojectMaintainers.

Someoneneedstoadministertheproject,anditmakessensethattheoriginalfoundersstartthisballrolling.

TheAdministratorsSHALLensuretheirownsuccessionovertimebypromotingthemosteffectiveMaintainers.

Atthesametime,asfounderofaprojectyoureallywanttogetoutofthewaybeforeyoubecomeoverattachedtoit.Promoting
themostactiveandconsistentmaintainersisgoodforeveryone.

AnewContributorwhomakesacorrectpatchSHALLbeinvitedtobecomeaMaintainer.

ImetFelixGeisendrferinLyonsin2012attheMixITconferencewhereIpresentedSocialArchitectureandonethingthatcame
outofthiswasFelix'snowfamousPullRequestHack.ItfitselegantlyintoC4andsolvestheproblemofmaintainersdroppingout
overtime.

AdministratorsMAYremoveMaintainerswhoareinactiveforanextendedperiodoftime,orwhorepeatedlyfailtoapplythis
processaccurately.

ThiswasIanBarber'ssuggestion:weneedawaytocropinactivemaintainers.Originallymaintainerswereselfelectedbutthat
makesithardtodroptroublemakers(whoarerare,butnotunknown).

C4isnotperfect.Fewthingsare.Theprocessforchangingit(Digistan'sCOSS)isalittleoutdatednow:itreliesonasingleeditor
workflowwiththeabilitytofork,butnotmerge.ThisseemstoworkbutitcouldbebettertouseC4forprotocolslikeC4.

ARealLifeExample topprevnext

Inthisemailthread,DanGoesaskshowtomakeapublisherthatknowswhenanewclientsubscribes,andsendsoutprevious
matchingmessages.It'sastandardpubsubtechniquecalled"lastvaluecaching".Nowovera1waytransportlikepgm(where
subscribersliterallysendnopacketsbacktopublishers),thiscan'tbedone.ButoverTCP,itcan,ifweuseanXPUBsocketand
ifthatsocketdidn'tcleverlyfilteroutduplicatesubscriptionstoreduceupstreamtraffic.

ThoughI'mnotanexpertcontributortolibzmq,thisseemslikeafunproblemtosolve.Howhardcoulditbe?Istartbyforking
thelibzmqrepositorytomyownGitHubaccountandthencloneittomylaptop,whereIbuildit:

gitclonegit@github.com:hintjens/libzmq.git
cdlibzmq
./autogen.sh
./configure
make

Becausethelibzmqcodeisneatandwellorganized,itwasquiteeasytofindthemainfilestochange(xpub.cppand
xpub.hpp).Eachsockettypehasitsownsourcefileandclass.Theyinheritfromsocket_base.cpp,whichhasthishookfor
socketspecificoptions:

//First,checkwhetherspecificsockettypeoverloadstheoption.

http://zguide.zeromq.org/page:all 140/225
12/31/2015 MQ - The Guide - MQ - The Guide
intrc=xsetsockopt(option_,optval_,optvallen_)
if(rc==0||errno!=EINVAL)
returnrc

//Ifthesockettypedoesn'tsupporttheoption,passitto
//thegenericoptionparser.
returnoptions.setsockopt(option_,optval_,optvallen_)

ThenIcheckwheretheXPUBsocketfiltersoutduplicatesubscriptions,initsxread_activatedmethod:

boolunique
if(*data==0)
unique=subscriptions.rm(data+1,size1,pipe_)
else
unique=subscriptions.add(data+1,size1,pipe_)

//Ifthesubscriptionisnotaduplicatestoreitsothatitcanbe
//passedtousedonnextrecvcall.
if(unique&&options.type!=ZMQ_PUB)
pending.push_back(blob_t(data,size))

Atthisstage,I'mnottooconcernedwiththedetailsofhowsubscriptions.rmandsubscriptions.addwork.Thecode
seemsobviousexceptthat"subscription"alsoincludesunsubscription,whichconfusedmeforafewseconds.Ifthere'sanything
elseweirdinthermandaddmethods,that'saseparateissuetofixlater.Timetomakeanissueforthischange.Iheadoverto
thezeromq.jira.comsite,login,andcreateanewentry.

Jirakindlyoffersmethetraditionalchoicebetween"bug"and"newfeature"andIspendthirtysecondswonderingwherethis
counterproductivehistoricaldistinctioncamefrom.Presumably,the"we'llfixbugsforfree,butyoupayfornewfeatures"
commercialproposal,whichstemsfromthe"youtelluswhatyouwantandwe'llmakeitfor$X"modelofsoftwaredevelopment,
andwhichgenerallyleadsto"wespentthreetimes$Xandwegotwhat?!"emailFistsofFury.

Puttingsuchthoughtsaside,Icreateanissue#443anddescribedtheproblemandplausiblesolution:

Problem:XPUBsocketfiltersoutduplicatesubscriptions(deliberatedesign).Howeverthismakesitimpossibletodo
subscriptionbasedintelligence.Seehttp://lists.zeromq.org/pipermail/zeromqdev/2012October/018838.htmlforausecase.
Solution:makethisbehaviorconfigurablewithasocketoption.

It'snamingtime.TheAPIsitsininclude/zmq.h,sothisiswhereIaddedtheoptionname.Whenyouinventaconceptinan
APIoranywhere,pleasetakeamomenttochooseanamethatisexplicitandshortandobvious.Don'tfallbackongeneric
namesthatneedadditionalcontexttounderstand.Youhaveonechancetotellthereaderwhatyourconceptisanddoes.A
namelikeZMQ_SUBSCRIPTION_FORWARDING_FLAGisterrible.Ittechnicallykindofaimsintherightdirection,butismiserably
longandobscure.IchoseZMQ_XPUB_VERBOSE:shortandexplicitandclearlyanon/offswitchwith"off"beingthedefaultsetting.

So,it'stimetoaddaprivatepropertytothexpubclassdefinitioninxpub.hpp:

//Iftrue,sendallsubscriptionmessagesupstream,notjust
//uniqueones
boolverbose

Andthenliftsomecodefromrouter.cpptoimplementthexsetsockoptmethod.Finally,changethexread_activated
methodtousethisnewoption,andwhileatit,makethattestonsockettypemoreexplicittoo:

//Ifthesubscriptionisnotaduplicatestoreitsothatitcanbe
//passedtousedonnextrecvcall.
if(options.type==ZMQ_XPUB&&(unique||verbose))
pending.push_back(blob_t(data,size))

http://zguide.zeromq.org/page:all 141/225
12/31/2015 MQ - The Guide - MQ - The Guide
Thethingbuildsnicelythefirsttime.Thismakesmealittlesuspicious,butbeinglazyandjetlaggedIdon'timmediatelymakea
testcasetoactuallytryoutthechange.Theprocessdoesn'tdemandthat,evenifusuallyI'ddoitjusttocatchthatinevitable10%
ofmistakesweallmake.Idohoweverdocumentthisnewoptiononthedoc/zmq_setsockopt.txtmanpage.Intheworst
case,Iaddedapatchthatwasn'treallyuseful.ButIcertainlydidn'tbreakanything.

Idon'timplementamatchingzmq_getsockoptbecause"minimal"meanswhatitsays.There'snoobvioususecaseforgetting
thevalueofanoptionthatyoupresumablyjustset,incode.Symmetryisn'tavalidreasontodoublethesizeofapatch.Idid
havetodocumentthenewoptionbecausetheprocesssays,"AllPublicContractsSHOULDbedocumented."

Committingthecode,Ipushthepatchtomyforkedrepository(the"origin"):

gitcommitam"Fixedissue#443"
gitpushoriginmaster

SwitchingtotheGitHubwebinterface,Igotomylibzmqfork,andpressthebig"PullRequest"buttonatthetop.GitHubasks
meforatitle,soIenter"AddedZMQ_XPUB_VERBOSEoption".I'mnotsurewhyitasksthisasImadeaneatcommitmessage
buthey,let'sgowiththeflowhere.

ThismakesanicelittlepullrequestwithtwocommitstheoneI'dmadeamonthagoonthereleasenotestoprepareforthe
v3.2.1release(amonthpassessoquicklywhenyouspendmostofitinairports),andmyfixforissue#443(37newlinesof
code).GitHubletsyoucontinuetomakecommitsafteryou'vekickedoffapullrequest.Theygetqueuedupandmergedinone
go.Thatiseasy,butthemaintainermayrefusethewholebundlebasedononepatchthatdoesn'tlookvalid.

BecauseDaniswaiting(atleastinmyhighlyoptimisticimagination)forthisfix,IgobacktothezeromqdevlistandtellhimI've
madethepatch,withalinktothecommit.ThefasterIgetfeedback,thebetter.It's1a.m.inSouthKoreaasImakethispatch,so
earlyeveninginEurope,andmorningintheStates.Youlearntocounttimezoneswhenyouworkwithpeopleacrosstheworld.
Ianisinaconference,Mikkoisgettingonaplane,andChuckisprobablyintheoffice,butthreehourslater,Ianmergesthepull
request.

AfterIanmergesthepullrequest,Iresynchronizemyforkwiththeupstreamlibzmqrepository.First,Iaddaremotethattellsgit
wherethisrepositorysits(IdothisjustonceinthedirectorywhereI'mworking):

gitremoteaddupstreamgit://github.com/zeromq/libzmq.git

AndthenIpullchangesbackfromtheupstreammasterandcheckthegitlogtodoublecheck:

gitpullrebaseupstreammaster
gitlog

Andthatisprettymuchit,intermsofhowmuchgitoneneedstolearnandusetocontributepatchestolibzmq.Sixgit
commandsandsomeclickingonwebpages.Mostimportantlytomeasanaturallylazy,stupid,andeasilyconfuseddeveloper,I
don'thavetolearngit'sinternalmodels,andneverhavetodoanythinginvolvingthoseinfernalenginesofstructuralcomplexity
wecall"gitbranches".Nextup,theattemptedassassinationofgitbranches.Let'slivedangerously!

GitBranchesConsideredHarmful topprevnext

Oneofgit'smostpopularfeaturesisitsbranches.Almostallprojectsthatusegitusebranches,andtheselectionofthe"best"
branchingstrategyislikeariteofpassageforanopensourceproject.VincentDriessen'sgitflowmaybethebestknown.Ithas
basebranches(master,develop),featurebranches,releasebranches,hotfixbranches,andsupportbranches.Manyteamshave
adoptedgitflow,whichevenhasgitextensionstosupportit.I'magreatbelieverinpopularwisdom,butsometimesyouhaveto
recognizemassdelusionforwhatitis.

HereisasectionofC4thatmighthaveshockedyouwhenyoufirstreadit:

TheprojectSHALLNOTusetopicbranchesforanyreason.PersonalforksMAYusetopicbranches.

http://zguide.zeromq.org/page:all 142/225
12/31/2015 MQ - The Guide - MQ - The Guide
Tobeclear,it'spublicbranchesinsharedrepositoriesthatI'mtalkingabout.Usingbranchesforprivatework,e.g.,toworkon
differentissues,appearstoworkwellenough,thoughit'smorecomplexitythanIpersonallyenjoy.TochannelStallmanagain:
"yourfreedomtocreatecomplexityendsoneinchfromoursharedworkspace."

LiketherestofC4,therulesonbranchesarenotaccidental.TheycamefromourexperiencemakingZeroMQ,startingwhen
MartinSustrikandIrethoughthowtomakestablereleases.Webothloveandappreciatesimplicity(somepeopleseemtohavea
remarkabletoleranceforcomplexity).WechattedforawhileIaskedhim,"I'mgoingtostartmakingastablerelease.Wouldit
beOKformetomakeabranchinthegityou'reworkingin?"Martindidn'tliketheidea."OK,ifIforktherepository,Icanmove
patchesfromyourrepotothatone".Thatfeltmuchbettertobothofus.

TheresponsefrommanyintheZeroMQcommunitywasshockandhorror.Peoplefeltwewerebeinglazyandmaking
contributorsworkhardertofindthe"right"repository.Still,thisseemedsimple,andindeeditworkedsmoothly.Thebestpartwas
thatweeachworkedaswewantedto.Whereasbefore,theZeroMQrepositoryhadfelthorriblycomplex(anditwasn'teven
anythinglikegitflow),thisfeltsimple.Anditworked.Theonlydownsidewasthatwelostasingleunifiedhistory.Now,perhaps
historianswillfeelrobbed,butIhonestlycan'tseethatthehistoricalminutiaeofwhochangedwhat,when,includingeverybranch
andexperiment,areworthanysignificantpainorfriction.

Peoplehavegottenusedtothe"multiplerepositories"approachinZeroMQandwe'vestartedusingthatinotherprojectsquite
successfully.Myownopinionisthathistorywilljudgegitbranchesandpatternslikegitflowasacomplexsolutiontoimaginary
problemsinheritedfromthedaysofSubversionandmonolithicrepositories.

Moreprofoundly,andperhapsthisiswhythemajorityseemstobe"wrong":Ithinkthebranchesversusforksargumentisreallya
deeperdesignversusevolveargumentabouthowtomakesoftwareoptimally.I'lladdressthatdeeperargumentinthenext
section.Fornow,I'lltrytobescientificaboutmyirrationalhatredofbranches,bylookingatanumberofcriteria,andcomparing
branchesandforksineachone.

SimplicityVersusComplexity topprevnext

Thesimpler,thebetter.

Thereisnoinherentreasonwhybranchesaremorecomplexthanforks.However,gitflowusesfivetypesofbranch,whereasC4
usestwotypesoffork(development,andstable)andonebranch(master).Circumstantialevidenceisthusthatbranchesleadto
morecomplexitythanforks.Fornewusers,itisdefinitely,andwe'vemeasuredthisinpractice,easiertolearntoworkwithmany
repositoriesandnobranchesexceptmaster.

ChangeLatency topprevnext

Thesmallerandmorerapidthedelivery,thebetter.

Developmentbranchesseemtocorrelatestronglywithlarge,slow,riskydeliveries."Sorry,Ihavetomergethisbranchbeforewe
cantestthenewversion"signalsabreakdowninprocess.It'scertainlynothowC4works,whichisbyfocusingtightlyon
individualproblemsandtheirminimalsolutions.Allowingbranchesindevelopmentraiseschangelatency.Forkshaveadifferent
outcome:it'suptotheforkertoensurethathischangesmergecleanly,andtokeepthemsimplesotheywon'tberejected.

LearningCurve topprevnext

Thesmootherthelearningcurve,thebetter.

Evidencedefinitelyshowsthatlearningtousegitbranchesiscomplex.Forsomepeople,thisisOK.Formostdevelopers,every
cyclespentlearninggitisacyclelostonmoreproductivethings.I'vebeentoldseveraltimes,bydifferentpeoplethatIdonotlike
branchesbecauseI"neverproperlylearnedgit".Thatisfair,butitisacriticismofthetool,notthehuman.

CostofFailure topprevnext

http://zguide.zeromq.org/page:all 143/225
12/31/2015 MQ - The Guide - MQ - The Guide

Thelowerthecostoffailure,thebetter.

Branchesdemandmoreperfectionfromdevelopersbecausemistakespotentiallyaffectothers.Thisraisesthecostoffailure.
Forksmakefailureextremelycheapbecauseliterallynothingthathappensinaforkcanaffectothersnotusingthatfork.

UpfrontCoordination topprevnext

Thelessneedforupfrontcoordination,thebetter.

Youcandoahostilefork.Youcannotdoahostilebranch.Branchesdependonupfrontcoordination,whichisexpensiveand
fragile.Onepersoncanvetothedesiresofawholegroup.ForexampleintheZeroMQcommunitywewereunabletoagreeona
gitbranchingmodelforayear.Wesolvedthatbyusingforkinginstead.Theproblemwentaway.

Scalability topprevnext

Themoreyoucanscaleaproject,thebetter.

Thestrongassumptioninallbranchstrategiesisthattherepositoryistheproject.Butthereisalimittohowmanypeopleyoucan
gettoagreetoworktogetherinonerepository.AsIexplained,thecostofupfrontcoordinationcanbecomefatal.Amorerealistic
projectscalesbyallowinganyonetostarttheirownrepositories,andensuringthesecanworktogether.AprojectlikeZeroMQ
hasdozensofrepositories.Forkinglooksmorescalablethanbranching.

SurpriseandExpectations topprevnext

Thelesssurprising,thebetter.

Peopleexpectbranchesandfindforkstobeuncommonandthusconfusing.Thisistheoneaspectwherebrancheswin.Ifyou
usebranches,asinglepatchwillhavethesamecommithashtag,whereasacrossforksthepatchwillhavedifferenthashtags.
Thatmakesithardertotrackpatchesastheycrossforks,true.Butseriously,havingtotrackhexadecimalhashtagsisnota
feature.It'sabug.Sometimesbetterwaysofworkingaresurprisingatfirst.

EconomicsofParticipation topprevnext

Themoretangibletherewards,thebetter.

Peopleliketoowntheirworkandgetcreditforit.Thisismucheasierwithforksthanwithbranches.Forkscreatemore
competitioninahealthyway,whilebranchessuppresscompetitionandforcepeopletocollaborateandsharecredit.Thissounds
positivebutinmyexperienceitdemotivatespeople.Abranchisn'taproductyoucan"own",whereasaforkcanbe.

RobustnessinConflict topprevnext

Themoreamodelcansurviveconflict,thebetter.

Likeitornot,peoplefightoverego,status,beliefs,andtheoriesoftheworld.Challengeisanecessarypartofscience.Ifyour
organizationalmodeldependsonagreement,youwon'tsurvivethefirstrealfight.Branchesdonotsurviverealargumentsand
http://zguide.zeromq.org/page:all 144/225
12/31/2015 MQ - The Guide - MQ - The Guide
fights,whereasforkscanbehostile,andstillbenefitallparties.Andthisisindeedhowfreesoftwareworks.

GuaranteesofIsolation topprevnext

Thestrongertheisolationbetweenproductioncodeandexperiment,thebetter.

Peoplemakemistakes.I'veseenexperimentalcodepushedtomainlineproductionbyerror.I'veseenpeoplemakebadpanic
changesunderstress.Buttherealfaultisinallowingtwoentirelyseparategenerationsofproducttoexistinthesameprotected
space.Ifyoucanpushtorandombranchx,youcanpushtomaster.Branchesdonotguaranteeisolationofproductioncritical
code.Forksdo.

Visibility topprevnext

Themorevisibleourwork,thebetter.

Forkshavewatchers,issues,aREADME,andawiki.Brancheshavenoneofthese.Peopletryforks,buildthem,breakthem,
patchthem.Branchessitthereuntilsomeonerememberstoworkonthem.Forkshavedownloadsandtarballs.Branchesdonot.
Whenwelookforselforganization,themorevisibleanddeclarativetheproblems,thefasterandmoreaccuratelywecanwork.

Conclusions topprevnext

Inthissection,I'velistedaseriesofarguments,mostofwhichcamefromfellowteammembers.Here'showitseemstobreak
down:gitveteransinsistthatbranchesarethewaytowork,whereasnewcomerstendtofeelintimidatedwhenaskedtonavigate
gitbranches.Gitisnotaneasytooltomaster.Whatwe'vediscovered,accidentally,isthatwhenyoustopusingbranchesatall,
gitbecomestrivialtouse.Itliterallycomesdowntosixcommands(clone,remote,commit,log,push,andpull).
Furthermore,abranchfreeprocessactuallyworks,we'veuseditforacoupleofyearsnow,andnovisibledownsideexcept
surprisetotheveteransandgrowthof"single"projectsovermultiplerepositories.

Ifyoucan'tuseforks,perhapsbecauseyourfirmdoesn'ttrustGitHub'sprivaterepositories,thenyoucanperhapsusetopic
branches,oneperissue.You'llstillsufferthecostsofgettingupfrontconsensus,lowcompetitiveness,andriskofhumanerror.

DesigningforInnovation topprevnext

Let'slookatinnovation,whichWikipediadefinesas,"thedevelopmentofnewvaluesthroughsolutionsthatmeetnew
requirements,inarticulateneeds,oroldcustomerandmarketneedsinvalueaddingnewways."Thisreallyjustmeanssolving
problemsmorecheaply.Itsoundsstraightforward,butthehistoryofcollapsedtechgiantsprovesthatit'snot.I'lltrytoexplain
howteamssooftengetitwrong,andsuggestawayfordoinginnovationright.

TheTaleofTwoBridges topprevnext

Twooldengineersweretalkingoftheirlivesandboastingoftheirgreatestprojects.Oneoftheengineersexplainedhowhehad
designedoneofthegreatestbridgesevermade.

"Webuiltitacrossarivergorge,"hetoldhisfriend."Itwaswideanddeep.Wespenttwoyearsstudyingtheland,andchoosing
designsandmaterials.Wehiredthebestengineersanddesignedthebridge,whichtookanotherfiveyears.Wecontractedthe
largestengineeringfirmstobuildthestructures,thetowers,thetollbooths,andtheroadsthatwouldconnectthebridgetothe
mainhighways.Dozensdiedduringtheconstruction.Undertheroadlevelwehadtrains,andaspecialpathforcyclists.That

http://zguide.zeromq.org/page:all 145/225
12/31/2015 MQ - The Guide - MQ - The Guide
bridgerepresentedyearsofmylife."

Thesecondmanreflectedforawhile,thenspoke."Oneeveningmeandafriendgotdrunkonvodka,andwethrewaropeacross
agorge,"hesaid."Justarope,tiedtotwotrees.Thereweretwovillages,oneateachside.Atfirst,peoplepulledpackages
acrossthatropewithapulleyandstring.Thensomeonethrewasecondrope,andbuiltafootwalk.Itwasdangerous,butthe
kidslovedit.Agroupofmenthenrebuiltthat,madeitsolid,andwomenstartedtocross,everyday,withtheirproduce.Amarket
grewupononesideofthebridge,andslowlythatbecamealargetown,becausetherewasalotofspaceforhouses.Therope
bridgegotreplacedwithawoodenbridge,toallowhorsesandcartstocross.Thenthetownbuiltarealstonebridge,withmetal
beams.Later,theyreplacedthestonepartwithsteel,andtodaythere'sasuspensionbridgestandinginthatsamespot."

Thefirstengineerwassilent."Funnything,"hesaid,"mybridgewasdemolishedabouttenyearsafterwebuiltit.Turnsoutitwas
builtinthewrongplaceandnoonewantedtouseit.Someguyshadthrownaropeacrossthegorge,afewmilesfurther
downstream,andthat'swhereeveryonewent."

HowZeroMQLostItsRoadMap topprevnext

PresentingZeroMQattheMixITconferenceinLyoninearly2012,Iwasaskedseveraltimesforthe"roadmap".Myanswer
was:thereisnoroadmapanylonger.Wehadroadmaps,andwedeletedthem.Insteadofafewexpertstryingtolayoutthe
nextsteps,wewereallowingthistohappenorganically.Theaudiencedidn'treallylikemyanswer.SounFrench.

However,thehistoryofZeroMQmakesitquiteclearwhyroadmapswereproblematic.Inthebeginning,wehadasmallteam
makingthelibrary,withfewcontributors,andnodocumentedroadmap.AsZeroMQgrewmorepopularandweswitchedtomore
contributors,usersaskedforroadmaps.Sowecollectedourplanstogetherandtriedtoorganizethemintoreleases.Here,we
wrote,iswhatwillcomeinthenextrelease.

Aswerolledoutreleases,wehittheproblemthatit'sveryeasytopromisestuff,andratherhardertomakeitasplanned.Forone
thing,muchoftheworkwasvoluntary,andit'snotclearhowyouforcevolunteerstocommittoaroadmap.Butalso,priorities
canshiftdramaticallyovertime.Soweweremakingpromiseswecouldnotkeep,andtherealdeliveriesdidn'tmatchtheroad
maps.

Thesecondproblemwasthatbydefiningtheroadmap,weineffectclaimedterritory,makingitharderforotherstoparticipate.
Peopledoprefertocontributetochangestheybelieveweretheiridea.Writingdownalistofthingstodoturnscontributionintoa
choreratherthananopportunity.

Finally,wesawchangesinZeroMQthatwerequitetraumatic,andtheroadmapsdidn'thelpwiththis,despitealotofdiscussion
andeffortto"doitright".ExamplesofthiswereincompatiblechangesinAPIsandprotocols.Itwasquiteclearthatweneededa
differentapproachfordefiningthechangeprocess.

Softwareengineersdon'tlikethenotionthatpowerful,effectivesolutionscancomeintoexistencewithoutanintelligentdesigner
activelythinkingthingsthrough.AndyetnooneinthatroominLyonwouldhavequestionedevolution.Astrangeirony,andoneI
wantedtoexplorefurtherasitunderpinsthedirectiontheZeroMQcommunityhastakensincethestartof2012.

Inthedominanttheoryofinnovation,brilliantindividualsreflectonlargeproblemsetsandthencarefullyandpreciselycreatea
solution.Sometimestheywillhave"eureka"momentswherethey"get"brilliantlysimpleanswerstowholelargeproblemsets.
Theinventor,andtheprocessofinventionarerare,precious,andcancommandamonopoly.Historyisfullofsuchheroic
individuals.Weowethemourmodernworld.

Lookingmoreclosely,however,andyouwillseethatthefactsdon'tmatch.Historydoesn'tshowloneinventors.Itshowslucky
peoplewhostealorclaimownershipofideasthatarebeingworkedonbymany.Itshowsbrilliantpeoplestrikingluckyonce,and
thenspendingdecadesonfruitlessandpointlessquests.ThebestknownlargescaleinventorslikeThomasEdisonwereinfact
justverygoodatsystematicbroadresearchdonebylargeteams.It'slikeclaimingthatSteveJobsinventedeverydevicemade
byApple.Itisanicemyth,goodformarketing,bututterlyuselessaspracticalscience.

Recenthistory,muchbetterdocumentedandlesseasytomanipulate,showsthiswell.TheInternetissurelyoneofthemost
innovativeandfastmovingareasoftechnology,andoneofthebestdocumented.Ithasnoinventor.Instead,ithasamassive
economyofpeoplewhohavecarefullyandprogressivelysolvedalongseriesofimmediateproblems,documentedtheiranswers,
andmadethoseavailabletoall.TheinnovativenatureoftheInternetcomesnotfromasmall,selectbandofEinsteins.Itcomes
fromRFCsanyonecanuseandimprove,madebyhundredsandthousandsofsmart,butnotuniquelysmart,individuals.It
comesfromopensourcesoftwareanyonecanuseandimprove.Itcomesfromsharing,scaleofcommunity,andthecontinuous
accretionofgoodsolutionsanddisposalofbadones.

Herethusisanalternativetheoryofinnovation:

http://zguide.zeromq.org/page:all 146/225
12/31/2015 MQ - The Guide - MQ - The Guide
1. Thereisaninfiniteproblem/solutionterrain.
2. Thisterrainchangesovertimeaccordingtoexternalconditions.
3. Wecanonlyaccuratelyperceiveproblemstowhichweareclose.
4. Wecanrankthecost/benefiteconomicsofproblemsusingamarketforsolutions.
5. Thereisanoptimalsolutiontoanysolvableproblem.
6. Wecanapproachthisoptimalsolutionheuristically,andmechanically.
7. Ourintelligencecanmakethisprocessfaster,butdoesnotreplaceit.

Thereareafewcorollariestothis:

Individualcreativitymatterslessthanprocess.Smarterpeoplemayworkfaster,buttheymayalsoworkinthewrong
direction.It'sthecollectivevisionofrealitythatkeepsushonestandrelevant.

Wedon'tneedroadmapsifwehaveagoodprocess.Functionalitywillemergeandevolveovertimeassolutionscompete
formarketshare.

Wedon'tinventsolutionssomuchasdiscoverthem.Allsympathiestothecreativesoul.It'sjustaninformationprocessing
machinethatlikestopolishitsownegoandcollectkarma.

Intelligenceisasocialeffect,thoughitfeelspersonal.Apersoncutofffromotherseventuallystopsthinking.Wecan
neithercollectproblemsnormeasuresolutionswithoutotherpeople.

Thesizeanddiversityofthecommunityisakeyfactor.Larger,morediversecommunitiescollectmorerelevantproblems,
andsolvethemmoreaccurately,anddothisfaster,thanasmallexpertgroup.

So,whenwetrustthesolitaryexperts,theymakeclassicmistakes.Theyfocusonideas,notproblems.Theyfocusonthewrong
problems.Theymakemisjudgmentsaboutthevalueofsolvingproblems.Theydon'tusetheirownwork.

Canweturntheabovetheoryintoareusableprocess?Inlate2011,IstarteddocumentingC4andsimilarcontracts,andusing
thembothinZeroMQandinclosedsourceprojects.TheunderlyingprocessissomethingIcall"SimplicityOrientedDesign",or
SOD.Thisisareproduciblewayofdevelopingsimpleandelegantproducts.Itorganizespeopleintoflexiblesupplychainsthat
areabletonavigateaproblemlandscaperapidlyandcheaply.Theydothisbybuilding,testing,andkeepingordiscarding
minimalplausiblesolutions,called"patches".Livingproductsconsistoflongseriesofpatches,appliedoneatoptheother.

SODisrelevantfirstbecauseit'showweevolveZeroMQ.It'salsothebasisforthedesignprocesswewilluseinChapter7
AdvancedArchitectureusingZeroMQtodeveloplargerscaleZeroMQapplications.Ofcourse,youcanuseanysoftware
architecturemethodologywithZeroMQ.

TobestunderstandhowweendedupwithSOD,let'slookatthealternatives.

TrashOrientedDesign topprevnext

ThemostpopulardesignprocessinlargebusinessesseemstobeTrashOrientedDesign,orTOD.TODfeedsoffthebeliefthat
allweneedtomakemoneyaregreatideas.It'stenaciousnonsense,butapowerfulcrutchforpeoplewholackimagination.The
theorygoesthatideasarerare,sothetrickistocapturethem.It'slikenonmusiciansbeingawedbyaguitarplayer,notrealizing
thatgreattalentissocheapitliterallyplaysonthestreetsforcoins.

ThemainoutputofTODsisexpensive"ideation":concepts,designdocuments,andproductsthatgostraightintothetrashcan.It
worksasfollows:

TheCreativePeoplecomeupwithlonglistsof"wecoulddoXandY".I'veseenendlesslydetailedlistsofeverything
amazingaproductcoulddo.We'veallbeenguiltyofthis.Oncethecreativeworkofideagenerationhashappened,it'sjust
amatterofexecution,ofcourse.

Sothemanagersandtheirconsultantspasstheirbrilliantideastodesignerswhocreateacresofpreciouslyrefineddesign
documents.Thedesignerstakethetensofideasthemanagerscameupwith,andturnthemintohundredsofworld
changingdesigns.

Thesedesignsgetgiventoengineerswhoscratchtheirheadsandwonderwhotheheckcameupwithsuchnonsense.
Theystarttoargueback,butthedesignscomefromuphigh,andreally,it'snotuptoengineerstoarguewithcreative
peopleandexpensiveconsultants.

Sotheengineerscreepbacktotheircubicles,humiliatedandthreatenedintobuildingthegiganticbutohsoelegantjunk
heap.Itisbonebreakingworkbecausethedesignstakenoaccountofpracticalcosts.Minorwhimsmighttakeweeksof
http://zguide.zeromq.org/page:all 147/225
12/31/2015 MQ - The Guide - MQ - The Guide
worktobuild.Astheprojectgetsdelayed,themanagersbullytheengineersintogivinguptheireveningsandweekends.

Eventually,somethingresemblingaworkingproductmakesitoutofthedoor.It'screakyandfragile,complexandugly.
Thedesignerscursetheengineersfortheirincompetenceandpaymoreconsultantstoputlipstickontothepig,andslowly
theproductstartstolookalittlenicer.

Bythistime,themanagershavestartedtotrytoselltheproductandtheyfind,shockingly,thatnoonewantsit.
Undaunted,theycourageouslybuildmilliondollarwebsitesandadcampaignstoexplaintothepublicwhytheyabsolutely
needthisproduct.Theydodealswithotherbusinessestoforcetheproductonthelazy,stupid,andungratefulmarket.

Aftertwelvemonthsofintensemarketing,theproductstillisn'tmakingprofits.Worse,itsuffersdramaticfailuresandgets
brandedinthepressasadisaster.Thecompanyquietlyshelvesit,firestheconsultants,buysacompetingproductfroma
smallstartupandrebrandsthatasitsownVersion2.Hundredsofmillionsofdollarsendupinthetrash.

Meanwhile,anothervisionarymanagersomewhereintheorganizationdrinksalittletoomuchtequilawithsomemarketing
peopleandhasaBrilliantIdea.

TrashOrientedDesignwouldbeacaricatureifitwasn'tsocommon.Somethinglike19outof20marketreadyproductsbuiltby
largefirmsarefailures(yes,87%ofstatisticsaremadeuponthespot).Theremaining1in20probablyonlysucceedsbecause
thecompetitorsaresobadandthemarketingissoaggressive.

ThemainlessonsofTODarequitestraightforwardbuthardtoswallow.Theyare:

Ideasarecheap.Noexceptions.Therearenobrilliantideas.Anyonewhotriestostartadiscussionwith"oooh,wecando
thistoo!"shouldbebeatendownwithallthepassiononereservesfortravelingevangelists.Itislikesittinginacafeatthe
footofamountain,drinkingahotchocolateandtellingothers,"Hey,Ihaveagreatidea,wecanclimbthatmountain!And
buildachaletontop!Withtwosaunas!Andagarden!Hey,andwecanmakeitsolarpowered!Dude,that'sawesome!
Whatcolorshouldwepaintit?Green!No,blue!OK,goandmakeit,I'llstayhereandmakespreadsheetsandgraphics!"

Thestartingpointforagooddesignprocessistocollectrealproblemsthatconfrontrealpeople.Thesecondstepisto
evaluatetheseproblemswiththebasicquestion,"Howmuchisitworthtosolvethisproblem?"Havingdonethat,wecan
collectthatsetofproblemsthatareworthsolving.

Goodsolutionstorealproblemswillsucceedasproducts.Theirsuccesswilldependonhowgoodandcheapthesolution
is,andhowimportanttheproblemis(andsadly,howbigthemarketingbudgetsare).Buttheirsuccesswillalsodependon
howmuchtheydemandinefforttouseinotherwords,howsimpletheyare.

Now,afterslayingthedragonofutterirrelevance,weattackthedemonofcomplexity.

ComplexityOrientedDesign topprevnext

Reallygoodengineeringteamsandsmallfirmscanusuallybuilddecentproducts.Butthevastmajorityofproductsstillendup
beingtoocomplexandlesssuccessfulthantheymightbe.Thisisbecausespecialistteams,eventhebest,oftenstubbornly
applyaprocessIcallComplexityOrientedDesign,orCOD,whichworksasfollows:

Managementcorrectlyidentifiessomeinterestinganddifficultproblemwitheconomicvalue.Indoingso,theyalready
leapfrogoveranyTODteam.

Theteamwithenthusiasmstartstobuildprototypesandcorelayers.Theseworkasdesignedandthusencouraged,the
teamgooffintointensedesignandarchitecturediscussions,comingupwithelegantschemasthatlookbeautifuland
solid.

Managementcomesbackandchallengestheteamwithyetmoredifficultproblems.Wetendtoequatecostwithvalue,so
theharderandmoreexpensivetosolve,themorethesolutionshouldbeworth,intheirminds.

Theteam,beingengineersandthuslovingtobuildstuff,buildstuff.Theybuildandbuildandbuildandendupwith
massive,perfectlydesignedcomplexity.

Theproductsgotomarket,andthemarketscratchesitsheadandasks,"Seriously,isthisthebestyoucando?"People
dousetheproducts,especiallyiftheyaren'tspendingtheirownmoneyinclimbingthelearningcurve.

Managementgetspositivefeedbackfromitslargercustomers,whosharethesameideathathighcost(intrainingand
use)meanshighvalue,andsocontinuestopushtheprocess.

http://zguide.zeromq.org/page:all 148/225
12/31/2015 MQ - The Guide - MQ - The Guide
Meanwhilesomewhereacrosstheworld,asmallteamissolvingthesameproblemusingabetterprocess,andayear
latersmashesthemarkettolittlepieces.

CODischaracterizedbyateamobsessivelysolvingthewrongproblemsinaformofcollectivedelusion.CODproductstendto
belarge,ambitious,complex,andunpopular.MuchopensourcesoftwareistheoutputofCODprocesses.Itisinsanelyhardfor
engineerstostopextendingadesigntocovermorepotentialproblems.Theyargue,"WhatifsomeonewantstodoX?"butnever
askthemselves,"WhatistherealvalueofsolvingX?"

AgoodexampleofCODinpracticeisBluetooth,acomplex,overdesignedsetofprotocolsthatusershate.Itcontinuestoexist
onlybecauseinamassivelypatentedindustrytherearenorealalternatives.Bluetoothisperfectlysecure,whichiscloseto
pointlessforaproximityprotocol.Atthesametime,itlacksastandardAPIfordevelopers,meaningit'sreallycostlytouse
Bluetoothinapplications.

Onthe#zeromqIRCchannel,Wintreoncewroteofhowenragedhewasmanyyearsagowhenhe"foundthatXMMS2hada
workingpluginsystem,butcouldnotactuallyplaymusic."

CODisaformoflargescale"rabbitholing",inwhichdesignersandengineerscannotdistancethemselvesfromthetechnical
detailsoftheirwork.Theyaddmoreandmorefeatures,utterlymisreadingtheeconomicsoftheirwork.

ThemainlessonsofCODarealsosimple,buthardforexpertstoswallow.Theyare:

Makingstuffthatyoudon'timmediatelyhaveaneedforispointless.Doesn'tmatterhowtalentedorbrilliantyouare,ifyou
justsitdownandmakestuffpeoplearenotactuallyaskingfor,youaremostlikelywastingyourtime.

Problemsarenotequal.Somearesimple,andsomearecomplex.Ironically,solvingthesimplerproblemsoftenhasmore
valuetomorepeoplethansolvingthereallyhardones.Soifyouallowengineerstojustworkonrandomthings,they'll
mostlyfocusonthemostinterestingbutleastworthwhilethings.

Engineersanddesignerslovetomakestuffanddecoration,andthisinevitablyleadstocomplexity.Itiscrucialtohavea
"stopmechanism",awaytosetshort,harddeadlinesthatforcepeopletomakesmaller,simpleranswerstojustthemost
crucialproblems.

SimplicityOrientedDesign topprevnext

Finally,wecometotherarebutpreciousSimplicityOrientedDesign,orSOD.Thisprocessstartswitharealization:wedonot
knowwhatwehavetomakeuntilafterwestartmakingit.Comingupwithideasorlargescaledesignsisn'tjustwasteful,it'sa
directhindrancetodesigningthetrulyaccuratesolutions.Thereallyjuicyproblemsarehiddenlikefarvalleys,andanyactivity
exceptactivescoutingcreatesafogthathidesthosedistantvalleys.Youneedtokeepmobile,packlight,andmovefast.

SODworksasfollows:

Wecollectasetofinterestingproblems(bylookingathowpeopleusetechnologyorotherproducts)andwelinetheseup
fromsimpletocomplex,lookingforandidentifyingpatternsofuse.

Wetakethesimplest,mostdramaticproblemandwesolvethiswithaminimalplausiblesolution,or"patch".Eachpatch
solvesexactlyagenuineandagreeduponprobleminabrutallyminimalfashion.

Weapplyonemeasureofqualitytopatches,namely"Canthisbedoneanysimplerwhilestillsolvingthestatedproblem?"
Wecanmeasurecomplexityintermsofconceptsandmodelsthattheuserhastolearnorguessinordertousethepatch.
Thefewer,thebetter.Aperfectpatchsolvesaproblemwithzerolearningrequiredbytheuser.

Ourproductdevelopmentconsistsofapatchthatsolvestheproblem"weneedaproofofconcept"andthenevolvesinan
unbrokenlinetoamatureseriesofproducts,throughhundredsorthousandsofpatchespiledontopofeachother.

Wedonotdoanythingthatisnotapatch.Weenforcethisrulewithformalprocessesthatdemandthateveryactivityor
taskistiedtoagenuineandagreeduponproblem,explicitlyenunciatedanddocumented.

Webuildourprojectsintoasupplychainwhereeachprojectcanprovideproblemstoits"suppliers"andreceivepatchesin
return.Thesupplychaincreatesthe"stopmechanism"becausewhenpeopleareimpatientlywaitingforananswer,we
necessarilycutourworkshort.

Individualsarefreetoworkonanyprojects,andprovidepatchesatanyplacetheyfeelit'sworthwhile.Noindividuals
"own"anyproject,excepttoenforcetheformalprocesses.Asingleprojectcanhavemanyvariations,eachacollectionof
different,competingpatches.

http://zguide.zeromq.org/page:all 149/225
12/31/2015 MQ - The Guide - MQ - The Guide
Projectsexportformalanddocumentedinterfacessothatupstream(client)projectsareunawareofchangehappeningin
supplierprojects.Thusmultiplesupplierprojectscancompeteforclientprojects,ineffectcreatingafreeandcompetitive
market.

Wetieoursupplychaintorealusersandexternalclientsandwedrivethewholeprocessbyrapidcyclessothata
problemreceivedfromoutsideuserscanbeanalyzed,evaluated,andsolvedwithapatchinafewhours.

Ateverymomentfromtheveryfirstpatch,ourproductisshippable.Thisisessential,becausealargeproportionof
patcheswillbewrong(1030%)andonlybygivingtheproducttouserscanweknowwhichpatcheshavebecome
problemsthatneedsolving.

SODisahillclimbingalgorithm,areliablewayoffindingoptimalsolutionstothemostsignificantproblemsinanunknown
landscape.Youdon'tneedtobeageniustouseSODsuccessfully,youjustneedtobeabletoseethedifferencebetweenthe
fogofactivityandtheprogresstowardsnewrealproblems.

Peoplehavepointedoutthathillclimbingalgorithmshaveknownlimitations.Onegetsstuckonlocalpeaks,mainly.Butthisis
nonethelesshowlifeitselfworks:collectingtinyincrementalimprovementsoverlongperiodsoftime.Thereisnointelligent
designer.Wereducetheriskoflocalpeaksbyspreadingoutwidelyacrossthelandscape,butitissomewhatmoot.The
limitationsaren'toptional,theyarephysicallaws.Thetheorysays,thisishowinnovationreallyworks,sobetterembraceitand
workwithitthantrytoworkonthebasisofmagicalthinking.

Andinfactonceyouseeallinnovationasmoreorlesssuccessfulhillclimbing,yourealizewhysometeamsandcompaniesand
productsgetstuckinaneverneverlandofdiminishingprospects.Theysimplydon'thavethediversityandcollectiveintelligence
tofindbetterhillstoclimb.WhenNokiakilledtheiropensourceprojects,theycuttheirownthroat.

AreallygooddesignerwithagoodteamcanuseSODtobuildworldclassproducts,rapidlyandaccurately.Togetthemostout
ofSODthedesignerhastousetheproductcontinuously,fromdayone,anddevelophisorherabilitytosmelloutproblemssuch
asinconsistency,surprisingbehavior,andotherformsoffriction.Wenaturallyoverlookmanyannoyances,butagooddesigner
pickstheseupandthinksabouthowtopatchthem.Designisaboutremovingfrictionintheuseofaproduct.

Inanopensourcesetting,wedothisworkinpublic.There'sno"let'sopenthecode"moment.Projectsthatdothisareinmyview
missingthepointofopensource,whichistoengageyourusersinyourexploration,andtobuildcommunityaroundtheseedof
thearchitecture.

Burnout topprevnext

TheZeroMQcommunityhasbeenandstillisheavilydependentonprobonoindividualefforts.I'dliketothinkthateveryonewas
compensatedinsomewayfortheircontributions,andIbelievethatwithZeroMQ,contributingmeansgainingexpertiseinan
extraordinarilyvaluabletechnology,whichleadstoimprovedprofessionaloptions.

However,notallprojectswillbesoluckyandifyouworkwithorinopensource,youshouldunderstandtheriskofburnoutthat
volunteersface.Thisappliestoallprobonocommunities.Inthissection,I'llexplainwhatcausesburnout,howtorecognizeit,
howtopreventit,and(ifithappens)howtotrytotreatit.Disclaimer:I'mnotapsychiatristandthisarticleisbasedonmyown
experiencesofworkinginprobonocontextsforthelast20years,includingfreesoftwareprojects,andNGOssuchastheFFII.

Inaprobonocontext,we'reexpectedtoworkwithoutdirectorobviouseconomicincentive.Thatis,wesacrificefamilylife,
professionaladvancement,freetime,andhealthinordertoaccomplishsomegoalwehavedecidedtoaccomplish.Inany
project,weneedsomekindofrewardtomakeitworthcontinuingeachday.Inmostprobonoprojectstherewardsarevery
indirect,superficiallynoteconomicalatall.Mostly,wedothingsbecausepeoplesay,"Hey,great!"Karmaisapowerfulmotivator.

However,weareeconomicbeings,andsoonerorlater,ifaprojectcostsusagreatdealanddoesnotbringeconomicrewardsof
somekind(money,fame,anewjob),westarttosuffer.Atacertainstage,itseemsoursubconscioussimplygetsdisgustedand
says,"Enoughisenough!"andrefusestogoanyfurther.Ifwetrytoforceourselves,wecanliterallygetsick.

ThisiswhatIcall"burnout",thoughthetermisalsousedforotherkindsofexhaustion.Toomuchinvestmentonaprojectwith
toolittleeconomicreward,fortoolong.Wearegreatatmanipulatingourselvesandothers,andthisisoftenpartoftheprocess
thatleadstoburnout.Wetellourselvesthatit'sforagoodcauseandthattheotherguyisdoingOK,soweshouldbeabletoas
well.

WhenIgotburnedoutonopensourceprojectslikeXitami,IrememberclearlyhowIfelt.Isimplystoppedworkingonit,refused
toansweranymoreemails,andtoldpeopletoforgetaboutit.Youcantellwhensomeone'sburnedout.Theygooffline,and
everyonestartssaying,"He'sactingstrangedepressed,ortired"

Diagnosisissimple.Hassomeoneworkedalotonaprojectthatwasnotpayingbackinanyway?Didshemakeexceptional
http://zguide.zeromq.org/page:all 150/225
12/31/2015 MQ - The Guide - MQ - The Guide
sacrifices?Didheloseorabandonhisjoborstudiestodotheproject?Ifyou'reanswering"yes",it'sburnout.

TherearethreesimpletechniquesI'vedevelopedovertheyearstoreducetheriskofburnoutintheteamsIworkwith:

Nooneisirreplaceable.Workingsoloonacriticalorpopularprojecttheconcentrationofresponsibilityononeperson
whocannotsettheirownlimitsisprobablythemainfactor.It'samanagementtruism:ifsomeoneinyourorganizationis
irreplaceable,getridofhimorher.

Weneeddayjobstopaythebills.Thiscanbehard,butseemsnecessary.Gettingmoneyfromsomewhereelsemakesit
mucheasiertosustainasacrificialproject.

Teachpeopleaboutburnout.Thisshouldbeabasiccourseincollegesanduniversities,asprobonoworkbecomesa
morecommonwayforyoungpeopletoexperimentprofessionally.

Whensomeoneisworkingaloneonacriticalproject,youknowtheyaregoingblowtheirfusessoonerorlater.It'sactuallyfairly
predictable:somethinglike1836monthsdependingontheindividualandhowmucheconomicstresstheyfaceintheirprivate
lives.I'venotseenanyoneburnoutafterhalfayear,norlastfiveyearsinaunrewardingproject.

Thereisasimplecureforburnoutthatworksinatleastsomecases:getpaiddecentlyforyourwork.However,thisprettymuch
destroysthefreedomofmovement(acrossthatinfiniteproblemlandscape)thatthevolunteerenjoys.

PatternsforSuccess topprevnext

I'llendthiscodefreechapterwithaseriesofpatternsforsuccessinsoftwareengineering.Theyaimtocapturetheessenceof
whatdividesglorioussuccessfromtragicfailure.Theyweredescribedas"religiousmaniacaldogma"byamanager,and
"anythingelsewouldbeeffinginsane"byacolleague,inasingleday.Forme,theyarescience.ButtreattheLazyPerfectionist
andothersastoolstouse,sharpen,andthrowawayifsomethingbettercomesalong.

TheLazyPerfectionist topprevnext

Neverdesignanythingthat'snotapreciseminimalanswertoaproblemwecanidentifyandhavetosolve.

TheLazyPerfectionistspendshisidletimeobservingothersandidentifyingproblemsthatareworthsolving.Helooksfor
agreementonthoseproblems,alwaysasking,"Whatistherealproblem".Thenhemoves,preciselyandminimally,tobuild,or
getotherstobuild,ausableanswertooneproblem.Heuses,orgetsotherstousethosesolutions.Andherepeatsthisuntil
therearenoproblemslefttosolve,ortimeormoneyrunsout.

TheBenevolentTyrant topprevnext

Thecontrolofalargeforceisthesameprincipleasthecontrolofafewmen:itismerelyaquestionofdividinguptheirnumbers.
SunTzu

TheBenevolentTyrantdivideslargeproblemsintosmalleronesandthrowsthematgroupstofocuson.Shebrokerscontracts
betweenthesegroups,intheformofAPIsandthe"unprotocols"we'llreadaboutinthenextchapter.TheBenevolentTyrant
constructsasupplychainthatstartswithproblems,andresultsinusablesolutions.Sheisruthlessabouthowthesupplychain
works,butdoesnottellpeoplewhattoworkon,norhowtodotheirwork.

TheEarthandSky topprevnext

Theidealteamconsistsoftwosides:onewritingcode,andoneprovidingfeedback.

TheEarthandSkyworktogetherasawhole,incloseproximity,buttheycommunicateformallythroughissuetracking.Skyseeks
http://zguide.zeromq.org/page:all 151/225
12/31/2015 MQ - The Guide - MQ - The Guide
outproblemsfromothersandfromtheirownuseoftheproductandfeedsthesetoEarth.Earthrapidlyanswerswithtestable
solutions.EarthandSkycanworkthroughdozensofissuesinaday.Skytalkstootherusers,andEarthtalkstoother
developers.EarthandSkymaybetwopeople,ortwosmallgroups.

TheOpenDoor topprevnext

Theaccuracyofknowledgecomesfromdiversity.

TheOpenDooracceptscontributionsfromalmostanyone.Shedoesnotarguequalityordirection,insteadallowingothersto
arguethatandgetmoreengaged.Shecalculatesthatevenatrollwillbringmorediverseopiniontothegroup.Sheletsthegroup
formitsopinionaboutwhatgoesintostablecode,andsheenforcesthisopinionwithhelpofaBenevolentTyrant.

TheLaughingClown topprevnext

Perfectionprecludesparticipation.

TheLaughingClown,oftenactingastheHappyFailure,makesnoclaimtohighcompetence.Insteadhisanticsandbumbling
attemptsprovokeothersintorescuinghimfromhisowntragedy.Somehowhowever,healwaysidentifiestherightproblemsto
solve.Peoplearesobusyprovinghimwrongtheydon'trealizethey'redoingvaluablework.

TheMindfulGeneral topprevnext

Makenoplans.Setgoals,developstrategiesandtactics.

TheMindfulGeneraloperatesinunknownterritory,solvingproblemsthatarehiddenuntiltheyarenearby.Thusshemakesno
plans,butseeksopportunities,thenexploitsthemrapidlyandaccurately.Shedevelopstacticsandstrategiesinthefield,and
teachesthesetohersoldierssotheycanmoveindependently,andtogether.

TheSocialEngineer topprevnext

Ifyouknowtheenemyandknowyourself,youneednotfeartheresultofahundredbattles.SunTzu

TheSocialEngineerreadstheheartsandmindsofthoseheworkswithandfor.Heasks,ofeveryone,"Whatmakesthisperson
angry,insecure,argumentative,calm,happy?"Hestudiestheirmoodsanddispositions.Withthisknowledgehecanencourage
thosewhoareuseful,anddiscouragethosewhoarenot.TheSocialEngineerneveractsonhisownemotions.

TheConstantGardener topprevnext

Hewillwinwhosearmyisanimatedbythesamespiritthroughoutallitsranks.SunTzu

TheConstantGardenergrowsaprocessfromasmallseed,stepbystepasmorepeoplecomeintotheproject.Shemakes
everychangeforaprecisereason,withagreementfromeveryone.Sheneverimposesaprocessfromabovebutletsothers
cometoconsensus,andthenheenforcesthatconsensus.Inthisway,everyoneownstheprocesstogetherandbyowningit,
theyareattachedtoit.

http://zguide.zeromq.org/page:all 152/225
12/31/2015 MQ - The Guide - MQ - The Guide

TheRollingStone topprevnext

Aftercrossingariver,youshouldgetfarawayfromit.SunTzu

TheRollingStoneacceptshisownmortalityandtransience.Hehasnoattachmenttohispastwork.Heacceptsthatallthatwe
makeisdestinedforthetrashcan,itisjustamatteroftime.Withprecise,minimalinvestments,hecanmoverapidlyawayfrom
thepastandstayfocusedonthepresentandnearfuture.Aboveall,hehasnoegoandnopridetobehurtbytheactionsof
others.

ThePirateGang topprevnext

Code,likeallknowledge,worksbestascollectivenotprivateproperty.

ThePirateGangorganizesfreelyaroundproblems.Itacceptsauthorityinsofarasauthorityprovidesgoalsandresources.The
PirateGangownsandsharesallitmakes:everyworkisfullyremixablebyothersinthePirateGang.Thegangmovesrapidlyas
newproblemsemerge,andisquicktoabandonoldsolutionsifthosestopbeingrelevant.Nopersonsorgroupscanmonopolize
anypartofthesupplychain.

TheFlashMob topprevnext

Watershapesitscourseaccordingtothenatureofthegroundoverwhichitflows.SunTzu

TheFlashMobcomestogetherinspaceandtimeasneeded,thendispersesassoonastheycan.Physicalclosenessis
essentialforhighbandwidthcommunications.Butovertimeitcreatestechnicalghettos,whereEarthgetsseparatedfromSky.
TheFlashMobtendstocollectalotoffrequentfliermiles.

TheCanaryWatcher topprevnext

Painisnot,generally,aGoodSign.

TheCanaryWatchermeasuresthequalityofanorganizationbytheirownpainlevel,andtheobservedpainlevelsofthosewith
whomheworks.Hebringsnewparticipantsintoexistingorganizationssotheycanexpresstherawpainoftheinnocent.Hemay
usealcoholtogetotherstoverbalizetheirpainpoints.Heasksothers,andhimself,"Areyouhappyinthisprocess,andifnot,
whynot?"Whenanorganizationcausespaininhimselforothers,hetreatsthatasaproblemtobefixed.Peopleshouldfeeljoy
intheirwork.

TheHangman topprevnext

Neverinterruptotherswhentheyaremakingmistakes.

TheHangmanknowsthatwelearnonlybymakingmistakes,andshegivesotherscopiousropewithwhichtolearn.Sheonly
pullstheropegently,whenit'stime.Alittletugtoremindtheotheroftheirprecariousposition.Allowingotherstolearnbyfailure
givesthegoodreasontostay,andthebadexcusetoleave.TheHangmanisendlesslypatient,becausethereisnoshortcutto
thelearningprocess.

TheHistorian topprevnext

http://zguide.zeromq.org/page:all 153/225
12/31/2015 MQ - The Guide - MQ - The Guide

Keepingthepublicrecordmaybetedious,butit'stheonlywaytopreventcollusion.

TheHistorianforcesdiscussionintothepublicview,topreventcollusiontoownareasofwork.ThePirateGangdependsonfull
andequalcommunicationsthatdonotdependonmomentarypresence.Noonereallyreadsthearchives,butthesimply
possibilitystopsmostabuses.TheHistorianencouragestherighttoolforthejob:emailfortransientdiscussions,IRCforchatter,
wikisforknowledge,issuetrackingforrecordingopportunities.

TheProvocateur topprevnext

Whenamanknowsheistobehangedinafortnight,itconcentrateshismindwonderfully.SamuelJohnson

TheProvocateurcreatesdeadlines,enemies,andtheoccasionalimpossibility.Teamsworkbestwhentheydon'thavetimefor
thecrap.Deadlinesbringpeopletogetherandfocusthecollectivemind.Anexternalenemycanmoveapassiveteamintoaction.
TheProvocateurnevertakesthedeadlinetooseriously.Theproductisalwaysreadytoship.Butshegentlyremindstheteamof
thestakes:fail,andwealllookforotherjobs.

TheMystic topprevnext

Whenpeopleargueorcomplain,justwritethemaSunTzuquotationMikkoKoppanen

TheMysticneverarguesdirectly.Heknowsthattoarguewithanemotionalpersononlycreatesmoreemotion.Insteadheside
stepsthediscussion.It'shardtobeangryataChinesegeneral,especiallywhenhehasbeendeadfor2,400years.TheMystic
playsHangmanwhenpeopleinsistontherighttogetitwrong.

Chapter7AdvancedArchitectureusingZeroMQ topprevnext

OneoftheeffectsofusingZeroMQatlargescaleisthatbecausewecanbuilddistributedarchitecturessomuchfasterthan
before,thelimitationsofoursoftwareengineeringprocessesbecomemorevisible.Mistakesinslowmotionareoftenharderto
see(orrather,easiertorationalizeaway).

MyexperiencewhenteachingZeroMQtogroupsofengineersisthatit'srarelysufficienttojustexplainhowZeroMQworksand
thenjustexpectthemtostartbuildingsuccessfulproducts.Likeanytechnologythatremovesfriction,ZeroMQopensthedoorto
bigblunders.IfZeroMQistheACMErocketpropelledshoeofdistributedsoftwaredevelopment,alotofusarelikeWileE.
Coyote,slammingfullspeedintotheproverbialdesertcliff.

WesawinChapter6TheZeroMQCommunitythatZeroMQitselfusesaformalprocessforchanges.Onereasonwebuiltthis
process,oversomeyears,wastostoptherepeatedcliffslammingthathappenedinthelibraryitself.

Partly,it'saboutslowingdownandpartially,it'saboutensuringthatwhenyoumovefast,yougoandthisisessentialDear
Readerintherightdirection.It'smystandardinterviewriddle:what'stherarestpropertyofanysoftwaresystem,theabsolute
hardestthingtogetright,thelackofwhichcausesthesloworfastdeathofthevastmajorityofprojects?Theanswerisnotcode
quality,funding,performance,oreven(thoughit'sacloseanswer),popularity.Theanswerisaccuracy.

Accuracyishalfthechallenge,andappliestoanyengineeringwork.Theotherhalfisdistributedcomputingitself,whichsetsupa
wholerangeofproblemsthatweneedtosolveifwearegoingtocreatearchitectures.Weneedtoencodeanddecodedatawe
needtodefineprotocolstoconnectclientsandserversweneedtosecuretheseprotocolsagainstattackersandweneedto
makestacksthatarerobust.Asynchronousmessagingishardtogetright.

Thischapterwilltacklethesechallenges,startingwithabasicreappraisalofhowtodesignandbuildsoftwareandendingwitha
fullyformedexampleofadistributedapplicationforlargescalefiledistribution.

We'llcoverthefollowingjuicytopics:

http://zguide.zeromq.org/page:all 154/225
12/31/2015 MQ - The Guide - MQ - The Guide
Howtogofromideatoworkingprototypesafely(theMOPEDpattern)
DifferentwaystoserializeyourdataasZeroMQmessages
Howtocodegeneratebinaryserializationcodecs
HowtobuildcustomcodegeneratorsusingtheGSLtool
Howtowriteandlicenseaprotocolspecification
HowtobuildfastrestartablefiletransferoverZeroMQ
Howtousecreditbasedflowcontrolfornonblockingtransfers
Howtobuildprotocolserversandclientsasstatemachines
HowtomakeasecureprotocoloverZeroMQ
Alargescalefilepublishingsystem(FileMQ)

MessageOrientedPatternforElasticDesign topprevnext

I'llintroduceMessageOrientedPatternforElasticDesign(MOPED),asoftwareengineeringpatternforZeroMQarchitectures.It
waseither"MOPED"or"BIKE",theBackronymInducedKineticEffect.That'sshortfor"BICICLE",theBackronymInflatedSeeifI
CareLessEffect.Inlife,onelearnstogowiththeleastembarrassingchoice.

Ifyou'vereadthisbookcarefully,you'llhaveseenMOPEDinactionalready.ThedevelopmentofMajordomoinChapter4
ReliableRequestReplyPatternsisanearperfectcase.Butcutenamesareworthathousandwords.

ThegoalofMOPEDistodefineaprocessbywhichwecantakearoughusecaseforanewdistributedapplication,andgofrom
"HelloWorld"tofullyworkingprototypeinanylanguageinunderaweek.

UsingMOPED,yougrow,morethanbuild,aworkingZeroMQarchitecturefromthegroundupwithminimalriskoffailure.By
focusingonthecontractsratherthantheimplementations,youavoidtheriskofprematureoptimization.Bydrivingthedesign
processthroughultrashorttestbasedcycles,youcanbemorecertainthatwhatyouhaveworksbeforeyouaddmore.

Wecanturnthisintofiverealsteps:

Step1:internalizetheZeroMQsemantics.
Step2:drawarougharchitecture.
Step3:decideonthecontracts.
Step4:makeaminimalendtoendsolution.
Step5:solveoneproblemandrepeat.

Step1:InternalizetheSemantics topprevnext

YoumustlearnanddigestZeroMQ's"language",thatis,thesocketpatternsandhowtheywork.Theonlywaytolearna
languageistouseit.There'snowaytoavoidthisinvestment,notapesyoucanplaywhileyousleep,nochipsyoucanpluginto
magicallybecomesmarter.Readthisbookfromthestart,workthroughthecodeexamplesinwhateverlanguageyouprefer,
understandwhat'sgoingon,and(mostimportantly)writesomeexamplesyourselfandthenthrowthemaway.

Atacertainpoint,you'llfeelaclickingnoiseinyourbrain.Maybeyou'llhaveaweirdchiliinduceddreamwherelittleZeroMQ
tasksrunaroundtryingtoeatyoualive.Maybeyou'lljustthink"aaahh,sothat'swhatitmeans!"Ifwedidourworkright,itshould
taketwotothreedays.Howeverlongittakes,untilyoustartthinkingintermsofZeroMQsocketsandpatterns,you'renotready
forstep2.

Step2:DrawaRoughArchitecture topprevnext

Frommyexperience,it'sessentialtobeabletodrawthecoreofyourarchitecture.Ithelpsothersunderstandwhatyouare
thinking,anditalsohelpsyouthinkthroughyourideas.Thereisreallynobetterwaytodesignagoodarchitecturethantoexplain
yourideastoyourcolleagues,usingawhiteboard.

Youdon'tneedtogetitright,andyoudon'tneedtomakeitcomplete.Whatyoudoneedtodoisbreakyourarchitectureinto
piecesthatmakesense.Thenicethingaboutsoftwarearchitecture(ascomparedtoconstructingbridges)isthatyourreallycan

http://zguide.zeromq.org/page:all 155/225
12/31/2015 MQ - The Guide - MQ - The Guide
replaceentirelayerscheaplyifyou'veisolatedthem.

Startbychoosingthecoreproblemthatyouaregoingtosolve.Ignoreanythingthat'snotessentialtothatproblem:youwilladdit
inlater.Theproblemshouldbeanendtoendproblem:theropeacrossthegorge.

Forexample,aclientaskedustomakeasupercomputingclusterwithZeroMQ.Clientscreatebundlesofwork,whicharesentto
abrokerthatdistributesthemtoworkers(runningonfastgraphicsprocessors),collectstheresultsback,andreturnsthemtothe
client.

Theropeacrossthegorgeisoneclienttalkingtoabrokertalkingtooneworker.Wedrawthreeboxes:client,broker,worker.We
drawarrowsfromboxtoboxshowingtherequestflowingonewayandtheresponseflowingback.It'sjustlikethemanydiagrams
wesawinearlierchapters.

Beminimalistic.Yourgoalisnottodefinearealarchitecture,buttothrowaropeacrossthegorgetobootstrapyourprocess.We
makethearchitecturesuccessfullymorecompleteandrealisticovertime:e.g.,addingmultipleworkers,addingclientandworker
APIs,handlingfailures,andsoon.

Step3:DecideontheContracts topprevnext

Agoodsoftwarearchitecturedependsoncontracts,andthemoreexplicittheyare,thebetterthingsscale.Youdon'tcarehow
thingshappenyouonlycareabouttheresults.IfIsendanemail,Idon'tcarehowitarrivesatitsdestination,aslongasthe
contractisrespected.Theemailcontractis:itarriveswithinafewminutes,noonemodifiesit,anditdoesn'tgetlost.

Andtobuildalargesystemthatworkswell,youmustfocusonthecontractsbeforetheimplementations.Itmaysoundobvious
butalltoooften,peopleforgetorignorethis,orarejusttooshytoimposethemselves.IwishIcouldsayZeroMQhaddonethis
properly,butforyearsourpubliccontractsweresecondrateafterthoughtsinsteadofprimaryinyourfacepiecesofwork.

Sowhatisacontractinadistributedsystem?Thereare,inmyexperience,twotypesofcontract:

TheAPIstoclientapplications.RememberthePsychologicalElements.TheAPIsneedtobeasabsolutelysimple,
consistent,andfamiliaraspossible.Yes,youcangenerateAPIdocumentationfromcode,butyoumustfirstdesignit,and
designinganAPIisoftenhard.

Theprotocolsthatconnectthepieces.Itsoundslikerocketscience,butit'sreallyjustasimpletrick,andonethatZeroMQ
makesparticularlyeasy.Infactthey'resosimpletowrite,andneedsolittlebureaucracythatIcallthemunprotocols.

Youwriteminimalcontractsthataremostlyjustplacemarkers.MostmessagesandmostAPImethodswillbemissingorempty.
Youalsowanttowritedownanyknowntechnicalrequirementsintermsofthroughput,latency,reliability,andsoon.Theseare
thecriteriaonwhichyouwillacceptorrejectanyparticularpieceofwork.

Step4:WriteaMinimalEndtoEndSolution topprevnext

Thegoalistotestouttheoverallarchitectureasrapidlyaspossible.MakeskeletonapplicationsthatcalltheAPIs,andskeleton
stacksthatimplementbothsidesofeveryprotocol.Youwanttogetaworkingendtoend"HelloWorld"assoonasyoucan.You
wanttobeabletotestcodeasyouwriteit,sothatyoucanweedoutthebrokenassumptionsandinevitableerrorsyoumake.Do
notgooffandspendsixmonthswritingatestsuite!Instead,makeaminimalbarebonesapplicationthatusesourstill
hypotheticalAPI.

IfyoudesignanAPIwearingthehatofthepersonwhoimplementsit,you'llstarttothinkofperformance,features,options,and
soon.You'llmakeitmorecomplex,moreirregular,andmoresurprisingthanitshouldbe.But,andhere'sthetrick(it'sacheap
one,wasbiginJapan):ifyoudesignanAPIwhilewearingthehatofthepersonwhohastoactuallywriteappsthatuseit,you
useallthatlazinessandfeartoyouradvantage.

Writedowntheprotocolsonawikiorshareddocumentinsuchawaythatyoucanexplaineverycommandclearlywithouttoo
muchdetail.Stripoffanyrealfunctionality,becauseitwillonlycreateinertiathatmakesithardertomovestuffaround.Youcan
alwaysaddweight.Don'tspendeffortdefiningformalmessagestructures:passtheminimumaroundinthesimplestpossible
fashionusingZeroMQ'smultipartframing.

Ourgoalistogetthesimplesttestcaseworking,withoutanyavoidablefunctionality.Everythingyoucanchopoffthelistofthings
todo,youchop.Ignorethegroansfromcolleaguesandbosses.I'llrepeatthisonceagain:youcanalwaysaddfunctionality,
http://zguide.zeromq.org/page:all 156/225
12/31/2015 MQ - The Guide - MQ - The Guide
that'srelativelyeasy.Butaimtokeeptheoverallweighttoaminimum.

Step5:SolveOneProblemandRepeat topprevnext

You'renowinthehappycycleofissuedrivendevelopmentwhereyoucanstarttosolvetangibleproblemsinsteadofadding
features.Writeissuesthateachstateaclearproblem,andproposeasolution.AsyoudesigntheAPI,keepinmindyour
standardsfornames,consistency,andbehavior.Writingthesedowninproseoftenhelpskeepthemsane.

Fromhere,everysinglechangeyoumaketothearchitectureandcodecanbeprovenbyrunningthetestcase,watchingitnot
work,makingthechange,andthenwatchingitwork.

Nowyougothroughthewholecycle(extendingthetestcase,fixingtheAPI,updatingtheprotocol,andextendingthecode,as
needed),takingproblemsoneatatimeandtestingthesolutionsindividually.Itshouldtakeabout1030minutesforeachcycle,
withtheoccasionalspikeduetorandomconfusion.

Unprotocols topprevnext

ProtocolsWithoutTheGoats topprevnext

Whenthismanthinksofprotocols,thismanthinksofmassivedocumentswrittenbycommittees,overyears.Thismanthinksof
theIETF,W3C,ISO,Oasis,regulatorycapture,FRANDpatentlicensedisputes,andsoonafter,thismanthinksofretirementtoa
nicelittlefarminnorthernBoliviaupinthemountainswheretheonlyotherneedlesslystubbornbeingsarethegoatschewingup
thecoffeeplants.

Now,I'venothingpersonalagainstcommittees.Theuselessfolkneedaplacetositouttheirliveswithminimalriskof
reproducingafterall,thatonlyseemsfair.Butmostcommitteeprotocolstendtowardscomplexity(theonesthatwork),ortrash
(theoneswedon'ttalkabout).There'safewreasonsforthis.Oneistheamountofmoneyatstake.Moremoneymeansmore
peoplewhowanttheirparticularprejudicesandassumptionsexpressedinprose.Buttwoisthelackofgoodabstractionson
whichtobuild.Peoplehavetriedtobuildreusableprotocolabstractions,likeBEEP.Mostdidnotstick,andthosethatdid,like
SOAPandXMPP,areonthecomplexsideofthings.

Itusedtobe,decadesago,whentheInternetwasayoungmodestthing,thatprotocolswereshortandsweet.Theyweren'teven
"standards",but"requestsforcomments",whichisasmodestasyoucanget.It'sbeenoneofmygoalssincewestartediMatixin
1995tofindawayforordinarypeoplelikemetowritesmall,accurateprotocolswithouttheoverheadofthecommittees.

Now,ZeroMQdoesappeartoprovidealiving,successfulprotocolabstractionlayerwithits"we'llcarrymultipartmessagesover
randomtransports"wayofworking.BecauseZeroMQdealssilentlywithframing,connections,androuting,it'ssurprisinglyeasy
towritefullprotocolspecsontopofZeroMQ,andinChapter4ReliableRequestReplyPatternsandChapter5Advanced
PubSubPatternsIshowedhowtodothis.

Somewherearoundmid2007,IkickedofftheDigitalStandardsOrganizationtodefinenewsimplerwaysofproducinglittle
standards,protocols,andspecifications.Inmydefense,itwasaquietsummer.Atthetime,Iwrotethatanewspecification
shouldtake"minutestoexplain,hourstodesign,daystowrite,weekstoprove,monthstobecomemature,andyearstoreplace."

In2010,westartedcallingsuchlittlespecificationsunprotocols,whichsomepeoplemightmistakeforadastardlyplanforworld
dominationbyashadowyinternationalorganization,butwhichreallyjustmeans"protocolswithoutthegoats".

ContractsAreHard topprevnext

Writingcontractsisperhapsthemostdifficultpartoflargescalearchitecture.Withunprotocols,weremoveasmuchofthe
unnecessaryfrictionaspossible.Whatremainsisstillahardsetofproblemstosolve.Agoodcontract(beitanAPI,aprotocol,

http://zguide.zeromq.org/page:all 157/225
12/31/2015 MQ - The Guide - MQ - The Guide
orarentalagreement)hastobesimple,unambiguous,technicallysound,andeasytoenforce.

Likeanytechnicalskill,it'ssomethingyouhavetolearnandpractice.Thereareaseriesofspecificationsonthe
ZeroMQRFCsite,whichareworthreadingandusingthemasabasisforyourownspecificationswhenyoufindyourselfinneed.

I'lltrytosummarizemyexperienceasaprotocolwriter:

Startsimple,anddevelopyourspecificationsstepbystep.Don'tsolveproblemsyoudon'thaveinfrontofyou.

Useveryclearandconsistentlanguage.Aprotocolmayoftenbreakdownintocommandsandfieldsuseclearshort
namesfortheseentities.

Trytoavoidinventingconcepts.Reuseanythingyoucanfromexistingspecifications.Useterminologythatisobviousand
cleartoyouraudience.

Makenothingforwhichyoucannotdemonstrateanimmediateneed.Yourspecificationsolvesproblemsitdoesnot
providefeatures.Makethesimplestplausiblesolutionforeachproblemthatyouidentify.

Implementyourprotocolasyoubuildit,sothatyouareawareofthetechnicalconsequencesofeachchoice.Usea
languagethatmakesithard(likeC)andnotonethatmakesiteasy(likePython).

Testyourspecificationasyoubuilditonotherpeople.Yourbestfeedbackonaspecificationiswhensomeoneelsetriesto
implementitwithouttheassumptionsandknowledgethatyouhaveinyourhead.

Crosstestrapidlyandconsistently,throwingothers'clientsagainstyourserversandviceversa.

Bepreparedtothrowitoutandstartagainasoftenasneeded.Planforthis,bylayeringyourarchitecturesothate.g.,you
cankeepanAPIbutchangetheunderlyingprotocols.

Onlyuseconstructsthatareindependentofprogramminglanguageandoperatingsystem.

Solvealargeprobleminlayers,makingeachlayeranindependentspecification.Bewareofcreatingmonolithicprotocols.
Thinkabouthowreusableeachlayeris.Thinkabouthowdifferentteamscouldbuildcompetingspecificationsateach
layer.

Andaboveall,writeitdown.Codeisnotaspecification.Thepointaboutawrittenspecificationisthatnomatterhowweakitis,it
canbesystematicallyimproved.Bywritingdownaspecification,youwillalsospotinconsistenciesandgrayareasthatare
impossibletoseeincode.

Ifthissoundshard,don'tworrytoomuch.OneofthelessobviousbenefitsofusingZeroMQisthatitcutstheeffortnecessaryto
writeaprotocolspecbyperhaps90%ormorebecauseitalreadyhandlesframing,routing,queuing,andsoon.Thismeansthat
youcanexperimentrapidly,makemistakescheaply,andthuslearnrapidly.

HowtoWriteUnprotocols topprevnext

Whenyoustarttowriteanunprotocolspecificationdocument,sticktoaconsistentstructuresothatyourreadersknowwhatto
expect.HereisthestructureIuse:

Coversection:witha1linesummary,URLtothespec,formalname,version,whotoblame.
Licenseforthetext:absolutelyneededforpublicspecifications.
Thechangeprocess:i.e.,howcanIasareaderfixproblemsinthespecification?
Useoflanguage:MUST,MAY,SHOULD,andsoon,withareferencetoRFC2119.
Maturityindicator:isthisanexperimental,draft,stable,legacy,orretired?
Goalsoftheprotocol:whatproblemsisittryingtosolve?
Formalgrammar:preventsargumentsduetodifferentinterpretationsofthetext.
Technicalexplanation:semanticsofeachmessage,errorhandling,andsoon.
Securitydiscussion:explicitly,howsecuretheprotocolis.
References:tootherdocuments,protocols,andsoon.

Writingclear,expressivetextishard.Doavoidtryingtodescribeimplementationsoftheprotocol.Rememberthatyou'rewritinga
contract.Youdescribeinclearlanguagetheobligationsandexpectationsofeachparty,thelevelofobligation,andthepenalties
forbreakingtherules.Youdonottrytodefinehoweachpartyhonorsitspartofthedeal.

Herearesomekeypointsaboutunprotocols:

http://zguide.zeromq.org/page:all 158/225
12/31/2015 MQ - The Guide - MQ - The Guide
Aslongasyourprocessisopen,thenyoudon'tneedacommittee:justmakecleanminimaldesignsandmakesure
anyoneisfreetoimprovethem.

Ifuseanexistinglicense,thenyoudon'thavelegalworriesafterwards.IuseGPLv3formypublicspecificationsand
adviseyoutodothesame.Forinhousework,standardcopyrightisperfect.

Formalityisvaluable.Thatis,learntowriteaformalgrammarsuchasABNF(AugmentedBackusNaurForm)andusethis
tofullydocumentyourmessages.

UseamarketdrivenlifecycleprocesslikeDigistan'sCOSSsothatpeopleplacetherightweightonyourspecsasthey
mature(ordon't).

WhyusetheGPLv3forPublicSpecifications? topprevnext

Thelicenseyouchooseisparticularlycrucialforpublicspecifications.Traditionally,protocolsarepublishedundercustom
licenses,wheretheauthorsownthetextandderivedworksareforbidden.Thissoundsgreat(afterall,whowantstoseea
protocolforked?),butit'sinfacthighlyrisky.Aprotocolcommitteeisvulnerabletocapture,andiftheprotocolisimportantand
valuable,theincentiveforcapturegrows.

Oncecaptured,likesomewildanimals,animportantprotocolwilloftendie.Therealproblemisthatthere'snowaytofreea
captiveprotocolpublishedunderaconventionallicense.Theword"free"isn'tjustanadjectivetodescribespeechorair,it'salso
averb,andtherighttoforkaworkagainstthewishesoftheownerisessentialtoavoidingcapture.

Letmeexplainthisinshorterwords.ImaginethatiMatixwritesaprotocoltodaythat'sreallyamazingandpopular.Wepublishthe
specandmanypeopleimplementit.Thoseimplementationsarefastandawesome,andfreeasinbeer.Theystarttothreatenan
existingbusiness.Theirexpensivecommercialproductisslowerandcan'tcompete.SoonedaytheycometoouriMatixofficein
MaetangDong,SouthKorea,andoffertobuyourfirm.Becausewe'respendingvastamountsonsushiandbeer,weaccept
gratefully.Withevillaughter,thenewownersoftheprotocolstopimprovingthepublicversion,closethespecification,andadd
patentedextensions.Theirnewproductssupportthisnewprotocolversion,buttheopensourceversionsarelegallyblockedfrom
doingso.Thecompanytakesoverthewholemarket,andcompetitionends.

Whenyoucontributetoanopensourceproject,youreallywanttoknowyourhardworkwon'tbeusedagainstyoubyaclosed
sourcecompetitor.ThisiswhytheGPLbeatsthe"morepermissive"BSD/MIT/X11licensesformostcontributors.Theselicenses
givepermissiontocheat.Thisappliesjustasmuchtoprotocolsastosourcecode.

WhenyouimplementaGPLv3specification,yourapplicationsareofcourseyours,andlicensedanywayyoulike.Butyoucanbe
certainoftwothings.One,thatspecificationwillneverbeembracedandextendedintoproprietaryforms.Anyderivedformsof
thespecificationmustalsobeGPLv3.Two,noonewhoeverimplementsorusestheprotocolwilleverlaunchapatentattackon
anythingitcovers,norcantheyaddtheirpatentedtechnologytoitwithoutgrantingtheworldafreelicense.

UsingABNF topprevnext

Myadvicewhenwritingprotocolspecsistolearnanduseaformalgrammar.It'sjustlesshasslethanallowingotherstointerpret
whatyoumean,andthenrecoverfromtheinevitablefalseassumptions.Thetargetofyourgrammarisotherpeople,engineers,
notcompilers.

MyfavoritegrammarisABNF,asdefinedbyRFC2234,becauseitisprobablythesimplestandmostwidelyusedformal
languagefordefiningbidirectionalcommunicationsprotocols.MostIETF(InternetEngineeringTaskForce)specificationsuse
ABNF,whichisgoodcompanytobein.

I'llgivea30secondcrashcourseinwritingABNF.Itmayremindyouofregularexpressions.Youwritethegrammarasrules.
Eachruletakestheform"name=elements".Anelementcanbeanotherrule(whichyoudefinebelowasanotherrule)orapre
definedterminallikeCRLF,OCTET,oranumber.TheRFClistsalltheterminals.Todefinealternativeelements,separatewitha
slash.Todefinerepetition,useanasterisk.Togroupelements,useparentheses.ReadtheRFCbecauseit'snotintuitive.

I'mnotsureifthisextensionisproper,butIthenprefixelementswith"C:"and"S:"toindicatewhethertheycomefromtheclient
orserver.

Here'sapieceofABNFforanunprotocolcalledNOMthatwe'llcomebacktolaterinthischapter:

http://zguide.zeromq.org/page:all 159/225
12/31/2015 MQ - The Guide - MQ - The Guide

nomprotocol=openpeering*usepeering

openpeering=C:OHAI(S:OHAIOK/S:WTF)

usepeering=C:ICANHAZ
/S:CHEEZBURGER
/C:HUGZS:HUGZOK
/S:HUGZC:HUGZOK

I'veactuallyusedthesekeywords(OHAI,WTF)incommercialprojects.Theymakedevelopersgigglyandhappy.Theyconfuse
management.They'regoodinfirstdraftsthatyouwanttothrowawaylater.

TheCheaporNastyPattern topprevnext

ThereisagenerallessonI'velearnedoveracoupleofdecadesofwritingprotocolssmallandlarge.IcallthistheCheaporNasty
pattern:youcanoftensplityourworkintotwoaspectsorlayersandsolvetheseseparatelyoneusinga"cheap"approach,the
otherusinga"nasty"approach.

ThekeyinsighttomakingCheaporNastyworkistorealizethatmanyprotocolsmixalowvolumechattypartforcontrol,anda
highvolumeasynchronouspartfordata.Forinstance,HTTPhasachattydialogtoauthenticateandgetpages,andan
asynchronousdialogtostreamdata.FTPactuallysplitsthisovertwoportsoneportforcontrolandoneportfordata.

Protocoldesignerswhodon'tseparatecontrolfromdatatendtomakehorridprotocols,becausethetradeoffsinthetwocases
arealmosttotallyopposed.Whatisperfectforcontrolisbadfordata,andwhat'sidealfordatajustdoesn'tworkforcontrol.It's
especiallytruewhenwewanthighperformanceatthesametimeasextensibilityandgooderrorchecking.

Let'sbreakthisdownusingaclassicclient/serverusecase.Theclientconnectstotheserverandauthenticates.Itthenasksfor
someresource.Theserverchatsback,thenstartstosenddatabacktotheclient.Eventually,theclientdisconnectsortheserver
finishes,andtheconversationisover.

Now,beforestartingtodesignthesemessages,stopandthink,andlet'scomparethecontroldialogandthedataflow:

Thecontroldialoglastsashorttimeandinvolvesveryfewmessages.Thedataflowcouldlastforhoursordays,and
involvebillionsofmessages.

Thecontroldialogiswhereallthe"normal"errorshappen,e.g.,notauthenticated,notfound,paymentrequired,censored,
andsoon.Incontrast,anyerrorsthathappenduringthedataflowareexceptional(diskfull,servercrashed).

Thecontroldialogiswherethingswillchangeovertimeasweaddmoreoptions,parameters,andsoon.Thedataflow
shouldbarelychangeovertimebecausethesemanticsofaresourcearefairlyconstantovertime.

Thecontroldialogisessentiallyasynchronousrequest/replydialog.Thedataflowisessentiallyaonewayasynchronous
flow.

Thesedifferencesarecritical.Whenwetalkaboutperformance,itappliesonlytodataflows.It'spathologicaltodesignaonetime
controldialogtobefast.Thuswhenwetalkaboutthecostofserialization,thisonlyappliestothedataflow.Thecostof
encoding/decodingthecontrolflowcouldbehuge,andformanycasesitwouldnotchangeathing.Soweencodecontrolusing
Cheap,andweencodedataflowsusingNasty.

Cheapisessentiallysynchronous,verbose,descriptive,andflexible.ACheapmessageisfullofrichinformationthatcanchange
foreachapplication.Yourgoalasdesigneristomakethisinformationeasytoencodeandparse,trivialtoextendfor
experimentationorgrowth,andhighlyrobustagainstchangebothforwardsandbackwards.TheCheappartofaprotocollooks
likethis:

Itusesasimpleselfdescribingstructuredencodingfordata,beitXML,JSON,HTTPstyleheaders,orsomeother.Any
encodingisfineaslongastherearestandardsimpleparsersforitinyourtargetlanguages.

Itusesastraightrequestreplymodelwhereeachrequesthasasuccess/failurereply.Thismakesittrivialtowritecorrect
clientsandserversforaCheapdialog.

Itdoesn'ttry,evenmarginally,tobefast.Performancedoesn'tmatterwhenyoudosomethingonlyonceorafewtimesper

http://zguide.zeromq.org/page:all 160/225
12/31/2015 MQ - The Guide - MQ - The Guide
session.

ACheapparserissomethingyoutakeofftheshelfandthrowdataat.Itshouldn'tcrash,shouldn'tleakmemory,shouldbehighly
tolerant,andshouldberelativelysimpletoworkwith.That'sit.

Nastyhoweverisessentiallyasynchronous,terse,silent,andinflexible.ANastymessagecarriesminimalinformationthat
practicallyneverchanges.Yourgoalasdesigneristomakethisinformationultrafasttoparse,andpossiblyevenimpossibleto
extendandexperimentwith.TheidealNastypatternlookslikethis:

Itusesahandoptimizedbinarylayoutfordata,whereeverybitispreciselycrafted.

Itusesapureasynchronousmodelwhereoneorbothpeerssenddatawithoutacknowledgments(oriftheydo,theyuse
sneakyasynchronoustechniqueslikecreditbasedflowcontrol).

Itdoesn'ttry,evenmarginally,tobefriendly.Performanceisallthatmatterswhenyouaredoingsomethingseveralmillion
timespersecond.

ANastyparserissomethingyouwritebyhand,whichwritesorreadsbits,bytes,words,andintegersindividuallyandprecisely.It
rejectsanythingitdoesn'tlike,doesnomemoryallocationsatall,andnevercrashes.

CheaporNastyisn'tauniversalpatternnotallprotocolshavethisdichotomy.Also,howyouuseCheaporNastywilldependon
thesituation.Insomecases,itcanbetwopartsofasingleprotocol.Inothercases,itcanbetwoprotocols,onelayeredontopof
theother.

ErrorHandling topprevnext

UsingCheaporNastymakeserrorhandlingrathersimpler.Youhavetwokindsofcommandsandtwowaystosignalerrors:

Synchronouscontrolcommands:errorsarenormal:everyrequesthasaresponsethatiseitherOKoranerrorresponse.
Asynchronousdatacommands:errorsareexceptional:badcommandsareeitherdiscardedsilently,orcausethewhole
connectiontobeclosed.

It'susuallygoodtodistinguishafewkindsoferrors,butasalwayskeepitminimalandaddonlywhatyouneed.

SerializingYourData topprevnext

Whenwestarttodesignaprotocol,oneofthefirstquestionswefaceishowweencodedataonthewire.Thereisnouniversal
answer.Thereareahalfdozendifferentwaystoserializedata,eachwithprosandcons.We'llexploresomeofthese.

AbstractionLevel topprevnext

Beforelookingathowtoputdataontothewire,it'sworthaskingwhatdataweactuallywanttoexchangebetweenapplications.If
wedon'tuseanyabstraction,weliterallyserializeanddeserializeourinternalstate.Thatis,theobjectsandstructuresweuseto
implementourfunctionality.

Puttinginternalstateontothewireishoweverareallybadidea.It'slikeexposinginternalstateinanAPI.Whenyoudothis,you
arehardcodingyourimplementationdecisionsintoyourprotocols.Youarealsogoingtoproduceprotocolsthataresignificantly
morecomplexthantheyneedtobe.

It'sperhapsthemainreasonsomanyolderprotocolsandAPIsaresocomplex:theirdesignersdidnotthinkabouthowto
abstractthemintosimplerconcepts.Thereisofcoursenoguaranteethananabstractionwillbesimplerthat'swherethehard
workcomesin.

AgoodprotocolorAPIabstractionencapsulatesnaturalpatternsofuse,andgivesthemnameandpropertiesthatarepredictable
andregular.Itchoosessensibledefaultssothatthemainusecasescanbespecifiedminimally.Itaimstobesimpleforthe
simplecases,andexpressivefortherarercomplexcases.Itdoesnotmakeanystatementsorassumptionsabouttheinternal

http://zguide.zeromq.org/page:all 161/225
12/31/2015 MQ - The Guide - MQ - The Guide
implementationunlessthatisabsolutelyneededforinteroperability.

ZeroMQFraming topprevnext

ThesimplestandmostwidelyusedserializationformatforZeroMQapplicationsisZeroMQ'sownmultipartframing.Forexample,
hereishowtheMajordomoProtocoldefinesarequest:

Frame0:Emptyframe
Frame1:"MDPW01"(sixbytes,representingMDP/Workerv0.1)
Frame2:0x02(onebyte,representingREQUEST)
Frame3:Clientaddress(envelopestack)
Frame4:Empty(zerobytes,envelopedelimiter)
Frames5+:Requestbody(opaquebinary)

Toreadandwritethisincodeiseasy,butthisisaclassicexampleofacontrolflow(thewholeofMDPisreally,asit'sachatty
requestreplyprotocol).WhenwecametoimproveMDPforthesecondversion,wehadtochangethisframing.Excellent,we
brokeallexistingimplementations!

Backwardscompatibilityishard,butusingZeroMQframingforcontrolflowsdoesnothelp.Here'showIshouldhavedesigned
thisprotocolifI'dfollowedmyownadvice(andI'llfixthisinthenextversion).It'ssplitintoaCheappartandaNastypart,and
usestheZeroMQframingtoseparatethese:

Frame0:"MDP/2.0"forprotocolnameandversion
Frame1:commandheader
Frame2:commandbody

Wherewe'dexpecttoparsethecommandheaderinthevariousintermediaries(clientAPI,broker,andworkerAPI),andpassthe
commandbodyuntouchedfromapplicationtoapplication.

SerializationLanguages topprevnext

Serializationlanguageshavetheirfashions.XMLusedtobebigasinpopular,thenitgotbigasinoverengineered,andthenit
fellintothehandsof"EnterpriseInformationArchitects"andit'snotbeenseenalivesince.Today'sXMListheepitomeof
"somewhereinthatmessissmall,elegantlanguagetryingtoescape".

StillXMLwasway,waybetterthanitspredecessors,whichincludedsuchmonstersastheStandardGeneralizedMarkup
Language(SGML),whichinturnwasacoolbreezecomparedtomindtorturingbeastslikeEDIFACT.Sothehistoryof
serializationlanguagesseemstobeofgraduallyemergingsanity,hiddenbywavesofrevoltingEIAsdoingtheirbesttoholdonto
theirjobs.

JSONpoppedoutoftheJavaScriptworldasaquickanddirty"I'dratherresignthanuseXMLhere"waytothrowdataontothe
wireandgetitbackagain.JSONisjustminimalXMLexpressed,sneakily,asJavaScriptsourcecode.

Here'sasimpleexampleofusingJSONinaCheapprotocol:

"protocol":{
"name":"MTL",
"version":1
},
"virtualhost":"testenv"

ThesamedatainXMLwouldbe(XMLforcesustoinventasingletoplevelentity):

http://zguide.zeromq.org/page:all 162/225
12/31/2015 MQ - The Guide - MQ - The Guide
<command>
<protocolname="MTL"version="1"/>
<virtualhost>testenv</virtualhost>
</command>

AndhereitisusingplainoldHTTPstyleheaders:

Protocol:MTL/1.0
Virtualhost:testenv

Theseareallprettyequivalentaslongasyoudon'tgooverboardwithvalidatingparsers,schemas,andother"trustus,thisisall
foryourowngood"nonsense.ACheapserializationlanguagegivesyouspaceforexperimentationforfree("ignoreany
elements/attributes/headersthatyoudon'trecognize"),andit'ssimpletowritegenericparsersthat,forexample,thunka
commandintoahashtable,orviceversa.

However,it'snotallroses.WhilemodernscriptinglanguagessupportJSONandXMLeasilyenough,olderlanguagesdonot.If
youuseXMLorJSON,youcreatenontrivialdependencies.It'salsosomewhatofapaintoworkwithtreestructureddataina
languagelikeC.

Soyoucandriveyourchoiceaccordingtothelanguagesforwhichyou'reaiming.Ifyouruniverseisascriptinglanguage,thengo
forJSON.Ifyouareaimingtobuildprotocolsforwidersystemuse,keepthingssimpleforCdevelopersandsticktoHTTPstyle
headers.

SerializationLibraries topprevnext

Themsgpack.orgsitesays:

I'mgoingtomaketheperhapsunpopularclaimthat"fastandsmall"arefeaturesthatsolvenonproblems.Theonlyrealproblem
thatserializationlibrariessolveis,asfarasIcantell,theneedtodocumentthemessagecontractsandactuallyserializedatato
andfromthewire.

Let'sstartbydebunking"fastandsmall".It'sbasedonatwopartargument.First,thatmakingyourmessagessmallerand
reducingCPUcostforencodinganddecodingwillmakeasignificantdifferencetoyourapplication'sperformance.Second,that
thisequallyvalidacrosstheboardtoallmessages.

Butmostrealapplicationstendtofallintooneoftwocategories.Eitherthespeedofserializationandsizeofencodingismarginal
comparedtoothercosts,suchasdatabaseaccessorapplicationcodeperformance.Or,networkperformancereallyiscritical,
andthenallsignificantcostsoccurinafewspecificmessagetypes.

Thus,aimingfor"fastandsmall"acrosstheboardisafalseoptimization.YouneithergettheeasyflexibilityofCheapforyour
infrequentcontrolflows,nordoyougetthebrutalefficiencyofNastyforyourhighvolumedataflows.Worse,theassumptionthat
allmessagesareequalinsomewaycancorruptyourprotocoldesign.CheaporNastyisn'tonlyaboutserializationstrategies,it's
alsoaboutsynchronousversusasynchronous,errorhandlingandthecostofchange.

Myexperienceisthatmostperformanceproblemsinmessagebasedapplicationscanbesolvedby(a)improvingtheapplication
itselfand(b)handoptimizingthehighvolumedataflows.Andtohandoptimizeyourmostcriticaldataflows,youneedtocheat
tolearnexploitfactsaboutyourdata,somethinggeneralpurposeserializerscannotdo.

Nowlet'saddressdocumentationandtheneedtowriteourcontractsexplicitlyandformally,ratherthanonlyincode.Thisisa
validproblemtosolve,indeedoneofthemainonesifwe'retobuildalonglasting,largescalemessagebasedarchitecture.

HereishowwedescribeatypicalmessageusingtheMessagePackinterfacedefinitionlanguage(IDL):

messagePerson{
1:stringsurname
2:stringfirstname
3:optionalstringemail
}

http://zguide.zeromq.org/page:all 163/225
12/31/2015 MQ - The Guide - MQ - The Guide
Now,thesamemessageusingtheGoogleprotocolbuffersIDL:

messagePerson{
requiredstringsurname=1
requiredstringfirstname=2
optionalstringemail=3
}

Itworks,butinmostpracticalcaseswinsyoulittleoveraserializationlanguagebackedbydecentspecificationswrittenbyhand
orproducedmechanically(we'llcometothis).Thepriceyou'llpayisanextradependencyandquiteprobably,worseoverall
performancethanifyouusedCheaporNasty.

HandwrittenBinarySerialization topprevnext

Asyou'llgatherfromthisbook,mypreferredlanguageforsystemsprogrammingisC(upgradedtoC99,witha
constructor/destructorAPImodelandgenericcontainers).TherearetworeasonsIlikethismodernizedClanguage.First,I'mtoo
weakmindedtolearnabiglanguagelikeC++.Lifejustseemsfilledwithmoreinterestingthingstounderstand.Second,Ifind
thatthisspecificlevelofmanualcontrolletsmeproducebetterresults,faster.

Thepointhereisn'tCversusC++,butthevalueofmanualcontrolforhighendprofessionalusers.It'snoaccidentthatthebest
cars,cameras,andespressomachinesintheworldhavemanualcontrols.Thatlevelofonthespotfinetuningoftenmakesthe
differencebetweenworldclasssuccess,andbeingsecondbest.

Whenyouarereally,trulyconcernedaboutthespeedofserializationand/orthesizeoftheresult(oftenthesecontradicteach
other),youneedhandwrittenbinaryserialization.Inotherwords,let'shearitforMr.Nasty!

YourbasicprocessforwritinganefficientNastyencoder/decoder(codec)is:

Buildrepresentativedatasetsandtestapplicationsthatcanstresstestyourcodec.
Writeafirstdumbversionofthecodec.
Test,measure,improve,andrepeatuntilyourunoutoftimeand/ormoney.

Herearesomeofthetechniquesweusetomakeourcodecsbetter:

Useaprofiler.There'ssimplynowaytoknowwhatyourcodeisdoinguntilyou'veprofileditforfunctioncountsandfor
CPUcostperfunction.Whenyoufindyourhotspots,fixthem.

Eliminatememoryallocations.TheheapisveryfastonamodernLinuxkernel,butit'sstillthebottleneckinmostnaive
codecs.Onolderkernels,theheapcanbetragicallyslow.Uselocalvariables(thestack)insteadoftheheapwhereyou
can.

Testondifferentplatformsandwithdifferentcompilersandcompileroptions.Apartfromtheheap,therearemanyother
differences.Youneedtolearnthemainones,andallowforthem.

Usestatetocompressbetter.Ifyouareconcernedaboutcodecperformance,youarealmostdefinitelysendingthesame
kindsofdatamanytimes.Therewillberedundancybetweeninstancesofdata.Youcandetecttheseandusethatto
compress(e.g.,ashortvaluethatmeans"sameaslasttime").

Knowyourdata.Thebestcompressiontechniques(intermsofCPUcostforcompactness)requireknowingaboutthe
data.Forexample,thetechniquesusedtocompressawordlist,avideo,andastreamofstockmarketdataareall
different.

Bereadytobreaktherules.Doyoureallyneedtoencodeintegersinbigendiannetworkbyteorder?x86andARM
accountforalmostallmodernCPUs,yetuselittleendian(ARMisactuallybiendianbutAndroid,likeWindowsandiOS,is
littleendian).

CodeGeneration topprevnext

http://zguide.zeromq.org/page:all 164/225
12/31/2015 MQ - The Guide - MQ - The Guide
Readingtheprevioustwosections,youmighthavewondered,"couldIwritemyownIDLgeneratorthatwasbetterthanageneral
purposeone?"Ifthisthoughtwanderedintoyourmind,itprobablyleftprettysoonafter,chasedbydarkcalculationsabouthow
muchworkthatactuallyinvolved.

WhatifItoldyouofawaytobuildcustomIDLgeneratorscheaplyandquickly?Youcanhaveawaytogetperfectlydocumented
contracts,codethatisasevilanddomainspecificasyouneedittobe,andallyouneedtodoissignawayyoursoul(whoever
reallyusedthat,amIright?)justhere

AtiMatix,untilafewyearsago,weusedcodegenerationtobuildeverlargerandmoreambitioussystemsuntilwedecidedthe
technology(GSL)wastoodangerousforcommonuse,andwesealedthearchiveandlockeditwithheavychainsinadeep
dungeon.WeactuallyposteditonGitHub.Ifyouwanttotrytheexamplesthatarecomingup,grabtherepositoryandbuild
yourselfagslcommand.Typing"make"inthesrcsubdirectoryshoulddoit(andifyou'rethatguywholovesWindows,I'msure
you'llsendapatchwithprojectfiles).

Thissectionisn'treallyaboutGSLatall,butaboutausefulandlittleknowntrickthat'susefulforambitiousarchitectswhowantto
scalethemselves,aswellastheirwork.Onceyoulearnthetrick,youcanwhipupyourowncodegeneratorsinashorttime.The
codegeneratorsmostsoftwareengineersknowaboutcomewithasinglehardcodedmodel.Forinstance,Ragel"compiles
executablefinitestatemachinesfromregularlanguages",i.e.,Ragel'smodelisaregularlanguage.Thiscertainlyworksfora
goodsetofproblems,butit'sfarfromuniversal.HowdoyoudescribeanAPIinRagel?Oraprojectmakefile?Orevenafinite
statemachineliketheoneweusedtodesigntheBinaryStarpatterninChapter4ReliableRequestReplyPatterns?

Allthesewouldbenefitfromcodegeneration,butthere'snouniversalmodel.Sothetrickistodesignyourownmodelsasyou
needthem,andthenmakecodegeneratorsascheapcompilersforthatmodel.Youneedsomeexperienceinhowtomakegood
models,andyouneedatechnologythatmakesitcheaptobuildcustomcodegenerators.Ascriptinglanguage,likePerland
Python,isagoodoption.However,weactuallybuiltGSLspecificallyforthis,andthat'swhatIprefer.

Let'stakeasimpleexamplethattiesintowhatwealreadyknow.We'llseemoreextensiveexampleslater,becauseIreallydo
believethatcodegenerationiscrucialknowledgeforlargescalework.InChapter4ReliableRequestReplyPatterns,we
developedtheMajordomoProtocol(MDP),andwroteclients,brokers,andworkersforthat.Nowcouldwegeneratethosepieces
mechanically,bybuildingourowninterfacedescriptionlanguageandcodegenerators?

WhenwewriteaGSLmodel,wecanuseanysemanticswelike,inotherwordswecaninventdomainspecificlanguagesonthe
spot.I'llinventacoupleseeifyoucanguesswhattheyrepresent:

slideshow
name=Cookerylevel3
page
title=FrenchCuisine
item=Overview
item=Thehistoricalcuisine
item=Thenouvellecuisine
item=WhytheFrenchlivelonger
page
title=Overview
item=Soupsandsalads
item=Leplatprincipal
item=Bchamelandothersauces
item=Pastries,cakes,andquiches
item=Souffl:cheesetostrawberry

Howaboutthisone:

table
name=person
column
name=firstname
type=string
column
name=lastname
type=string
column
name=rating

http://zguide.zeromq.org/page:all 165/225
12/31/2015 MQ - The Guide - MQ - The Guide
type=integer

Wecouldcompilethefirstintoapresentation.Thesecond,wecouldcompileintoSQLtocreateandworkwithadatabasetable.
Soforthisexercise,ourdomainlanguage,ourmodel,consistsof"classes"thatcontain"messages"thatcontain"fields"of
varioustypes.It'sdeliberatelyfamiliar.HereistheMDPclientprotocol:

<classname="mdp_client">
MDP/Client
<header>
<fieldname="empty"type="string"value=""
>Emptyframe</field>
<fieldname="protocol"type="string"value="MDPC01"
>Protocolidentifier</field>
</header>
<messagename="request">
Clientrequesttobroker
<fieldname="service"type="string">Servicename</field>
<fieldname="body"type="frame">Requestbody</field>
</message>
<messagename="reply">
Responsebacktoclient
<fieldname="service"type="string">Servicename</field>
<fieldname="body"type="frame">Responsebody</field>
</message>
</class>

AndhereistheMDPworkerprotocol:

<classname="mdp_worker">
MDP/Worker
<header>
<fieldname="empty"type="string"value=""
>Emptyframe</field>
<fieldname="protocol"type="string"value="MDPW01"
>Protocolidentifier</field>
<fieldname="id"type="octet">Messageidentifier</field>
</header>
<messagename="ready"id="1">
Workertellsbrokeritisready
<fieldname="service"type="string">Servicename</field>
</message>
<messagename="request"id="2">
Clientrequesttobroker
<fieldname="client"type="frame">Clientaddress</field>
<fieldname="body"type="frame">Requestbody</field>
</message>
<messagename="reply"id="3">
Workerreturnsreplytobroker
<fieldname="client"type="frame">Clientaddress</field>
<fieldname="body"type="frame">Requestbody</field>
</message>
<messagename="hearbeat"id="4">
Eitherpeertellstheotherit'sstillalive
</message>
<messagename="disconnect"id="5">
Eitherpeertellsotherthepartyisover
</message>
</class>

http://zguide.zeromq.org/page:all 166/225
12/31/2015 MQ - The Guide - MQ - The Guide
GSLusesXMLasitsmodelinglanguage.XMLhasapoorreputation,havingbeendraggedthroughtoomanyenterprisesewers
tosmellsweet,butithassomestrongpositives,aslongasyoukeepitsimple.Anywaytowriteaselfdescribinghierarchyof
itemsandattributeswouldwork.

NowhereisashortIDLgeneratorwritteninGSLthatturnsourprotocolmodelsintodocumentation:

.#TrivialIDLgenerator(specs.gsl)
.#
.output"$(class.name).md"
##The$(string.trim(class.?''):left)Protocol
.formessage
.frames=count(class>header.field)+count(field)

A$(message.NAME)commandconsistsofamultipartmessageof$(frames)
frames:

.forclass>header.field
.ifname="id"
*Frame$(item()):0x$(message.id:%02x)(1byte,$(message.NAME))
.else
*Frame$(item()):"$(value:)"($(string.length("$(value)"))\
bytes,$(field.:))
.endif
.endfor
.index=count(class>header.field)+1
.forfield
*Frame$(index):$(field.?'')\
.iftype="string"
(printablestring)
.elsiftype="frame"
(opaquebinary)
.index+=1
.else
.echo"E:unknownfieldtype:$(type)"
.endif
.index+=1
.endfor
.endfor

TheXMLmodelsandthisscriptareinthesubdirectoryexamples/models.Todothecodegeneration,Igivethiscommand:

gslscript:specsmdp_client.xmlmdp_worker.xml

HereistheMarkdowntextwegetfortheworkerprotocol:

##TheMDP/WorkerProtocol

AREADYcommandconsistsofamultipartmessageof4frames:

*Frame1:""(0bytes,Emptyframe)
*Frame2:"MDPW01"(6bytes,Protocolidentifier)
*Frame3:0x01(1byte,READY)
*Frame4:Servicename(printablestring)

AREQUESTcommandconsistsofamultipartmessageof5frames:

*Frame1:""(0bytes,Emptyframe)
*Frame2:"MDPW01"(6bytes,Protocolidentifier)
*Frame3:0x02(1byte,REQUEST)
http://zguide.zeromq.org/page:all 167/225
12/31/2015 MQ - The Guide - MQ - The Guide
*Frame4:Clientaddress(opaquebinary)
*Frame6:Requestbody(opaquebinary)

AREPLYcommandconsistsofamultipartmessageof5frames:

*Frame1:""(0bytes,Emptyframe)
*Frame2:"MDPW01"(6bytes,Protocolidentifier)
*Frame3:0x03(1byte,REPLY)
*Frame4:Clientaddress(opaquebinary)
*Frame6:Requestbody(opaquebinary)

AHEARBEATcommandconsistsofamultipartmessageof3frames:

*Frame1:""(0bytes,Emptyframe)
*Frame2:"MDPW01"(6bytes,Protocolidentifier)
*Frame3:0x04(1byte,HEARBEAT)

ADISCONNECTcommandconsistsofamultipartmessageof3frames:

*Frame1:""(0bytes,Emptyframe)
*Frame2:"MDPW01"(6bytes,Protocolidentifier)
*Frame3:0x05(1byte,DISCONNECT)

This,asyoucansee,isclosetowhatIwrotebyhandintheoriginalspec.Now,ifyouhaveclonedthezguiderepositoryand
youarelookingatthecodeinexamples/models,youcangeneratetheMDPclientandworkercodecs.Wepassthesametwo
modelstoadifferentcodegenerator:

gslscript:codec_cmdp_client.xmlmdp_worker.xml

Whichgivesusmdp_clientandmdp_workerclasses.ActuallyMDPissosimplethatit'sbarelyworththeeffortofwritingthe
codegenerator.Theprofitcomeswhenwewanttochangetheprotocol(whichwedidforthestandaloneMajordomoproject).
Youmodifytheprotocol,runthecommand,andoutpopsmoreperfectcode.

Thecodec_c.gslcodegeneratorisnotshort,buttheresultingcodecsaremuchbetterthanthehandwrittencodeIoriginallyput
togetherforMajordomo.Forinstance,thehandwrittencodehadnoerrorcheckingandwoulddieifyoupasseditbogus
messages.

I'mnowgoingtoexplaintheprosandconsofGSLpoweredmodelorientedcodegeneration.Powerdoesnotcomeforfreeand
oneofthegreatesttrapsinourbusinessistheabilitytoinventconceptsoutofthinair.GSLmakesthisparticularlyeasy,soitcan
beanequallydangeroustool.

Donotinventconcepts.Thejobofadesigneristoremoveproblems,notaddfeatures.

Firstly,Iwilllayouttheadvantagesofmodelorientedcodegeneration:

Youcancreatenearperfectabstractionsthatmaptoyourrealworld.So,ourprotocolmodelmaps100%tothe"real
world"ofMajordomo.Thiswouldbeimpossiblewithoutthefreedomtotuneandchangethemodelinanyway.
Youcandeveloptheseperfectmodelsquicklyandcheaply.
Youcangenerateanytextoutput.Fromasinglemodel,youcancreatedocumentation,codeinanylanguage,testtools
literallyanyoutputyoucanthinkof.
Youcangenerate(andImeanthisliterally)perfectoutputbecauseit'scheaptoimproveyourcodegeneratorstoanylevel
youwant.
Yougetasinglesourcethatcombinesspecificationsandsemantics.
Youcanleverageasmallteamtoamassivesize.AtiMatix,weproducedthemillionlineOpenAMQmessagingproduct
outofperhaps85Klinesofinputmodels,includingthecodegenerationscriptsthemselves.

Nowlet'slookatthedisadvantages:

Youaddtooldependenciestoyourproject.
Youmaygetcarriedawayandcreatemodelsforthepurejoyofcreatingthem.
Youmayalienatenewcomers,whowillsee"strangestuff",fromyourwork.
Youmaygivepeopleastrongexcusenottoinvestinyourproject.

http://zguide.zeromq.org/page:all 168/225
12/31/2015 MQ - The Guide - MQ - The Guide
Cynically,modelorientedabuseworksgreatinenvironmentswhereyouwanttoproducehugeamountsofperfectcodethatyou
canmaintainwithlittleeffortandwhichnoonecanevertakeawayfromyou.Personally,Iliketocrossmyriversandmoveon.
Butiflongtermjobsecurityisyourthing,thisisalmostperfect.

SoifyoudouseGSLandwanttocreateopencommunitiesaroundyourwork,hereismyadvice:

Useitonlywhereyouwouldotherwisebewritingtiresomecodebyhand.
Designnaturalmodelsthatarewhatpeoplewouldexpecttosee.
Writethecodebyhandfirstsoyouknowwhattogenerate.
Donotoveruse.Keepitsimple!Donotgettoometa!!
Introducegraduallyintoaproject.
Putthegeneratedcodeintoyourrepositories.

We'realreadyusingGSLinsomeprojectsaroundZeroMQ.Forexample,thehighlevelCbinding,CZMQ,usesGSLtogenerate
thesocketoptionsclass(zsockopt).A300linecodegeneratorturns78linesofXMLmodelinto1,500linesofperfect,butreally
boringcode.That'sagoodwin.

TransferringFiles topprevnext

Let'stakeabreakfromthelecturingandgetbacktoourfirstloveandthereasonfordoingallofthis:code.

"HowdoIsendafile?"isacommonquestionontheZeroMQmailinglists.Thisshouldnotbesurprising,becausefiletransferis
perhapstheoldestandmostobvioustypeofmessaging.Sendingfilesaroundnetworkshaslotsofusecasesapartfrom
annoyingthecopyrightcartels.ZeroMQisverygoodoutoftheboxatsendingeventsandtasks,butlessgoodatsendingfiles.

I'vepromised,forayearortwo,towriteaproperexplanation.Here'sagratuitouspieceofinformationtobrightenyourmorning:
theword"proper"comesfromthearchaicFrenchpropre,whichmeans"clean".ThedarkageEnglishcommonfolk,notbeing
familiarwithhotwaterandsoap,changedthewordtomean"foreign"or"upperclass",asin"that'sproperfood!",butlaterthe
wordcametomeanjust"real",asin"that'sapropermessyou'vegottenusinto!"

So,filetransfer.Thereareseveralreasonsyoucan'tjustpickuparandomfile,blindfoldit,andshoveitwholeintoamessage.
ThemostobviousreasonbeingthatdespitedecadesofdeterminedgrowthinRAMsizes(andwhoamongusoldtimersdoesn't
fondlyremembersavingupforthat1024bytememoryextensioncard?!),disksizesobstinatelyremainmuchlarger.Evenifwe
couldsendafilewithoneinstruction(say,usingasystemcalllikesendfile),we'dhittherealitythatnetworksarenotinfinitelyfast
norperfectlyreliable.Aftertryingtouploadalargefileseveraltimesonaslowflakynetwork(WiFi,anyone?),you'llrealizethata
properfiletransferprotocolneedsawaytorecoverfromfailures.Thatis,itneedsawaytosendonlythepartofafilethatwasn't
yetreceived.

Finally,afterallthis,ifyoubuildaproperfileserver,you'llnoticethatsimplysendingmassiveamountsofdatatolotsofclients
createsthatsituationweliketocall,inthetechnicalparlance,"serverwentbellyupduetoallavailableheapmemorybeingeaten
byapoorlydesignedapplication".Aproperfiletransferprotocolneedstopayattentiontomemoryuse.

We'llsolvetheseproblemsproperly,onebyone,whichshouldhopefullygetustoagoodandproperfiletransferprotocolrunning
overZeroMQ.First,let'sgeneratea1GBtestfilewithrandomdata(realpoweroftwogigalikeVonNeummanintended,notthe
fakesilicononesthememoryindustrylikestosell):

ddif=/dev/urandomof=testdatabs=1Mcount=1024

Thisislargeenoughtobetroublesomewhenwehavelotsofclientsaskingforthesamefileatonce,andonmanymachines,
1GBisgoingtobetoolargetoallocateinmemoryanyhow.Asabasereference,let'smeasurehowlongittakestocopythisfile
fromdiskbacktodisk.Thiswilltellushowmuchourfiletransferprotocoladdsontop(includingnetworkcosts):

$timecptestdatatestdata2

real0m7.143s
user0m0.012s
sys0m1.188s

The4figureprecisionismisleadingexpectvariationsof25%eitherway.Thisisjustan"orderofmagnitude"measurement.

http://zguide.zeromq.org/page:all 169/225
12/31/2015 MQ - The Guide - MQ - The Guide
Here'sourfirstcutatthecode,wheretheclientasksforthetestdataandtheserverjustsendsit,withoutstoppingforbreath,as
aseriesofmessages,whereeachmessageholdsonechunk:

fileio1:Filetransfertest,model1inC

Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

It'sprettysimple,butwealreadyrunintoaproblem:ifwesendtoomuchdatatotheROUTERsocket,wecaneasilyoverflowit.
Thesimplebutstupidsolutionistoputaninfinitehighwatermarkonthesocket.It'sstupidbecausewenowhavenoprotection
againstexhaustingtheserver'smemory.YetwithoutaninfiniteHWM,werisklosingchunksoflargefiles.

Trythis:settheHWMto1,000(inZeroMQv3.xthisisthedefault)andthenreducethechunksizeto100Ksowesend10K
chunksinonego.Runthetest,andyou'llseeitneverfinishes.Asthezmq_socket()manpagesayswithcheerfulbrutality,for
theROUTERsocket:"ZMQ_HWMoptionaction:Drop".

Wehavetocontroltheamountofdatatheserversendsupfront.There'snopointinitsendingmorethanthenetworkcan
handle.Let'strysendingonechunkatatime.Inthisversionoftheprotocol,theclientwillexplicitlysay,"GivemechunkN",and
theserverwillfetchthatspecificchunkfromdiskandsendit.

Here'stheimprovedsecondmodel,wheretheclientasksforonechunkatatime,andtheserveronlysendsonechunkforeach
requestitgetsfromtheclient:

fileio2:Filetransfertest,model2inC

Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

Itismuchslowernow,becauseofthetoandfrochattingbetweenclientandserver.Wepayabout300microsecondsforeach
requestreplyroundtrip,onalocalloopconnection(clientandserveronthesamebox).Itdoesn'tsoundlikemuchbutitaddsup
quickly:

$time./fileio1
4296chunksreceived,1073741824bytes

real0m0.669s
user0m0.056s
sys0m1.048s

$time./fileio2
4295chunksreceived,1073741824bytes

real0m2.389s
user0m0.312s
sys0m2.136s

Therearetwovaluablelessonshere.First,whilerequestreplyiseasy,it'salsotooslowforhighvolumedataflows.Payingthat
300microsecondsoncewouldbefine.Payingitforeverysinglechunkisn'tacceptable,particularlyonrealnetworkswith
latenciesofperhaps1,000timeshigher.

ThesecondpointissomethingI'vesaidbeforebutwillrepeat:it'sincrediblyeasytoexperiment,measure,andimproveaprotocol
overZeroMQ.Andwhenthecostofsomethingcomeswaydown,youcanaffordalotmoreofit.Dolearntodevelopandprove
yourprotocolsinisolation:I'veseenteamswastetimetryingtoimprovepoorlydesignedprotocolsthataretoodeeplyembedded
inapplicationstobeeasilytestableorfixable.

Ourmodeltwofiletransferprotocolisn'tsobad,apartfromperformance:

Itcompletelyeliminatesanyriskofmemoryexhaustion.Toprovethat,wesetthehighwatermarkto1inbothsenderand
receiver.
Itletstheclientchoosethechunksize,whichisusefulbecauseifthere'sanytuningofthechunksizetobedone,for
networkconditions,forfiletypes,ortoreducememoryconsumptionfurther,it'stheclientthatshouldbedoingthis.
Itgivesusfullyrestartablefiletransfers.
Itallowstheclienttocancelthefiletransferatanypointintime.

Ifwejustdidn'thavetodoarequestforeachchunk,it'dbeausableprotocol.Whatweneedisawayfortheservertosend

http://zguide.zeromq.org/page:all 170/225
12/31/2015 MQ - The Guide - MQ - The Guide
multiplechunkswithoutwaitingfortheclienttorequestoracknowledgeeachone.Whatareourchoices?

Theservercouldsend10chunksatonce,thenwaitforasingleacknowledgment.That'sexactlylikemultiplyingthechunk
sizeby10,soit'spointless.Andyes,it'sjustaspointlessforallvaluesof10.

Theservercouldsendchunkswithoutanychatterfromtheclientbutwithaslightdelaybetweeneachsend,sothatit
wouldsendchunksonlyasfastasthenetworkcouldhandlethem.Thiswouldrequiretheservertoknowwhat's
happeningatthenetworklayer,whichsoundslikehardwork.Italsobreakslayeringhorribly.Andwhathappensifthe
networkisreallyfast,buttheclientitselfisslow?Wherearechunksqueuedthen?

Theservercouldtrytospyonthesendingqueue,i.e.,seehowfullitis,andsendonlywhenthequeueisn'tfull.Well,
ZeroMQdoesn'tallowthatbecauseitdoesn'twork,forthesamereasonasthrottlingdoesn'twork.Theserverandnetwork
maybemorethanfastenough,buttheclientmaybeaslowlittledevice.

WecouldmodifylibzmqtotakesomeotheractiononreachingHWM.Perhapsitcouldblock?Thatwouldmeanthata
singleslowclientwouldblockthewholeserver,sonothankyou.Maybeitcouldreturnanerrortothecaller?Thenthe
servercoulddosomethingsmartlikewell,thereisn'treallyanythingitcoulddothat'sanybetterthandroppingthe
message.

Apartfrombeingcomplexandvariouslyunpleasant,noneoftheseoptionswouldevenwork.Whatweneedisawayfortheclient
totelltheserver,asynchronouslyandinthebackground,thatit'sreadyformore.Weneedsomekindofasynchronousflow
control.Ifwedothisright,datashouldflowwithoutinterruptionfromtheservertotheclient,butonlyaslongastheclientis
readingit.Let'sreviewourthreeprotocols.Thiswasthefirstone:

C:fetch
S:chunk1
S:chunk2
S:chunk3
....

Andthesecondintroducedarequestforeachchunk:

C:fetchchunk1
S:sendchunk1
C:fetchchunk2
S:sendchunk2
C:fetchchunk3
S:sendchunk3
C:fetchchunk4
....

Nowwaveshandsmysteriouslyhere'sachangedprotocolthatfixestheperformanceproblem:

C:fetchchunk1
C:fetchchunk2
C:fetchchunk3
S:sendchunk1
C:fetchchunk4
S:sendchunk2
S:sendchunk3
....

Itlookssuspiciouslysimilar.Infact,it'sidenticalexceptthatwesendmultiplerequestswithoutwaitingforareplyforeachone.
Thisisatechniquecalled"pipelining"anditworksbecauseourDEALERandROUTERsocketsarefullyasynchronous.

Here'sthethirdmodelofourfiletransfertestbench,withpipelining.Theclientsendsanumberofrequestsahead(the"credit")
andtheneachtimeitprocessesanincomingchunk,itsendsonemorecredit.Theserverwillneversendmorechunksthanthe
clienthasaskedfor:

fileio3:Filetransfertest,model3inC

http://zguide.zeromq.org/page:all 171/225
12/31/2015 MQ - The Guide - MQ - The Guide

Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

ThattweakgivesusfullcontrolovertheendtoendpipelineincludingallnetworkbuffersandZeroMQqueuesatsenderand
receiver.Weensurethepipelineisalwaysfilledwithdatawhilenevergrowingbeyondapredefinedlimit.Morethanthat,the
clientdecidesexactlywhentosend"credit"tothesender.Itcouldbewhenitreceivesachunk,orwhenithasfullyprocesseda
chunk.Andthishappensasynchronously,withnosignificantperformancecost.

Inthethirdmodel,Ichoseapipelinesizeof10messages(eachmessageisachunk).Thiswillcostamaximumof2.5MB
memoryperclient.Sowith1GBofmemorywecanhandleatleast400clients.Wecantrytocalculatetheidealpipelinesize.It
takesabout0.7secondstosendthe1GBfile,whichisabout160microsecondsforachunk.Aroundtripis300microseconds,so
thepipelineneedstobeatleast35chunkstokeeptheserverbusy.Inpractice,Istillgotperformancespikeswithapipelineof5
chunks,probablybecausethecreditmessagessometimesgetdelayedbyoutgoingdata.Soat10chunks,itworksconsistently.

$time./fileio3
4291chunksreceived,1072741824bytes

real0m0.777s
user0m0.096s
sys0m1.120s

Domeasurerigorously.Yourcalculationsmaybegood,buttherealworldtendstohaveitsownopinions.

Whatwe'vemadeisclearlynotyetarealfiletransferprotocol,butitprovesthepatternandIthinkitisthesimplestplausible
design.Forarealworkingprotocol,wemightwanttoaddsomeorallof:

Authenticationandaccesscontrols,evenwithoutencryption:thepointisn'ttoprotectsensitivedata,buttocatcherrorslike
sendingtestdatatoproductionservers.

ACheapstylerequestincludingfilepath,optionalcompression,andotherstuffwe'velearnedisusefulfromHTTP(such
asIfModifiedSince).

ACheapstyleresponse,atleastforthefirstchunk,thatprovidesmetadatasuchasfilesize(sotheclientcanpre
allocate,andavoidunpleasantdiskfullsituations).

Theabilitytofetchasetoffilesinonego,otherwisetheprotocolbecomesinefficientforlargesetsofsmallfiles.

Confirmationfromtheclientwhenit'sfullyreceivedafile,torecoverfromchunksthatmightbelostiftheclientdisconnects
unexpectedly.

Sofar,oursemantichasbeen"fetch"thatis,therecipientknows(somehow)thattheyneedaspecificfile,sotheyaskforit.The
knowledgeofwhichfilesexistandwheretheyareisthenpassedoutofband(e.g.,inHTTP,bylinksintheHTMLpage).

Howabouta"push"semantic?Therearetwoplausibleusecasesforthis.First,ifweadoptacentralizedarchitecturewithfileson
amain"server"(notsomethingI'madvocating,butpeopledosometimeslikethis),thenit'sveryusefultoallowclientstoupload
filestotheserver.Second,itletsusdoakindofpubsubforfiles,wheretheclientasksforallnewfilesofsometypeasthe
servergetsthese,itforwardsthemtotheclient.

Afetchsemanticissynchronous,whileapushsemanticisasynchronous.Asynchronousislesschatty,sofaster.Also,youcan
docutethingslike"subscribetothispath"thuscreatingapubsubfiletransferarchitecture.ThatissoobviouslyawesomethatI
shouldn'tneedtoexplainwhatproblemitsolves.

Still,hereistheproblemwiththefetchsemantic:thatoutofbandroutetotellclientswhatfilesexist.Nomatterhowyoudothis,it
endsupbeingcomplex.Eitherclientshavetopoll,oryouneedaseparatepubsubchanneltokeepclientsuptodate,oryou
needuserinteraction.

Thinkingthisthroughalittlemore,though,wecanseethatfetchisjustaspecialcaseofpubsub.Sowecangetthebestofboth
worlds.Hereisthegeneraldesign:

Fetchthispath
Hereiscredit(repeat)

Tomakethiswork(andwewill,mydearreaders),weneedtobealittlemoreexplicitabouthowwesendcredittotheserver.The
cutetrickoftreatingapipelined"fetchchunk"requestascreditwon'tflybecausetheclientdoesn'tknowanylongerwhatfiles
actuallyexist,howlargetheyare,anything.Iftheclientsays,"I'mgoodfor250,000bytesofdata",thisshouldworkequallyfor1

http://zguide.zeromq.org/page:all 172/225
12/31/2015 MQ - The Guide - MQ - The Guide
fileof250Kbytes,or100filesof2,500bytes.

Andthisgivesus"creditbasedflowcontrol",whicheffectivelyremovestheneedforhighwatermarks,andanyriskofmemory
overflow.

StateMachines topprevnext

Softwareengineerstendtothinkof(finite)statemachinesasakindofintermediaryinterpreter.Thatis,youtakearegular
languageandcompilethatintoastatemachine,thenexecutethestatemachine.Thestatemachineitselfisrarelyvisibletothe
developer:it'saninternalrepresentationoptimized,compressed,andbizarre.

However,itturnsoutthatstatemachinesarealsovaluableasafirstclassmodelinglanguagesforprotocolhandlers,e.g.,
ZeroMQclientsandservers.ZeroMQmakesitrathereasytodesignprotocols,butwe'veneverdefinedagoodpatternforwriting
thoseclientsandserversproperly.

Aprotocolhasatleasttwolevels:

Howwerepresentindividualmessagesonthewire.
Howmessagesflowbetweenpeers,andthesignificanceofeachmessage.

We'veseeninthischapterhowtoproducecodecsthathandleserialization.That'sagoodstart.Butifweleavethesecondjobto
developers,thatgivesthemalotofroomtointerpret.Aswemakemoreambitiousprotocols(filetransfer+heartbeating+credit+
authentication),itbecomeslessandlesssanetotrytoimplementclientsandserversbyhand.

Yes,peopledothisalmostsystematically.Butthecostsarehigh,andthey'reavoidable.I'llexplainhowtomodelprotocolsusing
statemachines,andhowtogenerateneatandsolidcodefromthosemodels.

Myexperiencewithusingstatemachinesasasoftwareconstructiontooldatesto1985andmyfirstrealjobmakingtoolsfor
applicationdevelopers.In1991,IturnedthatknowledgeintoafreesoftwaretoolcalledLibero,whichspatoutexecutablestate
machinesfromasimpletextmodel.

ThethingaboutLibero'smodelwasthatitwasreadable.Thatis,youdescribedyourprogramlogicasnamedstates,each
acceptingasetofevents,eachdoingsomerealwork.Theresultingstatemachinehookedintoyourapplicationcode,drivingit
likeaboss.

Liberowascharminglygoodatitsjob,fluentinmanylanguages,andmodestlypopulargiventheenigmaticnatureofstate
machines.WeusedLiberoinangerindozensoflargedistributedapplications,oneofwhichwasfinallyswitchedoffin2011after
20yearsofoperation.Statemachinedrivencodeconstructionworkedsowellthatit'ssomewhatimpressivethatthisapproach
neverhitthemainstreamofsoftwareengineering.

SointhissectionI'mgoingtoexplainLibero'smodel,anddemonstratehowtouseittogenerateZeroMQclientsandservers.
We'lluseGSLagain,butlikeIsaid,theprinciplesaregeneralandyoucanputtogethercodegeneratorsusinganyscripting
language.

Asaworkedexample,let'sseehowtocarryonastatefuldialogwithapeeronaROUTERsocket.We'lldeveloptheserverusing
astatemachine(andtheclientbyhand).WehaveasimpleprotocolthatI'llcall"NOM".I'musingtheohsoveryserious
keywordsforunprotocolsproposal:

nomprotocol=openpeering*usepeering

openpeering=C:OHAI(S:OHAIOK/S:WTF)

usepeering=C:ICANHAZ
/S:CHEEZBURGER
/C:HUGZS:HUGZOK
/S:HUGZC:HUGZOK

I'venotfoundaquickwaytoexplainthetruenatureofstatemachineprogramming.Inmyexperience,itinvariablytakesafew
daysofpractice.Afterthreeorfourdays'exposuretotheidea,thereisanearaudible"click!"assomethinginthebrainconnects
allthepiecestogether.We'llmakeitconcretebylookingatthestatemachineforourNOMserver.

Ausefulthingaboutstatemachinesisthatyoucanreadthemstatebystate.Eachstatehasauniquedescriptivenameandone
http://zguide.zeromq.org/page:all 173/225
12/31/2015 MQ - The Guide - MQ - The Guide
ormoreevents,whichwelistinanyorder.Foreachevent,weperformzeroormoreactionsandwethenmovetoanextstate(or
stayinthesamestate).

InaZeroMQprotocolserver,wehaveastatemachineinstanceperclient.Thatsoundscomplexbutitisn't,aswe'llsee.We
describeourfirststate,Start,ashavingonevalidevent:OHAI.Wechecktheuser'scredentialsandthenarriveinthe
Authenticatedstate.

Figure64TheStartState

TheCheckCredentialsactionproduceseitheranokoranerrorevent.It'sintheAuthenticatedstatethatwehandlethese
twopossibleeventsbysendinganappropriatereplybacktotheclient.Ifauthenticationfailed,wereturntotheStartstatewhere
theclientcantryagain.

Figure65TheAuthenticatedState

Whenauthenticationhassucceeded,wearriveintheReadystate.Herewehavethreepossibleevents:anICANHAZorHUGZ
messagefromtheclient,oraheartbeattimerevent.

Figure66TheReadyState

http://zguide.zeromq.org/page:all 174/225
12/31/2015 MQ - The Guide - MQ - The Guide

Thereareafewmorethingsaboutthisstatemachinemodelthatareworthknowing:

Eventsinuppercase(like"HUGZ")areexternaleventsthatcomefromtheclientasmessages.
Eventsinlowercase(like"heartbeat")areinternalevents,producedbycodeintheserver.
The"SendSOMETHING"actionsareshorthandforsendingaspecificreplybacktotheclient.
Eventsthataren'tdefinedinaparticularstatearesilentlyignored.

Now,theoriginalsourcefortheseprettypicturesisanXMLmodel:

<classname="nom_server"script="server_c">

<statename="start">
<eventname="OHAI"next="authenticated">
<actionname="checkcredentials"/>
</event>
</state>

<statename="authenticated">
<eventname="ok"next="ready">
<actionname="send"message="OHAIOK"/>
</event>
<eventname="error"next="start">
<actionname="send"message="WTF"/>
</event>
</state>

<statename="ready">
<eventname="ICANHAZ">
<actionname="send"message="CHEEZBURGER"/>
</event>
<eventname="HUGZ">
<actionname="send"message="HUGZOK"/>
</event>
<eventname="heartbeat">
<actionname="send"message="HUGZ"/>
</event>
</state>
</class>

Thecodegeneratorisinexamples/models/server_c.gsl.ItisafairlycompletetoolthatI'lluseandexpandformoreserious
http://zguide.zeromq.org/page:all 175/225
12/31/2015 MQ - The Guide - MQ - The Guide
worklater.Itgenerates:

AserverclassinC(nom_server.c,nom_server.h)thatimplementsthewholeprotocolflow.
AselftestmethodthatrunstheselfteststepslistedintheXMLfile.
Documentationintheformofgraphics(theprettypictures).

Here'sasimplemainprogramthatstartsthegeneratedNOMserver:

#include"czmq.h"
#include"nom_server.h"

intmain(intargc,char*argv[])
{
printf("StartingNOMprotocolserveronport5670\n")
nom_server_t*server=nom_server_new()
nom_server_bind(server,"tcp://*:5670")
nom_server_wait(server)
nom_server_destroy(&server)
return0
}

Thegeneratednom_serverclassisafairlyclassicmodel.ItacceptsclientmessagesonaROUTERsocket,sothefirstframeon
everyrequestistheclient'sconnectionidentity.Theservermanagesasetofclients,eachwithstate.Asmessagesarrive,it
feedstheseaseventstothestatemachine.Here'sthecoreofthestatemachine,asamixofGSLcommandsandtheCcodewe
intendtogenerate:

client_execute(client_t*self,intevent)
{
self>next_event=event
while(self>next_event){
self>event=self>next_event
self>next_event=0
switch(self>state){
.forclass.state
case$(name:c)_state:
.forevent
.ifindex()>1
else
.endif
if(self>event==$(name:c)_event){
.foraction
.ifname="send"
zmsg_addstr(self>reply,"$(message:)")
.else
$(name:c)_action(self)
.endif
.endfor
.ifdefined(event.next)
self>state=$(next:c)_state
.endif
}
.endfor
break
.endfor
}
if(zmsg_size(self>reply)>1){
zmsg_send(&self>reply,self>router)
self>reply=zmsg_new()
zmsg_add(self>reply,zframe_dup(self>address))
}
}
http://zguide.zeromq.org/page:all 176/225
12/31/2015 MQ - The Guide - MQ - The Guide
}

Eachclientisheldasanobjectwithvariousproperties,includingthevariablesweneedtorepresentastatemachineinstance:

event_tnext_event//Nextevent
state_tstate//Currentstate
event_tevent//Currentevent

Youwillseebynowthatwearegeneratingtechnicallyperfectcodethathastheprecisedesignandshapewewant.Theonly
cluethatthenom_serverclassisn'thandwrittenisthatthecodeistoogood.Peoplewhocomplainthatcodegeneratorsproduce
poorcodeareaccustomedtopoorcodegenerators.Itistrivialtoextendourmodelasweneedit.Forexample,here'showwe
generatetheselftestcode.

First,weadda"selftest"itemtothestatemachineandwriteourtests.We'renotusinganyXMLgrammarorvalidationsoitreally
isjustamatterofopeningtheeditorandaddinghalfadozenlinesoftext:

<selftest>
<stepsend="OHAI"body="Sleepy"recv="WTF"/>
<stepsend="OHAI"body="Joe"recv="OHAIOK"/>
<stepsend="ICANHAZ"recv="CHEEZBURGER"/>
<stepsend="HUGZ"recv="HUGZOK"/>
<steprecv="HUGZ"/>
</selftest>

Designingonthefly,Idecidedthat"send"and"recv"wereanicewaytoexpress"sendthisrequest,thenexpectthisreply".
Here'stheGSLcodethatturnsthismodelintorealcode:

.forclass>selftest.step
.ifdefined(send)
msg=zmsg_new()
zmsg_addstr(msg,"$(send:)")
.ifdefined(body)
zmsg_addstr(msg,"$(body:)")
.endif
zmsg_send(&msg,dealer)

.endif
.ifdefined(recv)
msg=zmsg_recv(dealer)
assert(msg)
command=zmsg_popstr(msg)
assert(streq(command,"$(recv:)"))
free(command)
zmsg_destroy(&msg)

.endif
.endfor

Finally,oneofthemoretrickybutabsolutelyessentialpartsofanystatemachinegeneratorishowdoIplugthisintomyown
code?AsaminimalexampleforthisexerciseIwantedtoimplementthe"checkcredentials"actionbyacceptingallOHAIsfrom
myfriendJoe(HiJoe!)andrejecteveryoneelse'sOHAIs.Aftersomethought,Idecidedtograbcodedirectlyfromthestate
machinemodel,i.e.,embedactionbodiesintheXMLfile.Soinnom_server.xml,you'llseethis:

<actionname="checkcredentials">
char*body=zmsg_popstr(self>request)
if(body&&streq(body,"Joe"))
self>next_event=ok_event

http://zguide.zeromq.org/page:all 177/225
12/31/2015 MQ - The Guide - MQ - The Guide
else
self>next_event=error_event
free(body)
</action>

AndthecodegeneratorgrabsthatCcodeandinsertsitintothegeneratednom_server.cfile:

.forclass.action
staticvoid
$(name:c)_action(client_t*self){
$(string.trim(.):)
}
.endfor

Andnowwehavesomethingquiteelegant:asinglesourcefilethatdescribesmyserverstatemachineandalsocontainsthe
nativeimplementationsformyactions.Anicemixofhighlevelandlowlevelthatisabout90%smallerthantheCcode.

Beware,asyourheadspinswithnotionsofalltheamazingthingsyoucouldproducewithsuchleverage.Whilethisapproach
givesyourealpower,italsomovesyouawayfromyourpeers,andifyougotoofar,you'llfindyourselfworkingalone.

Bytheway,thissimplelittlestatemachinedesignexposesjustthreevariablestoourcustomcode:

self>next_event
self>request
self>reply

IntheLiberostatemachinemodel,thereareafewmoreconceptsthatwe'venotusedhere,butwhichwewillneedwhenwe
writelargerstatemachines:

Exceptions,whichletsuswriteterserstatemachines.Whenanactionraisesanexception,furtherprocessingontheevent
stops.Thestatemachinecanthendefinehowtohandleexceptionevents.
TheDefaultsstate,wherewecandefinedefaulthandlingforevents(especiallyusefulforexceptionevents).

AuthenticationUsingSASL topprevnext

WhenwedesignedAMQPin2007,wechosetheSimpleAuthenticationandSecurityLayer(SASL)fortheauthenticationlayer,
oneoftheideaswetookfromtheBEEPprotocolframework.SASLlookscomplexatfirst,butit'sactuallysimpleandfitsneatly
intoaZeroMQbasedprotocol.WhatIespeciallylikeaboutSASListhatit'sscalable.Youcanstartwithanonymousaccessor
plaintextauthenticationandnosecurity,andgrowtomoresecuremechanismsovertimewithoutchangingyourprotocol.

I'mnotgoingtogiveadeepexplanationnowbecausewe'llseeSASLinactionsomewhatlater.ButI'llexplaintheprincipleso
you'realreadysomewhatprepared.

IntheNOMprotocol,theclientstartedwithanOHAIcommand,whichtheservereitheraccepted("HiJoe!")orrejected.Thisis
simplebutnotscalablebecauseserverandclienthavetoagreeupfrontonthetypeofauthenticationthey'regoingtodo.

WhatSASLintroduced,whichisgenius,isafullyabstractedandnegotiablesecuritylayerthat'sstilleasytoimplementatthe
protocollevel.Itworksasfollows:

Theclientconnects.
Theserverchallengestheclient,passingalistofsecurity"mechanisms"thatitknowsabout.
Theclientchoosesasecuritymechanismthatitknowsabout,andanswerstheserver'schallengewithablobofopaque
datathat(andhere'stheneattrick)somegenericsecuritylibrarycalculatesandgivestotheclient.
Theservertakesthesecuritymechanismtheclientchose,andthatblobofdata,andpassesittoitsownsecuritylibrary.
Thelibraryeitheracceptstheclient'sanswer,ortheserverchallengesagain.

ThereareanumberoffreeSASLlibraries.Whenwecometorealcode,we'llimplementjusttwomechanisms,ANONYMOUS
andPLAIN,whichdon'tneedanyspeciallibraries.

TosupportSASL,wehavetoaddanoptionalchallenge/responsesteptoour"openpeering"flow.Hereiswhattheresulting

http://zguide.zeromq.org/page:all 178/225
12/31/2015 MQ - The Guide - MQ - The Guide
protocolgrammarlookslike(I'mmodifyingNOMtodothis):

securenom=openpeering*usepeering

openpeering=C:OHAI*(S:ORLYC:YARLY)(S:OHAIOK/S:WTF)

ORLY=1*mechanismchallenge
mechanism=string
challenge=*OCTET

YARLY=mechanismresponse
response=*OCTET

WhereORLYandYARLYcontainastring(alistofmechanismsinORLY,onemechanisminYARLY)andablobofopaquedata.
Dependingonthemechanism,theinitialchallengefromtheservermaybeempty.Wedon'tcare:wejustpassthistothesecurity
librarytodealwith.

TheSASLRFCgoesintodetailaboutotherfeatures(thatwedon'tneed),thekindsofwaysSASLcouldbeattacked,andsoon.

LargeScaleFilePublishing:FileMQ topprevnext

Let'sputallthesetechniquestogetherintoafiledistributionsystemthatI'llcallFileMQ.Thisisgoingtobearealproduct,living
onGitHub.Whatwe'llmakehereisafirstversionofFileMQ,asatrainingtool.Iftheconceptworks,therealthingmayeventually
getitsownbook.

WhymakeFileMQ? topprevnext

Whymakeafiledistributionsystem?IalreadyexplainedhowtosendlargefilesoverZeroMQ,andit'sreallyquitesimple.Butif
youwanttomakemessagingaccessibletoamilliontimesmorepeoplethancanuseZeroMQ,youneedanotherkindofAPI.An
APIthatmyfiveyearoldsoncanunderstand.AnAPIthatisuniversal,requiresnoprogramming,andworkswithjustaboutevery
singleapplication.

Yes,I'mtalkingaboutthefilesystem.It'stheDropBoxpattern:chuckyourfilessomewhereandtheygetmagicallycopied
somewhereelsewhenthenetworkconnectsagain.

However,whatI'maimingforisafullydecentralizedarchitecturethatlooksmorelikegit,thatdoesn'tneedanycloudservices
(thoughwecouldputFileMQinthecloud),andthatdoesmulticast,i.e.,cansendfilestomanyplacesatonce.

FileMQmustbesecure(able),easilyhookedintorandomscriptinglanguages,andasfastaspossibleacrossourdomesticand
officenetworks.

IwanttouseittobackupphotosfrommymobilephonetomylaptopoverWiFi.Tosharepresentationslidesinrealtimeacross
50laptopsinaconference.Tosharedocumentswithcolleaguesinameeting.Tosendearthquakedatafromsensorstocentral
clusters.TobackupvideofrommyphoneasItakeit,duringprotestsorriots.Tosynchronizeconfigurationfilesacrossacloudof
Linuxservers.

Avisionaryidea,isn'tit?Well,ideasarecheap.Thehardpartismakingthis,andmakingitsimple.

InitialDesignCut:theAPI topprevnext

Here'sthewayIseethefirstdesign.FileMQhastobedistributed,whichmeansthateverynodecanbeaserverandaclientat
thesametime.ButIdon'twanttheprotocoltobesymmetrical,becausethatseemsforced.Wehaveanaturalflowoffilesfrom
pointAtopointB,whereAisthe"server"andBisthe"client".Iffilesflowbacktheotherway,thenwehavetwoflows.FileMQis

http://zguide.zeromq.org/page:all 179/225
12/31/2015 MQ - The Guide - MQ - The Guide
notyetdirectorysynchronizationprotocol,butwe'llbringitquiteclose.

Thus,I'mgoingtobuildFileMQastwopieces:aclientandaserver.Then,I'llputthesetogetherinamainapplication(the
filemqtool)thatcanactbothasclientandserver.Thetwopieceswilllookquitesimilartothenom_server,withthesamekind
ofAPI:

fmq_server_t*server=fmq_server_new()
fmq_server_bind(server,"tcp://*:5670")
fmq_server_publish(server,"/home/ph/filemq/share","/public")
fmq_server_publish(server,"/home/ph/photos/stream","/photostream")

fmq_client_t*client=fmq_client_new()
fmq_client_connect(client,"tcp://pieter.filemq.org:5670")
fmq_client_subscribe(server,"/public/","/home/ph/filemq/share")
fmq_client_subscribe(server,"/photostream/","/home/ph/photos/stream")

while(!zctx_interrupted)
sleep(1)

fmq_server_destroy(&server)
fmq_client_destroy(&client)

IfwewrapthisCAPIinotherlanguages,wecaneasilyscriptFileMQ,embeditapplications,portittosmartphones,andsoon.

InitialDesignCut:theProtocol topprevnext

Thefullnamefortheprotocolisthe"FileMessageQueuingProtocol",orFILEMQinuppercasetodistinguishitfromthesoftware.
Tostartwith,wewritedowntheprotocolasanABNFgrammar.Ourgrammarstartswiththeflowofcommandsbetweenthe
clientandserver.Youshouldrecognizetheseasacombinationofthevarioustechniqueswe'veseenalready:

filemqprotocol=openpeering*usepeering[closepeering]

openpeering=C:OHAI*(S:ORLYC:YARLY)(S:OHAIOK/error)

usepeering=C:ICANHAZ(S:ICANHAZOK/error)
/C:NOM
/S:CHEEZBURGER
/C:HUGZS:HUGZOK
/S:HUGZC:HUGZOK

closepeering=C:KTHXBAI/S:KTHXBAI

error=S:SRSLY/S:RTFM

Herearethecommandstoandfromtheserver:

Theclientopenspeeringtotheserver
OHAI=signature%x01protocolversion
signature=%xAA%xA3
protocol=stringMustbe"FILEMQ"
string=size*VCHAR
size=OCTET
version=%x01

TheserverchallengestheclientusingtheSASLmodel
ORLY=signature%x02mechanismschallenge

http://zguide.zeromq.org/page:all 180/225
12/31/2015 MQ - The Guide - MQ - The Guide
mechanisms=size1*mechanism
mechanism=string
challenge=*OCTETZeroMQframe

TheclientrespondswithSASLauthenticationinformation
YARLY=%signaturex03mechanismresponse
response=*OCTETZeroMQframe

Theservergrantstheclientaccess
OHAIOK=signature%x04

Theclientsubscribestoavirtualpath
ICANHAZ=signature%x05pathoptionscache
path=stringFullpathorpathprefix
options=dictionary
dictionary=size*keyvalue
keyvalue=stringFormattedasname=value
cache=dictionaryFileSHA1signatures

Theserverconfirmsthesubscription
ICANHAZOK=signature%x06

Theclientsendscredittotheserver
NOM=signature%x07credit
credit=8OCTET64bitinteger,networkorder
sequence=8OCTET64bitinteger,networkorder

Theserversendsachunkoffiledata
CHEEZBURGER=signature%x08sequenceoperationfilename
offsetheaderschunk
sequence=8OCTET64bitinteger,networkorder
operation=OCTET
filename=string
offset=8OCTET64bitinteger,networkorder
headers=dictionary
chunk=FRAME

Clientorserversendsaheartbeat
HUGZ=signature%x09

Clientorserverrespondstoaheartbeat
HUGZOK=signature%x0A

Clientclosesthepeering
KTHXBAI=signature%x0B

Andherearethedifferentwaystheservercantelltheclientthingswentwrong:

Servererrorreplyrefusedduetoaccessrights
S:SRSLY=signature%x80reason

Servererrorreplyclientsentaninvalidcommand
S:RTFM=signature%x81reason

FILEMQlivesontheZeroMQunprotocolswebsiteandhasaregisteredTCPportwithIANA(theInternetAssignedNumbers
Authority),whichisport5670.

BuildingandTryingFileMQ topprevnext

http://zguide.zeromq.org/page:all 181/225
12/31/2015 MQ - The Guide - MQ - The Guide

TheFileMQstackisonGitHub.ItworkslikeaclassicC/C++project:

gitclonegit://github.com/zeromq/filemq.git
cdfilemq
./autogen.sh
./configure
makecheck

YouwanttobeusingthelatestCZMQmasterforthis.Nowtryrunningthetrackcommand,whichisasimpletoolthatuses
FileMQtotrackchangesinonedirectoryinanother:

cdsrc
./track./fmqroot/send./fmqroot/recv

Andopentwofilenavigatorwindows,oneintosrc/fmqroot/sendandoneintosrc/fmqroot/recv.Dropfilesintothesend
folderandyou'llseethemarriveintherecvfolder.Theserverchecksoncepersecondfornewfiles.Deletefilesinthesend
folder,andthey'redeletedintherecvfoldersimilarly.

IusetrackforthingslikeupdatingmyMP3playermountedasaUSBdrive.AsIaddorremovefilesinmylaptop'sMusicfolder,
thesamechangeshappenontheMP3player.FILEMQisn'tafullreplicationprotocolyet,butwe'llfixthatlater.

InternalArchitecture topprevnext

TobuildFileMQIusedalotofcodegeneration,possiblytoomuchforatutorial.Howeverthecodegeneratorsareallreusablein
otherstacksandwillbeimportantforourfinalprojectinChapter8AFrameworkforDistributedComputing.Theyarean
evolutionofthesetwesawearlier:

codec_c.gsl:generatesamessagecodecforagivenprotocol.
server_c.gsl:generatesaserverclassforaprotocolandstatemachine.
client_c.gsl:generatesaclientclassforaprotocolandstatemachine.

ThebestwaytolearntouseGSLcodegenerationistotranslatetheseintoalanguageofyourchoiceandmakeyourowndemo
protocolsandstacks.You'llfinditfairlyeasy.FileMQitselfdoesn'ttrytosupportmultiplelanguages.Itcould,butit'dmakethings
needlesslycomplex.

TheFileMQarchitectureactuallyslicesintotwolayers.There'sagenericsetofclassestohandlechunks,directories,files,
patches,SASLsecurity,andconfigurationfiles.Then,there'sthegeneratedstack:messages,client,andserver.IfIwascreating
anewprojectI'dforkthewholeFileMQproject,andgoandmodifythethreemodels:

fmq_msg.xml:definesthemessageformats
fmq_client.xml:definestheclientstatemachine,API,andimplementation.
fmq_server.xml:doesthesamefortheserver.

You'dwanttorenamethingstoavoidconfusion.Whydidn'tImakethereusableclassesintoaseparatelibrary?Theansweris
twofold.First,nooneactuallyneedsthis(yet).Second,it'dmakethingsmorecomplexforyouasyoubuildandplaywithFileMQ.
It'sneverworthaddingcomplexitytosolveatheoreticalproblem.

AlthoughIwroteFileMQinC,it'seasytomaptootherlanguages.ItisquiteamazinghowniceCbecomeswhenyouadd
CZMQ'sgenericzlistandzhashcontainersandclassstyle.Letmegothroughtheclassesquickly:

fmq_sasl:encodesanddecodesaSASLchallenge.IonlyimplementedthePLAINmechanism,whichisenoughtoprove
theconcept.

fmq_chunk:workswithvariablesizedblobs.NotasefficientasZeroMQ'smessagesbuttheydolessweirdnessandso
areeasiertounderstand.Thechunkclasshasmethodstoreadandwritechunksfromdisk.

http://zguide.zeromq.org/page:all 182/225
12/31/2015 MQ - The Guide - MQ - The Guide
fmq_file:workswithfiles,whichmayormaynotexistondisk.Givesyouinformationaboutafile(likesize),letsyouread
andwritetofiles,removefiles,checkifafileexists,andcheckifafileis"stable"(moreonthatlater).

fmq_dir:workswithdirectories,readingthemfromdiskandcomparingtwodirectoriestoseewhatchanged.Whenthere
arechanges,returnsalistof"patches".

fmq_patch:workswithonepatch,whichreallyjustsays"createthisfile"or"deletethisfile"(referringtoafmq_fileitem
eachtime).

fmq_config:workswithconfigurationdata.I'llcomebacktoclientandserverconfigurationlater.

Everyclasshasatestmethod,andthemaindevelopmentcycleis"edit,test".Thesearemostlysimpleselftests,buttheymake
thedifferencebetweencodeIcantrustandcodeIknowwillstillbreak.It'sasafebetthatanycodethatisn'tcoveredbyatest
casewillhaveundiscoverederrors.I'mnotafanofexternaltestharnesses.Butinternaltestcodethatyouwriteasyouwriteyour
functionalitythat'slikethehandleonaknife.

Youshould,really,beabletoreadthesourcecodeandrapidlyunderstandwhattheseclassesaredoing.Ifyoucan'treadthe
codehappily,tellme.IfyouwanttoporttheFileMQimplementationintootherlanguages,startbyforkingthewholerepository
andlaterwe'llseeifit'spossibletodothisinoneoverallrepo.

PublicAPI topprevnext

ThepublicAPIconsistsoftwoclasses(aswesketchedearlier):

fmq_client:providestheclientAPI,withmethodstoconnecttoaserver,configuretheclient,andsubscribetopaths.

fmq_server:providestheserverAPI,withmethodstobindtoaport,configuretheserver,andpublishapath.

TheseclassesprovideanmultithreadedAPI,amodelwe'veusedafewtimesnow.WhenyoucreateanAPIinstance(i.e.,
fmq_server_new()orfmq_client_new()),thismethodkicksoffabackgroundthreadthatdoestherealwork,i.e.,runsthe
serverortheclient.TheotherAPImethodsthentalktothisthreadoverZeroMQsockets(apipeconsistingoftwoPAIRsockets
overinproc://).

IfIwasakeenyoungdevelopereagertouseFileMQinanotherlanguage,I'dprobablyspendahappyweekendwritingabinding
forthispublicAPI,thenstickitinasubdirectoryofthefilemqprojectcalled,say,bindings/,andmakeapullrequest.

TheactualAPImethodscomefromthestatemachinedescription,likethis(fortheserver):

<methodname="publish">
<argumentname="location"type="string"/>
<argumentname="alias"type="string"/>
mount_t*mount=mount_new(location,alias)
zlist_append(self>mounts,mount)
</method>

Whichgetsturnedintothiscode:

void
fmq_server_publish(fmq_server_t*self,char*location,char*alias)
{
assert(self)
assert(location)
assert(alias)
zstr_sendm(self>pipe,"PUBLISH")
zstr_sendfm(self>pipe,"%s",location)
zstr_sendf(self>pipe,"%s",alias)
}

http://zguide.zeromq.org/page:all 183/225
12/31/2015 MQ - The Guide - MQ - The Guide

DesignNotes topprevnext

ThehardestpartofmakingFileMQwasn'timplementingtheprotocol,butmaintainingaccuratestateinternally.AnFTPorHTTP
serverisessentiallystateless.Butapublish/subscribeserverhastomaintainsubscriptions,atleast.

SoI'llgothroughsomeofthedesignaspects:

Theclientdetectsiftheserverhasdiedbythelackofheartbeats(HUGZ)comingfromtheserver.Itthenrestartsitsdialog
bysendinganOHAI.There'snotimeoutontheOHAIbecausetheZeroMQDEALERsocketwillqueueanoutgoing
messageindefinitely.

Ifaclientstopsreplyingwith(HUGZOK)totheheartbeatsthattheserversends,theserverconcludesthattheclienthas
diedanddeletesallstatefortheclientincludingitssubscriptions.

TheclientAPIholdssubscriptionsinmemoryandreplaysthemwhenithasconnectedsuccessfully.Thismeansthecaller
cansubscribeatanytime(anddoesn'tcarewhenconnectionsandauthenticationactuallyhappen).

Theserverandclientusevirtualpaths,muchlikeanHTTPorFTPserver.Youpublishoneormoremountpoints,each
correspondingtoadirectoryontheserver.Eachofthesemapstosomevirtualpath,forinstance"/"ifyouhaveonlyone
mountpoint.Clientsthensubscribetovirtualpaths,andfilesarriveinaninboxdirectory.Wedon'tsendphysicalfile
namesacrossthenetwork.

Therearesometimingissues:iftheserveriscreatingitsmountpointswhileclientsareconnectedandsubscribing,the
subscriptionswon'tattachtotherightmountpoints.So,webindtheserverportaslastthing.

ClientscanreconnectatanypointiftheclientsendsOHAI,thatsignalstheendofanypreviousconversationandthestart
ofanewone.Imightonedaymakesubscriptionsdurableontheserver,sotheysurviveadisconnection.Theclientstack,
afterreconnecting,replaysanysubscriptionsthecallerapplicationalreadymade.

Configuration topprevnext

I'vebuiltseverallargeserverproducts,liketheXitamiwebserverthatwaspopularinthelate90's,andtheOpenAMQmessaging
server.Gettingconfigurationeasyandobviouswasalargepartofmakingtheseserversfuntouse.

Wetypicallyaimtosolveanumberofproblems:

Shipdefaultconfigurationfileswiththeproduct.
Allowuserstoaddcustomconfigurationfilesthatareneveroverwritten.
Allowuserstoconfigurefromthecommandline.

Andthenlayertheseoneontheother,socommandlinesettingsoverridecustomsettings,whichoverridedefaultsettings.Itcan
bealotofworktodothisright.ForFileMQ,I'vetakenasomewhatsimplerapproach:allconfigurationisdonefromtheAPI.

Thisishowwestartandconfiguretheserver,forexample:

server=fmq_server_new()
fmq_server_configure(server,"server_test.cfg")
fmq_server_publish(server,"./fmqroot/send","/")
fmq_server_publish(server,"./fmqroot/logs","/logs")
fmq_server_bind(server,"tcp://*:5670")

Wedouseaspecificformatfortheconfigfiles,whichisZPL,aminimalistsyntaxthatwestartedusingforZeroMQ"devices"a
fewyearsago,butwhichworkswellforanyserver:

#Configureserverforplainaccess
#
server
monitor=1#Checkmountpoints

http://zguide.zeromq.org/page:all 184/225
12/31/2015 MQ - The Guide - MQ - The Guide
heartbeat=1#Heartbeattoclients

publish
location=./fmqroot/logs
virtual=/logs

security
echo=I:useguest/guesttologintoserver
#TheseareSASLmechanismsweaccept
anonymous=0
plain=1
account
login=guest
password=guest
group=guest
account
login=super
password=secret
group=admin

Onecutething(whichseemsuseful)thegeneratedservercodedoesistoparsethisconfigfile(whenyouusethe
fmq_server_configure()method)andexecuteanysectionthatmatchesanAPImethod.Thusthepublishsectionworks
asafmq_server_publish()method.

FileStability topprevnext

Itisquitecommontopolladirectoryforchangesandthendosomething"interesting"withnewfiles.Butasoneprocessiswriting
toafile,otherprocesseshavenoideawhenthefilehasbeenfullywritten.Onesolutionistoaddasecond"indicator"filethatwe
createaftercreatingthefirstfile.Thisisintrusive,however.

Thereisaneaterway,whichistodetectwhenafileis"stable",i.e.,nooneiswritingtoitanylonger.FileMQdoesthisby
checkingthemodificationtimeofthefile.Ifit'smorethanasecondold,thenthefileisconsideredstable,atleaststableenough
tobeshippedofftoclients.Ifaprocesscomesalongafterfiveminutesandappendstothefile,it'llbeshippedoffagain.

Forthistowork,andthisisarequirementforanyapplicationhopingtouseFileMQsuccessfully,donotbuffermorethana
second'sworthofdatainmemorybeforewriting.Ifyouuseverylargeblocksizes,thefilemaylookstablewhenit'snot.

DeliveryNotifications topprevnext

OneofthenicethingsaboutthemultithreadedAPImodelwe'reusingisthatit'sessentiallymessagebased.Thismakesitideal
forreturningeventsbacktothecaller.AmoreconventionalAPIapproachwouldbetousecallbacks.Butcallbacksthatcross
threadboundariesaresomewhatdelicate.Here'showtheclientsendsamessagebackwhenithasreceivedacompletefile:

zstr_sendm(self>pipe,"DELIVER")
zstr_sendm(self>pipe,filename)
zstr_sendf(self>pipe,"%s/%s",inbox,filename)

Wecannowadda_recv()methodtotheAPIthatwaitsforeventsbackfromtheclient.Itmakesacleanstyleforthecaller:
createtheclientobject,configureit,andthenreceiveandprocessanyeventsitreturns.

SymbolicLinks topprevnext

http://zguide.zeromq.org/page:all 185/225
12/31/2015 MQ - The Guide - MQ - The Guide

Whileusingastagingareaisanice,simpleAPI,italsocreatescostsforsenders.IfIalreadyhavea2GBvideofileonacamera,
andwanttosenditviaFileMQ,thecurrentimplementationasksthatIcopyittoastagingareabeforeitwillbesentto
subscribers.

Oneoptionistomountthewholecontentdirectory(e.g.,/home/me/Movies),butthisisfragilebecauseitmeanstheapplication
can'tdecidetosendindividualfiles.It'severythingornothing.

Asimpleansweristoimplementportablesymboliclinks.AsWikipediaexplains:"Asymboliclinkcontainsatextstringthatis
automaticallyinterpretedandfollowedbytheoperatingsystemasapathtoanotherfileordirectory.Thisotherfileordirectoryis
calledthetarget.Thesymboliclinkisasecondfilethatexistsindependentlyofitstarget.Ifasymboliclinkisdeleted,itstarget
remainsunaffected."

Thisdoesn'taffecttheprotocolinanywayit'sanoptimizationintheserverimplementation.Let'smakeasimpleportable
implementation:

Asymboliclinkconsistsofafilewiththeextension.ln.
Thefilenamewithout.lnisthepublishedfilename.
Thelinkfilecontainsoneline,whichistherealpathtothefile.

Becausewe'vecollectedalloperationsonfilesinasingleclass(fmq_file),it'sacleanchange.Whenwecreateanewfile
object,wecheckifit'sasymboliclinkandthenallreadonlyactions(getfilesize,readfile)operateonthetargetfile,notthelink.

RecoveryandLateJoiners topprevnext

Asitstandsnow,FileMQhasonemajorremainingproblem:itprovidesnowayforclientstorecoverfromfailures.Thescenariois
thataclient,connectedtoaserver,startstoreceivefilesandthendisconnectsforsomereason.Thenetworkmaybetooslow,or
breaks.Theclientmaybeonalaptopwhichisshutdown,thenresumed.TheWiFimaybedisconnected.Aswemovetoamore
mobileworld(seeChapter8AFrameworkforDistributedComputing)thisusecasebecomesmoreandmorefrequent.Insome
waysit'sbecomingadominantusecase.

IntheclassicZeroMQpubsubpattern,therearetwostrongunderlyingassumptions,bothofwhichareusuallywronginFileMQ's
realworld.First,thatdataexpiresveryrapidlysothatthere'snointerestinaskingfromolddata.Second,thatnetworksarestable
andrarelybreak(soit'sbettertoinvestmoreinimprovingtheinfrastructureandlessinaddressingrecovery).

TakeanyFileMQusecaseandyou'llseethatiftheclientdisconnectsandreconnects,thenitshouldgetanythingitmissed.A
furtherimprovementwouldbetorecoverfrompartialfailures,likeHTTPandFTPdo.Butonethingatatime.

Oneanswertorecoveryis"durablesubscriptions",andthefirstdraftsoftheFILEMQprotocolaimedtosupportthis,withclient
identifiersthattheservercouldholdontoandstore.Soifaclientreappearsafterafailure,theserverwouldknowwhatfilesithad
notreceived.

Statefulserversare,however,nastytomakeanddifficulttoscale.Howdowe,forexample,dofailovertoasecondaryserver?
Wheredoesitgetitssubscriptionsfrom?It'sfarnicerifeachclientconnectionworksindependentlyandcarriesallnecessary
statewithit.

Anothernailinthecoffinofdurablesubscriptionsisthatitrequiresupfrontcoordination.Upfrontcoordinationisalwaysared
flag,whetherit'sinateamofpeopleworkingtogether,orabunchofprocessestalkingtoeachother.Whataboutlatejoiners?In
therealworld,clientsdonotneatlylineupandthenallsay"Ready!"atthesametime.Intherealworld,theycomeandgo
arbitrarily,andit'svaluableifwecantreatabrandnewclientinthesamewayasaclientthatwentawayandcameback.

ToaddressthisIwilladdtwoconceptstotheprotocol:aresynchronizationoptionandacachefield(adictionary).Iftheclient
wantsrecovery,itsetstheresynchronizationoption,andtellstheserverwhatfilesitalreadyhasviathecachefield.Weneed
both,becausethere'snowayintheprotocoltodistinguishbetweenanemptyfieldandanullfield.TheFILEMQRFCdescribes
thesefieldsasfollows:

Theoptionsfieldprovidesadditionalinformationtotheserver.TheserverSHOULDimplementtheseoptions:RESYNC=1
iftheclientsetsthis,theserverSHALLsendthefullcontentsofthevirtualpathtotheclient,exceptfilestheclientalready
has,asidentifiedbytheirSHA1digestinthecachefield.

And:

http://zguide.zeromq.org/page:all 186/225
12/31/2015 MQ - The Guide - MQ - The Guide

WhentheclientspecifiestheRESYNCoption,thecachedictionaryfieldtellstheserverwhichfilestheclientalreadyhas.
Eachentryinthecachedictionaryisa"filename=digest"key/valuepairwherethedigestSHALLbeaSHA1digestin
printablehexadecimalformat.Ifthefilenamestartswith"/"thenitSHOULDstartwiththepath,otherwisetheserverMUST
ignoreit.Ifthefilenamedoesnotstartwith"/"thentheserverSHALLtreatitasrelativetothepath.

Clientsthatknowtheyareintheclassicpubsubusecasejustdon'tprovideanycachedata,andclientsthatwantrecovery
providetheircachedata.Itrequiresnostateintheserver,noupfrontcoordination,andworksequallywellforbrandnewclients
(whichmayhavereceivedfilesviasomeoutofbandmeans),andclientsthatreceivedsomefilesandwerethendisconnectedfor
awhile.

IdecidedtouseSHA1digestsforseveralreasons.First,it'sfastenough:150msectodigesta25MBcoredumponmylaptop.
Second,it'sreliable:thechanceofgettingthesamehashfordifferentversionsofonefileiscloseenoughtozero.Third,it'sthe
widestsupporteddigestalgorithm.Acyclicredundancycheck(e.g.,CRC32)isfasterbutnotreliable.MorerecentSHAversions
(SHA256,SHA512)aremoresecurebuttake50%moreCPUcycles,andareoverkillforourneeds.

HereiswhatatypicalICANHAZmessagelookslikewhenweusebothcachingandresyncing(thisisoutputfromthedump
methodofthegeneratedcodecclass):

ICANHAZ:
path='/photos'
options={
RESYNC=1
}
cache={
DSCF0001.jpg=1FABCD4259140ACA99E991E7ADD2034AC57D341D
DSCF0006.jpg=01267C7641C5A22F2F4B0174FFB0C94DC59866F6
DSCF0005.jpg=698E88C05B5C280E75C055444227FEA6FB60E564
DSCF0004.jpg=F0149101DD6FEC13238E6FD9CA2F2AC62829CBD0
DSCF0003.jpg=4A49F25E2030B60134F109ABD0AD9642C8577441
DSCF0002.jpg=F84E4D69D854D4BF94B5873132F9892C8B5FA94E
}

Althoughwedon'tdothisinFileMQ,theservercanusethecacheinformationtohelptheclientcatchupwithdeletionsthatit
missed.Todothis,itwouldhavetologdeletions,andthencomparethislogwiththeclientcachewhenaclientsubscribes.

TestUseCase:TheTrackTool topprevnext

ToproperlytestsomethinglikeFileMQweneedatestcasethatplayswithlivedata.Oneofmysysadmintasksistomanagethe
MP3tracksonmymusicplayer,whichis,bytheway,aSansaClipreflashedwithRockBox,whichIhighlyrecommend.AsI
downloadtracksintomyMusicfolder,Iwanttocopythesetomyplayer,andasIfindtracksthatannoyme,Ideletetheminthe
Musicfolderandwantthosegonefrommyplayertoo.

Thisiskindofoverkillforapowerfulfiledistributionprotocol.IcouldwritethisusingabashorPerlscript,buttobehonestthe
hardestworkinFileMQwasthedirectorycomparisoncodeandIwanttobenefitfromthat.SoIputtogetherasimpletoolcalled
track,whichcallstheFileMQAPI.Fromthecommandlinethisrunswithtwoargumentsthesendingandthereceiving
directories:

./track/home/ph/Music/media/32306364/MUSIC

ThecodeisaneatexampleofhowtousetheFileMQAPItodolocalfiledistribution.Hereisthefullprogram,minusthelicense
text(it'sMIT/X11licensed):

#include"czmq.h"
#include"../include/fmq.h"

http://zguide.zeromq.org/page:all 187/225
12/31/2015 MQ - The Guide - MQ - The Guide
intmain(intargc,char*argv[])
{
fmq_server_t*server=fmq_server_new()
fmq_server_configure(server,"anonymous.cfg")
fmq_server_publish(server,argv[1],"/")
fmq_server_set_anonymous(server,true)
fmq_server_bind(server,"tcp://*:5670")

fmq_client_t*client=fmq_client_new()
fmq_client_connect(client,"tcp://localhost:5670")
fmq_client_set_inbox(client,argv[2])
fmq_client_set_resync(client,true)
fmq_client_subscribe(client,"/")

while(true){
//Getmessagefromfmq_clientAPI
zmsg_t*msg=fmq_client_recv(client)
if(!msg)
break//Interrupted
char*command=zmsg_popstr(msg)
if(streq(command,"DELIVER")){
char*filename=zmsg_popstr(msg)
char*fullname=zmsg_popstr(msg)
printf("I:received%s(%s)\n",filename,fullname)
free(filename)
free(fullname)
}
free(command)
zmsg_destroy(&msg)
}
fmq_server_destroy(&server)
fmq_client_destroy(&client)
return0
}

Notehowweworkwithphysicalpathsinthistool.Theserverpublishesthephysicalpath/home/ph/Musicandmapsthistothe
virtualpath/.Theclientsubscribesto/andreceivesallfilesin/media/32306364/MUSIC.Icoulduseanystructurewithinthe
serverdirectory,anditwouldbecopiedfaithfullytotheclient'sinbox.NotetheAPImethodfmq_client_set_resync(),which
causesaservertoclientsynchronization.

GettinganOfficialPortNumber topprevnext

We'vebeenusingport5670intheexamplesforFILEMQ.Unlikeallthepreviousexamplesinthisbook,thisportisn'tarbitrarybut
wasassignedbytheInternetAssignedNumbersAuthority(IANA),which"isresponsiblefortheglobalcoordinationoftheDNS
Root,IPaddressing,andotherInternetprotocolresources".

I'llexplainverybrieflywhenandhowtorequestregisteredportnumbersforyourapplicationprotocols.Themainreasonisto
ensurethatyourapplicationscanruninthewildwithoutconflictwithotherprotocols.Technically,ifyoushipanysoftwarethat
usesportnumbersbetween1024and49151,youshouldbeusingonlyIANAregisteredportnumbers.Manyproductsdon't
botherwiththis,however,andtendinsteadtousetheIANAlistas"portstoavoid".

Ifyouaimtomakeapublicprotocolofanyimportance,suchasFILEMQ,you'regoingtowantanIANAregisteredport.I'llexplain
brieflyhowtodothis:

Documentyourprotocolclearly,asIANAwillwantaspecificationofhowyouintendtousetheport.Itdoesnothavetobe
afullyformedprotocolspecification,butmustbesolidenoughtopassexpertreview.

Decidewhattransportprotocolsyouwant:UDP,TCP,SCTP,andsoon.WithZeroMQyouwillusuallyonlywantTCP.

Fillintheapplicationoniana.org,providingallthenecessaryinformation.

http://zguide.zeromq.org/page:all 188/225
12/31/2015 MQ - The Guide - MQ - The Guide
IANAwillthencontinuetheprocessbyemailuntilyourapplicationisacceptedorrejected.

Notethatyoudon'trequestaspecificportnumberIANAwillassignyouone.It'sthereforewisetostartthisprocessbeforeyou
shipsoftware,notafterwards.

Chapter8AFrameworkforDistributedComputing topprevnext

We'vegonethoughajourneyofunderstandingZeroMQinitsmanyaspects.Bynowyoumayhavestartedtobuildyourown
productsusingthetechniquesIexplained,aswellasothersyou'vefiguredoutyourself.Youwillstarttofacequestionsabouthow
tomaketheseproductsworkintherealworld.

Butwhatisthat"realworld"?I'llarguethatitisbecomingaworldofeverincreasingnumbersofmovingpieces.Somepeopleuse
thephrasethe"InternetofThings",suggestingthatwe'llseeanewcategoryofdevicesthataremorenumerousbutalsomore
stupidthanourcurrentsmartphones,tablets,laptops,andservers.However,Idon'tthinkthedatapointsthiswayatall.Yes,
therearemoreandmoredevices,butthey'renotstupidatall.They'resmartandpowerfulandgettingmoresoallthetime.

ThemechanismatworkissomethingIcall"CostGravity"andithastheeffectofreducingthecostoftechnologybyhalfevery18
24months.Putanotherway,ourglobalcomputingcapacitydoubleseverytwoyears,overandoverandover.Thefutureisfilled
withtrillionsofdevicesthatarefullypowerfulmulticorecomputers:theydon'trunacutdown"operatingsystemforthings"but
fulloperatingsystemsandfullapplications.

Andthisistheworldwe'retargetingwithZeroMQ.Whenwetalkof"scale",wedon'tmeanhundredsofcomputers,oreven
thousands.Thinkofcloudsoftinysmartandperhapsselfreplicatingmachinessurroundingeveryperson,fillingeveryspace,
coveringeverywall,fillingthecracksandeventually,becomingsomuchapartofusthatwegetthembeforebirthandtheyfollow
ustodeath.

Thesecloudsoftinymachinestalktoeachother,allthetime,overshortrangewirelesslinksusingtheInternetProtocol.They
createmeshnetworks,passinformationandtasksaroundlikenervoussignals.Theyaugmentourmemory,vision,everyaspect
ofourcommunications,andphysicalfunctions.Andit'sZeroMQthatpowerstheirconversationsandeventsandexchangesof
workandinformation.

Now,tomakeevenathinimitationofthiscometruetoday,weneedtosolveasetoftechnicalproblems.Theseinclude:Howdo
peersdiscovereachother?HowdotheytalktoexistingnetworksliketheWeb?Howdotheyprotecttheinformationtheycarry?
Howdowetrackandmonitorthem,togetsomeideaofwhatthey'redoing?Thenweneedtodowhatmostengineersforget
about:packagethissolutionintoaframeworkthatisdeadeasyforordinarydeveloperstouse.

Thisiswhatwe'llattemptinthischapter:tobuildaframeworkfordistributedapplicationsasanAPI,protocols,and
implementations.It'snotasmallchallengebutI'veclaimedoftenthatZeroMQmakessuchproblemssimple,solet'sseeifthat's
stilltrue.

We'llcover:

Requirementsfordistributedcomputing
TheprosandconsofWiFiforproximitynetworking
DiscoveryusingUDPandTCP
AmessagebasedAPI
Creatinganewopensourceproject
Peertopeerconnectivity(theHarmonypattern)
Trackingpeerpresenceanddisappearance
Groupmessagingwithoutcentralcoordination
Largescaletestingandsimulation
Dealingwithhighwatermarksandblockedpeers
Distributedloggingandmonitoring

DesignforTheRealWorld topprevnext

Whetherwe'reconnectingaroomfulofmobiledevicesoverWiFioraclusterofvirtualboxesoversimulatedEthernet,wewillhit
thesamekindsofproblems.Theseare:

http://zguide.zeromq.org/page:all 189/225
12/31/2015 MQ - The Guide - MQ - The Guide
Discovery:howdowelearnaboutothernodesonthenetwork?Doweuseadiscoveryservice,centralizedmediation,or
somekindofbroadcastbeacon?

Presence:howdowetrackwhenothernodescomeandgo?Doweusesomekindofcentralregistrationservice,or
heartbeatingorbeacons?

Connectivity:howdoweactuallyconnectonenodetoanother?Doweuselocalnetworking,wideareanetworking,ordo
weuseacentralmessagebrokertodotheforwarding?

Pointtopointmessaging:howdowesendamessagefromonenodetoanother?Dowesendthistothenode'snetwork
address,ordoweusesomeindirectaddressingviaacentralizedmessagebroker?

Groupmessaging:howdowesendamessagefromonenodetoagroupofothers?Doweworkviaacentralized
messagebroker,ordoweuseapubsubmodellikeZeroMQ?

Testingandsimulation:howdowesimulatelargenumbersofnodessowecantestperformanceproperly?Dowehaveto
buytwodozenAndroidtablets,orcanweusepuresoftwaresimulation?

DistributedLogging:howdowetrackwhatthiscloudofnodesisdoingsowecandetectperformanceproblemsand
failures?Dowecreateamainloggingservice,ordowealloweverydevicetologtheworldaroundit?

Contentdistribution:howdowesendcontentfromonenodetoanother?DoweuseservercentricprotocolslikeFTPor
HTTP,ordoweusedecentralizedprotocolslikeFileMQ?

Ifwecansolvetheseproblemsreasonablywell,andthefurtherproblemsthatwillemerge(likesecurityandwideareabridging),
wegetsomethinglikeaframeworkforwhatImightcall"ReallyCoolDistributedApplications",orasmygrandkidscallit,"the
softwareourworldrunson".

Youshouldhaveguessedfrommyrhetoricalquestionsthattherearetwobroaddirectionsinwhichwecango.Oneisto
centralizeeverything.Theotheristodistributeeverything.I'mgoingtobetondecentralization.Ifyouwantcentralization,you
don'treallyneedZeroMQthereareotheroptionsyoucanuse.

Soveryroughly,here'sthestory.One,thenumberofmovingpiecesincreasesexponentiallyovertime(doublesevery24
months).Two,thesepiecesstopusingwiresbecausedraggingcableseverywheregetsreallyboring.Three,futureapplications
runacrossclustersofthesepiecesusingtheBenevolentTyrantpatternfromChapter6TheZeroMQCommunity.Four,today
it'sreallydifficult,naystillratherimpossible,tobuildsuchapplications.Five,let'smakeitcheapandeasyusingallthetechniques
andtoolswe'vebuiltup.Six,partay!

TheSecretLifeofWiFi topprevnext

Thefutureisclearlywireless,andwhilemanybigbusinesseslivebyconcentratingdataintheirclouds,thefuturedoesn'tlook
quitesocentralized.Thedevicesattheedgesofournetworksgetsmartereveryyear,notdumber.They'rehungryforworkand
informationtodigestandfromwhichtoprofit.Andtheydon'tdragcablesaround,exceptonceanightforpower.It'sallwireless
andmoreandmore,it's802.11brandedWiFiofdifferentalphabeticalflavors.

WhyMeshIsn'tHereYet topprevnext

Assuchavitalpartofourfuture,WiFihasabigproblemthat'snotoftendiscussed,butthatanyonebettingonitneedstobe
awareof.Thephonecompaniesoftheworldhavebuiltthemselvesniceprofitablemobilephonecartelsinnearlyeverycountry
withafunctioninggovernment,basedonconvincinggovernmentsthatwithoutmonopolyrightstoairwavesandideas,theworld
wouldfallapart.Technically,wecallthis"regulatorycapture"and"patents",butinfactit'sjustaformofblackmailandcorruption.
Ifyou,thestate,giveme,abusiness,therighttoovercharge,taxthemarket,andbanallrealcompetitors,I'llgiveyou5%.Not
enough?Howabout10%?OK,15%plussnacks.Ifyourefuse,wepullservice.

ButWiFisnuckpastthis,borrowingunlicensedairspaceandridingonthebackoftheopenandunpatentedandremarkably
innovativeInternetProtocolstack.Sotoday,wehavethecurioussituationwhereitcostsmeseveralEuroaminutetocallfrom
SeoultoBrusselsifIusethestatebackedinfrastructurethatwe'vesubsidizedoverdecades,butnothingatallifIcanfindan
unregulatedWiFiaccesspoint.Oh,andIcandovideo,sendfilesandphotos,anddownloadentirehomemoviesallforthesame
amazingpricepointofpreciselyzeropointzerozero(inanycurrencyyoulike).GodhelpmeifItrytosendjustonephotohome
usingtheserviceforwhichIactuallypay.ThatwouldcostmemorethanthecameraItookiton.
http://zguide.zeromq.org/page:all 190/225
12/31/2015 MQ - The Guide - MQ - The Guide
Itisthepricewepayforhavingtoleratedthe"trustus,we'retheexperts"patentsystemforsolong.Butmorethanthat,it'sa
massiveeconomicincentivetochunksofthetechnologysectorandespeciallychipsetmakerswhoownpatentsontheanti
InternetGSM,GPRS,3G,andLTEstacks,andwhotreatthetelcosasprimeclientstoactivelythrottleWiFidevelopment.And
ofcourseit'sthesefirmsthatbulkouttheIEEEcommitteesthatdefineWiFi.

Thereasonforthisrantagainstlawyerdriven"innovation"istosteeryourthinkingtowards"whatifWiFiwerereallyfree?"This
willhappenoneday,nottoofaroff,andit'sworthbettingon.We'llseeseveralthingshappen.First,muchmoreaggressiveuseof
airspaceespeciallyforneardistancecommunicationswherethereisnoriskofinterference.Second,bigcapacityimprovements
aswelearntousemoreairspaceinparallel.Third,accelerationofthestandardizationprocess.Last,broadersupportindevices
forreallyinterestingconnectivity.

Rightnow,streamingamoviefromyourphonetoyourTVisconsidered"leadingedge".Thisisridiculous.Let'sgettruly
ambitious.Howaboutastadiumofpeoplewatchingagame,sharingphotosandHDvideowitheachotherinrealtime,creating
anadhoceventthatliterallysaturatestheairspacewithadigitalfrenzy.Ishouldbeabletocollectterabytesofimageryfrom
thosearoundme,inanhour.WhydoesthishavetogothroughTwitterorFacebookandthattinyexpensivemobiledata
connection?Howaboutahomewithhundredsofdevicesalltalkingtoeachotherovermesh,sowhensomeoneringsthe
doorbell,theporchlightsstreamvideothroughtoyourphoneorTV?Howaboutacarthatcantalktoyourphoneandplayyour
dubstepplaylistwithoutyouplugginginwires.

Togetmoreserious,whyisourdigitalsocietyinthehandsofcentralpointsthataremonitored,censored,logged,usedtotrack
whowetalkto,collectevidenceagainstus,andthenshutdownwhentheauthoritiesdecidewehavetoomuchfreespeech?The
lossofprivacywe'relivingthroughisonlyaproblemwhenit'sonesided,butthentheproblemiscalamitous.Atrulywireless
worldwouldbypassallcentralcensorship.It'showtheInternetwasdesigned,andit'squitefeasible,technically(whichisthebest
kindoffeasible).

SomePhysics topprevnext

Naivedevelopersofdistributedsoftwaretreatthenetworkasinfinitelyfastandperfectlyreliable.Whilethisisapproximatelytrue
forsimpleapplicationsoverEthernet,WiFirapidlyprovesthedifferencebetweenmagicalthinkingandscience.Thatis,WiFi
breakssoeasilyanddramaticallyunderstressthatIsometimeswonderhowanyonewoulddareuseitforrealwork.Theceiling
movesupasWiFigetsbetter,butneverfastenoughtostopushittingit.

TounderstandhowWiFiperformstechnically,youneedtounderstandabasiclawofphysics:thepowerrequiredtoconnecttwo
pointsincreasesaccordingtothesquareofthedistance.Peoplewhogrowupinlargerhouseshaveexponentiallyloudervoices,
asIlearnedinDallas.ForaWiFinetwork,thismeansthatastworadiosgetfurtherapart,theyhavetoeitherusemorepoweror
lowertheirsignalrate.

There'sonlysomuchpoweryoucanpulloutofabatterybeforeuserstreatthedeviceashopelesslybroken.Thuseventhougha
WiFinetworkmayberatedatacertainspeed,therealbitratebetweentheaccesspoint(AP)andaclientdependsonhowfar
apartthetwoare.AsyoumoveyourWiFienabledphoneawayfromtheAP,thetworadiostryingtotalktoeachotherwillfirst
increasetheirpowerandthenreducetheirbitrate.

Thiseffecthassomeconsequencesofwhichweshouldbeawareifwewanttobuildrobustdistributedapplicationsthatdon't
danglewiresbehindthemlikepuppets:

IfyouhaveagroupofdevicestalkingtoanAP,whentheAPistalkingtotheslowestdevice,thewholenetworkhasto
wait.It'slikehavingtorepeatajokeatapartytothedesignateddriverwhohasnosenseofhumor,isstillfullyand
tragicallysober,andhasapoorgraspofthelanguage.

IfyouuseunicastTCPandsendamessagetomultipledevices,theAPmustsendthepacketstoeachdeviceseparately,
Yes,andyouknewthis,it'salsohowEthernetworks.Butnowunderstandthatonedistant(orlowpowered)devicemeans
everythingwaitsforthatslowestdevicetocatchup.

Ifyouusemulticastorbroadcast(whichworkthesame,inmostcases),theAPwillsendsinglepacketstothewhole
networkatonce,whichisawesome,butitwilldoitattheslowestpossiblebitrate(usually1Mbps).Youcanadjustthis
ratemanuallyinsomeAPs.ThatjustreducesthereachofyourAP.YoucanalsobuymoreexpensiveAPsthathavea
littlemoreintelligenceandwillfigureoutthehighestbitratetheycansafelyuse.YoucanalsouseenterpriseAPswith
IGMP(InternetGroupManagementProtocol)supportandZeroMQ'sPGMtransporttosendonlytosubscribedclients.I'd
not,however,betonsuchAPsbeingwidelyavailable,ever.

AsyoutrytoputmoredevicesontoanAP,performancerapidlygetsworsetothepointwhereaddingonemoredevicecanbreak
thewholenetworkforeveryone.ManyAPssolvethisbyrandomlydisconnectingclientswhentheyreachsomelimit,suchasfour
toeightdevicesforamobilehotspot,3050devicesforaconsumerAP,perhaps100devicesforanenterpriseAP.

http://zguide.zeromq.org/page:all 191/225
12/31/2015 MQ - The Guide - MQ - The Guide

What'stheCurrentStatus? topprevnext

Despiteitsuncomfortableroleasenterprisetechnologythatsomehowescapedintothewild,WiFiisalreadyusefulformorethan
gettingafreeSkypecall.It'snotideal,butitworkswellenoughtoletussolvesomeinterestingproblems.Letmegiveyouarapid
statusreport.

First,pointtopointversusaccesspointtoclient.TraditionalWiFiisallAPclient.EverypackethastogofromclientAtoAP,then
toclientB.Youcutyourbandwidthby50%,butthat'sonlyhalftheproblem.Iexplainedabouttheinversepowerlaw.IfAandB
areveryclosetogether,butbotharefarfromtheAP,they'llbothbeusingalowbitrate.ImagineyourAPisinthegarage,and
you'reinthelivingroomtryingtostreamvideofromyourphonetoyourTV.Goodluck!

Thereisanold"adhoc"modethatletsAandBtalktoeachother,butit'swaytooslowforanythingfun,andofcourse,it's
disabledonallmobilechipsets.Actually,it'sdisabledinthetopsecretdriversthatthechipsetmakerskindlyprovidetohardware
makers.ThereisanewTunneledDirectLinkSetup(TDLS)protocolthatletstwodevicescreateadirectlink,usinganAPfor
discoverybutnotfortraffic.Andthere'sa"5G"WiFistandard(it'samarketingterm,soitgoesinquotes)thatboostslinkspeeds
toagigabit.TDLSand5GtogethermakeHDmoviestreamingfromyourphonetoyourTVaplausiblereality.IassumeTDLSwill
berestrictedinvariouswayssoastoplacatethetelcos.

Lastly,wesawstandardizationofthe802.11smeshprotocolin2012,afteraremarkablyspeedytenyearsorsoofwork.Mesh
removestheaccesspointcompletely,atleastintheimaginaryfuturewhereitexistsandiswidelyused.Devicestalktoeach
otherdirectly,andmaintainlittleroutingtablesofneighborsthatletthemforwardpackets.ImaginetheAPsoftwareembedded
intoeverydevice,butsmartenough(it'snotasimpressiveasitsounds)todomultiplehops.

Noonewhoismakingmoneyfromthemobiledataextortionracketwantstosee802.11savailablebecausecitywidemeshis
suchanightmareforthebottomline,soit'shappeningasslowlyaspossible.Theonlylargeorganizationwiththepower(and,I
assumethesurfacetosurfacemissiles)togetmeshtechnologyintowideuseistheUSArmy.ButmeshwillemergeandI'dbet
on802.11sbeingwidelyavailableinconsumerelectronicsby2020orso.

Second,ifwedon'thavepointtopoint,howfarcanwetrustAPstoday?Well,ifyougotoaStarbucksintheUSandtrythe
ZeroMQ"HelloWorld"exampleusingtwolaptopsconnectedviathefreeWiFi,you'llfindtheycannotconnect.Why?Well,the
answerisinthename:"attwifi".AT&TisagoodoldincumbenttelcothathatesWiFiandpresumablyprovidestheservicecheaply
toStarbucksandotherssothatindependentscan'tgetintothemarket.ButanyaccesspointyoubuywillsupportclienttoAPto
clientaccess,andoutsidetheUSI'veneverfoundapublicAPlockeddowntheAT&Tway.

Third,performance.TheAPisclearlyabottleneckyoucannotgetbetterthanhalfofitsadvertisedspeedevenifyouputAandB
literallybesidetheAP.Worse,ifthereareotherAPsinthesameairspace,they'llshouteachotherout.Inmyhome,WiFibarely
worksatallbecausetheneighborstwohousesdownhaveanAPwhichthey'veamplified.Evenonadifferentchannel,it
interfereswithourhomeWiFi.InthecafewhereI'msittingnowthereareoveradozennetworks.Realistically,aslongaswe're
dependentonAPbasedWiFi,we'resubjecttorandominterferenceandunpredictableperformance.

Fourth,batterylife.There'snoinherentreasonthatWiFi,whenidle,ishungrierthanBluetooth,forexample.Theyusethesame
radiosandlowlevelframing.Themaindifferenceistuningandintheprotocols.Forwirelesspowersavingtoworkwell,devices
havetomostlysleepandbeaconouttootherdevicesonlyonceeverysooften.Forthistowork,theyneedtosynchronizetheir
clocks.Thishappensproperlyforthemobilephonepart,whichiswhymyoldflipphonecanrunfivedaysonacharge.When
WiFiisworking,itwillusemorepower.Currentpoweramplifiertechnologyisalsoinefficient,meaningyoudrawalotmore
energyfromyourbatterythanyoupumpintotheair(thewasteturnsintoahotphone).Poweramplifiersareimprovingaspeople
focusmoreonmobileWiFi.

Lastly,mobileaccesspoints.Ifwecan'ttrustcentralizedAPs,andifourdevicesaresmartenoughtorunfulloperatingsystems,
can'twemakethemworkasAPs?I'msogladyouaskedthatquestion.Yes,wecan,anditworksquitenicely.Especially
becauseyoucanswitchthisonandoffinsoftware,onamodernOSlikeAndroid.Again,thevillainsofthepeacearetheUS
telcos,whomostlydetestthisfeatureandkillitorcrippleitonthephonestheycontrol.Smartertelcosrealizethatit'sawayto
amplifytheir"lastmile"andbringhighervalueproductstomoreusers,butcrooksdon'tcompeteonsmarts.

Conclusions topprevnext

WiFiisnotEthernetandalthoughIbelievefutureZeroMQapplicationswillhaveaveryimportantdecentralizedwireless
presence,it'snotgoingtobeaneasyroad.MuchofthebasicreliabilityandcapacitythatyouexpectfromEthernetismissing.
WhenyourunadistributedapplicationoverWiFi,youmustallowforfrequenttimeouts,randomlatencies,arbitrary

http://zguide.zeromq.org/page:all 192/225
12/31/2015 MQ - The Guide - MQ - The Guide
disconnections,wholeinterfacesgoingdownandcomingup,andsoon.

Thetechnologicalevolutionofwirelessnetworkingisbestdescribedas"slowandjoyless".Applicationsandframeworksthattry
toexploitdecentralizedwirelessaremostlyabsentorpoor.Theonlyexistingopensourceframeworkforproximitynetworkingis
AllJoynfromQualcomm.ButwithZeroMQ,weprovedthattheinertiaanddecrepitincompetenceofexistingplayerswasno
reasonforustositstill.Whenweaccuratelyunderstandproblems,wecansolvethem.Whatweimagine,wecanmakereal.

Discovery topprevnext

DiscoveryisanessentialpartofnetworkprogrammingandafirstclassproblemforZeroMQdevelopers.Everyzmq_connect
()callprovidesanendpointstring,andthathastocomefromsomewhere.Theexampleswe'veseensofardon'tdodiscovery:
theendpointstheyconnecttoarehardcodedasstringsinthecode.Whilethisisfineforexamplecode,it'snotidealforreal
applications.Networksdon'tbehavethatnicely.Thingschange,andit'showwellwehandlechangethatdefinesourlongterm
success.

ServiceDiscovery topprevnext

Let'sstartwithdefinitions.Networkdiscoveryisfindingoutwhatotherpeersareonthenetwork.Servicediscoveryislearning
whatthosepeerscandoforus.Wikipediadefinesa"networkservice"as"aservicethatishostedonacomputernetwork",and
"service"as"asetofrelatedsoftwarefunctionalitiesthatcanbereusedfordifferentpurposes,togetherwiththepoliciesthat
shouldcontrolitsusage".It'snotveryhelpful.IsFacebookanetworkservice?

Infacttheconceptof"networkservice"haschangedovertime.Thenumberofmovingpieceskeepsdoublingevery1824
months,breakingoldconceptualmodelsandpushingforeversimpler,morescalableones.Aserviceis,forme,asystemlevel
applicationthatotherprogramscantalkto.Anetworkserviceisoneaccessibleremotely(ascomparedto,e.g.,the"grep"
command,whichisacommandlineservice).

IntheclassicBSDsocketmodel,aservicemaps1to1toanetworkport.Acomputersystemoffersanumberofserviceslike
"FTP",and"HTTP",eachwithassignedports.TheBSDAPIhasfunctionslikegetservbynametomapaservicenametoaport
number.Soaclassicservicemapstoanetworkendpoint:ifyouknowaserver'sIPaddressandthenyoucanfinditsFTP
service,ifthatisrunning.

Inmodernmessaging,however,servicesdon'tmap1to1toendpoints.Oneendpointcanleadtomanyservices,andservices
canmovearoundovertime,betweenports,orevenbetweensystems.Whereismycloudstoragetoday?Inarealisticlarge
distributedapplication,therefore,weneedsomekindofservicediscoverymechanism.

TherearemanywaystodothisandIwon'ttrytoprovideanexhaustivelist.Howeverthereareafewclassicpatterns:

Wecanforcetheold1to1mappingfromendpointtoservice,andsimplystateupfrontthatacertainTCPportnumber
representsacertainservice.Ourprotocolthenshouldletuscheckthis("Arethefirst4bytesoftherequest'HTTP'?").

Wecanbootstraponeserviceoffanotherconnectingtoawellknownendpointandservice,askingforthe"real"service,
andgettinganendpointbackinreturn.Thisgivesusaservicelookupservice.Ifthelookupserviceallowsit,servicescan
thenmovearoundaslongastheyupdatetheirlocation.

Wecanproxyoneservicethroughanother,sothatawellknownendpointandservicewillprovideotherservicesindirectly
(i.e.byforwardingmessagestothem).ThisisforinstancehowourMajordomoserviceorientedbrokerworks.

Wecanexchangelistsofknownservicesandendpoints,thatchangeovertime,usingagossipapproachoracentralized
approach(liketheClonepattern),sothateachnodeinadistributednetworkcanbuildupaneventuallyconsistentmapof
thewholenetwork.

Wecancreatefurtherabstractlayersinbetweennetworkendpointsandservices,e.g.assigningeachnodeaunique
identifier,sowegeta"networkofnodes"whereeachnodemayoffersomeservices,andmayappearonrandomnetwork
endpoints.

Wecandiscoverservicesopportunistically,e.g.byconnectingtoendpointsandthenaskingthemwhatservicestheyoffer.
"Hi,doyouofferasharedprinter?Ifso,what'sthemakerandmodel?"

There'sno"rightanswer".Therangeofoptionsishuge,andchangesovertimeasthescaleofournetworksgrows.Insome
http://zguide.zeromq.org/page:all 193/225
12/31/2015 MQ - The Guide - MQ - The Guide
networkstheknowledgeofwhatservicesrunwherecanliterallybecomepoliticalpower.ZeroMQimposesnospecificmodelbut
makesiteasytodesignandbuildtheonesthatsuitusbest.However,tobuildservicediscovery,wemuststartbysolving
networkdiscovery.

NetworkDiscovery topprevnext

HereisalistofthesolutionsIknowfornetworkdiscovery:

Usehardcodedendpointstrings,i.e.,fixedIPaddressesandagreedports.Thisworkedininternalnetworksadecadeago
whentherewereafew"bigservers"andtheyweresoimportanttheygotstaticIPaddresses.Thesedayshoweverit'sno
useexceptinexamplesorforinprocesswork(threadsarethenewBigIron).Youcanmakeithurtalittlelessbyusing
DNSbutthisisstillpainfulforanyonewho'snotalsodoingsystemadministrationasasidejob.

Getendpointstringsfromconfigurationfiles.Thisshovesnameresolutionintouserspace,whichhurtslessthanDNSbut
that'slikesayingapunchinthefacehurtslessthanakickinthegroin.Younowgetanontrivialmanagementproblem.
Whoupdatestheconfigurationfiles,andwhen?Wheredotheylive?DoweinstalladistributedmanagementtoollikeSalt
Stack?

Useamessagebroker.Youstillneedahardcodedorconfiguredendpointstringtoconnecttothebroker,butthis
approachreducesthenumberofdifferentendpointsinthenetworktoone.Thatmakesarealimpact,andbrokerbased
networksdoscalenicely.However,brokersaresinglepointsoffailure,andtheybringtheirownsetofworriesabout
managementandperformance.

Useanaddressingbroker.Inotherwordsuseacentralservicetomediateaddressinformation(likeadynamicDNSsetup)
butallownodestosendeachothermessagesdirectly.It'sagoodmodelbutstillcreatesapointoffailureand
managementcosts.

Usehelperlibraries,likeZeroConf,thatprovideDNSserviceswithoutanycentralizedinfrastructure.It'sagoodanswerfor
certainapplicationsbutyourmileagewillvary.Helperlibrariesaren'tzerocost:theymakeitmorecomplextobuildthe
software,theyhavetheirownrestrictions,andtheyaren'tnecessarilyportable.

BuildsystemleveldiscoverybysendingoutARPorICMPECHOpacketsandthenqueryingeverynodethatresponds.
YoucanquerythroughaTCPconnection,forexample,orbysendingUDPmessages.Someproductsdothis,likethe
EyeFiwirelesscard.

Douserlevelbruteforcediscoverybytryingtoconnecttoeverysingleaddressinthenetworksegment.Youcandothis
triviallyinZeroMQsinceithandlesconnectionsinthebackground.Youdon'tevenneedmultiplethreads.It'sbrutalbut
fun,andworksverywellindemosandworkshops.Howeveritdoesn'tscale,andannoysdecentthinkingengineers.

RollyourownUDPbaseddiscoveryprotocol.Lotsofpeopledothis(Icountedabout80questionsonthistopicon
StackOverflow).UDPworkswellforthisandit'stechnicallyclear.Butit'stechnicallytrickytogetright,tothepointwhere
anydeveloperdoingthisthefirstfewtimeswillgetitdramaticallywrong.

Gossipdiscoveryprotocols.Afullyinterconnectednetworkisquiteeffectiveforsmallernumbersofnodes(say,upto100
or200).Forlargenumbersofnodes,weneedsomekindofgossipprotocol.Thatis,wherethenodeswecanreasonable
discover(say,onthesamesegmentasus),tellusaboutnodesthatarefurtheraway.Gossipprotocolsgobeyondwhat
weneedthesedayswithZeroMQ,butwilllikelybemorecommoninthefuture.Oneexampleofawideareagossipmodel
ismeshnetworking.

TheUseCase topprevnext

Let'sdefineourusecasemoreexplicitly.Afterall,allthesedifferentapproacheshaveworkedandstillworktosomeextent.What
interestsmeasarchitectisthefuture,andfindingdesignsthatcancontinuetoworkformorethanafewyears.Thismeans
identifyinglongtermtrends.Ourusecaseisn'thereandnow,it'stenortwentyyearsfromtoday.

HerearethelongtermtrendsIseeindistributedapplications:

Theoverallnumberofmovingpieceskeepsincreasing.Myestimateisthatitdoublesevery24months,buthowfastit
increasesmatterslessthanthefactthatwekeepaddingmoreandmorenodestoournetworks.They'renotjustboxesbut

http://zguide.zeromq.org/page:all 194/225
12/31/2015 MQ - The Guide - MQ - The Guide
alsoprocessesandthreads.Thedriverhereiscost,whichkeepsfalling.Inadecade,theaverageteenagerwillcarry30
50devices,allthetime.

Controlshiftsawayfromthecenter.Possiblydatatoo,thoughwe'restillfarfromunderstandinghowtobuildsimple
decentralizedinformationstores.Inanycase,thestartopologyisslowlydyingandbeingreplacedbycloudsofclouds.In
thefuturethere'sgoingtobemuchmoretrafficwithinalocalenvironment(home,office,school,bar)thanbetweenremote
nodesandthecenter.Themathsherearesimple:remotecommunicationscostmore,runmoreslowlyandareless
naturalthancloserangecommunications.It'smoreaccuratebothtechnicallyandsociallytoshareaholidayvideowith
yourfriendoverlocalWiFithanviaFacebook.

Networksareincreasinglycollaborative,lesscontrolled.Thismeanspeoplebringingtheirowndevicesandexpectingthem
toworkseamlessly.TheWebshowedonewaytomakethisworkbutwe'rereachingthelimitsofwhattheWebcando,as
westarttoexceedtheaverageofonedeviceperperson.

Thecostofconnectinganewnodetoanetworkmustfallproportionally,ifthenetworkistoscale.Thismeansreducing
theamountofconfigurationanodeneeds:lesspresharedstate,lesscontext.Again,theWebsolvedthisproblembutat
thecostofcentralization.Wewantthesameplugandplayexperiencebutwithoutacentralagency.

Inaworldoftrillionsofnodes,theonesyoutalktomostaretheonesclosesttoyou.Thisishowitworksintherealworldandit's
thesanestwayofscalinglargescalearchitectures.Groupsofnodes,logicallyorphysicallyclose,connectedbybridgestoother
groupsofnodes.Alocalgroupwillbeanythingfromhalfadozennodestoafewthousandnodes.

Sowehavetwobasicusecases:

Discoveryforproximitynetworks,thatis,asetofnodesthatfindthemselvesclosetoeachother.Wecandefine"close
toeachother"asbeing"onthesamenetworksegment".It'snotgoingtobetrueinallcasesbutit'strueenoughtobea
usefulplacetostart.

Discoveryacrosswideareanetworks,thatis,bridgingofproximitynetworkstogether.Wesometimescallthis
"federation".Therearemanywaystodofederationbutit'scomplexandsomethingtocoverelsewhere.Fornow,let's
assumewedofederationusingacentralizedbrokerorservice.

Soweareleftwiththeproblemofproximitynetworking.Iwanttojustplugthingsintothenetworkandhavethemtalkingtoeach
other.Whetherthey'retabletsinaschoolorabunchofserversinacloud,thelessupfrontagreementandcoordination,the
cheaperitistoscale.Soconfigurationfilesandbrokersandanykindofcentralizedserviceareallout.

Ialsowanttoallowanynumberofapplicationsonabox,bothbecausethat'showtherealworldworks(peopledownloadapps),
andsothatIcansimulatelargenetworksonmylaptop.UpfrontsimulationistheonlywayIknowtobesureasystemwillwork
whenit'sloadedinreallife.You'dbesurprisedhowengineersjusthopethingswillwork."Oh,I'msurethatbridgewillstayup
whenweopenittotraffic".Ifyouhaven'tsimulatedandfixedthethreemostlikelyfailures,they'llstillbethereonopeningday.

Runningmultipleinstancesofaserviceonthesamemachinewithoutupfrontcoordinationmeanswehavetouseephemeral
ports,i.e.,portsassignedrandomlyforservices.EphemeralportsruleoutbruteforceTCPdiscoveryandanyDNSsolution
includingZeroConf.

Finally,discoveryhastohappeninuserspacebecausetheappswe'rebuildingwillberunningonrandomboxesthatwedonot
necessarilyownandcontrol.Forexample,otherpeople'smobiledevices.Soanydiscoverythatneedsrootpermissionsis
excluded.ThisrulesoutARPandICMPandonceagainZeroConfsincethatalsoneedsrootpermissionsfortheserviceparts.

TechnicalRequirements topprevnext

Let'srecaptherequirements:

Thesimplestpossiblesolutionthatworks.Therearesomanyedgecasesinadhocnetworksthateveryextrafeatureor
functionalitybecomesarisk.

Supportsephemeralports,sothatwecanrunrealisticsimulations.Iftheonlywaytotestistouserealdevices,itbecomes
impossiblyexpensiveandslowtoruntests.

Norootaccessneeded,itmustrun100%inuserspace.Wewanttoshipfullypackagedapplicationsontodeviceslike
mobilephonesthatwedon'townandwhererootaccessisn'tavailable.

Invisibletosystemadministrators,sowedonotneedtheirhelptorunourapplications.Whatevertechniqueweuseshould
befriendlytothenetworkandavailablebydefault.

http://zguide.zeromq.org/page:all 195/225
12/31/2015 MQ - The Guide - MQ - The Guide
Zeroconfigurationapartfrominstallingtheapplicationsthemselves.Askingtheuserstodoanyconfigurationisgiving
themanexcusetonotusetheapplications.

Fullyportabletoallmodernoperatingsystems.Wecan'tassumewe'llberunningonanyspecificOS.Wecan'tassume
anysupportfromtheoperatingsystemexceptstandarduserspacenetworking.WecanassumeZeroMQandCZMQare
available.

FriendlytoWiFinetworkswithupto100150participants.Thismeanskeepingmessagessmallandbeingawareofhow
WiFinetworksscaleandhowtheybreakunderpressure.

Protocolneutral,i.e.,ourbeaconingshouldnotimposeanyspecificdiscoveryprotocol.I'llexplainwhatthismeansalittle
later.

Easytoreimplementinanygivenlanguage.Sure,wehaveaniceCimplementation,butifittakestoolongtore
implementinanotherlanguage,thatexcludeslargechunksoftheZeroMQcommunity.So,again,simple.

Fastresponsetime.Bythis,Imeananewnodeshouldbevisibletoitspeersinaveryshorttime,asecondortwoatmost.
Networkschangeshaperapidly.It'sOKtotakelonger,even30seconds,torealizeapeerhasdisappeared.

FromthelistofpossiblesolutionsIcollected,theonlyoptionthatisn'tdisqualifiedforoneormorereasonsistobuildourown
UDPbaseddiscoverystack.It'salittledisappointingthataftersomanydecadesofresearchintonetworkdiscovery,thisiswhere
weendup.Butthehistoryofcomputingdoesseemtogofromcomplextosimple,somaybeit'snormal.

ASelfHealingP2PNetworkin30Seconds topprevnext

Imentionedbruteforcediscovery.Let'sseehowthatworks.Onenicethingaboutsoftwareistobruteforceyourwaythroughthe
learningexperience.Aslongaswe'rehappytothrowawaywork,wecanlearnrapidlysimplybytryingthingsthatmayseem
insanefromthesafetyofthearmchair.

I'llexplainabruteforcediscoveryapproachforZeroMQthatemergedfromaworkshopin2012.Itisremarkablysimpleand
stupid:connecttoeveryIPaddressintheroom.Ifyournetworksegmentis192.168.55.x,forinstance,youdothis:

connecttotcp://192.168.55.1:9000
connecttotcp://192.168.55.2:9000
connecttotcp://192.168.55.3:9000
...
connecttotcp://192.168.55.254:9000

WhichinZeroMQspeaklookslikethis:

intaddress
for(address=1address<255address++)
zsocket_connect(listener,"tcp://192.168.55.%d:9000",address)

Thestupidpartiswhereweassumethatconnectingtoourselvesisfine,whereweassumethatallpeersareonthesame
networksegment,wherewewastefilehandlesasiftheywerefree.Luckilytheseassumptionsareoftentotallyaccurate.Atleast,
oftenenoughtoletusdofunthings.

TheloopworksbecauseZeroMQconnectcallsareasynchronousandopportunistic.Theylieintheshadowslikehungrycats,
waitingpatientlytopounceonanyinnocentmousethatdaredstartupaserviceonport9000.It'ssimple,effective,andworked
firsttime.

Itgetsbetter:aspeersleaveandjointhenetwork,they'llautomaticallyreconnect.We'vedesignedaselfhealingpeertopeer
network,in30secondsandthreelinesofcode.

Itwon'tworkforrealcasesthough.Pooreroperatingsystemstendtorunoutoffilehandles,andnetworkstendtobemore
complexthanonesegment.Andifonenodesquatsacoupleofhundredfilehandles,largescalesimulations(withmanynodes
ononeboxorinoneprocess)areoutofthequestion.

http://zguide.zeromq.org/page:all 196/225
12/31/2015 MQ - The Guide - MQ - The Guide
Still,let'sseehowfarwecangowiththisapproachbeforewethrowitout.Here'satinydecentralizedchatprogramthatletsyou
talktoanyoneelseonthesamenetworksegment.Thecodehastwothreads:alistenerandabroadcaster.Thelistenercreatesa
SUBsocketanddoesthebruteforceconnectiontoallpeersinthenetwork.Thebroadcasteracceptsinputfromtheconsoleand
sendsitonaPUBsocket:

dechat:DecentralizedChatinC

Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

ThedechatprogramneedstoknowthecurrentIPaddress,theinterface,andanalias.Wecouldgettheseincodefromthe
operatingsystem,butthat'sgrunkynonportablecode.Soweprovidethisinformationonthecommandline:

dechat192.168.55.122eth0Joe

PreemptiveDiscoveryoverRawSockets topprevnext

Oneofthegreatthingsaboutshortrangewirelessistheproximity.WiFimapscloselytothephysicalspace,whichmapsclosely
tohowwenaturallyorganize.Infact,theInternetisquiteabstractandthisconfusesalotofpeoplewhokindof"getit"butinfact
don'treally.WithWiFi,wehavetechnicalconnectivitythatispotentiallysupertangible.Youseewhatyougetandyougetwhat
yousee.Tangiblemeanseasytounderstandandthatshouldmeanlovefromusersinsteadofthetypicalfrustrationandseething
hatred.

Proximityisthekey.WehaveabunchofWiFiradiosinaroom,happilybeaconingtoeachother.Forlotsofapplications,it
makessensethattheycanfindeachotherandstartchattingwithoutanyuserinput.Afterall,mostrealworlddataisn'tprivate,
it'sjusthighlylocalized.

I'minahotelroominGangnam,Seoul,witha4Gwirelesshotspot,aLinuxlaptop,andancoupleofAndroidphones.Thephones
andlaptoparetalkingtothehotspot.TheifconfigcommandsaysmyIPaddressis192.168.1.2.Letmetrysomeping
commands.DHCPserverstendtodishoutaddressesinsequence,somyphonesareprobablycloseby,numericallyspeaking:

$ping192.168.1.1
PING192.168.1.1(192.168.1.1)56(84)bytesofdata.
64bytesfrom192.168.1.1:icmp_req=1ttl=64time=376ms
64bytesfrom192.168.1.1:icmp_req=2ttl=64time=358ms
64bytesfrom192.168.1.1:icmp_req=4ttl=64time=167ms
^C
192.168.1.1pingstatistics
3packetstransmitted,2received,33%packetloss,time2001ms
rttmin/avg/max/mdev=358.077/367.522/376.967/9.445ms

Foundone!150300msecroundtriplatencythat'sasurprisinglyhighfigure,somethingtokeepinmindforlater.NowIping
myself,justtotrytodoublecheckthings:

$ping192.168.1.2
PING192.168.1.2(192.168.1.2)56(84)bytesofdata.
64bytesfrom192.168.1.2:icmp_req=1ttl=64time=0.054ms
64bytesfrom192.168.1.2:icmp_req=2ttl=64time=0.055ms
64bytesfrom192.168.1.2:icmp_req=3ttl=64time=0.061ms
^C
192.168.1.2pingstatistics
3packetstransmitted,3received,0%packetloss,time1998ms
rttmin/avg/max/mdev=0.054/0.056/0.061/0.009ms

Theresponsetimeisabitfasternow,whichiswhatwe'dexpect.Let'strythenextcoupleofaddresses:

http://zguide.zeromq.org/page:all 197/225
12/31/2015 MQ - The Guide - MQ - The Guide

$ping192.168.1.3
PING192.168.1.3(192.168.1.3)56(84)bytesofdata.
64bytesfrom192.168.1.3:icmp_req=1ttl=64time=291ms
64bytesfrom192.168.1.3:icmp_req=2ttl=64time=271ms
64bytesfrom192.168.1.3:icmp_req=3ttl=64time=132ms
^C
192.168.1.3pingstatistics
3packetstransmitted,3received,0%packetloss,time2001ms
rttmin/avg/max/mdev=132.781/231.914/291.851/70.609ms

That'sthesecondphone,withthesamekindoflatencyasthefirstone.Let'scontinueandseeifthereareanyotherdevices
connectedtothehotspot:

$ping192.168.1.4
PING192.168.1.4(192.168.1.4)56(84)bytesofdata.
^C
192.168.1.4pingstatistics
3packetstransmitted,0received,100%packetloss,time2016ms

Andthatisit.Now,pingusesrawIPsocketstosendICMP_ECHOmessages.TheusefulthingaboutICMP_ECHOisthatitgetsa
responsefromanyIPstackthathasnotdeliberatelyhadechoswitchedoff.That'sstillacommonpracticeoncorporatewebsites
whofeartheold"pingofdeath"exploitwheremalformedmessagescouldcrashthemachine.

Icallthispreemptivediscoverybecauseitdoesn'ttakeanycooperationfromthedevice.Wedon'trelyonanycooperationfrom
thephonestoseethemsittingthereaslongasthey'renotactivelyignoringus,wecanseethem.

Youmightaskwhythisisuseful.Wedon'tknowthatthepeersrespondingtoICMP_ECHOrunZeroMQ,thattheyareinterestedin
talkingtous,thattheyhaveanyserviceswecanuse,orevenwhatkindofdevicetheyare.However,knowingthatthere's
somethingonaddress192.168.1.3isalreadyuseful.Wealsoknowhowfarawaythedeviceis,relatively,weknowhowmany
devicesareonthenetwork,andweknowtheroughstateofthenetwork(asin,good,poor,orterrible).

Itisn'tevenhardtocreateICMP_ECHOmessagesandsendthem.Afewdozenlinesofcode,andwecoulduseZeroMQ
multithreadingtodothisinparallelforaddressesstretchingoutaboveandbelowourownIPaddress.Couldbekindoffun.

However,sadly,there'safatalflawinmyideaofusingICMP_ECHOtodiscoverdevices.ToopenarawIPsocketrequiresroot
privilegesonaPOSIXbox.Itstopsrogueprogramsgettingdatameantforothers.Wecangetthepowertoopenrawsocketson
Linuxbygivingsudoprivilegestoourcommand(pinghasthesocalledstickybitset).OnamobileOSlikeAndroid,itrequires
rootaccess,i.e.,rootingthephoneortablet.That'soutofthequestionformostpeopleandsoICMP_ECHOisoutofreachfor
mostdevices.

Expletivedeleted!Let'strysomethinginuserspace.ThenextstepmostpeopletakeisUDPmulticastorbroadcast.Let'sfollow
thattrail.

CooperativeDiscoveryUsingUDPBroadcasts topprevnext

Multicasttendstobeseenasmoremodernand"better"thanbroadcast.InIPv6,broadcastdoesn'tworkatall:youmustalways
usemulticast.Nonetheless,allIPv4localnetworkdiscoveryprotocolsendupusingUDPbroadcastanyhow.Thereasons:
broadcastandmulticastendupworkingmuchthesame,exceptbroadcastissimplerandlessrisky.Multicastisseenbynetwork
adminsaskindofdangerous,asitcanleakovernetworksegments.

Ifyou'veneverusedUDP,you'lldiscoverit'squiteaniceprotocol.Insomeways,itremindsusofZeroMQ,sendingwhole
messagestopeersusingatwodifferentpatterns:onetoone,andonetomany.ThemainproblemswithUDParethat(a)the
POSIXsocketAPIwasdesignedforuniversalflexibility,notsimplicity,(b)UDPmessagesarelimitedforpracticalpurposesto
about1,500bytesonLANsand512bytesontheInternet,and(c)whenyoustarttouseUDPforrealdata,youfindthat
messagesgetdropped,especiallyasinfrastructuretendstofavorTCPoverUDP.

HereisaminimalpingprogramthatusesUDPinsteadofICMP_ECHO:

http://zguide.zeromq.org/page:all 198/225
12/31/2015 MQ - The Guide - MQ - The Guide
udpping1:UDPdiscovery,model1inC

C++|Python|Ada|Basic|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

Thiscodeusesasinglesockettobroadcast1bytemessagesandreceiveanythingthatothernodesarebroadcasting.WhenI
runit,itshowsjustonenode,whichisitself:

Pingingpeers...
Foundpeer192.168.1.2:9999
Pingingpeers...
Foundpeer192.168.1.2:9999

IfIswitchoffallnetworkingandtryagain,sendingamessagefails,asI'dexpect:

Pingingpeers...
sendto:Networkisunreachable

Workingonthebasisofsolvetheproblemscurrentlyaimingatyourthroat,let'sfixthemosturgentissuesinthisfirstmodel.
Theseissuesare:

Usingthe255.255.255.255broadcastaddressisabitdubious.Ontheonehand,thisbroadcastaddressmeansprecisely
"sendtoallnodesonthelocalnetwork,anddon'tforward".However,ifyouhaveseveralinterfaces(wiredEthernet,WiFi)
thenbroadcastswillgooutonyourdefaultrouteonly,andviajustoneinterface.Whatwewanttodoiseithersendour
broadcastoneachinterface'sbroadcastaddress,orfindtheWiFiinterfaceanditsbroadcastaddress.

Likemanyaspectsofsocketprogramming,gettinginformationonnetworkinterfacesisnotportable.Dowewanttowrite
nonportablecodeinourapplications?No,thisisbetterhiddeninalibrary.

There'snohandlingforerrorsexcept"abort",whichistoobrutalfortransientproblemslike"yourWiFiisswitchedoff".The
codeshoulddistinguishbetweensofterrors(ignoreandretry)andharderrors(assert).

ThecodeneedstoknowitsownIPaddressandignorebeaconsthatitsentout.Likefindingthebroadcastaddress,this
requiresinspectingtheavailableinterfaces.

ThesimplestanswertotheseissuesistopushtheUDPcodeintoaseparatelibrarythatprovidesacleanAPI,likethis:

//Constructor
staticudp_t*
udp_new(intport_nbr)

//Destructor
staticvoid
udp_destroy(udp_t**self_p)

//ReturnsUDPsockethandle
staticint
udp_handle(udp_t*self)

//SendmessageusingUDPbroadcast
staticvoid
udp_send(udp_t*self,byte*buffer,size_tlength)

//ReceivemessagefromUDPbroadcast
staticssize_t
udp_recv(udp_t*self,byte*buffer,size_tlength)

HereistherefactoredUDPpingprogramthatcallsthislibrary,whichismuchcleanerandnicer:

udpping2:UDPdiscovery,model2inC

http://zguide.zeromq.org/page:all 199/225
12/31/2015 MQ - The Guide - MQ - The Guide
Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

Thelibrary,udplib,hidesalotoftheunpleasantcode(whichwillbecomeuglieraswemakethisworkonmoresystems).I'm
notgoingtoprintthatcodehere.Youcanreaditintherepository.

Now,therearemoreproblemssizingusupandwonderingiftheycanmakelunchoutofus.First,IPv4versusIPv6andmulticast
versusbroadcast.InIPv6,broadcastdoesn'texistatalloneusesmulticast.FrommyexperiencewithWiFi,IPv4multicastand
broadcastworkidenticallyexceptthatmulticastbreaksinsomesituationswherebroadcastworksfine.Someaccesspointsdo
notforwardmulticastpackets.Whenyouhaveadevice(e.g.,atablet)thatactsasamobileAP,thenit'spossibleitwon'tget
multicastpackets.Meaning,itwon'tseeotherpeersonthenetwork.

ThesimplestplausiblesolutionissimplytoignoreIPv6fornow,andusebroadcast.Aperhapssmartersolutionwouldbetouse
multicastanddealwithasymmetricbeaconsiftheyhappen.

We'llstickwithstupidandsimplefornow.There'salwaystimetomakeitmorecomplex.

MultipleNodesonOneDevice topprevnext

SowecandiscovernodesontheWiFinetwork,aslongasthey'resendingoutbeaconsasweexpect.SoItrytotestwithtwo
processes.ButwhenIrunudpping2twice,thesecondinstancecomplains"'Addressalreadyinuse'onbind"andexits.Oh,right.
UDPandTCPbothreturnanerrorifyoutrytobindtwodifferentsocketstothesameport.Thisisright.Thesemanticsoftwo
readersononesocketwouldbeweirdtosaytheleast.Odd/evenbytes?Yougetallthe1s,Igetallthe0's?

However,aquickcheckofstackoverflow.comandsomememoryofasocketoptioncalledSO_REUSEADDRturnsupgold.IfIuse
that,IcanbindseveralprocessestothesameUDPport,andtheywillallreceiveanymessagearrivingonthatport.It'salmostas
iftheguyswhodesignedthiswerereadingmymind!(That'swaymoreplausiblethanthechancethatImaybereinventingthe
wheel.)

AquicktestshowsthatSO_REUSEADDRworksaspromised.ThisisgreatbecausethenextthingIwanttodoisdesignanAPI
andthenstartdozensofnodestoseethemdiscoveringeachother.Itwouldbereallycumbersometohavetotesteachnodeon
aseparatedevice.Andwhenwegettotestinghowrealtrafficbehavesonalarge,flakynetwork,thetwoalternativesare
simulationortemporaryinsanity.

AndIspeakfromexperience:wewere,thissummer,testingondozensofdevicesatonce.Ittakesaboutanhourtosetupafull
testrun,andyouneedaspaceshieldedfromWiFiinterferenceifyouwantanykindofreproducibility(unlessyourtestcaseis
"provethatinterferencekillsWiFinetworksfasterthanOrvalcankillathirst".

IfIwereawhizAndroiddeveloperwithafreeweekend,I'dimmediately(asin,itwouldtakemetwodays)portthiscodetomy
phoneandgetitsendingbeaconstomyPC.Butsometimeslazyismoreprofitable.IlikemyLinuxlaptop.Ilikebeingableto
startadozenthreadsfromoneprocess,andhaveeachthreadactinglikeanindependentnode.Ilikenothavingtoworkinareal
FaradaycagewhenIcansimulateoneonmylaptop.

DesigningtheAPI topprevnext

I'mgoingtorunNnodesonadevice,andtheyaregoingtohavetodiscovereachother,aswellasabunchofothernodesout
thereonthelocalnetwork.IcanuseUDPforlocaldiscoveryaswellasremotediscovery.It'sarguablynotasefficientasusing,
e.g.,theZeroMQinproc://transport,butithasthegreatadvantagethattheexactsamecodewillworkinsimulationandinreal
deployment.

IfIhavemultiplenodesononedevice,weclearlycan'tusetheIPaddressandportnumberasnodeaddress.Ineedsomelogical
nodeidentifier.Arguably,thenodeidentifieronlyhastobeuniquewithinthecontextofthedevice.Mymindfillswithcomplex
stuffIcouldmake,likesupernodesthatsitonrealUDPportsandforwardmessagestointernalnodes.Ihitmyheadonthetable
untiltheideaofinventingnewconceptsleavesit.

ExperiencetellsusthatWiFidoesthingslikedisappearandreappearwhileapplicationsarerunning.Usersclickonthings,which
doesinterestingthingslikechangetheIPaddresshalfwaythroughasession.WecannotdependonIPaddresses,noron
establishedconnections(intheTCPfashion).Weneedsomelonglastingaddressingmechanismthatsurvivesinterfacesand
connectionsbeingtorndownandthenrecreated.

http://zguide.zeromq.org/page:all 200/225
12/31/2015 MQ - The Guide - MQ - The Guide
Here'sthesimplestsolutionIcansee:wegiveeverynodeaUUID,andspecifythatnodes,representedbytheirUUIDs,can
appearorreappearatcertainIPaddress:portendpoints,andthendisappearagain.We'lldealwithrecoveryfromlostmessages
later.AUUIDis16bytes.SoifIhave100nodesonaWiFinetwork,that's(doubleitforotherrandomstuff)3,200bytesasecond
ofbeacondatathattheairhastocarryjustfordiscoveryandpresence.Seemsacceptable.

Backtoconcepts.WedoneedsomenamesforourAPI.Attheleastweneedawaytodistinguishbetweenthenodeobjectthat
is"us",andnodeobjectsthatareourpeers.We'llbedoingthingslikecreatingan"us"andthenaskingithowmanypeersit
knowsaboutandwhotheyare.Theterm"peer"isclearenough.

Fromthedeveloperpointofview,anode(theapplication)needsawaytotalktotheoutsideworld.Let'sborrowatermfrom
networkingandcallthisan"interface".Theinterfacerepresentsustotherestoftheworldandpresentstherestoftheworldto
us,asasetofotherpeers.Itautomaticallydoeswhateverdiscoveryitmust.Whenwewanttotalktoapeer,wegettheinterface
todothatforus.Andwhenapeertalkstous,it'stheinterfacethatdeliversusthemessage.

ThisseemslikeacleanAPIdesign.Howabouttheinternals?

TheinterfacemustbemultithreadedsothatonethreadcandoI/Ointhebackground,whiletheforegroundAPItalkstothe
application.WeusedthisdesignintheCloneandFreelanceclientAPIs.

TheinterfacebackgroundthreaddoesthediscoverybusinessbindtotheUDPport,sendoutUDPbeacons,andreceive
beacons.

WeneedtoatleastsendUUIDsinthebeaconmessagesothatwecandistinguishourownbeaconsfromthoseofour
peers.

Weneedtotrackpeersthatappear,andthatdisappear.Forthis,I'lluseahashtablethatstoresallknownpeersand
expirepeersaftersometimeout.

Weneedawaytoreportpeersandeventstothecaller.Herewegetintoajuicyquestion.HowdoesabackgroundI/O
threadtellaforegroundAPIthreadthatstuffishappening?Callbacksmaybe?Heckno.We'lluseZeroMQmessages,of
course.

ThethirditerationoftheUDPpingprogramisevensimplerandmorebeautifulthanthesecond.Themainbody,inC,isjustten
linesofcode.

udpping3:UDPdiscovery,model3inC

Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

Theinterfacecodeshouldbefamiliarifyou'vestudiedhowwemakemultithreadedAPIclasses:

interface:UDPpinginterfaceinC

Python|Ada|Basic|C++|C#|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

WhenIrunthisintwowindows,itreportsonepeerjoiningthenetwork.Ikillthatpeerandafewsecondslater,ittellsmethepeer
left:


[006]JOINED
[032]418E98D4B7184844B7D5E0EE5691084C

[004]LEFT
[032]418E98D4B7184844B7D5E0EE5691084C

What'sniceaboutaZeroMQmessagebasedAPIisthatIcanwrapthisanywayIlike.Forinstance,IcanturnitintocallbacksifI
reallywantthose.IcanalsotraceallactivityontheAPIveryeasily.

Somenotesabouttuning.OnEthernet,fiveseconds(theexpirytimeIusedinthiscode)seemslikealot.Onabadlystressed
WiFinetwork,youcangetpinglatenciesof30secondsormore.Ifyouuseatooaggressivevaluefortheexpiry,you'lldisconnect
nodesthatarestillthere.Ontheotherside,enduserapplicationsexpectacertainliveliness.Ifittakes30secondstoreportthat
anodehasgone,userswillgetannoyed.

Adecentstrategyistodetectandreportdisappearednodesrapidly,butonlydeletethemafteralongerinterval.Visually,anode

http://zguide.zeromq.org/page:all 201/225
12/31/2015 MQ - The Guide - MQ - The Guide
wouldbegreenwhenit'salive,thengrayforawhileasitwentoutofreach,thenfinallydisappear.We'renotdoingthisnow,but
willdoitintherealimplementationoftheasyetunnamedframeworkwe'remaking.

Aswewillalsoseelater,wehavetotreatanyinputfromanode,notjustUDPbeacons,asasignoflife.UDPmaygetsquashed
whenthere'salotofTCPtraffic.Thisisperhapsthemainreasonwe'renotusinganexistingUDPdiscoverylibrary:it'snecessary
tointegratethistightlywithourZeroMQmessagingforittowork.

MoreAboutUDP topprevnext

SowehavediscoveryandpresenceworkingoverUDPIPv4broadcasts.It'snotideal,butitworksforthelocalnetworkswehave
today.Howeverwecan'tuseUDPforrealwork,notwithoutadditionalworktomakeitreliable.There'sajokeaboutUDPbut
sometimesyou'llgetit,andsometimesyouwon't.

We'llsticktoTCPforallonetoonemessaging.ThereisonemoreusecaseforUDPafterdiscovery,whichismulticastfile
distribution.I'llexplainwhyandhow,thenshelvethatforanotherday.Thewhyissimple:whatwecall"socialnetworks"isjust
augmentedculture.Wecreateculturebysharing,andthismeansmoreandmoresharingworksthatwemakeorremix.Photos,
documents,contracts,tweets.Thecloudsofdeviceswe'reaimingtowardsdomoreofthis,notless.

Now,therearetwoprincipalpatternsforsharingcontent.Oneisthepubsubpatternwhereonenodesendsoutcontenttoaset
ofothernodessimultaneously.Secondisthe"latejoiner"pattern,whereanodearrivessomewhatlaterandwantstocatchupto
theconversation.WecandealwiththelatejoinerusingTCPunicast.ButdoingTCPunicasttoagroupofclientsatthesame
timehassomedisadvantages.First,itcanbeslowerthanmulticast.Second,it'sunfairbecausesomewillgetthecontentbefore
others.

BeforeyoujumpofftodesignaUDPmulticastprotocol,realizethatit'snotasimplecalculation.Whenyousendamulticast
packet,theWiFiaccesspointusesalowbitratetoensurethateventhefurthestdeviceswillgetitsafely.MostnormalAPsdon't
dotheobviousoptimization,whichistomeasurethedistanceofthefurthestdeviceandusethatbitrate.Instead,theyjustusea
fixedvalue.SoifyouhaveafewdevicesclosetotheAP,multicastwillbeinsanelyslow.Butifyouhavearoomfulofdevices
whichallwanttogetthenextchapterofthetextbook,multicastcanbeinsanelyeffective.

Thecurvescrossatabout612devicesdependingonthenetwork.Intheory,youcouldmeasurethecurvesinrealtimeand
createanadaptiveprotocol.Thatwouldbecoolbutprobablytoohardforeventhesmartestofus.

IfyoudositdownandsketchoutaUDPmulticastprotocol,realizethatyouneedachannelforrecovery,togetlostpackets.
You'dprobablywanttodothisoverTCP,usingZeroMQ.Fornow,however,we'llforgetaboutmulticastUDPandassumeall
trafficgoesoverTCP.

SpinningOffaLibraryProject topprevnext

Atthisstage,however,thecodeisgrowinglargerthananexampleshouldbe,soit'stimetocreateaproperGitHubproject.It'sa
rule:buildyourprojectsinpublicview,andtellpeopleaboutthemasyougosoyourmarketingandcommunitybuildingstartson
Day1.I'llwalkthroughwhatthisinvolves.IexplainedinChapter6TheZeroMQCommunityaboutgrowingcommunitiesaround
projects.Weneedafewthings:

Aname
Aslogan
Apublicgithubrepository
AREADMEthatlinkstotheC4process
Licensefiles
Anissuetracker
Twomaintainers
Afirstbootstrapversion

Thenameandsloganfirst.Thetrademarksofthe21stcenturyaredomainnames.SothefirstthingIdowhenspinningoffa
projectistolookforadomainnamethatmightwork.Quiterandomly,oneofouroldmessagingprojectswascalled"Zyre"andI
havethedomainnameforit.Thefullnameisabackronym:theZeroMQRealtimeExchangeframework.

I'msomewhatshyaboutpushingnewprojectsintotheZeroMQcommunitytooaggressively,andnormallywouldstartaprojectin
eithermypersonalaccountortheiMatixorganization.Butwe'velearnedthatmovingprojectsaftertheybecomepopularis

http://zguide.zeromq.org/page:all 202/225
12/31/2015 MQ - The Guide - MQ - The Guide
counterproductive.Mypredictionsofafuturefilledwithmovingpiecesareeithervalidorwrong.Ifthischapterisvalid,wemight
aswelllaunchthisasaZeroMQprojectfromthestart.Ifit'swrong,wecandeletetherepositorylaterorletitsinktothebottomof
alonglistofforgottenstarts.

Startwiththebasics.Theprotocol(UDPandZeroMQ/TCP)willbeZRE(ZeroMQRealtimeExchangeprotocol)andtheproject
willbeZyre.Ineedasecondmaintainer,soIinvitemyfriendDongMin(theKoreanhackerbehindJeroMQ,apureJavaZeroMQ
stack)tojoin.He'sbeenworkingonverysimilarideassoisenthusiastic.WediscussthisandwegettheideaofbuildingZyreon
topofJeroMQ,aswellasontopofCZMQandlibzmq.ThiswouldmakeitaloteasiertorunZyreonAndroid.Itwouldalsogive
ustwofullyseparateimplementationsfromthestart,whichisalwaysagoodthingforaprotocol.

SowetaketheFileMQprojectIbuiltinChapter7AdvancedArchitectureusingZeroMQasatemplateforanewGitHubproject.
TheGNUautoconftoolsarequitedecent,buthaveapainfulsyntax.It'seasiesttocopyexistingprojectfilesandmodifythem.
TheFileMQprojectbuildsalibrary,hastesttools,licensefiles,manpages,andsoon.It'snottoolargesoit'sagoodstarting
point.

IputtogetheraREADMEtosummarizethegoalsoftheprojectandpointtoC4.Theissuetrackerisenabledbydefaultonnew
GitHubprojects,sooncewe'vepushedtheUDPpingcodeasafirstversion,we'rereadytogo.However,it'salwaysgoodto
recruitmoremaintainers,soIcreateanissue"Callformaintainers"thatsays:

Ifyou'dliketohelpclickthatlovelygreen"MergePullRequest"buttonandgeteternalgoodkarma,addacomment
confirmingthatyou'vereadandunderstandtheC4processathttp://rfc.zeromq.org/spec:22.

Finally,Ichangetheissuetrackerlabels.Bydefault,GitHubofferstheusualvarietyofissuetypes,butwithC4wedon'tuse
them.Instead,weneedjusttwolabels("Urgent",inred,and"Ready",inblack).

PointtoPointMessaging topprevnext

I'mgoingtotakethelastUDPpingprogramandbuildapointtopointmessaginglayerontopofthat.Ourgoalisthatwecan
detectpeersastheyjoinandleavethenetwork,thatwecansendmessagestothem,andthatwecangetreplies.Itisanontrivial
problemtosolveandtakesMinandmetwodaystogeta"HelloWorld"versionworking.

Wehadtosolveanumberofissues:

WhatinformationtosendintheUDPbeacon,andhowtoformatit.
WhatZeroMQsockettypestousetointerconnectnodes.
WhatZeroMQmessagestosend,andhowtoformatthem.
Howtosendamessagetoaspecificnode.
Howtoknowthesenderofanymessagesowecouldsendareply.
HowtorecoverfromlostUDPbeacons.
Howtoavoidoverloadingthenetworkwithbeacons.

I'llexplaintheseinenoughdetailsothatyouunderstandwhywemadeeachchoicewedid,withsomecodefragmentsto
illustrate.Wetaggedthiscodeasversion0.1.0soyoucanlookatthecode:mostofthehardworkisdonein
zre_interface.c.

UDPBeaconFraming topprevnext

SendingUUIDsacrossthenetworkisthebareminimumforalogicaladdressingscheme.However,wehaveafewmoreaspects
togetworkingbeforethiswillworkinrealuse:

Weneedsomeprotocolidentificationsothatwecancheckforandrejectinvalidpackets.
Weneedsomeversioninformationsothatwecanchangethisprotocolovertime.
WeneedtotellothernodeshowtoreachusviaTCP,i.e.,aZeroMQporttheycantalktouson.

Let'sstartwiththebeaconmessageformat.Weprobablywantafixedprotocolheaderthatwillneverchangeinfutureversions
andabodythatdependsontheversion.

Figure67ZREdiscoverymessage
http://zguide.zeromq.org/page:all 203/225
12/31/2015 MQ - The Guide - MQ - The Guide

Theversioncanbea1bytecounterstartingat1.TheUUIDis16bytesandtheportisa2byteportnumberbecauseUDPnicely
tellsusthesender'sIPaddressforeverymessagewereceive.Thisgivesusa22byteframe.

TheClanguage(andafewotherslikeErlang)makeitsimpletoreadandwritebinarystructures.Wedefinethebeaconframe
structure:

#defineBEACON_PROTOCOL"ZRE"
#defineBEACON_VERSION0x01

typedefstruct{
byteprotocol[3]
byteversion
uuid_tuuid
uint16_tport
}beacon_t

Thismakessendingandreceivingbeaconsquitesimple.Hereishowwesendabeacon,usingthezre_udpclasstodothe
nonportablenetworkcalls:

//Beaconobject
beacon_tbeacon

//Formatbeaconfields
beacon.protocol[0]='Z'
beacon.protocol[1]='R'
beacon.protocol[2]='E'
beacon.version=BEACON_VERSION
memcpy(beacon.uuid,self>uuid,sizeof(uuid_t))
beacon.port=htons(self>port)

//Broadcastthebeacontoanyonewhoislistening
zre_udp_send(self>udp,(byte*)&beacon,sizeof(beacon_t))

Whenwereceiveabeacon,weneedtoguardagainstbogusdata.We'renotgoingtobeparanoidagainst,forexample,denialof
serviceattacks.Wejustwanttomakesurethatwe'renotgoingtocrashwhenabadZREimplementationsendsuserroneous
frames.

Tovalidateaframe,wecheckitssizeandheader.IfthoseareOK,weassumethebodyisusable.WhenwegetaUUIDthatisn't
ourselves(recall,we'llgetourownUDPbroadcastsback),wecantreatthisasapeer:

//Getbeaconframefromnetwork
beacon_tbeacon
ssize_tsize=zre_udp_recv(self>udp,
(byte*)&beacon,sizeof(beacon_t))

//Basicvalidationontheframe
if(size!=sizeof(beacon_t)
||beacon.protocol[0]!='Z'
||beacon.protocol[1]!='R'
||beacon.protocol[2]!='E'
||beacon.version!=BEACON_VERSION)

http://zguide.zeromq.org/page:all 204/225
12/31/2015 MQ - The Guide - MQ - The Guide
return0//Ignoreinvalidbeacons

//IfwegotaUUIDandit'snotourownbeacon,wehaveapeer
if(memcmp(beacon.uuid,self>uuid,sizeof(uuid_t))){
char*identity=s_uuid_str(beacon.uuid)
s_require_peer(self,identity,
zre_udp_from(self>udp),ntohs(beacon.port))
free(identity)
}

TruePeerConnectivity(HarmonyPattern) topprevnext

BecauseZeroMQisdesignedtomakedistributedmessagingeasy,peopleoftenaskhowtointerconnectasetoftruepeers(as
comparedtoobviousclientsandservers).ItisathornyquestionandZeroMQdoesn'treallyprovideasingleclearanswer.

TCP,whichisthemostcommonlyusedtransportinZeroMQ,isnotsymmetriconesidemustbindandonemustconnectand
thoughZeroMQtriestobeneutralaboutthis,it'snot.Whenyouconnect,youcreateanoutgoingmessagepipe.Whenyoubind,
youdonot.Whenthereisnopipe,youcannotwritemessages(ZeroMQwillreturnEAGAIN).

DeveloperswhostudyZeroMQandthentrytocreateNtoNconnectionsbetweensetsofequalpeersoftentryaROUTERto
ROUTERflow.It'sobviouswhy:eachpeerneedstoaddressasetofpeers,whichrequiresROUTER.Itusuallyendswitha
plaintiveemailtothelist.

ExperienceteachesusthatROUTERtoROUTERisparticularlydifficulttousesuccessfully.Ataminimum,onepeermustbind
andonemustconnect,meaningthearchitectureisnotsymmetrical.Butalsobecauseyousimplycan'ttellwhenyouareallowed
tosafelysendamessagetoapeer.It'saCatch22:youcantalktoapeerafterit'stalkedtoyou,butthepeercan'ttalktoyou
untilyou'vetalkedtoit.Onesideortheotherwillbelosingmessagesandthushastoretry,whichmeansthepeerscannotbe
equal.

I'mgoingtoexplaintheHarmonypattern,whichsolvesthisproblem,andwhichweuseinZyre.

Wewantaguaranteethatwhenapeer"appears"onournetwork,wecantalktoitsafelywithoutZeroMQdroppingmessages.
Forthis,wehavetouseaDEALERorPUSHsocketthatconnectsouttothepeersothatevenifthatconnectiontakessome
nonzerotime,thereisimmediatelyapipeandZeroMQwillacceptoutgoingmessages.

ADEALERsocketcannotaddressmultiplepeersindividually.ButifwehaveoneDEALERperpeer,andweconnectthat
DEALERtothepeer,wecansafelysendmessagestoapeerassoonaswe'veconnectedtoit.

Now,thenextproblemistoknowwhosentusaparticularmessage.WeneedareplyaddressthatistheUUIDofthenodewho
sentanygivenmessage.DEALERcan'tdothisunlessweprefixeverysinglemessagewiththat16byteUUID,whichwouldbe
wasteful.ROUTERdoesdoitifwesettheidentityproperlybeforeconnectingtotherouter.

AndsotheHarmonypatterncomesdowntothesecomponents:

OneROUTERsocketthatwebindtoaephemeralport,whichwebroadcastinourbeacons.
OneDEALERsocketperpeerthatweconnecttothepeer'sROUTERsocket.
ReadingfromourROUTERsocket.
Writingtothepeer'sDEALERsocket.

Thenextproblemisthatdiscoveryisn'tneatlysynchronized.Wecangetthefirstbeaconfromapeerafterwestarttoreceive
messagesfromit.AmessagecomesinontheROUTERsocketandhasaniceUUIDattachedtoit,butnophysicalIPaddress
andport.WehavetoforcediscoveryoverTCP.Todothis,ourfirstcommandtoanynewpeertowhichweconnectisanOHAI
commandwithourIPaddressandport.Thisensurethatthereceiverconnectsbacktousbeforetryingtosendusanycommand.

Hereitis,brokendownintosteps:

IfwereceiveaUDPbeaconfromanewpeer,weconnecttothepeerthroughaDEALERsocket.
WereadmessagesfromourROUTERsocket,andeachmessagecomeswiththeUUIDofthesender.
Ifit'sanOHAImessage,weconnectbacktothatpeerifnotalreadyconnectedtoit.
Ifit'sanyothermessage,wemustalreadybeconnectedtothepeer(agoodplaceforanassertion).
WesendmessagestoeachpeerusingtheperpeerDEALERsocket,whichmustbeconnected.
Whenweconnecttoapeer,wealsotellourapplicationthatthepeerexists.

http://zguide.zeromq.org/page:all 205/225
12/31/2015 MQ - The Guide - MQ - The Guide
Everytimewegetamessagefromapeer,wetreatthatasaheartbeat(it'salive).

IfwewerenotusingUDPbutsomeotherdiscoverymechanism,I'dstillusetheHarmonypatternforatruepeernetwork:one
ROUTERforinputfromallpeers,andoneDEALERperpeerforoutput.BindtheROUTER,connecttheDEALER,andstarteach
conversationwithanOHAIequivalentthatprovidesthereturnIPaddressandport.Youwouldneedsomeexternalmechanismto
bootstrapeachconnection.

DetectingDisappearances topprevnext

Heartbeatingsoundssimplebutit'snot.UDPpacketsgetdroppedwhenthere'salotofTCPtraffic,soifwedependonUDP
beacons,we'llgetfalsedisconnections.TCPtrafficcanbedelayedfor5,10,even30secondsifthenetworkisreallybusy.Soif
wekillpeerswhentheygoquiet,we'llhavefalsedisconnections.

BecauseUDPbeaconsaren'treliable,it'stemptingtoaddinTCPbeacons.Afterall,TCPwilldeliverthemreliably.However,
there'sonelittleproblem.Imaginethatyouhave100nodesonanetwork,andeachnodesendsaTCPbeacononceasecond.
Eachbeaconis22bytes,notcountingTCP'sframingoverhead.Thatis100*99*22bytespersecond,or217,000bytes/second
justforheartbeating.That'sabout12%ofatypicalWiFinetwork'sidealcapacity,whichsoundsOK.Butwhenanetworkis
stressedorfightingothernetworksforairspace,thatextra200Kasecondwillbreakwhat'sleft.UDPbroadcastsareatleastlow
cost.

SowhatwedoisswitchtoTCPheartbeatsonlywhenaspecificpeerhasn'tsentusanyUDPbeaconsinawhile.Andthenwe
sendTCPheartbeatsonlytothatonepeer.Ifthepeercontinuestobesilent,weconcludeit'sgoneaway.Ifthepeercomesback
withadifferentIPaddressand/orport,wehavetodisconnectourDEALERsocketandreconnecttothenewport.

Thisgivesusasetofstatesforeachpeer,thoughatthisstagethecodedoesn'tuseaformalstatemachine:

PeervisiblethankstoUDPbeacon(weconnectusingIPaddressandportfrombeacon)
PeervisiblethankstoOHAIcommand(weconnectusingIPaddressandportfromcommand)
Peerseemsalive(wegotaUDPbeaconorcommandoverTCPrecently)
Peerseemsquiet(noactivityinsometime,sowesendaHUGZcommand)
Peerhasdisappeared(noreplytoourHUGZcommands,sowedestroypeer)

There'soneremainingscenariowedidn'taddressinthecodeatthisstage.It'spossibleforapeertochangeIPaddressesand
portswithoutactuallytriggeringadisappearanceevent.Forexample,iftheuserswitchesoffWiFiandthenswitchesitbackon,
theaccesspointcanassignthepeeranewIPaddress.We'llneedtohandleadisappearedWiFiinterfaceonournodeby
unbindingtheROUTERsocketandrebindingitwhenwecan.Becausethisisnotcentraltothedesignnow,Idecidetologan
issueontheGitHubtrackerandleaveitforarainyday.

GroupMessaging topprevnext

Groupmessagingisacommonandveryusefulpattern.Theconceptissimple:insteadoftalkingtoasinglenode,youtalktoa
"group"ofnodes.Thegroupisjustaname,astringthatyouagreeonintheapplication.It'spreciselylikeusingthepubsub
prefixesinPUBandSUBsockets.Infact,theonlyreasonIsay"groupmessaging"andnot"pubsub"istopreventconfusion,
becausewe'renotgoingtousePUBSUBsocketsforthis.

PUBSUBsocketswouldalmostwork.Butwe'vejustdonesuchalotofworktosolvethelatejoinerproblem.Applicationsare
inevitablygoingtowaitforpeerstoarrivebeforesendingmessagestogroups,sowehavetobuildontheHarmonypatternrather
thanstartagainbesideit.

Let'slookattheoperationswewanttodoongroups:

Wewanttojoinandleavegroups.
Wewanttoknowwhatothernodesareinanygivengroup.
Wewanttosendamessageto(allnodesin)agroup.

Theselookfamiliartoanyonewho'susedInternetRelayChat,exceptthatwehavenoserver.Everynodewillneedtokeeptrack
ofwhateachgrouprepresents.Thisinformationwillnotalwaysbefullyconsistentacrossthenetwork,butitwillbecloseenough.

Ourinterfacewilltrackasetofgroups(eachanobject).Thesearealltheknowngroupswithoneormoremembernode,

http://zguide.zeromq.org/page:all 206/225
12/31/2015 MQ - The Guide - MQ - The Guide
excludingourselves.We'lltracknodesastheyleaveandjoingroups.Becausenodescanjointhenetworkatanytime,wehave
totellnewpeerswhatgroupswe'rein.Whenapeerdisappears,we'llremoveitfromallgroupsweknowabout.

Thisgivesussomenewprotocolcommands:

JOINwesendthistoallpeerswhenwejoinagroup.
LEAVEwesendthistoallpeerswhenweleaveagroup.

Plus,weaddagroupsfieldtothefirstcommandwesend(renamedfromOHAItoHELLOatthispointbecauseIneedalarger
lexiconofcommandverbs).

Lastly,let'saddawayforpeerstodoublechecktheaccuracyoftheirgroupdata.Theriskisthatwemissoneoftheabove
messages.ThoughweareusingHarmonytoavoidthetypicalmessagelossatstartup,it'sworthbeingparanoid.Fornow,allwe
needisawaytodetectsuchafailure.We'lldealwithrecoverylater,iftheproblemactuallyhappens.

I'llusetheUDPbeaconforthis.Whatwewantisarollingcounterthatsimplytellshowmanyjoinandleaveoperations
("transitions")therehavebeenforanode.Itstartsat0andincrementsforeachgroupwejoinorleave.Wecanuseaminimal1
bytevaluebecausethatwillcatchallfailuresexcepttheastronomicallyrare"welostprecisely256messagesinarow"failure
(thisistheonethathitsduringthefirstdemo).WewillalsoputthetransitionscounterintotheJOIN,LEAVE,andHELLO
commands.Andtotrytoprovoketheproblem,we'lltestbyjoining/leavingseveralhundredgroupswithahighwatermarksetto
10orso.

It'stimetochooseverbsforthegroupmessaging.Weneedacommandthatmeans"talktoonepeer"andonethatmeans"talk
tomanypeers".Aftersomeattempts,mybestchoicesareWHISPERandSHOUT,andthisiswhatthecodeuses.TheSHOUT
commandneedstotelltheuserthegroupname,aswellasthesenderpeer.

Becausegroupsarelikepubsub,youmightbetemptedtousethistobroadcasttheJOINandLEAVEcommandsaswell,
perhapsbycreatinga"global"groupthatallnodesjoin.Myadviceistokeepgroupspurelyasuserspaceconceptsfortwo
reasons.First,howdoyoujointheglobalgroupifyouneedtheglobalgrouptosendoutaJOINcommand?Second,itcreates
specialcases(reservednames)whicharemessy.

It'ssimplerjusttosendJOINsandLEAVEsexplicitlytoallconnectedpeers,period.

I'mnotgoingtoworkthroughtheimplementationofgroupmessagingindetailbecauseit'sfairlypedanticandnottooexciting.
Thedatastructuresforgroupandpeermanagementaren'toptimal,butthey'reworkable.Weusethefollowing:

Alistofgroupsforourinterface,whichwecansendtonewpeersinaHELLOcommand
Ahashofgroupsforotherpeers,whichweupdatewithinformationfromHELLO,JOIN,andLEAVEcommands
Ahashofpeersforeachgroup,whichweupdatewiththesamethreecommands.

Atthisstage,I'mstartingtogetprettyhappywiththebinaryserialization(ourcodecgeneratorfromChapter7Advanced
ArchitectureusingZeroMQ),whichhandleslistsanddictionariesaswellasstringsandintegers.

Thisversionistaggedintherepositoryasv0.2.0andyoucandownloadthetarballifyouwanttocheckwhatthecodelookedlike
atthisstage.

TestingandSimulation topprevnext

Whenyoubuildaproductoutofpieces,andthisincludesadistributedframeworklikeZyre,theonlywaytoknowthatitwillwork
properlyinreallifeistosimulaterealactivityoneachpiece.

OnAssertions topprevnext

Theproperuseofassertionsisoneofthehallmarksofaprofessionalprogrammer.

Ourconfirmationbiasascreatorsmakesithardtotestourworkproperly.Wetendtowriteteststoprovethecodeworks,rather
thantryingtoproveitdoesn't.Therearemanyreasonsforthis.Wepretendtoourselvesandothersthatwecanbe(couldbe)
perfect,wheninfactweconsistentlymakemistakes.Bugsincodeareseenas"bad",ratherthan"inevitable",sopsychologically
wewanttoseefewerofthem,notuncovermoreofthem."Hewritesperfectcode"isacomplimentratherthanaeuphemismfor

http://zguide.zeromq.org/page:all 207/225
12/31/2015 MQ - The Guide - MQ - The Guide
"henevertakesriskssohiscodeisasboringandheavilyusedascoldspaghetti".

Someculturesteachustoaspiretoperfectionandpunishmistakesineducationandwork,whichmakesthisattitudeworse.To
acceptthatwe'refallible,andthentolearnhowtoturnthatintoprofitratherthanshameisoneofthehardestintellectual
exercisesinanyprofession.Weleverageourfallibilitiesbyworkingwithothersandbychallengingourownworksooner,not
later.

Onetrickthatmakesiteasieristouseassertions.Assertionsarenotaformoferrorhandling.Theyareexecutabletheoriesof
fact.Thecodeasserts,"Atthispoint,suchandsuchmustbetrue"andiftheassertionfails,thecodekillsitself.

Thefasteryoucanprovecodeincorrect,thefasterandmoreaccuratelyyoucanfixit.Believingthatcodeworksandprovingthat
itbehavesasexpectedislessscience,moremagicalthinking.It'sfarbettertobeabletosay,"libzmqhasfivehundred
assertionsanddespiteallmyefforts,notoneofthemfails".

SotheZyrecodebaseisscatteredwithassertions,andparticularlyacoupleonthecodethatdealswiththestateofpeers.This
isthehardestaspecttogetright:peersneedtotrackeachotherandexchangestateaccuratelyorthingsstopworking.The
algorithmsdependonasynchronousmessagesflyingaroundandI'mprettysuretheinitialdesignhasflaws.Italwaysdoes.

AndasItesttheoriginalZyrecodebystartingandstoppinginstancesofzre_pingbyhand,everysooftenIgetanassertion
failure.Runningbyhanddoesn'treproducetheseoftenenough,solet'smakeapropertestertool.

OnUpFrontTesting topprevnext

Beingabletofullytesttherealbehaviorofindividualcomponentsinthelaboratorycanmakea10xor100xdifferencetothecost
ofyourproject.Thatconfirmationbiasengineershavetotheirownworkmakesupfronttestingincrediblyprofitable,andlate
stagetestingincrediblyexpensive.

I'lltellyouashortstoryaboutaprojectweworkedoninthelate1990's.Weprovidedthesoftwareandotherteamsprovidedthe
hardwareforafactoryautomationproject.Threeorfourteamsbroughttheirexpertsonsite,whichwasaremotefactory(funny
howthepollutingfactoriesarealwaysinremotebordercountry).

Oneoftheseteams,afirmspecializinginindustrialautomation,builtticketmachines:kiosks,andsoftwaretorunonthem.
Nothingunusual:swipeabadge,chooseanoption,receiveaticket.Theyassembledtwoofthesekiosksonsite,eachweek
bringingsomemorebitsandpieces.Ticketprinters,monitorscreens,specialkeypadsfromIsrael.Thestuffhadtoberesistant
againstdustbecausethekioskssatoutside.Nothingworked.Thescreenswereunreadableinthesun.Theticketprinters
continuallyjammedandmisprinted.Theinternalsofthekioskjustsatonwoodenshelving.Thekiosksoftwarecrashedregularly.
Itwascomedicexceptthattheprojectreally,reallyhadtoworkandsowespentweeksandthenmonthsonsitehelpingtheother
teamsdebugtheirbitsandpiecesuntilitworked.

Ayearlater,therewasasecondfactory,andthesamestory.Bythistimetheclient,wasgettingimpatient.Sowhentheycameto
thethirdandlargestfactory,ayearlater,wejumpedupandsaid,"pleaseletusmakethekiosksandthesoftwareand
everything".

Wemadeadetaileddesignforthesoftwareandhardwareandfoundsuppliersforallthepieces.Ittookusthreemonthsto
searchtheInternetforeachcomponent(inthosedays,theInternetwasalotslower),andanothertwomonthstogetthem
assembledintostainlesssteelbrickseachweighingabouttwentykilos.Thesebricksweretwofeetsquareandeightinchesdeep,
withalargeflatscreenpanelbehindunbreakableglass,andtwoconnectors:oneforpower,oneforEthernet.Youloadedupthe
paperbinwithenoughforsixmonths,thenscrewedthebrickintoahousing,anditautomaticallybooted,founditsDNSserver,
loadeditsLinuxOSandthenapplicationsoftware.Itconnectedtotherealserver,andshowedthemainmenu.Yougotaccessto
theconfigurationscreensbyswipingaspecialbadgeandthenenteringacode.

Thesoftwarewasportablesowecouldtestthataswewroteit,andaswecollectedthepiecesfromoursupplierswekeptoneof
eachsowehadadisassembledkiosktoplaywith.Whenwegotourfinishedkiosks,theyallworkedimmediately.Weshipped
themtotheclient,whopluggedthemintotheirhousing,switchedthemon,andwenttobusiness.Wespentaweekorsoonsite,
andintenyears,onekioskbroke(thescreendied,andwasreplaced).

Lessonis,testupfrontsothatwhenyouplugthethingin,youknowpreciselyhowit'sgoingtobehave.Ifyouhaven'ttestedit
upfront,you'regoingtobespendingweeksandmonthsinthefieldironingoutproblemsthatshouldneverhavebeenthere.

TheZyreTester topprevnext

http://zguide.zeromq.org/page:all 208/225
12/31/2015 MQ - The Guide - MQ - The Guide

Duringmanualtesting,Ididhitanassertionrarely.Itthendisappeared.BecauseIdon'tbelieveinmagic,Iknowthatmeantthe
codewasstillwrongsomewhere.So,thenextstepwasheavydutytestingoftheZyrev0.2.0codetotrytobreakitsassertions,
andgetagoodideaofhowitwillbehaveinthefield.

Wepackagedthediscoveryandmessagingfunctionalityasaninterfaceobjectthatthemainprogramcreates,workswith,and
thendestroys.Wedon'tuseanyglobalvariables.Thismakesiteasytostartlargenumbersofinterfacesandsimulatereal
activity,allwithinoneprocess.Andifthere'sonethingwe'velearnedfromwritinglotsofexamples,it'sthatZeroMQ'sabilityto
orchestratemultiplethreadsinasingleprocessismucheasiertoworkwiththanmultipleprocesses.

Thefirstversionofthetesterconsistsofamainthreadthatstartsandstopsasetofchildthreads,eachrunningoneinterface,
eachwithaROUTER,DEALER,andUDPsocket(R,D,andUinthediagram).

Figure68ZyreTesterTool

ThenicethingisthatwhenIamconnectedtoaWiFiaccesspoint,allZyretraffic(evenbetweentwointerfacesinthesame
process)goesacrosstheAP.ThismeansIcanfullystresstestanyWiFiinfrastructurewithjustacoupleofPCsrunningina
room.It'shardtoemphasizehowvaluablethisis:ifwehadbuiltZyreas,say,adedicatedserviceforAndroid,we'dliterallyneed
dozensofAndroidtabletsorphonestodoanylargescaletesting.Kiosks,andallthat.

Thefocusisnowonbreakingthecurrentcode,tryingtoproveitwrong.There'snopointatthisstageintestinghowwellitruns,
howfastitis,howmuchmemoryituses,oranythingelse.We'llworkuptotrying(andfailing)tobreakeachindividual
functionality,butfirst,wetrytobreaksomeofthecoreassertionsI'veputintothecode.

Theseare:

ThefirstcommandthatanynodereceivesfromapeerMUSTbeHELLO.Inotherwords,messagescannotbelostduring
thepeertopeerconnectionprocess.

Thestateeachnodecalculatesforitspeersmatchesthestateeachpeercalculatesforitself.Inotherwords,again,no
messagesarelostinthenetwork.

Whenmyapplicationsendsamessagetoapeer,wehaveaconnectiontothatpeer.Inotherwords,theapplicationonly
"sees"apeerafterwehaveestablishedaZeroMQconnectiontoit.

WithZeroMQ,thereareseveralcaseswherewemaylosemessages.Oneisthe"latejoiner"syndrome.Twoiswhenweclose
socketswithoutsendingeverything.ThreeiswhenweoverflowthehighwatermarkonaROUTERorPUBsocket.Fouriswhen
weuseanunknownaddresswithaROUTERsocket.

Now,IthinkHarmonygetsaroundallthesepotentialcases.Butwe'realsoaddingUDPtothemix.Sothefirstversionofthe
testersimulatesanunstableanddynamicnetwork,wherenodescomeandgorandomly.It'sherethatthingswillbreak.

http://zguide.zeromq.org/page:all 209/225
12/31/2015 MQ - The Guide - MQ - The Guide
Hereisthemainthreadofthetester,whichmanagesapoolof100threads,startingandstoppingeachonerandomly.Every
~750msecsiteitherstartsorstopsonerandomthread.Werandomizethetimingsothatthreadsaren'tallsynchronized.Aftera
fewminutes,wehaveanaverageof50threadshappilychattingtoeachotherlikeKoreanteenagersintheGangnamsubway
station:

intmain(intargc,char*argv[])
{
//Initializecontextfortalkingtotasks
zctx_t*ctx=zctx_new()
zctx_set_linger(ctx,100)

//Getnumberofinterfacestosimulate,default100
intmax_interface=100
intnbr_interfaces=0
if(argc>1)
max_interface=atoi(argv[1])

//Weaddressinterfacesasanarrayofpipes
void**pipes=zmalloc(sizeof(void*)*max_interface)

//Wewillrandomlystartandstopinterfacethreads
while(!zctx_interrupted){
uintindex=randof(max_interface)
//Toggleinterfacethread
if(pipes[index]){
zstr_send(pipes[index],"STOP")
zsocket_destroy(ctx,pipes[index])
pipes[index]=NULL
zclock_log("I:Stoppedinterface(%drunning)",
nbr_interfaces)
}
else{
pipes[index]=zthread_fork(ctx,interface_task,NULL)
zclock_log("I:Startedinterface(%drunning)",
++nbr_interfaces)
}
//Sleep~750msecsrandomlysowesmoothoutactivity
zclock_sleep(randof(500)+500)
}
zctx_destroy(&ctx)
return0
}

Notethatwemaintainapipetoeachchildthread(CZMQcreatesthepipeautomaticallywhenweusethezthread_fork
method).It'sviathispipethatwetellchildthreadstostopwhenit'stimeforthemtoleave.Thechildthreadsdothefollowing(I'm
switchingtopseudocodeforclarity):

createaninterface
whiletrue:
pollonpipetoparent,andoninterface
ifparentsentusamessage:
break
ifinterfacesentusamessage:
ifmessageisENTER:
sendaWHISPERtothenewpeer
ifmessageisEXIT:
sendaWHISPERtothedepartedpeer
ifmessageisWHISPER:
sendbackaWHISPER1/2ofthetime
ifmessageisSHOUT:

http://zguide.zeromq.org/page:all 210/225
12/31/2015 MQ - The Guide - MQ - The Guide
sendbackaWHISPER1/3ofthetime
sendbackaSHOUT1/3ofthetime
oncepersecond:
joinorleaveoneof10randomgroups
destroyinterface

TestResults topprevnext

Yes,webrokethecode.Severaltimes,infact.Thiswassatisfying.I'llworkthroughthedifferentthingswefound.

Gettingnodestoagreeonconsistentgroupstatuswasthemostdifficult.Everynodeneedstotrackthegroupmembershipofthe
wholenetwork,asIalreadyexplainedinthesection"GroupMessaging".Groupmessagingisapubsubpattern.JOINsand
LEAVEsareanalogoustosubscribeandunsubscribemessages.It'sessentialthatnoneoftheseevergetlost,orwe'llfindnodes
droppingrandomlyoffgroups.

SoeachnodecountsthetotalnumberofJOINsandLEAVEsit'severdone,andbroadcaststhisstatus(as1byterollingcounter)
initsUDPbeacon.Othernodespickupthestatus,compareittotheirowncalculations,andifthere'sadifference,thecode
asserts.

ThefirstproblemwasthatUDPbeaconsgetdelayedrandomly,sothey'reuselessforcarryingthestatus.Whenabeacons
arriveslate,thestatusisinaccurateandwegetafalsenegative.Tofixthis,wemovedthestatusinformationintotheJOINand
LEAVEcommands.WealsoaddedittotheHELLOcommand.Thelogicthenbecomes:

GetinitialstatusforapeerfromitsHELLOcommand.
WhengettingaJOINorLEAVEfromapeer,incrementthestatuscounter.
CheckthatthenewstatuscountermatchesthevalueintheJOINorLEAVEcommand
Ifitdoesn't,assert.

Nextproblemwegotwasthatmessageswerearrivingunexpectedlyonnewconnections.TheHarmonypatternconnects,then
sendsHELLOasthefirstcommand.ThismeansthereceivingpeershouldalwaysgetHELLOasthefirstcommandfromanew
peer.WewereseeingPING,JOIN,andothercommandsarriving.

ThisturnedouttobeduetoCZMQ'sephemeralportlogic.Anephemeralportisjustadynamicallyassignedportthataservice
cangetratherthanaskingforafixedportnumber.APOSIXsystemusuallyassignsephemeralportsintherange0xC000to
0xFFFF.CZMQ'slogicistolookforafreeportinthisrange,bindtothat,andreturntheportnumbertothecaller.

Thissoundsfine,untilyougetonenodestoppingandanothernodestartingclosetogether,andthenewnodegettingtheport
numberoftheoldnode.RememberthatZeroMQtriestoreestablishabrokenconnection.Sowhenthefirstnodestopped,its
peerswouldretrytoconnect.Whenthenewnodeappearsonthatsameport,suddenlyallthepeersconnecttoitandstart
chattinglikethey'reoldbuddies.

It'sageneralproblemthataffectsanylargerscaledynamicZeroMQapplication.Thereareanumberofplausibleanswers.One
istonotreuseephemeralports,whichiseasiersaidthandonewhenyouhavemultipleprocessesononesystem.Another
solutionwouldbetoselectarandomporteachtime,whichatleastreducestheriskofhittingajustfreedport.Thisbringstherisk
ofagarbageconnectiondowntoperhaps1/1000butit'sstillthere.Perhapsthebestsolutionistoacceptthatthiscanhappen,
understandthecauses,anddealwithitontheapplicationlevel.

WehaveastatefulprotocolthatalwaysstartswithaHELLOcommand.Weknowthatit'spossibleforpeerstoconnecttous,
thinkingwe'reanexistingnodethatwentawayandcameback,andsendusothercommands.Steponeiswhenwediscovera
newpeer,todestroyanyexistingpeerconnectedtothesameendpoint.It'snotafullanswerbutatleastit'spolite.Steptwoisto
ignoreanythingcominginfromanewpeeruntilthatpeersaysHELLO.

Thisdoesn'trequireanychangetotheprotocol,butitmustbespecifiedintheprotocolwhenwecometoit:duetotheway
ZeroMQconnectionswork,it'spossibletoreceiveunexpectedcommandsfromawellbehavingpeerandthereisnowayto
returnanerrorcodeorotherwisetellthatpeertoresetitsconnection.Thus,apeermustdiscardanycommandfromapeeruntil
itreceivesHELLO.

Infact,ifyoudrawthisonapieceofpaperandthinkitthrough,you'llseethatyounevergetaHELLOfromsuchaconnection.
ThepeerwillsendPINGsandJOINsandLEAVEsandtheneventuallytimeoutandclose,asitfailstogetanyheartbeatsback
fromus.

http://zguide.zeromq.org/page:all 211/225
12/31/2015 MQ - The Guide - MQ - The Guide
You'llalsoseethatthere'snoriskofconfusion,nowayforcommandsfromtwopeerstogetmixedintoasinglestreamonour
DEALERsocket.

Whenyouaresatisfiedthatthisworks,we'rereadytomoveon.Thisversionistaggedintherepositoryasv0.3.0andyoucan
downloadthetarballifyouwanttocheckwhatthecodelookedlikeatthisstage.

Notethatdoingheavysimulationoflotsofnodeswillprobablycauseyourprocesstorunoutoffilehandles,givinganassertion
failureinlibzmq.Iraisedtheperprocesslimitto30,000byrunning(onmyLinuxbox):

ulimitn30000

TracingActivity topprevnext

Todebugthekindsofproblemswesawhere,weneedextensivelogging.There'salothappeninginparallel,buteveryproblem
canbetraceddowntoaspecificexchangebetweentwonodes,consistingofasetofeventsthathappeninstrictsequence.We
knowhowtomakeverysophisticatedlogging,butasusualit'swisertomakejustwhatweneedandnomore.Wehaveto
capture:

Timeanddateforeachevent.
Inwhichnodetheeventoccurred.
Thepeernode,ifany.
Whattheeventwas(e.g.,whichcommandarrived).
Eventdata,ifany.

Theverysimplesttechniqueistoprintthenecessaryinformationtotheconsole,withatimestamp.That'stheapproachIused.
Thenit'ssimpletofindthenodesaffectedbyafailure,filterthelogfileforonlymessagesreferringtothem,andseeexactlywhat
happened.

DealingwithBlockedPeers topprevnext

InanyperformancesensitiveZeroMQarchitecture,youneedtosolvetheproblemofflowcontrol.Youcannotsimplysend
unlimitedmessagestoasocketandhopeforthebest.Attheoneextreme,youcanexhaustmemory.Thisisaclassicfailure
patternforamessagebroker:oneslowclientstopsreceivingmessagesthebrokerstartstoqueuethem,andeventually
exhaustsmemoryandthewholeprocessdies.Attheotherextreme,thesocketdropsmessages,orblocks,asyouhitthehigh
watermark.

WithZyrewewanttodistributemessagestoasetofpeers,andwewanttodothisfairly.UsingasingleROUTERsocketfor
outputwouldbeproblematicbecauseanyoneblockedpeerwouldblockoutgoingtraffictoallpeers.TCPdoeshavegood
algorithmsforspreadingthenetworkcapacityacrossasetofconnections.Andwe'reusingaseparateDEALERsockettotalkto
eachpeer,sointheoryeachDEALERsocketwillsenditsqueuedmessagesinthebackgroundreasonablyfairly.

ThenormalbehaviorofaDEALERsocketthathitsitshighwatermarkistoblock.Thisisusuallyideal,butit'saproblemforus
here.Ourcurrentinterfacedesignusesonethreadthatdistributesmessagestoallpeers.Ifoneofthosesendcallsweretoblock,
alloutputwouldblock.

Thereareafewoptionstoavoidblocking.Oneistousezmq_poll()onthewholesetofDEALERsockets,andonlywriteto
socketsthatareready.Idon'tlikethisforacoupleofreasons.First,theDEALERsocketishiddeninsidethepeerclass,anditis
cleanertoalloweachclasstohandlethisopaquely.Second,whatdowedowithmessageswecan'tyetdelivertoaDEALER
socket?Wheredowequeuethem?Third,itseemstobesidesteppingtheissue.Ifapeerisreallysobusyitcan'treadits
messages,somethingiswrong.Mostlikely,it'sdead.

Sonopollingforoutput.Thesecondoptionistouseonethreadperpeer.Iquiteliketheideaofthisbecauseitfitsintothe
ZeroMQdesignpatternof"doonethinginonethread".Butthisisgoingtocreatealotofthreads(squareofthenumberofnodes
westart)inthesimulation,andwe'realreadyrunningoutoffilehandles.

Athirdoptionistouseanonblockingsend.Thisisnicerandit'sthesolutionIchoose.Wecanthenprovideeachpeerwitha
reasonableoutgoingqueue(theHWM)andifthatgetsfull,treatitasafatalerroronthatpeer.Thiswillworkforsmaller

http://zguide.zeromq.org/page:all 212/225
12/31/2015 MQ - The Guide - MQ - The Guide
messages.Ifwe'resendinglargechunkse.g.,forcontentdistributionwe'llneedacreditbasedflowcontrolontop.

ThereforethefirststepistoprovetoourselvesthatwecanturnthenormalblockingDEALERsocketintoanonblockingsocket.
ThisexamplecreatesanormalDEALERsocket,connectsittosomeendpoint(sothatthere'sanoutgoingpipeandthesocket
willacceptmessages),setsthehighwatermarktofour,andthensetsthesendtimeouttozero:

eagain:CheckingEAGAINonDEALERsocketinC

C#|Python|Ada|Basic|C++|Clojure|CL|Delphi|Erlang|F#|Felix|Go|Haskell|Haxe|Java|Lua|Node.js|ObjectiveC|ooc|Perl|PHP|Q|
Racket|Ruby|Scala|Tcl

Whenwerunthis,wesendfourmessagessuccessfully(theygonowhere,thesocketjustqueuesthem),andthenwegetanice
EAGAINerror:

Sendingmessage0
Sendingmessage1
Sendingmessage2
Sendingmessage3
Sendingmessage4
Resourcetemporarilyunavailable

Thenextstepistodecidewhatareasonablehighwatermarkwouldbeforapeer.Zyreismeantforhumaninteractionsthatis,
applicationsthatchatatalowfrequency,suchastwogamesorashareddrawingprogram.I'dexpectahundredmessagesper
secondtobequitealot.Our"peerisreallydead"timeoutis10seconds.Soahighwatermarkof1,000seemsfair.

RatherthansetafixedHWMorusethedefault(whichrandomlyalsohappenstobe1,000),wecalculateitas100*thetimeout.
Here'showweconfigureanewDEALERsocketforapeer:

//Createnewoutgoingsocket(dropanymessagesintransit)
self>mailbox=zsocket_new(self>ctx,ZMQ_DEALER)

//Setourcaller"From"identitysothatreceivingnodeknows
//whoeachmessagecamefrom.
zsocket_set_identity(self>mailbox,reply_to)

//Setahighwatermarkthatallowsforreasonableactivity
zsocket_set_sndhwm(self>mailbox,PEER_EXPIRED*100)

//SendmessagesimmediatelyorreturnEAGAIN
zsocket_set_sndtimeo(self>mailbox,0)

//Connectthroughtopeernode
zsocket_connect(self>mailbox,"tcp://%s",endpoint)

Andfinally,whatdowedowhenwegetanEAGAINonapeer?Wedon'tneedtogothroughalltheworkofdestroyingthepeer
becausetheinterfacewilldothisautomaticallyifitdoesn'tgetanymessagefromthepeerwithintheexpirationtimeout.Just
droppingthelastmessageseemsveryweakitwillgivethereceivingpeergaps.

I'dpreferamorebrutalresponse.Brutalisgoodbecauseitforcesthedesigntoa"good"or"bad"decisionratherthanafuzzy
"shouldworkbuttobehonesttherearealotofedgecasessolet'sworryaboutitlater".Destroythesocket,disconnectthepeer,
andstopsendinganythingtoit.Thepeerwilleventuallyhavetoreconnectandreinitializeanystate.It'skindofanassertionthat
100messagesasecondisenoughforanyone.So,inthezre_peer_sendmethod:

int
zre_peer_send(zre_peer_t*self,zre_msg_t**msg_p)
{
assert(self)
if(self>connected){
if(zre_msg_send(msg_p,self>mailbox)&&errno==EAGAIN){
zre_peer_disconnect(self)
return1

http://zguide.zeromq.org/page:all 213/225
12/31/2015 MQ - The Guide - MQ - The Guide
}
}

return0
}

Wherethedisconnectmethodlookslikethis:

void
zre_peer_disconnect(zre_peer_t*self)
{
//Ifconnected,destroysocketanddropallpendingmessages
assert(self)
if(self>connected){
zsocket_destroy(self>ctx,self>mailbox)
free(self>endpoint)
self>endpoint=NULL
self>connected=false
}
}

DistributedLoggingandMonitoring topprevnext

Let'slookatloggingandmonitoring.Ifyou'veevermanagedarealserver(likeawebserver),youknowhowvitalitistohavea
captureofwhatisgoingon.Therearealonglistofreasons,notleast:

Tomeasuretheperformanceofthesystemovertime.
Toseewhatkindsofworkaredonethemost,tooptimizeperformance.
Totrackerrorsandhowoftentheyoccur.
Todopostmortemsoffailures.
Toprovideanaudittrailincaseofdispute.

Let'sscopethisintermsoftheproblemswethinkwe'llhavetosolve:

Wewanttotrackkeyevents(suchasnodesleavingandrejoiningthenetwork).
Foreachevent,wewanttotrackaconsistentsetofdata:thedate/time,nodethatobservedtheevent,peerthatcreated
theevent,typeofeventitself,andothereventdata.
Wewanttobeabletoswitchloggingonandoffatanytime.
Wewanttobeabletoprocesslogdatamechanicallybecauseitwillbesizable.
Wewanttobeabletomonitorarunningsystemthatis,collectlogsandanalyzeinrealtime.
Wewantlogtraffictohaveminimaleffectonthenetwork.
Wewanttobeabletocollectlogdataatasinglepointonthenetwork.

Asinanydesign,someoftheserequirementsarehostiletoeachother.Forexample,collectinglogdatainrealtimemeans
sendingitoverthenetwork,whichwillaffectnetworktraffictosomeextent.However,asinanydesign,theserequirementsare
alsohypotheticaluntilwehaverunningcodesowecan'ttakethemtooseriously.We'llaimforplausiblygoodenoughand
improveovertime.

APlausibleMinimalImplementation topprevnext

Arguably,justdumpinglogdatatodiskisonesolution,andit'swhatmostmobileapplicationsdo(using"debuglogs").Butmost
failuresrequirecorrelationofeventsfromtwonodes.Thismeanssearchinglotsofdebuglogsbyhandtofindtheonesthat
matter.It'snotaverycleverapproach.

Wewanttosendlogdatasomewherecentral,eitherimmediately,oropportunistically(i.e.,storeandforward).Fornow,let's
focusonimmediatelogging.MyfirstideawhenitcomestosendingdataistouseZyreforthis.Justsendlogdatatoagroup

http://zguide.zeromq.org/page:all 214/225
12/31/2015 MQ - The Guide - MQ - The Guide
called"LOG",andhopesomeonecollectsit.

ButusingZyretologZyreitselfisaCatch22.Whologsthelogger?Whatifwewantaverboselogofeverymessagesent?Do
weincludeloggingmessagesinthatornot?Itquicklygetsmessy.Wewantaloggingprotocolthat'sindependentofZyre'smain
ZREprotocol.Thesimplestapproachisapubsubprotocol,whereallnodespublishlogdataonaPUBsocketandacollector
picksthatupviaaSUBsocket.

Figure69DistributedLogCollection

Thecollectorcan,ofcourse,runonanynode.Thisgivesusanicerangeofusecases:

ApassivelogcollectorthatstoreslogdataondiskforeventualstatisticalanalysisthiswouldbeaPCwithsufficienthard
diskspaceforweeksormonthsoflogdata.

Acollectorthatstoreslogdataintoadatabasewhereitcanbeusedinrealtimebyotherapplications.Thismightbe
overkillforasmallworkgroup,butwouldbesnazzyfortrackingtheperformanceoflargergroups.Thecollectorcould
collectlogdataoverWiFiandthenforwarditoverEthernettoadatabasesomewhere.

AlivemeterapplicationthatjoinedtheZyrenetworkandthencollectedlogdatafromnodes,showingeventsandstatistics
inrealtime.

Thenextquestionishowtointerconnectthenodesandcollector.Whichsidebinds,andwhichconnects?Bothwayswillwork
here,butit'smarginallybetterifthePUBsocketsconnecttotheSUBsocket.Ifyourecall,ZeroMQ'sinternalbuffersonlypopinto
existencewhenthereareconnections.Itmeansassoonasanodeconnectstothecollector,itcanstartsendinglogdatawithout
loss.

Howdowetellnodeswhatendpointtoconnectto?Wemayhaveanynumberofcollectorsonthenetwork,andthey'llbeusing
arbitrarynetworkaddressesandports.Weneedsomekindofserviceannouncementmechanism,andherewecanuseZyreto
dotheworkforus.Wecouldusegroupmessaging,butitseemsneatertobuildservicediscoveryintotheZREprotocolitself.It's
nothingcomplex:ifanodeprovidesaserviceX,itcantellothernodesaboutthatwhenitsendsthemaHELLOcommand.

We'llextendtheHELLOcommandwithaheadersfieldthatholdsasetofname=valuepairs.Let'sdefinethattheheaderX
ZRELOGspecifiesthecollectorendpoint(theSUBsocket).Anodethatactsasacollectorcanaddaheaderlikethis(for
example):

XZRELOG=tcp://192.168.1.122:9992

Whenanothernodeseesthisheader,itsimplyconnectsitsPUBsockettothatendpoint.Logdatanowgetsdistributedtoall
collectors(zeroormore)onthenetwork.

Makingthisfirstversionwasfairlysimpleandtookhalfaday.Herearethepieceswehadtomakeorchange:

Wemadeanewclasszre_logthatacceptslogdataandmanagestheconnectiontothecollector,ifany.
Weaddedsomebasicmanagementforpeerheaders,takenfromtheHELLOcommand.
WhenapeerhastheXZRELOGheader,weconnecttotheendpointitspecifies.
Wherewewereloggingtostdout,weswitchedtologgingviathezre_logclass.

http://zguide.zeromq.org/page:all 215/225
12/31/2015 MQ - The Guide - MQ - The Guide
WeextendedtheinterfaceAPIwithamethodthatletstheapplicationsetheaders.
WewroteasimpleloggerapplicationthatmanagestheSUBsocketandsetstheXZRELOGheader.
WesendourownheaderswhenwesendaHELLOcommand.

ThisversionistaggedintheZyrerepositoryasv0.4.0andyoucandownloadthetarballifyouwanttoseewhatthecodelooked
likeatthisstage.

Atthisstage,thelogmessageisjustastring.We'llmakemoreprofessionallystructuredlogdatainalittlewhile.

First,anoteondynamicports.Inthezre_testerappthatweusefortesting,wecreateanddestroyinterfacesaggressively.
Oneconsequenceisthatanewinterfacecaneasilyreuseaportthatwasjustfreedbyanotherapplication.Ifthere'saZeroMQ
socketsomewheretryingtoconnectthisport,theresultscanbehilarious.

Here'sthescenarioIhad,whichcausedafewminutes'confusion.Theloggerwasrunningonadynamicport:

Startloggerapplication
Starttesterapplication
Stoplogger
Testerreceivesinvalidmessage(andassertsasdesigned)

Asthetestercreatedanewinterface,thatreusedthedynamicportfreedbythe(juststopped)logger,andsuddenlytheinterface
begantoreceivelogdatafromnodesonitsmailbox.Wesawasimilarsituationbefore,whereanewinterfacecouldreusethe
portfreedbyanoldinterfaceandstartgettingolddata.

Thelessonis,ifyouusedynamicports,bepreparedtoreceiverandomdatafromillinformedapplicationsthatarereconnecting
toyou.Switchingtoastaticportstoppedthemisbehavingconnection.That'snotafullsolutionthough.Therearetwomore
weaknesses:

AsIwritethis,libzmqdoesn'tchecksockettypeswhenconnecting.TheZMTP/2.0protocoldoesannounceeachpeer's
sockettype,sothischeckisdoable.

TheZREprotocolhasnofailfast(assertion)mechanismweneedtoreadandparseawholemessagebeforerealizing
thatit'sinvalid.

Let'saddressthesecondone.Socketpairvalidationwouldn'tsolvethisfullyanyway.

ProtocolAssertions topprevnext

AsWikipediaputsit,"Failfastsystemsareusuallydesignedtostopnormaloperationratherthanattempttocontinueapossibly
flawedprocess."AprotocollikeHTTPhasafailfastmechanisminthatthefirstfourbytesthataclientsendstoanHTTPserver
mustbe"HTTP".Ifthey'renot,theservercanclosetheconnectionwithoutreadinganythingmore.

OurROUTERsocketisnotconnectionorientedsothere'snowayto"closetheconnection"whenwegetbadincoming
messages.However,wecanthrowouttheentiremessageifit'snotvalid.Theproblemisgoingtobeworsewhenweuse
ephemeralports,butitappliesbroadlytoallprotocols.

Solet'sdefineaprotocolassertionasbeingauniquesignaturethatweplaceatthestartofeachmessageandwhichidentities
theintendedprotocol.Whenwereadamessage,wecheckthesignatureandifit'snotwhatweexpect,wediscardthemessage
silently.Agoodsignatureshouldbehardtoconfusewithregulardataandgiveusenoughspaceforanumberofprotocols.

I'mgoingtousea16bitsignatureconsistingofa12bitpatternanda4bitprotocolID.Thepattern%xAAAismeanttostayaway
fromvalueswemightotherwiseexpecttoseeatthestartofamessage:%x00,%xFF,andprintablecharacters.

Figure70ProtocolSignature

http://zguide.zeromq.org/page:all 216/225
12/31/2015 MQ - The Guide - MQ - The Guide
Asourprotocolcodecisgenerated,it'srelativelyeasytoaddthisassertion.Thelogicis:

Getfirstframeofmessage.
Checkiffirsttwobytesare%xAAAwithexpected4bitsignature.
Ifso,continuetoparserestofmessage.
Ifnot,skipall"more"frames,getfirstframe,andrepeat.

Totestthis,Iswitchedtheloggerbacktousinganephemeralport.Theinterfacenowproperlydetectsanddiscardsany
messagesthatdon'thaveavalidsignature.Ifthemessagehasavalidsignatureandisstillwrong,that'saproperbug.

BinaryLoggingProtocol topprevnext

Nowthatwehavetheloggingframeworkworkingproperly,let'slookattheprotocolitself.Sendingstringsaroundthenetworkis
simple,butwhenitcomestoWiFiwereallycannotaffordtowastebandwidth.Wehavethetoolstoworkwithefficientbinary
protocols,solet'sdesignoneforlogging.

ThisisgoingtobeapubsubprotocolandinZeroMQv3.xwedopublishersidefiltering.Thismeanswecandomultilevel
logging(errors,warnings,information)ifweputthelogginglevelatthestartofthemessage.Soourmessagestartswitha
protocolsignature(twobytes),alogginglevel(onebyte),andaneventtype(onebyte).

Inthefirstversion,wesendUUIDstringstoidentifyeachnode.Astext,theseare32characterseach.Wecansendbinary
UUIDs,butit'sstillverboseandwasteful.Wedon'tcareaboutthenodeidentifiersinthelogfiles.Allweneedissomewayto
correlateevents.Sowhat'stheshortestidentifierwecanusethat'sgoingtobeuniqueenoughforlogging?Isay"uniqueenough"
becausewhilewereallywantzerochanceofduplicateUUIDsinthelivecode,logfilesarenotsocritical.

ThesimplestplausibleansweristohashtheIPaddressandportintoa2bytevalue.We'llgetsomecollisions,butthey'llberare.
Howrare?Asaquicksanitycheck,Iwriteasmallprogramthatgeneratesabunchofaddressesandhashestheminto16bit
values,lookingforcollisions.Tobesure,Igenerate10,000addressesacrossasmallnumberofIPaddresses(matchinga
simulationsetup),andthenacrossalargenumberofaddresses(matchingareallifesetup).Thehashingalgorithmisamodified
Bernstein:

uint16_thash=0
while(*endpoint)
hash=33*hash^*endpoint++

Idon'tgetanycollisionsoverseveralruns,sothiswillworkasidentifierforthelogdata.Thisaddsfourbytes(twoforthenode
recordingtheevent,andtwoforitspeerineventsthatcomefromapeer).

Next,wewanttostorethedateandtimeoftheevent.ThePOSIXtime_ttypewaspreviously32bits,butbecausethis
overflowsin2038,it'sa64bitvalue.We'llusethisthere'snoneedformillisecondresolutioninalogfile:eventsaresequential,
clocksareunlikelytobethattightlysynchronized,andnetworklatenciesmeanthatprecisetimesaren'tthatmeaningful.

We'reupto16bytes,whichisdecent.Finally,wewanttoallowsomeadditionaldata,formattedastextanddependingonthe
typeofevent.Puttingthisalltogethergivesthefollowingmessagespecification:

<class
name="zre_log_msg"
script="codec_c.gsl"
signature="2"
>
ThisistheZREloggingprotocolrawversion.
<includefilename="license.xml"/>

<!Protocolconstants>
<definename="VERSION"value="1"/>

<definename="LEVEL_ERROR"value="1"/>
<definename="LEVEL_WARNING"value="2"/>
<definename="LEVEL_INFO"value="3"/>

http://zguide.zeromq.org/page:all 217/225
12/31/2015 MQ - The Guide - MQ - The Guide

<definename="EVENT_JOIN"value="1"/>
<definename="EVENT_LEAVE"value="2"/>
<definename="EVENT_ENTER"value="3"/>
<definename="EVENT_EXIT"value="4"/>

<messagename="LOG"id="1">
<fieldname="level"type="number"size="1"/>
<fieldname="event"type="number"size="1"/>
<fieldname="node"type="number"size="2"/>
<fieldname="peer"type="number"size="2"/>
<fieldname="time"type="number"size="8"/>
<fieldname="data"type="string"/>
Loganevent
</message>

</class>

Thisgenerates800linesofperfectbinarycodec(thezre_log_msgclass).Thecodecdoesprotocolassertionsjustlikethemain
ZREprotocoldoes.Codegenerationhasafairlysteepstartingcurve,butitmakesitsomucheasiertopushyourdesignspast
"amateur"into"professional".

ContentDistribution topprevnext

Wenowhavearobustframeworkforcreatinggroupsofnodes,lettingthemchattoeachother,andmonitoringtheresulting
network.Nextstepistoallowthemtodistributecontentasfiles.

Asusual,we'llaimfortheverysimplestplausiblesolutionandthenimprovethatstepbystep.Attheveryleastwewantthe
following:

AnapplicationcantelltheZyreAPI,"Publishthisfile",andprovidethepathtoafilethatexistssomewhereinthefile
system.
Zyrewilldistributethatfiletoallpeers,boththosethatareonthenetworkatthattime,andthosethatarrivelater.
Eachtimeaninterfacereceivesafileittellsitsapplication,"Hereisthisfile".

Wemighteventuallywantmorediscrimination,e.g.,publishingtospecificgroups.Wecanaddthatlaterifit'sneeded.InChapter
7AdvancedArchitectureusingZeroMQwedevelopedafiledistributionsystem(FileMQ)designedtobepluggedintoZeroMQ
applications.Solet'susethat.

Eachnodeisgoingtobeafilepublisherandafilesubscriber.Webindthepublishertoanephemeralport(ifweusethestandard
FileMQport5670,wecan'trunmultipleinterfacesononebox),andwebroadcastthepublisher'sendpointintheHELLO
message,aswedidforthelogcollector.Thisletsusinterconnectallnodessothatallsubscriberstalktoallpublishers.

Weneedtoensurethateachnodehasitsowndirectoryforsendingandreceivingfiles(theoutboxandtheinbox).Again,it'sso
wecanrunmultiplenodesononebox.BecausewealreadyhaveauniqueIDpernode,wejustusethatinthedirectoryname.

Here'showwesetuptheFileMQAPIwhenwecreateanewinterface:

sprintf(self>fmq_outbox,".outbox/%s",self>identity)
mkdir(self>fmq_outbox,0775)

sprintf(self>fmq_inbox,".inbox/%s",self>identity)
mkdir(self>fmq_inbox,0775)

self>fmq_server=fmq_server_new()
self>fmq_service=fmq_server_bind(self>fmq_server,"tcp://*:*")
fmq_server_publish(self>fmq_server,self>fmq_outbox,"/")
fmq_server_set_anonymous(self>fmq_server,true)
charpublisher[32]
sprintf(publisher,"tcp://%s:%d",self>host,self>fmq_service)

http://zguide.zeromq.org/page:all 218/225
12/31/2015 MQ - The Guide - MQ - The Guide
zhash_update(self>headers,"XFILEMQ",strdup(publisher))

//Clientwillconnectasitdiscoversnewnodes
self>fmq_client=fmq_client_new()
fmq_client_set_inbox(self>fmq_client,self>fmq_inbox)
fmq_client_set_resync(self>fmq_client,true)
fmq_client_subscribe(self>fmq_client,"/")

AndwhenweprocessaHELLOcommand,wecheckfortheXFILEMQheaderfield:

//IfpeerisaFileMQpublisher,connecttoit
char*publisher=zre_msg_headers_string(msg,"XFILEMQ",NULL)
if(publisher)
fmq_client_connect(self>fmq_client,publisher)

ThelastthingistoexposecontentdistributionintheZyreAPI.Weneedtwothings:

Awayfortheapplicationtosay,"Publishthisfile"
Awayfortheinterfacetotelltheapplication,"Wereceivedthisfile".

Intheory,theapplicationcanpublishafilejustbycreatingasymboliclinkintheoutboxdirectory,butaswe'reusingahidden
outbox,thisisalittledifficult.SoweaddanAPImethodpublish:

//Publishfileintovirtualspace
void
zre_interface_publish(zre_interface_t*self,
char*filename,char*external)
{
zstr_sendm(self>pipe,"PUBLISH")
zstr_sendm(self>pipe,filename)//Realfilename
zstr_send(self>pipe,external)//Locationinvirtualspace
}

TheAPIpassesthistotheinterfacethread,whichcreatesthefileintheoutboxdirectorysothattheFileMQserverwillpickitup
andbroadcastit.Wecouldliterallycopyfiledataintothisdirectory,butbecauseFileMQsupportssymboliclinks,weusethat
instead.Thefilehasa".ln"extensionandcontainsoneline,whichcontainstheactualpathname.

Finally,howdowenotifytherecipientthatafilehasarrived?TheFileMQfmq_clientAPIhasamessage,"DELIVER",forthis,
soallwehavetodoinzre_interfaceisgrabthismessagefromthefmq_clientAPIandpassitontoourownAPI:

zmsg_t*msg=fmq_client_recv(fmq_client_handle(self>fmq_client))
zmsg_send(&msg,self>pipe)

Thisiscomplexcodethatdoesalotatonce.Butwe'reonlyataround10KlinesofcodeforFileMQandZyretogether.Themost
complexZyreclass,zre_interface,is800linesofcode.Thisiscompact.Messagebasedapplicationsdokeeptheirshapeif
you'recarefultoorganizethemproperly.

WritingtheUnprotocol topprevnext

Wehaveallthepiecesforaformalprotocolspecificationandit'stimetoputtheprotocolonpaper.Therearetworeasonsforthis.
First,tomakesurethatanyotherimplementationstalktoeachotherproperly.Second,becauseIwanttogetanofficialportfor
theUDPdiscoveryprotocolandthatmeansdoingthepaperwork.

Likealltheotherunprotocolswedevelopedinthisbook,theprotocollivesontheZeroMQRFCsite.Thecoreoftheprotocol
specificationistheABNFgrammarforthecommandsandfields:

http://zguide.zeromq.org/page:all 219/225
12/31/2015 MQ - The Guide - MQ - The Guide

zreprotocol=greeting*traffic

greeting=S:HELLO
traffic=S:WHISPER
/S:SHOUT
/S:JOIN
/S:LEAVE
/S:PINGR:PINGOK

Greetapeersoitcanconnectbacktous
S:HELLO=header%x01ipaddressmailboxgroupsstatusheaders
header=signaturesequence
signature=%xAA%xA1
sequence=2OCTETIncrementalsequencenumber
ipaddress=stringSenderIPaddress
string=size*VCHAR
size=OCTET
mailbox=2OCTETSendermailboxportnumber
groups=stringsListofgroupssenderisin
strings=size*string
status=OCTETSendergroupstatussequence
headers=dictionarySenderheaderproperties
dictionary=size*keyvalue
keyvalue=stringFormattedasname=value

Sendamessagetoapeer
S:WHISPER=header%x02content
content=FRAMEMessagecontentasZeroMQframe

Sendamessagetoagroup
S:SHOUT=header%x03groupcontent
group=stringNameofgroup
content=FRAMEMessagecontentasZeroMQframe

Joinagroup
S:JOIN=header%x04groupstatus
status=OCTETSendergroupstatussequence

Leaveagroup
S:LEAVE=header%x05groupstatus

Pingapeerthathasgonesilent
S:PING=header%06

Replytoapeer'sping
R:PINGOK=header%07

ExampleZyreApplication topprevnext

Let'snowmakeaminimalexamplethatusesZyretobroadcastfilesaroundadistributednetwork.Thisexampleconsistsoftwo
programs:

AlistenerthatjoinstheZyrenetworkandreportswheneveritreceivesafile.
AsenderthatjoinsaZyrenetworkandbroadcastsexactlyonefile.

Thelistenerisquiteshort:

http://zguide.zeromq.org/page:all 220/225
12/31/2015 MQ - The Guide - MQ - The Guide

#include<zre.h>

intmain(intargc,char*argv[])
{
zre_interface_t*interface=zre_interface_new()
while(true){
zmsg_t*incoming=zre_interface_recv(interface)
if(!incoming)
break
zmsg_dump(incoming)
zmsg_destroy(&incoming)
}
zre_interface_destroy(&interface)
return0
}

Andthesenderisn'tmuchlonger:

#include<zre.h>

intmain(intargc,char*argv[])
{
if(argc<2){
puts("Syntax:senderfilenamevirtualname")
return0
}
printf("Publishing%sas%s\n",argv[1],argv[2])
zre_interface_t*interface=zre_interface_new()
zre_interface_publish(interface,argv[1],argv[2])
while(true){
zmsg_t*incoming=zre_interface_recv(interface)
if(!incoming)
break
zmsg_dump(incoming)
zmsg_destroy(&incoming)
}
zre_interface_destroy(&interface)
return0
}

Conclusions topprevnext

BuildingapplicationsforunstabledecentralizednetworksisoneoftheendgamesforZeroMQ.Asthecostofcomputingfalls
everyyear,suchnetworksbecomemoreandmorecommon,beitconsumerelectronicsorvirtualboxesinthecloud.Inthis
chapter,we'vepulledtogethermanyofthetechniquesfromthebooktobuildZyre,aframeworkforproximitycomputingovera
localnetwork.Zyreisn'tuniquethereareandhavebeenmanyattemptstoopenthisareaforapplications:ZeroConf,SLP,
SSDP,UPnP,DDS.Buttheseallseemtoenduptoocomplexorotherwisetoodifficultforapplicationdeveloperstobuildon.

Zyreisn'tfinished.Likemanyoftheprojectsinthisbook,it'sanicebreakerforothers.Therearesomemajorunfinishedareas,
whichwemayaddressinlatereditionsofthisbookorversionsofthesoftware.

HighlevelAPIs:themessagebasedAPIthatZyreoffersnowisusablebutstillrathermorecomplexthanI'dlikefor
averagedevelopers.Ifthere'sonetargetweabsolutelycannotmiss,it'srawsimplicity.Thismeansweshouldbuildhigh
levelAPIs,inlotsoflanguages,whichhideallthemessaging,andwhichcomedowntosimplemethodslikestart,
join/leavegroup,getmessage,publishfile,stop.

Security:howdowebuildafullydecentralizedsecuritysystem?Wemightbeabletoleveragepublickeyinfrastructurefor

http://zguide.zeromq.org/page:all 221/225
12/31/2015 MQ - The Guide - MQ - The Guide
somework,butthatrequiresthatnodeshavetheirownInternetaccess,whichisn'tguaranteed.Theansweris,asfaras
wecantell,touseanyexistingsecurepeertopeerlink(TLS,BlueTooth,perhapsNFC)toexchangeasessionkeyand
useasymmetriccipher.Symmetricciphershavetheiradvantagesanddisadvantages.

Nomadiccontent:howdoI,asauser,managemycontentacrossmultipledevices?TheZyre+FileMQcombinationmight
help,forlocalnetworkuse,butI'dliketobeabletodothisacrosstheInternetaswell.AretherecloudservicesIcould
use?IstheresomethingIcouldmakeusingZeroMQ?

Federation:howdowescalealocalareadistributedapplicationacrosstheglobe?Oneplausibleanswerisfederation,
whichmeanscreatingclustersofclusters.If100nodescanjointogethertocreatealocalcluster,thenperhaps100
clusterscanjointogethertocreateawideareacluster.Thechallengesarethenquitesimilar:discovery,presence,and
groupmessaging.

Postface topprevnext

TalesfromOutThere topprevnext

IaskedsomeofthecontributorstothisbooktotelluswhattheyweredoingwithZeroMQ.Herearetheirstories.

RobGagnon'sStory topprevnext

"WeuseZeroMQtoassistinaggregatingthousandsofeventsoccurringeveryminuteacrossourglobalnetworkof
telecommunicationsserverssothatwecanaccuratelyreportandmonitorforsituationsthatrequireourattention.ZeroMQmade
thedevelopmentofthesystemnotonlyeasier,butfastertodevelopandmorerobustandfaulttolerantthanwehadoriginally
plannedinouroriginaldesign.

"We'reabletoeasilyaddandremoveclientsfromthenetworkwithoutthelossofanymessage.Ifweneedtoenhancetheserver
portionofoursystem,wecanstopandrestartitaswellwithouthavingtoworryaboutstoppingalloftheclientsfirst.Thebuiltin
bufferingofZeroMQmakesthisallpossible."

TomvanLeeuwen'sStory topprevnext

"Iwaslookingatcreatingsomekindofservicebusconnectingallkindsofservicestogether.Therewerealreadysomeproducts
thatimplementedabroker,buttheydidnothavethefunctionalityIneeded.Byaccident,IstumbleduponZeroMQ,whichis
awesome.It'sverylightweight,lean,simpleandeasytofollowbecausetheguideisverycompleteandreadsverywell.I've
actuallyimplementedtheTitanicpatternandtheMajordomobrokerwithsomeadditions(client/workerauthenticationandworkers
sendingacatalogexplainingwhattheyprovideandhowtheyshouldbeaddressed).

"ThebeautifulthingaboutZeroMQisthefactthatitisalibraryandnotanapplication.Youcanmoldithoweveryoulikeandit
simplyputsboringthingslikequeuing,reconnecting,TCPsocketsandsuchtothebackground,makingsureyoucanconcentrate
onwhatisimportanttoyou.I'veimplementedallkindsofworkers/clientsandthebrokerinRuby,becausethatisthemain
languageweusefordevelopment,butalsosomePHPclientstoconnecttothebusfromexistingPHPwebapps.Weusethis
servicebusforcloudservices,connectingallkindsofplatformdevicestoaservicebusexposingfunctionalityforautomation.

"ZeroMQisveryeasytounderstandandifyouspendadaywiththeguide,you'llhavegoodknowledgeofhowitworks.I'ma
networkengineer,notasoftwaredeveloper,butmanagedtocreateaverynicesolutionforourautomationneeds!ZeroMQ:
Thankyouverymuch!"

http://zguide.zeromq.org/page:all 222/225
12/31/2015 MQ - The Guide - MQ - The Guide

MichaelJakl'sStory topprevnext

"WeuseZeroMQfordistributingmillionsofdocumentsperdayinourdistributedprocessingpipeline.Westartedoutwithbig
messagequeuingbrokersthathadtheirownrespectiveissuesandproblems.Inthequestofsimplifyingourarchitecture,we
choseZeroMQtodothewiring.Sofarithadahugeimpactinhowourarchitecturescalesandhoweasyitistochangeandmove
thecomponents.Theplethoraoflanguagebindingsletsuschoosetherighttoolforthejobwithoutsacrificinginteroperabilityin
oursystem.Wedon'tusealotofsockets(lessthan10inourwholeapplication),butthat'sallweneededtosplitahuge
monolithicapplicationintosmallindependentparts.

"Allinall,ZeroMQletsmekeepmysanityandhelpsmycustomersstaywithinbudget."

VadimShalts'sStory topprevnext

"IamteamleaderinthecompanyActForex,whichdevelopssoftwareforfinancialmarkets.Duetothenatureofourdomain,we
needtoprocesslargevolumesofpricesquickly.Inaddition,it'sextremelycriticaltominimizelatencyinprocessingordersand
prices.Achievingahighthroughputisnotenough.Everythingmustbehandledinasoftrealtimewithapredictableultralow
latencyperprice.Thesystemconsistsofmultiplecomponentsexchangingmessages.Eachpricecantakealotofprocessing
stages,eachofwhichincreasestotallatency.Asaconsequence,lowandpredictablelatencyofmessagingbetweencomponents
becomesakeyfactorofourarchitecture.

"Weinvestigateddifferentsolutionstofindsomethingsuitableforourneeds.Wetrieddifferentmessagebrokers(RabbitMQ,
ActiveMQApollo,Kafka),butfailedtoreachalowandpredictablelatencywithanyofthem.Intheend,wechoseZeroMQused
inconjunctionwithZooKeeperforservicediscovery.ComplexcoordinationwithZeroMQrequiresarelativelylargeeffortanda
goodunderstanding,asaresultofthenaturalcomplexityofmultithreading.WefoundthatanexternalagentlikeZooKeeperis
betterchoiceforservicediscoveryandcoordinationwhileZeroMQcanbeusedprimarilyforsimplemessaging.ZeroMQfit
perfectlyintoourarchitecture.Itallowedustoachievethedesiredlatencyusingminimalefforts.Itsavedusfromabottleneckin
theprocessingofmessagesandmadeprocessingtimeverystableandpredictable.

"IcandecidedlyrecommendZeroMQforsolutionswherelowlatencyisimportant."

HowThisBookHappened topprevnext

WhenIsetouttowriteaZeroMQbook,wewerestilldebatingtheprosandconsofforksandpullrequestsintheZeroMQ
community.Today,forwhatit'sworth,thisargumentseemssettled:the"liberal"policythatweadoptedforlibzmqinearly2012
brokeourdependencyonasingleprimeauthor,andopenedthefloortodozensofnewcontributors.Moreprofoundly,itallowed
ustomovetoagentlyorganicevolutionarymodelthatwasverydifferentfromtheolderforcedmarchmodel.

ThereasonIwasconfidentthiswouldworkwasthatourworkontheGuidehad,forayearormore,showntheway.True,the
textismyownwork,whichisperhapsasitshouldbe.Writingisnotprogramming.Whenwewrite,wetellastoryandonedoesn't
wantdifferentvoicestellingonetaleitfeelsstrange.

Formethereallongtermvalueofthebookistherepositoryofexamples:about65,000linesofcodein24differentlanguages.
It'spartlyaboutmakingZeroMQaccessibletomorepeople.PeoplealreadyrefertothePythonandPHPexamplerepositories
twoofthemostcompletewhentheywanttotellothershowtolearnZeroMQ.Butit'salsoaboutlearningprogramming
languages.

Here'saloopofcodeinTcl:

while{1}{
#Processallpartsofthemessage
zmqmessagemessage
frontendrecv_msgmessage
setmore[frontendgetsockoptRCVMORE]
backendsend_msgmessage[expr{$more?"SNDMORE":""}]
messageclose
if{!$more}{
http://zguide.zeromq.org/page:all 223/225
12/31/2015 MQ - The Guide - MQ - The Guide
break#Lastmessagepart
}
}

Andhere'sthesameloopinLua:

whiletruedo
Processallpartsofthemessage
localmsg=frontend:recv()
if(frontend:getopt(zmq.RCVMORE)==1)then
backend:send(msg,zmq.SNDMORE)
else
backend:send(msg,0)
breakLastmessagepart
end
end

Andthisparticularexample(rrbroker)existsinC#,C++,CL,Clojure,Erlang,F#,Go,Haskell,Haxe,Java,Lua,Node.js,Perl,
PHP,Python,Ruby,Scala,Tcl,andofcourseC.Thiscodebase,allprovidedasopensourceundertheMIT/X11license,may
formthebasisforotherbooksorprojects.

Butwhatthiscollectionoftranslationssaysmostprofoundlyisthis:thelanguageyouchooseisadetail,evenadistraction.The
powerofZeroMQliesinthepatternsitgivesyouandletsyoubuild,andthesetranscendthecomingsandgoingsoflanguages.
Mygoalasasoftwareandsocialarchitectistobuildstructuresthatcanlastgenerations.Thereseemsnopointinaimingfor
meredecades.

RemovingFriction topprevnext

I'llexplainthetechnicaltoolchainweusedintermsofthefrictionweremoved.Inthisbookwe'retellingastoryandthegoalisto
reachasmanypeopleaspossible,ascheaplyandsmoothlyaswecan.

ThecoreideawastohostthetextandexamplesonGitHubandmakeiteasyforanyonetocontribute.Itturnedouttobemore
complexthanthat,however.

Let'sstartwiththedivisionoflabor.I'magoodwriterandcanproduceendlessamountsofdecenttextquickly.Butwhatwas
impossibleformewastoprovidetheexamplesinotherlanguages.BecausethecoreZeroMQAPIisinC,itseemedlogicalto
writetheoriginalexamplesinC.Also,Cisaneutralchoiceit'sperhapstheonlylanguagethatdoesn'tcreatestrongemotions.

Howtoencouragepeopletomaketranslationsoftheexamples?Wetriedafewapproachesandfinallywhatworkedbestwasto
offera"chooseyourlanguage"linkoneverysingleexampleinthetext,whichtookpeopleeithertothetranslationortoapage
explaininghowtheycouldcontribute.ThewayitusuallyworksisthataspeoplelearnZeroMQintheirpreferredlanguage,they
contributeahandfuloftranslationsorfixestotheexistingones.

Atthesametime,Inoticedafewpeoplequitedeterminedlytranslatingeverysingleexample.Thiswasmainlybindingauthors
whorealizedthattheexampleswereagreatwaytoencouragepeopletousetheirbindings.Fortheirefforts,Iextendedthe
scriptstoproducelanguagespecificversionsofthebook.InsteadofincludingtheCcode,we'dincludethePython,orPHPcode.
LuaandHaxealsogottheirdedicatedversions.

Oncewehaveanideaofwhoworksonwhat,weknowhowtostructuretheworkitself.It'sclearthattowriteandtestan
example,whatyouwanttoworkonissourcecode.Soweimportthissourcecodewhenwebuildthebook,andthat'showwe
makelanguagespecificversions.

Iliketowriteinaplaintextformat.It'sfastandworkswellwithsourcecontrolsystemslikegit.Becausethemainplatformforour
websitesisWikidot,IwriteusingWikidot'sveryreadablemarkupformat.

Atleastinthefirstchapters,itwasimportanttodrawpicturestoexplaintheflowofmessagesbetweenpeers.Makingdiagrams
byhandisalotofwork,andwhenwewanttogetfinaloutputindifferentformats,imageconversionbecomesachore.Istarted
withDitaa,whichturnstextdiagramsintoPNGs,thenlaterswitchedtoasciitosvg,whichproducesSVGfiles,whicharerather
better.Sincethefiguresaretextdiagrams,embeddedintheprose,it'sremarkablyeasytoworkwiththem.

http://zguide.zeromq.org/page:all 224/225
12/31/2015 MQ - The Guide - MQ - The Guide
Bynowyou'llrealizethatthetoolchainweuseishighlycustomized,thoughitusesalotofexternaltools.Allareavailableon
Ubuntu,whichisamercy,andthewholecustomtoolchainisinthezguiderepositoryinthebinsubdirectory.

Let'swalkthroughtheeditingandpublishingprocess.Hereishowweproducetheonlineversion:

bin/buildguide

Whichworksasfollows:

Theoriginaltextsitsinaseriesoftextfiles(oneperchapter).
Theexamplessitintheexamplessubdirectory,classifiedperlanguage.
WetakethetextandprocessthisusingacustomPerlscript,mkwikidot,intoasetofWikidotreadyfiles.
Wedothisforeachofthelanguagesthatgettheirownversion.
Weextractthegraphicsandcallasciitosvgandrasterizeoneachonetoproduceimagefiles,whichwestoreintheimages
subdirectory.
Weextractinlinelistings(whicharenottranslated)andstorestheseinthelistingssubdirectory.
WeusepygmentizeoneachexampleandlistingtocreateamarkeduppageinWikidotformat.
WeuploadallchangedfilestotheonlinewikiusingtheWikidotAPI.

Doingthisfromscratchtakesawhile.SowestoretheSHA1signaturesofeveryimage,listing,example,andtextfile,andonly
processanduploadchanges,andthatmakesiteasytopublishanewversionofthetextwhenpeoplemakenewcontributions.

ToproducethePDFandEpubformats,wedothefollowing:

bin/buildpdfs

Whichworksasfollows:

WeusethecustommkdocbookPerlprogramontheinputfilestoproduceaDocBookoutput.
WepushtheDocBookformatthroughdocbook2psandps2pdftocreatecleanPDFsineachlanguage.
WepushtheDocBookformatthroughdb2epubtocreateEpubbooksandineachlanguage.
WeuploadthePDFstothepublicwikiusingtheWikidotAPI.

Whencreatingacommunityproject,it'simportanttolowerthe"changelatency",whichisthetimeittakesforpeopletoseetheir
workliveor,atleast,toseethatyou'veacceptedtheirpullrequest.Ifthatismorethanadayortwo,you'veoftenlostyour
contributor'sinterest.

Licensing topprevnext

Iwantpeopletoreusethistextintheirownwork:inpresentations,articles,andevenotherbooks.However,thedealisthatif
theyremixmywork,otherscanremixtheirs.I'dlikecredit,andhavenoargumentagainstothersmakingmoneyfromtheir
remixes.Thus,thetextislicensedunderccbysa.

Fortheexamples,westartedwithGPL,butitrapidlybecameclearthiswasn'tworkable.Thepointofexamplesistogivepeople
reusablecodefragmentssotheywilluseZeroMQmorewidely,andiftheseareGPL,thatwon'thappen.Weswitchedto
MIT/X11,evenforthelargerandmorecomplexexamplesthatconceivablywouldworkasLGPL.

However,whenwestartedturningtheexamplesintostandaloneprojects(aswithMajordomo),weusedtheLGPL.Again,
remixabilitytrumpsdissemination.Licensesaretoolsusethemwithintent,notideology.

Websitedesignandcontentiscopyright(c)2014iMatixCorporation.Contactusforprofessionalsupport.Sitecontentlicensedunderccbysa3.0MQiscopyright(c)
Copyright(c)20072014iMatixCorporationandContributors.MQisfreesoftwarelicensedundertheLGPL.MQandZEROMQaretrademarksofiMatixCorporation.
TermsofUsePrivacyPolicy

http://zguide.zeromq.org/page:all 225/225

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy