《物联网技术及应用》课程教学资料(参考资料)Publish/Subscribe Communication Systems - from Models to Applications

UNIVERSITA DEGLI STUDIDIROMA “LA SAPIENZA"DOTTORATO DI RICERCA IN INGEGNERIA INFORMATICAXVI CICLO - 2003-VPublish/SubscribeCommunicationSystems: from Models to ApplicationsAntoninoVirgillito
Universita degli Studi di Roma “La Sapienza” ` Dottorato di Ricerca in Ingegneria Informatica XVI Ciclo – 2003– V Publish/Subscribe Communication Systems: from Models to Applications Antonino Virgillito

Contents11 Introduction21.1The Publish/Subscribe Paradigm31.1.1ResearchChallengesforPublish/Subscribe41.2Contributionsof theThesis61.3Structure of the Thesis .721UnderstandingPublish/SubscribeSystems82.11BasicPublish/SubscribeSpecification82.1.1Elements of a Publish/Subscribe System2.1.210PositioningthePublish/SubscribeParadigm112.2Subscription Models2.2.112Topic-based Model132.2.2Content-based Model2.2.314 Type-based Model152.3Architectural Models2162.3.1NetworkMulticasting172.3.2Application-level Networks182.3.3Peer-to-peer Overlay Network Infrastructures192.4Behind the Scenes of a Distributed Notification Service192.4.1Overview2.4.220Event Matching.212.4.3Subscription Assignment and Routing2.4.424Event and Notification Routing262.4.5Classification Framework272.5Surveying Publish/Subscribe Systems272.5.1TIB/RV2.5.2Scribe28282.5.3Gryphon292.5.4SIENA302.5.5Hermes312.6Concluding Remarksi
Contents 1 Introduction 1 1.1 The Publish/Subscribe Paradigm . . . . . . . . . . . . . . . . . 2 1.1.1 Research Challenges for Publish/Subscribe . . . . . . . 3 1.2 Contributions of the Thesis . . . . . . . . . . . . . . . . . . . . 4 1.3 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . 6 2 Understanding Publish/Subscribe Systems 7 2.1 Basic Publish/Subscribe Specification . . . . . . . . . . . . . . 8 2.1.1 Elements of a Publish/Subscribe System . . . . . . . . . 8 2.1.2 Positioning the Publish/Subscribe Paradigm . . . . . . 10 2.2 Subscription Models . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.1 Topic-based Model . . . . . . . . . . . . . . . . . . . . . 12 2.2.2 Content-based Model . . . . . . . . . . . . . . . . . . . 13 2.2.3 Type-based Model . . . . . . . . . . . . . . . . . . . . . 14 2.3 Architectural Models . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.1 Network Multicasting . . . . . . . . . . . . . . . . . . . 16 2.3.2 Application-level Networks . . . . . . . . . . . . . . . . 17 2.3.3 Peer-to-peer Overlay Network Infrastructures . . . . . . 18 2.4 Behind the Scenes of a Distributed Notification Service . . . . . 19 2.4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.4.2 Event Matching . . . . . . . . . . . . . . . . . . . . . . . 20 2.4.3 Subscription Assignment and Routing . . . . . . . . . . 21 2.4.4 Event and Notification Routing . . . . . . . . . . . . . . 24 2.4.5 Classification Framework . . . . . . . . . . . . . . . . . 26 2.5 Surveying Publish/Subscribe Systems . . . . . . . . . . . . . . 27 2.5.1 TIB/RV . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.5.2 Scribe . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.5.3 Gryphon . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.5.4 SIENA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.5.5 Hermes . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . 31 i

333ModellingPublish/SubscribeSystems343.1AFramework forPublish/Subscribe343.1.1Process-NS Interaction .3.1.235Computational Model店3.1.336NS Implementation Parameters393.1.4Liveness Property413.1.5Persistent Notifications3.1.643On the liveness specification in dynamic systems453.2Analytical Model453.2.1Measuring Notification Loss3.2.248Analytical results.523.2.3Discussion.543.3Simulation Study543.3.1Simulation Details553.3.2Simulation Results3.457Related Work.583.5 Concluding Remarks4 Self-Organizing Content-Based Publish/Subscribe61624.1 Background.624.1.1Publish/Subscribe Model634.1.2Content-based Routing Protocol644.1.3Scalability of Content-Based Routing.654.2SOCBR:ASelf-OrganizingCBRAlgorithm664.2.1The Cost Metric:TCPhops664.2.2Measuring Subscription Similarity:Associativity684.2.3AlgorithmOverview684.3AlgorithmSpecification694.3.1Basic Notions4.3.270Triggering704.3.3Tear-Up Link Discovery4.3.471Tear-Down Link Selection734.3.5Reconfiguration.4.475Addressing Network Proximity754.4.1Network Awareness in Pub/Sub Systems774.4.2pbSOCBR:Network-Aware Self-Organization4.577Simulation Study.79Implementation Details4.5.1804.5.2Simulation Scenarios814.5.3Experimental Results884.6Related Work904.7Concluding Remarks
3 Modelling Publish/Subscribe Systems 33 3.1 A Framework for Publish/Subscribe . . . . . . . . . . . . . . . 34 3.1.1 Process-NS Interaction . . . . . . . . . . . . . . . . . . . 34 3.1.2 Computational Model . . . . . . . . . . . . . . . . . . . 35 3.1.3 NS Implementation Parameters . . . . . . . . . . . . . . 36 3.1.4 Liveness Property . . . . . . . . . . . . . . . . . . . . . 39 3.1.5 Persistent Notifications . . . . . . . . . . . . . . . . . . 41 3.1.6 On the liveness specification in dynamic systems . . . . 43 3.2 Analytical Model . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.2.1 Measuring Notification Loss . . . . . . . . . . . . . . . . 45 3.2.2 Analytical results . . . . . . . . . . . . . . . . . . . . . . 48 3.2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.3 Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.3.1 Simulation Details . . . . . . . . . . . . . . . . . . . . . 54 3.3.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . 55 3.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . 58 4 Self-Organizing Content-Based Publish/Subscribe 61 4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.1.1 Publish/Subscribe Model . . . . . . . . . . . . . . . . . 62 4.1.2 Content-based Routing Protocol . . . . . . . . . . . . . 63 4.1.3 Scalability of Content-Based Routing . . . . . . . . . . . 64 4.2 SOCBR: A Self-Organizing CBR Algorithm . . . . . . . . . . . 65 4.2.1 The Cost Metric: TCP hops . . . . . . . . . . . . . . . 66 4.2.2 Measuring Subscription Similarity: Associativity . . . . 66 4.2.3 Algorithm Overview . . . . . . . . . . . . . . . . . . . . 68 4.3 Algorithm Specification . . . . . . . . . . . . . . . . . . . . . . 68 4.3.1 Basic Notions . . . . . . . . . . . . . . . . . . . . . . . . 69 4.3.2 Triggering . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.3.3 Tear-Up Link Discovery . . . . . . . . . . . . . . . . . . 70 4.3.4 Tear-Down Link Selection . . . . . . . . . . . . . . . . . 71 4.3.5 Reconfiguration . . . . . . . . . . . . . . . . . . . . . . . 73 4.4 Addressing Network Proximity . . . . . . . . . . . . . . . . . . 75 4.4.1 Network Awareness in Pub/Sub Systems . . . . . . . . 75 4.4.2 pbSOCBR: Network-Aware Self-Organization . . . . . . 77 4.5 Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.5.1 Implementation Details . . . . . . . . . . . . . . . . . . 79 4.5.2 Simulation Scenarios . . . . . . . . . . . . . . . . . . . . 80 4.5.3 Experimental Results . . . . . . . . . . . . . . . . . . . 81 4.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.7 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . 90

91Publish/Subscribe at Work:The DaQuinCIS Project5925.1Background:theDaQuinCISproject925.1.1DataQuality in Cooperative Information Systems945.2ManagingDataQualityinCIS's:TheDaQuinCIS Architecture955.2.1The D?Q model965.2.2ArchitectureDescription5.397The Quality Notification Service-Specification985.3.1.1025.4QNSDesign..1025.4.1OverviewandMotivations.5.4.2103Merge Subscriptions.5.4.3105Diffusion Trees5.4.4106Quality Notification Service Internal Architecture5.5108Implementation and Simulation5.6Related Work1095.7110Concluding Remarks1136Conclusions1136.1ContributionsandFutureWork6.2116FutureDirections
5 Publish/Subscribe at Work: The DaQuinCIS Project 91 5.1 Background: the DaQuinCIS project . . . . . . . . . . . . . . . 92 5.1.1 Data Quality in Cooperative Information Systems . . . 92 5.2 Managing Data Quality in CIS’s: The DaQuinCIS Architecture 94 5.2.1 The D2Q model . . . . . . . . . . . . . . . . . . . . . . 95 5.2.2 Architecture Description . . . . . . . . . . . . . . . . . . 96 5.3 The Quality Notification Service . . . . . . . . . . . . . . . . . 97 5.3.1 Specification . . . . . . . . . . . . . . . . . . . . . . . . 98 5.4 QNS Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 5.4.1 Overview and Motivations . . . . . . . . . . . . . . . . . 102 5.4.2 Merge Subscriptions . . . . . . . . . . . . . . . . . . . . 103 5.4.3 Diffusion Trees . . . . . . . . . . . . . . . . . . . . . . . 105 5.4.4 Quality Notification Service Internal Architecture . . . . 106 5.5 Implementation and Simulation . . . . . . . . . . . . . . . . . . 108 5.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 5.7 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . 110 6 Conclusions 113 6.1 Contributions and Future Work . . . . . . . . . . . . . . . . . . 113 6.2 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Chapter 1IntroductionThe world-wide connectivity achieved with the explosion of the Internet is nowa well established reality.Possibilities are currently ever growing, thanks to thewidespreaddiffusionof high-bandwidthlinksand of powerful mobiledevicessuch as wireless laptops, palm computers or new generation mobile phones.The result is that millions of users are now potentially able to communicate,with massive loads of information possibly being exchanged from one side toanother of networks spanning a world-wide range.One of the biggest challenges in next-generation distributed computing isrepresented by large-scale diffusion of information.Example of applicationsare stock and news tickers, traffic information, instant messaging and elec-tronic auctions.The general model behind these applications is based ongathering information from a set of data sources, and delivering it to all theusers, depending on their interest. Such applications are expected to handle ahuge number of concurrent users, with frequent information publication anddynamic changes in users'interest.The design of this type of distributed applications in such a highly de-manding context still hides many issues to cope with and the large spectrumof possibilities offered by the technological advances cannot be by themselvesthe answer. Powerful tools are still required that can effectively exploits avail-able computational resources, carefully avoiding to overuse them more thanwhat strictly necessary. On the other hand, such tools have to provide to bothapplication developers and users a flexibility allowing them a quick usage andan easy deployment in a broad range of situations.The classical abstractions on whichdistributed applications have been builtuntil now cannot keep this pace anymore. For example, the common RPCparadigm that is the basis for the most popular middleware tools, have provedto be inadequate for large-scale interactions requiring a frequent diffusion ofinformation among many participants. The reason is that RPC promotes a1
Chapter 1 Introduction The world-wide connectivity achieved with the explosion of the Internet is now a well established reality. Possibilities are currently ever growing, thanks to the widespread diffusion of high-bandwidth links and of powerful mobile devices such as wireless laptops, palm computers or new generation mobile phones. The result is that millions of users are now potentially able to communicate, with massive loads of information possibly being exchanged from one side to another of networks spanning a world-wide range. One of the biggest challenges in next-generation distributed computing is represented by large-scale diffusion of information. Example of applications are stock and news tickers, traffic information, instant messaging and electronic auctions. The general model behind these applications is based on gathering information from a set of data sources, and delivering it to all the users, depending on their interest. Such applications are expected to handle a huge number of concurrent users, with frequent information publication and dynamic changes in users’ interest. The design of this type of distributed applications in such a highly demanding context still hides many issues to cope with and the large spectrum of possibilities offered by the technological advances cannot be by themselves the answer. Powerful tools are still required that can effectively exploits available computational resources, carefully avoiding to overuse them more than what strictly necessary. On the other hand, such tools have to provide to both application developers and users a flexibility allowing them a quick usage and an easy deployment in a broad range of situations. The classical abstractions on which distributed applications have been built until now cannot keep this pace anymore. For example, the common RPC paradigm that is the basis for the most popular middleware tools, have proved to be inadequate for large-scale interactions requiring a frequent diffusion of information among many participants. The reason is that RPC promotes a 1

2CHAPTER 1.INTRODUCTIONtight coupling among participants, in the sense that recipients for a piece ofinformation have to be explicitly targeted by its sender. When envisioning ascenariowherethe setof recipientsforapieceof information can be composedby alargenumber ofentities and can frequentlychange over time,it is easyto understand that tightly coupled communication paradigms such as RPCexperienceanintrinsicscalabilitylimit.More appropriate solutions for many-to-many wide-area diffusion are rep-resented by network-level technologies such as IP multicast. However, theseare low-level facilities that still needs high-level interfaces for being easily in-tegrated into applications. Moreover, an actual world-wide deployment seemsstill a long wayto come[34] and themanagement of dynamically changingmulticast groups cannot be handled easily.For this reason, a great attention has been paid in the last years for researchfocused on distributed solutions targeted to information diffusion specificallyfor wide-area environments. Among the the most active areas of research inthis sense we cite peer-to-peer overlay network infrastructures[105,116,95.89l,epidemicmulticast algorithms[43] and and,finally,event-based systemsfollowing the publish/subscribe paradigm, which are the subject of study ofthis thesis.Though publish/subscribe (pub/sub) is not a recent achievement [10, 103],its usein large-scale,wide-area communication has become only in the lastyearsahotresearchtopic,makingpub/submovefroma simpleapplicationof multicast to a communication paradigm in its own right. This happenedbecause the anonymous, loosely coupled communication scheme that is properof the pub/sub paradigm, fits well to the highly dynamic nature of large-scaleenvironments. In the following we quickly introduce the main features of thepub/sub paradigm, and then present some open research issues related to it.1.1ThePublish/SubscribeParadigmEach participant in a pub/sub-based communication system can take on therole of a publisher or a subscriber of information. Publishers produce infor-mation,referred in the literature as notifications (or notifications),whichisconsumed by subscribers.Themain semantical characterization ofpub/subis in the way notifications flow from senders to receivers: receivers are notdirectly targeted from publisher, but rather they are indirectly addressed ac-cording to the content of notifications. That is, a subscriber expresses itsinterest by issuing subscriptions for specific notifications, independently fromthe publishers that produces them, and then it is asynchronously notified forall notifications, submitted by any publisher, that match their subscription."Asynchronous"means that a subscriber does not have to be blocked waiting
2 CHAPTER 1. INTRODUCTION tight coupling among participants, in the sense that recipients for a piece of information have to be explicitly targeted by its sender. When envisioning a scenario where the set of recipients for a piece of information can be composed by a large number of entities and can frequently change over time, it is easy to understand that tightly coupled communication paradigms such as RPC experience an intrinsic scalability limit. More appropriate solutions for many-to-many wide-area diffusion are represented by network-level technologies such as IP multicast. However, these are low-level facilities that still needs high-level interfaces for being easily integrated into applications. Moreover, an actual world-wide deployment seems still a long way to come [34] and the management of dynamically changing multicast groups cannot be handled easily. For this reason, a great attention has been paid in the last years for research focused on distributed solutions targeted to information diffusion specifically for wide-area environments. Among the the most active areas of research in this sense we cite peer-to-peer overlay network infrastructures [105, 116, 95, 89], epidemic multicast algorithms [43] and and, finally, event-based systems following the publish/subscribe paradigm, which are the subject of study of this thesis. Though publish/subscribe (pub/sub) is not a recent achievement [10, 103], its use in large-scale, wide-area communication has become only in the last years a hot research topic, making pub/sub move from a simple application of multicast to a communication paradigm in its own right. This happened because the anonymous, loosely coupled communication scheme that is proper of the pub/sub paradigm, fits well to the highly dynamic nature of large-scale environments. In the following we quickly introduce the main features of the pub/sub paradigm, and then present some open research issues related to it. 1.1 The Publish/Subscribe Paradigm Each participant in a pub/sub-based communication system can take on the role of a publisher or a subscriber of information. Publishers produce information, referred in the literature as notifications (or notifications), which is consumed by subscribers. The main semantical characterization of pub/sub is in the way notifications flow from senders to receivers: receivers are not directly targeted from publisher, but rather they are indirectly addressed according to the content of notifications. That is, a subscriber expresses its interest by issuing subscriptions for specific notifications, independently from the publishers that produces them, and then it is asynchronously notified for all notifications, submitted by any publisher, that match their subscription. “Asynchronous” means that a subscriber does not have to be blocked waiting

31.1.THEPUBLISH/SUBSCRIBEPARADIGMfor notifications to arrive, such as in client/server RPC, but it can keep onperforming concurrent operations.In order to avoid each publisher to have to know all the subscription foreach possible subscriber, this propagation mechanism is realized by introduc-ing a logical intermediary between publishers and subscribers, that in theliterature is usually referred to as Notification Service.Both publishers andsubscribers communicate only with a single entity,theNotification Service.that (i) stores all the subscriptions associated with the respective subscribers,(ii)receives allthenotificationsfrom publishers,(iii)dispatches all thepub-lished notification to thecorrect subscribers.Theresult is thatpublishersandsubscribers exchange information without directly knowing each other. Thisanonymity is one of the main features of the pub/sub paradigm and simplystems from the level of indirection provided by the Notification Service.Pub/sub is then an anonymous, many-to-many, asynchronous communica-tion paradigm, where multipleproducers may propagate information to mul-tiple consumers.Anonymity is an effective solution to easily get scalabilityat abstraction level.Participants do not haveto know each other and whenthe size of the system grows, they still have to contact only the NotificationService.1.1.1ResearchChallenges forPublish/SubscribeIt is clear that given such a simple yet powerful abstraction, the scalabilityproblems move to the realization of the Notification Service. In other words,the Notification Service should be able to face a large amount of users and tospan large-scale communication networks, always maintaining an acceptablelevel ofperformance.Therealizationof ascalablepub/subsystemhidessev-eralinterestingresearchchallengesthathavemadepub/subameetingpointof different research communities, such as databases, software engineering anddistributedsystems.The area of interest of this work is distributed systems, then our attentionwill be put on the realization of efficient and scalable distributed algorithmsand architectures for wide-area pub/sub interactions. In particular, our inter-est is focused on distributed implementations of the Notification Service, madeup of a set of independent processes that interact among themselves with thecommon aim of dispatching notifications to all interested subscribers. Thelively research in this field during the last years has stimulated many ideasthat areyet to be completelyfollowed.Moreover,themanypoints ofviewunder which this problem can be attacked led to a general confusion, with-out a common unifying framework allowing to understand and compare thedifferentcontributions.In our opinion, the first real research challenge in pub/sub is building such
1.1. THE PUBLISH/SUBSCRIBE PARADIGM 3 for notifications to arrive, such as in client/server RPC, but it can keep on performing concurrent operations. In order to avoid each publisher to have to know all the subscription for each possible subscriber, this propagation mechanism is realized by introducing a logical intermediary between publishers and subscribers, that in the literature is usually referred to as Notification Service. Both publishers and subscribers communicate only with a single entity, the Notification Service, that (i) stores all the subscriptions associated with the respective subscribers, (ii) receives all the notifications from publishers, (iii) dispatches all the published notification to the correct subscribers. The result is that publishers and subscribers exchange information without directly knowing each other. This anonymity is one of the main features of the pub/sub paradigm and simply stems from the level of indirection provided by the Notification Service. Pub/sub is then an anonymous, many-to-many, asynchronous communication paradigm, where multiple producers may propagate information to multiple consumers. Anonymity is an effective solution to easily get scalability at abstraction level. Participants do not have to know each other and when the size of the system grows, they still have to contact only the Notification Service. 1.1.1 Research Challenges for Publish/Subscribe It is clear that given such a simple yet powerful abstraction, the scalability problems move to the realization of the Notification Service. In other words, the Notification Service should be able to face a large amount of users and to span large-scale communication networks, always maintaining an acceptable level of performance. The realization of a scalable pub/sub system hides several interesting research challenges that have made pub/sub a meeting point of different research communities, such as databases, software engineering and distributed systems. The area of interest of this work is distributed systems, then our attention will be put on the realization of efficient and scalable distributed algorithms and architectures for wide-area pub/sub interactions. In particular, our interest is focused on distributed implementations of the Notification Service, made up of a set of independent processes that interact among themselves with the common aim of dispatching notifications to all interested subscribers. The lively research in this field during the last years has stimulated many ideas that are yet to be completely followed. Moreover, the many points of view under which this problem can be attacked led to a general confusion, without a common unifying framework allowing to understand and compare the different contributions. In our opinion, the first real research challenge in pub/sub is building such

4CHAPTER 1. INTRODUCTIONa common vision, proposing general models and frameworks that preciselycapture and describe the peculiarities of the paradigm. Though previous con-tributions in this direction [40, 76] already represented a great step in posi-tioning and understanding the pub/sub paradigm, this road can be followedfurther.On thealgorithmicside, themaintrigger of pub/sub researchhas been theattempt to build systems that offer a high flexibility totheir users,for exam-ple allowing them to precisely characterize their interest with powerful and ex-pressive subscription languages. Such systems are referred to as Content-basedpub/sub systems.Content-based pub/sub obviously requires theNotificationService to rely on complex mechanisms and the big challenge is to build themin a scalable way [18]. Several systems and algorithms addressing these issueshave been proposed, for example for effciently matching notifications againsta largenumber of subscriptions[1,16,44]orefficientlydelivering them toalarge number of users [7, 20]. However, the application of such systems is stillrestricted to the research community and practical experiences are still lack-ing.In other words, though research results in content-based pub/sub are nowconsolidated, actual deployments still rely on more simple, but more efficientsolutions.There are two requirements that have to be satisfied: First, pushing thescalability limits of pub/sub one step forward,by devising new solutions;Second, care about those aspects related to the ease of deployment provid-ing system with dynamic self-organization capabilities. The recentresearchcontributions are following these guidelines, in particular by borrowing andexploiting results achieved in the research on peer-to-peer overlay network in-frastructures[117,22,108,84].Thismixturecanproduce a wholebunch ofnew ideas that can completely give a new face to this research area. Resultsobtained by following this direction could also represent a strong basis to copewith the new challenges represented by the application of pub/sub system inhighly dynamic scenarios, such as those comprising mobile devices.1.2Contributions of the ThesisThis thesis presents the results of a broad-range study on research problemsrelated totheapplicationof thepublish/subscribeparadigminwide-areanet-work. The first contribution is a general survey of the state of the art ofresearch in pub/sub area. The presentation ranges from a general descrip-tion of theparadigm to a deep investigation of the internal architectureof adistributed pub/sub system, presenting all the issues hidden behind the real-ization of a scalable pub/sub system, together with the possible solutions thathave been proposed in the literature
4 CHAPTER 1. INTRODUCTION a common vision, proposing general models and frameworks that precisely capture and describe the peculiarities of the paradigm. Though previous contributions in this direction [40, 76] already represented a great step in positioning and understanding the pub/sub paradigm, this road can be followed further. On the algorithmic side, the main trigger of pub/sub research has been the attempt to build systems that offer a high flexibility to their users, for example allowing them to precisely characterize their interest with powerful and expressive subscription languages. Such systems are referred to as Content-based pub/sub systems. Content-based pub/sub obviously requires the Notification Service to rely on complex mechanisms and the big challenge is to build them in a scalable way [18]. Several systems and algorithms addressing these issues have been proposed, for example for efficiently matching notifications against a large number of subscriptions [1, 16, 44] or efficiently delivering them to a large number of users [7, 20]. However, the application of such systems is still restricted to the research community and practical experiences are still lacking. In other words, though research results in content-based pub/sub are now consolidated, actual deployments still rely on more simple, but more efficient solutions. There are two requirements that have to be satisfied: First, pushing the scalability limits of pub/sub one step forward, by devising new solutions; Second, care about those aspects related to the ease of deployment providing system with dynamic self-organization capabilities. The recent research contributions are following these guidelines, in particular by borrowing and exploiting results achieved in the research on peer-to-peer overlay network infrastructures [117, 22, 108, 84]. This mixture can produce a whole bunch of new ideas that can completely give a new face to this research area. Results obtained by following this direction could also represent a strong basis to cope with the new challenges represented by the application of pub/sub system in highly dynamic scenarios, such as those comprising mobile devices. 1.2 Contributions of the Thesis This thesis presents the results of a broad-range study on research problems related to the application of the publish/subscribe paradigm in wide-area network. The first contribution is a general survey of the state of the art of research in pub/sub area. The presentation ranges from a general description of the paradigm to a deep investigation of the internal architecture of a distributed pub/sub system, presenting all the issues hidden behind the realization of a scalable pub/sub system, together with the possible solutions that have been proposed in the literature

51.2.CONTRIBUTIONSOFTHETHESISDifferently from other previous surveys [40, 21, 76], our proposal does notintends to characterize the pub/sub paradigm with respect to other distributedabstractions, but rather to give a wide-range"internal" classification of thepub/sub research area, clarifying also the relationships with other researchproblems. The other contributions of this thesis can be divided in three gen-eral areas: models, regarding the formalization of several aspects of pub/subalgorithms,presentingnovel solutionsandapplicationsforpub/sub,and ap-plications, describing a context-specific pub/sub design.Models: Semantics and Performance of a Pub/Sub System.Cur-rently,only one specific research contribution[76]has beendevoted totheformal specification of pub/sub system. Nevertheless, the decoupled natureof this paradigm hides many subtleties that still make necessary a further,deeper reasoning about the precise semantics of a pub/sub interaction. Wepropose a computational model that characterizes the semantics of a pub/subsystem in terms of the classical safety and liveness properties of a distributedsystem. These properties, however, are expressed basing on two time delays,required to model the decoupled nature of the paradigm.We also show how the non-determinism introduced by the decoupling mayprovoke information not to be delivered on time to all the intended subscribers.Thus, following the computational model, an analytical model is introducedthat characterizestheperformanceof apub/sub system,interms of thefrac-tion of notifications that successfully reach their destinations.A detailed analytical study has been carried out, that captures the behavior of a wide class of pub/sub systems.A simulation study,realized byimplementing a complete pub/sub system prototype, provides a validation forthe analytical results.Algorithms: Self-organization in Content-based Pub/Sub Systems.As pointed out above, content-based pub/sub systems have been the maininspiration for research on scalable algorithms for notification diffusion. Com-mon content-based pub/sub systems are built over distributed application-level networks of notification brokers, acting as servers for both subscribersand publishers.The links among brokers in these networks are in practicestatic TCP connections, made up at system creation time (generally assuminga human intervention). The idea behind our contribution is to try to pushthe scalability limit of content-based pub/sub by introducing the possibility ofrearranging the application-level network topology. In particular, brokers areable to self-organize their connections according to the distribution of interestamong their subscribers. This creates paths composed only by brokers servingsubscribers which are interested in the same information, avoiding toinvolve
1.2. CONTRIBUTIONS OF THE THESIS 5 Differently from other previous surveys [40, 21, 76], our proposal does not intends to characterize the pub/sub paradigm with respect to other distributed abstractions, but rather to give a wide-range “internal” classification of the pub/sub research area, clarifying also the relationships with other research problems. The other contributions of this thesis can be divided in three general areas: models, regarding the formalization of several aspects of pub/sub, algorithms, presenting novel solutions and applications for pub/sub, and applications, describing a context-specific pub/sub design. Models: Semantics and Performance of a Pub/Sub System. Currently, only one specific research contribution [76] has been devoted to the formal specification of pub/sub system. Nevertheless, the decoupled nature of this paradigm hides many subtleties that still make necessary a further, deeper reasoning about the precise semantics of a pub/sub interaction. We propose a computational model that characterizes the semantics of a pub/sub system in terms of the classical safety and liveness properties of a distributed system. These properties, however, are expressed basing on two time delays, required to model the decoupled nature of the paradigm. We also show how the non-determinism introduced by the decoupling may provoke information not to be delivered on time to all the intended subscribers. Thus, following the computational model, an analytical model is introduced that characterizes the performance of a pub/sub system, in terms of the fraction of notifications that successfully reach their destinations. A detailed analytical study has been carried out, that captures the behavior of a wide class of pub/sub systems. A simulation study, realized by implementing a complete pub/sub system prototype, provides a validation for the analytical results. Algorithms: Self-organization in Content-based Pub/Sub Systems. As pointed out above, content-based pub/sub systems have been the main inspiration for research on scalable algorithms for notification diffusion. Common content-based pub/sub systems are built over distributed applicationlevel networks of notification brokers, acting as servers for both subscribers and publishers. The links among brokers in these networks are in practice static TCP connections, made up at system creation time (generally assuming a human intervention). The idea behind our contribution is to try to push the scalability limit of content-based pub/sub by introducing the possibility of rearranging the application-level network topology. In particular, brokers are able to self-organize their connections according to the distribution of interest among their subscribers. This creates paths composed only by brokers serving subscribers which are interested in the same information, avoiding to involve

6CHAPTER 1brokers that exclusively carry out a forwarding function.A basic algorithm for self-organizing content-based pub/sub is first pre-sented.This algorithm achieves performance results very close to an idealvalue:afterthe self-organization the number of brokers involved in a notifi-cation diffusion is almost equal tothe number of the ones interested in thenotification itself, i.e. the minimum possible. Furthermore, a variant of the ba-sic algorithmispresented thatalsoaccountsfortheimpact of self-organizationon the performance metrics in the underlying network.Applications:Pub/subforData QualityNotification.The real-worldusage of pub/sub systems have been described in several application contexts.We present a novel application of a pub/sub-based service realized as a partof a research project that involves the research areas of information systems,databases and distributed systems. The problem attacked by this project isthemanagement of dataquality[9oJin thosesystemsformed bydifferent,inde-pendent information systems, cooperating to achieve common goals (namely,Cooperative Information Systems).In particular,we present thedesign of apub/sub service, aimed at the notification of changes in quality of data.Thedesign of the serviceincludes novel solutions to tacklethescalability and in-tegration issues arising from the cocenario.Thedesignand theimplementation (based on the web-services technology)of the service are pre-sented.1.3Structure of the ThesisThe following is an outline of the content of the thesis:In Chapter 2 we first give a general, high-level specification of a pub/subcommunication system, thenwe survey the state of the art in this researchfield, by first giving a wide-range classification of all the possible general solu-tions to the aforementioned problems and then by presenting how such prob-lems are solved in actual systems.In Chapter 3, a more specific, formal description of the semantics of apub/sub system is given. A probabilistic analytical model is also presented,validated through a simulation study.In Chapter 4 we address scalability issues underlying a pub/sub implemen-tation, by proposing self-organization algorithms for distributed content-basedpub/sub system.Chapter 5 presents the design and the implementation of a distributedpub/sub system (namely, the Quality Notification Service), specifically de- signed for being used in cooperative information systems for managing dataquality issues
6 CHAPTER 1 brokers that exclusively carry out a forwarding function. A basic algorithm for self-organizing content-based pub/sub is first presented. This algorithm achieves performance results very close to an ideal value: after the self-organization the number of brokers involved in a notifi- cation diffusion is almost equal to the number of the ones interested in the notification itself, i.e. the minimum possible. Furthermore, a variant of the basic algorithm is presented that also accounts for the impact of self-organization on the performance metrics in the underlying network. Applications: Pub/sub for Data Quality Notification. The real-world usage of pub/sub systems have been described in several application contexts. We present a novel application of a pub/sub-based service realized as a part of a research project that involves the research areas of information systems, databases and distributed systems. The problem attacked by this project is the management of data quality [90] in those systems formed by different, independent information systems, cooperating to achieve common goals (namely, Cooperative Information Systems). In particular, we present the design of a pub/sub service, aimed at the notification of changes in quality of data. The design of the service includes novel solutions to tackle the scalability and integration issues arising from the cooperative scenario. The design and the implementation (based on the web-services technology) of the service are presented. 1.3 Structure of the Thesis The following is an outline of the content of the thesis: In Chapter 2 we first give a general, high-level specification of a pub/sub communication system, then we survey the state of the art in this research field, by first giving a wide-range classification of all the possible general solutions to the aforementioned problems and then by presenting how such problems are solved in actual systems. In Chapter 3, a more specific, formal description of the semantics of a pub/sub system is given. A probabilistic analytical model is also presented, validated through a simulation study. In Chapter 4 we address scalability issues underlying a pub/sub implementation, by proposing self-organization algorithms for distributed content-based pub/sub system. Chapter 5 presents the design and the implementation of a distributed pub/sub system (namely, the Quality Notification Service), specifically designed for being used in cooperative information systems for managing data quality issues
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 《计算机组成原理》课程教学资源(PPT课件)第四章 主存储器.ppt
- 《计算机组成原理》课程教学资源(PPT课件)第三章 乘除及校验.ppt
- 《计算机组成原理》课程教学资源(PPT课件)第二章 运算方法和运算部件(二进制运算).ppt
- 《计算机组成原理》课程教学资源(PPT课件)第一章 计算机系统概论.ppt
- 《计算机组成原理》课程教学资源(PPT课件)第六章 中央处理部件(CPU).ppt
- 《计算机组成原理》课程教学资源(PPT课件)第五章 指令系统.ppt
- 《计算机组成原理》课程教学资源(PPT课件)第十章 输入输出系统(I/O).ppt
- 《计算机组成原理》课程教学资源(PPT课件)第七章 存储系统.ppt
- 《计算机组成原理》课程授课教案(讲稿,文字版).pdf
- 《计算机组成原理》课程实验指导书.doc
- 《数据结构》课程教学课件(讲稿,C语言描述)第1章 绪论.pdf
- 《数据结构》课程教学课件(讲稿,C语言描述)第3章 栈和队列.pdf
- 《数据结构》课程教学课件(讲稿,C语言描述)第5章 数组和广义表.pdf
- 《数据结构》课程教学课件(讲稿,C语言描述)第2章 线性表.pdf
- 《数据结构》课程教学课件(讲稿,C语言描述)第4章 串.pdf
- 《数据结构》课程教学课件(讲稿,C语言描述)第7章 图.pdf
- 《数据结构》课程教学课件(讲稿,C语言描述)第6章 树.pdf
- 《数据结构》课程教学课件(讲稿,C语言描述)第8章 查找.pdf
- 《数据结构》课程教学课件(讲稿,C语言描述)第9章 排序.pdf
- 《数据结构》课程教学资源(知识点)数据结构各章重点难点.pdf
- 《物联网技术及应用》课程教学资料(参考资料)Toward the 6G Network Era - Opportunities and Challenges.pdf
- 《物联网技术及应用》课程教学资料(参考资料)A Survey on Green 6G Network - Architecture and Technologies.pdf
- 《物联网技术及应用》课程教学资料(参考资料)A Survey of 5G Network:Architecture and Emerging Technologies.pdf