Research interests
Opportunities (tesi di laurea IT)
Publications by year:
Publications by type: Conference papers - Journals - Technical Reports - Books - Book Chapters - Phd Thesis - Master Thesis (Tesi di Laurea Magistrale) - Bachelor Thesis (Tesi di Laurea)
Publications by author:
Publications by project:
Free search: Search


G. Di Luna, R. Baldoni
Non Trivial Computations in Anonymous Dynamic Networks.

(Full Version) short version to Appear in OPODIS 15, 2015
bib - BibTeX reference

R. Baldoni, S. Bonomi, M. Platania, L. Querzoni
Efficient Notification Ordering for Geo-Distributed Pub/Sub Systems

IEEE Transactions on Computers, volume 64, num. 10, pages 796-2808, 2015

Abstract [+]

"A distributed event notification service (ENS) is at the core of modern messaging infrastructures providing applications with scalable and robust publish/subscribe communication primitives. Such ENSs can route events toward subscribers using multiple paths with different lengths and latencies. As a consequence, subscribers can receive events out of order. In this paper, we propose a novel solution for ordered notifications on top of an existing distributed topic-based ENS. Our solutions guarantees that each pair of events published in the system will be notified in the same order to all their target subscribers independently from the topics they are published in. It endows a distributed timestamping mechanism based on a multistage sequencer that produces timestamps whose size is dynamically adjusted to accommodate changing subscriptions in the system. An extensive experimental evaluation based on a prototype implementation shows that the timestamping mechanism is able to scale from several points of view (i.e., number of publisher and subscribers, event rate). Furthermore, it shows how the deployment flexibility of our solution makes it perform better in terms of timestamp size and timestamp generation latency when the system load exhibits geographic topic popularity, that is, matching subscriptions and publications are geographically clustered. This makes our solution particularly well suited to be deployed in geo-distributed infrastructures."

Downloads:pdf - Paper (accepted version) - Main
pdf - Paper (accepted version) - Supplemental Material
bib - BibTeX reference

F. Petroni, L. Querzoni, K. Daudjee, S. Kamali, G. Iacoboni
HDRF: Stream-Based Partitioning for Power-Law Graphs

Proceedings of the 24th ACM International Conference on Information and Knowledge Management (CIKM), 2015

Abstract [+]

"Balanced graph partitioning is a fundamental problem that is receiving growing attention with the emergence of large-scale graphs in a variety of real-world applications such as social networks, machine learning and the Web. Several recent distributed graph-computing (DGC) frameworks have emerged that provide tailored programming abstractions for these applications. In DGC frameworks, the partitioning strategy plays an important role since it drives the communication cost and the workload balance among computing nodes, thereby affecting system performance. However, existing partitioning algorithms either do not exploit or exploit only partially a key characteristic of natural graphs commonly found in the real-world: their highly skewed power-law degree distributions. In this paper, we propose High-Degree (are) Replicated First (HDRF), a novel streaming vertex-cut balanced graph partitioning algorithm. HDRF effectively exploits skewed degree distributions by explicitly taking into account vertex degree in the placement decision. We experimentally evaluate and compare HDRF with existing solutions on both synthetic and real-world graphs. We show that HDRF outperforms all existing algorithms in partitioning quality while guaranteeing load balance."

Downloads:pdf - Paper
bib - BibTeX reference

C. Ciccotelli, L. Aniello, F. Lombardi, L. Montanari, L. Querzoni, R. Baldoni
NIRVANA: A Non-Intrusive Black-Box Monitoring Framework for Rack-level Fault Detection

(To appear) In Proceedings of the 21st IEEE Pacific Rim International Symposium on Dependable Computing (PRDC), 2015

Abstract [+]

"Many organizations today still manage mid or large in-house data centers that require very expensive maintenance efforts, including fault detection. Common monitoring frameworks used to quickly detect faults are complex to deploy/maintain, expensive, and intrusive as they require the installation of probes on monitored hw/sw to collect raw data. Such intrusiveness can be problematic as it imposes installation/management overhead and may interfere with security/privacy policies. In this paper we introduce NIRVANA, a novel monitoring system for fault detection that works at rack-level and is (i) non-intrusive, i.e. it does not require the installation of software probes on the hosts to be monitored and (ii) black-box, i.e., agnostic with respect to monitored applications. At the core of our solution lies the observation that aggregated features that can be monitored at rack-level in a non-intrusive and black-box way, show predictable behaviors while the system works in both fault-free and faulty states; it is therefore possible to detect and identify faults by monitoring and analyzing any perturbations to these behaviors. An extensive experimental evaluation shows that non-intrusiveness does not significantly hamper the fault detection capabilities of the monitoring system, thus validating our approach."

Downloads:bib - BibTeX reference

S. Bonomi, A. Del Pozzo, M. Potop-Butucaru, S. Tixeuil
Mobile Byzantine Fault Tolerant Distributed Storage

Technical report 2 - MIDLAB - 2015
Downloads:pdf - Technical Report - MIDLAB 2/2015
bib - BibTeX reference

S. Bonomi, A. Del Pozzo, M. Potop-Butucaru
Tight Self-Stabilizing Mobile Byzantine-Tolerant Atomic Register

(to appear) In Proceedings of the 17th International Conference on Distributed Computing and Networking (ICDCN), 2015

Abstract [+]

"This paper proposes the first implementation of a self- stabi- lizing atomic register that is tolerant to both Mobile Byzan- tine Agents and transient failures. The register is main- tained by n servers and our algorithm tolerates (i) any num- ber of transient failures and (ii) up to f Mobile Byzantine Failures. In the Mobile Byzantine Failure model, faulty agents move from one server to another and when they are affecting a server, it behaves arbitrarily. Our implemen- tation is designed for the round-based synchronous model where agents are moved from round to round. The paper considers four Mobile Byzantine Failure models differing for the diagnosis capabilities at server side i.e., when servers can diagnose their failure state (that is, be aware that the mo- bile Byzantine agent has left the server), and when servers cannot self-diagnose. We first prove lower bounds on the number of servers n necessary to construct a register toler- ant to the presence of f Mobile Byzantine Failures for each of the Mobile Byzantine Failure models considered and then we propose a parametric algorithm working in all the models and matching the lower bounds."

Downloads:pdf - Technical Report version - MIDLAB 3/2015
bib - BibTeX reference

N. Rivetti, L. Querzoni, E. Anceaume, Y. Busnel, B. Sericola
Efficient Key Grouping for Near-Optimal Load Balancing in Stream Processing Systems

In Proceedings of the 9th ACM International Conference on Distributed Event-Based Systems (DEBS), 2015

Abstract [+]

"Key grouping is a technique used by stream processing frameworks to simplify the development of parallel stateful operators. Through key grouping a stream of tuples is partitioned in several disjoint sub-streams depending on the values contained in the tuples themselves. Each operator instance target of one sub-stream is guaranteed to receive all the tuples containing a specific value. A common solution to implement key grouping is through hash functions that, however, are known to cause load imbalances on the target operator instances when the input data stream is characterized by a skewed value distribution. In this paper we present DKG, a novel approach to key grouping that provides near-optimal load distribution for input streams with skewed value distribution. DKG starts from the simple observation that with such inputs the load balance is strongly driven by the most frequent values; it identifies such values and explicitly maps them to sub-streams together with groups of less frequent items to achieve a near-optimal load balance. We provide theoretical approximation bounds for the quality of the mapping derived by DKG and show, through both simulations and a running prototype, its impact on stream processing applications."

Downloads:pdf - Paper
bib - BibTeX reference

A. Anagnostopoulos, F. Petroni, M. Sorella
COLITA: Collaborative Interest-Driven Targeted Advertising

Telecom Big Data Challenge International Competition, 2015

Abstract [+]

"Compared with online advertising, offline advertising commonly fails to achieve a satisfactory level of targeting, because of the lack of specific information on customers interests and mobility patterns, being forced to rely on imprecise and aggregate demographic estimates. In this work we use the Twitter social network to gather information about users' degree of interest in given advertising categories and about the common routes that they follow for recommending physical locations for advertising. Given an advertisement belonging to one of the examined categories, we estimate the most promising areas to be selected for the placement of an ad that can maximize its targeted effectiveness. To the best of our knowledge this is the first work on offline advertising in urban areas making use of (publicly available) data from social networks."

Downloads:bib - BibTeX reference

A. Bessi et al.
Viral Misinformation: The Role of Homophily and Polarization

In Proceedings of the 24th International Conference on World Wide Web Companion Volume, 2015

Abstract [+]

"The World Economic Forum listed the diffusion of false information on online social networks as one of the main risks for our society. Indeed, the spreading of unsubstantiated rumors either intentionally or unintentionally misleading can have serious consequences such as in the case of rumors about Ebola causing disruption to health-care workers. Targeting Facebook, we characterize consumption patterns of 1.2M Italian users with respect to verified (science news) and unverified (conspiracy news) rumors. We show that users’ engagement on verified (or unverified) content correlates with the number of friends having similar consumption patterns (homophily). Finally, we measure how this social system responded to the injection of 4,709 false information. We find that the frequent exposure to unsubstantiated rumors (polarization) is a good measure to detect homophile clusters where false rumors are more likely to spread."

Downloads:bib - BibTeX reference

S. Bonomi, M. Potop-Butucaru, S. Tixeuil
Stabilizing Byzantine-Fault Tolerant Storage

Proceedings of the 29th IEEE International Parallel & Distributed Processing Symposium, 2015
Downloads:bib - BibTeX reference

L. Aniello, L. Querzoni, R. Baldoni
High Frequency Batch-oriented Computations over Large Sliding Time Windows

Future Generation Computer Systems, volume 43, pages 1-11, Elsevier, 2015

Abstract [+]

"Today’s business workflows are very likely to include batch computations that periodically analyze subsets of data within specific time ranges to provide strategic information for stakeholders and other interested parties. The frequency of these batch computations provides an effective measure of data analytics freshness available to decision makers. Nevertheless, the typical amounts of data to elaborate in a batch are so large that a computation can take very long. Considering that usually a new batch starts when the previous one has completed, the frequency of such batches can thus be very low. In this paper we propose a model for batch processing based on overlapping sliding time windows that allows to increase the frequency of batches. The model is well suited to scenarios (e.g., financial, security etc) characterized by large data volumes, observation windows in the order of hours (or days) and frequent updates (order of seconds). The model introduces multiple metrics whose aim is reducing the latency between the end of a computation time window and the availability of results, increasing thus the frequency of the batches. These metrics specifically take into account the organization of input data to minimize its impact on such latency. The model is then instantiated on the well-known Hadoop platform, a batch processing engine based on the MapReduce paradigm, and a set of strategies for efficiently arranging input data is described and evaluated."

Downloads:pdf - Paper (accepted version)
bib - BibTeX reference

S. Bonomi, A. Del Pozzo, R. Baldoni
Building Regular Registers with Rational Malicious Servers and Anonymous Clients

Technical report 1 - MIDLAB - 2015

Abstract [+]

"The paper addresses the problem of emulating a regular register in a synchronous distributed system where clients invoking ${\sf read}()$ and ${\sf write}()$ operations are anonymous while server processes maintaining the state of the register may be compromised by rational adversaries (i.e., a server might behave as \emph{rational malicious byzantine} process). We first model our problem as a Bayesian game between a client and a rational malicious server where the equilibrium depends on the decisions of the malicious server (behave correctly and not be detected by clients vs returning a wrong register value to clients with the risk of being detected and then excluded by the computation). We prove such equilibrium exists and finally we design a protocol implementing the regular register that forces the rational malicious server to behave correctly."

Downloads:pdf - Technical Report Netys
bib - BibTeX reference

L. Montanari et al.
2014 Italian Cyber Security Report

Abstract [+]

"Da qui a dieci anni la correlazione tra prosperità economica di una nazione ed il fatto che questa Nazione possegga capacità cyber avanzate sarà molto stretta. Quindi per rimanere nel gruppo delle Nazioni sviluppate si dovranno migliorare queste capacità nella società, nel tessuto industriale, nella pubblica amministrazione di un Paese. Per questo ogni Nazione si sta attrezzando attraverso l’implementazione di un piano strategico Nazionale multidimensionale che coinvolge pubblico, privato e ricerca. Alzare le difese del cyber space di un Paese significa, tra l’altro, rendere più appetibile investimenti da parte di operatori internazionali che potrebbero vedere come pericoloso costruire realtà imprenditoriali, come ad esempio nuove aziende sul territorio, in un paese dove non c’è organizzazione e capacità difensiva cyber. Queste aziende potrebbero infatti rivelarsi un punto debole del sistema informativo dell’organizzazione investitrice. Nel terzo millennio, rendere sicuro il cyber space di un Paese significa gettare le basi per l’indipendenza e la prosperità economica di una Nazione. Ad un anno dalla pubblicazione del Quadro Strategico Nazionale per la Sicurezza dello Spazio Cibernetico, il Cyber Security Report 2014 si concentra sulle problematiche legate alla consapevolezza della minaccia e alla capacità difensiva della pubblica amministrazione italiana. Un questionario è stato inviato a circa 300 pubbliche amministrazioni a livello nazionale, regionale e locale. Lo studio ha consentito di individuare i principali problemi e quindi i punti cruciali, su cui agire per ottenere un miglioramento rapido e sostanziale della nostra protezione. Ha anche purtroppo evidenziato come ci siano lacune importanti e radicate sia in termini di cultura della sicurezza che di organizzazione. Ne consegue una situazione in cui solo pochissime amministrazioni si possono ritenere consapevoli dei rischio cibernetico mentre gli errori e la quantità di best practices ignorate sottolineano la profonda arretratezza culturale, in particolare, rispetto al valore strategico (ed economico) delle informazioni che potrebbero essere trafugate dai sistemi informativi di una pubblica amministrazione. Porre in sicurezza il cyber space nazionale rappresenta quindi un dovere rispetto alle giovani generazioni di fronte al rischio che ogni minuto si corre in un mondo digitale dove le minacce sono continue ed in costante aumento. Ma questo pericolo può rappresentare un’imperdibile e gigante opportunità economica per il Paese in particolare per la crescita della nostra capacità industriale. Infatti nel dominio cyber l’Italia vanta competenze che il mondo ci invidia (e sempre più spesso purtroppo ci compra), quindi definire una appropriata strategia industriale a livello governativo per sorreggere l’ecosistema cyber italiano può rappresentare un doppio vantaggio economico, di sicurezza ed indipendenza Nazionale. Ad oggi però l’Italia investe zero in questo settore, dove altre nazioni, per dimensioni simili alla nostra, hanno budget quadriennali ormai nell’ordine del bilione di euro. D’altra parte le buone pratiche di altre Nazioni mostrano che per implementare strategie nazionali ed industriali adeguate, ci vogliono attori nazionali sparsi sul territorio. Per questo la comunità accademica italiana, ha iniziato a fare la sua parte creando nel corso del 2014 il Laboratorio Nazionale di Cyber Security dove professori e ricercatori di 33 Università Italiane offrono la loro collaborazione per rendere la nostra vita digitale più sicura. Noi speriamo che questa opportunità che abbiamo creato come accademici venga colta e supportata dalle autorità governative."

Downloads:bib - BibTeX reference

M. Baccini
Prediction of Twitter’s contents through the analysis of users’ similarities

Master thesis (Tesi di Laurea Specialistica)
Downloads:pdf - Master Thesis
bib - BibTeX reference

F. Piccione
Evaluation of data inference methods on the Google+ social network

Master thesis (Tesi di Laurea Specialistica)
Downloads:pdf - Master Thesis
bib - BibTeX reference

Top -

DISCLAIMER - The reports contained in this page are included by the contributing authors as a mechanism to ensure timely dissemination of scholarly/technical information on a non-commerical basis. Copyright and all rights therein are maintained by the authors, despite the fact they have offered this information electronically. It is understood that all individuals copying this information will adhere to the terms/ constraints invoked by each author's copyright. Reports may not be copied for commercial redistribution, republication, or dissemination without the explicit permission of the authors. Sections of some of these reports have been published by IEEE, Springer-Verlag, Kluwer etc. and have Copyright. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the publisher.