12357 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
Language operator: and / or
Date operator: and / or
more
Rights operator: and / or
30/05/2016 Unknown
gDup: an integrated and scalable graph deduplication system.
Atzori C
In this thesis we start from the experiences and solutions for duplicate identification in Big Data collections and address the broader and more complex problem of 'Entity Deduplication over Big Graphs'. By 'Graph' we mean any digital representation of an Entity Relationship model, hence entity types (structured properties) and relationships between them. By 'Big' we mean that duplicate identification over the objects of such entity types cannot be handled with traditional backends and solutions, e.g .ranging from tens of millions of objects to any higher number. By 'entity deduplication' we mean the combined process of duplicate identification and graph disambiguation. Duplicate identification has the aim of efficiently identifying pairs of equivalent objects for the same entity type, while graph disambiguation has the goal of removing the duplication anomaly from the graph. A large number of Big Graphs are today being maintained, e.g. collections populated over time with no duplicate controls, aggregations of multiple collections, which need continuous or extemporaneous entity deduplication cleaning. Examples are person deduplication in census records, deduplication of authors on library bibliographical collections (e.g. Google Scholar graph, Thomson Reuters citation graph, OpenAIRE graph), deduplication of catalogues from multiple stores, deduplication of Linked Open Data clouds resulting from integration of multiple clouds, any subset of the Web, etc.. As things stand today, data curators can find a plethora of tools supporting duplicate identification for Big collections of objects, which they can adopt to efficiently process the objects of individual entity type collections. However, the extension of such tools to the Big Data scenario is absent, as well as the support for graph disambiguation. In order to implement a full entity deduplication workflow for Big Graphs data curators end-up realizing patchwork systems, tailored to their graph data model, often bound to their physical representation of the graph (i.e. graph storage), expensive in terms of design, development, and maintenance, and in general not reusable by other practitioners with similar problems in different domains. This first contribution of this thesis is a reference architecture for 'Big Graph Entity Deduplication Systems' (BGEDSs), which are integrated, scalable, general purpose systems for entity deduplication over Big Graphs. BGEDSs are intended to support data curators with the out-of-the-box functionalities they need to implement all phases of duplicates identification and graph disambiguation. The architecture formally defines the challenge, by providing graph type language and graph object language, defining the specifics of the entity deduplication phases, and explaining how such phases manipulate the initial graph to eventually return the final disambiguated graph. Most importantly, it defines the level of configuration, i.e. customization, that data curators should be able to exploit when relying on BGEDSs to implement entity deduplication. The second contribution of this thesis is GDup, an implementation of a BGEDS whose instantiation is today used in the real production environment of the OpenAIRE infrastructure, the European e-infrastructure for Open Science and Access. GDup can be used to operate over Big Graphs represented using standards such as RDF-graphs or JSON-LD graphs and conforming to any graph schema. The system supports highly configurable duplicate identification and graph disambiguation settings, allowing data curators to tailor object matching functions by entity type properties and define the strategy of duplicate objects merging that will disambiguate the graph. GDup also provides functionalities to semi-automatically manage a Ground Truth, i.e. a set of trustworthy assertions of equality between objects, that can be used to preprocess objects of the same entity type and reduce computation time. The system is conceived to be extensible with other, possibly new methods in the deduplication domain (e.g. clustering functions, similarity functions) and supports scalability and performance over Big Graphs by exploiting an HBase - Hadoop MapReduce stack.Project(s): Open Access Infrastructure for Research in Europe 2020

See at: hdl.handle.net | etd.adm.unipi.it | hdl.handle.net | CNR IRIS


29/01/2016 Unknown
Device interoperability and service discovery in smart environments
Girolami M
Smart Environments (SE), and in particular Smart Homes, have attracted the attention of many researchers and industrial vendors. In such environments, according to the Ambient Intelligence paradigm, devices operate collectively using any information describing the environment (also known as the context-information) in order to support users in accomplishing their tasks. SE devices are characterized by several properties: they are designed to react autonomously to specific events, they are aware of the context, they manage sensitive information concerning the users, they adopt a service-oriented model in order to interact with other devices, and they interact by means of various applications and communication protocols. Cooperation with devices in SE is thus complex. This thesis deals with two problems that still represent a barrier to the development of many SE applications. The thesis examines how to interact with low-power devices, which is refereed to as device interoperability, and how to discover the functionalities that mobile devices offer, namely the service discovery problem. The rest part of the thesis describes the design of ZB4O an integration gateway for low-power devices based on the ZigBee specification. The growing market for ZigBee-ready appliances makes the ZigBee specification an important technology-enabler for SE. However, accessing such devices entails an easy interaction model with IP-based networks that are already present in most SE. Therefore, this work presents an open source platform that seamlessly integrates ZigBee devices with applications running on SE. The thesis describes the evaluation process of ZB4O with various trials organized over the last year of two EU projects, as well as the integration of ZB4O with UPnP and a RESTful approach. SE devices can also export their functionalities with a service-oriented approach. In fact, every resource offered by a device can be seen as a service available for other devices. The second problem studied in this thesis is the service discovery and it deals with how to advertise and query services in SE. The scenario considered for the service discovery problem is characterized by mobile devices carried by people roaming in SE. Hence, mobility and sociality are two key-factors that make the service discovery problem more complex and challenging. The thesis presents two algorithms, termed SIDEMAN and CORDIAL, for the service discovery in Mobile Social Networks (MSN) which are evaluated with real and synthetic simulation scenarios.

See at: hdl.handle.net | etd.adm.unipi.it | hdl.handle.net | CNR IRIS


27/11/2013 Unknown
Nuove tecnologie per lo studio della policromia sui sarcofagi romani: proposte per una standardizzazione metodologica
Siotto E
This research aims to identify the scientific method and use digital technologies to acquire information about pigments, painting techniques, and procedures for the color and gilding application on metropolitan Roman marble sarcophagi (2nd - 4th cent. AD). The study covers the analytical work and cataloging, performed according to the standardized norms, of the polychrome sarcophagi identified in the Musei Vaticani, the Museo Nationale Romano, and the Musei Capitolini collections. Moreover, it identifies, tests, and assesses a set of open-source software, suggesting some amendments designed to increase their effectiveness toward ancient polychromy study. The research was divided into three main groups: 1. Identification and classification of polychrome and gilded traces preserved on Roman marble sarcophagi of the three mentioned museums collections, after the standard ICCD - Central Institute for Cataloguing and Documentation of Italy. Further development of an analytical investigation method is supported by the results of scientific analyses to identify the used pigments and dyes, their application techniques, and the gold leaf. 2. Testing of the Ministerial web-based Information System documentation for the Restoration of Yards - SICaR, to create a standard method for the acquisition, comparison, and subsequent use of the information on the polychromy and gilding, with particular attention to the scientific analysis results. 3. Creation of a photorealistic 3D digital model of a sarcophagus that has been chosen as a case study. Polychrome knowledge acquisition by scientific analyses and identification of digital open-source tools (e.g., MeshLab). Finally, testing the system to assess the effectiveness and limitations to visualize a hypothesis of the original color on the digital reconstruction.

See at: hdl.handle.net | etd.adm.unipi.it | hdl.handle.net


26/10/2001 Unknown
Extracting Type Values From Semistructured Databases
Manghi P
To date, current investigations on Semi-Structured Data (SSD) have focused on query languages that operate directly on graph-based data by matching flexible path expressions against the graph-database topology. The problem with this approach is that it renounces in principle the benefits typically associated with typing information. In particular, storage and query optimisation techniques, representation of user-knowledge of the data, and validation of computations cannot be based upon static typing information. On the other hand, the various attempts to reintroduce types for SSD, while effectively returning some of the benefits of static typing, compromise the irregular nature of SSD databases, by allowing just for mild forms of irregularities. Our investigation is motivated by the observation that, despite the inherent irregularity of the structure, many or indeed most SSD databases contain one or more subsets that present a high degree of regularity and could therefore be treated as typed values of a programming language. In this thesis we lay the formal foundations underlying a novel query methodology based on an extraction system that, given an SSD database and a type of a target language, results in: (i) a subset of the database that is semantically equivalent to a value of the given type; (ii) a measure that informs the user about the quality of his type with respect to the original database. The extracted subset can then be converted into a value of that type and injected into the language environment, where it can be computed over with all the benefits of static typing.DOI: 10.5281/zenodo.269648
Metrics:


See at: hdl.handle.net | hdl.handle.net | CNR IRIS | CNR IRIS | CNR IRIS


26/04/2018 Unknown
Monitoring indoor human activities for Ambient Assisted Living
Crivello A
At the end of the 20th century, Ubiquitous Computing and Ambient Intelligence were introduced as a vision of the future society. In this context, the paradigm of Ambient Assisted Living (AAL) has allowed the evolution of methods, techniques and systems to improve everyday life, by supporting people in both physical and cognitive aspects, especially in case of the socalled "fragile people". The state-of-the-art research develops means for vital data measurements, for recognizing activities and inferring whether a self-care task has been performed. These results are obtained through the simultaneous presence of different technologies deployed into physical environments in which people live. The monitoring of human activities is fundamental to enable the AAL paradigm. For instance, people spend sleeping several hours a day, thus monitoring this activity is fundamental in understanding and characterizing a person's sleep habits. On the other hand, at daytime, several indoor activities can be inferred by knowing the exact position of a subject. In this view, the main goal of this thesis is the proposal of advancements in the field of both daytime and night-time monitoring of human activities, focusing on indoor localisation and sleep-monitoring as key enablers for AAL. Regarding Indoor Positioning Systems (IPSs), the lack of a standardized benchmarking tool and of a common and public dataset to test and to compare results of IPSs is still a challenging open issue. Advancements in this direction can lead to improve the performance evaluation of heterogeneous systems, and, consequently, to obtain improvements of the IPSs. Some steps have been made towards introducing benchmarking tools, for example, through the introduction of the EvAAL framework, that defines tool and metrics usable for comparing both real-time and offline methods. This thesis contributes by proposing (i) some improvements to the EvAAL benchmarking framework, especially considering real-time smartphone-based positioning systems; (ii) presenting a common, public, multisource and multivariate dataset, gathered using both a smartwatch and a smartphone, to allow researchers to test their own results. Then, this thesis focuses on both single-device and multipledevice localisation. Concerning single-device positioning strategies, several smartphone-based systems have been recently presented, based on data gathered from smartphone built-in sensors, though with performances not completely satisfactory. In this view, the thesis proposes a novel approach based on deep convolutional neural networks, in order to improve the use of the pedometer (one of the main smartphone built-in sensors used in IPSs) e consequently the Pedestrian Dead Reckoning algorithm performances. Finally, we extend the concept of a single-device localisation to several devices in indoor environments. Localising multiple devices into the same environment can lead to detect, for example, social behaviour and interaction. Several systems try to reach the goal in AAL scenarios, but using an intrusive and expensive ad-hoc infrastructure. Instead, we propose a novel approach for finding the presence of people in indoor locations, through a cheap technology as Wi-Fi probes, demonstrating the feasibility of this approach. Regarding the sleep monitoring problem, recent findings show that sleep plays a critical role in reducing the risk of dementia and preserving the cognitive function in old adults. However, state-of-the-art techniques for understanding the sleep characteristics are generally difficult to deploy in an AAL scenario. This suggest that more effort should be spent to find sleep monitoring systems able to detect objective sleep patterns and, at the same time, easy to use in a home setting. In this thesis we propose a system able to perform the human sleep monitoring in an unobstrusive way, using force-sensing resistor sensors placed in a rectangular grid pattern on the slats, below the mattress; it can also detect human bed postures during sleep sessions and to identify patient movements and sleep stages, an information particularly useful, for instance, to assure the pressure ulcer prevention. The proposed advancements have been thoroughly evaluated in the laboratory and in real-world scenarios, demonstrating their effectiveness.

See at: hdl.handle.net | hdl.handle.net | hdl.handle.net | CNR IRIS


25/10/2021 Unknown
Enhancing the computational representation of narrative and its extraction from text
Metilli D
Narratives are a fundamental part of human life. Every human being encounters countless stories during their life, and these stories contribute to form a common understanding of reality. This is reflected in the current digital landscape, and especially on the Web, where narratives are published and shared everyday. However, the current digital representation of narratives is limited by the fact that each narrative is generally expressed as natural language text or other media, in an unstructured way that is neither standardized nor machine-readable. These limitations hinder the manageability of narratives by automated systems. One way to solve this problem would be to create an ontology of narrative, i.e., a formal model of what a narrative is, then develop semi-automated methods to extract narratives from natural language text, and use the extracted data to populate the ontology. However, the feasibility of this approach remains an open question. This thesis attempts to investigate this research question, starting from the state of the art in the fields of Computational Narratology, Semantic Web, and Natural Language Processing. Based on this analysis, we have identified a set of requirements, and we have developed a methodology for our research work. Then, we have developed an informal conceptualization of narrative, and we have expressed it in a formal way using First-Order Logic. The result of this work is the Narrative Ontology (NOnt), a formal model of narrative that also includes a representation of its textual structure and textual semantics. To ensure interoperability, the ontology is based on the CIDOC CRM and FRBRoo standards, and it has been expressed using the OWL and SWRL languages of the Semantic Web. Based on the ontology, we have developed NarraNext, a semi-automatic tool that is able to extract the main elements of narrative from natural language text. The tool allows the user to create a complete narrative based on a text, using the extracted knowledge to populate the ontology. NarraNext is based on recent advancements in the Natural Language Processing field, including deep neural networks, and is integrated with the Wikidata knowledge base. The validation of our work is being carried out in three different scenarios: (i) a case study on biographies of historical figures found in Wikipedia; (ii) the Mingei project, which applies NOnt to the representation and preservation of Heritage Crafts; (iii) the Hypermedia Dante Network project, where NOnt has been integrated with a citation ontology to represent the content of Dante's Comedy. All three applications have served to validate the representational adequacy of NOnt and the satisfaction of the requirements we defined. The case study on biographies has also evaluated the effectiveness of the NarraNext tool.Project(s): Representation and Preservation of Heritage Crafts

See at: hdl.handle.net | etd.adm.unipi.it | hdl.handle.net | CNR IRIS


25/05/2007 Unknown
A Content-Addressable Network for Similarity Search in Metric Spaces
Falchi F
Because of the ongoing digital data explosion, more advanced search paradigms than the traditional exact match are needed for contentbased retrieval in huge and ever growing collections of data produced in application areas such as multimedia, molecular biology, marketing, computer-aided design and purchasing assistance. As the variety of data types is fast going towards creating a database utilized by people, the computer systems must be able to model human fundamental reasoning paradigms, which are naturally based on similarity. The ability to perceive similarities is crucial for recognition, classification, and learning, and it plays an important role in scientific discovery and creativity. Recently, the mathematical notion of metric space has become a useful abstraction of similarity and many similarity search indexes have been developed. In this thesis, we accept the metric space similarity paradigm and concentrate on the scalability issues. By exploiting computer networks and applying the Peer-to-Peer communication paradigms, we build a structured network of computers able to process similarity queries in parallel. Since no centralized entities are used, such architectures are fully scalable. Specifically, we propose a Peer-to-Peer system for similarity search in metric spaces called Metric Content-Addressable Network (MCAN) which is an extension of the well known Content-Addressable Network (CAN) used for hash lookup. A prototype implementation of MCAN was tested on real-life datasets of image features, protein symbols, and text -- observed results are reported. We also compared the performance of MCAN with three other, recently proposed, distributed data structures for similarity search in metric spaces.Project(s): Search on Audio-visual content using peer-to-peer Information Retrieval

See at: hdl.handle.net | etd.adm.unipi.it | hdl.handle.net | CNR IRIS


23/03/2023 Unknown
Orchestration strategies for regression testing of evolving software systems
Merlin Greca Rd
Context: Software is an important part of modern life, and in most cases, it provides tremendous benefits to society. Unfortunately, software is highly susceptible to faults. Faults are often harmless, but even small errors can cause massive damage depending on the context. Thus, it is crucial for software developers to adopt testing techniques that can help locate faults and guarantee the functionality of both individual components and the system as a whole. Today, there is a trend towards continuously evolving software, in which it is desired that changes such as new features and corrections are delivered to end users as quickly as possible. To ensure correct behavior upon release, development teams rely on regression testing suites, which serve to validate previously-correct features and, when well-designed, avoid the propagation of faults to end users. However, the desire for velocity that comes with continuously evolving software places an additional burden on regression testing practices, as running complete test suites can be a costly process in large-scale software. This challenge has generated a need for novel regression testing techniques, a topic which now enjoys a robust literature within software engineering research. However, there is limited evidence of this research finding its way into practical usage by the software development community; in other words, there is a disconnect between academia and industry on the subject of software testing techniques. Objective: To improve applicability of regression testing research, we must identify what are the main causes of this apparent gap between software engineering academics and practitioners. This is a multifaceted goal, involving an investigation of the literature and of the state of practice. A related goal is to provide examples of test suite orchestration strategies that draw from academic advancements and could provide benefits if implemented on real software. Method: This thesis tackles the aforementioned challenge from multiple directions. It includes a comprehensive systematic literature review covering research published between 2016 and 2022, focusing on papers that bring techniques and discussions that are relevant to the applicability of regression testing research. Along with data extracted from the papers themselves, this discussion of the existing literature includes information received directly from authors through a questionnaire, as well as a survey performed with practitioners, seeking to validate some of the reported findings. Test suite orchestration strategies can be a step towards bridging the so-called industry-academia knowledge gap. To that end we propose a combined approach for regression testing, including techniques extracted from the literature that have promising qualities. This approach is an initial experiment with full test suite orchestration and extended approaches are also discussed. To get a closer understanding of the state of regression testing in a practical sense, a series of interviews were conducted in collaboration with a large technology company. During a seven-week process, we were able to interact with the team and learn the test practices performed on a daily basis and have some insight on the long-term test strategies for the company. The responses of the interviews are reported, edited for readability and confidentiality reasons, and these results are discussed within the larger context of the study. The results from the above components of the studies are then aggregated into two notable outputs. First, a live repository of literature is made available online, containing the current results of the literature review and with the opportunity of expansion as more research is performed in this topic. Then, we provide a list of the most notable challenges for the implementation of regression testing techniques in practice, that were identified during the development of this entire study. Results: This thesis provides the following contributions: a comprehensive literature review of applicable regression testing research; additional context on the literature provided by the authors of cited papers; a preliminary test suite orchestration strategy combining robust techniques from the literature; interviews with practitioners at a major technology company that highlight the challenges faced daily by developers and testers; a live repository of papers to aggregate relevant literature in one online location; a list of challenges that can serve as guidelines for researchers or even as research directions in their own right. Conclusion: There is still much work to be done by the software engineering research and development communities in order to completely close the gap that exists between them. To a great extent, the motivations of researchers and practitioners are not aligned -- while in academia, proposing theoretically sound novel approaches is encouraged to obtain publications, in industry there is a need for techniques that are proven to reduce effort and/or costs. This can only be solved by close collaboration between the two sides, yet a question of who is willing to fund these experiments remain. The data and discussions provided in this thesis show that, although difficult, this is not an impossible problem to solve and there are certain clear steps that can be taken by researchers and practitioners alike to begin addressing it.

See at: hdl.handle.net | iris.gssi.it | hdl.handle.net | CNR IRIS


23/02/2022 Unknown
Computational methods for improving manufacturing processes
Alderighi T
The last two decades have seen a rapid and wide growth of digital fabrication machinery and technologies. This led to a massive diffusion of such technologies both in the industrial setting and within the hobbyists' and makers' communities. While the applications to rapid prototyping and simple download-and-print use cases can be trivial, the design space offered by these numerically controlled technologies (i.e., 3D printing, CNC milling, laser cutting, etc.) is hard to exploit without the support of appropriate computational tools and algorithms. Within this thesis, we investigate how the potential of common rapid prototyping tools, combined with sound computational methods, can be used to provide novel and alternative fabrication methods and to enhance existing ones, making them available to non-expert users. In particular, the contributions presented in this thesis are four. The first is a novel technique for the automatic design of flexible molds to cast highly complex shapes. The algorithm is based on an innovative volumetric analysis of the mold volume that defines the layout of the internal cuts needed to open the mold. We show how the method can robustly generate valid molds for shapes with high topological and geometrical complexity for which previous existing methods could not provide any solution. The second contribution is a method for the automatic volumetric decomposition of objects in parts that can be cast using two-piece reusable rigid molds. Automating the design of this kind of molds can directly impact industrial applications, where the use of two-piece, reusable, rigid molds is a de-facto standard, for example, in plastic injection molding machinery. The third contribution is a pipeline for the fabrication of tangible media for the study of complex biological entities and their interactions. The method covers the whole pipeline from molecular surface preparation and editing to actual 3D model fab- rication. Moreover, we investigated the use of these tangible models as teaching aid in high school classrooms. Finally, the fourth contribution tackles another important problem related to the fabrication of parts using FDM 3D printing technologies. With this method, we present an automatic optimization algorithm for the decomposition of objects in parts that can be individually 3D printed and then assembled, with the goal of minimizing the visual impact of supports artifacts.

See at: hdl.handle.net | etd.adm.unipi.it | hdl.handle.net | CNR IRIS


22/06/2015 Unknown
Social Network Dynamics
Rossetti G
This thesis focuses on the analysis of structural and topological network problems. In particular, in this work the privileged subjects of investigation will be both static and dynamic social networks. Nowadays, the constantly growing availability of Big Data describing human behaviors (i.e., the ones provided by online social networks, telco companies, insurances, airline companies. . . ) offers the chance to evaluate and validate, on large scale realities, the performances of algorithmic approaches and the soundness of sociological theories. In this scenario, exploiting data-driven methodologies enables for a more careful modeling and thorough understanding of observed phenomena. In the last decade, graph theory has lived a second youth: the scientific community has extensively adopted, and sharpened, its tools to shape the so called Network Science. Within this highly active field of research, it is recently emerged the need to extend classic network analytical methodologies in order to cope with a very important, previously underestimated, semantic information: time. Such awareness has been the linchpin for recent works that have started to redefine form scratch well known network problems in order to better understand the evolving nature of human interactions. Indeed, social networks are highly dynamic realities: nodes and edges appear and disappear as time goes by describing the natural lives of social ties: for this reason. it is mandatory to assess the impact that time-aware approaches have on the solution of network problems. Moving from the analysis of the strength of social ties, passing through node ranking and link prediction till reaching community discovery, this thesis aims to discuss data-driven methodologies specifically tailored to approach social network issues in semantic enriched scenarios. To this end, both static and dynamic analytical processes will be introduced and tested on real world data.

See at: hdl.handle.net | etd.adm.unipi.it | hdl.handle.net | CNR IRIS


22/05/2018 Unknown
A Portable, Intelligent, Customizable Device for Human Breath Analysis
Germanese D
Breath analysis allows for monitoring the metabolic processes that occur in human body in a non-invasive way. Comparing with other traditional methods such as blood test, breath analysis is harmless to not only the subjects but also the personnel who collect the samples. However, despite its great potential, only few breath tests are commonly used in clinical practice nowadays. Breath analysis has not gained a wider use yet. One of the main reasons is related to standard instrumentation for gas analysis. Standard instrumentation, such as gas chromatography, is expensive and time consuming. Its use, as well as the interpretation of the results, often requires specialized personnel. E-nose systems, based on gas sensor array, are easier to use and able to analyze gases in real time, but, although cheaper than a gas chromatograph, their cost remains high. During my research activity, carried on at the Signals and Images Laboratory (SiLab) of the Institute of Information Science and Technology (ISTI) of the National Research Council (CNR), I design and developed the so called Wize Sniffer (WS), a device able to accurately analyze human breath composition and, at the same time, to overcome the limitations of existing instrumentation for gas analysis. The idea of the Wize Sniffer was born in the framework of SEMEiotic Oriented Technology for Individual's CardiOmetabolic risk self-assessmeNt and Self-monitoring (SE- MEOTICONS, www.semeoticons.eu) European Project, and it was designed for detecting, in human breath, those molecules related to the noxious habits for cardio-metabolic risk. The clinical assumption behind the Wize Sniffer lied in the fact that harmful habits such as alcohol consumption, smoking, unhealthy diet cause a variation in the concentration of a set of molecules (among which carbon monoxide, ethanol, hydrogen, hydrogen sulfide) in the exhaled breath. Therefore, the goal was to realize a portable and easy-to-use device, based on cheap electronics, to be used by anybody at their home. The main contributions of my work were the following: o design and development of a portable, low cost, customizable, easy to use device, able to be used in whichever context of use: I succeeded in this with using cheap commercial discrete gas sensors and an Arduino board, wrote the software and calibrated the system; o development of a method to analyze breath composition and understand individual's cardio-metabolic risk; I also validated it with success on real people. Given such good outcomes, I wanted the Wize Sniffer took a further step forward, towards the diagnosis in particular. The application field regarded the chronic liver impairment, as the studies which involve e-nose systems in the identification of liver disease are still few. In addition, the diagnosis of liver impairment often requires very invasive clinical test (biopsy, for instance). In this proof-of-concept study, the Wize Sniffer showed good diagnosis-oriented properties in discriminating the severity of liver disease (absence of disease, chronic liver disease, cirrhosis, hepatic encephalopathy) on the base of the detected ammonia.Project(s): SEMEiotic Oriented Technology for Individual's CardiOmetabolic risk self-assessmeNt and Self-monitoring

See at: hdl.handle.net | hdl.handle.net | CNR IRIS


22/03/2019 Unknown
Mining human mobility data and social media for smart ride sharing
Monteiro De Lira V
People living in highly-populated cities increasingly suffer an impoverishment of their quality of life due to pollution and traffic congestion problems caused by the huge number of circulating vehicles. Indeed, the reduction the number of circulating vehicles is one of the most difficult challenges in large metropolitan areas. This PhD thesis proposes a research contribution with the final objective of reducing travelling vehicles. This is done towards two different directions: on the one hand, we aim to improve the efficacy of ride sharing systems, creating a larger number of ride possibilities based on the passengers destination activities; on the other hand, we propose a social media analysis method, based on machine learning, to identify transportation demand to an event.

See at: hdl.handle.net | hdl.handle.net | CNR IRIS


21/12/2012 Unknown
Combining Peer-to-Peer and Cloud Computing for Large Scale On-line Games
Carlini E
This thesis investigates the combination of Peer-to-Peer (P2P) and Cloud Computing to support Massively Multiplayer On- line Games (MMOGs). MMOGs are large-scale distributed applications where a large number of users concurrently share a real-time virtual environment. Commercial MMOG infras- tructures are sized to support peak loads, incurring in high economical cost. Cloud Computing represents an attractive solution, as it lifts MMOG operators from the burden of buy- ing and maintaining hardware, while offering the illusion of infinite machines. However, it requires balancing the tradeoff between resource provisioning and operational costs. P2P- based solutions present several advantages, including the in- herent scalability, self-repairing, and natural load distribu- tion capabilities. They require additional mechanisms to suit the requirements of a MMOG, such as backup solutions to cope with peer unreliability and heterogeneity. We propose mechanisms that integrate P2P and Cloud Computing combining their advantages. Our techniques allow operators to select the ideal tradeoff between performance and economical costs. Using realistic workloads, we show that hybrid infrastructures can reduce the economical effort of the operator, while offering a level of service comparable with centralized architectures.

See at: hdl.handle.net | hdl.handle.net | CNR IRIS


21/07/2021 Unknown
The GDPR compliance through access control systems
Daoudagh S
The GDPR is changing how Personal Data should be processed. It states, in Art. 5.1(f), that "[data] should be processed in a manner that ensures appropriate security of the personal data [...], using appropriate technical or organizational measures (integrity and confidentiality)". We identify in the Access Control (AC) systems such a measure. Indeed, AC is the mechanism used to restrict access to data or systems according to Access Control Policies (ACPs), i.e., a set of rules that specify who has access to which resources and under which circumstances. In our view, the ACPs, when suitably enriched with attributes, elements and rules extracted from the GDPR provisions, can suitably specify the regulations and the AC systems can assure a by-design lawfully compliance with the privacy preserving rules. Vulnerabilities, threats, inaccuracies and misinterpretations that occur during the process of ACPs specification and AC systems implementation may have serious consequences for the security of personal data (security perspective) and for the lawfulness of the data processing (legal perspective). For mitigating these risks, this thesis provides a systematic process for automatically deriving, testing and enforcing ACPs and AC systems in line with the GDPR. Its data protection by-design solution promotes the adoption of AC systems ruled by policies systematically designed for expressing the GDPR's provisions. Specifically, the main contributions of this thesis are: (1) the definition of an Access Control Development Life Cycle for analyzing, designing, implementing and testing AC mechanisms (systems and policies) able to guarantee the compliance with the GDPR; (2) the realization of a reference architecture allowing the automatic application of the proposed Life Cycle; and (3) the use of the thesis proposal within five application examples highlighting the flexibility and feasibility of the proposal.Project(s): Being safe around collaborative and versatile robots in shared spaces, Cyber Security Network of Competence Centres for Europe, Building Trust in Ecosystems and Ecosystem Components

See at: hdl.handle.net | etd.adm.unipi.it | hdl.handle.net | CNR IRIS


21/06/2011 Unknown
Multiresolution Techniques for Real-Time Visualization of Urban Environments and Terrains
Di Benedetto M
In recent times we are witnessing a steep increase in the availability of data coming from real-life environments. Nowadays, virtually everyone connected to the Internet may have instant access to a tremendous amount of data coming from satellite elevation maps, airborne time-of-flight scanners and digital cameras, street-level photographs and even cadastral maps. As for other, more traditional types of media such as pictures and videos, users of digital exploration softwares expect commodity hardware to exhibit good performance for interactive purposes, regardless of the dataset size. In this thesis we propose novel solutions to the problem of rendering large terrain and urban models on commodity platforms, both for local and remote exploration. Our solutions build on the concept of multiresolution representation, where alternative representations of the same data with different accuracy are used to selectively distribute the computational power, and consequently the visual accuracy, where it is more needed on the base of the user's point of view. In particular, we will introduce an efficient multiresolution data compression technique for planar and spherical surfaces applied to terrain datasets which is able to handle huge amount of information at a planetary scale. We will also describe a novel data structure for compact storage and rendering of urban entities such as buildings to allow real-time exploration of cityscapes from a remote online repository. Moreover, we will show how recent technologies can be exploited to transparently integrate virtual exploration and general computer graphics techniques with web applications.

See at: hdl.handle.net | etd.adm.unipi.it | hdl.handle.net | CNR IRIS


21/04/2017 Unknown
Personal Data Analytics: Capturing Human Behavior to Improve Self-Awareness and Personal Services through Individual and Collective Knowledge
Guidotti R
In the era of Big Data, every single user of our hyper-connected world leaves behind a myriad of digital breadcrumbs while performing her daily activities. It is sufficient to think of a simple smartphone that enables each one of us to browse the Web, listen to music on online musical services, post messages on social networks, perform online shopping sessions, acquire images and videos and record our geographical locations. This enormous amount of personal data could be exploited to improve the lifestyle of each individual by extracting, analyzing and exploiting user's behavioral patterns like the items frequently purchased, the routinary movements, the favorite sequence of songs listened, etc. However, even though some user-centric models for data management named Personal Data Store are emerging, currently there is still a significant lack in terms of algorithms and models specifically designed to extract and capture knowledge from personal data. This thesis proposes an extension to the idea of Personal Data Store through Personal Data Analytics. In practice, we describe parameter-free algorithms that do not need to be tuned by experts and are able to automatically extract the patterns from the user's data. We define personal data models to characterize the user profile which are able to capture and collect the users' behavioral patterns. In addition, we propose individual and collective services exploiting the knowledge extracted with Personal Data Analytics algorithm and models. The services are provided for the users which are organized in a Personal Data Ecosystem in form of a peer distributed network, and are available to share part of their own patterns as a return of the service providing. We show how the sharing with the collectivity enables or improves, the services analyzed. The sharing enhances the level of the service for individuals, for example by providing to the user an invaluable opportunity for having a better perception of her self-awareness. Moreover, at the same time, knowledge sharing can lead to forms of collective gain, like the reduction of the number of circulating cars. To prove the feasibility of Personal Data Analytics in terms of algorithms, models and services proposed we report an extensive experimentation on real world data.Project(s): Personal Transport Advisor: an integrated platform of mobility patterns for Smart Cities to enable demand-adaptive transportation systems, Bringing CItizens, Models and Data together in Participatory, Interactive SociaL EXploratories, SoBigData Research Infrastructure

See at: hdl.handle.net | hdl.handle.net | CNR IRIS


21/04/2017 Unknown
Improving the Efficiency and Effectiveness of Document Understanding in Web Search
Trani S
Web Search Engines (WSEs) are probably nowadays the most complex information systems since they need to handle an ever-increasing amount of web pages and match them with the information needs expressed in short and often ambiguous queries by a multitude of heterogeneous users. In addressing this challenging task they have to deal at an unprecedented scale with two classic and contrasting IR problems: the satisfaction of effectiveness requirements and efficiency constraints. While the former refers to the user-perceived quality of query results, the latter regards the time spent by the system in retrieving and presenting them to the user. Due to the importance of text data in the Web, natural language understanding techniques acquired popularity in the latest years and are profitably exploited by WSEs to overcome ambiguities in natural language queries given for example by polysemy and synonymy. A promising approach in this direction is represented by the so-called Web of Data, a paradigm shift which originates from the Semantic Web and promotes the enrichment of Web documents with the semantic concepts they refer to. Enriching unstructured text with an entity-based representation of documents - where entities can precisely identify persons, companies, locations, etc. - allows in fact, a remarkable improvement of retrieval effectiveness to be achieved. In this thesis, we argue that it is possible to improve both efficiency and effectiveness of document understanding in Web search by exploiting learning-to-rank, i.e., a supervised technique aimed at learning effective ranking functions from training data. Indeed, on one hand, enriching documents with machine-learnt semantic annotations leads to an improvement of WSE effectiveness, since the retrieval of relevant documents can exploit a finer comprehension of the documents. On the other hand, by enhancing the efficiency of learning to rank techniques we can improve both WSE efficiency and effectiveness, since a faster ranking technique can reduce query processing time or, alternatively, allow a more complex and accurate ranking model to be deployed. The contribution of this thesis are manifold: i) we discuss a novel machine- learnt measure for estimating the relatedness among entities mentioned in a document, thus enhancing the accuracy of text disambiguation techniques for document understanding; ii) we propose novel machine-learnt technique to label the mentioned entities according to a notion of saliency, where the most salient entities are those that have the highest utility in understanding the topics discussed; iii) we enhance state-of-the-art ensemble-based ranking models by means of a general learning-to-rank framework that is able to iteratively prune the less useful part of the ensemble and re-weight the remaining part accordingly to the loss function adopted. Finally, we share with the research community working in this area several open source tools to promote collaborative developments and favoring the reproducibility of research results.

See at: hdl.handle.net | etd.adm.unipi.it | hdl.handle.net | CNR IRIS


21/04/2017 Unknown
Enhancing digital fabrication with advanced modeling techniques
Malomo L
A few years ago there were only expensive machineries dedicated to rapid prototyping for professionals or industrial application, while nowadays very affordable solutions are on the market and have become useful tools for experimenting, providing access to final users. Given the digital nature of these machine-controlled manufacturing processes, a clear need exists for computational tools that support this new way of productional thinking. For this reason the ultimate target of this research is to improve the easiness of use of such technologies, providing novel supporting tools and methods to ultimately sustain the concept of democratized design ("fabrication for the masses"). In this thesis we present a novel set of methods to enable, with the available manufacturing devices, new cost-effective and powerful ways of producing objects. The contributions of the thesis are three. The first one is a technique that allows to automatically create a tangible illustrative representation of a 3D model by interlocking together a set of planar pieces. Given an input 3D model, this technique produces the design of flat planar pieces that can be fabricated using a 2D laser cutter, using very cheap material (e.g., cardboard, acrylic, etc.). The produced pieces can be then manually assembled using automatically generated instructions. The second method allows the automatic design of flexible reusable molds, which can be used to produce many copies of an input digital object. The designs produced by this method can be directly sent to a 3D printer and used to liquid-cast multiple replicas using a wide variety of materials. The last technique is a method to fabricate, using a single-material 3D printer, objects with custom elasticity. The base idea is to create a set of microstructures that can be 3D-printed and used to replicate desired mechanical properties (Young's modulus and Poisson's ratio). Such microstructures can be distributed inside voxelized objects to vary their mechanical behavior. We also designed an optimization strategy that, varying the elastic properties inside the object volume, is able to design printable objects with a prescribed mechanical behavior, i.e. they exhibit a target deformation given some input forces.

See at: hdl.handle.net | etd.adm.unipi.it | hdl.handle.net | CNR IRIS


21/02/2023 Unknown
Computational design and fabrication of tileable patterns: from geometry to mechanical properties
Manolas I
With the increasing availability of CNC machines and 3D printers, the fabrication of physical artifacts and their visual appearance has become trending research topics in the Computer Graphics community. In recent years, several workflows have been developed to streamline the Digital Fabrication process, overcome material, size, and geometric limitations, and speed up the reproduction and the prototyping phase. In addition to high-resolution reproductions, new approaches which realize objects in an artistic manner instead, have acquired attention. It quickly became apparent that these techniques could also be directed towards the production of objects that look and perform in a desired way, e.g. when subject to a particular external or internal stimulus. In this context, a common theme is the design of ornamental patterns and their use as structural building blocks of complex pattern assemblies bringing into the spotlight the interplay between aesthetics and mechanical properties. In this thesis, we investigate and propose a novel pipeline for designing and efficiently simulating complex pattern tessellations. The thesis presents 3 main contributions. The first one targets the scarcity of open and efficient simulation tools by proposing a computational tool for predicting the static-equilibrium of general bending-active structures which is accompanied by an efficient open-source implementation. Our second contribution is a novel approach for generating a wide range of flat patterns with favorable fabrication-related properties. The third is a computational method for calibrating a reduced mechanical model for each generated pattern enabling the interactive simulation of complex pattern assemblies.Project(s): Advanced Visual and Geometric Computing for 3D Capture, Display, and Fabrication

See at: hdl.handle.net | etd.adm.unipi.it | hdl.handle.net | CNR IRIS


21/01/2017 Unknown
Data Flow Quality Monitoring in Data Infrastructures
Mannocci A
In the last decade, a lot of attention worldwide has been brought by researchers, organizations, and funders on the realization ofData Infrastructures (DIs), namely systems supporting researchers with the broad spectrum of resources they need to perform science. DIs are here intended as ICT (eco)systems offering data and processing components which can be combined into data flows so as to enable arbitrarily complex data manipulation actions serving the consumption needs of DI customers, be them humans or machines.Data resulting from the execution of data flows, represent an important asset both for the DI users, typically craving for the information they need, and for the organization (or community) operating the DI, whose existence and cost sustainability depends on the adoption and usefulness of the DI. On the other hand, when operating several data processing data flows over time, several issues, well-known to practitioners, may arise and compromise the behaviour of the DI, and therefore undermine its reliability and generate stakeholders dissatisfaction. Such issues span a plethora of causes, such as(i) the lack of any kind of guarantees (e.g. quality, stability, findability, etc.) from integrated external data sources, typically not under the jurisdiction of the DI; (ii) the occurrence at any abstraction level of subtle, unexpected errors in the data flows; and(iii) the nature in ever changing evolution of the DI, in terms of data flow composition and algorithms/configurations in use.The autonomy of DI components, their use across several data flows, the evolution of end-user requirements over time, make the one of DI data flows a critical environment, subject to the most subtle inconsistencies. Accordingly, DI users demand guarantees, while quality managers are called to provide them, on the "correctness" of the DI data flows behaviour over time, to be somehow quantified in terms of "data quality" and in terms of "processing quality". Monitoring the quality of data flows is therefore a key activity of paramount importance to ensure the up-taking and long term existence of a DI. Indeed, monitoring can detect or anticipate misbehaviours of DI's data flows, in order to prevent and adjust the errors, or at least "formally" justify to the stakeholders the underlying reasons, possibly not due to the DI, of such errors. Not only, monitoring can also be vital for DIs operation, as having hardware and software resources actively employed in processing low quality data can yield inefficient resource allocation and waste of time.However, data flow quality monitoring is further hindered by the "hybrid" nature of such infrastructures, which typically consist of a patchwork of individual components("system of systems") possibly developed by distinct stakeholders with possibly distinct life-cycles, evolving over time, whose interactions are regulated mainly by shared policies agreed at infrastructural level. Due to such heterogeneity, generally DIs are not equipped with built-in monitoring systems in this sense and to date DI quality managers are therefore bound to use combinations of existing tools - with non trivial integration efforts - or to develop and integrate ex-post their own ad-hoc solutions, at high cost of realization and maintenance.In this thesis, we introduce MoniQ, a general-purpose Data Flow Quality Monitoring system enabling the monitoring of critical data flow components, which are routinely checked during and after every run of the data flow against a set of user-defined quality control rules to make sure the data flow meets the expected behaviour and quality criteria over time, as established upfront by the quality manager. MoniQ introduces a monitoring description language capable of (i) describing the semantic and the time ordering of the observational intents and capture the essence of the DI data flows to be monitored; and (ii) describing monitoring intents over the monitoring flows in terms of metrics to be extracted and controls to be ensured. The novelty of the language is that it incorporates the essence of existing data quality monitoring approaches, identifies and captures process monitoring scenarios, and, above all, provides abstractions to represent monitoring scenarios that combine data and process quality monitoring within the scope of a data flow. The study is provided with an extensive analysis of two real-world use cases used as support and validation of the proposed approach, and discusses an implementation of MoniQ providing quality managers with high-level tools to integrate the solution in a DI in an easy, technology transparent and cost efficient way in order to start to get insight out data flows by visualizing the trends of the metrics defined and the outcome of the controls declared against them.Project(s): Open Access Infrastructure for Research in Europe

See at: hdl.handle.net | etd.adm.unipi.it | hdl.handle.net | CNR IRIS