Research Topics of Data Engineering team

Integration, Scalability and Deployment

The IPaD operation (a french accronym for Integration, Scaling and Deployment) supports the objectives covering at the same time the collection, the modeling and the analysis of needs which can be strongly heterogeneous because they come from autonomous partner as it is the case of large enterprises, the integration of heterogeneous data resulting from dynamic and scalable sources (the Big Data case) for which approaches of semantic resolution of heterogeneity that may exist between sources must be developed, the proposition of ETL techniques (Extraction, Transformation, Loading) exploiting the presence of one or more ontologies of sectors, the choice of the data storage architecture (mediator, storage or hybrid) the choice of the data storage system (a traditional, semantics or NoSQL database management system), the choice of the deployment platform, and lastly the proposition of optimization structures adapted to the final system.

What makes a difference between us and other existing works is the fact that we offer a simulator for all the phases by taking into account constraints related to different parameters: used data models, storage models, deployment platforms, optimization structures, exploitation methods (data search, recommendation, etc) and energy constraints. That gives designers the ability to submit their approaches before the deployment, which represents a considerable contribution for the production of the database management system and high performance templates.

These works have been widely used and validated in many fields such as:

  • the aeronautics industry,
  • the petroleum industry,
  • the automotive industry,
  • the estate and archives management.

Personalization Intelligence and Cooperativeness

Operation PIC: Personalization, Intelligence & Cooperativeness. This operation focuses on the problematic of exploiting and accessing data. Data can be precise or incomplete and include instances of ontological/semantic data, Web services, process models, etc. The activities of this operation are centered on the following topics.

  • Personalization: a first aspect is to investigate advanced models to represent and handle sophisticated user preferences; a second aspect deals with modeling and handling user context and profile to best fit its needs and desires.
  • Intelligence: modern-day database systems should exhibit intelligent behaviors to face the hugeness of data and the complexity of user needs. We study appropriate inference mechanisms to handle, for instance, inferred predicates/constructs and leverage knowledge discovery and machine learning techniques to intelligently answer database queries.
  • Cooperativeness: The aim is to develop novel approaches to cooperative query answering. We study query relaxation techniques to overcome the common problem of empty/unsatisfactory answers. We are also interested in other kind of cooperative responses such as intentional and summarized answers (in case of too many query results).

Moreover, a particular effort is made about the scalability issue of the approaches developed. The theoretical tools used in this theme are mainly stemmed from fuzzy logic theory and domain ontologies.

Human Computer Interaction

The HCI operation (Human Computer Interaction) offers approaches and tools supports of validation and examination of models quality coming from the collection system of needs to interface models through testing and experimentation. Within the theme, this activity is focused on the use of tasks templates in the design of the interactive applications. In close collaboration with Dominique Scapin (INRIA), a new formalism, K-MAD (evolution of the original formalism MAD), has been defined.

The works to come are based in the Human-Machine Interaction, and more specifically tasks templates. On one hand they aim to complete the definition of K-MAD model, on the other hand they develop tools exploiting the approach driven by the models. Regarding K-MAD, and in addition to the work done during the last quadrennial, it is about taking into account new aspects of interaction (for example plastic systems, multimodality), and to explore the possibilities of using tasks models during the phase of expression needs (requirement analysis, workflow) and of prototyping, as well as in the implementation of interfaces tests. As for the driven-model engineering, it is a matter of deepening the co-design process of tasks and dialog templates, in particular in the area of supervision, and by exploring especially the case of new human-machine interfaces (interfaces post-WIMP).