Session Big Data


Schedule:November 29, 09:15am - 11:00am

Session Chairman: Cedric Thomas, OW2. 

Keynote Big data: Hadoop's Role in the Big Data Architecture

Speaker: Jim Walker, HortonWorks
Schedule: 10:00 - 10:15am
Abstract: Big Data is everywhere. We see it on television. We hear it in conversations over coffee. It is an expanding topic in the boardroom. The hype is palpable but what is real and better yet, how does it affect the status quo? At the center of the big data discussion is Apache Hadoop, a next-generation open source enterprise data platform that allows you to capture, process and share the enormous amounts of new, multi-structured data that doesn’t fit into traditional systems.  Organizations that embrace solution architectures focused on maximizing the value from big data will put themselves in a position to improve customer engagement, enhance productivity, and discover new and lucrative business opportunities. In this session we will discuss:

  • The evolution of Apache Hadoop and Hadoop’s role within enterprise data architectures
  • The relationship between Apache Hadoop and existing data infrastructures such as the enterprise data warehouse Use-cases and best practices on how to incorporate Apache Hadoop into your big data strategy
  • The Future of Apache Hadoop
SpagoBI and Big Data: next Open Source Information Management suite

Speaker:Monica Franceschini, Engineering
Schedule: Thursday Nov 29, 10:15 - 10:30am
Abstract: Organizations adopt Business Intelligence tools to analyze tons of data: nonetheless, several business leaders do not dispose of the information they actually need. This happens because the information management scenario is evolving. Various new contents are adding to structured information, supported by already known processes, tools and practices, including information coming from social computing. They will be managed by disparate processes, fragmented tools, new practices. This information will combine with various contents of enterprise systems: documents, transactional data, databases and data warehouses, images, audio, texts, videos. This huge amount of contents is named “big data”, even though it is not just related to a big amount of data. It refers to the capability of managing data that are growing along three dimensions - volume, velocity and variety - respecting the simplicity of the user interface. The speech describes SpagoBI approach to the “big data” scenario and presents SpagoBI suite roadmap, which is two-fold. It aims to address existing emerging analytical areas and domains, providing the suite with new capabilities - including big data and open data support, in-memory analysis, real time and mobile BI - and following a research path towards the realization of a new generation of SpagoBI suite.

The Big Challenge of Big Data and Hadoop Integration

Speaker: Cedric Carbone, Talend
Schedule: Thursday Nov 29, 10:30 - 10:45am
Abstract: Enterprises can't close their doors just because integration tools won't cope with the volume of information that their systems produce. As each day goes by, their information will become larger and more complicated, and enterprises must constantly struggle to manage the integration of dozens (or hundreds) of systems. Apache Hadoop has quickly become the technology of choice for enterprises that need to perform complex analysis of petabytes of data, but few are aware of its potential to handle large-scale integration work. By using effective tools, integrators can process the complex transformation, synchronization, and orchestration tasks required in a high-performance, low cost, infinitely scalable way. In this talk, Cédric Carbone will discuss how Hadoop can be used to integrate disparate systems and services, and provide a demonstration of the process for designing and deploying common integration tasks.

Using Vanilla to manage Hadoop database

Speaker:Patrick Beaucamp, Bpm-Conseil
Schedule: Thursday Nov 29, 10:45 - 11:00am
Abstract: This presentation will demo how to use Vanilla to read/write data in Hadoop database, using big data database like HBase or Cassandra, along with the use of Hadoop-Ready Solr/Lucene search engine - embeded into Vanilla - to run clustered search on Hadoop data.

11:00 - 11:15 : Coffee Break