Contact Info
Egypt: +2 0128 418 2760 Egypt: +20 35844333 UAE: +971 52611 8912
info@floijcs.com

















    Folow us on social

    BIG DATA AND HADOOP

    Why Big Data? And what’s so big about it?

    BIG DATA AND HADOOP

    Big Data

    Why Big Data? And what’s so big about it?

    At first glance at the term, you may conclude its definition, and you may go “Hey, it definitely isn’t what it sounds like”.

    Actually, it is exactly what it sounds like.

    Let’s just take a quick reminder of what “data” is:
    Data is synonymous with information. In computer science, data is a representation of information, and it can have many forms and structures (e.g.: tables, trees, graphs, etc….).

    So what exactly is Big Data?

    Like we said, it is exactly what it sounds like, Big Data is a collection of very large and complex data sets, so large that the traditional methods of data processing like database management systems and file systems isn’t just doing it anymore.

    How big is it, you might ask?

    Well, let’s just say your hard drive is of size 1 Terabytes (1000 Gigabytes), if you’re one of the majority of people who uses the PC just for browsing the internet, checking multimedia ,or playing video games, then you’ll be really satisfied by how much free space you have now on your hard drive and how pretty much you will never delete any data on it.

    Well, Big Data may reach to the size of Exabytes, which is 1 billion Gigabytes.
    As of 2012, 2.5 exabytes of data are created every day, a size that even the most advanced of info management systems weren’t designed to handle.

    How to handle Big Data?

    In 2004, Google published a paper describing a new process called MapReduce that provides a parallel processing model consisting of nodes in which the queries are split, distributed and processed (Map). The results are then gathered and delivered (Reduce).

    Popular Open Source Tools for Big Data

    Due to the incredible success of the architecture of MapReduce, an implementation of its framework was adopted by an Apache open source project named Hadoop.

    1- Big Data Analysis Platforms and Tools
    Hadoop
    MapReduce
    GridGain
    Storm
    2- Databases/Data Warehouses
    Cassandra
    HBase
    MongoDB
    CouchDB
    Redis
    3- Business Intelligence
    Talend
    Jaspersoft
    4- Data Mining
    RapidMiner/RapidAnalytics
    Mahout
    Orange
    5- Big Data Search
    Lucene
    Solr

    Post a Comment