Overall Satisfaction with Hadoop
- Used for Massive data collection, storage, and analytics
- Used for MapReduce processes, Hive tables, Spark job input, and for backing up data
- Storing Retail Catalog & Session data to enable omnichannel experience for customers, and a 360-degree customer insight
- Having a consistent data store that can be integrated across other platforms, and have one single source of truth.
Pros
- HDFS is reliable and solid, and in my experience with it, there are very few problems using it
- Enterprise support from different vendors makes it easier to 'sell' inside an enterprise
- It provides High Scalability and Redundancy
- Horizontal scaling and distributed architecture
Cons
- Less organizational support system. Bugs need to be fixed and outside help take a long time to push updates
- Not for small data sets
- Data security needs to be ramped up
- Failure in NameNode has no replication which takes a lot of time to recover
- Too many Hadoop projects have community focus divided; this causes some bug fixes to happen slow
- Mindset change among business partners
- Adopting Hadoop/MapReduce has a learning curve
- For real-time streaming, use Spark; can provide a stark contrast to the way MR works
- Use Hive for querying purposes
Comments
Please log in to join the conversation