An update on Apache Hadoop 1.0
发布日期:2021-09-11 14:30:19 浏览次数:48 分类:技术文章

本文共 5574 字,大约阅读时间需要 18 分钟。

hot3.png

Some users & customers have asked about the most recent release of Apache Hadoop, v1.0: what’s in it, what it followed and what it preceded.  To explain this we should start with some basics of how Apache projects release software:

By and large, in Apache projects new features are developed on a main codeline known as “trunk.”  Occasionally very large features are developed on their own branches with the expectation they’ll later merge into trunk.  While new features usually land in trunk before they reach a release, there is not much expectation of quality or stability.  Periodically, candidate releases are branched from trunk.  Once a candidate release is branched it usually stops getting new features.   Bugs are fixed and after a vote, a release is declared for that particular branch.  Any member of the community can create a branch for a release and name it whatever they like.

This diagram illustrates the history of the various Apache Hadoop releases and their origins.  There are 3 occasions where community releases from the Apache Hadoop project broke with what would be a more traditional release & branch convention.  These occasions are usually the source of confusion for users.

  1. More than a year after Apache Hadoop 0.20 branched, significant feature development continued on just that branch and not on trunk.  Two major features were added to branches off 0.20.2.  One feature was authentication, enabling strong security for core Hadoop.  The other major feature was append, enabling users to run Apache HBase without risk of data loss.  The security branch was later released as 0.20.203.  These branches and their subsequent release have been the largest source of confusion for users because since that time, releases off of the 0.20 branches had features that releases off of trunk did not have and vice versa.
  2. Apache Hadoop .22 released chronologically after Apache Hadoop 0.23.  In actuality Apache Hadoop 0.23 is a strict superset of features over 0.22 but it actually released a month before 0.22.
  3. A few weeks after 0.23 released, the 0.20 branch formerly known as 0.20.205 was renumbered 1.0.  There is next to no functional difference between 0.20.205 and 1.0.  This is just a renumbering.

Because of issue #1, there has been an 18 month period where there has been no one Apache release that had all the committed features of Apache Hadoop.  This table illustrates the point:

As members of the Apache Hadoop community, Cloudera engineers have focused their efforts on getting back to releases that are strict superset of all of the features of any past releases so as to avoid having to make the unpleasant choice of picking one feature set over another.  The good news is minus the confusion over the 1.0 numbering, we are basically there.  There have been two good recent releases off of trunk (0.22 and 0.23) one of which (0.23) does have all of the features of any past release.  It’s very possible these new releases will get renumbered to 2.0 or 3.0 or some other number to indicate they are functional supersets of 1.0 but this remains to be decided.

Many of you are CDH users and by now you’re wondering what Apache Hadoop you are running today and what Apache Hadoop you’ll be running in the future.  This diagram shows the CDH releases and the Apache Hadoop releases they draw from.

The CDH1 distribution incorporated the 0.18.3 Apache Hadoop release.  The CDH2 distribution incorporated the 0.20.1 Apache Hadoop release.  The CDH3 distribution incorporated the 0.20.2 Apache Hadoop release plus the features of the 0.20.append and 0.20.security branches that collectively are now known as “1.0.”  The Apache Hadoop in CDH3 has been the equivalent of the recently announced Apache Hadoop 1.0 for approximately a year now.  The CDH4 distribution will likely incorporate a release from the 0.23.x series.  We also do quarterly updates for CDH releases.  These updates typically include backports from trunk that fix bugs or improve performance & stability, not new component releases.  In some cases when it is not destabilizing or compatibility breaking, a CDH update will include an incrementally new component version.  For example CDH3U0 uses HBase 0.90.0 whereas CDH3U2 uses HBase 0.90.4.

Cloudera’s Distribution including Apache Hadoop currently incorporates and integrates 13 different open source components to create a single open source Apache Hadoop based data management platform.  11 of the 13 components come from Apache projects, Apache Hadoop being one of them.  All of these projects have their own branch and release quirks because each project is a different collection of individuals with different motivations and preferences.  This is a feature, not a bug of the Apache community process.  By creating an environment where individuals with disparate motivations can all contribute, projects attract more contributors and more innovation.

CDH has a multi-year history of annual releases, quarterly updates, clear upgrade paths and strong policies around maintaining compatibility and stability across updates.  This has only been possible because the CDH engineering team is comprised of more than 20 engineers that are committers and PMC members of the various Apache projects who can shape the innovation of the extended community into a single coherent system.  It is why we believe demonstrated leadership in open source contribution is the only way to harness the open innovation of the Apache Hadoop ecosystem.

The most current GA release of CDH is CDH3, update 2.  Find out more about it .

转载于:https://my.oschina.net/unclegeek/blog/41857

转载地址:https://blog.csdn.net/weixin_34413802/article/details/92428289 如侵犯您的版权,请留言回复原文章的地址,我们会给您删除此文章,给您带来不便请您谅解!

上一篇:mysql忘记root密码后的补救方法
下一篇:微服务注册中心注册表与hashcode实现golang版

发表评论

最新留言

网站不错 人气很旺了 加油
[***.192.178.218]2024年04月07日 20时26分50秒

关于作者

    喝酒易醉,品茶养心,人生如梦,品茶悟道,何以解忧?唯有杜康!
-- 愿君每日到此一游!

推荐文章

Java多线程的11种创建方式以及纠正网上流传很久的一个谬误 2019-04-27
JDK源码研究Jstack,JMap,threaddump,dumpheap的原理 2019-04-27
Java使用字节码和汇编语言同步分析volatile,synchronized的底层实现 2019-04-27
javac编译原理和javac命令行的使用 2019-04-27
Unity使用UnityWebRequest实现本地日志上传到web服务器 2019-04-27
Unity使用RenderTexture实现裁切3D模型 2019-04-27
美术和程序吵架,原来是资源序列化格式设置不统一 2019-04-27
Unity iOS接SDK,定制UnityAppController 2019-04-27
Unity iOS接SDK前先要了解的知识(Objective-C) 2019-04-27
python遇到了‘module‘ object has no attribute ‘socket‘问题,大概率是这个原因 2019-04-27
记一次iOS闪退问题的定位:NSLog闪退 2019-04-27
Unity打开照相机与打开本地相册然后在Unity中显示照片(Android与iOS) 2019-04-27
无需接入SDK即可在Unity中获取经纬度(Android/iOS),告诉我你的坐标 2019-04-27
Unity获取系统信息SystemInfo(CPU、显卡、操作系统等信息) 2019-04-27
Unity中获取物体的尺寸(size)的三种方法 2019-04-27
Unity中的关节组件和绳子效果的实现 2019-04-27
Unity可视化编程插件: Bolt,可以像UE4的蓝图那样啦 2019-04-27
Android使用adb logcat时日志中文乱码问题,使用chcp 65001设置编码即可 2019-04-27
Android的.dex、.odex与.oat文件扫盲 2019-04-27
Unity移动应用如何在Bugly上查看崩溃堆栈 2019-04-27