SegmentFault 来时做李白最新的文章
2016-08-25T14:12:55+08:00
https://segmentfault.com/feeds/blogs
https://creativecommons.org/licenses/by-nc-nd/4.0/
django 静态文件
https://segmentfault.com/a/1190000006714878
2016-08-25T14:12:55+08:00
2016-08-25T14:12:55+08:00
timger
https://segmentfault.com/u/timger
0
<h4>MEDIA_ROOT</h4>
<pre><code>is the folder where every files uploaded with an FileField will go.
</code></pre>
<h4>STATIC_ROOT</h4>
<pre><code> is the folder where every static files will be stored after a manage.py collectstatic</code></pre>
<p>The absolute path to the directory where collectstatic will collect static files for deployment.</p>
<p>If the staticfiles contrib app is enabled (default) the collectstatic management command will collect static files into this directory. See the howto on managing static files for more details about usage.</p>
<h4>STATICFILES_DIRS</h4>
<p>is the list of folder where Django will search for additional static files, in addition to each static folder of each app installed.<br>This setting defines the additional locations the staticfiles app will traverse if the FileSystemFinder finder is enabled, e.g. if you use the collectstatic or findstatic management command or use the static file serving view.</p>
<h2>聚合</h2>
<pre><code>python manage.py collectstatic</code></pre>
plantuml 画架构图
https://segmentfault.com/a/1190000005769998
2016-06-21T21:36:39+08:00
2016-06-21T21:36:39+08:00
timger
https://segmentfault.com/u/timger
0
<p><a href="https://link.segmentfault.com/?enc=RS%2BTMwcTcXQMulull%2BHlNg%3D%3D.JpX%2B0X6%2FXpFPMMeEJLo2aWbquJAGicRzPvRnc%2BMK9CUJPyKo5g1AOD1s1Fl6EHoZ" rel="nofollow">http://plantuml.com/sequence....</a></p>
idea + plantuml 画流程图
https://segmentfault.com/a/1190000005686115
2016-06-10T20:13:56+08:00
2016-06-10T20:13:56+08:00
timger
https://segmentfault.com/u/timger
0
<h4>plantuml 简介</h4>
<ol><li><p><a href="https://link.segmentfault.com/?enc=4SWtNd03vhXL2OkfXFR%2FUA%3D%3D.zlqBIsvs%2FAKIPN3jsb6kuV4L8dZbUMcOeIXm5O7y70U%3D" rel="nofollow">文档</a></p></li></ol>
<h4>安装插件</h4>
<p><img src="/img/bVx1j6" alt="clipboard.png" title="clipboard.png"></p>
<p>需要安装</p>
<p><img src="/img/bVx1kA" alt="clipboard.png" title="clipboard.png"></p>
<pre><code>git:(master) ✗ brew install graphviz
==> Installing dependencies for graphviz: libpng
==> Installing graphviz dependency: libpng
==> Downloading https://homebrew.bintray.com/bottles/libpng-1.6.21.yosemite.bottle.tar.gz</code></pre>
<pre><code>➜ ~ /usr/local/Cellar/graphviz/2.38.0/bin/
acyclic* circo@ dot* edgepaint* gml2gv* gv2gxl@ gvmap* gxl2dot@ neato@ patchwork@ sfdp@ unflatten*
bcomps* cluster* dot2gxl@ fdp@ graphml2gv* gvcolor* gvpack* gxl2gv* nop* prune* tred*
ccomps* dijkstra* dot_builtins* gc* gv2gml* gvgen* gvpr* mm2gv* osage@ sccmap* twopi@</code></pre>
<h4>新建文件</h4>
<hr>
<p><img src="/img/bVx1kd" alt="clipboard.png" title="clipboard.png"></p>
<h4>预览效果</h4>
<p><img src="/img/bVx1nB" alt="clipboard.png" title="clipboard.png"></p>
好用的记录coding的服务
https://segmentfault.com/a/1190000002561220
2015-02-20T18:54:08+08:00
2015-02-20T18:54:08+08:00
timger
https://segmentfault.com/u/timger
0
<p>分享一个记录每天敲多少代码的服务</p>
<p><a rel="nofollow" href="https://wakatime.com/dashboard">https://wakatime.com/dashboard</a></p>
scala 好用的时间处理包
https://segmentfault.com/a/1190000002531865
2015-02-03T13:16:05+08:00
2015-02-03T13:16:05+08:00
timger
https://segmentfault.com/u/timger
0
<p><a rel="nofollow" href="https://github.com/nscala-time/nscala-time">https://github.com/nscala-time/nscala-time</a></p>
<p><a rel="nofollow" href="http://www.joda.org/joda-time/apidocs/org/joda/time/DateTime.html">http://www.joda.org/joda-time/apidocs/org/joda/time/DateTime.html</a></p>
scala 资源 copy 自知乎
https://segmentfault.com/a/1190000002531245
2015-02-03T09:26:30+08:00
2015-02-03T09:26:30+08:00
timger
https://segmentfault.com/u/timger
2
<p>我是宏江,华东地区的scala爱好者接触过一些,对scala圈子稍微有点了解。<br>
我不是大牛,以前是为了忽悠同事跟我一起搞scala,让身边的人更多认识scala,写了很多关于scala的blog,所以很多人是通过我的blog( scala | 在路上 )了解到我的。<br>
我说一下国内我了解的一些scala技术还比较不错的人:<br>
1)邓草原,毫无疑问是国内首屈一指的大牛,刚开始学scala时用的netbeans上的编译器插件就是他写的,草原不光是scala方面牛,Erlang也有很深入的了解,关注他微博就是了。<br><a rel="nofollow" href="http://weibo.com/dcaoyuan">http://weibo.com/dcaoyuan</a><br>
2)杨博,这个人是广州还是深圳的一家游戏公司,没有直接打过交道,但看他github发现应该是个很聪明的家伙。<br><a rel="nofollow" href="http://www.zhihu.com/people/atry">http://www.zhihu.com/people/atry</a><br>
3)阿里系的一些业余玩家:王福强(千任<a rel="nofollow" href="http://weibo.com/fujohnwang">http://weibo.com/fujohnwang</a> ),钟伦甫(聚石<a rel="nofollow" href="http://weibo.com/cafusic">http://weibo.com/cafusic</a> ),王定乾(花名忘了,一般叫他阿干<a rel="nofollow" href="http://weibo.com/argan">http://weibo.com/argan</a>),山行,鸣嵩<br>
4)上海的一些:老高(平安 <a rel="nofollow" href="http://weibo.com/laogaome">http://weibo.com/laogaome</a>),老猪(自己在创业),诺铁(tw <a rel="nofollow" href="http://weibo.com/notyy">http://weibo.com/notyy</a>),章业铭(原阿里的,现在看处方 <a rel="nofollow" href="http://weibo.com/yeming">http://weibo.com/yeming</a>),还有其他几个初创公司,比如乔布简历的,人名不记得了。<br>
5)一些没怎么发过声音,但有实战经验的,比如杭州有家手机游戏公司,后台早就采用scala,还有19楼等,可能有一些水平不错的人。<br>
6)唯品会的王在祥<br><a rel="nofollow" href="http://weibo.com/wangzaixiang">http://weibo.com/wangzaixiang</a><br>
7)Intel还有其他公司研究spark的人里,应该也有一些对scala玩的不错的,没交流过。<br>
(吴甘沙 Sina Visitor System) ( 连城 Sina Visitor System) (crazyjvm Sina Visitor System)<br>
8)redhat 北京 应该有一些,比如参加apache下面很多开源项目(包括Camel )的 姜宁<br><a rel="nofollow" href="http://weibo.com/willemjiang">http://weibo.com/willemjiang</a><br>
9)早先在 javaeye 的 scala圈子比较活跃的 eastsun 孙旭东 <a rel="nofollow" href="http://www.zhihu.com/people/eastsu">http://www.zhihu.com/people/eastsu</a><br>
10) 华为的 sunqihui <a rel="nofollow" href="http://blog.csdn.net/sunqihui">http://blog.csdn.net/sunqihui</a><br>
11) Databricks 的 hasjoin (应该不算国内的了) <a rel="nofollow" href="http://weibo.com/hashjoin">http://weibo.com/hashjoin</a><br>
12) Jin Mingjian 在做scala-ide相关的开发 <a rel="nofollow" href="https://github.com/jinmingjian">https://github.com/jinmingjian</a><br><a rel="nofollow" href="http://jmj-eclipse.blogspot.com">http://jmj-eclipse.blogspot.com</a> (杨博补充的)<br>
大家可以补充自己认识的<em>斜体文字</em></p>
大数据 论文
https://segmentfault.com/a/1190000002521293
2015-01-29T10:38:59+08:00
2015-01-29T10:38:59+08:00
timger
https://segmentfault.com/u/timger
0
<h2>Papers</h2>
<ul>
<li><a rel="nofollow">Published in 2014</a></li>
<li><a rel="nofollow">Published in 2013</a></li>
<li><a rel="nofollow">Published in 2012</a></li>
<li><a rel="nofollow">Published in 2011</a></li>
<li><a rel="nofollow">Published in 2010</a></li>
<li><a rel="nofollow">Published in 2009</a></li>
<li><a rel="nofollow">Published in 2008</a></li>
<li><a rel="nofollow">Published in 2007</a></li>
<li><a rel="nofollow">Published in 2006</a></li>
<li><a rel="nofollow">Published in 2005</a></li>
<li><a rel="nofollow">Published in 2004</a></li>
<li><a rel="nofollow">Published in 2003</a></li>
<li><a rel="nofollow">Published in 2002</a></li>
<li><a rel="nofollow">Published in 2001</a></li>
<li><a rel="nofollow">Published in 2000</a></li>
<li><a rel="nofollow">Published in 1999</a></li>
<li><a rel="nofollow">Published in 1998</a></li>
<li><a rel="nofollow">Published in 1997</a></li>
</ul>
<h3>2014</h3>
<ul>
<li>
<strong>2014</strong> - <a rel="nofollow" href="https://www.cs.cmu.edu/~om3d/papers/SIGGRAPH2014.pdf">3D Object Manipulation in a Single Photograph using Stock 3D Models</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="http://www.vldb.org/pvldb/vol7/p1617-sun.pdf?imm_mid=0c5589&cmp=em-strata-na-na-newsltr_20141022">A Partitioning Framework for Aggressive Data Skipping</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-ardekani.pdf">A Self-Configurable Geo-Replicated Cloud Storage System</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="http://www.bailis.org/papers/ca-vldb2015.pdf">Coordination Avoidance in Database Systems</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="http://www.cs.toronto.edu/~ranzato/publications/taigman_cvpr14.pdf">DeepFace: Closing the Gap to Human-Level Performance in Face Verification</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="http://www.vldb.org/pvldb/vol7/p1462-vemuri.pdf">Execution Primitives for Scalable Joins and Aggregations in Map Reduce</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-muralidhar.pdf">f4: Facebookâs Warm BLOB Storage System</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="http://fastpass.mit.edu/Fastpass-SIGCOMM14-Perry.pdf">Fastpass: A Centralized "Zero-Queue" Datacenter Network</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="http://research.microsoft.com/en-us/um/redmond/projects/hyperlapse/paper/hyperlapse.pdf">First-person Hyper-lapse Videos</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="http://arxiv.org/pdf/1408.2055v1.pdf">Guess Who Rated This Movie: Identifying Users Through Subspace Clustering</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="https://www.usenix.org/system/files/conference/atc14/atc14-paper-ongaro.pdf">In Search of an Understandable Consensus Algorithm</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="https://www.usenix.org/system/files/conference/fast14/fast14-paper_rumble.pdf">Log-structured Memory for DRAM-based Storage</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="http://www.cse.buffalo.edu/tech-reports/2014-04.pdf">Logical Physical Clocks and Consistent Snapshots in Globally Distributed Databases</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="http://mapgraph.io/papers/MapGraph-SIGMOD-2014.pdf">MapGraph: A High Level API for Fast Development of High Performance Graph Analytics on GPUs</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42851.pdf">Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="http://www.pivotal.io/sites/default/files/SIGMODMay2014HAWQAdvantages.pdf">Orca A Modular Query Optimizer Architecture for Big Data</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="http://spatialhadoop.cs.umn.edu/publications/ICDE14_demo_763.pdf">Pigeon: A Spatial MapReduce Language</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Erhan_Scalable_Object_Detection_2014_CVPR_paper.pdf">Scalable Object Detection using Deep Neural Networks</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="http://arxiv.org/pdf/1409.3215v1.pdf">Sequence to Sequence Learning with Neural Networks</a>
</li>
<li>
<strong>2014</strong> - <a rel="nofollow" href="http://arxiv.org/pdf/1411.4555v1.pdf">Show and Tell: A Neural Image Caption Generator</a>
</li>
</ul>
<h3>2013</h3>
<ul>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://spatialhadoop.cs.umn.edu/publications/p744-eldawy.pdf">A Demonstration of SpatailHadoop: An Efficient MapReduce Framework for Spatial Data</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://spatialhadoop.cs.umn.edu/publications/p0144_Eldawy.pdf">CG_Hadoop: Computational Geometry in MapReduce</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://delivery.acm.org/10.1145/2530000/2522731/p309-terry.pdf">Consistency-Based Service Level Agreements for Cloud Storage</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://arxiv.org/pdf/1304.1467v2.pdf">Dimension Independent Matrix Square using MapReduce</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://static.druid.io/docs/druid.pdf">Druid A Real-time Analytical Data Store</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://link.springer.com/content/pdf/10.1007%2Fs13748-013-0040-3.pdf">Event labeling combining ensemble detectors and background knowledge</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://sigops.org/sosp/sosp13/papers/p33-david.pdf">Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://stevereads.com/papers_to_read/f1_a_distributed_sql_database_that_scales.pdf">F1: A Distributed SQL Database That Scales </a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="https://amplab.cs.berkeley.edu/wp-content/uploads/2013/05/grades-graphx_with_fonts.pdf">GraphX: A Resilient Distributed Graph System on Spark</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://stefanheule.com/papers/edbt2013-hyperloglog.pdf">HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality 2013 Estimation Algorithm</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41378.pdf">MillWheel: Fault-Tolerant Stream Processing at Internet Scale</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://cidrdb.org/cidr2013/Papers/CIDR13_Paper118.pdf">MLbase: A Distributed Machine-learning System</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://research.microsoft.com/pubs/201100/naiad_sosp2013.pdf">Naiad: A Timely Dataflow System</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://db.disi.unitn.eu/pages/VLDBProgram/pdf/industry/p764-rae.pdf">Online, Asynchronous Schema Change in F1</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://eurosys2013.tudos.org/wp-content/uploads/2013/paper/Venkataraman.pdf">Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://nlp.stanford.edu/~socherr/EMNLP2013_RNTN.pdf">Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://arxiv.org/pdf/1311.2524v4.pdf">Rich feature hierarchies for accurate object detection and semantic segmentation</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://research.microsoft.com/pubs/200169/now-vldb.pdf">Scalable Progressive Analytics on Big Data in the Cloud</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="https://www.usenix.org/system/files/conference/nsdi13/nsdi13-final170_update.pdf&sa=U&ei=gWJjU97pOeqxsQSDkYDAAg&ved=0CBsQFjAA&usg=AFQjCNGMeuWne9ywncbgux_XiZW6lQWHNw">Scaling Memcache at Facebook</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://db.disi.unitn.eu/pages/VLDBProgram/pdf/industry/p767-wiener.pdf">Scuba: Diving into Data at Facebook</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-214.pdf">Shark: SQL and Rich Analytics at Scale</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://arxiv.org/pdf/1312.5402v1.pdf">Some Improvements on Deep Convolutional Neural Network Based Image Classification</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="https://www.cs.cmu.edu/~pavlo/courses/fall2013/static/papers/11730-atc13-bronson.pdf">TAO: Facebookâs Distributed Data Store for the Social Graph</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://www.scs.stanford.edu/~stutsman/papers/stutsman-dcft-hotos13.pdf">Toward Common Patterns for Distributed, Concurrent, Fault-Tolerant Code</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://db.disi.unitn.eu/pages/VLDBProgram/pdf/industry/p871-curtiss.pdf">Unicorn: A System for Searching the Social Graph</a>
</li>
<li>
<strong>2013</strong> - <a rel="nofollow" href="http://hyperdex.org/papers/warp.pdf">Warp: Lightweight Multi-Key Transactions for Key-Value Stores</a>
</li>
</ul>
<h3>2012</h3>
<ul>
<li>
<strong>2012</strong> - <a rel="nofollow" href="http://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf">A Few Useful Things to Know about Machine Learning</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="http://research.microsoft.com/en-us/um/people/borgs/Papers/SublinearPR.pdf">A Sublinear Time Algorithm for PageRank Computations</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="http://www.vldb.org/pvldb/vol5/p1874_liliwu_vldb2012.pdf">Avatara: OLAP for Web-scale Analytics Products</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="http://www.cs.berkeley.edu/~sameerag/blinkdb_vldb12_demo.pdf">Blink and It's Done. Interactive Queries on Very Large Data</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="https://www.cs.berkeley.edu/~sameerag/blinkdb_eurosys13.pdf">BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="http://arxiv.org/pdf/1206.2082.pdf">Dimension Independent Similarity Computation</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="http://www.umiacs.umd.edu/~jimmylin/publications/Busch_etal_ICDE2012.pdf">Earlybird: Real-Time Search at Twitter</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="https://www.usenix.org/system/files/login/articles/zaharia.pdf">Fast and Interactive Analytics over Hadoop Data with Spark</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="http://hyperdex.org/papers/hyperdex.pdf">HyperDex: A Distributed, Searchable Key-Value Store</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf">ImageNet Classification with Deep Convolutional Neural Networks</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="http://www.umiacs.umd.edu/~jimmylin/publications/Lin_Kolcz_SIGMOD2012.pdf">Large:Scale Machine Learning at Twitter</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="http://arxiv.org/pdf/1202.2771v5.pdf">Multi-Scale Matrix Sampling and Sublinear-Time PageRank Computation</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="http://research.microsoft.com/pubs/178045/ppaoxs-paper29.pdf">Paxos Made Parallel</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="https://www.usenix.org/legacy/events/nsdi11/tech/full_papers/Bolosky.pdf">Paxos Replicated State Machines as the Basis of a High-Performance Data Store</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="http://vldb.org/pvldb/vol5/p1436_alexanderhall_vldb2012.pdf">Processing a Trillion Cells per Mouse Click</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="http://www.cs.berkeley.edu/~matei/papers/2012/sigmod_shark_demo.pdf">Shark: Fast Data Analysis Using Coarse-grained Distributed Memory</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf">Spanner: Google's Globally-Distributed Database</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="http://vldb.org/pvldb/vol5/p1771_georgelee_vldb2012.pdf">The Unified Logging Infrastructure for Data Analytics at Twitter</a>
</li>
<li>
<strong>2012</strong> - <a rel="nofollow" href="http://vldb.org/pvldb/vol5/p1790_andrewlamb_vldb2012.pdf">The Vertica Analytic Database- C-Store 7 Years Later</a>
</li>
</ul>
<h3>2011</h3>
<ul>
<li>
<strong>2011</strong> - <a rel="nofollow" href="http://csce.uark.edu/~tingxiny/courses/5013spring13/readingList/crowddb_sigmod2011.pdf">CrowdDB: Answering Queries with Crowdsourcing</a>
</li>
<li>
<strong>2011</strong> - <a rel="nofollow" href="http://cs.brown.edu/~kraskat/pub/vldb11-crowddb_demo.pdf">CrowdDB: Query Processing with the VLDB Crowd</a>
</li>
<li>
<strong>2011</strong> - <a rel="nofollow" href="http://web.stanford.edu/~ouster/cgi-bin/papers/ramcloud-recovery.pdf">Fast Crash Recovery in RAMCloud</a>
</li>
<li>
<strong>2011</strong> - <a rel="nofollow" href="http://www.eecs.berkeley.edu/~brecht/papers/hogwildTR.pdf">Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent</a>
</li>
<li>
<strong>2011</strong> - <a rel="nofollow" href="http://www.scs.stanford.edu/~rumble/papers/latency_hotos11.pdf">It's Time for Low Latency</a>
</li>
<li>
<strong>2011</strong> - <a rel="nofollow" href="http://research.microsoft.com/pubs/144534/matching_tr.pdf">Matching Unstructured Product Offers to Structured Product Specifications</a>
</li>
<li>
<strong>2011</strong> - <a rel="nofollow" href="http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper32.pdf">Megastore: Providing Scalable, Highly Available Storage for Interactive Services</a>
</li>
<li>
<strong>2011</strong> - <a rel="nofollow" href="https://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf">Resilient Distributed Datasets- A Fault-Tolerant Abstraction for In-Memory Cluster Computing</a>
</li>
<li>
<strong>2011</strong> - <a rel="nofollow" href="http://research.microsoft.com/en-us/um/people/srikanth/data/scarlett_eurosys11.pdf">Scarlett: Coping with Skewed Content Popularity in MapReduce Clusters</a>
</li>
</ul>
<h3>2010</h3>
<ul>
<li>
<strong>2010</strong> - <a rel="nofollow" href="http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36356.pdf">Dapper, a Large-Scale Distributed Systems Tracing Infrastructure</a>
</li>
<li>
<strong>2010</strong> - <a rel="nofollow" href="http://web.stanford.edu/~boyd/papers/pdf/admm_distr_stats.pdf">Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers</a>
</li>
<li>
<strong>2010</strong> - <a rel="nofollow" href="http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36632.pdf">Dremel: Interactive Analysis of Web-Scale Datasets</a>
</li>
<li>
<strong>2010</strong> - <a rel="nofollow" href="https://www.usenix.org/legacy/event/osdi10/tech/full_papers/Beaver.pdf">Finding a needle in Haystack- Facebook's photo storage</a>
</li>
<li>
<strong>2010</strong> - <a rel="nofollow" href="http://pages.cs.wisc.edu/~akella/CS838/F12/838-CloudPapers/FlumeJava.pdf">FlumeJava: Easy, Eff¥cient Data-Parallel Pipelines</a>
</li>
<li>
<strong>2010</strong> - <a rel="nofollow" href="https://www.usenix.org/legacy/event/osdi10/tech/full_papers/Peng.pdf">Large:scale Incremental Processing Using Distributed Transactions and Notifications</a>
</li>
<li>
<strong>2010</strong> - <a rel="nofollow" href="http://static.usenix.org/event/nsdi11/tech/full_papers/Hindman_new.pdf">Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center </a>
</li>
<li>
<strong>2010</strong> - <a rel="nofollow" href="http://kowshik.github.io/JPregel/pregel_paper.pdf">Pregel: A System for Large-Scale Graph Processing</a>
</li>
<li>
<strong>2010</strong> - <a rel="nofollow" href="http://www.4lunas.org/pub/2010-s4.pdf">S4: Distributed Stream Computing Platform</a>
</li>
<li>
<strong>2010</strong> - <a rel="nofollow" href="http://www.cs.berkeley.edu/~matei/papers/2010/hotcloud_spark.pdf">Spark: Cluster Computing with Working Sets</a>
</li>
<li>
<strong>2010</strong> - <a rel="nofollow" href="http://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/36955.pdf">The Learning Behind Gmail Priority Inbox</a>
</li>
<li>
<strong>2010</strong> - <a rel="nofollow" href="https://www.usenix.org/legacy/event/usenix10/tech/full_papers/Hunt.pdf">ZooKeeper: Wait-free coordination for Internet-scale systems</a>
</li>
</ul>
<h3>2009</h3>
<ul>
<li>
<strong>2009</strong> - <a rel="nofollow" href="https://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf">Cassandra - A Decentralized Structured Storage System</a>
</li>
<li>
<strong>2009</strong> - <a rel="nofollow" href="http://www.vldb.org/pvldb/2/vldb09-861.pdf">HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads</a>
</li>
<li>
<strong>2009</strong> - <a rel="nofollow" href="http://research.microsoft.com/en-us/um/people/lamport/pubs/vertical-paxos.pdf">Vertical Paxos and Primary-Backup Replication</a>
</li>
</ul>
<h3>2008</h3>
<ul>
<li>
<strong>2008</strong> - <a rel="nofollow" href="http://mmm.csd.uwo.ca/courses/CS9842/papers/Paper-13-Ariel-Rabkin.pdf">Chukwa: A large-scale monitoring system</a>
</li>
<li>
<strong>2008</strong> - <a rel="nofollow" href="http://db.csail.mit.edu/projects/cstore/abadi-sigmod08.pdf">Column:Stores vs. Row-Stores- How Different Are They Really?</a>
</li>
<li>
<strong>2008</strong> - <a rel="nofollow" href="http://www.mpi-sws.org/~druschel/courses/ds/papers/cooper-pnuts.pdf">PNUTS: Yahoo!Õs Hosted Data Serving Platform</a>
</li>
<li>
<strong>2008</strong> - <a rel="nofollow" href="http://www.cs.umd.edu/~samir/498/10Algorithms-08.pdf">Top 10 algorithms in data mining</a>
</li>
</ul>
<h3>2007</h3>
<ul>
<li>
<strong>2007</strong> - <a rel="nofollow" href="http://cs.brown.edu/~debrabant/cis570-website/papers/dryad.pdf">Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks</a>
</li>
<li>
<strong>2007</strong> - <a rel="nofollow" href="http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf">Dynamo: Amazon's Highly Available Key-value Store</a>
</li>
<li>
<strong>2007</strong> - <a rel="nofollow" href="http://vis-www.cs.umass.edu/lfw/lfw.pdf">Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments</a>
</li>
<li>
<strong>2007</strong> - <a rel="nofollow" href="http://www.ics.uci.edu/~cs223/papers/cidr07p15.pdf">Life beyond Distributed Transactions: an ApostateÕs Opinion</a>
</li>
<li>
<strong>2007</strong> - <a rel="nofollow" href="http://www.cs.utexas.edu/users/lorenzo/corsi/cs380d/papers/paper2-1.pdf">Paxos Made Live - An Engineering Perspective</a>
</li>
</ul>
<h3>2006</h3>
<ul>
<li>
<strong>2006</strong> - <a rel="nofollow" href="http://static.googleusercontent.com/media/research.google.com/en//archive/bigtable-osdi06.pdf">Bigtable: A Distributed Storage System for Structured Data</a>
</li>
<li>
<strong>2006</strong> - <a rel="nofollow" href="http://www.ssrc.ucsc.edu/Papers/weil-osdi06.pdf">Ceph: A Scalable, High-Performance Distributed File System</a>
</li>
<li>
<strong>2006</strong> - <a rel="nofollow" href="http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2006_725.pdf">Map-Reduce for Machine Learning on Multicore</a>
</li>
<li>
<strong>2006</strong> - <a rel="nofollow" href="http://static.googleusercontent.com/media/research.google.com/en//archive/chubby-osdi06.pdf">The Chubby lock service for loosely-coupled distributed systems</a>
</li>
</ul>
<h3>2005</h3>
<ul>
<li>
<strong>2005</strong> - <a rel="nofollow" href="http://research.microsoft.com/pubs/64624/tr-2005-112.pdf">Fast Paxos</a>
</li>
</ul>
<h3>2004</h3>
<ul>
<li>
<strong>2004</strong> - <a rel="nofollow" href="http://research.microsoft.com/en-us/um/people/lamport/pubs/web-dsn-submission.pdf">Cheap Paxos</a>
</li>
<li>
<strong>2004</strong> - <a rel="nofollow" href="http://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-osdi04.pdf">MapReduce: Simplified Data Processing on Large Clusters</a>
</li>
</ul>
<h3>2003</h3>
<ul>
<li>
<strong>2003</strong> - <a rel="nofollow" href="http://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf">The Google File System</a>
</li>
</ul>
<h3>2002</h3>
<ul>
<li>
<strong>2002</strong> - <a rel="nofollow" href="http://lpd.epfl.ch/sgilbert/pubs/BrewersConjecture-SigAct.pdf">Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services</a>
</li>
</ul>
<h3>2001</h3>
<ul>
<li>
<strong>2001</strong> - <a rel="nofollow" href="http://pdos.csail.mit.edu/papers/chord:sigcomm01/chord_sigcomm.pdf">Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications</a>
</li>
<li>
<strong>2001</strong> - <a rel="nofollow" href="http://research.microsoft.com/en-us/um/people/lamport/pubs/paxos-simple.pdf">Paxos Made Simple</a>
</li>
<li>
<strong>2001</strong> - <a rel="nofollow" href="http://oz.berkeley.edu/~breiman/randomforest2001.pdf">Random Forrest</a>
</li>
</ul>
<h3>1999</h3>
<ul>
<li>
<strong>1999</strong> - <a rel="nofollow" href="http://link.springer.com/content/pdf/10.1023%2FA%3A1007563306331.pdf">Pasting Small Votes for Classification in Large Databases and On-Line</a>
</li>
<li>
<strong>1999</strong> - <a rel="nofollow" href="http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf">The PageRank Citation Ranking: Bringing Order to the Web</a>
</li>
</ul>
<h3>1997</h3>
<ul>
<li>
<strong>1997</strong> - <a rel="nofollow" href="http://www.nas.nasa.gov/assets/pdf/techreports/1997/nas-97-010.pdf">Application-Controlled Demand Paging for Out-of-Core Visualization</a>
</li>
</ul>
Big Data Ecosystem Dataset
https://segmentfault.com/a/1190000002521290
2015-01-29T10:38:39+08:00
2015-01-29T10:38:39+08:00
timger
https://segmentfault.com/u/timger
6
<p>Big Data Ecosystem Dataset</p>
<h2>Data</h2>
<h3>Projects</h3>
<ul>
<li><a rel="nofollow">Frameworks</a></li>
<li><a rel="nofollow">Distributed Programming</a></li>
<li><a rel="nofollow">Distributed Filesystem</a></li>
<li><a rel="nofollow">Key-Map Data Model</a></li>
<li><a rel="nofollow">Document Data Model</a></li>
<li><a rel="nofollow">Key-value Data Model</a></li>
<li><a rel="nofollow">Graph Data Model</a></li>
<li><a rel="nofollow">NewSQL Databases</a></li>
<li><a rel="nofollow">Columnar Databases</a></li>
<li><a rel="nofollow">Time-Series Databases</a></li>
<li><a rel="nofollow">SQL-like processing</a></li>
<li><a rel="nofollow">Integrated Development Environments</a></li>
<li><a rel="nofollow">Data Ingestion</a></li>
<li><a rel="nofollow">Message-oriented middleware</a></li>
<li><a rel="nofollow">Service Programming</a></li>
<li><a rel="nofollow">Scheduling</a></li>
<li><a rel="nofollow">Machine Learning</a></li>
<li><a rel="nofollow">Benchmarking</a></li>
<li><a rel="nofollow">Security</a></li>
<li><a rel="nofollow">System Deployment</a></li>
<li><a rel="nofollow">Container Manager</a></li>
<li><a rel="nofollow">Applications</a></li>
<li><a rel="nofollow">Search engine and framework</a></li>
<li><a rel="nofollow">MySQL forks and evolutions</a></li>
<li><a rel="nofollow">PostgreSQL forks and evolutions</a></li>
<li><a rel="nofollow">Memcached forks and evolutions</a></li>
<li><a rel="nofollow">Embedded Databases</a></li>
<li><a rel="nofollow">Business Intelligence</a></li>
<li><a rel="nofollow">Data Analysis</a></li>
<li><a rel="nofollow">Data Warehouse</a></li>
<li><a rel="nofollow">Data Visualization</a></li>
<li><a rel="nofollow">Internet of Things</a></li>
</ul>
<h4>Frameworks</h4>
<ul>
<li>
<a rel="nofollow" href="http://hadoop.apache.org/">Apache Hadoop</a> - framework for distributed processing. Integrates MapReduce (parallel processing), YARN (job scheduling) and HDFS (distributed file system).</li>
</ul>
<h4>Distributed Programming</h4>
<ul>
<li>
<a rel="nofollow" href="https://github.com/addthis/hydra">AddThis Hydra</a> - distributed data processing and storage system originally developed at AddThis.</li>
<li>
<a rel="nofollow" href="https://github.com/mozilla-metrics/akela">Akela</a> - Mozilla's utility library for Hadoop, HBase, Pig, etc..</li>
<li>
<a rel="nofollow" href="http://aws.amazon.com/lambda/">Amazon Lambda</a> - a compute service that runs your code in response to events and automatically manages the compute resources for you.</li>
<li>
<a rel="nofollow" href="http://databricks.github.io/simr/">AMPLab SIMR</a> - run Spark on Hadoop MapReduce v1.</li>
<li>
<a rel="nofollow" href="http://succinct.cs.berkeley.edu/wp/wordpress/">AMPLab Succinct</a> - Enabling Queries on Compressed Data.</li>
<li>
<a rel="nofollow" href="http://crunch.apache.org/">Apache Crunch</a> - a simple Java API for tasks like joining and data aggregation that are tedious to implement on plain MapReduce.</li>
<li>
<a rel="nofollow" href="http://incubator.apache.org/projects/datafu.html">Apache DataFu</a> - collection of user-defined functions for Hadoop and Pig developed by LinkedIn.</li>
<li>
<a rel="nofollow" href="http://flink.incubator.apache.org/">Apache Flink</a> - high-performance runtime, and automatic program optimization.</li>
<li>
<a rel="nofollow" href="http://gora.apache.org/">Apache Gora</a> - framework for in-memory data model and persistence.</li>
<li>
<a rel="nofollow" href="http://hama.apache.org/">Apache Hama</a> - BSP (Bulk Synchronous Parallel) computing framework.</li>
<li>
<a rel="nofollow" href="http://wiki.apache.org/hadoop/MapReduce/">Apache MapReduce</a> - programming model for processing large data sets with a parallel, distributed algorithm on a cluster.</li>
<li>
<a rel="nofollow" href="https://pig.apache.org/">Apache Pig</a> - high level language to express data analysis programs for Hadoop.</li>
<li>
<a rel="nofollow" href="http://incubator.apache.org/s4/">Apache S4</a> - framework for stream processing, implementation of S4.</li>
<li>
<a rel="nofollow" href="http://spark.incubator.apache.org/">Apache Spark</a> - framework for in-memory cluster computing.</li>
<li>
<a rel="nofollow" href="http://spark.incubator.apache.org/docs/0.7.3/streaming-programming-guide.html">Apache Spark Streaming</a> - framework for stream processing, part of Spark.</li>
<li>
<a rel="nofollow" href="http://storm-project.net/">Apache Storm</a> - framework for stream processing by Twitter also on YARN.</li>
<li>
<a rel="nofollow" href="http://tez.incubator.apache.org/">Apache Tez</a> - application framework for executing a complex DAG (directed acyclic graph) of tasks, built on YARN.</li>
<li>
<a rel="nofollow" href="https://incubator.apache.org/projects/twill.html">Apache Twill</a> - abstraction over YARN that reduces the complexity of developing distributed applications.</li>
<li>
<a rel="nofollow" href="http://cascalog.org/">Cascalog</a> - data processing and querying library.</li>
<li>
<a rel="nofollow" href="http://vldbarc.org/pvldb/vldb2010/pvldb_vol3/I08.pdf">Cheetah</a> - High Performance, Custom Data Warehouse on Top of MapReduce.</li>
<li>
<a rel="nofollow" href="http://www.cascading.org/">Concurrent Cascading</a> - framework for data management/analytics on Hadoop.</li>
<li>
<a rel="nofollow" href="https://github.com/damballa/parkour">Damballa Parkour</a> - MapReduce library for Clojure.</li>
<li>
<a rel="nofollow" href="https://github.com/datasalt/pangool">Datasalt Pangool</a> - alternative MapReduce paradigm.</li>
<li>
<a rel="nofollow" href="https://www.datatorrent.com/">DataTorrent StrAM</a> - real-time engine is designed to enable distributed, asynchronous, real time in-memory big-data computations in as unblocked a way as possible, with minimal overhead and impact on performance.</li>
<li>
<a rel="nofollow" href="http://www.vertica.com/distributedr/">DistributedR</a> - scalable high-performance platform for the R language.</li>
<li>
<a rel="nofollow" href="http://www.drools.org/">Drools</a> - a Business Rules Management System (BRMS) solution.</li>
<li>
<a rel="nofollow" href="https://github.com/eBay/oink">eBay Oink</a> - REST based interface for PIG execution.</li>
<li>
<a rel="nofollow" href="https://www.facebook.com/notes/facebook-engineering/under-the-hood-scheduling-mapreduce-jobs-more-efficiently-with-corona/10151142560538920">Facebook Corona</a> - Hadoop enhancement which removes single point of failure.</li>
<li>
<a rel="nofollow" href="http://peregrine_mapreduce.bitbucket.org/">Facebook Peregrine</a> - Map Reduce framework.</li>
<li>
<a rel="nofollow" href="https://www.facebook.com/notes/facebook-engineering/under-the-hood-data-diving-with-scuba/10150599692628920">Facebook Scuba</a> - distributed in-memory datastore.</li>
<li>
<a rel="nofollow" href="http://geotrellis.io/">Geotrellis</a> - geographic data processing engine for high performance applications.</li>
<li>
<a rel="nofollow" href="http://esri.github.io/gis-tools-for-hadoop/">GIS Tools for Hadoop</a> - Big Data Spatial Analytics for the Hadoop Framework.</li>
<li>
<a rel="nofollow" href="http://googledevelopers.blogspot.it/2014/06/cloud-platform-at-google-io-new-big.html">Google Dataflow</a> - create data pipelines to help themæingest, transform and analyze data.</li>
<li>
<a rel="nofollow" href="http://research.google.com/archive/mapreduce.html">Google MapReduce</a> - map reduce framework.</li>
<li>
<a rel="nofollow" href="http://research.google.com/pubs/pub41378.html">Google MillWheel</a> - fault tolerant stream processing framework.</li>
<li>
<a rel="nofollow" href="http://hazelcast.com/products/hazelcast/">Hazelcast</a> - In-Memory Data Grid.</li>
<li>
<a rel="nofollow" href="http://www.informatica.com/us/products/big-data/hparser/">HParser</a> - data parsing transformation environment optimized for Hadoop.</li>
<li>
<a rel="nofollow" href="http://www.ibm.com/software/products/en/infosphere-streams">IBM Streams</a> - advanced analytic platform that allows user-developed applications to quickly ingest, analyze and correlate information as it arrives from thousands of real-time sources.</li>
<li>
<a rel="nofollow" href="https://code.google.com/p/jaql/">JAQL</a> - declarative programming language for working with structured, semi-structured and unstructured data.</li>
<li>
<a rel="nofollow" href="http://kitesdk.org/docs/current/">Kite</a> - is a set of libraries, tools, examples, and documentation focused on making it easier to build systems on top of the Hadoop ecosystem.</li>
<li>
<a rel="nofollow" href="https://github.com/EsotericSoftware/kryo">Kryo</a> - Java serialization and cloning: fast, efficient, automatic.</li>
<li>
<a rel="nofollow" href="https://github.com/linkedin/Cubert">LinkedIn Cubert</a> - a fast and efficient batch computation engine for complex analysis and reporting of massive datasets on Hadoop.</li>
<li>
<a rel="nofollow" href="https://github.com/Netflix/Lipstick">Lipstick</a> - Pig workflow visualization tool.</li>
<li>
<a rel="nofollow" href="http://druid.io/">Metamarkers Druid</a> - framework for real-time analysis of large datasets.</li>
<li>
<a rel="nofollow" href="https://github.com/Netflix/aegisthus">Netflix Aegisthus</a> - Bulk Data Pipeline out of Cassandra. implements a reader for the SSTable format and provides a map/reduce program to create a compacted snapshot of the data contained in a column family.</li>
<li>
<a rel="nofollow" href="https://github.com/Netflix/Lipstick">Netflix Lipstick</a> - Pig Visualization framework.</li>
<li>
<a rel="nofollow" href="http://qconsf.com/presentation/mantis-netflixs-event-stream-processing-system">Netflix Mantis</a> - Event Stream Processing System.</li>
<li>
<a rel="nofollow" href="https://github.com/Netflix/PigPen">Netflix PigPen</a> - map-reduce for Clojure whiche compiles to Apache Pig.</li>
<li>
<a rel="nofollow" href="https://github.com/Netflix/staash">Netflix STAASH</a> - language-agnostic as well as storage-agnostic web interface for storing data into persistent storage systems.</li>
<li>
<a rel="nofollow" href="https://github.com/Netflix/zeno">Netflix Zeno</a> - Netflix's In-Memory Data Propagation Framework.</li>
<li>
<a rel="nofollow" href="http://www.nextflow.io">Nextflow</a> - Dataflow oriented toolkit for parallel and distributed computational pipelines.</li>
<li>
<a rel="nofollow" href="http://discoproject.org/">Nokia Disco</a> - MapReduce framework developed by Nokia.</li>
<li>
<a rel="nofollow" href="https://github.com/Netflix/PigPen">PigPen</a> - PigPen is map-reduce for Clojure, or distributed Clojure. It compiles to Apache Pig, but you don't need to know much about Pig to use it.</li>
<li>
<a rel="nofollow" href="http://engineering.pinterest.com/post/91288882494/pinlater-an-asynchronous-job-execution-system">Pinterest Pinlater</a> - asynchronous job execution system.</li>
<li>
<a rel="nofollow" href="http://www.pubnub.com/">Pubnub</a> - Data stream network.</li>
<li>
<a rel="nofollow" href="http://pydoop.sourceforge.net/docs/">Pydoop</a> - Python MapReduce and HDFS API for Hadoop.</li>
<li>
<a rel="nofollow" href="http://www.scaleoutsoftware.com/">ScaleOut hServer</a> - fast, scalable in-memory data grid for Hadoop.</li>
<li>
<a rel="nofollow" href="http://seqpig.sourceforge.net/">SeqPig</a> - Simple and scalable scripting for large sequencing data set(ex: bioinfomation) in Hadoop .</li>
<li>
<a rel="nofollow" href="https://github.com/sigmoidanalytics/spork">SigmoidAnalytics Spork</a> - Pig on Apache Spark.</li>
<li>
<a rel="nofollow" href="http://spatialhadoop.cs.umn.edu/">SpatialHadoop</a> - SpatialHadoop is a MapReduce extension to Apache Hadoop designed specially to work with spatial data. .</li>
<li>
<a rel="nofollow" href="http://projects.spring.io/spring-hadoop/">Spring for Apache Hadoop</a> - unified configuration model and easy to use APIs for using HDFS, MapReduce, Pig, and Hive.</li>
<li>
<a rel="nofollow" href="http://www.sqlstream.com/blaze/">SQLStream Blaze</a> - stream processing platform.</li>
<li>
<a rel="nofollow" href="http://www.openstratio.org/about/stratio-streaming/">Stratio Streaming</a> - the union of a real-time messaging bus with a complex event processing engine using Spark Streaming.</li>
<li>
<a rel="nofollow" href="http://stratosphere.eu/">Stratosphere</a> - general purpose cluster computing framework.</li>
<li>
<a rel="nofollow" href="https://streamdrill.com/">Streamdrill</a> - usefull for counting activities of event streams over different time windows and finding the most active one.</li>
<li>
<a rel="nofollow" href="http://www.sumologic.com/">Sumo Logic</a> - cloud based analyzer for machine-generated data..</li>
<li>
<a rel="nofollow" href="http://it.teradata.com/Teradata-QueryGrid/">Teradata QueryGrid</a> - data-access layer that can orchestrate multiple modes of analysis across multiple databases plus Hadoop.</li>
<li>
<a rel="nofollow" href="http://www.tibco.com/products/automation/in-memory-computing/in-memory-data-grid/activespaces-enterprise-edition">TIBCO ActiveSpaces</a> - in-memory data grid.</li>
<li>
<a rel="nofollow" href="http://cask.co/products/tigon/">Tigon</a> - a distributed framework built on Apache HadoopTM and Apache HBaseTM for real-time, high-throughput, low-latency data processing and analytics applications.</li>
<li>
<a rel="nofollow" href="http://torch.ch/">Torch</a> - Scientific computing for LuaJIT.</li>
<li>
<a rel="nofollow" href="https://github.com/twitter/scalding">Twitter Scalding</a> - Scala library for Map Reduce jobs, built on Cascading.</li>
<li>
<a rel="nofollow" href="https://github.com/twitter/summingbird">Twitter Summingbird</a> - Streaming MapReduce with Scalding and Storm, by Twitter.</li>
<li>
<a rel="nofollow" href="https://blog.twitter.com/2014/tsar-a-timeseries-aggregator">Twitter TSAR</a> - TimeSeries AggregatoR by Twitter.</li>
</ul>
<h4>Distributed Filesystem</h4>
<ul>
<li>
<a rel="nofollow" href="http://hadoop.apache.org/">Apache HDFS</a> - a way to store large files across multiple machines.</li>
<li>
<a rel="nofollow" href="http://www.fhgfs.com/cms/">BeeGFS</a> - formerly FhGFS, parallel distributed file system.</li>
<li>
<a rel="nofollow" href="http://ceph.com/ceph-storage/file-system/">Ceph Filesystem</a> - software storage platform designed.</li>
<li>
<a rel="nofollow" href="http://disco.readthedocs.org/en/latest/howto/ddfs.html">Disco DDFS</a> - distributed filesystem.</li>
<li>
<a rel="nofollow" href="https://www.facebook.com/note.php?note_id=76191543919">Facebook Haystack</a> - object storage system.</li>
<li>
<a rel="nofollow" href="https://google.com/">Google Colossus</a> - distributed filesystem (GFS2).</li>
<li>
<a rel="nofollow" href="https://google.com/">Google GFS</a> - distributed filesystem.</li>
<li>
<a rel="nofollow" href="http://research.google.com/pubs/pub36971.html">Google Megastore</a> - scalable, highly available storage.</li>
<li>
<a rel="nofollow" href="http://www.gridgain.org/">GridGain</a> - GGFS, Hadoop compliant in-memory file system.</li>
<li>
<a rel="nofollow" href="https://github.com/twitter/hdfs-du">HDSF-DU</a> - HDFS-DU is an interactive visualization of the Hadoop distributed file system. .</li>
<li>
<a rel="nofollow" href="http://wiki.lustre.org/">Lustre file system</a> - high-performance distributed filesystem.</li>
<li>
<a rel="nofollow" href="https://github.com/Netflix/s3mper">Netflix S3mper</a> - library that provides an additional layer of consistency checking on top of Amazon's S3 index through use of a consistent, secondary index.</li>
<li>
<a rel="nofollow" href="https://www.quantcast.com/engineering/qfs/">Quantcast File System QFS</a> - open-source distributed file system.</li>
<li>
<a rel="nofollow" href="http://www.gluster.org/">Red Hat GlusterFS</a> - scale-out network-attached storage file system.</li>
<li>
<a rel="nofollow" href="http://tachyon-project.org/">Tachyon</a> - reliable file sharing at memory speed across cluster frameworks.</li>
</ul>
<h4>Key-Map Data Model</h4>
<ul>
<li>
<a rel="nofollow" href="http://www.actian.com/">Actian Vector</a> - column-oriented analytic database.</li>
<li>
<a rel="nofollow" href="http://accumulo.apache.org/">Apache Accumulo</a> - distribuited key/value store, built on Hadoop.</li>
<li>
<a rel="nofollow" href="http://cassandra.apache.org/">Apache Cassandra</a> - column-oriented distribuited datastore, inspired by BigTable.</li>
<li>
<a rel="nofollow" href="http://hbase.apache.org/">Apache HBase</a> - column-oriented distribuited datastore, inspired by BigTable.</li>
<li>
<a rel="nofollow" href="https://code.facebook.com/posts/321111638043166/hydrabase-the-evolution-of-hbase-facebook/">Facebook HydraBase</a> - evolution of HBase made by Facebook.</li>
<li>
<a rel="nofollow" href="http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/bigtable-osdi06.pdf">Google BigTable</a> - column-oriented distributed datastore.</li>
<li>
<a rel="nofollow" href="https://developers.google.com/datastore/">Google Cloud Datastore</a> - is a fully managed, schemaless database for storing non-relational data over BigTable.</li>
<li>
<a rel="nofollow" href="http://hypertable.org/">Hypertable</a> - column-oriented distribuited datastore, inspired by BigTable.</li>
<li>
<a rel="nofollow" href="http://infinidb.co/">InfiniDB</a> - is accessed through a MySQL interface and use massive parallel processing to parallelize queries.</li>
<li>
<a rel="nofollow" href="http://content.dataversity.net/rs/wilshireconferences/images/MapR-DB_Product_Preview_for_NoSQL_Now.pdf">MapR-DB</a> - fast, scalable, and enterprise-ready in-Hadoop database architected to manage big data.</li>
<li>
<a rel="nofollow" href="https://github.com/Netflix/Priam">Netflix Priam</a> - Co-Process for backup/recovery, Token Management, and Centralized Configuration management for Cassandra.</li>
<li>
<a rel="nofollow" href="http://ohmdata.com/">OhmData C5</a> - improved version of HBase.</li>
<li>
<a rel="nofollow" href="http://sqrrl.com/product/sqrrl-enterprise/">Sqrrl</a> - NoSQL databases on top of Apache Accumulo.</li>
<li>
<a rel="nofollow" href="https://github.com/continuuity/tephra">Tephra</a> - Transactions for HBase.</li>
<li>
<a rel="nofollow" href="https://blog.twitter.com/2014/manhattan-our-real-time-multi-tenant-distributed-database-for-twitter-scale">Twitter Manhattan</a> - real-time, multi-tenant distributed database for Twitter scale.</li>
</ul>
<h4>Document Data Model</h4>
<ul>
<li>
<a rel="nofollow" href="http://www.actian.com/products/operational-databases/">Actian Versant</a> - commercial object-oriented database management systems .</li>
<li>
<a rel="nofollow" href="http://aws.amazon.com/simpledb/">Amazon SimpleDB</a> - a highly available and flexible non-relational data store that offloads the work of database administration.</li>
<li>
<a rel="nofollow" href="http://www.clusterpoint.com/">Clusterpoint</a> - a database software for high-speed storage and large-scale processing of XML and JSON data on clusters of commodity hardware.</li>
<li>
<a rel="nofollow" href="https://crate.io/">Crate Data</a> - is an open source massively scalable data store. It requires zero administration.</li>
<li>
<a rel="nofollow" href="http://www.infoq.com/news/2014/06/facebook-apollo">Facebook Apollo</a> - Facebook’s Paxos-like NoSQL database.</li>
<li>
<a rel="nofollow" href="http://comsysto.github.io/jumbodb/">jumboDB</a> - document oriented datastore over Hadoop.</li>
<li>
<a rel="nofollow" href="http://data.linkedin.com/projects/espresso">LinkedIn Espresso</a> - horizontally scalable document-oriented NoSQL data store.</li>
<li>
<a rel="nofollow" href="http://www.marklogic.com/">MarkLogic</a> - Schema-agnostic Enterprise NoSQL database technology.</li>
<li>
<a rel="nofollow" href="http://azure.microsoft.com/en-us/services/documentdb/">Microsoft DocumentDB</a> - fully-managed, highly-scalable, NoSQL document database service.</li>
<li>
<a rel="nofollow" href="http://www.mongodb.org/">MongoDB</a> - Document-oriented database system.</li>
<li>
<a rel="nofollow" href="http://www.ravendb.net/">RavenDB</a> - A transactional, open-source Document Database.</li>
<li>
<a rel="nofollow" href="http://www.rethinkdb.com/">RethinkDB</a> - document database that supports queries like table joins and group by.</li>
<li>
<a rel="nofollow" href="https://code.google.com/p/terrastore/">Terrastore</a> - a modern document store which provides advanced scalability and elasticity features without sacrificing consistency.</li>
<li>
<a rel="nofollow" href="http://www.tokutek.com/products/tokumx-for-mongodb/">TokuMX</a> - High-Performance MongoDB Distribution.</li>
</ul>
<h4>Key-value Data Model</h4>
<ul>
<li>
<a rel="nofollow" href="http://www.aerospike.com/">Aerospike</a> - NoSQL flash-optimized, in-memory. Open source and "Server code in 'C' (not Java or Erlang) precisely tuned to avoid context switching and memory copies..</li>
<li>
<a rel="nofollow" href="http://aws.amazon.com/dynamodb/">Amazon DynamoDB</a> - distributed key/value store, implementation of Dynamo paper.</li>
<li>
<a rel="nofollow" href="https://github.com/couchbaselabs/forestdb">Couchbase ForestDB</a> - Fast Key-Value Storage Engine Based on Hierarchical B+-Tree Trie.</li>
<li>
<a rel="nofollow" href="http://inaka.github.io/edis/">Edis</a> - is a protocol-compatible Server replacement for Redis.</li>
<li>
<a rel="nofollow" href="https://github.com/nathanmarz/elephantdb">ElephantDB</a> - Distributed database specialized in exporting data from Hadoop.</li>
<li>
<a rel="nofollow" href="http://geteventstore.com">EventStore</a> - distributed time series database.</li>
<li>
<a rel="nofollow" href="http://hyperdex.org/">HyperDex</a> - next generation key-value store.</li>
<li>
<a rel="nofollow" href="http://sourceforge.net/projects/kai/">KAI</a> - a distributed key-value datastore.</li>
<li>
<a rel="nofollow" href="https://github.com/linkedin-sna/sna-page/tree/master/krati">LinkedIn Krati</a> - is a simple persistent data store with very low latency and high throughput.</li>
<li>
<a rel="nofollow" href="http://www.project-voldemort.com/voldemort/">Linkedin Voldemort</a> - distributed key/value storage system.</li>
<li>
<a rel="nofollow" href="http://memcachedb.org/">MemcacheDB</a> - a distributed key-value storage system designed for persistent.</li>
<li>
<a rel="nofollow" href="http://techblog.netflix.com/2014/03/netflixoss-season-2-episode-1.html">Netflix Dynomite</a> - thin Dynamo-based replication for cached data.</li>
<li>
<a rel="nofollow" href="http://www.oracle.com/technetwork/database/database-technologies/nosqldb/overview/index.html">Oracle NoSQL Database</a> - distributed key-value database by Oracle Corporation.</li>
<li>
<a rel="nofollow" href="https://ramcloud.atlassian.net/wiki/display/RAM/RAMCloud">RAMCloud</a> - storage system that provides large-scale low-latency storage by keeping all data in DRAM all the time and aggregating the main memories of thousands of servers.</li>
<li>
<a rel="nofollow" href="http://redis.io">Redis</a> - in memory key value datastore.</li>
<li>
<a rel="nofollow" href="http://redis.io/topics/cluster-spec">Redis Cluster</a> - distributed implementation of Redis.</li>
<li>
<a rel="nofollow" href="http://redis.io/topics/sentinel">Redis Sentinel</a> - system designed to help managing Redis instances.</li>
<li>
<a rel="nofollow" href="https://github.com/basho/riak">Riak</a> - a decentralized datastore.</li>
<li>
<a rel="nofollow" href="https://code.google.com/p/scalaris/">Scalaris</a> - a distributed transactional key-value store.</li>
<li>
<a rel="nofollow" href="https://github.com/twitter/storehaus">Storehaus</a> - library to work with asynchronous key value stores, by Twitter.</li>
<li>
<a rel="nofollow" href="https://github.com/tarantool/tarantool">Tarantool</a> - an efficient NoSQL database and a Lua application server.</li>
<li>
<a rel="nofollow" href="https://github.com/Treode/store">TreodeDB</a> - key-value store that's replicated and sharded and provides atomic multirow writes.</li>
<li>
<a rel="nofollow" href="http://en.wikipedia.org/wiki/Yahoo_Sherpa">Yahoo Sherpa</a> - hosted, distributed and geographically replicated key-valueÊcloud storage platform.</li>
</ul>
<h4>Graph Data Model</h4>
<ul>
<li>
<a rel="nofollow" href="http://giraph.apache.org/">Apache Giraph</a> - implementation of Pregel, based on Hadoop.</li>
<li>
<a rel="nofollow" href="http://spark.incubator.apache.org/docs/0.7.3/bagel-programming-guide.html">Apache Spark Bagel</a> - implementation of Pregel, part of Spark.</li>
<li>
<a rel="nofollow" href="https://www.arangodb.org/">ArangoDB</a> - multi model distribuited database.</li>
<li>
<a rel="nofollow" href="https://www.facebook.com/notes/facebook-engineering/tao-the-power-of-the-graph/10151525983993920">Facebook TAO</a> - TAO is the distributed data store that is widely used at facebook to store and serve the social graph.</li>
<li>
<a rel="nofollow" href="http://thinkaurelius.github.io/faunus/">Faunus</a> - Hadoop-based graph analytics engine for analyzing graphs represented across a multi-machine compute cluster.</li>
<li>
<a rel="nofollow" href="https://github.com/google/cayley">Google Cayley</a> - open-source graph database.</li>
<li>
<a rel="nofollow" href="http://kowshik.github.io/JPregel/pregel_paper.pdf">Google Pregel</a> - graph processing framework.</li>
<li>
<a rel="nofollow" href="http://graphlab.org/projects/source.html">GraphLab PowerGraph</a> - a core C++ GraphLab API and a collection of high-performance machine learning and data mining toolkits built on top of the GraphLab API.</li>
<li>
<a rel="nofollow" href="https://amplab.cs.berkeley.edu/publication/graphx-grades/">GraphX</a> - resilient Distributed Graph System on Spark.</li>
<li>
<a rel="nofollow" href="https://github.com/tinkerpop/gremlin">Gremlin</a> - graph traversal Language.</li>
<li>
<a rel="nofollow" href="http://www.hypergraphdb.org/">HyperGraphDB</a> - general purpose, open-source data storage mechanism based on a powerful knowledge management formalism known as directed hypergraphs.</li>
<li>
<a rel="nofollow" href="http://www.objectivity.com/infinitegraph">InfiniteGraph</a> - distributed graph database.</li>
<li>
<a rel="nofollow" href="https://github.com/paulhoule/infovore">Infovore</a> - RDF-centric Map/Reduce framework.</li>
<li>
<a rel="nofollow" href="https://01.org/graphbuilder/">Intel GraphBuilder</a> - tools to construct large-scale graphs on top of Hadoop.</li>
<li>
<a rel="nofollow" href="http://mapgraph.io/">MapGraph</a> - Massively Parallel Graph processing on GPUs.</li>
<li>
<a rel="nofollow" href="http://www.neo4j.org/">Neo4j</a> - graph database writting entirely in Java.</li>
<li>
<a rel="nofollow" href="http://www.orientechnologies.com/">OrientDB</a> - document and graph database.</li>
<li>
<a rel="nofollow" href="https://github.com/xslogic/phoebus">Phoebus</a> - framework for large scale graph processing.</li>
<li>
<a rel="nofollow" href="http://www.sparsity-technologies.com/">Sparksee</a> - scalable high-performance graph database.</li>
<li>
<a rel="nofollow" href="http://stardog.com/">Stardog</a> - graph database: search, query, reasoning, and constraints in a lightweight, pure Java system.</li>
<li>
<a rel="nofollow" href="http://thinkaurelius.github.io/titan/">Titan</a> - distributed graph database, built over Cassandra.</li>
<li>
<a rel="nofollow" href="https://github.com/twitter/flockdb">Twitter FlockDB</a> - distribuited graph database.</li>
</ul>
<h4>NewSQL Databases</h4>
<ul>
<li>
<a rel="nofollow" href="http://www.actian.com/products/operational-databases/">Actian Ingres</a> - commercially supported, open-source SQL relational database management system.</li>
<li>
<a rel="nofollow" href="http://probcomp.csail.mit.edu/bayesdb/index.html">BayesDB</a> - statistic oriented SQL database.</li>
<li>
<a rel="nofollow" href="https://github.com/cockroachdb/cockroach">Cockroach</a> - Scalable, Geo-Replicated, Transactional Datastore.</li>
<li>
<a rel="nofollow" href="http://www.datomic.com/">Datomic</a> - distributed database designed to enable scalable, flexible and intelligent applications.</li>
<li>
<a rel="nofollow" href="https://foundationdb.com/">FoundationDB</a> - distributed database, inspired by F1.</li>
<li>
<a rel="nofollow" href="http://research.google.com/pubs/pub41344.html">Google F1</a> - distributed SQL database built on Spanner.</li>
<li>
<a rel="nofollow" href="http://research.google.com/archive/spanner.html">Google Spanner</a> - globally distributed semi-relational database.</li>
<li>
<a rel="nofollow" href="http://hstore.cs.brown.edu/">H-Store</a> - is an experimental main-memory, parallel database management system that is optimized for on-line transaction processing (OLTP) applications.</li>
<li>
<a rel="nofollow" href="http://www.percona.com/doc/percona-server/5.5/performance/handlersocket.html">HandlerSocket</a> - NoSQL plugin for MySQL/MariaDB.</li>
<li>
<a rel="nofollow" href="http://www.ibm.com/software/data/db2/">IBM DB2</a> - object-relational database management system.</li>
<li>
<a rel="nofollow" href="http://www.infinisql.org/">InfiniSQL</a> - infinity scalable RDBMS.</li>
<li>
<a rel="nofollow" href="http://www.memsql.com/">MemSQL</a> - in memory SQL database witho optimized columnar storage on flash.</li>
<li>
<a rel="nofollow" href="http://www.nuodb.com/">NuoDB</a> - SQL/ACID compliant distributed database.</li>
<li>
<a rel="nofollow" href="http://www.oracle.com/us/corporate/features/database-12c/index.html">Oracle Database</a> - object-relational database management system.</li>
<li>
<a rel="nofollow" href="http://www.oracle.com/technetwork/database/database-technologies/timesten/overview/index.html">Oracle TimesTen in-Memory Database</a> - in-memory, relational database management system with persistence and recoverability.</li>
<li>
<a rel="nofollow" href="http://gemfirexd.docs.gopivotal.com/latest/userguide/index.html?q=about_users_guide.html/">Pivotal GemFire XD</a> - Low-latency, in-memory, distributed SQL data store. Provides SQL interface to in-memory table data, persistable in HDFS.</li>
<li>
<a rel="nofollow" href="http://www.saphana.com/welcome">SAP HANA</a> - is an in-memory, column-oriented, relational database management system.</li>
<li>
<a rel="nofollow" href="http://senseidb.com/">SenseiDB</a> - distributed, realtime, semi-structured database.</li>
<li>
<a rel="nofollow" href="http://skydb.io/">Sky</a> - database used for flexible, high performance analysis of behavioral data.</li>
<li>
<a rel="nofollow" href="http://www.symmetricds.org/">SymmetricDS</a> - open source software for both file and database synchronization.</li>
<li>
<a rel="nofollow" href="http://it.teradata.com/products-and-services/Teradata-Database/">Teradata Database</a> - complete relational database management system.</li>
<li>
<a rel="nofollow" href="http://voltdb.com/">VoltDB</a> - in-memory NewSQL database.</li>
</ul>
<h4>Columnar Databases</h4>
<ul>
<li>
<a rel="nofollow" href="http://aws.amazon.com/redshift/">Amazon RedShift</a> - data warehouse service, based on PostgreSQL.</li>
<li>
<a rel="nofollow" href="http://db.lcs.mit.edu/projects/cstore/">C-Store</a> - column oriented DBMS.</li>
<li>
<a rel="nofollow" href="http://research.google.com/pubs/pub36632.html">Google BigQuery</a> - framework for interactive analysis, implementation of Dremel.</li>
<li>
<a rel="nofollow" href="http://research.google.com/pubs/pub36632.html">Google Dremel</a> - framework for interactive analysis, implementation of Dremel.</li>
<li>
<a rel="nofollow" href="https://www.monetdb.org/">MonetDB</a> - column store database.</li>
<li>
<a rel="nofollow" href="http://parquet.io/">Parquet</a> - columnar storage format for Hadoop.</li>
<li>
<a rel="nofollow" href="https://www.pivotal.io/big-data/pivotal-greenplum-database">Pivotal Greenplum</a> - purpose-built, dedicated analytic data warehouse.</li>
<li>
<a rel="nofollow" href="http://www.vertica.com/">Vertica</a> - is designed to manage large, fast-growing volumes of data and provide very fast query performance when used for data warehouses.</li>
</ul>
<h4>Time-Series Databases</h4>
<ul>
<li>
<a rel="nofollow" href="http://square.github.io/cube/">Cube</a> - uses MongoDB to store time series data.</li>
<li>
<a rel="nofollow" href="https://github.com/etsy/statsd/">Etsy StatsD</a> - simple daemon for easy stats aggregation.</li>
<li>
<a rel="nofollow" href="http://influxdb.com/">InfluxDB</a> - distributed time series database.</li>
<li>
<a rel="nofollow" href="https://code.google.com/p/kairosdb/">Kairosdb</a> - similar to OpenTSDB but allows for Cassandra.</li>
<li>
<a rel="nofollow" href="http://opentsdb.net">OpenTSDB</a> - distributed time series database on top of HBase.</li>
<li>
<a rel="nofollow" href="http://square.github.io/cube/">Square Cube</a> - system for collecting timestamped events and deriving metrics.</li>
<li>
<a rel="nofollow" href="https://tempoiq.com/">TempoIQ</a> - Cloud-based sensor analytics.</li>
</ul>
<h4>SQL-like processing</h4>
<ul>
<li>
<a rel="nofollow" href="http://www.actian.com/products/analytics-platform/">Actian SQL for Hadoop</a> - high performance interactive SQL access to all Hadoop data.</li>
<li>
<a rel="nofollow" href="https://github.com/amplab/shark/">AMPLAB Shark</a> - data warehouse system for Spark.</li>
<li>
<a rel="nofollow" href="http://incubator.apache.org/drill/">Apache Drill</a> - framework for interactive analysis, inspired by Dremel.</li>
<li>
<a rel="nofollow" href="http://hive.apache.org/docs/hcat_r0.5.0/">Apache HCatalog</a> - table and storage management layer for Hadoop.</li>
<li>
<a rel="nofollow" href="http://hive.apache.org/">Apache Hive</a> - SQL-like data warehouse system for Hadoop.</li>
<li>
<a rel="nofollow" href="https://wiki.apache.org/incubator/OptiqProposal">Apache Optiq</a> - framework that allows efficient translation of queries involving heterogeneous and federated data.</li>
<li>
<a rel="nofollow" href="http://phoenix.incubator.apache.org/index.html">Apache Phoenix</a> - SQL skin over HBase.</li>
<li>
<a rel="nofollow" href="http://blinkdb.org/">BlinkDB</a> - massively parallel, approximate query engine.</li>
<li>
<a rel="nofollow" href="http://www.cloudera.com/content/cloudera/en/products-and-services/cdh/impala.html">Cloudera Impala</a> - framework for interactive analysis, Inspired by Dremel.</li>
<li>
<a rel="nofollow" href="http://www.cascading.org/lingual/">Concurrent Lingual</a> - SQL-like query language for Cascading.</li>
<li>
<a rel="nofollow" href="http://www.datasalt.com/products/splout-sql/">Datasalt Splout SQL</a> - full SQL query engine for big datasets.</li>
<li>
<a rel="nofollow" href="http://www.kylin.io/">eBay Kylin</a> - Distributed Analytics Engine from eBay Inc. that provides SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets.</li>
<li>
<a rel="nofollow" href="http://prestodb.io/">Facebook PrestoDB</a> - distributed SQL query engine.</li>
<li>
<a rel="nofollow" href="http://hadapt.com/">Hadapt</a> - a native implementation of SQL for the Apache Hadoop open-source project.</li>
<li>
<a rel="nofollow" href="http://jethrodata.com/product-2/product/">JethroData</a> - index-based SQL engine for Hadoop.</li>
<li>
<a rel="nofollow" href="https://metanautix.com/product/">Metanautix Quest</a> - data compute engine.</li>
<li>
<a rel="nofollow" href="http://www.gopivotal.com/pivotal-products/data/pivotal-hd">Pivotal HAWQ</a> - SQL-like data warehouse system for Hadoop.</li>
<li>
<a rel="nofollow" href="http://rainstor.com/products/rainstor-database/">RainstorDB</a> - database for storing petabyte-scale volumes of structured and semi-structured data.</li>
<li>
<a rel="nofollow" href="https://github.com/apache/spark/tree/master/sql">Spark Catalyst</a> - is a Query Optimization Framework for Spark and Shark.</li>
<li>
<a rel="nofollow" href="http://databricks.com/blog/2014/03/26/Spark-SQL-manipulating-structured-data-using-Spark.html">SparkSQL</a> - Manipulating Structured Data Using Spark.</li>
<li>
<a rel="nofollow" href="http://www.splicemachine.com/">Splice Machine</a> - a full-featured SQL-on-Hadoop RDBMS with ACID transactions.</li>
<li>
<a rel="nofollow" href="http://hortonworks.com/labs/stinger/">Stinger</a> - interactive query for Hive.</li>
<li>
<a rel="nofollow" href="http://tajo.incubator.apache.org/">Tajo</a> - distributed data warehouse system on Hadoop.</li>
<li>
<a rel="nofollow" href="https://wiki.trafodion.org/wiki/index.php/Main_Page">Trafodion</a> - enterprise-class SQL-on-HBase solution targeting big data transactional or operational workloads.</li>
</ul>
<h4>Integrated Development Environments</h4>
<ul>
<li>
<a rel="nofollow" href="https://github.com/rstudio/rstudio">R-Studio</a> - IDE for R.</li>
</ul>
<h4>Data Ingestion</h4>
<ul>
<li>
<a rel="nofollow" href="http://aws.amazon.com/kinesis/">Amazon Kinesis</a> - real-time processing of streaming data at massive scale.</li>
<li>
<a rel="nofollow" href="http://zookeeper.apache.org/bookkeeper/">Apache BookKeeper</a> - a distributed logging service called BookKeeper and a distributed publish/subscribe system built on top of BookKeeper called Hedwig.</li>
<li>
<a rel="nofollow" href="http://incubator.apache.org/chukwa/">Apache Chukwa</a> - data collection system.</li>
<li>
<a rel="nofollow" href="http://flume.apache.org/">Apache Flume</a> - service to manage large amount of log data.</li>
<li>
<a rel="nofollow" href="http://samza.incubator.apache.org/">Apache Samza</a> - stream processing framework, based on Kafla and YARN.</li>
<li>
<a rel="nofollow" href="http://sqoop.apache.org/">Apache Sqoop</a> - tool to transfer data between Hadoop and a structured datastore.</li>
<li>
<a rel="nofollow" href="https://uima.apache.org/">Apache UIMA</a> - Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user.</li>
<li>
<a rel="nofollow" href="https://github.com/cloudera/cdk/tree/master/cdk-morphlines">Cloudera Morphlines</a> - framework that help ETL to Solr, HBase and HDFS.</li>
<li>
<a rel="nofollow" href="https://github.com/facebook/scribe">Facebook Scribe</a> - streamed log data aggregator.</li>
<li>
<a rel="nofollow" href="http://fluentd.org/">Fluentd</a> - tool to collect events and logs.</li>
<li>
<a rel="nofollow" href="http://research.google.com/pubs/pub41318.html">Google Photon</a> - geographically distributed system for joining multiple continuously flowing streams of data in real-time with high scalability and low latency.</li>
<li>
<a rel="nofollow" href="https://github.com/mozilla-services/heka">Heka</a> - open source stream processing software system.</li>
<li>
<a rel="nofollow" href="https://github.com/sonalgoyal/hiho">HIHO</a> - framework for connecting disparate data sources with Hadoop.</li>
<li>
<a rel="nofollow" href="https://github.com/linkedin/camus">LinkedIn Camus</a> - Kafka to HDFS pipeline. It is a mapreduce job that does distributed data loads out of Kafka.</li>
<li>
<a rel="nofollow" href="http://data.linkedin.com/projects/databus">LinkedIn Databus</a> - stream of change capture events for a database.</li>
<li>
<a rel="nofollow" href="http://engineering.linkedin.com/data-ingestion/gobblin-big-data-ease">LinkedIn Gobblin</a> - a framework for Solving Big Data Ingestion Problem.</li>
<li>
<a rel="nofollow" href="https://github.com/linkedin/kamikaze">LinkedIn Kamikaze</a> - utility package for compressing sorted integer arrays.</li>
<li>
<a rel="nofollow" href="http://www.slideshare.net/Hadoop_Summit/th-220p230-cramachandranv1">Linkedin Lumos</a> - bridge from OLTP to OLAP for use it on Hadoop.</li>
<li>
<a rel="nofollow" href="https://github.com/linkedin/white-elephant">LinkedIn White Elephant</a> - log aggregator and dashboard.</li>
<li>
<a rel="nofollow" href="http://logstash.net">Logstash</a> - a tool for managing events and logs.</li>
<li>
<a rel="nofollow" href="https://github.com/Netflix/suro">Netflix Suro</a> - data pipeline service for collecting, aggregating, and dispatching large volume of application events including log data based on Chukwa.</li>
<li>
<a rel="nofollow" href="https://github.com/pinterest/secor">Pinterest Secor</a> - is a service implementing Kafka log persistance.</li>
<li>
<a rel="nofollow" href="http://cloudera.github.io/RecordBreaker/">Record Breaker</a> - Automatic structure for your text-formatted data.</li>
<li>
<a rel="nofollow" href="http://www.tibco.com/products/automation/enterprise-messaging/enterprise-message-service">TIBCO Enterprise Message Service</a> - standards-based messaging middleware.</li>
<li>
<a rel="nofollow" href="https://github.com/twitter/zipkin">Twitter Zipkin</a> - distributed tracing system that helps us gather timing data for all the disparate services at Twitter.</li>
<li>
<a rel="nofollow" href="http://www.informatica.com/us/products/big-data/vibe-data-stream/">Vibe Data Stream</a> - streaming data collection for real-time Big Data analytics.</li>
</ul>
<h4>Message-oriented middleware</h4>
<ul>
<li>
<a rel="nofollow" href="http://activemq.apache.org/">ActiveMQ</a> - open source messaging and Integration Patterns server.</li>
<li>
<a rel="nofollow" href="http://aws.amazon.com/sqs/">Amazon Simple Queue Service</a> - fast, reliable, scalable, fully managed queue service.</li>
<li>
<a rel="nofollow" href="http://kafka.apache.org/">Apache Kafka</a> - distributed publish-subscribe messaging system.</li>
<li>
<a rel="nofollow" href="http://qpid.apache.org/">Apache Qpid</a> - messaging tools that speak AMQP and support many languages and platforms.</li>
<li>
<a rel="nofollow" href="http://activemq.apache.org/apollo/">Apollo</a> - ActiveMQ's next generation of messaging.</li>
<li>
<a rel="nofollow" href="http://kr.github.io/beanstalkd/">Beanstalkd</a> - simple, fast work queue.</li>
<li>
<a rel="nofollow" href="https://github.com/bitly/nsq">Bit.ly NSQ</a> - realtime distributed message processing at scale.</li>
<li>
<a rel="nofollow" href="http://www.celeryproject.org/">Celery</a> - Distributed Task Queue.</li>
<li>
<a rel="nofollow" href="http://www.crossroads.io/">Crossroads I/O</a> - library for building scalable and high performance distributed applications.</li>
<li>
<a rel="nofollow" href="https://github.com/wavii/darner">Darner</a> - simple, lightweight message queue.</li>
<li>
<a rel="nofollow" href="https://code.facebook.com/posts/820258981365363/building-mobile-first-infrastructure-for-messenger/">Facebook Iris</a> - a totally ordered queue of messaging updates with separate pointers into the queue indicating the last update sent to your Messenger app and the traditional storage tier.</li>
<li>
<a rel="nofollow" href="http://gearman.org">Gearman</a> - Job Server.</li>
<li>
<a rel="nofollow" href="http://www.jboss.org/hornetq">HornetQ</a> - open source project to build a multi-protocol, embeddable, very high performance, clustered, asynchronous messaging system.</li>
<li>
<a rel="nofollow" href="http://www.iron.io/mq">IronMQ</a> - easy-to-use highly available message queuing service.</li>
<li>
<a rel="nofollow" href="http://robey.github.io/kestrel/">Kestrel</a> - distributed message queue system.</li>
<li>
<a rel="nofollow" href="https://wiki.openstack.org/wiki/Marconi">Marconi</a> - queuing and notification service made by and for OpenStack, but not only for it.</li>
<li>
<a rel="nofollow" href="http://www.rabbitmq.com/">RabbitMQ</a> - Robust messaging for applications.</li>
<li>
<a rel="nofollow" href="http://restmq.com/">RestMQ</a> - message queue which uses HTTP as transport, JSON to format a minimalist protocol and is organized as REST resources.</li>
<li>
<a rel="nofollow" href="http://python-rq.org/">RQ</a> - simple Python library for queueing jobs and processing them in the background with workers.</li>
<li>
<a rel="nofollow" href="http://sidekiq.org/">Sidekiq</a> - Simple, efficient background processing for Ruby.</li>
<li>
<a rel="nofollow" href="http://www.zeromq.org/">ZeroMQ</a> - The Intelligent Transport Layer.</li>
</ul>
<h4>Service Programming</h4>
<ul>
<li>
<a rel="nofollow" href="http://akka.io/">Akka Toolkit</a> - runtime for distributed, and fault tolerant event-driven applications on the JVM.</li>
<li>
<a rel="nofollow" href="http://avro.apache.org/">Apache Avro</a> - data serialization system.</li>
<li>
<a rel="nofollow" href="http://curator.apache.org/">Apache Curator</a> - Java libaries for Apache ZooKeeper.</li>
<li>
<a rel="nofollow" href="http://karaf.apache.org/">Apache Karaf</a> - OSGi runtime that runs on top of any OSGi framework.</li>
<li>
<a rel="nofollow" href="http://thrift.apache.org//">Apache Thrift</a> - framework to build binary protocols.</li>
<li>
<a rel="nofollow" href="http://zookeeper.apache.org/">Apache Zookeeper</a> - centralized service for process management.</li>
<li>
<a rel="nofollow" href="http://research.google.com/archive/chubby.html">Google Chubby</a> - a lock service for loosely-coupled distributed systems.</li>
<li>
<a rel="nofollow" href="http://data.linkedin.com/opensource/norbert">Linkedin Norbert</a> - cluster manager.</li>
<li>
<a rel="nofollow" href="http://www.mpich.org/">MPICH</a> - high performance and widely portable implementation of the Message Passing Interface (MPI) standard.</li>
<li>
<a rel="nofollow" href="http://www.open-mpi.org/">OpenMPI</a> - message passing framework.</li>
<li>
<a rel="nofollow" href="http://www.serfdom.io/">Serf</a> - decentralized solution for service discovery and orchestration.</li>
<li>
<a rel="nofollow" href="https://github.com/spotify/luigi">Spotify Luigi</a> - a Python package for building complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more.</li>
<li>
<a rel="nofollow" href="https://github.com/spring-projects/spring-xd">Spring XD</a> - distributed and extensible system for data ingestion, real time analytics, batch processing, and data export.</li>
<li>
<a rel="nofollow" href="https://github.com/kevinweil/elephant-bird">Twitter Elephant Bird</a> - libraries for working with LZOP-compressed data.</li>
<li>
<a rel="nofollow" href="https://twitter.github.io/finagle/">Twitter Finagle</a> - asynchronous network stack for the JVM.</li>
</ul>
<h4>Scheduling</h4>
<ul>
<li>
<a rel="nofollow" href="http://aurora.incubator.apache.org/">Apache Aurora</a> - is a service scheduler that runs on top of Apache Mesos.</li>
<li>
<a rel="nofollow" href="http://falcon.incubator.apache.org/">Apache Falcon</a> - data management framework.</li>
<li>
<a rel="nofollow" href="http://oozie.apache.org/">Apache Oozie</a> - workflow job scheduler.</li>
<li>
<a rel="nofollow" href="http://airbnb.github.io/chronos/">Chronos</a> - distributed and fault-tolerant scheduler.</li>
<li>
<a rel="nofollow" href="http://azkaban.github.io/azkaban2/">Linkedin Azkaban</a> - batch workflow job scheduler.</li>
<li>
<a rel="nofollow" href="http://engineering.pinterest.com/post/74429563460/pinball-building-workflow-management">Pinterest Pinball</a> - customizable platform for creating workflow managers.</li>
<li>
<a rel="nofollow" href="https://github.com/radlab/sparrow">Sparrow</a> - scheduling platform.</li>
</ul>
<h4>Machine Learning</h4>
<ul>
<li>
<a rel="nofollow" href="http://mahout.apache.org/">Apache Mahout</a> - machine learning library for Hadoop.</li>
<li>
<a rel="nofollow" href="http://www.ayasdi.com/">Ayasdi Core</a> - tool for topological data analysis.</li>
<li>
<a rel="nofollow" href="https://github.com/harthur/brain">brain</a> - Neural networks in JavaScript.</li>
<li>
<a rel="nofollow" href="https://github.com/cloudera/oryx">Cloudera Oryx</a> - real-time large-scale machine learning.</li>
<li>
<a rel="nofollow" href="http://www.cascading.org/pattern/">Concurrent Pattern</a> - machine learning library for Cascading.</li>
<li>
<a rel="nofollow" href="https://github.com/karpathy/convnetjs">convnetjs</a> - Deep Learning in Javascript. Train Convolutional Neural Networks (or ordinary ones) in your browser.</li>
<li>
<a rel="nofollow" href="http://devblogs.nvidia.com/parallelforall/accelerate-machine-learning-cudnn-deep-neural-network-library/">cuDNN</a> - GPU-accelerated library of primitives for deep neural networks.</li>
<li>
<a rel="nofollow" href="https://github.com/danielsdeleo/Decider">Decider</a> - Flexible and Extensible Machine Learning in Ruby.</li>
<li>
<a rel="nofollow" href="http://www.etcml.com/">etcML</a> - text classification with machine learning.</li>
<li>
<a rel="nofollow" href="https://github.com/etsy/Conjecture">Etsy Conjecture</a> - scalable Machine Learning in Scalding.</li>
<li>
<a rel="nofollow" href="http://users.soe.ucsc.edu/~niejiazhong/slides/chandra.pdf">Google Sibyl</a> - System for Large Scale Machine Learning at Google.</li>
<li>
<a rel="nofollow" href="http://0xdata.github.io/h2o/">H2O</a> - statistical, machine learning and math runtime for Hadoop.</li>
<li>
<a rel="nofollow" href="http://www.ibm.com/smarterplanet/us/en/ibmwatson/">IBM Watson</a> - cognitive computing system.</li>
<li>
<a rel="nofollow" href="https://github.com/linkedin/ml-ease">LinkedIn ml-ease</a> - ADMM based large scale logistic regression.</li>
<li>
<a rel="nofollow" href="http://www.mlbase.org/">MLbase</a> - distributed machine learning libraries for the BDAS stack.</li>
<li>
<a rel="nofollow" href="https://github.com/nikolaypavlov/MLPNeuralNet">MLPNeuralNet</a> - Fast multilayer perceptron neural network library for iOS and Mac OS X.</li>
<li>
<a rel="nofollow" href="https://github.com/numenta/nupic">nupic</a> - Numenta Platform for Intelligent Computing: a brain-inspired machine intelligence platform, and biologically accurate neural network based on cortical learning algorithms.</li>
<li>
<a rel="nofollow" href="http://prediction.io/">PredictionIO</a> - machine learning server buit on Hadoop, Mahout and Cascading.</li>
<li>
<a rel="nofollow" href="https://github.com/scikit-learn/scikit-learn">scikit-learn</a> - scikit-learn: machine learning in Python.</li>
<li>
<a rel="nofollow" href="http://spark.apache.org/docs/0.9.0/mllib-guide.html">Spark MLlib</a> - a Spark implementation of some common machine learning (ML) functionality.</li>
<li>
<a rel="nofollow" href="http://databricks.com/blog/2014/06/30/sparkling-water-h20-spark.html">Sparkling Water</a> - combine H2OÕs Machine Learning capabilities with the power of the Spark platform.</li>
<li>
<a rel="nofollow" href="http://deeplearning.net/software/theano/">Theano</a> - Python package for deep learning that can utilize NVIDIA's CUDA toolkit to run on the GPU.</li>
<li>
<a rel="nofollow" href="http://thefreemanlab.com/thunder/">Thunder</a> - Large-scale analysis of neural data.</li>
<li>
<a rel="nofollow" href="https://github.com/Ganglion/varaha">Vahara</a> - Machine learning and natural language processing with Apache Pig.</li>
<li>
<a rel="nofollow" href="http://viv.ai/">Viv</a> - global platform that enables developers to plug into and create an intelligent, conversational interface to anything.</li>
<li>
<a rel="nofollow" href="https://github.com/JohnLangford/vowpal_wabbit/wiki">Vowpal Wabbit</a> - learning system sponsored by Microsoft and Yahoo!.</li>
<li>
<a rel="nofollow" href="http://www.cs.waikato.ac.nz/ml/weka/">WEKA</a> - suite of machine learning software.</li>
<li>
<a rel="nofollow" href="https://wit.ai/">Wit</a> - Natural Language for the Internet of Things.</li>
<li>
<a rel="nofollow" href="http://www.wolframalpha.com/">Wolfram Alpha</a> - computational knowledge engine.</li>
<li>
<a rel="nofollow" href="https://yhathq.com/products/scienceops">YHat ScienceOps</a> - platform for deploying, managing, and scaling predictive models in production applications.</li>
</ul>
<h4>Benchmarking</h4>
<ul>
<li>
<a rel="nofollow" href="https://issues.apache.org/jira/browse/MAPREDUCE-3561">Apache Hadoop Benchmarking</a> - micro-benchmarks for testing Hadoop performances.</li>
<li>
<a rel="nofollow" href="https://github.com/SWIMProjectUCB/SWIM/wiki">Berkeley SWIM Benchmark</a> - real-world big data workload benchmark.</li>
<li>
<a rel="nofollow" href="https://github.com/intel-hadoop/Big-Bench">Big-Bench</a> - Big Bench Workload Development.</li>
<li>
<a rel="nofollow" href="https://github.com/yhuai/hive-benchmarks">Hive-benchmarks</a> - some benchmarking queries for Apache Hive.</li>
<li>
<a rel="nofollow" href="https://github.com/cartershanklin/hive-testbench">Hive-testbench</a> - Testbench for experimenting with Apache Hive at any data scale..</li>
<li>
<a rel="nofollow" href="https://github.com/intel-hadoop/HiBench">Intel HiBench</a> - a Hadoop benchmark suite.</li>
<li>
<a rel="nofollow" href="https://hadoopsummit.uservoice.com/forums/242807-hadoop-deployment-operations-track/suggestions/5568461-inviso-maximizing-big-data-performance-at-netflix">Netflix Inviso</a> - performance focused Big Data tool.</li>
<li>
<a rel="nofollow" href="https://issues.apache.org/jira/browse/MAPREDUCE-5116">PUMA Benchmarking</a> - benchmark suite for MapReduce applications.</li>
<li>
<a rel="nofollow" href="https://developer.yahoo.com/blogs/hadoop/gridmix3-emulating-production-workload-apache-hadoop-450.html">Yahoo Gridmix3</a> - Hadoop cluster benchmarking from Yahoo engineer team.</li>
</ul>
<h4>Security</h4>
<ul>
<li>
<a rel="nofollow" href="http://argus.incubator.apache.org/">Apache Argus</a> - framework to enable, monitor and manage comprehensive data security across the Hadoop platform.</li>
<li>
<a rel="nofollow" href="http://knox.apache.org/">Apache Knox Gateway</a> - single point of secure access for Hadoop clusters.</li>
<li>
<a rel="nofollow" href="http://incubator.apache.org/projects/sentry.html">Apache Sentry</a> - security module for data stored in Hadoop.</li>
<li>
<a rel="nofollow" href="https://github.com/packetloop/packetpig">PacketPig</a> - Open Source Big Data Security Analytics.</li>
<li>
<a rel="nofollow" href="http://www.voltage.com/products/securedata-enterprise/">Voltage SecureData</a> - data protection framework.</li>
</ul>
<h4>System Deployment</h4>
<ul>
<li>
<a rel="nofollow" href="https://github.com/impetus-opensource/ankush">Ankush</a> - A big data cluster management tool that creates and manages clusters of different technologies..</li>
<li>
<a rel="nofollow" href="http://ambari.apache.org/">Apache Ambari</a> - operational framework for Hadoop mangement.</li>
<li>
<a rel="nofollow" href="http://bigtop.apache.org//">Apache Bigtop</a> - system deployment framework for the Hadoop ecosystem.</li>
<li>
<a rel="nofollow" href="http://helix.apache.org/">Apache Helix</a> - cluster management framework.</li>
<li>
<a rel="nofollow" href="http://mesos.apache.org/">Apache Mesos</a> - cluster manager.</li>
<li>
<a rel="nofollow" href="https://github.com/hortonworks/slider">Apache Slider</a> - is a YARN application to deploy existing distributed applications on YARN.</li>
<li>
<a rel="nofollow" href="http://whirr.apache.org/">Apache Whirr</a> - set of libraries for running cloud services.</li>
<li>
<a rel="nofollow" href="http://hortonworks.com/hadoop/yarn/">Apache YARN</a> - Cluster manager.</li>
<li>
<a rel="nofollow" href="http://brooklyncentral.github.io/">Brooklyn</a> - library that simplifies application deployment and management.</li>
<li>
<a rel="nofollow" href="http://buildoop.github.io/">Buildoop</a> - Similar to Apache BigTop based on Groovy language.</li>
<li>
<a rel="nofollow" href="http://www.cloudera.com/content/cloudera/en/products-and-services/director.html">Cloudera Director</a> - a comprehensive data management platform with the flexibility and power to evolve with your business.</li>
<li>
<a rel="nofollow" href="http://gethue.com/">Cloudera HUE</a> - web application for interacting with Hadoop.</li>
<li>
<a rel="nofollow" href="https://github.com/mesosphere/deimos">Deimos</a> - Mesos containerizer hooks for Docker.</li>
<li>
<a rel="nofollow" href="http://deploop.github.io/">Develoop</a> - tool for provisioning, managing and monitoring Apache Hadoop.</li>
<li>
<a rel="nofollow" href="https://code.facebook.com/posts/816473015039157/making-facebook-s-software-infrastructure-more-energy-efficient-with-autoscale/">Facebook Autoscale</a> - the load balancer will concentrate workload to a server until it has at least a medium-level workload.</li>
<li>
<a rel="nofollow" href="http://www.wired.com/2012/08/facebook-prism/">Facebook Prism</a> - multi datacenters replication system.</li>
<li>
<a rel="nofollow" href="http://ganglia.sourceforge.net/">Ganglia Monitoring System</a> - scalable distributed monitoring system for high-performance computing systems such as clusters and Grids.</li>
<li>
<a rel="nofollow" href="https://github.com/Netflix/genie">Genie</a> - Genie provides REST-ful APIs to run Hadoop, Hive and Pig jobs, and to manage multiple Hadoop resources and perform job submissions across them..</li>
<li>
<a rel="nofollow" href="http://www.wired.com/wiredenterprise/2013/03/google-borg-twitter-mesos/all/">Google Borg</a> - job scheduling and monitoring system.</li>
<li>
<a rel="nofollow" href="https://www.youtube.com/watch?v=0ZFMlO98Jkc">Google Omega</a> - job scheduling and monitoring system.</li>
<li>
<a rel="nofollow" href="https://github.com/sentric/hannibal">Hannibal</a> - Hannibal is tool to help monitor and maintain HBase-Clusters that are configured for manual splitting..</li>
<li>
<a rel="nofollow" href="http://hortonworks.com/blog/introducing-hoya-hbase-on-yarn/">Hortonworks HOYA</a> - application that can deploy HBase cluster on YARN.</li>
<li>
<a rel="nofollow" href="http://www.jumbune.org/">Jumbune</a> - Jumbune is an open-source product built for analyzing Hadoop cluster and MapReduce jobs..</li>
<li>
<a rel="nofollow" href="https://github.com/mesosphere/marathon">Marathon</a> - Mesos framework for long-running services.</li>
<li>
<a rel="nofollow" href="https://github.com/mesos/myriad">Myriad</a> - a mesos framework designed for scaling YARN clusters on Mesos. Myriad can expand or shrink one or more YARN clusters in response to events as per configured rules and policies..</li>
<li>
<a rel="nofollow" href="https://github.com/Netflix/SimianArmy">Neflix SimianArmy</a> - a suite of tools for keeping your cloud operating in top form.</li>
</ul>
<h4>Container Manager</h4>
<ul>
<li>
<a rel="nofollow" href="https://aws.amazon.com/ecs/">Amazon EC2 Container Service</a> - a highly scalable, high performance container management service that supports Docker containers.</li>
<li>
<a rel="nofollow" href="https://www.docker.com/">Docker</a> - an open platform for developers and sysadmins to build, ship, and run distributed applications.</li>
<li>
<a rel="nofollow" href="http://www.fig.sh/">Fig</a> - fast, isolated development environments using Docker.</li>
<li>
<a rel="nofollow" href="https://cloud.google.com/container-engine/">Google Container Engine</a> - Run Docker containers on Google Cloud Platform, powered by Kubernetes.</li>
<li>
<a rel="nofollow" href="https://github.com/GoogleCloudPlatform/kubernetes">Kubernetes</a> - open source implementation of container cluster management.</li>
<li>
<a rel="nofollow" href="https://coreos.com/blog/rocket/">Rocket</a> - an alternative to the Docker runtime, designed for server environments with the most rigorous security and production requirements.</li>
</ul>
<h4>Applications</h4>
<ul>
<li>
<a rel="nofollow" href="https://github.com/adobe-research/spindle">Adobe Spindle</a> - Next-generation web analytics processing with Scala, Spark, and Parquet.</li>
<li>
<a rel="nofollow" href="http://www.kiji.org/">Apache Kiji</a> - framework to collect and analyze data in real-time, based on HBase.</li>
<li>
<a rel="nofollow" href="http://nutch.apache.org/">Apache Nutch</a> - open source web crawler.</li>
<li>
<a rel="nofollow" href="http://oodt.apache.org/">Apache OODT</a> - capturing, processing and sharing of data for NASA's scientific archives.</li>
<li>
<a rel="nofollow" href="https://tika.apache.org/">Apache Tika</a> - content analysis toolkit.</li>
<li>
<a rel="nofollow" href="http://www.dominoup.com/">Domino</a> - Run, scale, share, and deploy models Ñ without any infrastructure..</li>
<li>
<a rel="nofollow" href="http://www.eclipse.org/birt/">Eclipse BIRT</a> - Eclipse-based reporting system.</li>
<li>
<a rel="nofollow" href="https://github.com/Codecademy/EventHub">Eventhub</a> - open source event analytics platform.</li>
<li>
<a rel="nofollow" href="http://hipi.cs.virginia.edu/">HIPI Library</a> - API for performing image processing tasks on Hadoop's MapReduce.</li>
<li>
<a rel="nofollow" href="http://www.splunk.com/download/hunk">Hunk</a> - Splunk analytics for Hadoop.</li>
<li>
<a rel="nofollow" href="http://madlib.net/community/">MADlib</a> - data-processing library of an RDBMS to analyze data.</li>
<li>
<a rel="nofollow" href="https://github.com/gopivotal/PivotalR">PivotalR</a> - R on Pivotal HD / HAWQ and PostgreSQL.</li>
<li>
<a rel="nofollow" href="http://www.qubole.com/">Qubole</a> - auto-scaling Hadoop cluster, built-in data connectors.</li>
<li>
<a rel="nofollow" href="https://senseplatform.com/">Sense</a> - Cloud Platform for Data Science and Big Data Analytics.</li>
<li>
<a rel="nofollow" href="https://github.com/snowplow/snowplow">Snowplow</a> - enterprise-strength web and event analytics, powered by Hadoop, Kinesis, Redshift and Postgres.</li>
<li>
<a rel="nofollow" href="http://amplab-extras.github.io/SparkR-pkg/">SparkR</a> - R frontend for Spark.</li>
<li>
<a rel="nofollow" href="http://www.splunk.com/">Splunk</a> - analyzer for machine-generated date.</li>
<li>
<a rel="nofollow" href="http://www.talend.com/products/big-data">Talend</a> - unified open source environment for YARN, Hadoop, HBASE, Hive, HCatalog & Pig.</li>
</ul>
<h4>Search engine and framework</h4>
<ul>
<li>
<a rel="nofollow" href="https://incubator.apache.org/blur/">Apache Blur</a> - a search engine capable of querying massive amounts of structured data at incredible speeds.</li>
<li>
<a rel="nofollow" href="http://lucene.apache.org/">Apache Lucene</a> - Search engine library.</li>
<li>
<a rel="nofollow" href="http://lucene.apache.org/solr/">Apache Solr</a> - Search platform for Apache Lucene.</li>
<li>
<a rel="nofollow" href="http://www.elasticsearch.org/">ElasticSearch</a> - Search and analytics engine based on Apache Lucene.</li>
<li>
<a rel="nofollow" href="https://github.com/elasticsearch/elasticsearch-hadoop">Elasticsearch Hadoop</a> - Elasticsearch real-time search and analytics natively integrated with Hadoop. Supports Map/Reduce, Cascading, Apache Hive and Apache Pig..</li>
<li>
<a rel="nofollow" href="http://enigma.io">Enigma.io</a> - Freemium robust web application for exploring, filtering, analyzing, searching and exporting massive datasets scraped from across the Web.</li>
<li>
<a rel="nofollow" href="https://www.facebook.com/publications/219621248185635/">Facebook Unicorn</a> - social graph search platform.</li>
<li>
<a rel="nofollow" href="http://googleblog.blogspot.it/2010/06/our-new-search-index-caffeine.html">Google Caffeine</a> - continuous indexing system.</li>
<li>
<a rel="nofollow" href="http://research.google.com/pubs/pub36726.html">Google Percolator</a> - continuous indexing system.</li>
<li>
<a rel="nofollow">TeraGoogle</a> - large search index.</li>
<li>
<a rel="nofollow" href="https://github.com/VCNC/haeinsa">Haeinsa</a> - linearly scalable multi-row, multi-table transaction library for HBase based on Percolator.</li>
<li>
<a rel="nofollow" href="https://blogs.apache.org/hbase/entry/coprocessor_introduction">HBase Coprocessor</a> - implementation of Percolator, part of HBase.</li>
<li>
<a rel="nofollow" href="https://github.com/Huawei-Hadoop/hindex">hIndex</a> - Secondary Index for HBase.</li>
<li>
<a rel="nofollow" href="http://github.com/izenecloud/sf1r-lite">SF1R Search Engine</a> - distributed search engine written in c++.</li>
<li>
<a rel="nofollow" href="http://ngdata.github.io/hbase-indexer/">Lily HBase Indexer</a> - quickly and easily search for any content stored in HBase.</li>
<li>
<a rel="nofollow" href="http://senseidb.github.io/bobo/">LinkedIn Bobo</a> - is a Faceted Search implementation written purely in Java, an extension to Apache Lucene.</li>
<li>
<a rel="nofollow" href="https://github.com/linkedin/cleo">LinkedIn Cleo</a> - is a flexible software library for enabling rapid development of partial, out-of-order and real-time typeahead search.</li>
<li>
<a rel="nofollow" href="http://engineering.linkedin.com/search/did-you-mean-galene">LinkedIn Galene</a> - search architecture at LinkedIn.</li>
<li>
<a rel="nofollow" href="https://github.com/senseidb/zoie">LinkedIn Zoie</a> - is a realtime search/indexing system written in Java.</li>
<li>
<a rel="nofollow" href="http://sphinxsearch.com/">Sphnix Search Server</a> - fulltext search engine.</li>
</ul>
<h4>MySQL forks and evolutions</h4>
<ul>
<li>
<a rel="nofollow" href="https://aws.amazon.com/rds/aurora/">Amazon Aurora</a> - a MySQL-compatible, relational database engine that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases.</li>
<li>
<a rel="nofollow" href="http://aws.amazon.com/rds/">Amazon RDS</a> - MySQL databases in Amazon's cloud.</li>
<li>
<a rel="nofollow" href="http://www.drizzle
Awesome JavaScript
https://segmentfault.com/a/1190000002517896
2015-01-28T09:10:50+08:00
2015-01-28T09:10:50+08:00
timger
https://segmentfault.com/u/timger
0
<h2>Awesome JavaScript</h2>
<p>A collection of awesome browser-side JavaScript libraries, resources and shiny things.</p>
<ul>
<li>
<a rel="nofollow">Awesome JavaScript</a><br><br><ul>
<li><a rel="nofollow">Package Managers</a></li>
<li><a rel="nofollow">Loaders</a></li>
<li><a rel="nofollow">Testing Frameworks</a></li>
<li><a rel="nofollow">QA Tools</a></li>
<li><a rel="nofollow">MVC Frameworks and Libraries</a></li>
<li><a rel="nofollow">Non-MVC Frameworks</a></li>
<li><a rel="nofollow">Templating Engines</a></li>
<li><a rel="nofollow">Data Visualization</a></li>
<li><a rel="nofollow">Timeline</a></li>
<li><a rel="nofollow">Editors</a></li>
<li>Utilities</li>
<li><a rel="nofollow">Files</a></li>
<li><a rel="nofollow">Functional Programming</a></li>
<li><a rel="nofollow">Reactive Programming</a></li>
<li><a rel="nofollow">Data Structure</a></li>
<li><a rel="nofollow">Date</a></li>
<li><a rel="nofollow">String</a></li>
<li><a rel="nofollow">Number</a></li>
<li><a rel="nofollow">Storage</a></li>
<li><a rel="nofollow">Color</a></li>
<li><a rel="nofollow">I18n And L10n</a></li>
<li><a rel="nofollow">Class</a></li>
<li><a rel="nofollow">Control Flow</a></li>
<li><a rel="nofollow">Routing</a></li>
<li><a rel="nofollow">Security</a></li>
<li><a rel="nofollow">Log</a></li>
<li><a rel="nofollow">RegExp</a></li>
<li><a rel="nofollow">Media</a></li>
<li><a rel="nofollow">Voice Command</a></li>
<li><a rel="nofollow">API</a></li>
<li><a rel="nofollow">Vision Detection</a></li>
<li>UI</li>
<li><a rel="nofollow">Code Highlighting</a></li>
<li><a rel="nofollow">Loading Status</a></li>
<li><a rel="nofollow">Validation</a></li>
<li><a rel="nofollow">Keyboard Wrappers</a></li>
<li><a rel="nofollow">Tours And Guides</a></li>
<li><a rel="nofollow">Notifications</a></li>
<li><a rel="nofollow">Sliders</a></li>
<li><a rel="nofollow">Range Sliders</a></li>
<li><a rel="nofollow">Form Widgets</a></li>
<li><a rel="nofollow">Tips</a></li>
<li><a rel="nofollow">Modals and Popups</a></li>
<li><a rel="nofollow">Scroll</a></li>
<li><a rel="nofollow">Menu</a></li>
<li><a rel="nofollow">Table/Grid</a></li>
<li>Mobile</li>
<li><a rel="nofollow">Gesture</a></li>
<li><a rel="nofollow">Maps</a></li>
<li><a rel="nofollow">Animations</a></li>
<li><a rel="nofollow">Image processing</a></li>
<li><a rel="nofollow">ES6</a></li>
<li><a rel="nofollow">Misc</a></li>
</ul>
</li>
<li><a rel="nofollow">Other Awesome Lists</a></li>
<li><a rel="nofollow">Contributing</a></li>
</ul>
<hr>
<h3>Package Managers</h3>
<p><em>Host the javascript libraries and provide tools for fetching and packaging them.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/bower/bower">Bower</a> - A package manager for the web.</li>
<li>
<a rel="nofollow" href="https://github.com/component/component">component</a> - Client package management for building better web applications.</li>
<li>
<a rel="nofollow" href="https://github.com/spmjs/spm">spm</a> - Brand new static package manager.</li>
<li>
<a rel="nofollow" href="https://github.com/substack/node-browserify">browserify</a> - Browser-side require() the node.js way.</li>
<li>
<a rel="nofollow" href="https://github.com/caolan/jam">jam</a> - A package manager using a browser-focused and RequireJS compatible repository.</li>
<li>
<a rel="nofollow" href="https://github.com/jspm/jspm-cli">jspm</a> - Frictionless browser package management.</li>
<li>
<a rel="nofollow" href="https://github.com/ender-js/Ender">Ender</a> - The no-library library.</li>
<li>
<a rel="nofollow" href="https://github.com/volojs/volo">volo</a> - Create front end projects from templates, add dependencies, and automate the resulting projects.</li>
<li>
<a rel="nofollow" href="https://github.com/duojs/duo">Duo</a> - Next-generation package manager that blends the best ideas from Component, Browserify and Go to make organizing and writing front-end code quick and painless.</li>
</ul>
<h3>Loaders</h3>
<p><em>Module or loading system for JavaScript.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/jrburke/requirejs">RequireJS</a> - A file and module loader for JavaScript.</li>
<li>
<a rel="nofollow" href="https://github.com/seajs/seajs">SeaJS</a> - A Module Loader for the Web.</li>
<li>
<a rel="nofollow" href="https://github.com/headjs/headjs">HeadJS</a> - The only script in your HEAD.</li>
<li>
<a rel="nofollow" href="https://github.com/cujojs/curl">curl</a> - A small, fast, extensible module loader that handles AMD, CommonJS Modules/1.1, CSS, HTML/text, and legacy scripts.</li>
<li>
<a rel="nofollow" href="https://github.com/rgrove/lazyload/">lazyload</a> - Tiny, dependency-free async JavaScript and CSS loader.</li>
<li>
<a rel="nofollow" href="https://github.com/ded/script.js">script.js</a> - Asyncronous JavaScript loader and dependency manager.</li>
<li>
<a rel="nofollow" href="https://github.com/systemjs/systemjs">systemjs</a> - AMD, CJS & ES6 spec-compliant module loader.</li>
<li>
<a rel="nofollow" href="https://github.com/webpack/webpack">webpack</a> - Module loader made for big projects. Supports AMD, CommonJS, and more.</li>
</ul>
<h3>Testing Frameworks</h3>
<h4>Frameworks</h4>
<ul>
<li>
<a rel="nofollow" href="https://github.com/visionmedia/mocha">mocha</a> - Simple, flexible, fun javascript test framework for node.js & the browser.</li>
<li>
<a rel="nofollow" href="https://github.com/pivotal/jasmine">jasmine</a> - DOM-less simple JavaScript testing framework.</li>
<li>
<a rel="nofollow" href="https://github.com/jquery/qunit">qunit</a> - An easy-to-use JavaScript Unit Testing framework.</li>
<li>
<a rel="nofollow" href="http://github.com/facebook/jest">jest</a> - Painless Javascript Unit Testing.</li>
<li>
<a rel="nofollow" href="http://github.com/azer/prova">prova</a> - Node & Browser test runner based on Tape and Browserify</li>
</ul>
<h4>Assertion</h4>
<ul>
<li>
<a rel="nofollow" href="https://github.com/chaijs/chai">chai</a> - BDD / TDD assertion framework for node.js and the browser that can be paired with any testing framework. <a rel="nofollow" href="http://spmjs.io/package/chai"><img src="http://spmjs.io/badge/chai" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/cjohansen/Sinon.JS">Sinon.JS</a> - Test spies, stubs and mocks for JavaScript. <a rel="nofollow" href="http://spmjs.io/package/sinon"><img src="http://spmjs.io/badge/sinon" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/LearnBoost/expect.js">expect.js</a> - Minimalistic BDD-style assertions for Node.JS and the browser. <a rel="nofollow" href="http://spmjs.io/package/expect.js"><img src="http://spmjs.io/badge/expect.js" alt=""></a>
</li>
</ul>
<h4>Coverage</h4>
<ul>
<li>
<a rel="nofollow" href="https://github.com/gotwarlost/istanbul">istanbul</a> - Yet another JS code coverage tool.</li>
<li>
<a rel="nofollow" href="https://github.com/alex-seville/blanket">blanket</a> - A simple code coverage library for javascript. Designed to be easy to install and use, for both browser and nodejs.</li>
<li>
<a rel="nofollow" href="https://github.com/tntim96/JSCover">JSCover</a> - JSCover is a tool that measures code coverage for JavaScript programs.</li>
</ul>
<h4>Runner</h4>
<ul>
<li>
<a rel="nofollow" href="https://github.com/ariya/phantomjs">phantomjs</a> - Scriptable Headless WebKit.</li>
<li>
<a rel="nofollow" href="https://github.com/laurentj/slimerjs">slimerjs</a> - A PhantomJS-like tool running Gecko.</li>
<li>
<a rel="nofollow" href="https://github.com/n1k0/casperjs">casperjs</a> - Navigation scripting & testing utility for PhantomJS and SlimerJS.</li>
<li>
<a rel="nofollow" href="https://github.com/assaf/zombie">zombie</a> - Insanely fast, full-stack, headless browser testing using node.js.</li>
<li>
<a rel="nofollow" href="https://github.com/totorojs/totoro">totoro</a> - A simple and stable cross-browser testing tool.</li>
<li>
<a rel="nofollow" href="https://github.com/karma-runner/karma">karma</a> - Spectacular Test Runner for JavaScript.</li>
<li>
<a rel="nofollow" href="https://github.com/beatfactor/nightwatch">nightwatch</a> - UI automated testing framework based on node.js and selenium webdriver.</li>
<li>
<a rel="nofollow" href="https://github.com/theintern/intern">intern</a> - A next-generation code testing stack for JavaScript.</li>
</ul>
<h3>QA Tools</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/jshint/jshint/">JSHint</a> - JSHint is a tool that helps to detect errors and potential problems in your JavaScript code.</li>
<li>
<a rel="nofollow" href="https://github.com/jscs-dev/node-jscs">jscs</a> - JavaScript Code Style checker.</li>
<li>
<a rel="nofollow" href="https://github.com/rdio/jsfmt">jsfmt</a> - For formatting, searching, and rewriting JavaScript.</li>
<li>
<a rel="nofollow" href="https://github.com/danielstjules/jsinspect">jsinspect</a> - Detect copy-pasted and structurally similar code.</li>
<li>
<a rel="nofollow" href="https://github.com/danielstjules/buddy.js">buddy.js</a> - Magic number detection for JavaScript.</li>
<li>
<a rel="nofollow" href="https://github.com/eslint/eslint">ESLint</a> - A fully pluggable tool for identifying and reporting on patterns in JavaScript.</li>
</ul>
<h3>MVC Frameworks and Libraries</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/angular/angular.js">angular.js</a> - HTML enhanced for web apps.</li>
<li>
<a rel="nofollow" href="https://github.com/jashkenas/backbone">backbone</a> - Give your JS App some Backbone with Models, Views, Collections, and Events. <a rel="nofollow" href="http://spmjs.io/package/backbone"><img src="http://spmjs.io/badge/backbone" alt=""></a>
</li>
<li>
<a rel="nofollow" href="http://batmanjs.org/">batman.js</a> - The best JavaScript framework for Rails developers.</li>
<li>
<a rel="nofollow" href="https://github.com/emberjs/ember.js">ember.js</a> - A JavaScript framework for creating ambitious web applications.</li>
<li>
<a rel="nofollow" href="https://github.com/meteor/meteor">meteor</a> - An ultra-simple, database-everywhere, data-on-the-wire, pure-Javascript web framework.</li>
<li>
<a rel="nofollow" href="https://github.com/ractivejs/ractive">ractive</a> - Next-generation DOM manipulation. <a rel="nofollow" href="http://spmjs.io/package/ractive"><img src="http://spmjs.io/badge/ractive" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/yyx990803/vue">vue</a> - Intuitive, fast & composable MVVM for building interactive interfaces. <a rel="nofollow" href="http://spmjs.io/package/vue"><img src="http://spmjs.io/badge/vue" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/knockout/knockout">knockout</a> - Knockout makes it easier to create rich, responsive UIs with JavaScript.</li>
<li>
<a rel="nofollow" href="https://github.com/spine/spine">spine</a> - Lightweight MVC library for building JavaScript applications.</li>
<li>
<a rel="nofollow" href="https://github.com/techlayer/espresso.js">espresso.js</a> - A minimal javascript library for crafting user interfaces. <a rel="nofollow" href="http://spmjs.io/package/espresso.js"><img src="http://spmjs.io/badge/espresso.js" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/bitovi/canjs">canjs</a> - Can do JS, better, faster, easier.</li>
<li>
<a rel="nofollow" href="https://facebook.github.io/react/">react</a> - A library for building user interfaces. It's declarative, efficient, and extremely flexible. Works with a Virtual DOM.</li>
<li>
<a rel="nofollow" href="https://github.com/walmartlabs/thorax">thorax</a> - Strengthening your Backbone.</li>
<li>
<a rel="nofollow" href="https://github.com/chaplinjs/chaplin">chaplin</a> - An architecture for JavaScript applications using the Backbone.js library.</li>
<li>
<a rel="nofollow" href="https://github.com/marionettejs/backbone.marionette">marionette</a> - A composite application library for Backbone.js that aims to simplify the construction of large scale JavaScript applications.</li>
<li>
<a rel="nofollow" href="https://github.com/ripplejs/ripple">ripple</a> - A tiny foundation for building reactive views.</li>
<li>
<a rel="nofollow" href="https://github.com/mikeric/rivets">rivets</a> - Lightweight and powerful data binding + templating solution.</li>
<li>
<a rel="nofollow" href="https://github.com/derbyjs/derby">derby</a> - MVC framework making it easy to write realtime, collaborative applications that run in both Node.js and browsers.<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/onerussell/awesome-derby">derby-awesome</a> - A collection of awesome derby components</li>
</ul>
</li>
<li>
<a rel="nofollow" href="https://github.com/gwendall/way.js">way.js</a> - Simple, lightweight, persistent two-way databinding. <a rel="nofollow" href="http://spmjs.io/package/way.js"><img src="http://spmjs.io/badge/way.js" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/lhorie/mithril.js">mithril.js</a> - Mithril is a client-side MVC framework (Light-weight, Robust, Fast).</li>
</ul>
<h3>Non-MVC Frameworks</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/Famous/famous">famous</a> - A JavaScript framework for everyone who wants to build beautiful experiences on any device.</li>
</ul>
<h3>Templating Engines</h3>
<p><em>Templating engines allow you to perform string interpolation.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/janl/mustache.js">mustache.js</a> - Minimal templating with {{mustaches}} in JavaScript. <a rel="nofollow" href="http://spmjs.io/package/mustache"><img src="http://spmjs.io/badge/mustache" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/wycats/handlebars.js/">handlebars.js</a> - An extension to the Mustache templating language. <a rel="nofollow" href="http://spmjs.io/package/handlebars"><img src="http://spmjs.io/badge/handlebars" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/twitter/hogan.js">hogan.js</a> - A compiler for the Mustache templating language. <a rel="nofollow" href="http://spmjs.io/package/hogan.js"><img src="http://spmjs.io/badge/hogan.js" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/olado/doT">doT</a> - The fastest + concise javascript template engine for nodejs and browsers.</li>
<li>
<a rel="nofollow" href="https://github.com/linkedin/dustjs/">dustjs</a> - Asynchronous templates for the browser and node.js.</li>
<li>
<a rel="nofollow" href="https://github.com/sstephenson/eco/">eco</a> - Embedded CoffeeScript templates.</li>
<li>
<a rel="nofollow" href="https://github.com/blueimp/JavaScript-Templates">JavaScript-Templates</a> - < 1KB lightweight, fast & powerful JavaScript templating engine with zero dependencies.</li>
<li>
<a rel="nofollow" href="https://github.com/jasonmoo/t.js">t.js</a> - A tiny javascript templating framework in ~400 bytes gzipped.</li>
</ul>
<h3>Data Visualization</h3>
<p><em>Data visualization tools for the web.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/mbostock/d3">d3</a> - A JavaScript visualization library for HTML and SVG. <a rel="nofollow" href="http://spmjs.io/package/d3"><img src="http://spmjs.io/badge/d3" alt=""></a><br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/mozilla/metrics-graphics">metrics-graphics</a> - A library optimized for concise, principled data graphics and layouts.</li>
</ul>
</li>
<li>
<a rel="nofollow" href="https://github.com/mrdoob/three.js">three.js</a> - JavaScript 3D library.</li>
<li>
<a rel="nofollow" href="https://github.com/nnnick/Chart.js">Chart.js</a> - Simple HTML5 Charts using the tag. <a rel="nofollow" href="http://spmjs.io/package/chart.js"><img src="http://spmjs.io/badge/chart.js" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/paperjs/paper.js">paper.js</a> - The Swiss Army Knife of Vector Graphics Scripting – Scriptographer ported to JavaScript and the browser, using HTML5 Canvas.</li>
<li>
<a rel="nofollow" href="https://github.com/kangax/fabric.js">fabric.js</a> - Javascript Canvas Library, SVG-to-Canvas (& canvas-to-SVG) Parser.</li>
<li>
<a rel="nofollow" href="https://github.com/benpickles/peity">peity</a> - Progressive bar, line and pie charts.</li>
<li>
<a rel="nofollow" href="https://github.com/DmitryBaranovskiy/raphael">raphael</a> - JavaScript Vector Library.</li>
<li>
<a rel="nofollow" href="https://github.com/ecomfe/echarts">echarts</a> - Enterprise Charts.</li>
<li>
<a rel="nofollow" href="https://github.com/almende/vis">vis</a> - Dynamic, browser-based visualization library.</li>
<li>
<a rel="nofollow" href="https://github.com/jonobr1/two.js">two.js</a> - A renderer agnostic two-dimensional drawing api for the web.</li>
<li>
<a rel="nofollow" href="https://github.com/DmitryBaranovskiy/g.raphael">g.raphael</a> - Charts for Raphaël.</li>
<li>
<a rel="nofollow" href="https://github.com/jacomyal/sigma.js">sigma.js</a> - A JavaScript library dedicated to graph drawing.</li>
<li>
<a rel="nofollow" href="https://github.com/samizdatco/arbor">arbor</a> - A graph visualization library using web workers and jQuery.</li>
<li>
<a rel="nofollow" href="https://github.com/square/cubism">cubism</a> - A D3 plugin for visualizing time series.</li>
<li>
<a rel="nofollow" href="https://github.com/dc-js/dc.js">dc.js</a> - Multi-Dimensional charting built to work natively with crossfilter rendered with d3.js</li>
<li>
<a rel="nofollow" href="https://github.com/trifacta/vega">vega</a> - A visualization grammar.</li>
<li>
<a rel="nofollow" href="https://github.com/HumbleSoftware/envisionjs">envisionjs</a> - Dynamic HTML5 visualization.</li>
<li>
<a rel="nofollow" href="https://github.com/shutterstock/rickshaw">rickshaw</a> - JavaScript toolkit for creating interactive real-time graphs.</li>
<li>
<a rel="nofollow" href="https://github.com/flot/flot">flot</a> - Attractive JavaScript charts for jQuery.</li>
<li>
<a rel="nofollow" href="https://github.com/morrisjs/morris.js">morris.js</a> - Pretty time-series line graphs.</li>
<li>
<a rel="nofollow" href="https://github.com/novus/nvd3">nvd3</a> - Build re-usable charts and chart components for d3.js</li>
<li>
<a rel="nofollow" href="https://github.com/wout/svg.js">svg.js</a> - A lightweight library for manipulating and animating SVG.</li>
<li>
<a rel="nofollow" href="https://github.com/pa7/heatmap.js">heatmap.js</a> - JavaScript Library for HTML5 canvas based heatmaps.</li>
<li>
<a rel="nofollow" href="https://github.com/gwatts/jquery.sparkline">jquery.sparkline</a> - A plugin for the jQuery javascript library to generate small sparkline charts directly in the browser.</li>
<li>
<a rel="nofollow" href="https://github.com/tenxer/xCharts">xCharts</a> - A D3-based library for building custom charts and graphs.</li>
<li>
<a rel="nofollow" href="https://github.com/qrohlf/trianglify">trianglify</a> - Low poly style background generator with d3.js</li>
<li>
<a rel="nofollow" href="https://github.com/jasondavies/d3-cloud">d3-cloud</a> - Create word clouds in JavaScript.</li>
<li>
<a rel="nofollow" href="https://github.com/heavysixer/d4">d4</a> - A friendly reusable charts DSL for D3.</li>
<li>
<a rel="nofollow" href="http://dimplejs.org">dimple.js</a> - Easy charts for business analytics powered by d3</li>
<li>
<a rel="nofollow" href="https://github.com/gionkunz/chartist-js">chartist-js</a> - Simple responsive charts.</li>
<li>
<a rel="nofollow" href="https://github.com/fastly/epoch">epoch</a> - A general purpose real-time charting library.</li>
<li>
<a rel="nofollow" href="https://github.com/masayuki0812/c3">c3</a> - D3-based reusable chart library.</li>
<li>
<a rel="nofollow" href="https://github.com/BabylonJS/Babylon.js">BabylonJS</a> - A framework for building 3D games with HTML 5 and WebGL.</li>
</ul>
<p>There're also some great commercial libraries, like <a rel="nofollow" href="http://www.amcharts.com/">amchart</a> and <a rel="nofollow" href="http://www.highcharts.com/">highchart</a>.</p>
<h3>Timeline</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/NUKnightLab/TimelineJS">TimelineJS</a> - A Storytelling Timeline built in JavaScript.</li>
<li>
<a rel="nofollow" href="https://github.com/semu/timesheet.js">timesheet.js</a> - JavaScript library for simple HTML5 & CSS3 time sheets.</li>
</ul>
<h3>Editors</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/ajaxorg/ace">ace</a> - Ace (Ajax.org Cloud9 Editor).</li>
<li>
<a rel="nofollow" href="https://github.com/marijnh/CodeMirror">CodeMirror</a> - In-browser code editor. <a rel="nofollow" href="http://spmjs.io/package/codemirror"><img src="http://spmjs.io/badge/codemirror" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/ariya/esprima">esprima</a> - ECMAScript parsing infrastructure for multipurpose analysis.</li>
<li>
<a rel="nofollow" href="https://github.com/quilljs/quill">quill</a> - A cross browser rich text editor with an API.</li>
<li>
<a rel="nofollow" href="https://github.com/daviferreira/medium-editor">medium-editor</a> - Medium.com WYSIWYG editor clone.</li>
<li>
<a rel="nofollow" href="https://github.com/sofish/pen">pen</a> - enjoy live editing (+markdown).</li>
<li>
<a rel="nofollow" href="https://github.com/raphaelcruzeiro/jquery-notebook">jquery-notebook</a> - A simple, clean and elegant text editor. Inspired by the awesomeness of Medium.</li>
<li>
<a rel="nofollow" href="https://github.com/mindmup/bootstrap-wysiwyg">bootstrap-wysiwyg</a> - Tiny bootstrap-compatible WYSIWYG rich text editor.</li>
<li>
<a rel="nofollow" href="https://github.com/ckeditor/ckeditor-releases">ckeditor-releases</a> - The best web text editor for everyone.</li>
<li>
<a rel="nofollow" href="https://github.com/lepture/editor">editor</a> - A markdown editor. still on development.</li>
<li>
<a rel="nofollow" href="https://github.com/OscarGodson/EpicEditor">EpicEditor</a> - An embeddable JavaScript Markdown editor with split fullscreen editing, live previewing, automatic draft saving, offline support, and more.</li>
<li>
<a rel="nofollow" href="https://github.com/josdejong/jsoneditor">jsoneditor</a> - A web-based tool to view, edit and format JSON.</li>
<li>
<a rel="nofollow" href="https://github.com/coolwanglu/vim.js">vim.js</a> - JavaScript port of Vim with a persistent ~/.vimrc</li>
<li>
<a rel="nofollow" href="https://github.com/neilj/Squire">Squire</a> - HTML5 rich text editor.</li>
<li>
<a rel="nofollow" href="https://github.com/tinymce/tinymce">TinyMCE</a> - The JavaScript Rich Text editor.</li>
</ul>
<h3>Files</h3>
<p><em>Libraries for working with files.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/mholt/PapaParse">Papa Parse</a> - A powerful CSV library that supports parsing CSV files/strings and also exporting to CSV.</li>
<li>
<a rel="nofollow" href="https://github.com/jDataView/jBinary">jBinary</a> - High-level I/O (loading, parsing, manipulating, serializing, saving) for binary files with declarative syntax for describing file types and data structures.</li>
</ul>
<h3>Functional Programming</h3>
<p><em>Functional programming libraries to extend JavaScript’s capabilities.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/jashkenas/underscore">underscore</a> - JavaScript's utility _ belt. <a rel="nofollow" href="http://spmjs.io/package/underscore"><img src="http://spmjs.io/badge/underscore" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/lodash/lodash">lodash</a> - A utility library delivering consistency, customization, performance, & extras. <a rel="nofollow" href="http://spmjs.io/package/lodash"><img src="http://spmjs.io/badge/lodash" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/andrewplummer/Sugar">Sugar</a> - A Javascript library for working with native objects.</li>
<li>
<a rel="nofollow" href="https://github.com/dtao/lazy.js">lazy.js</a> - Like Underscore, but lazier. <a rel="nofollow" href="http://spmjs.io/package/lazy.js"><img src="http://spmjs.io/badge/lazy.js" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/CrossEye/ramda">ramda</a> - A practical functional library for Javascript programmers.</li>
<li>
<a rel="nofollow" href="https://github.com/mout/mout">mout</a> - Modular JavaScript Utilities.</li>
</ul>
<h3>Reactive Programming</h3>
<p><em>Reactive programming libraries to extend JavaScript’s capabilities.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/Reactive-Extensions/RxJS">RxJs</a> - The Reactive Extensions for JavaScript.</li>
<li>
<a rel="nofollow" href="https://github.com/baconjs/bacon.js">Bacon</a> - FRP (functional reactive programming) library for Javascript.</li>
<li>
<a rel="nofollow" href="https://github.com/pozadi/kefir">Kefir</a> - FRP library for JavaScript inspired by Bacon.js and RxJS with focus on high performance and low memory consumption.</li>
</ul>
<h3>Data Structure</h3>
<p><em>Data structure libraries to build a more sophisticated application.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/facebook/immutable-js">immutable-js</a> - Immutable Data Collections including Sequence, Range, Repeat, Map, OrderedMap, Set and a sparse Vector. <a rel="nofollow" href="http://spmjs.io/package/immutable"><img src="http://spmjs.io/badge/immutable" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/swannodette/mori">mori</a> - A library for using ClojureScript's persistent data structures and supporting API from the comfort of vanilla JavaScript.</li>
<li>
<a rel="nofollow" href="https://github.com/mauriciosantos/buckets">buckets</a> - A complete, fully tested and documented data structure library written in JavaScript.</li>
<li>
<a rel="nofollow" href="https://github.com/flesler/hashmap">hashmap</a> - Simple hashmap implementation that supports any kind of keys.</li>
</ul>
<h3>Date</h3>
<p><em>Date Libraries.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/moment/moment">moment</a> - Parse, validate, manipulate, and display dates in javascript. <a rel="nofollow" href="http://spmjs.io/package/moment"><img src="http://spmjs.io/badge/moment" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/moment/moment-timezone">moment-timezone</a> - Timezone support for moment.js.</li>
<li>
<a rel="nofollow" href="https://github.com/rmm5t/jquery-timeago">jquery-timeago</a> - A jQuery plugin that makes it easy to support automatically updating fuzzy timestamps (e.g. "4 minutes ago"). <a rel="nofollow" href="http://spmjs.io/package/timeago"><img src="http://spmjs.io/badge/timeago" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/mde/timezone-js">timezone-js</a> - Timezone-enabled JavaScript Date object. Uses Olson zoneinfo files for timezone data.</li>
<li>
<a rel="nofollow" href="https://github.com/MatthewMueller/date">date</a> - Date() for humans. <a rel="nofollow" href="http://spmjs.io/package/date"><img src="http://spmjs.io/badge/date" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/guille/ms.js">ms.js</a> - Tiny milisecond conversion utility. <a rel="nofollow" href="http://spmjs.io/package/ms"><img src="http://spmjs.io/badge/ms" alt=""></a>
</li>
</ul>
<h3>String</h3>
<p><em>String Libraries.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/epeli/underscore.string">underscore.string</a> - String manipulation extensions for Underscore.js javascript library. <a rel="nofollow" href="http://spmjs.io/package/underscore.string"><img src="http://spmjs.io/badge/underscore.string" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/jprichardson/string.js">string.js</a> - Extra JavaScript string methods. <a rel="nofollow" href="http://spmjs.io/package/string.js"><img src="http://spmjs.io/badge/string.js" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/mathiasbynens/he">he</a> - A robust HTML entity encoder/decoder written in JavaScript. <a rel="nofollow" href="http://spmjs.io/package/he"><img src="http://spmjs.io/badge/he" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/sindresorhus/multiline">multiline</a> - Multiline strings in JavaScript. <a rel="nofollow" href="http://spmjs.io/package/multiline"><img src="http://spmjs.io/badge/multiline" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/sindresorhus/query-string">query-string</a> - Parse and stringify URL query strings. <a rel="nofollow" href="http://spmjs.io/package/query-string"><img src="http://spmjs.io/badge/query-string" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/medialize/URI.js/">URI.js</a> - Javascript URL mutation library. <a rel="nofollow" href="http://spmjs.io/package/urijs"><img src="http://spmjs.io/badge/urijs" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/Mikhus/jsurl">jsurl</a> - Lightweight URL manipulation with JavaScript.</li>
<li>
<a rel="nofollow" href="https://github.com/alexei/sprintf.js">sprintf.js</a> - A sprintf implementation.</li>
</ul>
<h3>Number</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/adamwdraper/Numeral-js">Numeral-js</a> - A javascript library for formatting and manipulating numbers. <a rel="nofollow" href="http://spmjs.io/package/numeral"><img src="http://spmjs.io/badge/numeral" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/HubSpot/odometer">odometer</a> - Smoothly transitions numbers with ease. <a rel="nofollow" href="http://spmjs.io/package/odometer"><img src="http://spmjs.io/badge/odometer" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/josscrowcroft/accounting.js">accounting.js</a> - A lightweight JavaScript library for number, money and currency formatting - fully localisable, zero dependencies.</li>
<li>
<a rel="nofollow" href="https://github.com/josscrowcroft/money.js">money.js</a> - A tiny (1kb) javascript currency conversion library, for web & nodeJS.</li>
</ul>
<h3>Storage</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/marcuswestin/store.js">store.js</a> - LocalStorage wrapper for all browsers without using cookies or flash. Uses localStorage, globalStorage, and userData behavior under the hood. <a rel="nofollow" href="http://spmjs.io/package/store"><img src="http://spmjs.io/badge/store" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/mozilla/localForage">localForage</a> - Offline storage, improved. Wraps IndexedDB, WebSQL, or localStorage using a simple but powerful API. <a rel="nofollow" href="http://spmjs.io/package/localforage"><img src="http://spmjs.io/badge/localforage" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/andris9/jStorage">jStorage</a> - jStorage is a simple key/value database to store data on browser side.</li>
<li>
<a rel="nofollow" href="https://github.com/zendesk/cross-storage">cross-storage</a> - Cross domain local storage, with permissions.</li>
<li>
<a rel="nofollow" href="https://github.com/addyosmani/basket.js">basket.js</a> - A script and resource loader for caching & loading scripts with localStorage.</li>
<li>
<a rel="nofollow" href="https://github.com/Wisembly/basil.js">basil.js</a> - The missing Javascript smart persistent layer. <a rel="nofollow" href="http://spmjs.io/package/basil.js"><img src="http://spmjs.io/badge/basil.js" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/carhartl/jquery-cookie">jquery-cookie</a> - A simple, lightweight jQuery plugin for reading, writing and deleting cookies.</li>
<li>
<a rel="nofollow" href="https://github.com/ScottHamper/Cookies">Cookies</a> - JavaScript Client-Side Cookie Manipulation Library.</li>
</ul>
<h3>Color</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/davidmerfield/randomColor">randomColor</a> - A color generator for JavaScript. <a rel="nofollow" href="http://spmjs.io/package/randomcolor"><img src="http://spmjs.io/badge/randomcolor" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/gka/chroma.js">chroma.js</a> - JavaScript library for all kinds of color manipulations. <a rel="nofollow" href="http://spmjs.io/package/chroma-js"><img src="http://spmjs.io/badge/chroma-js" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/harthur/color">color</a> - JavaScript color conversion and manipulation library. <a rel="nofollow" href="http://spmjs.io/package/color"><img src="http://spmjs.io/badge/color" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/mrmrs/colors">colors</a> - Smarter defaults for colors on the web.</li>
<li>
<a rel="nofollow" href="https://github.com/Fooidge/PleaseJS">PleaseJS</a> - JavaScript Library for creating random pleasing colors and color schemes.</li>
<li>
<a rel="nofollow" href="https://github.com/bgrins/TinyColor">TinyColor</a> - Fast, small color manipulation and conversion for JavaScript. <a rel="nofollow" href="http://spmjs.io/package/tinycolor"><img src="http://spmjs.io/badge/tinycolor" alt=""></a>
</li>
</ul>
<h3>I18n And L10n</h3>
<p><em>Localization (l10n) and internationalization (i18n) JavaScript libraries.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/jamuhl/i18next">i18next</a> - internationalisation (i18n) with javascript the easy way.</li>
<li>
<a rel="nofollow" href="https://github.com/airbnb/polyglot.js">polyglot</a> - tiny i18n helper library.</li>
</ul>
<h3>Class</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/ded/klass">klass</a> - A utility for creating expressive classes in JavaScript. <a rel="nofollow" href="http://spmjs.io/package/klass"><img src="http://spmjs.io/badge/klass" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/javascript/augment">augment</a> - The world's smallest and fastest classical JavaScript inheritance pattern. <a rel="nofollow" href="http://spmjs.io/package/augment"><img src="http://spmjs.io/badge/augment" alt=""></a>
</li>
</ul>
<h3>Control Flow</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/caolan/async">async</a> - Async utilities for node and the browser.</li>
<li>
<a rel="nofollow" href="https://github.com/kriskowal/q">q</a> - A tool for making and composing asynchronous promises in JavaScript. <a rel="nofollow" href="http://spmjs.io/package/q"><img src="http://spmjs.io/badge/q" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/creationix/step/">step</a> - An async control-flow library that makes stepping through logic easy. <a rel="nofollow" href="http://spmjs.io/package/step"><img src="http://spmjs.io/badge/step" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/bevacqua/contra/">contra</a> - Asynchronous flow control with a functional taste to it.</li>
<li>
<a rel="nofollow" href="https://github.com/petkaantonov/bluebird/">Bluebird</a> - fully featured promise library with focus on innovative features and performance.</li>
<li>
<a rel="nofollow" href="https://github.com/cujojs/when">when</a> - A solid, fast Promises/A+ and when() implementation, plus other async goodies.</li>
</ul>
<h3>Routing</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/flatiron/director">director</a> - A tiny and isomorphic URL router for JavaScript. <a rel="nofollow" href="http://spmjs.io/package/director"><img src="http://spmjs.io/badge/director" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/visionmedia/page.js">page.js</a> - Micro client-side router inspired by the Express router (~1200 bytes). <a rel="nofollow" href="http://spmjs.io/package/page"><img src="http://spmjs.io/badge/page" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/mtrpcic/pathjs">pathjs</a> - Simple, lightweight routing for web browsers.</li>
<li>
<a rel="nofollow" href="https://github.com/millermedeiros/crossroads.js">crossroads</a> - JavaScript Routes.</li>
<li>
<a rel="nofollow" href="https://github.com/olivernn/davis.js">davis.js</a> - RESTful degradable JavaScript routing using pushState.</li>
</ul>
<h3>Security</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/cure53/DOMPurify">DOMPurify</a> - A DOM-only, super-fast, uber-tolerant XSS sanitizer for HTML, MathML and SVG.</li>
<li>
<a rel="nofollow" href="https://github.com/leizongmin/js-xss">js-xss</a> - Sanitize untrusted HTML (to prevent XSS) with a configuration specified by a Whitelist.</li>
</ul>
<h3>Log</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/adamschwartz/log">log</a> - Console.log with style. <a rel="nofollow" href="http://spmjs.io/package/log"><img src="http://spmjs.io/badge/log" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/Oaxoa/Conzole">Conzole</a> - A debug panel built in javascript that wraps javascript native console object methods and functionality in a panel displayed inside the page.</li>
<li>
<a rel="nofollow" href="https://github.com/patik/console.log-wrapper">console.log-wrapper</a> - Log to the console in any browser with clarity.</li>
<li>
<a rel="nofollow" href="https://github.com/pimterry/loglevel">loglevel</a> - Minimal lightweight logging for JavaScript, adding reliable log level methods to wrap any available console.log methods.</li>
</ul>
<h3>RegExp</h3>
<h3>Media</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/IonDen/ion.sound">Ion.Sound</a> - Simple sounds on any web page <a rel="nofollow" href="http://spmjs.io/package/ion-sound"><img src="http://spmjs.io/badge/ion-sound" alt=""></a>
</li>
</ul>
<h3>Voice Command</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/TalAter/annyang">annyang</a> - A JavaScript library for adding voice commands to your site, using speech recognition.</li>
<li>
<a rel="nofollow" href="https://github.com/pazguille/voix">voix.js</a> - A JavaScript library to add voice commands to your sites, apps or games.</li>
</ul>
<h3>API</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/SGrondin/bottleneck">bottleneck</a> - A powerful rate limiter that makes throttling easy.</li>
<li>
<a rel="nofollow" href="https://github.com/bettiolo/oauth-signature-js">oauth-signature-js</a> - JavaScript OAuth 1.0a signature generator for node and the browser.</li>
</ul>
<h3>Vision Detection</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/eduardolundgren/tracking.js">tracking.js</a> - A modern approach for Computer Vision on the web.</li>
<li>
<a rel="nofollow" href="https://github.com/antimatter15/ocrad.js">ocrad.js</a> - OCR in Javascript via Emscripten.</li>
</ul>
<h3>Code highlighting</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/isagalaev/highlight.js">Highlight.js</a> - Javascript syntax highlighter.</li>
<li>
<a rel="nofollow" href="https://github.com/LeaVerou/prism">PrismJS</a> - Lightweight, robust, elegant syntax highlighting.</li>
</ul>
<h3>Loading Status</h3>
<p><em>Libraries for indicate load status.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://ricostacruz.com/nprogress/">NProgress</a> - Slim progress bars for Ajax'y applications. <a rel="nofollow" href="http://spmjs.io/package/nprogress"><img src="http://spmjs.io/badge/nprogress" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/fgnass/spin.js">Spin.js</a> - A spinning activity indicator. <a rel="nofollow" href="http://spmjs.io/package/spin.js"><img src="http://spmjs.io/badge/spin.js" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/usablica/progress.js">progress.js</a> - Create and manage progress bar for every objects on the page.</li>
<li>
<a rel="nofollow" href="https://github.com/HubSpot/pace">pace</a> - Automatically add a progress bar to your site. <a rel="nofollow" href="http://spmjs.io/package/pace"><img src="http://spmjs.io/badge/pace" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/buunguyen/topbar">topbar</a> - Tiny & beautiful site-wide progress indicator. <a rel="nofollow" href="http://spmjs.io/package/topbar"><img src="http://spmjs.io/badge/topbar" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/jacoborus/nanobar">nanobar</a> - Very lighweight progress bars. No jQuery. <a rel="nofollow" href="http://spmjs.io/package/nanobar"><img src="http://spmjs.io/badge/nanobar" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/codrops/PageLoadingEffects">PageLoadingEffects</a> - Modern ways of revealing new content using SVG animations.</li>
<li>
<a rel="nofollow" href="https://github.com/tobiasahlin/SpinKit">SpinKit</a> - A collection of loading indicators animated with CSS.</li>
<li>
<a rel="nofollow" href="https://github.com/hakimel/Ladda">Ladda</a> - Buttons with built-in loading indicators.</li>
<li>
<a rel="nofollow" href="https://github.com/lukehaas/css-loaders">css-loaders</a> - A collection of loading spinners animated with CSS.</li>
</ul>
<p>Besides libraries, there're <a rel="nofollow" href="http://codepen.io/collection/HtAne/">Collection on Codepen</a>, and generators like <a rel="nofollow" href="http://www.ajaxload.info/">Ajaxload</a>, <a rel="nofollow" href="http://preloaders.net/">Preloaders</a> and <a rel="nofollow" href="http://cssload.net/">CSSLoad</a>.</p>
<h3>Validation</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/guillaumepotier/Parsley.js">Parsley.js</a> - Validate your forms, frontend, without writing a single line of javascript. <a rel="nofollow" href="http://spmjs.io/package/parsleyjs"><img src="http://spmjs.io/badge/parsleyjs" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/jzaefferer/jquery-validation">jquery-validation</a> - jQuery Validation Plugin.</li>
<li>
<a rel="nofollow" href="https://github.com/chriso/validator.js">validator.js</a> - String validation and sanitization.</li>
<li>
<a rel="nofollow" href="https://github.com/rickharrison/validate.js">validate.js</a> - Lightweight JavaScript form validation library inspired by CodeIgniter.</li>
<li>
<a rel="nofollow" href="https://github.com/jaymorrow/validatr/">validatr</a> - Cross Browser HTML5 Form Validation.</li>
<li>
<a rel="nofollow" href="https://github.com/nghuuphuoc/bootstrapvalidator">BootstrapValidator</a> - The best jQuery plugin to validate form fields. Designed to use with Bootstrap 3.</li>
</ul>
<h3>Keyboard Wrappers</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/ccampbell/mousetrap">mousetrap</a> - Simple library for handling keyboard shortcuts in Javascript. <a rel="nofollow" href="http://spmjs.io/package/mousetrap"><img src="http://spmjs.io/badge/mousetrap" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/madrobby/keymaster">keymaster</a> - A simple micro-library for defining and dispatching keyboard shortcuts. <a rel="nofollow" href="http://spmjs.io/package/keymaster"><img src="http://spmjs.io/badge/keymaster" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/dmauro/Keypress">Keypress</a> - A keyboard input capturing utility in which any key can be a modifier key. <a rel="nofollow" href="http://spmjs.io/package/keypress"><img src="http://spmjs.io/badge/keypress" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/RobertWHurst/KeyboardJS">KeyboardJS</a> - A JavaScript library for binding keyboard combos without the pain of key codes and key combo conflicts.</li>
<li>
<a rel="nofollow" href="https://github.com/jeresig/jquery.hotkeys">jquery.hotkeys</a> - jQuery Hotkeys lets you watch for keyboard events anywhere in your code supporting almost any key combination.</li>
<li>
<a rel="nofollow" href="https://github.com/keithamus/jwerty">jwerty</a> - Awesome handling of keyboard events.</li>
</ul>
<h3>Tours And Guides</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/usablica/intro.js">intro.js</a> - A better way for new feature introduction and step-by-step users guide for your website and project. <a rel="nofollow" href="http://spmjs.io/package/intro.js"><img src="http://spmjs.io/badge/intro.js" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/HubSpot/shepherd">shepherd</a> - Guide your users through a tour of your app. <a rel="nofollow" href="http://spmjs.io/package/shepherd"><img src="http://spmjs.io/badge/shepherd" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/sorich87/bootstrap-tour">bootstrap-tour</a> - Quick and easy product tours with Twitter Bootstrap Popovers.</li>
<li>
<a rel="nofollow" href="https://github.com/easelinc/tourist">tourist</a> - Simple, flexible tours for your app.</li>
<li>
<a rel="nofollow" href="https://github.com/heelhook/chardin.js">chardin.js</a> - Simple overlay instructions for your apps.</li>
<li>
<a rel="nofollow" href="https://github.com/tracelytics/pageguide">pageguide</a> - An interactive guide for web page elements using jQuery and CSS3.</li>
<li>
<a rel="nofollow" href="https://github.com/linkedin/hopscotch">hopscotch</a> - A framework to make it easy for developers to add product tours to their pages.</li>
<li>
<a rel="nofollow" href="https://github.com/zurb/joyride">joyride</a> - jQuery feature tour plugin.</li>
</ul>
<h3>Notifications</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/HubSpot/messenger">messenger</a> - Growl-style alerts and messages for your app. <a rel="nofollow" href="http://spmjs.io/package/messenger"><img src="http://spmjs.io/badge/messenger" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/needim/noty">noty</a> - jQuery notification plugin.</li>
<li>
<a rel="nofollow" href="https://github.com/sciactive/pnotify">pnotify</a> - JavaScript notifications for Bootstrap, jQuery UI, and the Web Notifications Draft.</li>
<li>
<a rel="nofollow" href="https://github.com/CodeSeven/toastr">toastr</a> - Simple javascript toast notifications.</li>
<li>
<a rel="nofollow" href="https://github.com/wavded/humane-js">humane-js</a> - A simple, modern, browser notification system.</li>
<li>
<a rel="nofollow" href="https://github.com/hxgf/smoke.js">smoke.js</a> - Framework-agnostic styled alert system for javascript.</li>
</ul>
<h3>Sliders</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/nolimits4web/Swiper">Swiper</a> - Mobile touch slider and framework with hardware accelerated transitions. <a rel="nofollow" href="http://spmjs.io/package/swiper"><img src="http://spmjs.io/badge/swiper" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/kenwheeler/slick">slick</a> - The last carousel you'll ever need. <a rel="nofollow" href="http://spmjs.io/package/slick"><img src="http://spmjs.io/badge/slick" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/woothemes/FlexSlider">FlexSlider</a> - An awesome, fully responsive jQuery slider plugin.</li>
<li>
<a rel="nofollow" href="https://github.com/idiot/unslider">unslider</a> - The simplest jQuery slider there is.</li>
<li>
<a rel="nofollow" href="https://github.com/jackmoore/colorbox">colorbox</a> - A light-weight, customizable lightbox plugin for jQuery.</li>
<li>
<a rel="nofollow" href="https://github.com/fancyapps/fancyBox">fancyBox</a> - A tool that offers a nice and elegant way to add zooming functionality for images, html content and multi-media on your webpages.</li>
<li>
<a rel="nofollow" href="https://github.com/darsain/sly">sly</a> - JavaScript library for one-directional scrolling with item based navigation support.</li>
<li>
<a rel="nofollow" href="https://github.com/jaysalvat/vegas">vegas</a> - A jQuery plugin to add beautiful fullscreen backgrounds to your webpages. It even allows Slideshows.</li>
<li>
<a rel="nofollow" href="https://github.com/IanLunn/Sequence">Sequence</a> - CSS animation framework for creating responsive sliders, presentations, banners, and other step-based applications.</li>
<li>
<a rel="nofollow" href="https://github.com/feimosi/baguetteBox.js">baguetteBox.js</a> - Simple and easy to use lightbox script written in pure JavaScript.</li>
<li>
<a rel="nofollow" href="https://github.com/hakimel/reveal.js">reveal.js</a> - A framework for easily creating beautiful presentations using HTML.</li>
<li>
<a rel="nofollow" href="https://github.com/dimsemenov/PhotoSwipe">PhotoSwipe</a> - JavaScript image gallery for mobile and desktop, modular, framework independent.</li>
</ul>
<h3>Range Sliders</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/IonDen/ion.rangeSlider">Ion.RangeSlider</a> - Powerful and easily customizable range slider with many options and skin support.</li>
<li>
<a rel="nofollow" href="https://github.com/ghusse/jQRangeSlider">jQRangeSlider</a> - A javascript slider selector that supports dates.</li>
<li>
<a rel="nofollow" href="https://github.com/leongersen/noUiSlider">noUiSlider</a> - A lightweight, highly customizable range slider without bloat.</li>
<li>
<a rel="nofollow" href="https://github.com/andreruffert/rangeslider.js">rangeslider.js</a> - HTML5 input range slider element polyfill.</li>
</ul>
<h3>Form Widgets</h3>
<h4>Input</h4>
<ul>
<li>
<a rel="nofollow" href="https://github.com/twitter/typeahead.js">typeahead.js</a> - A fast and fully-featured autocomplete library. <a rel="nofollow" href="http://spmjs.io/package/typeahead.js"><img src="http://spmjs.io/badge/typeahead.js" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/aehlke/tag-it">tag-it</a> - A jQuery UI plugin to handle multi-tag fields as well as tag suggestions/autocomplete.</li>
<li>
<a rel="nofollow" href="https://github.com/ichord/At.js">At.js</a> - Add Github like mentions autocomplete to your application. <a rel="nofollow" href="http://spmjs.io/package/at.js"><img src="http://spmjs.io/badge/at.js" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/jamesallardice/Placeholders.js">Placeholders.js</a> - A JavaScript polyfill for the HTML5 placeholder attribute.</li>
<li>
<a rel="nofollow" href="https://github.com/yairEO/fancyInput">fancyInput</a> - Makes typing in input fields fun with CSS3 effects.</li>
<li>
<a rel="nofollow" href="https://github.com/xoxco/jQuery-Tags-Input">jQuery-Tags-Input</a> - Magically convert a simple text input into a cool tag list with this jQuery plugin.</li>
<li>
<a rel="nofollow" href="https://github.com/BankFacil/vanilla-masker">vanilla-masker</a> - A pure javascript mask input.</li>
</ul>
<h4>Calendar</h4>
<ul>
<li>
<a rel="nofollow" href="https://github.com/amsul/pickadate.js">pickadate.js</a> - The mobile-friendly, responsive, and lightweight jQuery date & time input picker.</li>
<li>
<a rel="nofollow" href="https://github.com/eternicode/bootstrap-datepicker">bootstrap-datepicker</a> - A datepicker for <a href="/u/twitter">@twitter</a> bootstrap forked from Stefan Petre's (of eyecon.ro), improvements by @eternicode.</li>
<li>
<a rel="nofollow" href="https://github.com/dbushell/Pikaday">Pikaday</a> - A refreshing JavaScript Datepicker — lightweight, no dependencies, modular CSS.</li>
<li>
<a rel="nofollow" href="https://github.com/arshaw/fullcalendar">fullcalendar</a> - Full-sized drag & drop event calendar (jQuery plugin).</li>
<li>
<a rel="nofollow" href="https://github.com/bevacqua/rome">rome</a> - A customizable date (and time) picker. Dependency free, opt-in UI. <a rel="nofollow" href="http://spmjs.io/package/rome"><img src="http://spmjs.io/badge/rome" alt=""></a>
</li>
</ul>
<h4>Select</h4>
<ul>
<li>
<a rel="nofollow" href="https://github.com/brianreavis/selectize.js">selectize.js</a> - Selectize is the hybrid of a textbox and select box. It's jQuery based and it has autocomplete and native-feeling keyboard navigation; useful for tagging, contact lists, etc. <a rel="nofollow" href="http://spmjs.io/package/selectize"><img src="http://spmjs.io/badge/selectize" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/ivaynberg/select2">select2</a> - a jQuery based replacement for select boxes. It supports searching, remote data sets, and infinite scrolling of results. <a rel="nofollow" href="http://spmjs.io/package/select2"><img src="http://spmjs.io/badge/select2" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/harvesthq/chosen">chosen</a> - A library for making long, unwieldy select boxes more friendly.</li>
</ul>
<h4>File Uploader</h4>
<ul>
<li>
<a rel="nofollow" href="https://github.com/blueimp/jQuery-File-Upload">jQuery-File-Upload</a> - File Upload widget with multiple file selection, drag&drop support, progress bar, validation and preview images, audio and video for jQuery.</li>
<li>
<a rel="nofollow" href="https://github.com/enyo/dropzone">dropzone</a> - Dropzone is an easy to use drag'n'drop library. It supports image previews and shows nice progress bars. <a rel="nofollow" href="http://spmjs.io/package/dropzone"><img src="http://spmjs.io/badge/dropzone" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/flowjs/flow.js">flow.js</a> - A JavaScript library providing multiple simultaneous, stable, fault-tolerant and resumable/restartable file uploads via the HTML5 File API.</li>
<li>
<a rel="nofollow" href="https://github.com/Widen/fine-uploader">fine-uploader</a> - Multiple file upload plugin with progress-bar, drag-and-drop, direct-to-S3 uploading.</li>
<li>
<a rel="nofollow" href="https://github.com/mailru/FileAPI">FileAPI</a> - A set of javascript tools for working with files. Multiupload, drag'n'drop and chunked file upload. Images: crop, resize and auto orientation by EXIF.</li>
<li>
<a rel="nofollow" href="https://github.com/moxiecode/plupload">plupload</a> - A JavaScript API for dealing with file uploads it supports features like multiple file selection, file type filtering, request chunking, client side image scaling and it uses different runtimes to achieve this such as HTML 5, Silverlight and Flash.</li>
</ul>
<h4>Other</h4>
<ul>
<li>
<a rel="nofollow" href="https://github.com/malsup/form">form</a> - jQuery Form Plugin.</li>
<li>
<a rel="nofollow" href="https://github.com/guillaumepotier/Garlic.js">Garlic.js</a> - Automatically persist your forms' text and select field values locally, until the form is submitted.</li>
<li>
<a rel="nofollow" href="https://github.com/RadLikeWhoa/Countable">Countable</a> - A JavaScript function to add live paragraph-, word- and character-counting to an HTML element.</li>
<li>
<a rel="nofollow" href="https://github.com/jessepollak/card">card</a> - Make your credit card form better in one line of code.</li>
</ul>
<h3>Tips</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/jaz303/tipsy">tipsy</a> - Facebook-style tooltips plugin for jQuery.</li>
<li>
<a rel="nofollow" href="https://github.com/enyo/opentip">opentip</a> - An open source javascript tooltip based on the prototype framework. <a rel="nofollow" href="http://spmjs.io/package/opentip"><img src="http://spmjs.io/badge/opentip" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/qTip2/qTip2">qTip2</a> - Pretty powerful tooltips.</li>
<li>
<a rel="nofollow" href="https://github.com/iamceege/tooltipster">tooltipster</a> - A jQuery tooltip plugin.</li>
<li>
<a rel="nofollow" href="https://github.com/arashmanteghi/simptip">simptip</a> - A simple CSS tooltip made with Sass.</li>
</ul>
<h3>Modals and Popups</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/dimsemenov/Magnific-Popup">Magnific-Popup</a> - Light and responsive lightbox script with focus on performance.</li>
<li>
<a rel="nofollow" href="https://github.com/gristmill/jquery-popbox">jquery-popbox</a> - jQuery PopBox UI Element.</li>
<li>
<a rel="nofollow" href="https://github.com/voronianski/jquery.avgrund.js">jquery.avgrund.js</a> - A jQuery plugin with new modal concept for popups.</li>
<li>
<a rel="nofollow" href="https://github.com/HubSpot/vex">vex</a> - A modern dialog library which is highly configurable and easy to style.</li>
<li>
<a rel="nofollow" href="https://github.com/jschr/bootstrap-modal">bootstrap-modal</a> - Extends the default Bootstrap Modal class. Responsive, stackable, ajax and more.</li>
<li>
<a rel="nofollow" href="https://github.com/drublic/css-modal">css-modal</a> - A modal built out of pure CSS.</li>
</ul>
<h3>Scroll</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/sakabako/scrollMonitor">scrollMonitor</a> - A simple and fast API to monitor elements as you scroll.</li>
<li>
<a rel="nofollow" href="https://github.com/WickyNilliams/headroom.js">headroom</a> - Give your pages some headroom. Hide your header until you need it.</li>
<li>
<a rel="nofollow" href="https://github.com/peachananr/onepage-scroll">onepage-scroll</a> - Create an Apple-like one page scroller website (iPhone 5S website) with One Page Scroll plugin.</li>
<li>
<a rel="nofollow" href="https://github.com/cubiq/iscroll">iscroll</a> - iScroll is a high performance, small footprint, dependency free, multi-platform javascript scroller.</li>
<li>
<a rel="nofollow" href="https://github.com/Prinzhorn/skrollr">skrollr</a> - Stand-alone parallax scrolling library for mobile (Android + iOS) and desktop. No jQuery.</li>
<li>
<a rel="nofollow" href="https://github.com/wagerfield/parallax">parallax</a> - Parallax Engine that reacts to the orientation of a smart device.</li>
<li>
<a rel="nofollow" href="https://github.com/markdalgleish/stellar.js">stellar.js</a> - Parallax scrolling made easy.</li>
<li>
<a rel="nofollow" href="https://github.com/cameronmcefee/plax">plax</a> - jQuery powered parallaxing.</li>
<li>
<a rel="nofollow" href="https://github.com/stephband/jparallax">jparallax</a> - jQuery plugin for creating interactive parallax effect.</li>
<li>
<a rel="nofollow" href="https://github.com/alvarotrigo/fullPage.js">fullPage</a> - A simple and easy to use plugin to create fullscreen scrolling websites (also known as single page websites).</li>
</ul>
<h3>Menu</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/kamens/jQuery-menu-aim">jQuery-menu-aim</a> - jQuery plugin to fire events when user's cursor aims at particular dropdown menu items. For making responsive mega dropdowns like Amazon's.</li>
</ul>
<h3>Table/Grid</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/hikalkan/jtable">jTable</a> - A JQuery plugin to create AJAX based CRUD tables.</li>
</ul>
<h3>Gesture</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/hammerjs/hammer.js">hammer.js</a> - A javascript library for multi-touch gestures. <a rel="nofollow" href="http://spmjs.io/package/hammerjs"><img src="http://spmjs.io/badge/hammerjs" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/hammerjs/touchemulator">touchemulator</a> - Emulate touch input on your desktop.</li>
</ul>
<h3>Maps</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/Leaflet/Leaflet">Leaflet</a> - JavaScript library for mobile-friendly interactive maps.</li>
<li>
<a rel="nofollow" href="https://github.com/AnalyticalGraphicsInc/cesium">Cesium</a> - Open Source WebGL virtual globe and map engine.</li>
<li>
<a rel="nofollow" href="https://github.com/HPNeo/gmaps">gmaps</a> - The easiest way to use Google Maps.</li>
<li>
<a rel="nofollow" href="https://github.com/simplegeo/polymaps">polymaps</a> - A free JavaScript library for making dynamic, interactive maps in modern web browsers.</li>
<li>
<a rel="nofollow" href="https://github.com/kartograph/kartograph.js">kartograph.js</a> - Open source JavaScript renderer for Kartograph SVG maps.</li>
<li>
<a rel="nofollow" href="https://github.com/mapbox/mapbox.js">mapbox.js</a> - Mapbox JavaScript API, a Leaflet Plugin.</li>
<li>
<a rel="nofollow" href="https://github.com/manifestinteractive/jqvmap">jqvmap</a> - jQuery Vector Map Library.</li>
</ul>
<h3>Animations</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/julianshapiro/velocity">velocity</a> - Accelerated JavaScript animation.</li>
<li>
<a rel="nofollow" href="https://github.com/rstacruz/jquery.transit">jquery.transit</a> - Super-smooth CSS3 transformations and transitions for jQuery.</li>
<li>
<a rel="nofollow" href="https://github.com/tictail/bounce.js">bounce.js</a> - Create tasty CSS3 powered animations in no time.</li>
<li>
<a rel="nofollow" href="https://github.com/greensock/GreenSock-JS">GreenSock-JS</a> - High-performance HTML5 animations that work in all major browsers.</li>
</ul>
<h3>Image Processing</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/davidsonfellipe/lena.js">lena.js</a> - A Library for image processing with filters and util functions.</li>
</ul>
<h3>ES6</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/lukehoban/es6features">es6features</a> - Overview of ECMAScript 6 features.</li>
<li>
<a rel="nofollow" href="http://kangax.github.io/compat-table/es6/">ECMAScript 6 compatibility table</a> - Compatibility tables for all ECMAScript 6 features on a variety of environments.</li>
<li>
<a rel="nofollow" href="https://github.com/sebmck/6to5">6to5</a> - Turn ES6+ code into vanilla ES5 with no runtime.</li>
<li>
<a rel="nofollow" href="https://github.com/google/traceur-compiler">Traceur compiler</a> - ES6 features > ES5. Includes classes, generators, promises, destructuring patterns, default parameters & more.</li>
</ul>
<h3>Misc</h3>
<ul>
<li>
<a rel="nofollow" href="https://github.com/toddmotto/echo">echo</a> - Lazy-loading images with data-* attributes. <a rel="nofollow" href="http://spmjs.io/package/echo.js"><img src="http://spmjs.io/badge/echo.js" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/bestiejs/platform.js">platform.js</a> - A platform detection library that works on nearly all JavaScript platforms. <a rel="nofollow" href="http://spmjs.io/package/platform.js"><img src="http://spmjs.io/badge/platform.js" alt=""></a>
</li>
<li>
<a rel="nofollow" href="https://github.com/bestiejs/json3">json3</a> - A modern JSON implementation compatible with nearly all JavaScript platforms. <a rel="nofollow" href="http://spmjs.io/package/json3"><img src="http://spmjs.io/badge/json3" alt=""></a>
</li>
</ul>
<h2>Other Awesome Lists</h2>
<ul>
<li><a rel="nofollow" href="https://github.com/emijrp/awesome-awesome">emijrp/awesome-awesome</a></li>
<li><a rel="nofollow" href="https://github.com/bayandin/awesome-awesomeness">bayandin/awesome-awesomeness</a></li>
<li><a rel="nofollow" href="https://github.com/sindresorhus/awesome">sindresorhus/awesome</a></li>
<li><a rel="nofollow" href="https://github.com/jnv/lists">jnv/list</a></li>
<li><a rel="nofollow" href="https://github.com/gianarb/awesome-angularjs">gianarb/angularjs</a></li>
<li><a rel="nofollow" href="https://github.com/peterkokot/awesome-dojo">peterkokot/awesome-dojo</a></li>
<li><a rel="nofollow" href="https://github.com/addyosmani/es6-tools">addyosmani/es6-tools</a></li>
<li><a rel="nofollow" href="https://github.com/ericdouglas/ES6-Learning">ericdouglas/ES6-Learning</a></li>
</ul>
<h2>Contributing</h2>
<p>Contributions welcome! Read the <a rel="nofollow">contribution guidelines</a> first.</p>
Awesome Python II
https://segmentfault.com/a/1190000002517893
2015-01-28T09:10:12+08:00
2015-01-28T09:10:12+08:00
timger
https://segmentfault.com/u/timger
2
<h3>Caching</h3>
<p><em>Libraries for caching data.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://beaker.readthedocs.org/">Beaker</a> - A library for caching and sessions for use with web applications and stand-alone Python scripts and applications.</li>
<li>
<a rel="nofollow" href="http://dogpilecache.readthedocs.org/">dogpile.cache</a> - dogpile.cache is next generation replacement for Beaker made by same authors.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/HermesCache">HermesCache</a> - Python caching library with tag-based invalidation and dogpile effect prevention.</li>
<li>
<a rel="nofollow" href="https://github.com/jbalogh/django-cache-machine">django-cache-machine</a> - Automatic caching and invalidation for Django models through the ORM.</li>
<li>
<a rel="nofollow" href="https://github.com/Suor/django-cacheops">django-cacheops</a> - A slick ORM cache with automatic granular event-driven invalidation.</li>
<li>
<a rel="nofollow" href="https://github.com/jmoiron/johnny-cache">johnny-cache</a> - A caching framework for django applications.</li>
<li>
<a rel="nofollow" href="https://github.com/5monkeys/django-viewlet">django-viewlet</a> - Render template parts with extended cache control.</li>
<li>
<a rel="nofollow" href="https://github.com/lericson/pylibmc">pylibmc</a> - A Python wrapper around the <a rel="nofollow" href="http://libmemcached.org/libMemcached.html">libmemcached</a> interface.</li>
</ul>
<h3>Email</h3>
<p><em>Libraries for sending and parsing email.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/kennethreitz/inbox.py">inbox.py</a> - Python SMTP Server for Humans.</li>
<li>
<a rel="nofollow" href="https://github.com/martinrusev/imbox">imbox</a> - Python IMAP for Humans.</li>
<li>
<a rel="nofollow" href="https://github.com/inboxapp/inbox">inbox</a> - The open source email toolkit.</li>
<li>
<a rel="nofollow" href="https://github.com/zedshaw/lamson">lamson</a> - Pythonic SMTP Application Server.</li>
<li>
<a rel="nofollow" href="https://github.com/mailgun/flanker">flanker</a> - A email address and Mime parsing library.</li>
<li>
<a rel="nofollow" href="https://github.com/marrow/marrow.mailer">marrow.mailer</a> - High-performance extensible mail delivery framework.</li>
<li>
<a rel="nofollow" href="https://github.com/StreetVoice/django-celery-ses">django-celery-ses</a> - Django email backend with AWS SES and Celery.</li>
<li>
<a rel="nofollow" href="https://github.com/tonioo/modoboa">modoboa</a> - A mail hosting and management platform including a modern and simplified Web UI.</li>
<li>
<a rel="nofollow" href="http://tomekwojcik.github.io/envelopes/">envelopes</a> - Mailing for human beings.</li>
<li>
<a rel="nofollow" href="https://github.com/WoLpH/mailjet">mailjet</a> - Mailjet API implementation for batch mailing, statistics and more.</li>
<li>
<a rel="nofollow" href="https://github.com/mailgun/talon">Talon</a> - Mailgun library to extract message quotations and signatures.</li>
<li>
<a rel="nofollow" href="http://www.magiksys.net/pyzmail/">pyzmail</a> - Compose, send and parse emails.</li>
</ul>
<h3>Internationalization</h3>
<p><em>Libraries for woking with i18n.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://babel.pocoo.org/">Babel</a> - An internationalization library for Python.</li>
<li>
<a rel="nofollow" href="https://korean.readthedocs.org/">Korean</a> - A library for <a rel="nofollow" href="http://en.wikipedia.org/wiki/Korean_language">Korean</a> morphology.</li>
</ul>
<h3>URL Manipulation</h3>
<p><em>Libraries for parsing URLs.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/gruns/furl">furl</a> - A small Python library that makes manipulating URLs simple.</li>
<li>
<a rel="nofollow" href="https://github.com/codeinthehole/purl">purl</a> - A simple, immutable URL class with a clean API for interrogation and manipulation.</li>
<li>
<a rel="nofollow" href="https://github.com/ellisonleao/pyshorteners">pyshorteners</a> - A pure Python URL shortening lib.</li>
<li>
<a rel="nofollow" href="https://github.com/Alir3z4/python-short_url">short_url</a> - Python implementation for generating Tiny URL and bit.ly-like URLs.</li>
<li>
<a rel="nofollow" href="https://github.com/sloria/webargs">webargs</a> - A friendly library for parsing HTTP request arguments, with built-in support for popular web frameworks, including Flask, Django, Bottle, Tornado, and Pyramid.</li>
</ul>
<h3>HTML Manipulation</h3>
<p><em>Libraries for working with HTML and XML.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://www.crummy.com/software/BeautifulSoup/bs4/doc/">BeautifulSoup</a> - Providing Pythonic idioms for iterating, searching, and modifying HTML or XML.</li>
<li>
<a rel="nofollow" href="http://lxml.de/">lxml</a> - A very fast, easy-to-use and versatile library for handling HTML and XML.</li>
<li>
<a rel="nofollow" href="https://github.com/html5lib/html5lib-python">html5lib</a> - A standards-compliant library for parsing and serializing HTML documents and fragments.</li>
<li>
<a rel="nofollow" href="https://github.com/gawel/pyquery">pyquery</a> - A jQuery-like library for parsing HTML.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/cssutils/">cssutils</a> - A CSS library for Python.</li>
<li>
<a rel="nofollow" href="https://github.com/mitsuhiko/markupsafe">MarkupSafe</a> - Implements a XML/HTML/XHTML Markup safe string for Python.</li>
<li>
<a rel="nofollow" href="http://bleach.readthedocs.org/">bleach</a> - A whitelist-based HTML sanitization and text linkification library.</li>
<li>
<a rel="nofollow" href="https://github.com/martinblech/xmltodict">xmltodict</a> - Working with XML feel like you are working with JSON.</li>
<li>
<a rel="nofollow" href="https://github.com/chrisglass/xhtml2pdf">xhtml2pdf</a> - HTML/CSS to PDF converter.</li>
<li>
<a rel="nofollow" href="https://github.com/stchris/untangle">untangle</a> - Converts XML documents to Python objects for easy access.</li>
</ul>
<h3>Web Crawling</h3>
<p><em>Libraries for scraping websites.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://scrapy.org/">Scrapy</a> - A fast high-level screen scraping and web crawling framework.</li>
<li>
<a rel="nofollow" href="https://github.com/scrapinghub/portia">portia</a> - Visual scraping for Scrapy.</li>
<li>
<a rel="nofollow" href="http://pythonhosted.org/feedparser/">feedparser</a> - Universal feed parser.</li>
<li>
<a rel="nofollow" href="https://github.com/jmcarp/robobrowser">RoboBrowser</a> - A simple, Pythonic library for browsing the web without a standalone web browser.</li>
<li>
<a rel="nofollow" href="https://github.com/hickford/MechanicalSoup">MechanicalSoup</a> - A Python library for automating interaction with websites.</li>
<li>
<a rel="nofollow" href="http://wwwsearch.sourceforge.net/mechanize/">mechanize</a> - Stateful programmatic web browsing.</li>
<li>
<a rel="nofollow" href="https://github.com/matiasb/demiurge">Demiurge</a> - PyQuery-based scraping micro-framework.</li>
<li>
<a rel="nofollow" href="https://github.com/chineking/cola">cola</a> - A distributed crawling framework.</li>
<li>
<a rel="nofollow" href="https://github.com/binux/pyspider">pyspider</a> - A powerful spider system.</li>
</ul>
<h3>Web Content Extracting</h3>
<p><em>Libraries for extracting web contents.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/codelucas/newspaper">newspaper</a> - News extraction, article extraction and content curation in Python.</li>
<li>
<a rel="nofollow" href="https://github.com/Alir3z4/html2text">html2text</a> - Convert HTML to Markdown-formatted text.</li>
<li>
<a rel="nofollow" href="https://github.com/grangier/python-goose">python-goose</a> - HTML Content/Article Extractor.</li>
<li>
<a rel="nofollow" href="https://github.com/michaelhelmick/lassie">lassie</a> - Web Content Retrieval for Humans.</li>
<li>
<a rel="nofollow" href="https://github.com/coleifer/micawber">micawber</a> - A small library for extracting rich content from URLs.</li>
<li>
<a rel="nofollow" href="https://github.com/miso-belica/sumy">sumy</a> - A module for automatic summarization of text documents and HTML pages.</li>
<li>
<a rel="nofollow" href="https://github.com/vinta/Haul">Haul</a> - An Extensible Image Crawler.</li>
<li>
<a rel="nofollow" href="https://github.com/buriy/python-readability">python-readability</a> - Fast Python port of arc90's readability tool.</li>
<li>
<a rel="nofollow" href="https://github.com/erikriver/opengraph">opengraph</a> - A Python module to parse the Open Graph Protocol</li>
<li>
<a rel="nofollow" href="https://github.com/deanmalmgren/textract">textract</a> - Extract text from any document, Word, PowerPoint, PDFs, etc.</li>
<li>
<a rel="nofollow" href="https://github.com/Alir3z4/sanitize">sanitize</a> - Bringing sanity to world of messed-up data.</li>
</ul>
<h3>Forms</h3>
<p><em>Libraries for working with forms.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://wtforms.readthedocs.org/">WTForms</a> - A flexible forms validation and rendering library.</li>
<li>
<a rel="nofollow" href="http://wtforms-json.readthedocs.org/">WTForms-JSON</a> - A WTForms extension for JSON data handling.</li>
<li>
<a rel="nofollow" href="http://deform.readthedocs.org/">Deform</a> - Python HTML form generation library influenced by the formish form generation library.</li>
<li>
<a rel="nofollow" href="https://github.com/dyve/django-bootstrap3">django-bootstrap3</a> - Bootstrap 3 integration with Django.</li>
<li>
<a rel="nofollow" href="http://django-crispy-forms.readthedocs.org/">django-crispy-forms</a> - A Django app which lets you create beautiful forms in a very elegant and DRY way.</li>
<li>
<a rel="nofollow" href="https://github.com/WiserTogether/django-remote-forms">django-remote-forms</a> - A platform independent Django form serializer.</li>
</ul>
<h3>Data Validation</h3>
<p><em>Libraries for validating data. Used for forms in many cases.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/alecthomas/voluptuous">voluptuous</a> - A Python data validation library. It is primarily intended for validating data coming into Python as JSON, YAML, etc.</li>
<li>
<a rel="nofollow" href="http://docs.pylonsproject.org/projects/colander/">colander</a> - A system for validating and deserializing data obtained via XML, JSON, an HTML form post or any other equally simple data serialization.</li>
<li>
<a rel="nofollow" href="https://github.com/halst/schema">schema</a> - A library for validating Python data structures.</li>
<li>
<a rel="nofollow" href="https://github.com/schematics/schematics">Schematics</a> - Data Structure Validation.</li>
<li>
<a rel="nofollow" href="https://github.com/ambitioninc/kmatch">kmatch</a> - A language for matching/validating/filtering Python dictionaries.</li>
<li>
<a rel="nofollow" href="https://github.com/podio/valideer">valideer</a> - Lightweight extensible data validation and adaptation library.</li>
</ul>
<h3>Anti-spam</h3>
<p><em>Libraries for fighting spam.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/phalt/stopspam">Stopspam</a> - Intelligent spam detection for Python.</li>
<li>
<a rel="nofollow" href="https://github.com/moqada/django-simple-spam-blocker">django-simple-spam-blocker</a> - Simple spam blocker for Django.</li>
<li>
<a rel="nofollow" href="https://github.com/mbi/django-simple-captcha">django-simple-captcha</a> - A simple and highly customizable Django app to add captcha images to any Django form.</li>
</ul>
<h3>Tagging</h3>
<p><em>Libraries for tagging items.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/alex/django-taggit">django-taggit</a> - Simple tagging for Django.</li>
</ul>
<h3>Admin Panels</h3>
<p><em>Libraries for administrative interfaces.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/Eugeny/ajenti">Ajenti</a> - The admin panel your servers deserve.</li>
<li>
<a rel="nofollow" href="http://grappelliproject.com">Grappelli</a> – A jazzy skin for the Django Admin-Interface.</li>
<li>
<a rel="nofollow" href="http://djangosuit.com/">django-suit</a> - Alternative Django Admin-Interface (free only for Non-commercial use).</li>
<li>
<a rel="nofollow" href="https://github.com/sshwsfc/django-xadmin">django-xadmin</a> - Drop-in replacement of Django admin comes with lots of goodies.</li>
<li>
<a rel="nofollow" href="https://github.com/mrjoes/flask-admin">flask-admin</a> - Simple and extensible administrative interface framework for Flask.</li>
<li>
<a rel="nofollow" href="https://github.com/mher/flower">flower</a> - Real-time monitor and web admin for Celery.</li>
</ul>
<h3>Static Site Generator</h3>
<p><em>Static site generator is a software that takes some text + templates as input and produces html files on the output.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://blog.getpelican.com/">Pelican</a> - Uses Markdown or ReST for content and Jinja 2 for themes. Supports DVCS, Disqus. AGPL.</li>
<li>
<a rel="nofollow" href="http://github.com/koenbok/Cactus/">Cactus</a> – Static site generator for designers.</li>
<li>
<a rel="nofollow" href="https://hyde.github.com/">Hyde</a> - Jinja2-based static web site generator.</li>
<li>
<a rel="nofollow" href="http://www.getnikola.com/">Nikola</a> - A static website and blog generator.</li>
<li>
<a rel="nofollow" href="http://tags.brace.io/">Tags</a> - The simplest static site generator.</li>
<li>
<a rel="nofollow" href="http://tinkerer.me/">Tinkerer</a> - Tinkerer is a blogging engine/.static website generator powered by Sphinx.</li>
</ul>
<h3>Processes and Threads</h3>
<p><em>Libraries for woking with processes or threads</em></p>
<ul>
<li>
<a rel="nofollow" href="https://docs.python.org/2/library/multiprocessing.html">multiprocessing</a> - (Python standard library) Process-based "threading" interface.</li>
<li>
<a rel="nofollow" href="https://docs.python.org/2/library/threading.html">threading</a> - (Python standard library) Higher-level threading interface.</li>
<li>
<a rel="nofollow" href="https://github.com/kennethreitz/envoy">envoy</a> - Python Subprocesses for Humans™.</li>
<li>
<a rel="nofollow" href="https://github.com/amoffat/sh">sh</a> - A full-fledged <a rel="nofollow" href="https://docs.python.org/2/library/subprocess.html">subprocess</a> replacement for Python.</li>
<li>
<a rel="nofollow" href="http://sarge.readthedocs.org/">sarge</a> - A wrapper for subprocess.</li>
</ul>
<h3>Concurrency and Networking</h3>
<p><em>Libraries for concurrency and network programming.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://docs.python.org/3/library/asyncio.html">asyncio</a> - (Python standard library in Python 3.4+) Asynchronous I/O, event loop, coroutines and tasks.</li>
<li>
<a rel="nofollow" href="http://www.gevent.org/">gevent</a> - A coroutine-based Python networking library that uses <a rel="nofollow" href="https://github.com/python-greenlet/greenlet">greenlet</a>.</li>
<li>
<a rel="nofollow" href="https://twistedmatrix.com/trac/">Twisted</a> - An event-driven networking engine.</li>
<li>
<a rel="nofollow" href="http://www.tornadoweb.org/">Tornado</a> - A Web framework and asynchronous networking library.</li>
<li>
<a rel="nofollow" href="https://github.com/quantmind/pulsar">pulsar</a> - Event-driven concurrent framework for Python.</li>
<li>
<a rel="nofollow" href="https://github.com/jamwt/diesel">diesel</a> - Greenlet-based event I/O Framework for Python.</li>
<li>
<a rel="nofollow" href="http://eventlet.net/">eventlet</a> - Asynchronous framework with WSGI support.</li>
<li>
<a rel="nofollow" href="http://zeromq.github.io/pyzmq/">pyzmq</a> - A Python wrapper for the 0MQ message library.</li>
<li>
<a rel="nofollow" href="https://github.com/smira/txZMQ">txZMQ</a> - Twisted based wrapper for the 0MQ message library.</li>
<li>
<a rel="nofollow" href="http://crossbar.io">Crossbar</a> - Open-source Unified Application Router (Websocket & WAMP for Python on Autobahn).</li>
</ul>
<h3>WebSocket</h3>
<p><em>Libraries for woking with WebSocket.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/tavendo/AutobahnPython">AutobahnPython</a> - WebSocket & WAMP for Python on Twisted and <a rel="nofollow" href="https://docs.python.org/3/library/asyncio.html">asyncio</a>.</li>
<li>
<a rel="nofollow" href="https://github.com/Lawouach/WebSocket-for-Python">WebSocket-for-Python</a> - WebSocket client and server library for Python 2 and 3 as well as PyPy.</li>
</ul>
<h3>WSGI Servers</h3>
<p><em>WSGI-compatible web servers.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://docs.python.org/library/wsgiref.html">wsgiref</a> - (Python standard library) WSGI reference implementation, single-threaded.</li>
<li>
<a rel="nofollow" href="http://werkzeug.pocoo.org/">Werkzeug</a> - A WSGI utility library for Python that powers Flask and can easily be embedded into your own projects.</li>
<li>
<a rel="nofollow" href="http://pythonpaste.org/">paste</a> - Multi-threaded, stable, tried and tested.</li>
<li>
<a rel="nofollow" href="http://pypi.python.org/pypi/rocket">rocket</a> - Multi-threaded.</li>
<li>
<a rel="nofollow" href="https://waitress.readthedocs.org/">waitress</a> - Multi-threaded, poweres Pyramid.</li>
<li>
<a rel="nofollow" href="https://github.com/hivesolutions/netius">netius</a> - Asynchronous, very fast.</li>
<li>
<a rel="nofollow" href="http://pypi.python.org/pypi/gunicorn">gunicorn</a> - Pre-forked, partly written in C.</li>
<li>
<a rel="nofollow" href="http://www.fapws.org/">fapws3</a> - Asynchronous (network side only), written in C.</li>
<li>
<a rel="nofollow" href="http://pypi.python.org/pypi/meinheld">meinheld</a> - Asynchronous, partly written in C.</li>
<li>
<a rel="nofollow" href="http://pypi.python.org/pypi/bjoern">bjoern</a> - Asynchronous, very fast and written in C.</li>
</ul>
<h3>RPC Servers</h3>
<p><em>RPC-compatible servers.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://docs.python.org/2/library/simplexmlrpcserver.html">SimpleXMLRPCServer</a> - (Python standard library) Simple XML-RPC server implementation, single-threaded.</li>
<li>
<a rel="nofollow" href="https://github.com/joshmarshall/jsonrpclib/">SimpleJSONRPCServer</a> - This library is an implementation of the JSON-RPC specification.</li>
<li>
<a rel="nofollow" href="https://github.com/dotcloud/zerorpc-python">zeroRPC</a> - zerorpc is a flexible RPC implementation based on <a rel="nofollow" href="http://zeromq.org/">ZeroMQ</a> and <a rel="nofollow" href="http://msgpack.org/">MessagePack</a>.</li>
</ul>
<h3>Cryptography</h3>
<ul>
<li>
<a rel="nofollow" href="https://www.dlitz.net/software/pycrypto/">PyCrypto</a> - The Python Cryptography Toolkit.</li>
<li>
<a rel="nofollow" href="http://www.paramiko.org/">Paramiko</a> - A Python (2.6+, 3.3+) implementation of the SSHv2 protocol, providing both client and server functionality.</li>
<li>
<a rel="nofollow" href="https://cryptography.io/">cryptography</a> - A package designed to expose cryptographic primitives and recipes to Python developers.</li>
<li>
<a rel="nofollow" href="https://github.com/pyca/pynacl">PyNacl</a> - Python binding to the Networking and Cryptography (NaCl) library.</li>
<li>
<a rel="nofollow" href="https://github.com/davidaurelio/hashids-python">hashids</a> - Implementation of <a rel="nofollow" href="http://hashids.org">hashids</a> in Python.</li>
<li>
<a rel="nofollow" href="https://pythonhosted.org/passlib/">Passlib</a> - Secure password storage/hashing library, very high level.</li>
</ul>
<h3>GUI</h3>
<p><em>Libraries for working with graphical user interface applications.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://www.riverbankcomputing.co.uk/software/pyqt/intro">PyQt</a> - Python bindings for the <a rel="nofollow" href="http://qt-project.org/">Qt</a> cross-platform application and UI framework, with support for both Qt v4 and Qt v5 frameworks.</li>
<li>
<a rel="nofollow" href="http://qt-project.org/wiki/pyside">PySide</a> - Python bindings for the <a rel="nofollow" href="http://qt-project.org/">Qt</a> cross-platform application and UI framework, supporting the Qt v4 framework.</li>
<li>
<a rel="nofollow" href="http://wxpython.org/">wxPython</a> - A blending of the wxWidgets C++ class library with the Python.</li>
<li>
<a rel="nofollow" href="http://kivy.org/">kivy</a> - A library for creating NUI applications, running on Windows, Linux, Mac OS X, Android and iOS.</li>
<li>
<a rel="nofollow" href="https://docs.python.org/2/library/curses.html#module-curses">curses</a> - Built-in wrapper for <a rel="nofollow" href="http://www.gnu.org/software/ncurses/">ncurses</a> used to create terminal GUI applications.</li>
<li>
<a rel="nofollow" href="http://urwid.org/">urwid</a> - A library for creating terminal GUI applications with strong support for widgets, events, rich colors, etc.</li>
<li>
<a rel="nofollow" href="http://www.pyglet.org/">pyglet</a> - A cross-platform windowing and multimedia library for Python.</li>
<li>
<a rel="nofollow" href="https://wiki.python.org/moin/TkInter">Tkinter</a> - Tkinter is Python's de-facto standard GUI package.</li>
<li>
<a rel="nofollow" href="https://github.com/nucleic/enaml">enaml</a> - Creating beautiful user-interfaces with Declaratic Syntax like QML.</li>
<li>
<a rel="nofollow" href="https://github.com/pybee/toga">Toga</a> - A Python native, OS native GUI toolkit.</li>
</ul>
<h3>Game Development</h3>
<p><em>Awesome game development libraries.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://www.pygame.org/news.html">Pygame</a> - Pygame is a set of Python modules designed for writing games.</li>
<li>
<a rel="nofollow" href="http://cocos2d.org/">Cocos2d</a> - cocos2d is a framework for building 2D games, demos, and other graphical/interactive applications. It is based on pyglet.</li>
<li>
<a rel="nofollow" href="http://pysdl2.readthedocs.org/">PySDL2</a> - A ctypes based wrapper for the SDL2 library.</li>
<li>
<a rel="nofollow" href="https://www.panda3d.org/">Panda3D</a> - 3D game engine developed by Disney and maintained by Carnegie Mellon's Entertainment Technology Center. Written in C++, completely wrapped in Python.</li>
<li>
<a rel="nofollow" href="http://www.ogre3d.org/tikiwiki/PyOgre">PyOgre</a> - Python bindings for the Ogre 3D render engine, can be used for games, simulations, anything 3D.</li>
<li>
<a rel="nofollow" href="http://pyopengl.sourceforge.net/">PyOpenGL</a> - Python ctypes bindings for OpenGL and it's related APIs.</li>
<li>
<a rel="nofollow" href="http://www.python-sfml.org/">PySFML</a> - Python bindings for <a rel="nofollow" href="http://www.sfml-dev.org/">SFML</a>
</li>
<li>
<a rel="nofollow" href="http://www.renpy.org/">RenPy</a> - A Visual Novel engine.</li>
</ul>
<h3>Logging</h3>
<p><em>Libraries for generating and working with log files.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://docs.python.org/2/library/logging.html">logging</a> - (Python standard library) Logging facility for Python.</li>
<li>
<a rel="nofollow" href="http://pythonhosted.org/Logbook/">logbook</a> - Logging replacement for Python.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/sentry">Sentry</a> - A realtime logging and aggregation server.</li>
<li>
<a rel="nofollow" href="http://raven.readthedocs.org/">Raven</a> - The Python client for Sentry.</li>
</ul>
<h3>Testing</h3>
<p><em>Libraries for testing codebases and generating test data.</em></p>
<ul>
<li>Testing Frameworks<br><br><ul>
<li>
<a rel="nofollow" href="https://docs.python.org/2/library/unittest.html">unittest</a> - (Python standard library) Unit testing framework.</li>
<li>
<a rel="nofollow" href="https://nose.readthedocs.org/">nose</a> - nose extends unittest.</li>
<li>
<a rel="nofollow" href="http://pytest.org/">pytest</a> - A mature full-featured Python testing tool.</li>
<li>
<a rel="nofollow" href="https://nestorsalceda.github.io/mamba">mamba</a> - The definitive testing tool for Python. Born under the banner of BDD.</li>
<li>
<a rel="nofollow" href="https://github.com/benjamin-hodgson/Contexts">contexts</a> - A BDD framework for Python 3.3+. Inspired by C#'s <code>Machine.Specifications</code>.</li>
<li>
<a rel="nofollow" href="https://github.com/drslump/pyshould">pyshould</a> - Should style asserts based on <a rel="nofollow" href="https://github.com/hamcrest/PyHamcrest">PyHamcrest</a>.</li>
<li>
<a rel="nofollow" href="http://heynemann.github.io/pyvows/">pyvows</a> - BDD style testing for Python. Inspired by <a rel="nofollow" href="http://vowsjs.org/">Vows.js</a>.</li>
</ul>
</li>
<li>Web Testing<br><br><ul>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/selenium">Selenium</a> - Python bindings for <a rel="nofollow" href="http://www.seleniumhq.org/">Selenium</a> WebDriver.</li>
<li>
<a rel="nofollow" href="http://splinter.cobrateam.info/">splinter</a> - Open source tool for testing web applications.</li>
<li>
<a rel="nofollow" href="https://github.com/locustio/locust">locust</a> - Scalable user load testing tool written in Python.</li>
<li>
<a rel="nofollow" href="https://github.com/seatgeek/sixpack">sixpack</a> - A language-agnostic A/B Testing framework.</li>
</ul>
</li>
<li>Mock<br><br><ul>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/mock">mock</a> - A Python Mocking and Patching Library for Testing.</li>
<li>
<a rel="nofollow" href="https://github.com/dropbox/responses">responses</a> - A utility library for mocking out the requests Python library.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/doublex">doublex</a> - Powerful test doubles framework for Python.</li>
<li>
<a rel="nofollow" href="https://github.com/spulec/freezegun">freezegun</a> - Travel through time by mocking the datetime module.</li>
<li>
<a rel="nofollow" href="http://falcao.it/HTTPretty/">httpretty</a> - HTTP request mock tool for Python.</li>
<li>
<a rel="nofollow" href="https://github.com/patrys/httmock">httmock</a> - A mocking library for requests for Python 2.6+ and 3.2+.</li>
</ul>
</li>
<li>Code Coverage<br><br><ul>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/coverage">coverage</a> - Code coverage measurement.</li>
</ul>
</li>
<li>Fake Data<br><br><ul>
<li>
<a rel="nofollow" href="http://www.joke2k.net/faker/">faker</a> - A Python package that generates fake data.</li>
<li>
<a rel="nofollow" href="https://github.com/emirozer/fake2db">fake2db</a> - Fake database generator.</li>
<li>
<a rel="nofollow" href="https://mixer.readthedocs.org">mixer</a> - Generating fake data and creating random fixtures for testing in Django ORM, SQLAlchemy, Peewee, MongoEngine, Pony ORM and etc.</li>
<li>
<a rel="nofollow" href="https://model-mommy.readthedocs.org/">model_mommy</a> - Creating random fixtures for testing in Django.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/ForgeryPy">ForgeryPy</a> - An easy to use forged data generator for Python. It's a port of <a rel="nofollow" href="http://rubygems.org/gems/forgery">forgery</a>.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/radar">radar</a> - Generate random datetime / time.</li>
</ul>
</li>
<li>Error Handler<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/ajalt/fuckitpy">FuckIt.py</a> - FuckIt.py uses state-of-the-art technology to make sure your Python code runs whether it has any right to or not.</li>
</ul>
</li>
</ul>
<h3>Code Analysis and Linter</h3>
<p><em>Libraries and tools for analysing, parsing and manipulation codebases.</em></p>
<ul>
<li>Code Analysis<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/yinwang0/pysonar2">pysonar2</a> - A type inferencer and indexer for Python.</li>
<li>
<a rel="nofollow" href="https://github.com/gak/pycallgraph">pycallgraph</a> - A library that visualises the flow (call graph) of your Python application.</li>
<li>
<a rel="nofollow" href="https://github.com/scottrogowski/code2flow">code2flow</a> - Turn your Python and JavaScript code into DOT flowcharts.</li>
</ul>
</li>
<li>Linter<br><br><ul>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/flake8">Flake8</a> - The modular source code checker: pep8, pyflakes and co.</li>
<li>
<a rel="nofollow" href="https://pylama.readthedocs.org/">pylama</a> - Code audit tool for Python and JavaScript.</li>
<li>
<a rel="nofollow" href="http://www.pylint.org/">Pylint</a> - A source code analyzer.</li>
</ul>
</li>
</ul>
<h3>Debugging Tools</h3>
<p><em>Libraries for debugging code.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://docs.python.org/2/library/pdb.html">pdb</a> - (Python standard library) The Python Debugger.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/ipdb">ipdb</a> - IPython-enabled pdb.</li>
<li>
<a rel="nofollow" href="http://winpdb.org/">winpdb</a> - A Platform Independent Python Debugger with GUI.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/pudb">pudb</a> – A full-screen, console-based Python debugger.</li>
<li>
<a rel="nofollow" href="https://github.com/google/pyringe">pyringe</a> - Debugger capable of attaching to and injecting code into Python processes.</li>
<li>
<a rel="nofollow" href="https://github.com/WoLpH/python-statsd">python-statsd</a> - Python Client for the <a rel="nofollow" href="https://github.com/etsy/statsd/">statsd</a> server.</li>
<li>
<a rel="nofollow" href="https://github.com/fabianp/memory_profiler">memory_profiler</a> - Monitor Memory usage of Python code.</li>
<li>
<a rel="nofollow" href="https://github.com/what-studio/profiling">profiling</a> - An interactive Python profiler.</li>
<li>
<a rel="nofollow" href="https://github.com/django-debug-toolbar/django-debug-toolbar">django-debug-toolbar</a> - Display various debug information about the current request/response.</li>
<li>
<a rel="nofollow" href="https://github.com/dcramer/django-devserver">django-devserver</a> - A drop-in replacement for Django's runserver.</li>
<li>
<a rel="nofollow" href="https://github.com/mgood/flask-debugtoolbar">flask-debugtoolbar</a> - A port of the django-debug-toolbar to flask.</li>
<li>
<a rel="nofollow" href="https://github.com/eliben/pyelftools">pyelftools</a> - A pure-Python library for parsing and analyzing ELF files and DWARF debugging information.</li>
</ul>
<h3>Science and Data Analysis</h3>
<p><em>Libraries for scientific computing and data analyzing.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://www.scipy.org/">SciPy</a> - A Python-based ecosystem of open-source software for mathematics, science, and engineering.</li>
<li>
<a rel="nofollow" href="http://www.numpy.org/">NumPy</a> - A fundamental package for scientific computing with Python.</li>
<li>
<a rel="nofollow" href="http://numba.pydata.org/">Numba</a> - Python JIT (just in time) complier to LLVM aimed at scientific Python by the developers of Cython and NumPy.</li>
<li>
<a rel="nofollow" href="https://networkx.github.io/">NetworkX</a> - A high-productivity software for complex networks.</li>
<li>
<a rel="nofollow" href="http://pandas.pydata.org/">Pandas</a> - A library providing high-performance, easy-to-use data structures and data analysis tools.</li>
<li>
<a rel="nofollow" href="https://github.com/avelino/mining">Open Mining</a> - Business Intelligence (BI) in Python (Pandas web interface)</li>
<li>
<a rel="nofollow" href="https://github.com/pymc-devs/pymc">PyMC</a> - Markov Chain Monte Carlo sampling toolkit.</li>
<li>
<a rel="nofollow" href="https://github.com/quantopian/zipline">zipline</a> - A Pythonic algorithmic trading library.</li>
<li>
<a rel="nofollow" href="https://pydy.org/">PyDy</a> - Short for Python Dynamics, used to assist with workflow in the modeling of dynamic motion based around NumPy, SciPy, IPython, and matplotlib.</li>
<li>
<a rel="nofollow" href="https://github.com/sympy/sympy">SymPy</a> - A Python library for symbolic mathematics.</li>
<li>
<a rel="nofollow" href="https://github.com/statsmodels/statsmodels">statsmodels</a> - Statistical modeling and econometrics in Python.</li>
<li>
<a rel="nofollow" href="http://www.astropy.org/">astropy</a> - A community Python library for Astronomy.</li>
<li>
<a rel="nofollow" href="http://orange.biolab.si/">orange</a> - Data mining, data visualization, analysis and machine learning through visual programming or Python scripting.</li>
<li>
<a rel="nofollow" href="http://www.rdkit.org/">RDKit</a> - Cheminformatics and Machine Learning Software.</li>
<li>
<a rel="nofollow" href="http://openbabel.org/wiki/Main_Page">Open Babel</a> - A chemical toolbox designed to speak the many languages of chemical data.</li>
<li>
<a rel="nofollow" href="http://cclib.github.io/">cclib</a> - A library for parsing and interpreting the results of computational chemistry packages.</li>
<li>
<a rel="nofollow" href="http://biopython.org/wiki/Main_Page">Biopython</a> - Biopython is a set of freely available tools for biological computation.</li>
<li>
<a rel="nofollow" href="https://github.com/chapmanb/bcbb">bccb</a> - Collection of useful code related to biological analysis.</li>
<li>
<a rel="nofollow" href="https://github.com/chapmanb/bcbio-nextgen">bcbio-nextgen</a> - A toolkit providing best-practice pipelines for fully automated high throughput sequencing analysis.</li>
<li>
<a rel="nofollow" href="http://blaze.pydata.org/docs/latest/index.html">blaze</a> - NumPy and Pandas interface to Big Data.</li>
</ul>
<h3>Data Visualization</h3>
<p><em>Libraries for visualizing data. See: <a rel="nofollow" href="https://github.com/sorrycc/awesome-javascript#data-visualization">awesome-javascript</a>.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://matplotlib.org/">matplotlib</a> - A Python 2D plotting library.</li>
<li>
<a rel="nofollow" href="https://github.com/ContinuumIO/bokeh">bokeh</a> - Interactive Web Plotting for Python.</li>
<li>
<a rel="nofollow" href="https://plot.ly/python">plotly</a> - Collaborative web plotting for Python and matplotlib.</li>
<li>
<a rel="nofollow" href="https://github.com/wrobstory/vincent">vincent</a> - A Python to Vega translator.</li>
<li>
<a rel="nofollow" href="https://github.com/mikedewar/d3py">d3py</a> - A plottling library for Python, based on <a rel="nofollow" href="http://d3js.org/">D3.js</a>.</li>
<li>
<a rel="nofollow" href="https://github.com/yhat/ggplot">ggplot</a> - Same API as ggplot2 for R.</li>
<li>
<a rel="nofollow" href="https://github.com/kartograph/kartograph.py">Kartograph.py</a> - Rendering beautiful SVG maps in Python.</li>
<li>
<a rel="nofollow" href="http://pygal.org/">pygal</a> - A Python SVG Charts Creator.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/pygraphviz">pygraphviz</a> - Python interface to <a rel="nofollow" href="http://www.graphviz.org/">Graphviz</a>.</li>
<li>
<a rel="nofollow" href="http://www.pyqtgraph.org/">PyQtGraph</a> - Interactive and realtime 2D/3D/Image plotting and science/engineering widgets.</li>
</ul>
<h3>Computer Vision</h3>
<p><em>Libraries for computer vision.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://opencv.org/">OpenCV</a> - Open Source Computer Vision Library.</li>
<li>
<a rel="nofollow" href="http://simplecv.org/">SimpleCV</a> - An open source framework for building computer vision applications.</li>
</ul>
<h3>Machine Learning</h3>
<p><em>Libraries for Machine Learning. See: <a rel="nofollow" href="https://github.com/josephmisiti/awesome-machine-learning#python">awesome-machine-learning</a>.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://scikit-learn.org/">scikit-learn</a> - A Python module for machine learning built on top of SciPy.</li>
<li>
<a rel="nofollow" href="https://github.com/clips/pattern">pattern</a> - Web mining module for Python.</li>
<li>
<a rel="nofollow" href="https://github.com/numenta/nupic">NuPIC</a> - Numenta Platform for Intelligent Computing.</li>
<li>
<a rel="nofollow" href="https://github.com/lisa-lab/pylearn2">Pylearn2</a> - A Machine Learning library based on <a rel="nofollow" href="https://github.com/Theano/Theano">Theano</a>.</li>
<li>
<a rel="nofollow" href="https://github.com/hannes-brt/hebel">hebel</a> - GPU-Accelerated Deep Learning Library in Python.</li>
<li>
<a rel="nofollow" href="https://github.com/piskvorky/gensim">gensim</a> - Topic Modelling for Humans.</li>
<li>
<a rel="nofollow" href="https://github.com/pybrain/pybrain">PyBrain</a> - Another Python Machine Learning Library.</li>
<li>
<a rel="nofollow" href="https://github.com/muricoca/crab">Crab</a> - A flexible, fast recommender engine.</li>
<li>
<a rel="nofollow" href="https://github.com/ocelma/python-recsys">python-recsys</a> - A Python library for implementing a Recommender System.</li>
<li>
<a rel="nofollow" href="https://github.com/josephreisinger/vowpal_porpoise">vowpal_porpoise</a> - A lightweight Python wrapper for <a rel="nofollow" href="https://github.com/JohnLangford/vowpal_wabbit/">Vowpal Wabbit</a>.</li>
</ul>
<h3>MapReduce</h3>
<p><em>Framworks and libraries for MapReduce.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://spark.apache.org/docs/latest/programming-guide.html">PySpark</a> - The Spark Python API.</li>
<li>
<a rel="nofollow" href="https://github.com/douban/dpark">dpark</a> - Python clone of Spark, a MapReduce alike framework in Python.</li>
<li>
<a rel="nofollow" href="https://github.com/spotify/luigi">luigi</a> - A module that helps you build complex pipelines of batch jobs.</li>
<li>
<a rel="nofollow" href="https://github.com/Yelp/mrjob">mrjob</a> - Run MapReduce jobs on Hadoop or Amazon Web Services.</li>
<li>
<a rel="nofollow" href="https://github.com/klbostee/dumbo">dumbo</a> - Python module that allows one to easily write and run Hadoop programs.</li>
<li>
<a rel="nofollow" href="https://github.com/Parsely/streamparse">streamparse</a> - Run Python code against real-time streams of data. Integrates with <a rel="nofollow" href="https://storm.incubator.apache.org/">Apache Storm</a>.</li>
</ul>
<h3>Functional Programming</h3>
<p><em>Functional Programming with Python.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/kachayev/fn.py">fn.py</a> - Functional programming in Python: implementation of missing features to enjoy FP.</li>
<li>
<a rel="nofollow" href="https://github.com/Suor/funcy">funcy</a> - A fancy and practical functional tools.</li>
<li>
<a rel="nofollow" href="https://github.com/pytoolz/toolz">Toolz</a> - A collection of functional utilities for iterators, functions, and dictionaries.</li>
<li>
<a rel="nofollow" href="https://github.com/pytoolz/cytoolz/">CyToolz</a> - Cython implementation of Toolz: High performance functional utilities.</li>
</ul>
<h3>Third-party APIs</h3>
<p><em>Libraries for accessing third party services APIs. See: <a rel="nofollow" href="https://github.com/realpython/list-of-python-api-wrappers">List of Python API Wrappers and Libraries</a>.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://libcloud.apache.org/">apache-libcloud</a> - One Python library for all clouds.</li>
<li>
<a rel="nofollow" href="https://github.com/boto/boto">boto</a> - Python interface to Amazon Web Services.</li>
<li>
<a rel="nofollow" href="https://github.com/ryanmcgrath/twython">twython</a> - A Python wrapper for the Twitter API.</li>
<li>
<a rel="nofollow" href="https://github.com/google/google-api-python-client">google-api-python-client</a> - Google APIs Client Library for Python.</li>
<li>
<a rel="nofollow" href="https://github.com/burnash/gspread">gspread</a> - Google Spreadsheets Python API.</li>
<li>
<a rel="nofollow" href="https://github.com/pythonforfacebook/facebook-sdk">facebook-sdk</a> - Facebook Platform Python SDK.</li>
<li>
<a rel="nofollow" href="https://github.com/jgorset/facepy">facepy</a> - Facepy makes it really easy to interact with Facebook's Graph API</li>
<li>
<a rel="nofollow" href="https://github.com/charlierguo/gmail">gmail</a> - A Pythonic interface for Gmail.</li>
<li>
<a rel="nofollow" href="https://github.com/sunlightlabs/django-wordpress/">django-wordpress</a> - WordPress models and views for Django.</li>
</ul>
<h3>DevOps Tools</h3>
<p><em>Software and libraries for DevOps.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://www.openstack.org/">OpenStack</a> - Open source software for building private and public clouds.</li>
<li>
<a rel="nofollow" href="https://github.com/ansible/ansible">Ansible</a> - A radically simple IT automation platform.</li>
<li>
<a rel="nofollow" href="https://github.com/saltstack/salt">SaltStack</a> - Infrastructure automation and management system.</li>
<li>
<a rel="nofollow" href="http://www.fabfile.org/">Fabric</a> - A simple, Pythonic tool for remote execution and deployment.</li>
<li>
<a rel="nofollow" href="https://github.com/ronnix/fabtools">Fabtools</a> - Tools for writing awesome Fabric files.</li>
<li>
<a rel="nofollow" href="https://github.com/sebastien/cuisine">cuisine</a> - Chef-like functionality for Fabric.</li>
<li>
<a rel="nofollow" href="https://github.com/giampaolo/psutil">psutil</a> - A cross-platform process and system utilities module.</li>
<li>
<a rel="nofollow" href="https://github.com/pexpect/pexpect">pexpect</a> - Controlling interactive programs in a pseudo-terminal like GNU expect.</li>
<li>
<a rel="nofollow" href="https://github.com/python-provy/provy">provy</a> - An easy-to-use provisioning system in Python.</li>
<li>
<a rel="nofollow" href="https://github.com/nickstenning/honcho">honcho</a> - A Python port of <a rel="nofollow" href="https://github.com/ddollar/foreman">Foreman</a>, a tool for managing Procfile-based applications.</li>
<li>
<a rel="nofollow" href="https://github.com/gunnery/gunnery">gunnery</a> - Multipurpose task execution tool for distributed systems with web-based interface.</li>
<li>
<a rel="nofollow" href="http://www.fig.sh/">fig</a> - Fast, isolated development environments using <a rel="nofollow" href="https://www.docker.com/">Docker</a>.</li>
<li>
<a rel="nofollow" href="http://bitbucket.org/haard/hgapi">hgapi</a> - Pure-Python API for Mercurial.</li>
<li>
<a rel="nofollow" href="http://bitbucket.org/haard/gitapi">gitapi</a> - Pure-Python API for git.</li>
</ul>
<h3>Job Scheduler</h3>
<p><em>Libraries for scheduling jobs.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://apscheduler.readthedocs.org/">APScheduler</a> - A light but powerful in-process task scheduler that lets you schedule functions.</li>
<li>
<a rel="nofollow" href="https://github.com/thauber/django-schedule">django-schedule</a> - A calendaring app for Django.</li>
<li>
<a rel="nofollow" href="http://pydoit.org/">doit</a> - A task runner/build tool.</li>
<li>
<a rel="nofollow" href="http://pythonhosted.org/joblib/index.html">Joblib</a> - A set of tools to provide lightweight pipelining in Python.</li>
<li>
<a rel="nofollow" href="https://github.com/fengsp/plan">Plan</a> - Writing crontab file in Python like a charm.</li>
<li>
<a rel="nofollow" href="https://github.com/knipknap/SpiffWorkflow">Spiff</a> - A powerful workflow engine implemented in pure Python.</li>
<li>
<a rel="nofollow" href="https://github.com/dbader/schedule">schedule</a> - Python job scheduling for humans.</li>
<li>
<a rel="nofollow" href="http://docs.openstack.org/developer/taskflow/">TaskFlow</a> - A Python library that helps to make task execution easy, consistent and reliable.</li>
</ul>
<h3>Foreign Function Interface</h3>
<p><em>Libraries for providing foreign function interface.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://docs.python.org/2/library/ctypes.html">ctypes</a> - (Python standard library) Foreign Function Interface for Python calling C code.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/cffi">cffi</a> - Foreign Function Interface for Python calling C code.</li>
<li>
<a rel="nofollow" href="http://www.swig.org/Doc1.3/Python.html">SWIG</a> - Simplified Wrapper and Interface Generator.</li>
<li>
<a rel="nofollow" href="http://mathema.tician.de/software/pycuda/">PyCUDA</a> - A Python wrapper for Nvidia's CUDA API.</li>
</ul>
<h3>High Performance</h3>
<p><em>Libraries for making Python faster.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://cython.org/">Cython</a> - Optimizing Static Complier for Python. Uses type mixins to compile Python into C or C++ modules resulting in large performance gains.</li>
<li>
<a rel="nofollow" href="http://pypy.org/">PyPy</a> - An implementation of Python in Python. The interpreter uses black magic to make Python very fast without having to add in additional type information.</li>
<li>
<a rel="nofollow" href="http://www.stackless.com/">Stackless Python</a> - An enhanced version of the Python.</li>
<li>
<a rel="nofollow" href="https://github.com/dropbox/pyston">Pyston</a> - A Python implementation built using LLVM and modern JIT techniques with the goal of achieving good performance.</li>
</ul>
<h3>Microsoft Windows</h3>
<p><em>Python programming on Microsoft Windows.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://www.lfd.uci.edu/~gohlke/pythonlibs/">pythonlibs</a> - Unofficial Windows(32/64-bit) binaries for Python extension packages</li>
<li>
<a rel="nofollow" href="https://code.google.com/p/pythonxy/">Python(x,y)</a> - Scientific-applications-oriented Python Distribution based on Qt and Spyder.</li>
<li>
<a rel="nofollow" href="https://code.google.com/p/spyderlib/">spyder</a> - IDE for the Python language with advanced editing, interactive testing, debugging and introspection features (also comes with Anaconda).</li>
</ul>
<h3>Network Virtualization and SDN</h3>
<p><em>Tools and libraries for Virtual Networking and SDN (Software Defined Networking).</em></p>
<ul>
<li>
<a rel="nofollow" href="http://mininet.org/">Mininet</a> - A popular network emulator and API written in Python.</li>
<li>
<a rel="nofollow" href="http://www.noxrepo.org/pox/about-pox/">POX</a> - An open source development platform for Python-based Software Defined Networking (SDN) control applications, such as OpenFlow SDN controllers.</li>
<li>
<a rel="nofollow" href="http://frenetic-lang.org/pyretic/">Pyretic</a> - A member of the Frenetic family of SDN programming languages that provides powerful abstractions over network switches or emulators.</li>
<li>
<a rel="nofollow" href="https://github.com/sdn-ixp/internet2award">SDX Platform</a> - SDN based IXP implementation that leverages Mininet, POX and Pyretic.</li>
</ul>
<h3>Hardware</h3>
<p><em>Libraries for programming with hardware.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/SavinaRoja/PyUserInput">PyUserInput</a> - A module for cross-platform control of the mouse and keyboard.</li>
<li>
<a rel="nofollow" href="https://wifi.readthedocs.org/">wifi</a> - A Python library and command line tool for working with WiFi on Linux.</li>
<li>
<a rel="nofollow" href="http://www.secdev.org/projects/scapy/">scapy</a> - A brilliant packet manipulation library.</li>
<li>
<a rel="nofollow" href="http://inotool.org/">ino</a> - Command line toolkit for working with <a rel="nofollow" href="http://www.arduino.cc/">Arduino</a>.</li>
<li>
<a rel="nofollow" href="http://pyrorobotics.com/">Pyro</a> - Python Robotics.</li>
</ul>
<h3>Compatibility</h3>
<p><em>Libraries for migrating from Python 2 to 3.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/six">Six</a> - Python 2 and 3 compatibility utilities.</li>
<li>
<a rel="nofollow" href="http://python-future.org/index.html">Python-Future</a> - The missing compatibility layer between Python 2 and Python 3.</li>
<li>
<a rel="nofollow" href="https://github.com/mitsuhiko/python-modernize">Python-Modernize</a> - Modernizes Python code for eventual Python 3 migration.</li>
</ul>
<h3>Miscellaneous</h3>
<p><em>Useful libraries or tools that don't fit in the categories above.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/mitsuhiko/pluginbase">pluginbase</a> - A simple but flexible plugin system for Python.</li>
<li>
<a rel="nofollow" href="https://github.com/mitsuhiko/itsdangerous">itsdangerous</a> - Various helpers to pass trusted data to untrusted environments.</li>
<li>
<a rel="nofollow" href="https://github.com/jek/blinker">blinker</a> - A fast Python in-process signal/event dispatching system.</li>
<li>
<a rel="nofollow" href="https://github.com/PacketPerception/pychievements">Pychievements</a> - A framework for creating and tracking achievements.</li>
</ul>
<h3>Algorithms and Design Patterns</h3>
<p><em>Python implementation of algorithms and design patterns.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/faif/python-patterns">python-patterns</a> - A collection of design patterns in Python.</li>
<li>
<a rel="nofollow" href="https://github.com/nryoung/algorithms">algorithms</a> - module of algorithms for Python.</li>
</ul>
<h3>Editor Plugins</h3>
<p><em>Plugins for editors and IDEs.</em></p>
<ul>
<li>Vim<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/klen/python-mode">Python-mode</a> - An all in one plugin for turning Vim into a Python IDE.</li>
<li>
<a rel="nofollow" href="https://github.com/davidhalter/jedi-vim">Jedi-vim</a> - Vim bindings for the <a rel="nofollow" href="https://github.com/davidhalter/jedi">Jedi</a> autocompletion library for Python.</li>
<li>
<a rel="nofollow" href="https://github.com/Valloric/YouCompleteMe">YouCompleteMe</a> - Includes <a rel="nofollow" href="https://github.com/davidhalter/jedi">Jedi</a>-based completion engine for Python</li>
</ul>
</li>
<li>Emacs<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/jorgenschaefer/elpy">Elpy</a> - Emacs Python Development Environment.</li>
</ul>
</li>
<li>Sublime Text<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/srusskih/SublimeJEDI">SublimeJEDI</a> - A Sublime Text plugin to the awesome autocomplete library <a rel="nofollow" href="https://github.com/davidhalter/jedi">Jedi</a>.</li>
<li>
<a rel="nofollow" href="https://github.com/DamnWidget/anaconda">Anaconda</a> - Anaconda turns your Sublime Text 3 in a full featured Python development IDE.</li>
</ul>
</li>
<li>Atom<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/AtomLinter/Linter">Linter</a> - A static code analysis tool for Atom.</li>
<li>
<a rel="nofollow" href="https://github.com/AtomLinter/linter-flake8">Linter-flake8</a> - An addon to <code>linter</code>, that acts as an interface for <code>flake8</code>.</li>
<li>
<a rel="nofollow" href="https://github.com/jhutchins/virtualenv">virtualenv</a> - Atom package for virtualenv management.</li>
</ul>
</li>
</ul>
<h2>Resources</h2>
<p>Where to discover new Python libraries.</p>
<h3>Websites</h3>
<ul>
<li>
<a rel="nofollow" href="http://www.reddit.com/r/python">r/Python</a> - News about Python.</li>
<li>
<a rel="nofollow" href="http://python3wos.appspot.com/">Python 3 Wall of Superpowers</a> - Too many popular Python packages don't support Python 3.</li>
<li>
<a rel="nofollow" href="https://github.com/trending?l=python">Trending Python repositories on GitHub today</a> - Good place to find new Python libraries.</li>
<li>
<a rel="nofollow" href="http://pythonhackers.com/open-source/">Python Hackers</a> - List of top 400 projects in GitHub.</li>
<li>
<a rel="nofollow" href="http://coolgithubprojects.com/">CoolGithubProjects</a> - Sharing cool github projects just got easier!</li>
<li>
<a rel="nofollow" href="http://www.fullstackpython.com/">Full Stack Python</a> - Plain English explanations for every layer of the Python web application stack.</li>
<li>
<a rel="nofollow" href="https://www.djangopackages.com/">Django Packages</a> - A directory of reusable apps, sites, tools, and more for Django projects.</li>
</ul>
<h3>Weekly</h3>
<ul>
<li><a rel="nofollow" href="http://pycoders.com/">Pycoder's Weekly</a></li>
<li><a rel="nofollow" href="http://www.pythonweekly.com/">Python Weekly</a></li>
<li><a rel="nofollow" href="http://importpython.com/newsletter/">Import Python Newsletter</a></li>
</ul>
<h3>Twitter</h3>
<ul>
<li><a rel="nofollow" href="https://twitter.com/pypi">@pypi</a></li>
<li><a rel="nofollow" href="https://twitter.com/planetpython">@planetpython</a></li>
<li><a rel="nofollow" href="https://twitter.com/getpy">@getpy</a></li>
<li><a rel="nofollow" href="https://twitter.com/pycoders">@pycoders</a></li>
<li><a rel="nofollow" href="https://twitter.com/PythonWeekly">@PythonWeekly</a></li>
<li><a rel="nofollow" href="https://twitter.com/pythontrending">@pythontrending</a></li>
</ul>
<h2>Other Awesome Lists</h2>
<p>List of lists.</p>
<ul>
<li>Python<br><br><ul>
<li><a rel="nofollow" href="https://github.com/kirang89/pycrumbs/blob/master/pycrumbs.md">pycrumbs</a></li>
<li><a rel="nofollow" href="https://github.com/svaksha/pythonidae">pythonidae</a></li>
<li><a rel="nofollow" href="https://github.com/checkcheckzz/python-github-projects">python-github-projects</a></li>
<li><a rel="nofollow" href="https://github.com/rasbt/python_reference">python_reference</a></li>
<li><a rel="nofollow" href="http://easy-python.readthedocs.org/">easy-python</a></li>
</ul>
</li>
<li>Monty<br><br><ul>
<li><a rel="nofollow" href="https://github.com/bayandin/awesome-awesomeness">awesome-awesomeness</a></li>
<li><a rel="nofollow" href="https://github.com/jnv/lists">lists</a></li>
</ul>
</li>
</ul>
<h2><a rel="nofollow" href="https://github.com/vinta/awesome-python/blob/master/CONTRIBUTING.md">Contributing</a></h2>
Awesome Python
https://segmentfault.com/a/1190000002517890
2015-01-28T09:09:47+08:00
2015-01-28T09:09:47+08:00
timger
https://segmentfault.com/u/timger
3
<h2>Awesome Python</h2>
<p>A curated list of awesome Python frameworks, libraries and software. Inspired by <a rel="nofollow" href="https://github.com/ziadoz/awesome-php">awesome-php</a>.</p>
<ul>
<li>
<a rel="nofollow">Awesome Python</a><br><br><ul>
<li><a rel="nofollow">Environment Management</a></li>
<li><a rel="nofollow">Package Management</a></li>
<li><a rel="nofollow">Package Repositories</a></li>
<li><a rel="nofollow">Distribution</a></li>
<li><a rel="nofollow">Build Tools</a></li>
<li><a rel="nofollow">Interactive Interpreter</a></li>
<li><a rel="nofollow">Files</a></li>
<li><a rel="nofollow">Date and Time</a></li>
<li><a rel="nofollow">Text Processing</a></li>
<li><a rel="nofollow">Specific Formats Processing</a></li>
<li><a rel="nofollow">Natural Language Processing</a></li>
<li><a rel="nofollow">Documentation</a></li>
<li><a rel="nofollow">Configuration</a></li>
<li><a rel="nofollow">Command-line Tools</a></li>
<li><a rel="nofollow">Downloader</a></li>
<li><a rel="nofollow">Imagery</a></li>
<li><a rel="nofollow">OCR</a></li>
<li><a rel="nofollow">Audio</a></li>
<li><a rel="nofollow">Video</a></li>
<li><a rel="nofollow">Geolocation</a></li>
<li><a rel="nofollow">HTTP</a></li>
<li><a rel="nofollow">Database</a></li>
<li><a rel="nofollow">Database Drivers</a></li>
<li><a rel="nofollow">ORM</a></li>
<li><a rel="nofollow">Web Frameworks</a></li>
<li><a rel="nofollow">Permissions</a></li>
<li><a rel="nofollow">CMS</a></li>
<li><a rel="nofollow">E-commerce</a></li>
<li><a rel="nofollow">RESTful API</a></li>
<li><a rel="nofollow">Authentication</a></li>
<li><a rel="nofollow">Template Engine</a></li>
<li><a rel="nofollow">Queue</a></li>
<li><a rel="nofollow">Search</a></li>
<li><a rel="nofollow">News Feed</a></li>
<li><a rel="nofollow">Asset Management</a></li>
<li><a rel="nofollow">Caching</a></li>
<li><a rel="nofollow">Email</a></li>
<li><a rel="nofollow">Internationalization</a></li>
<li><a rel="nofollow">URL Manipulation</a></li>
<li><a rel="nofollow">HTML Manipulation</a></li>
<li><a rel="nofollow">Web Crawling</a></li>
<li><a rel="nofollow">Web Content Extracting</a></li>
<li><a rel="nofollow">Forms</a></li>
<li><a rel="nofollow">Data Validation</a></li>
<li><a rel="nofollow">Anti-spam</a></li>
<li><a rel="nofollow">Tagging</a></li>
<li><a rel="nofollow">Admin Panels</a></li>
<li><a rel="nofollow">Static Site Generator</a></li>
<li><a rel="nofollow">Processes and Threads</a></li>
<li><a rel="nofollow">Concurrency and Networking</a></li>
<li><a rel="nofollow">WebSocket</a></li>
<li><a rel="nofollow">WSGI Servers</a></li>
<li><a rel="nofollow">RPC Servers</a></li>
<li><a rel="nofollow">Cryptography</a></li>
<li><a rel="nofollow">GUI</a></li>
<li><a rel="nofollow">Game Development</a></li>
<li><a rel="nofollow">Logging</a></li>
<li><a rel="nofollow">Testing</a></li>
<li><a rel="nofollow">Code Analysis and Linter</a></li>
<li><a rel="nofollow">Debugging Tools</a></li>
<li><a rel="nofollow">Science and Data Analysis</a></li>
<li><a rel="nofollow">Data Visualization</a></li>
<li><a rel="nofollow">Computer Vision</a></li>
<li><a rel="nofollow">Machine Learning</a></li>
<li><a rel="nofollow">Functional Programming</a></li>
<li><a rel="nofollow">MapReduce</a></li>
<li><a rel="nofollow">Third-party APIs</a></li>
<li><a rel="nofollow">DevOps Tools</a></li>
<li><a rel="nofollow">Job Scheduler</a></li>
<li><a rel="nofollow">Foreign Function Interface</a></li>
<li><a rel="nofollow">High Performance</a></li>
<li><a rel="nofollow">Network Virtualization and SDN</a></li>
<li><a rel="nofollow">Hardware</a></li>
<li><a rel="nofollow">Compatibility</a></li>
<li><a rel="nofollow">Miscellaneous</a></li>
<li><a rel="nofollow">Algorithms and Design Patterns</a></li>
<li><a rel="nofollow">Editor Plugins</a></li>
</ul>
</li>
<li>
<a rel="nofollow">Resources</a><br><br><ul>
<li><a rel="nofollow">Websites</a></li>
<li><a rel="nofollow">Weekly</a></li>
<li><a rel="nofollow">Twitter</a></li>
</ul>
</li>
<li><a rel="nofollow">Other Awesome Lists</a></li>
<li><a rel="nofollow">Contributing</a></li>
</ul>
<hr>
<h3>Environment Management</h3>
<p><em>Libraries for Python version and environment management.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/yyuu/pyenv">pyenv</a> - Simple Python version management.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/virtualenv">virtualenv</a> - A tool to create isolated Python environments.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/virtualenvwrapper">virtualenvwrapper</a> - A set of extensions to virtualenv.</li>
<li>
<a rel="nofollow" href="https://github.com/sjkingo/virtualenv-api">virtualenv-api</a> - An API for virtualenv and pip.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/pew/">pew</a> - A set of tools to manage multiple virtual environments.</li>
<li>
<a rel="nofollow" href="https://github.com/sashahart/vex">Vex</a> - Run a command in the named virtualenv.</li>
<li>
<a rel="nofollow" href="https://www.egenix.com/products/python/PyRun/">PyRun</a> - A one-file, no-installation-needed version of Python.</li>
</ul>
<h3>Package Management</h3>
<p><em>Libraries for package and dependency management.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://pip.pypa.io/">pip</a> - The Python package and dependency manager.<br><br><ul>
<li><a rel="nofollow" href="https://pypi.python.org/pypi">Python Package Index</a></li>
</ul>
</li>
<li>
<a rel="nofollow" href="https://github.com/conda/conda/">conda</a> - Cross-platform, Python-agnostic binary package manager.</li>
<li>
<a rel="nofollow" href="http://clarete.li/curdling/">Curdling</a> - Curdling is a command line tool for managing Python packages.</li>
<li>
<a rel="nofollow" href="http://pythonwheels.com/">wheel</a> - The new standard of Python distribution and are intended to replace eggs.</li>
</ul>
<h3>Package Repositories</h3>
<p><em>Local PyPI repository server and proxies.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/pypa/warehouse">warehouse</a> - Next generation Python Package Repository (PyPI).<br><br><ul>
<li><a rel="nofollow" href="https://warehouse.python.org/">Warehouse</a></li>
</ul>
</li>
<li>
<a rel="nofollow" href="http://doc.devpi.net/">devpi</a> - PyPI server and packaging/testing/release tool.</li>
<li>
<a rel="nofollow" href="https://github.com/mvantellingen/localshop">localshop</a> - PyPI server which mirrors official packages on-demand, and also supports local (private) package uploads.</li>
</ul>
<h3>Distribution</h3>
<p><em>Libraries to create packaged executables for release distribution.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://cx-freeze.readthedocs.org/">cx-Freeze</a> - Freezes Python scripts (cross-platform).</li>
<li>
<a rel="nofollow" href="http://www.py2exe.org/">py2exe</a> - Freezes Python scripts (Windows).</li>
<li>
<a rel="nofollow" href="http://pynsist.readthedocs.org/">pynsist</a> - A tool to build Windows installers, installers bundle Python itself.</li>
<li>
<a rel="nofollow" href="http://pythonhosted.org/py2app/">py2app</a> - Freezes Python scripts (Mac OS X).</li>
<li>
<a rel="nofollow" href="http://www.pyinstaller.org/">PyInstaller</a> - Converts Python programs into stand-alone executables (cross-platform).</li>
<li>
<a rel="nofollow" href="http://dh-virtualenv.readthedocs.org/">dh-virtualenv</a> - Build and distribute a virtualenv as a Debian package.</li>
<li>
<a rel="nofollow" href="http://nuitka.net/">Nuitka</a> - Compile scripts, modules, packages to an executable or extension module.</li>
</ul>
<h3>Build Tools</h3>
<p><em>Compile software from source code.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://www.buildout.org/">buildout</a> - A build system for creating, assembling and deploying applications from multiple parts, some of which may be non-Python-based.</li>
<li>
<a rel="nofollow" href="http://www.scons.org/">SCons</a> - A software construction tool.</li>
<li>
<a rel="nofollow" href="https://github.com/ivankravets/platformio">PlatformIO</a> - A console tool to build code with different development platforms.</li>
<li>
<a rel="nofollow" href="http://www.yoctoproject.org/docs/1.6/bitbake-user-manual/bitbake-user-manual.html">BitBake</a> - A make-like build tool with the special focus of distributions and packages for embedded Linux.</li>
<li>
<a rel="nofollow" href="https://code.google.com/p/fabricate/">fabricate</a> - A build tool that finds dependencies automatically for any language.</li>
</ul>
<h3>Interactive Interpreter</h3>
<p><em>Interactive Python interpreters.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/ipython/ipython">IPython</a> - A rich toolkit to help you make the most out of using Python interactively.</li>
<li>
<a rel="nofollow" href="http://bpython-interpreter.org">bpython</a> – A fancy interface to the Python interpreter.</li>
<li>
<a rel="nofollow" href="https://github.com/jonathanslenders/python-prompt-toolkit">python-prompt-toolkit</a> - A Library for building powerful interactive command lines.</li>
</ul>
<h3>Files</h3>
<p><em>Libraries for file manipulation and MIME type detection.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://docs.python.org/2/library/mimetypes.html">mimetypes</a> - (Python standard library) Map filenames to MIME types.</li>
<li>
<a rel="nofollow" href="https://docs.python.org/2/library/imghdr.html">imghdr</a> - (Python standard library) Determine the type of an image.</li>
<li>
<a rel="nofollow" href="https://github.com/ahupp/python-magic">python-magic</a> - A Python interface to the libmagic file type identification library.</li>
<li>
<a rel="nofollow" href="https://github.com/jaraco/path.py">path.py</a> - A module wrapper for <a rel="nofollow" href="https://docs.python.org/2/library/os.path.html">os.path</a>.</li>
<li>
<a rel="nofollow" href="https://github.com/gorakhargosh/watchdog">watchdog</a> - API and shell utilities to monitor file system events.</li>
<li>
<a rel="nofollow" href="https://github.com/mikeorr/Unipath">Unipath</a> - An object-oriented approach to file/directory operations.</li>
<li>
<a rel="nofollow" href="https://pathlib.readthedocs.org/en/pep428/">pathlib</a> - (Python standard library in Python 3.4+) An cross-platform, object-oriented path library.</li>
</ul>
<h3>Date and Time</h3>
<p><em>Libraries for working with dates and times.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/crsmithdev/arrow">arrow</a> - Better dates & times for Python.</li>
<li>
<a rel="nofollow" href="https://github.com/KoffeinFlummi/Chronyk">Chronyk</a> - A Python 3 library for parsing human-written times and dates.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/python-dateutil">dateutil</a> - Extensions to the standard Python <a rel="nofollow" href="https://docs.python.org/2/library/datetime.html">datetime</a> module.</li>
<li>
<a rel="nofollow" href="https://github.com/myusuf3/delorean/">delorean</a> - A library for clearing up the inconvenient truths that arise dealing with datetimes.</li>
<li>
<a rel="nofollow" href="https://github.com/dirn/When.py">when.py</a> - Providing user-friendly functions to help perform common date and time actions.</li>
<li>
<a rel="nofollow" href="https://github.com/zachwill/moment">moment</a> - A Python library for dealing with dates/times. Inspired by <a rel="nofollow" href="http://momentjs.com/">Moment.js</a>.</li>
<li>
<a rel="nofollow" href="https://launchpad.net/pytz">pytz</a> - World timezone definitions, modern and historical. Brings the <a rel="nofollow" href="http://en.wikipedia.org/wiki/Tz_database">tz database</a> into Python.</li>
</ul>
<h3>Text Processing</h3>
<p><em>Libraries for parsing and manipulating plain texts.</em></p>
<ul>
<li>General<br><br><ul>
<li>
<a rel="nofollow" href="https://docs.python.org/2/library/difflib.html">difflib</a> - (Python standard library) Helpers for computing deltas.</li>
<li>
<a rel="nofollow" href="https://github.com/ztane/python-Levenshtein/">Levenshtein</a> - Fast computation of Levenshtein distance and string similarity.</li>
<li>
<a rel="nofollow" href="https://github.com/seatgeek/fuzzywuzzy">fuzzywuzzy</a> - Fuzzy String Matching.</li>
<li>
<a rel="nofollow" href="https://code.google.com/p/esmre/">esmre</a> - Regular expression accelerator.</li>
<li>
<a rel="nofollow" href="https://github.com/stochastic-technologies/shortuuid">shortuuid</a> - A generator library for concise, unambiguous and URL-safe UUIDs.</li>
<li>
<a rel="nofollow" href="https://github.com/LuminosoInsight/python-ftfy">ftfy</a> - Makes Unicode text less broken and more consistent automagically.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/Unidecode">unidecode</a> - ASCII transliterations of Unicode text.</li>
<li>
<a rel="nofollow" href="https://github.com/chardet/chardet">chardet</a> - Python 2/3 compatible character encoding detector.</li>
<li>
<a rel="nofollow" href="https://github.com/lxneng/xpinyin">xpinyin</a> - A library to translate Chinese hanzi (漢字) to pinyin (拼音).</li>
<li>
<a rel="nofollow" href="https://github.com/vinta/pangu.py">pangu.py</a> - Spacing texts for CJK and alphanumerics.</li>
<li>
<a rel="nofollow" href="https://github.com/pwaller/pyfiglet">pyfiglet</a> - An implementation of figlet written in Python.</li>
<li>
<a rel="nofollow" href="https://github.com/moskytw/uniout">uniout</a> - Print readable chars instead of the escaped string.</li>
</ul>
</li>
<li>Slugify<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/dimka665/awesome-slugify">awesome-slugify</a> - A Python slugify library that can preserve unicode.</li>
<li>
<a rel="nofollow" href="https://github.com/un33k/python-slugify">python-slugify</a> - A Python slugify library that translates unicode to ASCII.</li>
<li>
<a rel="nofollow" href="https://github.com/mozilla/unicode-slugify">unicode-slugify</a> - A slugifier that generates unicode slugs with Django as a dependency.</li>
</ul>
</li>
<li>Parser<br><br><ul>
<li>
<a rel="nofollow" href="http://www.dabeaz.com/ply/">PLY</a> - Implementation of lex and yacc parsing tools for Python</li>
<li>
<a rel="nofollow" href="https://github.com/daviddrysdale/python-phonenumbers">phonenumbers</a> - Parsing, formatting, storing and validating international phone numbers.</li>
<li>
<a rel="nofollow" href="https://github.com/selwin/python-user-agents">python-user-agents</a> - Browser user agent parser.</li>
<li>
<a rel="nofollow" href="https://sqlparse.readthedocs.org/">sqlparse</a> - A non-validating SQL parser.</li>
<li>
<a rel="nofollow" href="http://pygments.org/">Pygments</a> - A generic syntax highlighter.</li>
<li>
<a rel="nofollow" href="https://github.com/derek73/python-nameparser">python-nameparser</a> - Parsing human names into their individual components.</li>
<li>
<a rel="nofollow" href="http://pyparsing.wikispaces.com/">pyparsing</a> - A general purpose framework for generating parsers.</li>
</ul>
</li>
</ul>
<h3>Specific Formats Processing</h3>
<p><em>Libraries for parsing and manipulating specific text formats.</em></p>
<ul>
<li>General<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/kennethreitz/tablib">tablib</a> - A module for Tabular Datasets in XLS, CSV, JSON, YAML.</li>
</ul>
</li>
<li>Office<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/python-openxml/python-docx">python-docx</a> - Reads, queries and modifies Microsoft Word 2007/2008 docx files.</li>
<li>
<a rel="nofollow" href="https://github.com/python-excel/xlwt">xlwt</a> / <a rel="nofollow" href="https://github.com/python-excel/xlrd">xlrd</a> - Writing and reading data and formatting information from Excel files.</li>
<li>
<a rel="nofollow" href="https://xlsxwriter.readthedocs.org/">XlsxWriter</a> - A Python module for creating Excel .xlsx files.</li>
<li>
<a rel="nofollow" href="http://xlwings.org/">xlwings</a> - A BSD-licensed library that makes it easy to call Python from Excel and vice versa.</li>
<li>
<a rel="nofollow" href="https://github.com/brianray/mm">Marmir</a> - Takes Python data structures and turns them into spreadsheets.</li>
</ul>
</li>
<li>PDF<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/euske/pdfminer">PDFMiner</a> - A tool for extracting information from PDF documents.</li>
<li>
<a rel="nofollow" href="https://github.com/mstamy2/PyPDF2">PyPDF2</a> - A library capable of splitting, merging and transforming PDF pages.</li>
</ul>
</li>
<li>Markdown<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/waylan/Python-Markdown">Python-Markdown</a> - A Python implementation of John Gruber’s Markdown.</li>
<li>
<a rel="nofollow" href="https://github.com/lepture/mistune">Mistune</a> - Fastest and full featured pure Python parsers of Markdown.</li>
</ul>
</li>
<li>YAML<br><br><ul>
<li>
<a rel="nofollow" href="http://pyyaml.org/">PyYAML</a> - YAML implementations for Python.</li>
</ul>
</li>
<li>CSV<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/onyxfish/csvkit">csvkit</a> - Utilities for converting to and working with CSV.</li>
</ul>
</li>
<li>Archive<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/mitsuhiko/unp">unp</a> - A command line tool that can unpack archives easily.</li>
</ul>
</li>
</ul>
<h3>Natural Language Processing</h3>
<p><em>Libraries for working with human languages.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://www.nltk.org/">NLTK</a> - A leading platform for building Python programs to work with human language data.</li>
<li>
<a rel="nofollow" href="http://www.clips.ua.ac.be/pattern">Pattern</a> - A web mining module for the Python. It has tools for natural language processing, machine learning, among others.</li>
<li>
<a rel="nofollow" href="http://textblob.readthedocs.org/">TextBlob</a> - Providing a consistent API for diving into common NLP tasks. Stands on the giant shoulders of NLTK and Pattern.</li>
<li>
<a rel="nofollow" href="https://github.com/fxsjy/jieba">jieba</a> - Chinese Words Segmentation Utilities.</li>
<li>
<a rel="nofollow" href="https://github.com/isnowfy/snownlp">SnowNLP</a> - A library for processing Chinese text.</li>
<li>
<a rel="nofollow" href="https://github.com/victorlin/loso">loso</a> - Another Chinese segmentation library.</li>
<li>
<a rel="nofollow" href="https://github.com/duanhongyi/genius">genius</a> - A Chinese segment base on Conditional Random Field.</li>
</ul>
<h3>Documentation</h3>
<p><em>Libraries for generating project documentation.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://sphinx-doc.org/">Sphinx</a> - Python Documentation generator.<br><br><ul>
<li><a rel="nofollow" href="https://github.com/yoloseem/awesome-sphinxdoc">awesome-sphinxdoc</a></li>
</ul>
</li>
<li>
<a rel="nofollow" href="http://docutils.sourceforge.net/rst.html">reStructuredText</a> - Markup Syntax and Parser Component of Docutils.</li>
<li>
<a rel="nofollow" href="http://www.mkdocs.org/">MkDocs</a> - Markdown friendly documentation generator.</li>
<li>
<a rel="nofollow" href="http://fitzgen.github.io/pycco/">Pycco</a> - The original quick-and-dirty, hundred-line-long, literate-programming-style documentation generator.</li>
<li>
<a rel="nofollow" href="https://github.com/BurntSushi/pdoc">pdoc</a> - Epydoc replacement to auto generate API documentation for Python libraries.</li>
</ul>
<h3>Configuration</h3>
<p><em>Libraries for storing configuration options.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://docs.python.org/2/library/configparser.html">ConfigParser</a> - (Python standard library) INI file parser.</li>
<li>
<a rel="nofollow" href="http://www.voidspace.org.uk/python/configobj.html">ConfigObj</a> - INI file parser with validation.</li>
<li>
<a rel="nofollow" href="http://www.red-dove.com/config-doc/">config</a> - Hierarchical config from the author of <a rel="nofollow" href="https://docs.python.org/2/library/logging.html">logging</a>.</li>
<li>
<a rel="nofollow" href="http://profig.readthedocs.org/">profig</a> - Config from multiple formats with value conversion.</li>
</ul>
<h3>Command-line Tools</h3>
<p><em>Libraries for building command-line application.</em></p>
<ul>
<li>Command-line Application Development<br><br><ul>
<li>
<a rel="nofollow" href="http://builtoncement.com/">cement</a> - Cement provides a light-weight and fully featured foundation to build anything from single file scripts to complex and intricately designed applications.</li>
<li>
<a rel="nofollow" href="http://click.pocoo.org/">click</a> - A package for creating beautiful command line interfaces in a composable way.</li>
<li>
<a rel="nofollow" href="https://github.com/kennethreitz/clint">clint</a> - Python Command-line Application Tools.</li>
<li>
<a rel="nofollow" href="https://cliff.readthedocs.org/">cliff</a> - A framework for creating command-line programs with multi-level commands.</li>
<li>
<a rel="nofollow" href="http://clime.mosky.tw">Clime</a> – Clime lets you convert any module into a multi-command CLI program without any configuration.</li>
<li>
<a rel="nofollow" href="http://docopt.org/">docopt</a> - Pythonic command line arguments parser.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/colorama">colorama</a> - Cross-platform colored terminal text.</li>
<li>
<a rel="nofollow" href="https://pythonhosted.org/pyCLI/">pyCLI</a> - Command-line applications supporting standard command line parsing, logging, unit and functional testing.</li>
<li>
<a rel="nofollow" href="https://github.com/chriskiehl/Gooey">Gooey</a> - Turn command line programs into a full GUI application with one line</li>
</ul>
</li>
<li>Productivity Tools<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/audreyr/cookiecutter">cookiecutter</a> - A command-line utility that creates projects from cookiecutters (project templates). E.g. Python package projects, jQuery plugin projects.</li>
<li>
<a rel="nofollow" href="https://github.com/jakubroztocil/httpie">httpie</a> - A command line HTTP client, a user-friendly cURL replacement.</li>
<li>
<a rel="nofollow" href="https://github.com/mooz/percol">percol</a> - Adds flavor of interactive selection to the traditional pipe concept on UNIX.</li>
<li>
<a rel="nofollow" href="http://www.rainbowstream.org/">RainbowStream</a> - Smart and nice Twitter client on terminal.</li>
<li>
<a rel="nofollow" href="https://github.com/brettcannon/caniusepython3">caniusepython3</a> - Determine what projects are blocking you from porting to Python 3.</li>
</ul>
</li>
</ul>
<h3>Downloader</h3>
<p><em>Libraries for downloading.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/s3tools/s3cmd">s3cmd</a> - A command line tool for managing Amazon S3 and CloudFront.</li>
<li>
<a rel="nofollow" href="http://rg3.github.io/youtube-dl/">youtube-dl</a> - A small command-line program to download videos from YouTube.</li>
<li>
<a rel="nofollow" href="http://www.soimort.org/you-get/">you-get</a> - A YouTube/Youku/Niconico video downloader written in Python 3.</li>
<li>
<a rel="nofollow" href="https://github.com/coursera-dl/coursera">coursera</a> - Script for downloading Coursera.org videos and naming them.</li>
<li>
<a rel="nofollow" href="https://github.com/WikiTeam/wikiteam">WikiTeam</a> - Tools for downloading and preserving wikis.</li>
<li>
<a rel="nofollow" href="https://github.com/Diaoul/subliminal">subliminal</a> - Library and command line tool to search and download subtitles.</li>
</ul>
<h3>Imagery</h3>
<p><em>Libraries for manipulating images.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://pillow.readthedocs.org/">pillow</a> - Pillow is the friendly <a rel="nofollow" href="http://www.pythonware.com/products/pil/">PIL</a> fork.</li>
<li>
<a rel="nofollow" href="https://github.com/dahlia/wand">wand</a> - Python bindings for <a rel="nofollow" href="http://www.imagemagick.org/script/magick-wand.php">MagickWand</a>, C API for ImageMagick.</li>
<li>
<a rel="nofollow" href="https://github.com/thumbor/thumbor">thumbor</a> - A smart imaging service. It enables on-demand crop, resizing and flipping of images.</li>
<li>
<a rel="nofollow" href="http://www.imgseek.net/">imgSeek</a> - A project for searching a collection of images using visual similarity.</li>
<li>
<a rel="nofollow" href="https://github.com/lincolnloop/python-qrcode">python-qrcode</a> - A pure Python QR Code generator.</li>
<li>
<a rel="nofollow" href="https://pythonhosted.org/pyBarcode/">pyBarcode</a> - Create barcodes in Python without needing PIL.</li>
<li>
<a rel="nofollow" href="https://github.com/ajkumar25/pygram">pygram</a> - Instagram-like image filters.</li>
<li>
<a rel="nofollow" href="https://github.com/fogleman/Quads">Quads</a> - Computer art based on quadtrees.</li>
<li>
<a rel="nofollow" href="https://github.com/hhatto/nude.py">nude.py</a> - Nudity detection.</li>
<li>
<a rel="nofollow" href="http://scikit-image.org/">scikit-image</a> - A Python library for (scientific) image processing.</li>
<li>
<a rel="nofollow" href="https://github.com/rossgoodwin/hmap">hmap</a> - Image histogram remapping.</li>
</ul>
<h3>OCR</h3>
<p><em>Libraries for Optical Character Recognition.</em></p>
<ul>
<li>[python-tesseract] (<a rel="nofollow" href="https://code.google.com/p/python-tesseract">https://code.google.com/p/python-tesseract</a>) - A wrapper class for <a rel="nofollow" href="https://code.google.com/p/tesseract-ocr/">Google Tesseract OCR</a>.</li>
<li>
<a rel="nofollow" href="https://github.com/madmaze/pytesseract">pytesseract</a> - Another wrapper for Google Tesseract OCR. </li>
<li>
<a rel="nofollow" href="https://github.com/jflesch/pyocr">pyocr</a> - A wrapper for Tesseract and Cuneiform.</li>
</ul>
<h3>Audio</h3>
<p><em>Libraries for manipulating audio.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/danilobellini/audiolazy">audiolazy</a> - Expressive Digital Signal Processing (DSP) package for Python.</li>
<li>
<a rel="nofollow" href="https://github.com/sampsyo/audioread">audioread</a> - Cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding.</li>
<li>
<a rel="nofollow" href="http://beets.radbox.org/">beets</a> - A music library manager and <a rel="nofollow" href="https://musicbrainz.org/">MusicBrainz</a> tagger.</li>
<li>
<a rel="nofollow" href="https://github.com/worldveil/dejavu">dejavu</a> - Audio fingerprinting and recognition.</li>
<li>
<a rel="nofollow" href="https://github.com/StreetVoice/django-elastic-transcoder">django-elastic-transcoder</a> - Django + <a rel="nofollow" href="http://aws.amazon.com/elastictranscoder/">Amazon Elastic Transcoder</a>.</li>
<li>
<a rel="nofollow" href="http://eyed3.nicfit.net/">eyeD3</a> - A tool for working with audio files, specifically MP3 files containing ID3 metadata.</li>
<li>
<a rel="nofollow" href="http://nedbatchelder.com/code/modules/id3reader.py">id3reader</a> - A Python module for reading MP3 meta data.</li>
<li>
<a rel="nofollow" href="https://code.google.com/p/mutagen/">mutagen</a> - A Python module to handle audio metadata.</li>
<li>
<a rel="nofollow" href="https://github.com/jiaaro/pydub">pydub</a> - Manipulate audio with a simple and easy high level interface.</li>
<li>
<a rel="nofollow" href="https://github.com/echonest/pyechonest">pyechonest</a> - Python client for the <a rel="nofollow" href="http://developer.echonest.com/docs/">Echo Nest</a> API.</li>
<li>
<a rel="nofollow" href="http://scikits.appspot.com/talkbox">talkbox</a> - A Python library for speech/signal processing.</li>
<li>
<a rel="nofollow" href="https://github.com/yomguy/TimeSide">TimeSide</a> - Open web audio processing framework.</li>
<li>
<a rel="nofollow" href="https://github.com/devsnd/tinytag">tinytag</a> - A library for reading music meta data of MP3, OGG, FLAC and Wave files.</li>
<li>
<a rel="nofollow" href="https://github.com/globocom/m3u8">m3u8</a> - A module for parsing m3u8 file.</li>
</ul>
<h3>Video</h3>
<p><em>Libraries for manipulating video and GIFs.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://zulko.github.io/moviepy/">moviepy</a> - A module for script-based movie editing with many formats, including animated GIFs.</li>
<li>
<a rel="nofollow" href="http://www.shorten.tv/">shorten.tv</a> - Video summarization.</li>
<li>
<a rel="nofollow" href="https://github.com/aizvorski/scikit-video">scikit-video</a> - Video processing routines for SciPy.</li>
</ul>
<h3>Geolocation</h3>
<p><em>Libraries for geocoding addresses and working with latitudes and longitudes.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://docs.djangoproject.com/en/dev/ref/contrib/gis/">GeoDjango</a> - A world-class geographic web framework.</li>
<li>
<a rel="nofollow" href="https://github.com/geopy/geopy">geopy</a> - Python Geocoding Toolbox.</li>
<li>
<a rel="nofollow" href="https://github.com/appliedsec/pygeoip">pygeoip</a> - Pure Python GeoIP API.</li>
<li>
<a rel="nofollow" href="https://github.com/maxmind/geoip-api-python">GeoIP</a> - Python API for MaxMind GeoIP Legacy Database.</li>
<li>
<a rel="nofollow" href="https://github.com/frewsxcv/python-geojson">geojson</a> - Python bindings and utlities for GeoJSON.</li>
<li>
<a rel="nofollow" href="https://github.com/SmileyChris/django-countries">django-countries</a> - A Django app that provides country choices for use with forms, flag icons static files, and a country field for models.</li>
</ul>
<h3>HTTP</h3>
<p><em>Libraries for working with HTTP.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://docs.python-requests.org/">requests</a> - HTTP Requests for Humans™.</li>
<li>
<a rel="nofollow" href="https://github.com/kennethreitz/grequests">grequests</a> - requests + gevent for asynchronous HTTP requests.</li>
<li>
<a rel="nofollow" href="https://github.com/shazow/urllib3">urllib3</a> - A HTTP library with thread-safe connection pooling, file post support, sanity friendly.</li>
<li>
<a rel="nofollow" href="https://github.com/jcgregorio/httplib2">httplib2</a> - Comprehensive HTTP client library.</li>
<li>
<a rel="nofollow" href="https://github.com/dreid/treq">treq</a> - Python requests like API built on top of Twisted's HTTP client.</li>
</ul>
<h3>Database</h3>
<p><em>Databases implemented in Python.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://www.zodb.org/">ZODB</a> - A native object database for Python. A key-value and object graph database.</li>
<li>
<a rel="nofollow" href="https://pythonhosted.org/pickleDB/">pickleDB</a> - A simple and lightweight key-value store for Python.</li>
<li>
<a rel="nofollow" href="https://github.com/msiemens/tinydb">TinyDB</a> - A tiny, document-oriented database.</li>
</ul>
<h3>Database Drivers</h3>
<p><em>Libraries for connecting and operating databases.</em></p>
<ul>
<li>Relational Databases<br><br><ul>
<li>
<a rel="nofollow" href="http://sourceforge.net/projects/mysql-python/">mysql-python</a> - The MySQL database connector for Python.</li>
<li>
<a rel="nofollow" href="https://github.com/PyMySQL/mysqlclient-python">mysqlclient</a> - mysql-python fork supporting Python 3.</li>
<li>
<a rel="nofollow" href="https://github.com/PyMySQL/PyMySQL">PyMySQL</a> - Pure Python MySQL driver compatible to mysql-python.</li>
<li>
<a rel="nofollow" href="https://pypi.python.org/pypi/mysql-connector-python">mysql-connector-python</a> - A pure Python MySQL driver from Oracle.</li>
<li>
<a rel="nofollow" href="https://pythonhosted.org/oursql/">oursql</a> - A better MySQL connector with support for native prepared statements and BLOBs.</li>
<li>
<a rel="nofollow" href="http://initd.org/psycopg/">psycopg2</a> - The most popular PostgreSQL adapter for Python.</li>
<li>
<a rel="nofollow" href="http://txpostgres.readthedocs.org/">txpostgres</a> - Twisted based asynchronous driver for PostgreSQL.</li>
<li>
<a rel="nofollow" href="https://github.com/gmr/queries">queries</a> - A wrapper of the psycopg2 library for interacting with PostgreSQL.</li>
<li>
<a rel="nofollow" href="https://github.com/pudo/dataset">dataset</a> - Store Python dicts in a database - works with SQLite, MySQL, and PostgreSQL.</li>
</ul>
</li>
<li>NoSQL Databases<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/datastax/python-driver">cassandra-python-driver</a> - Python driver for Cassandra.</li>
<li>
<a rel="nofollow" href="https://github.com/pycassa/pycassa">pycassa</a> - Python Thrift driver for Cassandra.</li>
<li>
<a rel="nofollow" href="http://happybase.readthedocs.org/">HappyBase</a> - A developer-friendly library for Apache HBase.</li>
<li>
<a rel="nofollow" href="http://docs.mongodb.org/ecosystem/drivers/python/">PyMongo</a> - The official Python client for MongoDB.</li>
<li>
<a rel="nofollow" href="https://plyvel.readthedocs.org/">Plyvel</a> - A fast and feature-rich Python interface to LevelDB.</li>
<li>
<a rel="nofollow" href="https://github.com/andymccurdy/redis-py">redis-py</a> - The Redis Python Client.</li>
<li>
<a rel="nofollow" href="http://book.py2neo.org/">py2neo</a> - Python wrapper client for Neo4j's restful interface.</li>
<li>
<a rel="nofollow" href="https://github.com/driftx/Telephus">telephus</a> - Twisted based client for Cassandra.</li>
<li>
<a rel="nofollow" href="https://github.com/deldotdr/txRedis">txRedis</a> - Twisted based client for Redis.</li>
</ul>
</li>
</ul>
<h3>ORM</h3>
<p><em>Libraries that implement Object-Relational Mapping or datamapping techniques.</em></p>
<ul>
<li>Relational Databases<br><br><ul>
<li>
<a rel="nofollow" href="https://docs.djangoproject.com/en/dev/topics/db/models/">Django Models</a> - A part of Django.</li>
<li>
<a rel="nofollow" href="http://www.sqlalchemy.org/">SQLAlchemy</a> - The Python SQL Toolkit and Object Relational Mapper.<br><br><ul>
<li><a rel="nofollow" href="https://github.com/dahlia/awesome-sqlalchemy">awesome-sqlalchemy</a></li>
</ul>
</li>
<li>
<a rel="nofollow" href="https://github.com/coleifer/peewee">peewee</a> - A small, expressive ORM.</li>
<li>
<a rel="nofollow" href="http://ponyorm.com">PonyORM</a> - ORM that provides a generator-oriented interface to SQL.</li>
</ul>
</li>
<li>NoSQL Databases<br><br><ul>
<li>
<a rel="nofollow" href="http://mongoengine.org/">MongoEngine</a> - A Python Object-Document-Mapper for working with MongoDB.</li>
<li>
<a rel="nofollow" href="https://github.com/django-nonrel/mongodb-engine">django-mongodb-engine</a> - Django MongoDB Backend.</li>
<li>
<a rel="nofollow" href="https://github.com/kiddouk/redisco">redisco</a> - A Python Library for Simple Models and Containers Persisted in Redis.</li>
<li>
<a rel="nofollow" href="https://github.com/mathcamp/flywheel">flywheel</a> - Object mapper for Amazon DynamoDB.</li>
</ul>
</li>
<li>Others<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/Widdershin/butterdb">butterdb</a> - A Python ORM for Google Drive Spreadsheets.</li>
</ul>
</li>
</ul>
<h3>Web Frameworks</h3>
<p><em>Full stack web frameworks.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://www.djangoproject.com/">Django</a> - The most popular web framework in Python.<br><br><ul>
<li><a rel="nofollow" href="https://github.com/rosarior/awesome-django">awesome-django</a></li>
</ul>
</li>
<li>
<a rel="nofollow" href="http://flask.pocoo.org/">Flask</a> - A microframework for Python.<br><br><ul>
<li><a rel="nofollow" href="https://github.com/humiaozuzu/awesome-flask">awesome-flask</a></li>
</ul>
</li>
<li>
<a rel="nofollow" href="http://bottlepy.org/">Bottle</a> - A fast, simple and lightweight WSGI micro web-framework.</li>
<li>
<a rel="nofollow" href="http://www.pylonsproject.org/">Pyramid</a> - A small, fast, down-to-earth, open source Python web framework.</li>
<li>
<a rel="nofollow" href="http://www.web2py.com">web2py</a> - A full stack web framework and platform focused in the ease of use.</li>
<li>
<a rel="nofollow" href="http://webpy.org/">web.py</a> - A web framework for Python that is as simple as it is powerful.</li>
<li>
<a rel="nofollow" href="http://www.turbogears.org/">TurboGears</a> - The Web Framework that starts as a microframework and scales up to a fullstack solution.</li>
<li>
<a rel="nofollow" href="http://www.cherrypy.org/">CherryPy</a> - A Minimalist Python Web Framework, HTTP/1.1-compliant and WSGI thread-pooled.</li>
<li>
<a rel="nofollow" href="http://grok.zope.org/">Grok</a> - A framework built on the existing Zope 3 libraries.</li>
<li>
<a rel="nofollow" href="http://bluebream.zope.org/">Bluebream</a> - An open-source web application server, framework and library, formerly known as Zope 3.</li>
<li>
<a rel="nofollow" href="https://github.com/flatpeach/guava">guava</a> - A lightweight and high performance web framework for Python written in C.</li>
</ul>
<h3>Permissions</h3>
<p><em>Libraries that allow or deny users access to data or functionality.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/lukaszb/django-guardian">django-guardian</a> - Implementation of per object permissions for Django 1.2+</li>
<li>
<a rel="nofollow" href="http://www.github.com/neuman/python-carteblanche/">Carteblanche</a> - Module to align code with thoughts of users and designers. Also magically handles navigation and permissions.</li>
</ul>
<h3>CMS</h3>
<p><em>Content Management Systems.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://www.django-cms.org/en/">django-cms</a> - An Open source enterprise CMS based on the Django.</li>
<li>
<a rel="nofollow" href="http://djedi-cms.org/">djedi-cms</a> - A lightweight but yet powerful Django CMS with plugins, inline editing and performance in mind.</li>
<li>
<a rel="nofollow" href="http://www.feincms.org/">FeinCMS</a> - One of the most advanced Content Management Systems built on Django.</li>
<li>
<a rel="nofollow" href="http://kotti.pylonsproject.org/">Kotte</a> - A high-level, Pythonic web application framework built on Pyramid.</li>
<li>
<a rel="nofollow" href="http://mezzanine.jupo.org/">Mezzanine</a> - A powerful, consistent, and flexible content management platform.</li>
<li>
<a rel="nofollow" href="http://oppsproject.org/">Opps</a> - A Django-based CMS for magazines, newspapers websites and portals with high-traffic.</li>
<li>
<a rel="nofollow" href="http://plone.org/">Plone</a> - A CMS built on top of the open source application server Zope.</li>
<li>
<a rel="nofollow" href="http://quokkaproject.org/">Quokka</a> - Flexible, extensible, small CMS powered by Flask and MongoDB.</li>
<li>
<a rel="nofollow" href="http://wagtail.io/">Wagtail</a> - A Django content management system.</li>
<li>
<a rel="nofollow" href="http://wid.gy/">Widgy</a> - Last CMS framework, based on Django.</li>
</ul>
<h3>E-commerce</h3>
<p><em>Frameworks and libraries for e-commerce and payments.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://oscarcommerce.com/">django-oscar</a> - An open-source e-commerce framework for Django.</li>
<li>
<a rel="nofollow" href="https://www.django-cms.org/">django-shop</a> - A Django based shop system.</li>
<li>
<a rel="nofollow" href="https://github.com/agiliq/merchant">merchant</a> - A Django app to accept payments from various payment processors.</li>
<li>
<a rel="nofollow" href="https://github.com/carlospalol/money">money</a> - Money class with optional CLDR-backed locale-aware formatting and an extensible currency exchange solution.</li>
<li>
<a rel="nofollow" href="https://github.com/Alir3z4/python-currencies">python-currencies</a> - Display money format and its filthy currencies.</li>
</ul>
<h3>RESTful API</h3>
<p><em>Libraries for developing RESTful APIs.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://cornice.readthedocs.org/">cornice</a> - A REST framework for Pyramid.</li>
<li>
<a rel="nofollow" href="http://www.django-rest-framework.org/">django-rest-framework</a> - A powerful and flexible toolkit that makes it easy to build Web APIs.</li>
<li>
<a rel="nofollow" href="http://tastypieapi.org/">django-tastypie</a> - Creating delicious APIs for Django apps.</li>
<li>
<a rel="nofollow" href="https://github.com/5monkeys/django-formapi">django-formapi</a> - Create JSON APIs with HMAC authentication and Django form-validation.</li>
<li>
<a rel="nofollow" href="http://www.flaskapi.org/">flask-api</a> - An implementation of the same web browsable APIs that django-rest-framework provides.</li>
<li>
<a rel="nofollow" href="http://flask-restful.readthedocs.org/">flask-restful</a> - An extension for Flask that adds support for quickly building REST APIs.</li>
<li>
<a rel="nofollow" href="https://flask-restless.readthedocs.org/en/latest/">flask-restless</a> - A Flask extension for generating ReSTful APIs for database models defined with SQLAlchemy (or Flask-SQLAlchemy).</li>
<li>
<a rel="nofollow" href="https://github.com/marselester/flask-api-utils">flask-api-utils</a> - Flask extension that takes care of API representation and authentication.</li>
<li>
<a rel="nofollow" href="http://falconframework.org/">falcon</a> - A high-performance Python framework for building cloud APIs and web app backends.</li>
<li>
<a rel="nofollow" href="https://github.com/nicolaiarocci/eve">eve</a> - REST API framework powered by Flask, MongoDB and good intentions.</li>
<li>
<a rel="nofollow" href="https://github.com/jeffknupp/sandman">sandman</a> - Automated REST APIs for existing database-driven systems.</li>
<li>
<a rel="nofollow" href="http://restless.readthedocs.org/en/latest/">restless</a> - Framework agnostic REST framework based on lessons learned from TastyPie.</li>
<li>
<a rel="nofollow" href="https://github.com/RueLaLa/savory-pie/">savory-pie</a> - REST API building library (django, and others)</li>
</ul>
<h3>Authentication</h3>
<p><em>Libraries for implementing authentications schemes.</em></p>
<ul>
<li>OAuth<br><br><ul>
<li>
<a rel="nofollow" href="http://peterhudec.github.io/authomatic/">Authomatic</a> - Simple but powerful framework agnostic authentication/authorization client package.</li>
<li>
<a rel="nofollow" href="https://github.com/idan/oauthlib">OAuthLib</a> - A generic, spec-compliant, thorough implementation of the OAuth request-signing logic.</li>
<li>
<a rel="nofollow" href="https://github.com/litl/rauth">rauth</a> - A Python library for OAuth 1.0/a, 2.0, and Ofly.</li>
<li>
<a rel="nofollow" href="https://github.com/simplegeo/python-oauth2">python-oauth2</a> - A fully tested, abstract interface to creating OAuth clients and servers.</li>
<li>
<a rel="nofollow" href="https://github.com/omab/python-social-auth">python-social-auth</a> - An easy-to-setup social authentication mechanism.</li>
<li>
<a rel="nofollow" href="https://github.com/evonove/django-oauth-toolkit">django-oauth-toolkit</a> - OAuth2 goodies for the Djangonauts.</li>
<li>
<a rel="nofollow" href="https://github.com/caffeinehit/django-oauth2-provider">django-oauth2-provider</a> - Providing OAuth2 access to Django app.</li>
<li>
<a rel="nofollow" href="https://github.com/pennersr/django-allauth">django-allauth</a> - Authentication app for Django that "just works."</li>
<li>
<a rel="nofollow" href="https://github.com/lepture/flask-oauthlib">Flask-OAuthlib</a> - OAuth 1.0/a, 2.0 implementation of client and provider for Flask.</li>
<li>
<a rel="nofollow" href="https://github.com/demianbrecht/sanction">sanction</a> - A dead simple OAuth2 client implementation.</li>
</ul>
</li>
<li>Others<br><br><ul>
<li>
<a rel="nofollow" href="https://github.com/progrium/pyjwt">PyJWT</a> - Implementation of the JSON Web Token draft 01.</li>
<li>
<a rel="nofollow" href="https://github.com/davedoesdev/python-jwt">python-jwt</a> - Module for generating and verifying JSON Web Tokens.</li>
<li>
<a rel="nofollow" href="https://github.com/brianloveswords/python-jws">python-jws</a> - Implementation of JSON Web Signatures draft 02.</li>
<li>
<a rel="nofollow" href="https://github.com/demonware/jose">jose</a> - JavaScript Object Signing and Encryption draft implementation.</li>
</ul>
</li>
</ul>
<h3>Template Engine</h3>
<p><em>Libraries and tools for templating and lexing.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/mitsuhiko/jinja2">Jinja2</a> - A modern and designer friendly templating language.</li>
<li>
<a rel="nofollow" href="http://genshi.edgewall.org/">Genshi</a> - Python templating toolkit for generation of web-aware output.</li>
<li>
<a rel="nofollow" href="http://www.makotemplates.org/">Mako</a> - Hyperfast and lightweight templating for the Python platform.</li>
<li>
<a rel="nofollow" href="https://chameleon.readthedocs.org/">Chameleon</a> - An HTML/XML template engine. Modeled after ZPT, optimized for speed.</li>
<li>
<a rel="nofollow" href="https://code.google.com/p/spitfire/">Spitfire</a> - A very fast Python template compiler.</li>
</ul>
<h3>Queue</h3>
<p><em>Libraries for working with event and task queues.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://www.celeryproject.org/">celery</a> - An asynchronous task queue/job queue based on distributed message passing.</li>
<li>
<a rel="nofollow" href="https://github.com/coleifer/huey">huey</a> - Little multi-threaded task queue.</li>
<li>
<a rel="nofollow" href="https://github.com/pricingassistant/mrq">mrq</a> - Mr. Queue - A distributed worker task queue in Python using Redis & gevent.</li>
<li>
<a rel="nofollow" href="http://python-rq.org/">rq</a> - Simple job queues for Python.</li>
<li>
<a rel="nofollow" href="https://github.com/rdegges/simpleq">simpleq</a> - A simple, infinitely scalable, Amazon SQS based queue.</li>
</ul>
<h3>Search</h3>
<p><em>Libraries and software for indexing and performing search queries on data.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/toastdriven/django-haystack">django-haystack</a> - Modular search for Django.</li>
<li>
<a rel="nofollow" href="http://www.elasticsearch.org/guide/en/elasticsearch/client/python-api/current/">elasticsearch-py</a> - The official low-level Python client for <a rel="nofollow" href="http://www.elasticsearch.org/">Elasticsearch</a>.</li>
<li>
<a rel="nofollow" href="https://code.google.com/p/solrpy/">solrpy</a> - A Python client for <a rel="nofollow" href="http://lucene.apache.org/solr/">solr</a>.</li>
<li>
<a rel="nofollow" href="http://whoosh.readthedocs.org/">Whoosh</a> - A fast, pure Python search engine library.</li>
</ul>
<h3>News Feed</h3>
<p><em>Libraries for building user's activities.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/tschellenbach/Feedly">Feedly</a> - A library to build newsfeed and notification systems using Cassandra and Redis.</li>
<li>
<a rel="nofollow" href="https://github.com/justquick/django-activity-stream">django-activity-stream</a> - Generate generic activity streams from the actions on your site.</li>
</ul>
<h3>Asset Management</h3>
<p><em>Tools for managing, compressing and minifying website assets.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/django-compressor/django-compressor">django-compressor</a> - Compresses linked and inline javascript or CSS into a single cached file.</li>
<li>
<a rel="nofollow" href="https://github.com/jaysonsantos/jinja-assets-compressor">jinja-assets-compressor</a> - A Jinja extension to compile and compress your assets.</li>
<li>
<a rel="nofollow" href="http://webassets.readthedocs.org/">webassets</a> - Bundles, optimizes, and manages unique cache-busting URLs for static resources.</li>
<li>
<a rel="nofollow" href="http://www.fanstatic.org/">fanstatic</a> - Packages, optimizes, and serves static file dependencies as Python packages.</li>
<li>
<a rel="nofollow" href="http://fileconveyor.org/">fileconveyor</a> - Monitors changes, processes, and transports assets to CDNs and file storage systems.</li>
<li>
<a rel="nofollow" href="http://code.larlet.fr/django-storages/">django-storages</a> - A collection of custom storage backends for Django.</li>
<li>
<a rel="nofollow" href="http://gluecss.com">glue</a> - Glue is a simple command line tool to generate CSS sprites.</li>
<li>
<a rel="nofollow" href="http://hongminhee.org/libsass-python/">libsass-python</a> - A Python binding of <a rel="nofollow" href="https://github.com/hcatlin/libsass">libsass</a>, the reference implementation of SASS/SCSS.</li>
<li>
<a rel="nofollow" href="http://flask-assets.readthedocs.org/">Flask-Assets</a> - Helps you integrate webassets into your Flask app.</li>
</ul>
scala 实现 hadoop 多重文件输出
https://segmentfault.com/a/1190000002517336
2015-01-27T22:23:47+08:00
2015-01-27T22:23:47+08:00
timger
https://segmentfault.com/u/timger
0
<pre><code>package com.timger.tools
/**
* Created by timger on 15-1-26.
*/
import com.timger.etl.TokenizerMapper
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.Path
import org.apache.hadoop.io.{LongWritable, NullWritable, IntWritable, Text}
//import org.apache.hadoop.mapred.lib.{MultipleOutputFormat, MultipleTextOutputFormat}
import org.apache.hadoop.mapreduce.lib.output.LazyOutputFormat
import org.apache.hadoop.mapreduce.lib.output.MultipleOutputs;
import org.apache.hadoop.mapreduce.{Job,InputFormat,OutputFormat}
import org.apache.hadoop.mapreduce.Mapper
import org.apache.hadoop.mapreduce.Reducer
import org.apache.hadoop.mapreduce.lib.input.{TextInputFormat, FileInputFormat}
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
import org.apache.hadoop.util.GenericOptionsParser
import scala.collection.JavaConversions._
import org.apache.hadoop.mapreduce.lib.output;
import org.apache.hadoop.mapreduce.lib.output.{TextOutputFormat}
/**
* https://hadoop.apache.org/docs/r2.2.0//api/org/apache/hadoop/mapred/lib/MultipleOutputs.html
*/
// This class performs the map operation, translating raw input into the key-value
// pairs we will feed into our reduce operation.
class SplitMapper extends Mapper[Object,Text,Text,Text] {
//https://github.com/rystsov/learning-hadoop/blob/master/src/main/java/com/twitter/rystsov/mr/MultipulOutputExample.java
//var multipleOutputs :MultipleOutputs = null;
private var multipleOutputs: MultipleOutputs[Text, Text] = null
@throws(classOf[java.io.IOException])
@throws(classOf[java.lang.InterruptedException])
override
protected def setup(context:Mapper[Object, Text, Text, Text]#Context) {
//pattern = Pattern.compile("^http://([^/]+).+$");
multipleOutputs = new MultipleOutputs(context);
println("SetUp Ok ......................");
}
override
def map(key:Object, value:Text, context:Mapper[Object,Text,Text,Text]#Context) = {
//var conf :Configuration = context.getConfiguration();
//var splitkeys = conf.get("splitkeys");
var keys = "26/Jan/2015,27/Jan/2015".split(",");
//var keys =
val word = new Text;
var line = value.toString();
for (key <- keys) {
word.set(key)
if (line.contains(key)) {
var mkey :String = key.toString().replace("/", "");
context.write(word, value);
multipleOutputs.write(mkey,
word,
value);
}
}
}
@throws(classOf[java.io.IOException])
@throws(classOf[InterruptedException])
override
protected def cleanup( context:Mapper[Object,Text,Text,Text]#Context){
multipleOutputs.close();
}
}
/***
class MutiFIleReducer extends Reducer[Text,Text,Text,Text] {
override
def reduce(key:Text, values:java.lang.Iterable[Text], context:Reducer[Text,Text,Text,Text]#Context) = {
for (value <- values) {
context.write(key, value)
}
}
}
***/
// This class configures and runs the job with the map and reduce classes we've
// specified above.
object SplitFile {
def main(args:Array[String]):Int = {
val conf = new Configuration()
val otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs
var usage =
"""
|Usage: FileSplit <inputdir> <outdir> <tmpdir> <keeyoriginfile>[<key1> <key2> <key3> key4>...]
|inputdir is a file or dir
|outputdir is a dir need not blank
|tmpdir use for tmp
|keeyoriginfile is a bool
"""
print("\n\n");
print(args.toString());
print(otherArgs);
println()
if (args.length < 5) {
print(usage)
return 2
}
val job = new Job(conf, "com.timger.tools.SplitFile")
job.setJarByClass(classOf[SplitMapper])
job.setMapperClass(classOf[SplitMapper])
//job.setCombinerClass(classOf[IntSumReducer])
//job.setReducerClass(classOf[MutiFIleReducer])
job.setOutputKeyClass(classOf[Text])
job.setInputFormatClass(classOf[TextInputFormat])
var inputdir = new Path(otherArgs(0));
var outputdir = new Path(otherArgs(1));
var tmpdir = new Path(otherArgs(2));
var keeyoriginfile :String= otherArgs(3);
var splitkeys :Array[String]= otherArgs.slice(4, otherArgs.length);
var keys = splitkeys.mkString(",")
var test_str :String =
"""
|input: %s
|output: %s
|tmpdir: %s
|keys: %s
""".format(inputdir, outputdir, tmpdir, keys);
print(test_str);
//onf.set("splitkeys", keys);
print(splitkeys);
// val TextOutputFormatClass = classOf[TextOutputFormat].asInstanceOf[Class[T] forSome {type T <: OutputFormat[String, String]}]
for (key <- splitkeys) {
var mkey :String = key.toString().replace("/", "");
print(mkey);
MultipleOutputs.addNamedOutput(job, mkey,
classOf[TextOutputFormat[Text,Text]],
classOf[Text],
classOf[Text]);
}
//job.setNumReduceTasks(splitkeys.length);
job.setNumReduceTasks(0);
job.setOutputValueClass(classOf[Text])
FileInputFormat.addInputPath(job, inputdir)
FileOutputFormat.setOutputPath(job, tmpdir)
//job.setOutputFormatClass(classOf[MyMultipleOutputFormat]);
//job.setOutputFormatClass(classOf[CustomMultipleTextOutputFormat]);
//LazyOutputFormat.setOutputFormatClass(conf, classOf[MyMultipleOutputFormat])
if (job.waitForCompletion(true)) 0 else 1
}
}
</code></pre>
Awesome Scala
https://segmentfault.com/a/1190000002508809
2015-01-23T22:58:31+08:00
2015-01-23T22:58:31+08:00
timger
https://segmentfault.com/u/timger
3
<h2>Awesome Scala</h2>
<p>A community driven list of useful Scala libraries, frameworks and software. This is not a catalog of all the libraries, just a starting point for your explorations. Inspired by <a rel="nofollow" href="https://github.com/vinta/awesome-python">awesome-python</a>. Other amazingly awesome lists can be found in the <a rel="nofollow" href="https://github.com/bayandin/awesome-awesomeness">awesome-awesomeness</a> list.</p>
<ul>
<li>
<a rel="nofollow">Awesome Scala</a><br><br><ul>
<li><a rel="nofollow">Database</a></li>
<li><a rel="nofollow">Web Frameworks</a></li>
<li><a rel="nofollow">i18n</a></li>
<li><a rel="nofollow">Authentication</a></li>
<li><a rel="nofollow">Testing</a></li>
<li><a rel="nofollow">JSON Manipulation</a></li>
<li><a rel="nofollow">Serialization</a></li>
<li><a rel="nofollow">Science and Data Analysis</a></li>
<li><a rel="nofollow">Big Data</a></li>
<li><a rel="nofollow">Functional Reactive Programming</a></li>
<li><a rel="nofollow">Modularization and Dependency Injection</a></li>
<li><a rel="nofollow">Distributed Systems</a></li>
<li><a rel="nofollow">Extensions</a></li>
<li><a rel="nofollow">Android</a></li>
<li><a rel="nofollow">HTTP</a></li>
<li><a rel="nofollow">Semantic Web</a></li>
<li><a rel="nofollow">Metrics and Monitoring</a></li>
<li><a rel="nofollow">Parsing</a></li>
<li><a rel="nofollow">Sbt plugins</a></li>
</ul>
</li>
<li><a rel="nofollow">Contributing</a></li>
</ul>
<h3>Database</h3>
<p><em>Database access libraries in Scala.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/fwbrasil/activate">Activate</a> — Pluggable object persistence in Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/sksamuel/elastic4s">Elastic4s</a> - A scala DSL / reactive client for Elasticsearch</li>
<li>
<a rel="nofollow" href="https://github.com/websudosuk/phantom">Phantom</a> — Async type safe Scala DSL for Apache Cassandra.</li>
<li>
<a rel="nofollow" href="https://github.com/mauricio/postgresql-async">PostgreSQL and MySQL async</a> — Async database drivers to talk to PostgreSQL and MySQL in Scala.</li>
<li>
<a rel="nofollow" href="http://reactivecouchbase.org/">ReactiveCouchbase</a> — Reactive Scala Driver for Couchbase. Also includes a Play plug-in. An official plug-in is also in development.</li>
<li>
<a rel="nofollow" href="https://github.com/ReactiveMongo/ReactiveMongo">ReactiveMongo</a> — Reactive Scala Driver for MongoDB.</li>
<li>
<a rel="nofollow" href="https://github.com/novus/salat/">Salat</a> — ORM for MongoDB. A related Play-plugin is also available.</li>
<li>
<a rel="nofollow" href="https://github.com/aselab/scala-activerecord">Scala ActiveRecord</a> — ORM library for scala, inspired by ActiveRecord of Ruby on Rails.</li>
<li>
<a rel="nofollow" href="https://github.com/scalikejdbc/scalikejdbc">ScalikeJDBC</a> — A tidy SQL-based DB access library for Scala developers. </li>
<li>
<a rel="nofollow" href="https://github.com/slick/slick">Slick</a> — Modern database query and access library for Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/sorm/sorm">Sorm</a> — A functional boilerplate-free Scala ORM.</li>
<li>
<a rel="nofollow" href="https://github.com/squeryl/squeryl">Squeryl</a> — A Scala DSL for talking with databases with minimum verbosity and maximum type safety.</li>
</ul>
<h3>Web Frameworks</h3>
<p><em>Scala frameworks for web development.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/playframework/playframework">Play</a> — Makes it easy to build scalable, fast and real-time web applications with Java & Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/lift/framework">Lift</a> — Secure and powerful full stack web framework (<a rel="nofollow" href="https://github.com/lauris/awesome-scala/pull/19">discussion</a>).</li>
<li>
<a rel="nofollow" href="https://github.com/skinny-framework/skinny-framework">Skinny Framework</a> — A full-stack web app framework upon Scalatra for rapid Development in Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/scalatra/scalatra">Scalatra</a> — Tiny Scala high-performance, async web framework, inspired by Sinatra.</li>
<li>
<a rel="nofollow" href="https://github.com/spray/spray">Spray</a> — A suite of scala libraries for building and consuming RESTful web services on top of Akka.</li>
<li>
<a rel="nofollow" href="https://github.com/twitter/finatra">Finatra</a> — A sinatra-inspired web framework for scala, running on top of Finagle.</li>
<li>
<a rel="nofollow" href="https://github.com/nafg/reactive">Reactive</a> — FRP and web abstractions, which can be plugged into any web framework (currently only has bindings for Lift).</li>
<li>
<a rel="nofollow" href="https://github.com/mesosphere/chaos">Chaos</a> — A lightweight framework for writing REST services in Scala.</li>
<li>
<a rel="nofollow" href="http://xitrum-framework.github.io/">Xitrum</a> — An async and clustered Scala web framework and HTTP(S) server fusion on top of Netty, Akka, and Hazelcast.</li>
<li>
<a rel="nofollow" href="https://github.com/unfiltered/unfiltered">Unfiltered</a> — A modular set of unopinionated primitives for servicing HTTP and WebSocket requests in Scala.</li>
<li>
<a rel="nofollow" href="http://tumblr.github.io/colossus/">Colossus</a> — lightweight framework for building high-performance applications in Scala that require non-blocking network I/O.</li>
</ul>
<h3>i18n</h3>
<p><em>Scala libraries for i18n.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/xitrum-framework/scaposer">Scaposer</a> – GNU Gettext .po file loader for Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/xitrum-framework/scala-xgettext">scala-xgettext</a> – A compiler plugin that acts like GNU xgettext command to extract i18n strings in Scala source code files to Gettext .po file.</li>
</ul>
<h3>Authentication</h3>
<p><em>Libraries for implementing authentications schemes.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/nulab/scala-oauth2-provider">scala-oauth2-provider</a> — OAuth 2.0 server-side implementation written in Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/jaliss/securesocial">SecureSocial</a> — A module that provides OAuth, OAuth2 and OpenID authentication for Play Framework applications.</li>
<li>
<a rel="nofollow" href="https://github.com/t2v/play2-auth">play2-auth</a> — Play2.x Authentication and Authorization module.</li>
<li>
<a rel="nofollow" href="https://github.com/leleuj/play-pac4j">play-pac4j</a> — Profile & Authentication Client in Scala for CAS, OAuth, OpenID, SAML & HTTP protocols and Play 2.x framework.</li>
<li>
<a rel="nofollow" href="https://github.com/mohiva/play-silhouette">play-silhouette</a> — Authentication library for Play Framework applications that supports several authentication methods, including OAuth1, OAuth2, OpenID, Credentials or custom authentication schemes.</li>
</ul>
<h3>Testing</h3>
<p><em>Libraries for code testing.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/rickynils/scalacheck">ScalaCheck</a> — Property-based testing for Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/scalatest/scalatest">ScalaTest</a> — A testing tool for Scala and Java developers.</li>
<li>
<a rel="nofollow" href="https://scalameter.github.io/">ScalaMeter</a> - Performance & memory footprint measuring, regression testing.</li>
<li>
<a rel="nofollow" href="https://github.com/etorreborre/specs2">Specs2</a> — Software Specifications for Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/lihaoyi/utest">µTest</a> — A tiny, portable testing library for Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/xitrum-framework/scalive">Scalive</a> — Connect a Scala REPL to running JVM processes without any prior setup; this library is used for inspecting systems in production mode.</li>
<li>
<a rel="nofollow" href="https://github.com/scalastyle/scalastyle">Scalastyle</a> – Scala style checker.</li>
<li>
<a rel="nofollow" href="http://gatling-tool.org/">Gatling</a> – Async Scala-Akka-Netty based Stress Tool.</li>
<li>
<a rel="nofollow" href="http://scalamock.org">ScalaMock</a> – Scala native mocking framework </li>
</ul>
<h3>JSON Manipulation</h3>
<p><em>Libraries for work with json.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/json4s/json4s">json4s</a> — Project aims to provide a single AST to be used by other scala json libraries.</li>
<li>
<a rel="nofollow" href="https://github.com/spray/spray-json">spray-json</a> — Lightweight, clean and efficient JSON implementation in Scala.</li>
<li>
<a rel="nofollow" href="http://argonaut.io/">argonaut</a> — Purely Functional JSON in Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/FasterXML/jackson-module-scala">jackson-module-scala</a> — Add-on module for Jackson to support Scala-specific datatypes.</li>
<li>
<a rel="nofollow" href="https://github.com/playframework/playframework/tree/master/framework/src/play-json">play-json</a> — Flexible and powerful JSON manipulation, validation and serialization, with no reflection at runtime.</li>
</ul>
<h3>Serialization</h3>
<p><em>Libraries for serializing and deserializing data for storage or transport.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/scala/pickling">Pickling</a> — Fast, customizable, boilerplate-free pickling support.</li>
<li>
<a rel="nofollow" href="https://github.com/scodec/scodec">scodec</a> — A combinator library for working with binary data.</li>
<li>
<a rel="nofollow" href="http://twitter.github.io/scrooge/">Scrooge</a> — An Apache Thrift code generator for Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/jto/validation">validation</a> — Advanced validation & serialization for JSON, HTML form data, etc, with no reflection at runtime.</li>
<li>
<a rel="nofollow" href="https://github.com/twitter/chill">Chill</a> — Extensions for the Kryo serialization library to ease configuration in systems like Hadoop and Storm.</li>
</ul>
<h3>Science and Data Analysis</h3>
<p><em>Libraries for scientific computing, data analysis and numerical processing.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/twitter/algebird">Algebird</a> — Abstract Algebra for Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/scalanlp/breeze">Breeze</a> — Breeze is a numerical processing library for Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/scalanlp/chalk">Chalk</a> — Chalk is a natural language processing library. </li>
<li>
<a rel="nofollow" href="https://github.com/factorie/factorie">FACTORIE</a> — A toolkit for deployable probabilistic modeling, implemented as a software library in Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/p2t2/figaro">Figaro</a> - Figaro is a probabilistic programming language that supports development of very rich probabilistic models.</li>
<li>
<a rel="nofollow" href="https://github.com/romainreuillon/mgo">MGO</a> — Modular multi-objective evolutionary algorithm optimization library enforcing immutability.</li>
<li>
<a rel="nofollow" href="https://spark.apache.org/mllib/">MLLib</a> — Machine Learning framework for Spark</li>
<li>
<a rel="nofollow" href="https://github.com/ISCPIF/openmole">OpenMOLE</a> — OpenMOLE (Open MOdeL Experiment) is a workflow engine designed to leverage the computing power of distributed execution environments for naturally parallel processes.</li>
<li>
<a rel="nofollow" href="https://github.com/saddle/saddle">Saddle</a> — A minimalist port of Pandas to Scala</li>
<li>
<a rel="nofollow" href="https://github.com/non/spire">Spire</a> — Powerful new number types and numeric abstractions for Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/garyKeorkunian/squants">Squants</a> — The Scala API for Quantities, Units of Measure and Dimensional Analysis.</li>
</ul>
<h3>Big Data</h3>
<ul>
<li>
<a rel="nofollow" href="http://spark.apache.org/">Spark</a> — Lightning fast cluster computing — up to 100x faster than Hadoop for iterative algorithms (memory caching) and up to 10x faster than Hadoop for single-pass MapReduce jobs. Compatible with YARN-enabled Hadoop clusters, can run on Mesos and in stand-alone mode as well.</li>
<li>
<a rel="nofollow" href="https://github.com/twitter/scalding">Scalding</a> — A Scala binding for the Cascading abstraction of Hadoop MapReduce.</li>
<li>
<a rel="nofollow" href="https://github.com/twitter/summingbird">Summingbird</a> — An implementation of the “lambda architecture” as a software abstraction — a single API for Hadoop and Storm.</li>
<li>
<a rel="nofollow" href="https://github.com/adamretter/shadoop">Shadoop</a> - A Scala DSL for Hadoop MapReduce.</li>
<li>
<a rel="nofollow" href="http://crunch.apache.org/scrunch.html">Scrunch</a> — A Scala wrapper for <a rel="nofollow" href="http://crunch.apache.org/index.html">Apache Crunch</a> which provides a framework for writing, testing, and running MapReduce pipelines.</li>
<li>
<a rel="nofollow" href="https://github.com/romainreuillon/gridscale">GridScale</a> — A Scala API for computing clusters and grids.</li>
<li>
<a rel="nofollow" href="https://github.com/klout/scoozie">scoozie</a> — Scala DSL on top of Oozie XML.</li>
</ul>
<h3>Functional Reactive Programming</h3>
<p><em>Event streams, signals, observables, etc.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/lihaoyi/scala.rx">Scala.Rx</a> — An experimental library for Functional Reactive Programming in Scala (reactive variables). Scala.js compatible.</li>
<li>
<a rel="nofollow" href="https://github.com/dylemma/scala.frp">scala.frp</a> — Functional Reactive Programming for Scala (event streams).</li>
<li>
<a rel="nofollow" href="https://github.com/Netflix/RxJava/tree/master/language-adaptors/rxjava-scala">RxJava-Scala</a> — Scala Adaptor for RxJava.</li>
<li>
<a rel="nofollow" href="https://github.com/storm-enroute/reactive-collections">Reactive Collections</a> – A library that incorporates event streams and signals with specialized collections called reactive containers, and expresses concurrency using isolates and channels.</li>
<li>
<a rel="nofollow" href="http://vertx.io/">Vertx.io</a> – A polyglot reactive application platform for the JVM which aims to be an alternative to node.js. Its concurrency model resembles actors. It supports <a rel="nofollow" href="http://vertx.io/core_manual_scala.html">Scala</a>, Clojure, Java, Javascript, Ruby, Groovy and Python.</li>
</ul>
<h3>Modularization and Dependency Injection</h3>
<p><em>Modularization of applications, dependency injection, etc.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/helgoboss/domino">Domino</a> — Write elegant OSGi bundle activators in Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/xitrum-framework/sclasner">Sclasner</a> - Scala classpath scanner.</li>
<li>
<a rel="nofollow" href="https://github.com/scaldi/scaldi">Scaldi</a> — Lightweight Scala Dependency Injection Library.</li>
<li>
<a rel="nofollow" href="https://github.com/adamw/macwire">MacWire</a> — Scala Macro to generate wiring code for class instantiation. DI container replacement.</li>
<li>
<a rel="nofollow" href="https://github.com/dickwall/subcut">SubCut</a> — Scala Uniquely Bound Classes Under Traits.</li>
</ul>
<h3>Distributed Systems</h3>
<p><em>Libraries and frameworks for writing distributed applications.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://akka.io/">Akka</a> — A toolkit and runtime for building highly concurrent, distributed, and fault tolerant event-driven applications.</li>
<li>
<a rel="nofollow" href="https://twitter.github.io/finagle/">Finagle</a> — An extensible, protocol-agnostic RPC system designed for high performance and concurrency.</li>
<li>
<a rel="nofollow" href="https://github.com/xitrum-framework/glokka">Glokka</a> - Library to register and lookup actors by names in an Akka cluster.</li>
</ul>
<h3>Extensions</h3>
<p><em>Scala extensions.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/scalaz/scalaz">Scalaz</a> — An extension to the core Scala library for functional programming.</li>
<li>
<a rel="nofollow" href="https://github.com/milessabin/shapeless">Shapeless</a> — A type class and dependent type based generic programming library for Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/twitter/util">Twitter Util</a> — General-purpose Scala libraries, including a future implementation and other concurrency tools.</li>
<li>
<a rel="nofollow" href="https://github.com/scala/async">Scala Async</a> — An asynchronous programming facility for Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/resolvable/resolvable">Resolvable</a> — A library to optimize fetching immutable data structures from several endpoints in several formats.</li>
<li>
<a rel="nofollow" href="http://scala-blitz.github.io/">Scala Blitz</a> – A library to speed up Scala collection operations by removing runtime overheads during compilation, and a custom data-parallel operation runtime.</li>
<li>
<a rel="nofollow" href="http://log4s.org">Log4s</a> - Fast, Scala-friendly logging bindings on top of <a rel="nofollow" href="http://slf4j.org/">SLF4J</a>. Uses macros for extreme performance.</li>
<li>
<a rel="nofollow" href="https://github.com/maxcellent/lamma">Lamma</a> – A Scala date library for date and schedule generation.</li>
<li>
<a rel="nofollow" href="http://www.scala-graph.org/">Scala Graph</a> – A Scala library with basic graph functionality that seamlessly fits into the Scala standard collections library</li>
<li>
<a rel="nofollow" href="https://github.com/twitter/cassovary">Cassovary</a> – A Scala library that is designed from the ground up for space efficiency, handling graphs with billions of nodes and edges.</li>
</ul>
<h3>Android</h3>
<p><em>Scala libraries and wrappers for Android development.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/pocorall/scaloid">Scaloid</a> — Less painful Android development with Scala.</li>
<li>
<a rel="nofollow" href="https://github.com/macroid/macroid">Macroid</a> — A modular functional UI language for Android.</li>
<li>
<a rel="nofollow" href="https://github.com/pfn/android-sdk-plugin">Android SDK Plugin for SBT</a> — An sbt plugin that adds tasks for developing Android applications.</li>
</ul>
<h3>HTTP</h3>
<p><em>Scala libraries and wrappers for HTTP clients.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/dispatch/reboot">Dispatch</a> — Library for asynchronous HTTP interaction. It provides a Scala vocabulary for Java’s <a rel="nofollow" href="https://github.com/AsyncHttpClient/async-http-client">async-http-client</a>.</li>
<li>
<a rel="nofollow" href="https://github.com/ngocdaothanh/netcaty">Netcaty</a> - Simple net test client/server for Netty and Scala lovers.</li>
<li>
<a rel="nofollow" href="https://github.com/eed3si9n/scalaxb">Scalaxb</a> — An XML data-binding tool for Scala that supports W3C XML Schema (xsd) and Web Services Description Language (wsdl) as the input file.</li>
<li>
<a rel="nofollow" href="http://spray.io/">Spray</a> — Actor-based library for http interaction.</li>
<li>
<a rel="nofollow" href="https://github.com/softprops/tubesocks">Tubesocks</a> — Library supporting bi-directional communication with websocket servers.</li>
<li>
<a rel="nofollow" href="https://github.com/scalaj/scalaj-http">scalaj-http</a> – Simple scala wrapper for HttpURLConnection (including OAuth support).</li>
<li>
<a rel="nofollow" href="https://github.com/finagle/finch">Finch.io</a> — Purely Functional REST API atop of <a rel="nofollow" href="https://github.com/twitter/finagle">Finagle</a>.</li>
<li>
<a rel="nofollow" href="https://github.com/stackmob/newman">Newman</a> — A REST DSL that tries to take the best from Dispatch, Finagle and Apache HttpClient. See <a rel="nofollow" href="https://www.paypal-engineering.com/2014/02/13/hello-newman-a-rest-client-for-scala/">here</a> for rationale.</li>
</ul>
<h3>Semantic Web</h3>
<p><em>Scala libraries for interactions with the Web of Data, and other RDF tools</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/w3c/banana-rdf">Banana-RDF</a> – Scala-friendly abstractions for RDF and Linked Data technologies. Supports Jena, Sesame and native Scala.</li>
</ul>
<h3>Metrics and Monitoring</h3>
<p><em>Scala libraries for gathering metrics and monitoring applications.</em></p>
<ul>
<li>
<a rel="nofollow" href="http://kamon.io">Kamon</a> - Gathering metrics from applications built with Akka, Spray and Play! with support for user metrics as well.</li>
</ul>
<h3>Parsing</h3>
<p><em>Scala libraries for creating parsers.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/scala/scala-parser-combinators">Scala Parser Combinators</a> – Scala Standard Parser Combinator Library.</li>
<li>
<a rel="nofollow" href="https://github.com/sirthias/parboiled2">Parboiled2</a> – A Fast Parser Generator for Scala 2.10.3+.</li>
</ul>
<h3>Sbt plugins</h3>
<p><em>Sbt plugins to make your life easier.</em></p>
<ul>
<li>
<a rel="nofollow" href="https://github.com/spray/sbt-revolver">Sbt-Revolver</a> – Fork & Stop processes from sbt.</li>
<li>
<a rel="nofollow" href="https://github.com/typesafehub/sbteclipse">Sbt-Eclipse</a> – Create Eclipse project definitions from sbt builds.</li>
<li>
<a rel="nofollow" href="https://github.com/sbt/sbt-native-packager">Sbt-Native-Packager</a> – Bundle up Scala software for native packaging systems, like deb, rpm, homebrew, msi..</li>
<li>
<a rel="nofollow" href="https://github.com/jrudolph/sbt-dependency-graph">Sbt-Dependency-Graph</a> – Create a dependency graph for your project.</li>
<li>
<a rel="nofollow" href="https://github.com/sbt/sbt-onejar">Sbt-One-Jar</a> – Packages your project using One-JAR™.</li>
<li>
<a rel="nofollow" href="https://github.com/sbt/sbt-start-script">Sbt-Start-Script</a> – Create a "start" script to run the program.</li>
<li>
<a rel="nofollow" href="https://github.com/MasseGuillaume/ScalaKata">ScalaKata</a> – Scala playground & Documentation tool.</li>
<li>
<a rel="nofollow" href="https://github.com/typelevel/wartremover">WartRemover</a> – Flexible Scala code linting tool.</li>
<li>
<a rel="nofollow" href="https://github.com/earldouglas/xsbt-web-plugin">xsbt-web-plugin</a> – Build enterprise J2EE Web applications in Scala.</li>
</ul>
<h2>Contributing</h2>
<p>Your contributions are always welcome! Please submit a pull request or create an issue to add a new framework, library or software to the list. Do not submit a project that hasn’t been updated in the past 6 months or is not awesome.</p>
scala 项目生成工具
https://segmentfault.com/a/1190000002506657
2015-01-23T01:28:42+08:00
2015-01-23T01:28:42+08:00
timger
https://segmentfault.com/u/timger
0
<p><a rel="nofollow" href="https://github.com/softprops/np">https://github.com/softprops/np</a></p>
python 生成项目模板 用于打包
https://segmentfault.com/a/1190000002494878
2015-01-18T17:40:55+08:00
2015-01-18T17:40:55+08:00
timger
https://segmentfault.com/u/timger
0
<h4>安装工具</h4>
<pre><code>pip install cookiecutter
</code></pre>
<h4>获取模板</h4>
<pre><code>cookiecutter https://github.com/audreyr/cookiecutter-pypackage.git
</code></pre>
<h4>生成项目</h4>
<pre><code>timger-mac:scala_sbt_tool timger$ cookiecutter https://github.com/audreyr/cookiecutter-pypackage.git
Cloning into 'cookiecutter-pypackage'...
remote: Counting objects: 455, done.
remote: Total 455 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (455/455), 68.63 KiB | 9.00 KiB/s, done.
Resolving deltas: 100% (238/238), done.
Checking connectivity... done.
full_name (default is "Audrey Roy")? timger
email (default is "audreyr@gmail.com")? admin@timger.info
github_username (default is "audreyr")? yishenggudou
project_name (default is "Python Boilerplate")? scala_sbt_tool
repo_name (default is "boilerplate")? scala_sbt_tool
project_short_description (default is "Python Boilerplate contains all the boilerplate you need to create a Python package.")? a tool for sbt usage
release_date (default is "2014-01-11")? 2015-01-18
year (default is "2014")? 2015
version (default is "0.1.0")? 0.0.1
timger-mac:scala_sbt_tool timger$ ls
scala_sbt_tool
timger-mac:scala_sbt_tool timger$ cd scala_sbt_tool/
.editorconfig .travis.yml CONTRIBUTING.rst LICENSE Makefile docs/ scala_sbt_tool/ setup.py tox.ini
.gitignore AUTHORS.rst HISTORY.rst MANIFEST.in README.rst requirements.txt setup.cfg tests/
timger-mac:scala_sbt_tool timger$ cd scala_sbt_tool/
.editorconfig .travis.yml CONTRIBUTING.rst LICENSE Makefile docs/ scala_sbt_tool/ setup.py tox.ini
.gitignore AUTHORS.rst HISTORY.rst MANIFEST.in README.rst requirements.txt setup.cfg tests/
timger-mac:scala_sbt_tool timger$ cd scala_sbt_tool/
</code></pre>
python 代码 打包成jar
https://segmentfault.com/a/1190000002494730
2015-01-18T16:26:02+08:00
2015-01-18T16:26:02+08:00
timger
https://segmentfault.com/u/timger
0
<p>py 写东西快<br>
但是java 生态广<br>
比如大数据 py 虽然好 但是利用不到java的整个的生态的代码</p>
<p><code>scala</code> 虽然也好但是毕竟 有些库 需要自己写的多<br>
虽然也很简单 ,但是查文档也很麻烦</p>
<p>那么 问题来了<br>
最简单的的方式就是直接把py 打包 jar</p>
<p>那么 问题又来了 py 打包成java 挺麻烦的 官方文档看不懂</p>
<p>答案 有了<br>
写了个 包 <a rel="nofollow" href="https://github.com/yishenggudou/jythontools">https://github.com/yishenggudou/jythontools</a><br>
搞这个事情</p>
<pre><code>timger-mac:test timger$ python ../jytool/jytoollib.py hellojython.py main
timger-mac:test timger$ java -jar output.jython.jar
*sys-package-mgr*: processing modified jar, '/Users/timger/GitHub/jythontools/jytool/test/output.jython.jar'
hello jython
timger-mac:test timger$
</code></pre>
<p>整体代码如下</p>
<pre><code>timger-mac:test timger$ java -jar output.jython.jar a a s s s
hello jython
['a', 'a', 's', 's', 's']
timger-mac:test timger$ cat hellojython.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Copyright 2011 timger
# +Author timger
# +Gtalk&Email yishenggudou@gmail.com
# +Msn yishenggudou@msn.cn
# +Weibo @timger http://t.sina.com/zhanghaibo
# +twitter @yishenggudou http://twitter.com/yishenggudou
# Licensed under the MIT License, Version 2.0 (the "License");
__author__ = 'timger'
import sys
def main():
print "hello jython"
print sys.argv
</code></pre>
使用命令行搜索你的java 库
https://segmentfault.com/a/1190000002494226
2015-01-18T00:20:57+08:00
2015-01-18T00:20:57+08:00
timger
https://segmentfault.com/u/timger
0
<p>安装包 <code>mvns</code><br><code>pip install mvns</code></p>
<h3>如下:</h3>
<pre><code>timger-mac:bin timger$ pip install mvns
Downloading/unpacking mvns
Downloading mvns-0.1.1.tar.gz
Running setup.py egg_info for package mvns
Downloading/unpacking requests==2.4.1 (from mvns)
Downloading requests-2.4.1.tar.gz (436kB): 436kB downloaded
Running setup.py egg_info for package requests
Installing collected packages: mvns, requests
Running setup.py install for mvns
changing mode of build/scripts-2.7/mvns from 644 to 755
changing mode of /usr/local/bin/mvns to 755
Found existing installation: requests 2.5.0
Uninstalling requests:
Successfully uninstalled requests
Running setup.py install for requests
Successfully installed mvns requests
Cleaning up...
</code></pre>
<h2>使用</h2>
<pre><code>timger-mac:bin timger$
timger-mac:bin timger$ mv
mv mvn mvnDebug mvns mvnyjp
timger-mac:bin timger$ mvn
mvn mvnDebug mvns mvnyjp
timger-mac:bin timger$ mvns json
com.proofpoint.platform:json:1.10
io.airlift:json:0.99
org.json:json:20141113
org.glassfish.main.packager:json:4.1
com.vaadin.external.json:json:0.0.20080701
org.apache.airavata:json:0.11
org.glassfish:json:1.0.4
com.wadpam.openserver:json:29
net.stepniak.api:json:0.8.8
org.webjars:json:20121008
de.twentyeleven.bundled:json:20070829
org.gaixie.json:json:1.0.0
org.apache.isis.viewer:json:0.2.0-incubating
com.unboundid.components:json:1.0.0
it.tidalwave.bluebill:json:1.0.21
org.apache.geronimo.bundles:json:20090211_1
com.twitter:json:2.1.4
com.sun.woodstock.dependlibs:json:2.0
org.jvnet.jax-ws-commons:json:1.2
org.metawidget.modules.json:json-parent:4.0
</code></pre>
<h2>help</h2>
<pre><code>timger-mac:bin timger$ mvns -h
usage: mvns [-h] [-g groupId] [-a artifactId] [-v version] [-A] [-m max]
[query]
CLI tool to search library in maven central repository
positional arguments:
query search by query
optional arguments:
-h, --help show this help message and exit
-g groupId, --groupId groupId
specify groupId
-a artifactId, --artifactId artifactId
specify artifactId
-v version, --version version
specify version
-A, --allVersions show all version
-m max, --max max limit number of result. default is 20
</code></pre>
scala 中类似 py 的 strip 的字符串处理
https://segmentfault.com/a/1190000002492343
2015-01-16T17:45:12+08:00
2015-01-16T17:45:12+08:00
timger
https://segmentfault.com/u/timger
1
<pre><code>scala> var a = "mysql "
a: String = "mysql "
scala> a.trim()
res14: String = mysql
scala> a.stripSuffix("l")
res15: String = "mysql "
scala> a.stripSuffix("l ")
res16: String = "mysql "
scala> a.stripSuffix("l ")
res17: String = mysq
scala> a.stripPrefix("my")
res18: String = "sql "
</code></pre>
python 加速库 cytoolz
https://segmentfault.com/a/1190000002490620
2015-01-15T23:25:01+08:00
2015-01-15T23:25:01+08:00
timger
https://segmentfault.com/u/timger
0
<p><a rel="nofollow" href="https://github.com/pytoolz/cytoolz/">https://github.com/pytoolz/cytoolz/</a></p>
scala 处理命令行输入参数
https://segmentfault.com/a/1190000002490298
2015-01-15T20:29:06+08:00
2015-01-15T20:29:06+08:00
timger
https://segmentfault.com/u/timger
0
<p><a rel="nofollow" href="https://github.com/scopt/scopt">https://github.com/scopt/scopt</a></p>
scala 类似 Python join 的方法
https://segmentfault.com/a/1190000002490244
2015-01-15T19:52:49+08:00
2015-01-15T19:52:49+08:00
timger
https://segmentfault.com/u/timger
0
<pre><code>scala> var a=0 to 10
a: scala.collection.immutable.Range.Inclusive = Range(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
scala> a.ma
map max maxBy
scala> a.ma
map max maxBy
scala> a.mkString
def mkString(sep: String): String def mkString(start: String, sep: String, end: String): String def mkString: String
scala> a.mkString(",")
res0: String = 0,1,2,3,4,5,6,7,8,9,10
</code></pre>
sbt的assembly插件
https://segmentfault.com/a/1190000002484984
2015-01-13T23:49:37+08:00
2015-01-13T23:49:37+08:00
timger
https://segmentfault.com/u/timger
0
<p>1.首先在 project/plugins.sbt: 下加入这段代码:</p>
<p><code>addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.2")</code><br>
2.先对project 执行sbt 看看能不能通过 记住要在机子上装好Git<br>
3.在根目录创建assembly.sbt文件,内容如下:</p>
<pre><code>import AssemblyKeys._ // put this at the top of the file
assemblySettings
// your assembly settings here
</code></pre>
<p>之后就可以<code>sbt assembly</code>来打包了,生成<code>./target/scala_x.x.x/projectname-assembly-x.x.x.jar</code><br>
4.如果想更详细的配置assembly,可以这样</p>
<p>在assembly.sbt内写入:</p>
<pre><code>import AssemblyKeys._
assemblySettings
jarName in assembly := "spark_sbt.jar"
test in assembly := {}
mainClass in assembly := Some( "Spark_Test")
assemblyOption in packageDependency ~= { _.copy(appendContentHash = true) }
mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) =>
{
case PathList(ps @ _*) if ps.last endsWith "axiom.xml" => MergeStrategy.filterDistinctLines
case PathList(ps @ _*) if ps.last endsWith "Log.class" => MergeStrategy.first
case PathList(ps @ _*) if ps.last endsWith "LogConfigurationException.class" => MergeStrategy.first
case PathList(ps @ _*) if ps.last endsWith "LogFactory.class" => MergeStrategy.first
case PathList(ps @ _*) if ps.last endsWith "SimpleLog$1.class" => MergeStrategy.first
case x => old(x)
}
}
</code></pre>
sbt编译入门
https://segmentfault.com/a/1190000002484978
2015-01-13T23:43:13+08:00
2015-01-13T23:43:13+08:00
timger
https://segmentfault.com/u/timger
1
<p>非托管依赖 为放在 lib 目录下的 jar 文件<br>
托管依赖 配置在构建定义中,并且会自动从仓库(repository)中下载<br>
非托管依赖</p>
<p>大多数人会用托管依赖而非非托管依赖。但是非托管依赖在起步阶段会简单很多。</p>
<p>非托管依赖像这样工作:将 jar 文件放在 lib 文件夹下,然后它们将会被添加到项目的 classpath 中。没有更多的事情了!</p>
<p>你也可以将测试依赖的 jar 文件放在 lib 目录下,比如 ScalaCheck,Specs2,ScalaTest。</p>
<p>lib 目录下的所有依赖都会在 classpaths(为了 compile, test, run 和 console)。如果你想对其中的一个改变 classpath, 你需要做适当调整,例如 dependencyClasspath in Compile 或者dependencyClasspath in Runtime。</p>
<p>如果用非托管依赖的话,不用往 build.sbt 文件中添加任何内容,不过你可以改变unmanagedBase key,如果你想用一个不同的目录而非 lib。</p>
<p>用 custom_lib 替代 lib:</p>
<p>unmanagedBase := baseDirectory.value / "custom_lib"<br>
baseDirectory 是项目的根目录,所以在这里你依据 baseDirectory 的值改变了 unmanagedBase,通过在 更多关于设置 中介绍的一个特殊的 value 方法。</p>
<p>同时也有一个列举 unmanagedBase 目录下所有 jar 文件的 task 叫 unmanagedJars。如果你想用多个目录或者做一些更加复杂的事情,你可能需要用一个可以做其它事情的 task 替换整个unmanagedJars task, 例如清空 Compile configuration 的列表,不考虑 lib 目录下的文件:</p>
<p>unmanagedJars in Compile := Seq.empty[sbt.Attributed[java.io.File]]<br>
托管依赖</p>
<p>sbt 使用 Apache Ivy 来实现托管依赖,所以如果你对 Ivy 或者 Maven 比较熟悉的话,你不会有太多的麻烦。</p>
<p>libraryDependencies Key</p>
<p>大多数时候,你可以很简单的在 libraryDependencies 设置项中列出你的依赖。也可以通过 Maven POM 文件或者 Ivy 配置文件来配置依赖,而且可以通过 sbt 来调用这些外部的配置文件。 你可以从这里获取更详细的内容。</p>
<p>像这样定义一个依赖,groupId, artifactId 和 revision 都是字符串:</p>
<p>libraryDependencies += groupID % artifactID % revision<br>
或者像这样, 用字符串或者 Configuration val 当做 configuration:</p>
<p>libraryDependencies += groupID % artifactID % revision % configuration<br>
libraryDependencies 在 Keys 中像这样声明:</p>
<p>val libraryDependencies = settingKey<a rel="nofollow" title="Declares managed dependencies.">Seq[ModuleID]</a><br>
方法 % 从字符串创建 ModuleID 对象,然后将 ModuleID 添加到 libraryDependencies 中。</p>
<p>当然,要让 sbt(通过 Ivy)知道从哪里下载模块。如果你的模块和 sbt 来自相同的某个默认的仓库,这样就会工作。例如,Apache Derby 在标准的 Maven2 仓库中:</p>
<p>libraryDependencies += "org.apache.derby" % "derby" % "10.4.1.3"<br>
如果你在 build.sbt 中输入上面这些内容,然后执行 update,sbt 会将 Derby 下载到~/.ivy2/cache/org.apache.derby/。(顺便提一下, compile 依赖于 update,所以 大多数时候不需要手动的执行 update。)</p>
<p>当然,你也可以通过 ++= 一次将所有依赖作为一个列表添加:</p>
<p>libraryDependencies ++= Seq(<br>
groupID % artifactID % revision,<br>
groupID % otherID % otherRevision<br>
)<br>
在很少情况下,你也会需要在 libraryDependencies 上用 := 方法。</p>
<p>通过 %% 方法获取正确的 Scala 版本</p>
<p>如果你用是 groupID %% artifactID % revision 而不是 groupID % artifactID % revision(区别在于 groupID 后面是 %%),sbt 会在 工件名称中加上项目的 Scala 版本号。 这只是一种快捷方法。你可以这样写不用 %%:</p>
<p>libraryDependencies += "org.scala-tools" % "scala-stm_2.11.1" % "0.3"<br>
假设这个构建的 scalaVersion 是 2.11.1,下面这种方式是等效的(注意 "org.scala-tools" 后面是 %%):</p>
<p>libraryDependencies += "org.scala-tools" %% "scala-stm" % "0.3"<br>
这个想法是很多依赖都会被编译之后给多个 Scala 版本,然后你想确保和项目匹配的某一个是二进制兼容的。</p>
<p>实践中的复杂度在于通常一个依赖会和稍微不同的 Scala 版本一起工作;但是 %% 就没有那么智能了。所以如果一个依赖要求版本为 2.10.1,但是你使用的 scalaVersion := "2.10.4", 你不可能使用 %% 方法即使 2.10.1 的版本很有可能工作。如果 %% 停止工作了,只需要去检查那个依赖是基于哪个 Scala 版本构建的,然后硬编码你认为可以工作的版本号(假设已经有一个)。</p>
<p>参见 交叉构建 获取更多信息。</p>
<p>Ivy 修正</p>
<p>groupID % artifactID % revision 中的 revision 不需要是一个固定的版本号。Ivy 能够根据你指定的约束选择一个模块的最新版本。你指定 "latest.integration","2.9.+" 或者 "[1.0,)",而不是 一个固定的版本号,像 "1.6.1"。参看Ivy 修订文档获取详细内容。</p>
<p>解析器</p>
<p>不是所有的依赖包都放在同一台服务器上,sbt 默认使用标准的 Maven2 仓库。如果你的依赖不在默认的仓库中,你需要添加 resolver 来帮助 Ivy 找到它。</p>
<p>通过以下形式添加额外的仓库:</p>
<p>resolvers += name at location<br>
在两个字符串中间有一个特殊的 at。</p>
<p>例如:</p>
<p>resolvers += "Sonatype OSS Snapshots" at "<a rel="nofollow" href="https://oss.sonatype.org/content/repositories/snapshots">https://oss.sonatype.org/content/repositories/snapshots</a>"<br>
resolvers key 在 Keys 中像这样定义:</p>
<p>val resolvers = settingKey<a rel="nofollow">Seq[Resolver]</a><br>
at 方法通过两个字符串创建了一个 Resolver 对象。</p>
<p>sbt 会搜索你的本地 Maven 仓库如果你将它添加为一个仓库:</p>
<p>resolvers += "Local Maven Repository" at "file://"+Path.userHome.absolutePath+"/.m2/repository"<br>
或者,为了方便起见:</p>
<p>resolvers += Resolver.mavenLocal<br>
参见解析器获取更多关于定义其他类型的仓库的内容。</p>
<p>覆写默认的解析器</p>
<p>resolvers 不包含默认的解析器,仅仅通过构建定义添加额外的解析器。</p>
<p>sbt 将 resolvers 和一些默认的仓库组合起来构成 externalResolvers。</p>
<p>然而,为了改变或者移除默认的解析器,你需要覆写externalResolvers 而不是 resolvers。</p>
<p>Per-configuration dependencies</p>
<p>通常一个依赖只被测试代码使用(在 src/test/scala 中,通过 Test configuration 编译)。</p>
<p>如果你想要一个依赖只在 Test configuration 的 classpath 中出现而不是 Compile configuration,像这样添加 % "test":</p>
<p>libraryDependencies += "org.apache.derby" % "derby" % "10.4.1.3" % "test"<br>
也可能也会像这样使用类型安全的 Test configuration:</p>
<p>libraryDependencies += "org.apache.derby" % "derby" % "10.4.1.3" % Test<br>
现在,如果你在 sbt 的命令提示行里输入 show compile:dependencyClasspath,你不应该看到 derby jar。但是如果你输入 show test:dependencyClasspath, 你应该在列表中看到 derby jar。</p>
<p>通常,测试相关的依赖,如 ScalaCheck, Specs2和 ScalaTest 将会被定义为 % "test"。</p>
<p>库依赖更详细的内容和技巧在这里。</p>
<p>常用命令</p>
<p>actions – 显示对当前工程可用的命令<br>
update – 下载依赖<br>
compile – 编译代码<br>
test – 运行测试代码<br>
package – 创建一个可发布的jar包<br>
publish-local – 把构建出来的jar包安装到本地的ivy缓存<br>
publish – 把jar包发布到远程仓库(如果配置了的话)<br>
更多命令</p>
<p>test-failed – 运行失败的spec<br>
test-quick – 运行所有失败的以及/或者是由依赖更新的spec<br>
clean-cache – 清除所有的sbt缓存。类似于sbt的clean命令<br>
clean-lib – 删除lib_managed下的所有内容</p>
<p>SBT是Simple Build Tool的简称,如果读者使用过Maven,那么可以简单将SBT看做是Scala世界的Maven,虽然二者各有优劣,但完成的工作基本是类似的。<br>
虽然Maven同样可以管理Scala项目的依赖并进行构建, 但SBT的某些特性却让人如此着迷,比如:</p>
<p>使用Scala作为DSL来定义build文件(one language rules them all);<br>
通过触发执行(trigger execution)特性支持持续的编译与测试;<br>
增量编译;^[SBT的增量编译支持因为如此优秀,已经剥离为Zinc,可被Eclipse, Maven,Gradle等使用]<br>
可以混合构建Java和Scala项目;<br>
并行的任务执行;<br>
可以重用Maven或者ivy的repository进行依赖管理;<br>
等等这些,都是SBT得以在Scala的世界里广受欢迎的印记。</p>
<p>SBT的发展可以分为两个阶段, 即SBT_0.7.x时代以及SBT_0.10.x以后的时代。</p>
<p>目前来讲, SBT_0.7.x已经很少使用, 大部分公司和项目都已经迁移到0.10.x以后的版本上来,最新的是0.12版本。 0.10.x之后的版本build定义采用了新的Settings系统,与最初0.7.x版本采用纯Scala代码来定义build文件大相径庭,虽然笔者在迁移之前很抵触(因为0.7.x中采用Scala定义build文件的做法可以体现很好的统一性),但还是升级并接纳了0.10.x以后的版本,并且也逐渐意识到, 虽然新的版本初看起来很复杂,但一旦了解了其设计和实现的哲学跟思路,就会明白这种设计可以更便捷的定义build文件。而且可选的build文件方式也同样运行采用Scala代码来定义,即并未放弃统一性的思想。</p>
<p>以上是SBT的简单介绍,如果读者已经急于开始我们的SBT之旅,那么让我们先从SBT的安装和配置开始吧!</p>
<p>SBT安装和配置</p>
<p>SBT的安装和配置可以采用两种方式,一种是所有平台都通用的安装配置方式,另一种是跟平台相关的安装和配置方式,下面我们分别对两种方式进行详细介绍。</p>
<p>所有平台通用的安装配置方式</p>
<p>所有平台通用的安装和配置方式只需要两步:</p>
<p>下载sbt boot launcher<br>
本书采用最新的sbt0.12,其下载地址为<a rel="nofollow" href="http://typesafe.artifactoryonline.com/typesafe/ivy-releases/org.scala-sbt/sbt-launch/0.12.0/sbt-launch.jar">http://typesafe.artifactoryonline.com/typesafe/ivy-releases/org.scala-sbt/sbt-launch/0.12.0/sbt-launch.jar</a>;<br>
创建sbt启动脚本(启动脚本是平台相关的)<br>
如果是Linux/Unit系统,创建名称为sbt的脚本,并赋予其执行权限,并将其加到PATH路径中; sbt脚本内容类似于 java -Xms512M -Xmx1536M -Xss1M -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=384M -jar <code>dirname $0</code>/sbt-launch.jar "$@", 可以根据情况调整合适的java进程启动参数;<br>
如果是Windows系统,则创建sbt.bat命令行脚本,同样将其添加到PATH路径中。 脚本内容类似于set SCRIPT_DIR=%~dp0 \n java -Xmx512M -jar "%SCRIPT_DIR%sbt-launch.jar" %*<br>
以上两步即可完成sbt的安装和配置。</p>
<p>平台相关的安装配置方式</p>
<p>笔者使用的是Mac系统,安装sbt只需要执行brew install sbt即可(因为我已经安装有homebrew这个包管理器),使用macport同样可以很简单的安装sbt - sudo port install sbt;</p>
<p>如果读者使用的是Linux系统,那么这些系统通常都会有相应的包管理器可用,比如yum或者apt,安装和配置sbt也同样轻松,只要简单的运行yum install sbt 或者 apt-get install sbt命令就能搞定(当然,通常需要先将有sbt的repository添加到包管理器的列表中);</p>
<p>Windows的用户也可以偷懒,只要下载MSI文件直接安装,MSI文件下载地址为<a rel="nofollow" href="http://scalasbt.artifactoryonline.com/scalasbt/sbt-native-packages/org/scala-sbt/sbt/0.12.0/sbt.msi">http://scalasbt.artifactoryonline.com/scalasbt/sbt-native-packages/org/scala-sbt/sbt/0.12.0/sbt.msi</a>。</p>
<p>以上方式基本上囊括三大主流操作系统特定的安装和配置方式,其它特殊情况读者可以酌情处理 ^_^</p>
<p>SBT基础篇</p>
<p>既然我们已经安装和配置好了SBT,那就让我们先尝试构建一个简单的Scala项目吧!</p>
<p>Hello, SBT</p>
<p>在SBT的眼里, 一个最简单的Scala项目可以极简到项目目录下只有一个.scala文件,比如HelloWorld.scala:</p>
<p>object HelloWorld{<br>
def main(args: Array[String]) {<br>
println("Hello, SBT")<br>
}<br>
}<br>
假设我们HelloWorld.scala放到hello目录下,那么可以尝试在该目录下执行:</p>
<p>$ sbt</p>
<blockquote>
<p>run<br>
[info] Running HelloWorld<br>
Hello, SBT<br>
[success] Total time: 2 s, completed Sep 2, 2012 7:54:58 PM<br>
怎么样,是不是很简单那? (画外音: 这岂止是简单,简直就是个玩具嘛,有啥用嘛?! 来点儿实在的行不?)</p>
</blockquote>
<p>好吧, 笔者也承认这太小儿科了,所以,我们还是来点儿"干货"吧!</p>
<p>NOTE: 以上实例简单归简单,但可不要小看它哦,你可知道笔者开始就因为忽略了如此简单的小细节而"光阴虚度"? 该实例的项目目录下,没有定义任何的build文件,却依然可以正确的执行sbt命令, 实际上, 即使在一个空目录下执行sbt命令也是可以成功进入sbt的console的。 所以,只要了解了sbt构建的这个最低条件,那么,当你无意间在非项目的根目录下执行了相应sbt命令而出错的时候,除了检查build文件的定义,另外要注意的就是,你是否在预想的项目根目录下面执行的sbt命令!<br>
SBT项目工程结构详解</p>
<p>一般意义上讲,SBT工程项目的目录结构跟Maven的很像, 如果读者接触过Maven,那么可以很容易的理解如下内容。</p>
<p>一个典型的SBT项目工程结构如下图所示:</p>
<p>src目录详解</p>
<p>Maven用户对src目录的结构应该不会感到陌生,下面简单介绍各个子目录的作用。</p>
<p>src/main/java目录存放Java源代码文件<br>
src/main/resources目录存放相应的资源文件<br>
src/main/scala目录存放Scala源代码文件<br>
src/test/java目录存放Java语言书写的测试代码文件<br>
src/test/resources目录存放测试起见使用到的资源文件<br>
src/test/scala目录存放scala语言书写的测试代码文件<br>
build.sbt详解</p>
<p>读者可以简单的将build.sbt文件理解为Maven项目的pom.xml文件,它是build定义文件。 SBT运行使用两种形式的build定义文件,一种就是放在项目的根目录下,即build.sbt, 是一种简化形式的build定义; 另一种放在project目录下,采用纯Scala语言编写,形式更加复杂,当然,也更完备,更有表现力。</p>
<p>我们暂时先介绍build.sbt的定义格式,基于scala的build定义格式我们稍后再细说。</p>
<p>一个简单的build.sbt文件内容如下:</p>
<p>name := "hello" // 项目名称</p>
<p>organization := "xxx.xxx.xxx" // 组织名称</p>
<p>version := "0.0.1-SNAPSHOT" // 版本号</p>
<p>scalaVersion := "2.9.2" // 使用的Scala版本号</p>
<p>// 其它build定义<br>
其中, name和version的定义是必须的,因为如果想生成jar包的话,这两个属性的值将作为jar包名称的一部分。</p>
<p>build.sbt的内容其实很好理解,可以简单理解为一行代表一个键值对(Key-Value Pair),各行之间以空行相分割。</p>
<p>当然,实际情况要比这复杂,需要理解SBT的Settings引擎才可以完全领会, 以上原则只是为了便于读者理解build.sbt的内容。</p>
<p>除了定义以上项目相关信息,我们还可以在build.sbt中添加项目依赖:</p>
<p>// 添加源代码编译或者运行期间使用的依赖<br>
libraryDependencies += "ch.qos.logback" % "logback-core" % "1.0.0"</p>
<p>libraryDependencies += "ch.qos.logback" % "logback-classic" % "1.0.0"</p>
<p>// 或者</p>
<p>libraryDependencies ++= Seq(<br>
"ch.qos.logback" % "logback-core" % "1.0.0",<br>
"ch.qos.logback" % "logback-classic" % "1.0.0",<br>
...<br>
)</p>
<p>// 添加测试代码编译或者运行期间使用的依赖<br>
libraryDependencies ++= Seq("org.scalatest" %% "scalatest" % "1.8" % "test")<br>
甚至于直接使用ivy的xml定义格式:</p>
<p>ivyXML :=<br><br><br><br><br><br><br><br><br><br><br><br>
在这里,我们排除了某些不必要的依赖,并且声明了某个定制过的依赖声明。</p>
<p>当然, build.sbt文件中还可以定义很多东西,比如添加插件,声明额外的repository,声明各种编译参数等等,我们这里就不在一一赘述了。</p>
<p>project目录即相关文件介绍</p>
<p>project目录下的几个文件实际上都是非必须存在的,可以根据情况添加。</p>
<p>build.properties文件声明使用的要使用哪个版本的SBT来编译当前项目, 最新的sbt boot launcher可以能够兼容编译所有0.10.x版本的SBT构建项目,比如如果我使用的是0.12版本的sbt,但却想用0.11.3版本的sbt来编译当前项目,则可以在build.properties文件中添加sbt.version=0.11.3来指定。 默认情况下,当前项目的构建采用使用的sbt boot launcher对应的版本。</p>
<p>plugins.sbt文件用来声明当前项目希望使用哪些插件来增强当前项目使用的sbt的功能,比如像assembly功能,清理ivy local cache功能,都有相应的sbt插件供使用, 要使用这些插件只需要在plugins.sbt中声明即可,不用自己去再造轮子:</p>
<p>resolvers += Resolver.url("git://github.com/jrudolph/sbt-dependency-graph.git")</p>
<p>resolvers += "sbt-idea-repo" at "<a rel="nofollow" href="http://mpeltonen.github.com/maven/">http://mpeltonen.github.com/maven/</a>"</p>
<p>addSbtPlugin("com.github.mpeltonen" % "sbt-idea" % "1.1.0")</p>
<p>addSbtPlugin("net.virtual-void" % "sbt-dependency-graph" % "0.6.0")<br>
在笔者的项目中, 使用sbt-idea来生成IDEA IDE对应的meta目录和文件,以便能够使用IDEA来编写项目代码; 使用sbt-dependency-graph来发现项目使用的各个依赖之间的关系;</p>
<p>为了能够成功加载这些sbt插件,我们将他们的查找位置添加到resolovers当中。有关resolvers的内容,我们后面将会详细介绍,这里注意一个比较有趣的地方就是,sbt支持直接将相应的github项目作为依赖或者插件依赖,而不用非得先将相应的依赖或者插件发布到maven或者ivy的repository当中才可以使用。</p>
<p>其它</p>
<p>以上目录和文件通常是在创建项目的时候需要我们创建的,实际上, SBT还会在编译或者运行期间自动生成某些相应的目录和文件,比如SBT会在项目的根目录下和project目录下自动生成相应的target目录,并将编译结果或者某些缓存的信息置于其中, 一般情况下,我们不希望将这些目录和文件记录到版本控制系统中,所以,通常会将这些目录和文件排除在版本管理之外。</p>
<p>比如, 如果我们使用git来做版本控制,那么就可以在.gitignore中添加一行"target/"来排除项目根目录下和project目录下的target目录及其相关文件。</p>
<p>TIPS</p>
<p>在sbt0.7.x时代, 我们只要创建项目目录,然后在项目目录下敲入sbt,则应该创建哪些需要的目录和文件就会由sbt自动为我们生成, 而sbt0.10之后,这项福利就没有了。 所以,刚开始,我们可能会认为要很苦逼的执行一长串命令来生成相应的目录和文件:</p>
<pre><code>$ touch build.sbt
$ mkdir src
$ mkdir src/main
$ mkdir src/main/java
$ mkdir src/main/resources
$ mkdir src/main/scala
$ mkdir src/test
$ mkdir src/test/java
$ mkdir src/test/resources
$ mkdir src/test/scala
$ mkdir project
$ ...
</code></pre>
<p>SBT的使用</p>
<p>SBT支持两种使用方式:</p>
<p>批处理模式(batch mode)<br>
可交互模式(interactive mode)<br>
批处理模式是指我们可以在命令行模式下直接依次执行多个SBT命令, 比如:</p>
<p>$ sbt compile test package</p>
<p>而可交互模式则直接运行sbt,后面不跟任何SBT命令,在这种情况下, 我们将直接进入sbt控制台(console), 在sbt控制台中,我们可以输入任何合法的sbt命令并获得相应的反馈:</p>
<p>$ sbt</p>
<blockquote>
<p>compile<br>
[success] Total time: 1 s, completed Sep 3, 2012 9:34:58 PM<br>
test<br>
[info] No tests to run for test:test<br>
[success] Total time: 0 s, completed Sep 3, 2012 9:35:04 PM<br>
package<br>
[info] Packaging XXX_XXX_2.9.2-0.1-SNAPSHOT.jar ...<br>
[info] Done packaging.<br>
[success] Total time: 0 s, completed Sep 3, 2012 9:35:08 PM</p>
</blockquote>
<p>TIPS</p>
<p>在可交互模式的sbt控制台下,可以输入help获取进一步的使用信息。<br>
在以上实例中,我们依次执行了compile, test和package命令, 实际上, 这些命令之间是有依赖关系的,如果仅仅是为了package,那么,只需要执行package命令即可, package命令依赖的compile和test命令将先于package命令执行,以保证它们之间的依赖关系得以满足。</p>
<p>除了compile,test和package命令, 下面列出了更多可用的sbt命令供读者参考:</p>
<p>compile<br>
test-compile<br>
run<br>
test<br>
package<br>
这些命令在某些情况下也可以结合SBT的触发执行(Trigger Execution)机制一起使用, 唯一需要做的就只是在相应的命令前追加~符号比如:</p>
<p>$ sbt ~compile</p>
<p>原则上, ~和相应命令之间应该用空格分隔,不过对于一般的命令来讲,直接前缀~也是可以的,就跟我们使用~compile的方式一样。<br>
SBT的依赖管理</p>
<p>在SBT中, 类库的依赖管理可以分为两类:</p>
<p>unmanaged dependencies<br>
managed dependencies<br>
大部分情况下,我们会采用managed dependencies方式来管理依赖关系,但也不排除为了快速构建项目环境等特殊情况下,直接使用unmanaged dependencies来管理依赖关系。</p>
<p>Unmanaged Dependencies简介</p>
<p>要使用unmanaged dependencies的方式来管理依赖其实很简单,只需要将想要放入当前项目classpath的jar包放到lib目录下即可。</p>
<p>如果对默认的lib目录看着不爽, 我们也可以通过配置来更改这个默认位置,比如使用3rdlibs:</p>
<p>unmanagedBase <<= baseDirectory { base => base / "3rdlibs" }<br>
这里可能需要解释一下以上配置。 首先unmanagedBase这个Key用来表示unmanaged dependencies存放第三方jar包的路径, 具体的值默认是lib, 我们为了改变这个Key的值, 采用<<=操作符, 根据baseDirectory的值转换并计算出一个新值赋值给unmanagedBase这个Key, 其中, baseDirectory指的是当前项目目录,而<<=操作符(其实是Key的方法)则负责从已知的某些Key的值计算出新的值并赋值给指定的Key。</p>
<p>关于Unmanaged dependencies,一般情况下,需要知道的基本上就这些。</p>
<p>Managed Dependencies详解</p>
<p>sbt的managed dependencies采用Apache Ivy的依赖管理方式, 可以支持从Maven或者Ivy的Repository中自动下载相应的依赖。</p>
<p>简单来说,在SBT中, 使用managed dependencies基本上就意味着往libraryDependencies这个Key中添加所需要的依赖, 添加的一般格式如下:</p>
<p>libraryDependencies += groupID % artifactID % revision<br>
比如:</p>
<p>libraryDependencies += "org.apache.derby" % "derby" % "10.4.1.3"<br>
这种格式其实是简化的常见形式,实际上,我们还可以做更多微调, 比如:</p>
<p>(1) libraryDependencies += "org.apache.derby" % "derby" % "10.4.1.3" % "test"<br>
(2) libraryDependencies += "org.apache.derby" % "derby" % "10.4.1.3" exclude("org", "artifact")<br>
(3) libraryDependencies += "org.apache.derby" %% "derby" % "10.4.1.3"<br>
(1)的形式允许我们限定依赖的范围只限于测试期间; (2)的形势允许我们排除递归依赖中某些我们需要排除的依赖; (3)的形式则会在依赖查找的时候,将当前项目使用的scala版本号追加到artifactId之后作为完整的artifactId来查找依赖,比如如果我们的项目使用scala2.9.2,那么(3)的依赖声明实际上等同于"org.apache.derby" %% "derby_2.9.2" % "10.4.1.3",这种方式更多是为了简化同一依赖类库存在有多个Scala版本对应的发布的情况。</p>
<p>如果有一堆依赖要添加,一行一行的添加是一种方式,其实也可以一次添加多个依赖:</p>
<p>libraryDependencies ++= Seq("org.apache.derby" %% "derby" % "10.4.1.3",<br>
"org.scala-tools" %% "scala-stm" % "0.3",<br>
...)<br>
Resovers简介</p>
<p>对于managed dependencies来说,虽然我们指定了依赖哪些类库,但有没有想过,SBT是如何知道到哪里去抓取这些类库和相关资料那?!</p>
<p>实际上,默认情况下, SBT回去默认的Maven2的Repository中抓取依赖,但如果默认的Repository中找不到我们的依赖,那我们可以通过resolver机制,追加更多的repository让SBT去查找并抓取, 比如:</p>
<p>resolvers += "Sonatype OSS Snapshots" at "<a rel="nofollow" href="https://oss.sonatype.org/content/repositories/snapshots">https://oss.sonatype.org/content/repositories/snapshots</a>"<br>
at^[at实际上是String类型进行了隐式类型转换(Implicit conversion)后目标类型的方法名]之前是要追加的repository的标志名称(任意取),at后面则是要追加的repository的路径。</p>
<p>除了可远程访问的Maven Repo,我们也可以将本地的Maven Repo追加到resolver的搜索范围:</p>
<p>resolvers += "Local Maven Repository" at "file://"+Path.userHome.absolutePath+"/.m2/repository"</p>
linux+hadoop 权限管理
https://segmentfault.com/a/1190000002426225
2014-12-16T09:13:14+08:00
2014-12-16T09:13:14+08:00
timger
https://segmentfault.com/u/timger
0
<h2>hadoop体系权限</h2>
<pre><code>[root@com ~]# hadoop fs -mkdir /tmp/tmp_authority
[root@com ~]# hadoop fs -ls -h /tmp
Found 7 items
drwxrwxrwx - hdfs supergroup 0 2014-12-15 15:45 /tmp/.cloudera_health_monitoring_canary_files
drwxr-xr-x - yarn supergroup 0 2014-10-16 15:07 /tmp/hadoop-yarn
drwxrwxrwx - hive supergroup 0 2014-10-30 15:03 /tmp/hive-hive
drwxrwxrwx - nobody supergroup 0 2014-12-15 14:39 /tmp/hive-nobody
drwxr-xr-x - root supergroup 0 2014-12-15 15:36 /tmp/hive-root
drwxrwxrwt - mapred hadoop 0 2014-10-13 16:53 /tmp/logs
drwxr-xr-x - root supergroup 0 2014-12-15 15:45 /tmp/tmp_authority
[root@com ~]# hadoop fs -chgrp -R root /tmp/tmp_authority
[root@com ~]# hadoop fs -ls -h /tmp
Found 7 items
drwxrwxrwx - hdfs supergroup 0 2014-12-15 15:46 /tmp/.cloudera_health_monitoring_canary_files
drwxr-xr-x - yarn supergroup 0 2014-10-16 15:07 /tmp/hadoop-yarn
drwxrwxrwx - hive supergroup 0 2014-10-30 15:03 /tmp/hive-hive
drwxrwxrwx - nobody supergroup 0 2014-12-15 14:39 /tmp/hive-nobody
drwxr-xr-x - root supergroup 0 2014-12-15 15:36 /tmp/hive-root
drwxrwxrwt - mapred hadoop 0 2014-10-13 16:53 /tmp/logs
drwxr-xr-x - root root 0 2014-12-15 15:45 /tmp/tmp_authority
</code></pre>
<h3>linux 添加组和 用户</h3>
<ol>
<li>创建组</li>
</ol>
<pre><code>groupadd query
</code></pre>
<ol>
<li>新用户加入组</li>
</ol>
<pre><code>useradd -g query query
</code></pre>
<ol>
<li>老用户加入另外一个组</li>
</ol>
<pre><code>usermod -a -G root query
[root@com ~]# groups query
query : query root
[root@com ~]# usermod -G query query
[root@com ~]# groups query
query : query
</code></pre>
<ol>
<li>删除组<br><code></code>
</li>
</ol>
查看mac 的内存使用
https://segmentfault.com/a/1190000002422791
2014-12-13T21:20:41+08:00
2014-12-13T21:20:41+08:00
timger
https://segmentfault.com/u/timger
0
<p>mac air 11年款<br>
2GB 内存<br>
经常会卡<br>
需要分析那个进程占用内存多<br>
打开系统自带的软件</p>
<p><img src="/img/bVkkrq" alt="clipboard.png"></p>
<p>可以看到基本还是chrome</p>
<p><img src="/img/bVkkro" alt="clipboard.png"></p>
oozie 失败重试和报警
https://segmentfault.com/a/1190000002419658
2014-12-11T16:34:57+08:00
2014-12-11T16:34:57+08:00
timger
https://segmentfault.com/u/timger
1
<h2>配置重试</h2>
<pre><code>xml</code><code><workflow-app xmlns="uri:oozie:workflow:0.3" name="wf-name">
<action name="a" retry-max="3" retry-interval="1">
</action>
</code></pre>
<h2>添加失败报警</h2>
<pre><code>xml</code><code><action name="sdk-sendmail-failed">
<email xmlns="uri:oozie:email-action:0.1">
<to>alert@timger.info</to>
<subject>[OOZIE FAILED] ${wf:id()}</subject>
<body>
Etl daily stat failed!
Stat DATE:${jobYear}-${jobMonth}-${jobDay},
Error message:[${wf:errorMessage(wf:lastErrorNode())}].
</body>
</email>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Sdk daily stat workflow failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
</code></pre>
<p><a rel="nofollow" href="http://blog.caiyongfu.cn/?m=201311">参考</a></p>
git 常用命令别名
https://segmentfault.com/a/1190000002418234
2014-12-10T23:00:08+08:00
2014-12-10T23:00:08+08:00
timger
https://segmentfault.com/u/timger
0
<p>方便开发 创建了一些别名</p>
<p><img src="/img/bVkjfV" alt="clipboard.png"></p>
mac 下 安装 Python PIL
https://segmentfault.com/a/1190000002418196
2014-12-10T22:31:24+08:00
2014-12-10T22:31:24+08:00
timger
https://segmentfault.com/u/timger
0
<pre><code>brew install freetype
ln -s /usr/local/include/freetype2 /usr/local/include/freetype
pip install pil
</code></pre>
mysql 建表自动添加时间
https://segmentfault.com/a/1190000002414083
2014-12-09T11:11:18+08:00
2014-12-09T11:11:18+08:00
timger
https://segmentfault.com/u/timger
0
<pre><code>CREATE TABLE t1
(
ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);
</code></pre>
td-agent 安装插件
https://segmentfault.com/a/1190000002408701
2014-12-05T17:37:48+08:00
2014-12-05T17:37:48+08:00
timger
https://segmentfault.com/u/timger
0
<pre><code>td-agent-gem install fluent-plugin-mysql
</code></pre>
centos 上安装 fluentd-ui
https://segmentfault.com/a/1190000002405189
2014-12-04T09:11:35+08:00
2014-12-04T09:11:35+08:00
timger
https://segmentfault.com/u/timger
0
<h2>安装 <code>ruby</code>
</h2>
<pre><code>yum install ruby
yum install gcc g++ make automake autoconf curl-devel openssl-devel zlib-devel httpd-devel apr-devel apr-util-devel sqlite-devel
yum install ruby-rdoc ruby-devel
yum install rubygems
</code></pre>
<h2>安装 <code>fluentd-ui</code>
</h2>
<pre><code>gem install -V fluentd-ui
</code></pre>
<h2>错误和解决</h2>
<pre><code>error failed to build gem native extension. centos
</code></pre>
<p>需要<code>1.9</code> 的 ruby</p>
<h2>centos6.4 安装 <code>ruby1.9</code>
</h2>
mysql alter 表的时候查询进度
https://segmentfault.com/a/1190000002405077
2014-12-04T03:29:47+08:00
2014-12-04T03:29:47+08:00
timger
https://segmentfault.com/u/timger
0
<p>查看<br>
进入 mysql 的 db 文件夹</p>
<pre><code>[root@com cubes]# ls -alh
-rw-rw---- 1 mysql mysql 8.7K Dec 3 22:21 #sql-6cb7_1240a.frm
</code></pre>
<pre><code>SELECT count(*) FROM `#sql-6cb7_1240a`;
</code></pre>
<p>得到</p>
<pre><code>10400000
</code></pre>
<p>和原先的表比较就可以得到进度</p>
python mysql 插入特殊字符串处理
https://segmentfault.com/a/1190000002402361
2014-12-02T21:05:41+08:00
2014-12-02T21:05:41+08:00
timger
https://segmentfault.com/u/timger
0
<pre><code>col_ = col.strip()\
.replace(',','')\
.replace("'",'')\
.replace("""""""",'')\
.replace('(',' ')\
.replace(')',' ')\
.replace('%', ' ')\
.replace('<', ' ')\
.replace('>', ' ')
</code></pre>
hive unlock table 和分区
https://segmentfault.com/a/1190000002391143
2014-11-26T20:57:56+08:00
2014-11-26T20:57:56+08:00
timger
https://segmentfault.com/u/timger
0
<pre><code>unlock table some_cube partition (year='2014',month='11',day='25',hour='00',b='pc');
</code></pre>
centos 安装 py pyhs2
https://segmentfault.com/a/1190000001715944
2014-11-17T19:06:40+08:00
2014-11-17T19:06:40+08:00
timger
https://segmentfault.com/u/timger
0
<pre><code>yum install gcc-c++ cyrus-sasl-devel
pip2.7 install pyhs2
</code></pre>
<p>用于 <code>hive</code> thrift 访问</p>
centos6.x 下面安装python2.7
https://segmentfault.com/a/1190000000766991
2014-11-09T22:04:51+08:00
2014-11-09T22:04:51+08:00
timger
https://segmentfault.com/u/timger
0
<h3>python2.7</h3>
<pre><code>wget http://www.python.org/ftp/python/2.7.6/Python-2.7.6.tar.xz
yum install xz-libs
xz -d Python-2.7.6.tar.xz
ls
tar -xvf Python-2.7.6.tar
cd Python-2.7.6
yum install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline
LDFLAGS="-Wl,-rpath /usr/local/lib"
vim /etc/ld.so.conf
./configure --prefix=/usr/local --enable-unicode=ucs4 --enable-shared LDFLAGS="-Wl,-rpath /usr/local/lib"
make clean
make && make altinstall
</code></pre>
<h2>pip</h2>
<pre><code>wget --no-check-certificate https://pypi.python.org/packages/source/s/setuptools/setuptools-1.4.2.tar.gz
tar -vxf setuptools-1.4.2.tar.gz
cd setuptools-1.4.2
python2.7 setup.py install
easy_install-2.7 pip
</code></pre>
<h2>mysqldb</h2>
<pre><code>yum -y install MySQL-python
yum -y install mysql-devel
easy_install-2.7 MySQL-python
</code></pre>
使用 fabric 在无外网的群集里面安装 py 包
https://segmentfault.com/a/1190000000759292
2014-11-05T15:30:17+08:00
2014-11-05T15:30:17+08:00
timger
https://segmentfault.com/u/timger
0
<p>使用 fabric<br>
有一台机器 有外网作为代理</p>
<pre><code>@task
@parallel(10)
def install_redis():
cmd = """
export http_proxy="http://proxy_host_ip:3128"
export NO_PROXY="localhost,127.0.0.1"
easy_install -i http://pypi.douban.com/simple/ pip
pip install -i http://pypi.douban.com/simple/ redis
"""
print cmd
run(cmd)
</code></pre>
flask 跨域访问装饰器实现
https://segmentfault.com/a/1190000000753690
2014-11-02T13:09:28+08:00
2014-11-02T13:09:28+08:00
timger
https://segmentfault.com/u/timger
6
<p>现在web开发已经进入前后端分离的阶段<br>
后端往往只需要吐api数据就ok</p>
<p>一般纯的api接口需要考虑跨域访问问题<br>
下面是简单的跨域访问装饰器在flask中的实现</p>
<pre><code>from functools import wraps
from flask import make_response
def allow_cross_domain(fun):
@wraps(fun)
def wrapper_fun(*args, **kwargs):
rst = make_response(fun(*args, **kwargs))
rst.headers['Access-Control-Allow-Origin'] = '*'
rst.headers['Access-Control-Allow-Methods'] = 'PUT,GET,POST,DELETE'
allow_headers = "Referer,Accept,Origin,User-Agent"
rst.headers['Access-Control-Allow-Headers'] = allow_headers
return rst
return wrapper_fun
@app.route('/hosts/')
@allow_cross_domain
def domains():
pass
</code></pre>
scala 连接mysql数据库
https://segmentfault.com/a/1190000000752156
2014-11-01T00:26:34+08:00
2014-11-01T00:26:34+08:00
timger
https://segmentfault.com/u/timger
1
<p>scala 代码</p>
<pre><code>import java.sql.{DriverManager, Connection, ResultSet}
object hello {
val user="root"
val password = "password"
val host="yourip"
val database="yourdb"
val conn_str = "jdbc:mysql://"+host +":3306/"+database+"?user="+user+"&password=" + password
println(conn_str)
def main(args:Array[String]): Unit ={
//classOf[com.mysql.jdbc.Driver]
Class.forName("com.mysql.jdbc.Driver").newInstance();
val conn = DriverManager.getConnection(conn_str)
println("hello")
try {
// Configure to be Read Only
val statement = conn.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY)
// Execute Query
val rs = statement.executeQuery("SHOW TABLES")
// Iterate Over ResultSet
while (rs.next) {
println(rs.getRow())
}
}
catch {
case _ : Exception => println("===>")
}
finally {
conn.close
}
}
}
</code></pre>
<h2>sbt代码</h2>
<p>```name := "test"</p>
<p>version := "1.0"</p>
<p>organization :="com.timger.info"</p>
<p>version :="0.0.1-SNAPSHOT"</p>
<p>exportJars := true</p>
<p>libraryDependencies += "mysql" % "mysql-connector-java" % "5.1.33"</p>
<p>resolvers += "Local Maven Repository" at "file://"+Path.userHome.absolutePath+"/.m2/repository"</p>
<pre><code><br>### 执行
</code></pre>
<p>sbt run<br>
```</p>
cdh5 下面编译spark1.1踩过的坑
https://segmentfault.com/a/1190000000752131
2014-10-31T23:51:28+08:00
2014-10-31T23:51:28+08:00
timger
https://segmentfault.com/u/timger
0
<ol>
<li>MVN</li>
</ol>
<pre><code>需要设置内存使用 并且设置java环境
</code></pre>
<ol>
<li>java版本<br>
java版本必须使用1.6的编译.否则会有问题</li>
<li>protobuf的版本<br>
修改<code>pom.xml</code>文件</li>
</ol>
<pre><code><protobuf.version>2.4.1</protobuf.version>
to
<protobuf.version>2.5.0</protobuf.version>
</code></pre>
使用tty.js 在浏览器上使用终端
https://segmentfault.com/a/1190000000722829
2014-10-14T22:12:07+08:00
2014-10-14T22:12:07+08:00
timger
https://segmentfault.com/u/timger
0
<ol>
<li>下载<code>node.js</code> <code>http://nodejs.org/download/</code>
</li>
<li>安装 <code>node</code>
</li>
</ol>
<pre><code> [root@li637-23 data]# wget http://nodejs.org/dist/v0.10.32/node-v0.10.32-linux-x64.tar.gz
--2014-10-14 13:30:21-- http://nodejs.org/dist/v0.10.32/node-v0.10.32-linux-x64.tar.gz
Resolving nodejs.org... 165.225.133.150
Connecting to nodejs.org|165.225.133.150|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5643126 (5.4M) [application/octet-stream]
Saving to: “node-v0.10.32-linux-x64.tar.gz”
100%[===============================================================================================================================>] 5,643,126 129K/s in 64s
2014-10-14 13:31:26 (86.1 KB/s) - “node-v0.10.32-linux-x64.tar.gz” saved [5643126/5643126]
[root@li637-23 data]# tar -vxf node-v0.10.32-linux-x64.tar.gz
</code></pre>
<pre><code>vim ~/.bashrc
export PATH="${PATH}:/root/data/node-v0.10.32-linux-x64/bin"
</code></pre>
<p>安装gcc tty.js</p>
<pre><code>yum install gcc-c++
[root@li637-23 ~]# npm install tty.js
</code></pre>
<p>编写<code>tty.js</code><br>
```</p>
linux 下 python多版本安装
https://segmentfault.com/a/1190000000722751
2014-10-14T21:07:52+08:00
2014-10-14T21:07:52+08:00
timger
https://segmentfault.com/u/timger
0
<p>使用<code>pythonbrew</code></p>
<pre><code>easy_install pythonbrew
</code></pre>
<pre><code>[root@li637-23 schirm]# pythonbrew_install
Well-done! Congratulations!
The pythonbrew is installed as:
/root/.pythonbrew
Please add the following line to the end of your ~/.bashrc
[[ -s "$HOME/.pythonbrew/etc/bashrc" ]] && source "$HOME/.pythonbrew/etc/bashrc"
After that, exit this shell, start a new one, and install some fresh
pythons:
pythonbrew install 2.7.2
pythonbrew install 3.2
For further instructions, run:
pythonbrew help
The default help messages will popup and tell you what to do!
Enjoy pythonbrew at /root/.pythonbrew!!
</code></pre>
<pre><code>[root@li637-23 schirm]# vim /root/.pythonbrew
[root@li637-23 schirm]# . ~/.bashrc
[root@li637-23 schirm]# pythonbrew install 2.7.2
Downloading Python-2.7.2.tgz as /root/.pythonbrew/dists/Python-2.7.2.tgz
######################################################################## 100.0%
Extracting Python-2.7.2.tgz into /root/.pythonbrew/build/Python-2.7.2
This could take a while. You can run the following command on another shell to track the status:
tail -f "/root/.pythonbrew/log/build.log"
Installing Python-2.7.2 into /root/.pythonbrew/pythons/Python-2.7.2
</code></pre>
使用yo bower grunt 写angularjs项目
https://segmentfault.com/a/1190000000721174
2014-10-13T23:38:23+08:00
2014-10-13T23:38:23+08:00
timger
https://segmentfault.com/u/timger
0
<h2>安装</h2>
<pre><code>npm install -g yo
npm install -g bower
npm install -g grunt-cli
</code></pre>
<pre><code>mkdir godeyeweb
cd godeyeweb/
timger-mac:godeyeweb timger$ yo
[?] What would you like to do? (Use arrow keys)
❯ Run the Angular generator (0.8.0)
Run the Angular-cms generator (0.6.0-rc.1)
Run the Angular-phonegap-seed generator (0.6.0)
Run the Angularjs-library generator (1.0.2)
Run the Chrome-extension generator (0.2.5)
Run the Chromeapp generator (0.2.5)
Run the Ink generator (0.1.1)
(Move up and down to reveal more choices)
</code></pre>
<p>选第一个 ok</p>
git 配置同时 push 到多个源
https://segmentfault.com/a/1190000000720996
2014-10-13T20:42:19+08:00
2014-10-13T20:42:19+08:00
timger
https://segmentfault.com/u/timger
0
<h2>git 配置同时 push 到多个源</h2>
<pre><code>git remote set-url --add --push origin https://timger:password@bitbucket.org/timger/exapmle.git
git push origin master
</code></pre>
Hadoop参数汇总
https://segmentfault.com/a/1190000000709725
2014-10-05T21:58:32+08:00
2014-10-05T21:58:32+08:00
timger
https://segmentfault.com/u/timger
1
<h2>Hadoop参数汇总</h2>
<p>@(hadoop)[配置]</p>
<h3>linux参数</h3>
<p>以下参数最好优化一下:</p>
<ol>
<li>文件描述符<code>ulimit -n</code>
</li>
<li>用户最大进程 nproc (hbase需要 hbse book)</li>
<li>关闭swap分区</li>
<li>设置合理的预读取缓冲区</li>
<li>Linux的内核的IO调度器</li>
</ol>
<h3>JVM参数</h3>
<p>JVM方面的优化项<a rel="nofollow" href="http://developer.amd.com/wordpress/media/2012/10/Hadoop_Tuning_Guide-Version5.pdf">Hadoop Performance Tuning Guide</a></p>
<h3>Hadoop参数大全</h3>
<pre><code>适用版本:4.3.0
</code></pre>
<p>主要配置文件:</p>
<ul>
<li><a rel="nofollow">core</a></li>
<li><a rel="nofollow">hdfs</a></li>
<li><a rel="nofollow">yarn</a></li>
<li><a rel="nofollow">mapred</a></li>
</ul>
<p>重要性表示如下:</p>
<ul>
<li><strong>重要</strong></li>
<li>一般</li>
<li><em>不重要</em></li>
</ul>
<h4><a>core-default.xml</a></h4>
<ul>
<li>
<p>hadoop.common.configuration.version</p>
<blockquote>
<p>配置文件的版本。</p>
</blockquote>
</li>
<li>
<p><strong>hadoop.tmp.dir=/tmp/hadoop-${user.name}</strong></p>
<blockquote>
<p>Hadoop的临时目录,其它目录会基于此路径。本地目录。</p>
<blockquote>
<p><strong>只可以设置一个值;建议设置到一个足够空间的地方,而不是默认的/tmp下</strong><br>
服务端参数,修改需重启</p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>hadoop.security.authorization=false</strong></p>
<blockquote>
<p>是否开启安全服务验证。</p>
<blockquote>
<p><strong>建议不开启。认证操作比较复杂,在公司内部网络下,重要性没那么高</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>io.file.buffer.size=4096</strong></p>
<blockquote>
<p>在读写文件时使用的缓存大小。这个大小应该是内存Page的倍数。</p>
<blockquote>
<p><strong>建议1M</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>io.compression.codecs=null</strong></p>
<blockquote>
<p>压缩和解压缩编码类列表,用逗号分隔。这些类是使用Java ServiceLoader加载。</p>
</blockquote>
</li>
<li>
<p><strong>fs.defaultFS=file:///</strong></p>
<blockquote>
<p>默认文件系统的名称。URI形式。uri's的scheme需要由(fs.SCHEME.impl)指定文件系统实现类。 uri's的authority部分用来指定host, port等。默认是本地文件系统。</p>
<blockquote>
<p><strong>HA方式,这里设置服务名,例如:hdfs://mycluster1</strong><br>
HDFS的客户端访问HDFS需要此参数。</p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>fs.trash.interval=0</strong></p>
<blockquote>
<p>以分钟为单位的垃圾回收时间,垃圾站中数据超过此时间,会被删除。如果是0,垃圾回收机制关闭。可以配置在服务器端和客户端。如果在服务器端配置trash无效,会检查客户端配置。如果服务器端配置有效,客户端配置会忽略。</p>
<blockquote>
<p><strong>建议开启,建议4320(3天)</strong><br>
垃圾回收站,如有同名文件被删除,会给文件顺序编号,例如:a.txt,a.txt(1)</p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>fs.trash.checkpoint.interval=0</strong></p>
<blockquote>
<p>以分钟为单位的垃圾回收检查间隔。应该小于或等于fs.trash.interval。如果是0,值等同于fs.trash.interval。每次检查器运行,会创建新的检查点。</p>
<blockquote>
<p><strong>建议设置为60(1小时)</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>dfs.ha.fencing.methods=null</strong></p>
<blockquote>
<p>HDFS的HA功能的防脑裂方法。可以是内建的方法(例如shell和sshfence)或者用户定义的方法。<em>建议使用sshfence(hadoop:9922),括号内的是用户名和端口,注意,这需要NN的2台机器之间能够免密码登陆</em></p>
</blockquote>
<blockquote>
<p>fences是防止脑裂的方法,保证NN中仅一个是Active的,如果2者都是Active的,新的会把旧的强制Kill。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.ha.fencing.ssh.private-key-files=null</strong></p>
<blockquote>
<p>使用sshfence时,SSH的私钥文件。 <strong>使用了sshfence,这个必须指定</strong></p>
</blockquote>
</li>
<li>
<p><strong>ha.zookeeper.quorum=null</strong></p>
<blockquote>
<p>Ha功能,需要一组zk地址,用逗号分隔。被ZKFailoverController使用于自动失效备援failover。</p>
</blockquote>
</li>
<li>
<p><strong>ha.zookeeper.session-timeout.ms=5000</strong></p>
<blockquote>
<p>ZK连接超时。ZKFC连接ZK时用。设置一个小值可以更快的探测到服务器崩溃(crash),但也会更频繁的触发失效备援,在传输错误或者网络不畅时。<strong>建议10s-30s</strong></p>
</blockquote>
</li>
<li>
<p><strong>hadoop.http.staticuser.user=dr.who</strong></p>
<blockquote>
<p>在网页界面访问数据使用的用户名。<em>默认值是一个不真实存在的用户,此用户权限很小,不能访问不同用户的数据。这保证了数据安全。也可以设置为hdfs和hadoop等具有较高权限的用户,但会导致能够登陆网页界面的人能看到其它用户数据。实际设置请综合考虑。如无特殊需求。使用默认值就好</em></p>
</blockquote>
</li>
<li>
<p><strong>fs.permissions.umask-mode=22</strong></p>
<blockquote>
<p>在创建文件和目录时使用此umask值(用户掩码)。类linux上的文件权限掩码。可以使用8进制数字也可以使用符号,例如:"022" (8进制,等同于以符号表示的u=rwx,g=r-x,o=r-x),或者"u=rwx,g=rwx,o="(符号法,等同于8进制的007)。<em>注意,8进制的掩码,和实际权限设置值正好相反,建议使用符号表示法,描述更清晰</em></p>
</blockquote>
</li>
<li>
<p>io.native.lib.available=true</p>
<blockquote>
<p>是否启动Hadoop的本地库,默认启用。本地库可以加快基本操作,例如IO,压缩等。</p>
</blockquote>
</li>
<li>
<p>hadoop.http.filter.initializers=org.apache.hadoop.http.lib.StaticUserWebFilter</p>
<blockquote>
<p>Hadoop的Http服务中,用逗号分隔的一组过滤器类名,每个类必须扩展自org.apache.hadoop.http.FilterInitializer。 这些组件被初始化,应用于全部用户的JSP和Servlet页面。 列表中定义的顺序就是过滤器被调用的顺序。</p>
</blockquote>
</li>
<li>
<p>hadoop.security.authentication</p>
<blockquote>
<p>安全验证规则,可以是simple和kerberos。simple意味着不验证。</p>
</blockquote>
</li>
<li>
<p>hadoop.security.group.mapping=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback</p>
<blockquote>
<p>user到group的映射类。ACL用它以给定user获取group。默认实现是 org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback, 如果JNI有效,它将发挥作用,使用Hadoop的API去获取user的groups列表。如果JNI无效,会使用另一个基于shell的实现, ShellBasedUnixGroupsMapping。这个实现是基于Linux、Unix的shell的环境。</p>
</blockquote>
</li>
<li>
<p><em>hadoop.security.groups.cache.secs=300</em></p>
<blockquote>
<p>user到gourp映射缓存的有效时间。如果超时,会再次调用去获取新的映射关系然后缓存起来。</p>
</blockquote>
</li>
<li>
<p><em>hadoop.security.service.user.name.key=null</em></p>
<blockquote>
<p>如果相同的RPC协议被多个Server实现,这个配置是用来指定在客户端进行RPC调用时,使用哪个principal name去联系服务器。<em>不建议使用</em></p>
</blockquote>
</li>
<li>
<p><em>hadoop.security.uid.cache.secs=14400</em></p>
<blockquote>
<p>安全选项。<em>不建议使用</em></p>
</blockquote>
</li>
<li>
<p><em>hadoop.rpc.protection=authentication</em></p>
<blockquote>
<p>rpc连接保护。可取的值有authentication(认证), integrity(完整) and privacy(隐私)。<em>不建议使用</em></p>
</blockquote>
</li>
<li>
<p>hadoop.work.around.non.threadsafe.getpwuid=false</p>
<blockquote>
<p>一些系统已知在调用getpwuid_r和getpwgid_r有问题,这些调用是非线程安全的。这个问题的主要表现特征是JVM崩溃。如果你的系统有这些问题,开启这个选项。默认是关闭的。</p>
</blockquote>
</li>
<li>
<p><em>hadoop.kerberos.kinit.command=kinit</em></p>
<blockquote>
<p>用来定期的向Hadoop提供新的Kerberos证书。所提供命令需要能够在运行Hadoop客户端的用户路径中查找到,否则,请指定绝对路径。<em>不建议使用</em></p>
</blockquote>
</li>
<li>
<p><em>hadoop.security.auth_to_local=null</em></p>
<blockquote>
<p>映射kerberos principals(代理人)到本地用户名</p>
</blockquote>
</li>
<li>
<p>io.bytes.per.checksum=512</p>
<blockquote>
<p>每次进行校验和检查的字节数。一定不能大于io.file.buffer.size.</p>
</blockquote>
</li>
<li>
<p>io.skip.checksum.errors=FALSE</p>
<blockquote>
<p>是否跳过校验和错误,默认是否,校验和异常时会抛出错误。</p>
</blockquote>
</li>
<li>
<p>io.serializations=org.apache.hadoop.io.serializer.WritableSerialization,org.apache.hadoop.io.serializer.avro.AvroSpecificSerialization,org.apache.hadoop.io.serializer.avro.AvroReflectSerialization</p>
<blockquote>
<p>序列化类列表,可以被用来获取序列化器和反序列化器(serializers and deserializers)。</p>
</blockquote>
</li>
<li>
<p>io.seqfile.local.dir=${hadoop.tmp.dir}/io/local</p>
<blockquote>
<p>本地文件目录。sequence file在merge过程中存储内部数据的地方。可以是逗号分隔的一组目录。最好在不同磁盘以分散IO。实际不存在的目录会被忽略。</p>
</blockquote>
</li>
<li>
<p>io.map.index.skip=0</p>
<blockquote>
<p>跳过的索引实体数量在entry之间。默认是0。设置大于0的值可以用更少的内存打开大MapFiles。<strong>注意:MpaFile是一组Sequence文件,是排序后的,带内部索引的文件</strong></p>
</blockquote>
</li>
<li>
<p>io.map.index.interval=128</p>
<blockquote>
<p>MapFile包含两个文件,数据文件和索引文件。每io.map.index.interval个记录写入数据文件,一条记录(行key,数据文件位置)写入索引文件。</p>
</blockquote>
</li>
<li>
<p>fs.default.name=file:///</p>
<blockquote>
<p><strong>过时</strong>。使用(fs.defaultFS)代替</p>
</blockquote>
</li>
<li>
<p><em>fs.AbstractFileSystem.file.impl=org.apache.hadoop.fs.local.LocalFs</em></p>
<blockquote>
<p>文件系统实现类:file</p>
</blockquote>
</li>
<li>
<p>fs.AbstractFileSystem.hdfs.impl=org.apache.hadoop.fs.Hdfs</p>
<blockquote>
<p>文件系统实现类:hdfs</p>
</blockquote>
</li>
<li>
<p>fs.AbstractFileSystem.viewfs.impl=org.apache.hadoop.fs.viewfs.ViewFs</p>
<blockquote>
<p>文件系统实现类:viewfs (例如客户端挂载表)。</p>
<blockquote>
<p><strong>在实现federation特性时,客户端可以部署此系统,方便同时访问多个nameservice</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><em>fs.ftp.host=0.0.0.0</em></p>
<blockquote>
<p>非Hdfs文件系统设置。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p><em>fs.ftp.host.port=21</em></p>
<blockquote>
<p>非Hdfs文件系统设置。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p>fs.df.interval=60000</p>
<blockquote>
<p>磁盘使用统计刷新间隔,以毫秒为单位</p>
</blockquote>
</li>
<li>
<p><em>fs.s3.block.size=67108864</em></p>
<blockquote>
<p>非Hdfs文件系统设置。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p><em>fs.s3.buffer.dir=${hadoop.tmp.dir}/s3</em></p>
<blockquote>
<p>非Hdfs文件系统设置。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p><em>fs.s3.maxRetries=4</em></p>
<blockquote>
<p>非Hdfs文件系统设置。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p><em>fs.s3.sleepTimeSeconds=10</em></p>
<blockquote>
<p>非Hdfs文件系统设置。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p>fs.automatic.close=true</p>
<blockquote>
<p>默认的,文件系统实例在程序退出时自动关闭,通过JVM shutdown hook方式。可以把此属性设置为false取消这种操作。这是一个高级选项,需要使用者特别关注关闭顺序。<em>不要关闭</em></p>
</blockquote>
</li>
<li>
<p><em>fs.s3n.block.size=67108864</em></p>
<blockquote>
<p>非Hdfs文件系统设置。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p>io.seqfile.compress.blocksize=1000000</p>
<blockquote>
<p>SequenceFiles以块压缩方式压缩时,块大小大于此值时才启动压缩。</p>
</blockquote>
</li>
<li>
<p>io.seqfile.lazydecompress=TRUE</p>
<blockquote>
<p>懒惰解压,仅在必要时解压,仅对块压缩的SequenceFiles有效。</p>
</blockquote>
</li>
<li>
<p>io.seqfile.sorter.recordlimit=1000000</p>
<blockquote>
<p>在SequenceFiles.Sorter spill过程中,保存在内存中的记录数</p>
</blockquote>
</li>
<li>
<p>io.mapfile.bloom.size=1048576</p>
<blockquote>
<p>在BloomMapFile使用的布隆过滤器内存大小。</p>
</blockquote>
</li>
<li>
<p>io.mapfile.bloom.error.rate=0.005</p>
<blockquote>
<p>BloomMapFile中使用布隆过滤器失败比率. 如果减少这个值,使用的内存会成指数增长。</p>
</blockquote>
</li>
<li>
<p>hadoop.util.hash.type=murmur</p>
<blockquote>
<p>默认Hash算法实现. 'murmur':MurmurHash, 'jenkins':JenkinsHash.</p>
</blockquote>
</li>
<li>
<p>ipc.client.idlethreshold=4000</p>
<blockquote>
<p>连接数阀值,超过此值,需要进行空闲连接检查</p>
</blockquote>
</li>
<li>
<p>ipc.client.kill.max=10</p>
<blockquote>
<p>定义客户端最大数量,超过会被断开连接</p>
</blockquote>
</li>
<li>
<p>ipc.client.connection.maxidletime=10000</p>
<blockquote>
<p>毫秒,最大时间,超过后客户端会断开和服务器的连接。</p>
</blockquote>
</li>
<li>
<p>ipc.client.connect.max.retries=10</p>
<blockquote>
<p>客户端连接重试次数。</p>
</blockquote>
</li>
<li>
<p>ipc.client.connect.max.retries.on.timeouts=45</p>
<blockquote>
<p>在连接超时后,客户端连接重试次数</p>
</blockquote>
</li>
<li>
<p>ipc.server.listen.queue.size=128</p>
<blockquote>
<p>定义服务器端接收客户端连接的监听队列长度</p>
</blockquote>
</li>
<li>
<p>ipc.server.tcpnodelay=false</p>
<blockquote>
<p>在服务器端开启/关闭Nagle's算法,此算法可以延迟小数据包发送,从而达到网络流量更有效利用。但是这对小数据包是不利的。默认关闭。<em>建议false,即开启Nagle算法</em></p>
</blockquote>
</li>
<li>
<p>ipc.client.tcpnodelay=false</p>
<blockquote>
<p>参考ipc.server.tcpnodelay,客户端参数。<em>或许可以考虑关闭Nagle算法,增加客户端响应速度</em></p>
</blockquote>
</li>
<li>
<p>hadoop.rpc.socket.factory.class.default=org.apache.hadoop.net.StandardSocketFactory</p>
<blockquote>
<p>高级选项,暂不考虑</p>
</blockquote>
</li>
<li>
<p>hadoop.rpc.socket.factory.class.ClientProtocol=null</p>
<blockquote>
<p>高级选项,暂不考虑</p>
</blockquote>
</li>
<li>
<p>hadoop.socks.server=null</p>
<blockquote>
<p>高级选项,暂不考虑</p>
</blockquote>
</li>
<li>
<p>net.topology.node.switch.mapping.impl=org.apache.hadoop.net.ScriptBasedMapping</p>
<blockquote>
<p>机架感知实现类。</p>
</blockquote>
</li>
<li>
<p>net.topology.script.file.name=null</p>
<blockquote>
<p>配合ScriptBasedMapping使用。脚本文件。此脚本文件,输入是ip地址,输出是机架路径。</p>
</blockquote>
</li>
<li>
<p>net.topology.script.number.args=100</p>
<blockquote>
<p>机架感知脚本文件的参数最大数量。脚本每次运行被传递的参数,每个参数是一个ip地址</p>
</blockquote>
</li>
<li>
<p>net.topology.table.file.name=null</p>
<blockquote>
<p>在net.topology.script.file.name被设置为 org.apache.hadoop.net.TableMapping时,可以使用此配置。文件格式是一个有两个列的文本文件,使用空白字符分隔。第一列是DNS或IP地址,第二列是机架路径。如无指定,使用默认机架(/default-rack)</p>
</blockquote>
</li>
<li>
<p><em>file.stream-buffer-size=4096</em></p>
<blockquote>
<p>非hdfs文件系统,暂不关注</p>
</blockquote>
</li>
<li>
<p><em>s3.stream-buffer-size=4096</em></p>
<blockquote>
<p>非hdfs文件系统,暂不关注</p>
</blockquote>
</li>
<li>
<p><em>kfs.stream-buffer-size=4096</em></p>
<blockquote>
<p>非hdfs文件系统,暂不关注</p>
</blockquote>
</li>
<li>
<p><em>ftp.stream-buffer-size=4096</em></p>
<blockquote>
<p>非hdfs文件系统,暂不关注</p>
</blockquote>
</li>
<li>
<p><em>tfile.io.chunk.size=1048576</em></p>
<blockquote>
<p>非hdfs文件系统,暂不关注</p>
</blockquote>
</li>
<li>
<p>hadoop.http.authentication.type=simple</p>
<blockquote>
<p>Oozie Http终端安全验证。可选值:simple | kerberos |#AUTHENTICATION_HANDLER_CLASSNAME#</p>
<blockquote>
<p><strong>建议simple,关闭验证</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><em>hadoop.http.authentication.token.validity=36000</em></p>
<blockquote>
<p>安全选项。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p><em>hadoop.http.authentication.signature.secret.file=${user.home}/hadoop-http-auth-signature-secret</em></p>
<blockquote>
<p>安全选项。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p><em>hadoop.http.authentication.cookie.domain=null</em></p>
<blockquote>
<p>安全选项。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p><em>hadoop.http.authentication.simple.anonymous.allowed=TRUE</em></p>
<blockquote>
<p>安全选项。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p><em>hadoop.http.authentication.kerberos.principal=HTTP/_HOST@LOCALHOST</em></p>
<blockquote>
<p>安全选项。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p><em>hadoop.http.authentication.kerberos.keytab=${user.home}/hadoop.keytab</em></p>
<blockquote>
<p>安全选项。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p>dfs.ha.fencing.ssh.connect-timeout=30000</p>
<blockquote>
<p>SSH连接超时,毫秒,仅适用于内建的sshfence fencer。</p>
</blockquote>
</li>
<li>
<p>ha.zookeeper.parent-znode=/hadoop-ha</p>
<blockquote>
<p>ZK失效备援功能,需要在ZK上创建节点,这里是根节点的名称。ZKFC会在这下面工作。注意,NameService ID会 被写到此节点下,所以即便是开启federation功能,也仅需要指定一个值。</p>
</blockquote>
</li>
<li>
<p>ha.zookeeper.acl=world:anyone:rwcda</p>
<blockquote>
<p>ZKFC创建的ZK节点的访问控制权限设置。可以多个,逗号分隔。此设置和ZK的CLI使用相同的格式。</p>
</blockquote>
</li>
<li>
<p>ha.zookeeper.auth=null</p>
<blockquote>
<p>ZK操作时的权限验证。</p>
</blockquote>
</li>
<li>
<p><em>hadoop.ssl.keystores.factory.class=org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory</em></p>
<blockquote>
<p>安全选项。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p><em>hadoop.ssl.require.client.cert=FALSE</em></p>
<blockquote>
<p>安全选项。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p><em>hadoop.ssl.hostname.verifier=DEFAULT</em></p>
<blockquote>
<p>安全选项。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p><em>hadoop.ssl.server.conf=ssl-server.xml</em></p>
<blockquote>
<p>安全选项。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p><em>hadoop.ssl.client.conf=ssl-client.xml</em></p>
<blockquote>
<p>安全选项。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p><em>hadoop.ssl.enabled=FALSE</em></p>
<blockquote>
<p>安全选项。<em>暂不关注</em></p>
</blockquote>
</li>
<li>
<p><em>hadoop.jetty.logs.serve.aliases=TRUE</em></p>
<blockquote>
<p>是否允许在Jetty中使用别名服务。</p>
</blockquote>
</li>
<li>
<p>ha.health-monitor.connect-retry-interval.ms=1000</p>
<blockquote>
<p>HA功能的健康监控连接重试间隔</p>
</blockquote>
</li>
<li>
<p>ha.health-monitor.check-interval.ms=1000</p>
<blockquote>
<p>HA功能的健康监控连接间隔</p>
</blockquote>
</li>
<li>
<p>ha.health-monitor.sleep-after-disconnect.ms=1000</p>
<blockquote>
<p>HA功能的健康监控,在因网络问题失去连接后休眠多久。<em>用于避免立即重试,此时网络问题仍在,没有意义</em></p>
</blockquote>
</li>
<li>
<p>ha.health-monitor.rpc-timeout.ms=45000</p>
<blockquote>
<p>HA功能健康监控的超时时间,毫秒</p>
</blockquote>
</li>
<li>
<p>ha.failover-controller.new-active.rpc-timeout.ms=60000</p>
<blockquote>
<p>FC等待新的NN变成active状态的超时时间。</p>
</blockquote>
</li>
<li>
<p>ha.failover-controller.graceful-fence.rpc-timeout.ms=5000</p>
<blockquote>
<p>FC等待旧的active变成standby的超时时间。</p>
</blockquote>
</li>
<li>
<p>ha.failover-controller.graceful-fence.connection.retries=1</p>
<blockquote>
<p>FC在做完美隔离是的连接重试次数(graceful fencing)</p>
</blockquote>
</li>
<li>
<p>ha.failover-controller.cli-check.rpc-timeout.ms=20000</p>
<blockquote>
<p>手动运行的FC功能(从CLI)等待健康检查、服务状态的超时时间。</p>
</blockquote>
</li>
</ul>
<h4><a>hdfs-default.xml</a></h4>
<ul>
<li>
<p>hadoop.hdfs.configuration.version=1</p>
<p>配置文件的版本</p>
</li>
<li>
<p><strong>dfs.datanode.address=0.0.0.0:50010</strong></p>
<p>DN服务地址和端口,用于数据传输。0表示任意空闲端口。</p>
<pre><code>xferPort dfs.datanode.address 50010 数据流地址 数据传输
infoPort dfs.datanode.http.address 50075
ipcPort dfs.datanode.ipc.address 50020 命令
</code></pre>
</li>
<li>
<p><strong>dfs.datanode.http.address=0.0.0.0:50075</strong></p>
<blockquote>
<p>DN的HTTP服务地址和端口。0表示任意空闲端口。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.datanode.ipc.address=0.0.0.0:50020</strong></p>
<blockquote>
<p>DN的IPC地址和端口。0表示任意空闲端口。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.namenode.rpc-address=0.0.0.0:50090</strong></p>
<blockquote>
<p>NN的RPC地址和端口</p>
</blockquote>
</li>
<li>
<p><strong>dfs.namenode.http-address=0.0.0.0:50070</strong></p>
<blockquote>
<p>NN的HTTP地址和端口。0表示任意空闲端口。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.datanode.du.reserved=0</strong></p>
<blockquote>
<p>每个磁盘(volume)的保留空间,字节。要注意留足够的空间给非HDFS文件使用。<em>建议保留磁盘容量的5%或者50G以上</em></p>
</blockquote>
</li>
<li>
<p><strong>dfs.namenode.name.dir.restore=FALSE</strong></p>
<blockquote>
<p>设置为true,允许NN尝试恢复之前失败的dfs.namenode.name.dir目录。在创建checkpoint是做此尝试。<em>如果设置多个磁盘,建议允许</em></p>
</blockquote>
</li>
<li>
<p><strong>dfs.namenode.edits.dir=${dfs.namenode.name.dir}</strong></p>
<blockquote>
<p>本地文件,NN存放edits文件的目录。可以是逗号分隔的目录列表。edits文件会存储在每个目录,冗余安全。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.namenode.shared.edits.dir=null</strong></p>
<blockquote>
<p>在多个NN中共享存储目录,用于存放edits文件。这个目录,由active写,由standby读,以保持命名空间数据一致。此目录不需要是dfs.namenode.edits.dir中列出的。在非HA集群中,它不会使用。<em>建议使用qj方式,可以不关注这个选项</em></p>
</blockquote>
</li>
<li>
<p><strong>dfs.namenode.edits.journal-plugin.qjournal=org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager</strong></p>
<blockquote>
<p>qj方式共享edits。<em>建议使用此方式</em></p>
</blockquote>
</li>
<li>
<p><strong>dfs.permissions.enabled=true</strong></p>
<blockquote>
<p>是否在HDFS中开启权限检查。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.permissions.superusergroup=supergroup</strong></p>
<blockquote>
<p>超级用户组。仅能设置一个。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.datanode.data.dir=file://${hadoop.tmp.dir}/dfs/data</strong></p>
<blockquote>
<p>本地磁盘目录,HDFS数据应该存储Block的地方。可以是逗号分隔的目录列表(典型的,每个目录在不同的磁盘)。这些目录被轮流使用,一个块存储在这个目录,下一个块存储在下一个目录,依次循环。每个块在同一个机器上仅存储一份。不存在的目录被忽略。必须创建文件夹,否则被视为不存在。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.replication=3</strong></p>
<blockquote>
<p>数据块副本数。此值可以在创建文件是设定,客户端可以只有设定,也可以在命令行修改。不同文件可以有不同的副本数。默认值用于未指定时。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.replication.max=512</strong></p>
<blockquote>
<p>最大块副本数,不要大于节点总数。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.namenode.replication.min=1</strong></p>
<blockquote>
<p>最小块副本数。<em>在上传文件时,达到最小副本数,就认为上传是成功的</em></p>
</blockquote>
</li>
<li>
<p><strong>dfs.blocksize=67108864</strong></p>
<blockquote>
<p>块大小,字节。可以使用后缀: k(kilo), m(mega), g(giga), t(tera), p(peta), e(exa)指定大小 (就像128k, 512m, 1g, 等待)。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.client.block.write.retries=3</strong></p>
<blockquote>
<p>客户端写数据到DN时,最大重试次数。超过重试次数就会报出错误。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.client.block.write.replace-datanode-on-failure.enable=true</strong></p>
<blockquote>
<p>在进行pipeline写数据(上传数据的方式)时,如果DN或者磁盘故障,客户端将尝试移除失败的DN,然后写到剩下的磁盘。一个结果是,pipeline中的DN减少了。这个特性是添加新的DN到pipeline。这是一个站点范围的选项。当集群规模非常小时,例如3个或者更小,集群管理者可能想要禁止掉此特性。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.client.block.write.replace-datanode-on-failure.policy=DEFAULT</strong></p>
<blockquote>
<p>此属性仅在dfs.client.block.write.replace-datanode-on-failure.enable设置为true时有效。</p>
</blockquote>
<blockquote>
<ul>
<li>ALWAYS: 总是添加新的DN<br><br><ul>
<li>NEVER: 从不添加新的DN</li>
<li>DEFAULT: 设r是副本数,n是要写的DN数。在r>=3并且floor(r/2)>=n或者r>n(前提是文件是hflushed/appended)时添加新的DN。</li>
</ul>
</li>
</ul>
</blockquote>
</li>
<li>
<p><strong>dfs.heartbeat.interval=3</strong></p>
<blockquote>
<p>DN的心跳间隔,秒</p>
</blockquote>
</li>
<li>
<p><strong>dfs.namenode.handler.count=10</strong></p>
<blockquote>
<p>NN的服务线程数。用于处理RPC请求。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.namenode.safemode.threshold-pct=0.999f</strong></p>
<blockquote>
<p>数据进入安全模式阀值,百分比,float形式,数据块达到最小副本数(dfs.namenode.replication.min)的百分比。值小于等于0意味着在退出安全模式前不等待数据修复。大于1的值将导致无法离开安全模式。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.namenode.safemode.extension=30000</strong></p>
<blockquote>
<p>安全模式扩展存在时间,在需要的阀值达到后,毫秒。<em>可以设置为0,或者比较短的一个时间,例如3秒</em></p>
</blockquote>
</li>
<li>
<p><strong>dfs.datanode.balance.bandwidthPerSec=1048576</strong></p>
<blockquote>
<p>在做数据平衡时,每个DN最大带宽占用,每秒字节。默认值是1M。<em>建议可以到10M</em></p>
</blockquote>
</li>
<li>
<p><strong>dfs.hosts=null</strong></p>
<blockquote>
<p>文件名,包含了一个host列表,允许列表内机器连到NN。必须指定完整路径。如果值为空,全部hosts都允许连入。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.hosts.exclude=null</strong></p>
<blockquote>
<p>文件名,包含了一个hosts列表,不允许列表内机器连到NN。必须指定完整路径。如果值为空。没有host被禁止。<em>如果上述2个都设置并且有重合,dfs.hosts中优先级高。</em></p>
</blockquote>
</li>
<li>
<p><strong>dfs.stream-buffer-size=4096</strong></p>
<blockquote>
<p>文件流缓存大小。需要是硬件page大小的整数倍。<em>在读写操作时,数据缓存大小。注意和core-default.xml中指定文件类型的缓存是不同的,这个是dfs共用的</em></p>
</blockquote>
</li>
<li>
<p><strong>dfs.namenode.num.extra.edits.retained=1000000</strong></p>
<blockquote>
<p>除最小的必须的editlog之外,额外保留的editlog文件数量。这是有用的,可以用于审核目的,或者HA设置一个远程Standby节点并且有时可能离线时,都需要保留一个较长的backlog。</p>
<hr>
<p>典型的,每个edit大约几百字节,默认的1百万editlog大约有百兆到1G。注意:早先的extra edits文件可能操作这里设置的值,因为还有其它选项,例如dfs.namenode.max.extra.edits.segments.retained</p>
<blockquote>
<p><strong>建议值:2200,约3天的</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>dfs.datanode.handler.count=10</strong></p>
<blockquote>
<p>DN的服务线程数。<em>这些线程仅用于接收请求,处理业务命令</em></p>
</blockquote>
</li>
<li>
<p><strong>dfs.datanode.failed.volumes.tolerated=0</strong></p>
<blockquote>
<p>可以接受的卷的失败数量。默认值0表示,任一个卷失败都会导致DN关闭。</p>
<blockquote>
<p><strong>建议设置此值,避免个别磁盘问题。如果此值超过真实磁盘数,将会报错,启动失败</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>dfs.namenode.support.allow.format=true</strong></p>
<blockquote>
<p>NN是否允许被格式化?在生产系统,把它设置为false,阻止任何格式化操作在一个运行的DFS上。</p>
<blockquote>
<p><strong>建议初次格式化后,修改配置禁止</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>dfs.client.failover.max.attempts=15</strong></p>
<blockquote>
<p>专家设置。客户端失败重试次数。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.client.failover.connection.retries=0</strong></p>
<blockquote>
<p>专家设置。IPC客户端失败重试次数。<em>在网络不稳定时建议加大此值</em></p>
</blockquote>
</li>
<li>
<p><strong>dfs.client.failover.connection.retries.on.timeouts=0</strong></p>
<blockquote>
<p>专家设置。IPC客户端失败重试次数,此失败仅指超时失败。<em>在网络不稳定时建议加大此值</em></p>
</blockquote>
</li>
<li>
<p><strong>dfs.nameservices=null</strong></p>
<blockquote>
<p>nameservices列表。逗号分隔。</p>
<blockquote>
<p><strong>我们常用的仅配置一个,启动federation功能需要配置多个</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>dfs.nameservice.id=null</strong></p>
<blockquote>
<p>nameservice id,如果没有配置或者配置多个,由匹配到的本地节点地址配置的IP地址决定。<em>我们进配置一个NS的情况下,建议这里不配置</em></p>
</blockquote>
</li>
<li>
<p><strong>dfs.ha.namenodes.EXAMPLENAMESERVICE=null</strong></p>
<blockquote>
<p>包含一个NN列表。EXAMPLENAMESERVICE是指具体的nameservice名称,通常就是dfs.nameservices中配置的。值是预备配置的NN的ID。</p>
<blockquote>
<p><strong>ID是自己取的,不重复就可以,例如nn1,nn2</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>dfs.ha.namenode.id=null</strong></p>
<blockquote>
<p>NN的ID,如果没有配置,由系统决定。通过匹配本地节点地址和配置的地址。</p>
<blockquote>
<p><strong>这里设置的是本机的NN的ID(此配置仅对NN生效),由于要配置2个NN,建议没有特殊需要,这里不进行配置</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>dfs.ha.automatic-failover.enabled=FALSE</strong></p>
<blockquote>
<p>是否开启自动故障转移。<em>建议开启,true</em></p>
</blockquote>
</li>
<li>
<p><strong>dfs.namenode.avoid.write.stale.datanode=FALSE</strong></p>
<blockquote>
<p>决定是否避开在脏DN上写数据。写操作将会避开脏DN,除非超过一个配置的比率 (dfs.namenode.write.stale.datanode.ratio)。</p>
<blockquote>
<p><strong>尝试开启</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>dfs.journalnode.rpc-address=0.0.0.0:8485</strong></p>
<blockquote>
<p>JournalNode RPC服务地址和端口</p>
</blockquote>
</li>
<li>
<p><strong>dfs.journalnode.http-address=0.0.0.0:8480</strong></p>
<blockquote>
<p>JournalNode的HTTP地址和端口。端口设置为0表示随机选择。</p>
</blockquote>
</li>
<li>
<p><strong>dfs.namenode.audit.loggers=default</strong></p>
<blockquote>
<p>审查日志的实现类列表,能够接收audit事件。它们需要实现 org.apache.hadoop.hdfs.server.namenode.AuditLogger接口。默认值"default"可以用于引用默认的audit logger, 它使用配置的日志系统。安装客户自己的audit loggers可能影响NN的稳定性和性能。</p>
<blockquote>
<p><strong>建议default,开启</strong></p>
</blockquote>
</blockquote>
</li>
<li><p>dfs.client.socket-timeout=60*1000</p></li>
<li><p>dfs.datanode.socket.write.timeout=8*60*1000</p></li>
<li><p>dfs.datanode.socket.reuse.keepalive=1000</p></li>
<li>
<p>dfs.namenode.logging.level=info</p>
<blockquote>
<p>DFS的NN的日志等级。值可以是:info,dir(跟踪命名空间变动),"block" (跟踪块的创建删除,replication变动),或者"all".</p>
</blockquote>
</li>
<li>
<p>dfs.namenode.secondary.http-address=0.0.0.0:50090</p>
<blockquote>
<p>SNN的http服务地址。如果是0,服务将随机选择一个空闲端口。<em>使用了HA后,就不再使用SNN了</em></p>
</blockquote>
</li>
<li>
<p>dfs.https.enable=FALSE</p>
<blockquote>
<p>允许HDFS支持HTTPS(SSL)。<em>建议不支持</em></p>
</blockquote>
</li>
<li>
<p>dfs.client.https.need-auth=FALSE</p>
<blockquote>
<p>安全选项,暂不关注</p>
</blockquote>
</li>
<li>
<p>dfs.https.server.keystore.resource=ssl-server.xml</p>
<blockquote>
<p>安全选项,暂不关注</p>
</blockquote>
</li>
<li>
<p>dfs.client.https.keystore.resource=ssl-client.xml</p>
<blockquote>
<p>安全选项,暂不关注</p>
</blockquote>
</li>
<li>
<p>dfs.datanode.https.address=0.0.0.0:50475</p>
<blockquote>
<p>安全选项,暂不关注</p>
</blockquote>
</li>
<li>
<p>dfs.namenode.https-address=0.0.0.0:50470</p>
<blockquote>
<p>安全选项,暂不关注</p>
</blockquote>
</li>
<li>
<p>dfs.datanode.dns.interface=default</p>
<blockquote>
<p>DN汇报它的IP地址的网卡。<em>我们给DN指定了0.0.0.0之类的地址,这个地址需要被解析成对外地址,这里指定的是网卡名,即那个网卡上绑定的IP是可以对外的IP,一般的,默认值就足够了</em></p>
</blockquote>
</li>
<li>
<p>dfs.datanode.dns.nameserver=default</p>
<blockquote>
<p>DNS的域名或者IP地址。DN用它来确定自己的域名,在对外联系和显示时调用。<em>一般的,默认值就足够了</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.backup.address=0.0.0.0:50100</p>
<blockquote>
<p>NN的BK节点地址和端口,0表示随机选用。<em>使用HA,就不需要关注此选项了。建议不使用BK节点</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.backup.http-address=0.0.0.0:50105</p>
<blockquote>
<p><em>使用HA,就不需要关注此选项了。建议不使用BK节点</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.replication.considerLoad=true</p>
<blockquote>
<p>设定在选择存放目标时是否考虑负载。<em>需要</em></p>
</blockquote>
</li>
<li>
<p>dfs.default.chunk.view.size=32768</p>
<blockquote>
<p>在浏览器中查看一个文件时,可以看到的字节数。</p>
</blockquote>
</li>
<li>
<p>dfs.namenode.name.dir=file://${hadoop.tmp.dir}/dfs/name</p>
<blockquote>
<p>本地磁盘目录,NN存储fsimage文件的地方。可以是按逗号分隔的目录列表,fsimage文件会存储在全部目录,冗余安全。<em>这里多个目录设定,最好在多个磁盘,另外,如果其中一个磁盘故障,不会导致系统故障,会跳过坏磁盘。由于使用了HA,建议仅设置一个。如果特别在意安全,可以设置2个</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.fs-limits.max-component-length=0</p>
<blockquote>
<p>路径中每个部分的最大字节长度(目录名,文件名的长度)。0表示不检查长度。<em>长文件名影响性能</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.fs-limits.max-directory-items=0</p>
<blockquote>
<p>设置每个目录最多拥有多少个子目录或者文件。0表示无限制。<em>同一目录下子文件和目录多影响性能</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.fs-limits.min-block-size=1048576</p>
<blockquote>
<p>最小的Block大小,字节。在NN创建时强制验证。避免用户设定过小的Block Size,导致过多的Block,这非常影响性能。</p>
</blockquote>
</li>
<li>
<p>dfs.namenode.fs-limits.max-blocks-per-file=1048576</p>
<blockquote>
<p>每个文件最大的Block数。在NN写时强制检查。用于防止创建超大文件。</p>
</blockquote>
</li>
<li>
<p>dfs.block.access.token.enable=FALSE</p>
<blockquote>
<p>访问DN时是否验证访问令牌。<em>建议false,不检查</em></p>
</blockquote>
</li>
<li>
<p>dfs.block.access.key.update.interval=600</p>
<blockquote>
<p>安全选项,暂不关注</p>
</blockquote>
</li>
<li>
<p>dfs.block.access.token.lifetime=600</p>
<blockquote>
<p>安全选项,暂不关注</p>
</blockquote>
</li>
<li>
<p>dfs.datanode.data.dir.perm=700</p>
<blockquote>
<p>本地数据目录权限设定。8进制或者符号方式都可以。</p>
</blockquote>
</li>
<li>
<p>dfs.blockreport.intervalMsec=21600000</p>
<blockquote>
<p>数据块汇报间隔,毫秒,默认是6小时。</p>
</blockquote>
</li>
<li>
<p>dfs.blockreport.initialDelay=0</p>
<blockquote>
<p>第一次数据块汇报时延迟,秒。<em>目的是减轻NN压力?</em></p>
</blockquote>
</li>
<li>
<p>dfs.datanode.directoryscan.interval=21600</p>
<blockquote>
<p>DN的数据块扫描间隔,秒。磁盘上数据和内存中数据调整一致。</p>
</blockquote>
</li>
<li>
<p>dfs.datanode.directoryscan.threads=1</p>
<blockquote>
<p>线程池要有多少线程用来并发的压缩磁盘的汇报数据。</p>
</blockquote>
</li>
<li>
<p>dfs.namenode.safemode.min.datanodes=0</p>
<blockquote>
<p>NN收到回报的DN的数量的最小值,达不到此值,NN不退出安全模式。(在系统启动时发生作用)。<=0的值表示不关心DN数量,在启动时。大于DN实际数量的值会导致无法离开安全模式。<em>建议不设置此值</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.max.objects=0</p>
<blockquote>
<p>DFS支持的最大文件、目录、数据块数量。0无限制。</p>
</blockquote>
</li>
<li>
<p>dfs.namenode.decommission.interval=30</p>
<blockquote>
<p>NN周期性检查退役是否完成的间隔,秒。</p>
</blockquote>
</li>
<li>
<p>dfs.namenode.decommission.nodes.per.interval=5</p>
<blockquote>
<p>NN检查退役是否完成,每dfs.namenode.decommission.interval秒检查的节点数量。</p>
</blockquote>
</li>
<li>
<p>dfs.namenode.replication.interval=3</p>
<blockquote>
<p>NN周期性计算DN的副本情况的频率,秒。</p>
</blockquote>
</li>
<li>
<p>dfs.namenode.accesstime.precision=3600000</p>
<blockquote>
<p>HDFS文件的访问时间精确到此值,默认是1小时。0表示禁用访问时间。</p>
</blockquote>
</li>
<li>
<p>dfs.datanode.plugins=null</p>
<blockquote>
<p>DN上的插件列表,逗号分隔。</p>
</blockquote>
</li>
<li>
<p>dfs.namenode.plugins=null</p>
<blockquote>
<p>NN上的插件列表,逗号分隔。</p>
</blockquote>
</li>
<li>
<p>dfs.bytes-per-checksum=512</p>
<blockquote>
<p>每次计算校验和的字节数。一定不能大于dfs.stream-buffer-size。</p>
</blockquote>
</li>
<li>
<p>dfs.client-write-packet-size=65536</p>
<blockquote>
<p>客户端写数据时的包的大小。<em>包是块中的更小单位数据集合</em></p>
</blockquote>
</li>
<li>
<p>dfs.client.write.exclude.nodes.cache.expiry.interval.millis=600000</p>
<blockquote>
<p>最大周期去让DN保持在例外节点队列中。毫秒。操过此周期,先前被排除的DN将被移除缓存并被尝试再次申请Block。默认为10分钟。</p>
</blockquote>
</li>
<li>
<p>dfs.namenode.checkpoint.dir=file://${hadoop.tmp.dir}/dfs/namesecondary</p>
<blockquote>
<p>本地文件系统中,DFS SNN应该在哪里存放临时[用于合并|合并后](to merge)的Image。如果是逗号分隔的目录列表,Image文件存放多份。冗余备份。<em>建议不使用SNN功能,忽略此配置</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.checkpoint.edits.dir=${dfs.namenode.checkpoint.dir}</p>
<blockquote>
<p><em>建议不使用SNN功能,忽略此配置</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.checkpoint.period=3600</p>
<blockquote>
<p><em>建议不使用SNN功能,忽略此配置</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.checkpoint.txns=1000000</p>
<blockquote>
<p><em>建议不使用SNN功能,忽略此配置</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.checkpoint.check.period=60</p>
<blockquote>
<p><em>建议不使用SNN功能,忽略此配置</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.checkpoint.max-retries=3</p>
<blockquote>
<p><em>建议不使用SNN功能,忽略此配置</em></p>
</blockquote>
</li>
<li><p>dfs.namenode.num.checkpoints.retained=2<br><em>建议不使用SNN功能,忽略此配置</em></p></li>
<li>
<p>dfs.namenode.num.extra.edits.retained=1000000</p>
<blockquote>
<p>数量限制,额外的edits事务数。</p>
</blockquote>
</li>
<li>
<p>dfs.namenode.max.extra.edits.segments.retained=10000</p>
<blockquote>
<p>extra edit日志文件segments的最大数量。除了用于NN重启时的最小edits文件之外。<em>一个segments包含多个日志文件</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.delegation.key.update-interval=86400000</p>
<blockquote>
<p>NN中更新主代理令牌的时间间隔,毫秒。<em>安全选项,不关注</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.delegation.token.max-lifetime=604800000</p>
<blockquote>
<p>NN中更新主代理令牌的时间间隔,毫秒。<em>安全选项,不关注</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.delegation.token.renew-interval=86400000</p>
<blockquote>
<p>NN中更新主代理令牌的时间间隔,毫秒。<em>安全选项,不关注</em></p>
</blockquote>
</li>
<li>
<p>dfs.image.compress=FALSE</p>
<blockquote>
<p>Image文件要压缩吗?</p>
</blockquote>
</li>
<li>
<p>dfs.image.compression.codec=org.apache.hadoop.io.compress.DefaultCodec</p>
<blockquote>
<p>Image文件压缩编码。必须是在io.compression.codecs中定义的编码。</p>
</blockquote>
</li>
<li>
<p>dfs.image.transfer.timeout=600000</p>
<blockquote>
<p>Image文件传输时超时。<em>HA方式使用不到,可不关注</em></p>
</blockquote>
</li>
<li>
<p>dfs.image.transfer.bandwidthPerSec=0</p>
<blockquote>
<p>Image文件传输时可以使用的最大带宽,秒字节。0表示没有限制。<em>HA方式使用不到,可不关注</em></p>
</blockquote>
</li>
<li>
<p>dfs.datanode.max.transfer.threads=4096</p>
<blockquote>
<p>= 旧参数 dfs.datanode.max.xcievers<br>
DN上传送数据出入的最大线程数。</p>
</blockquote>
</li>
<li>
<p>dfs.datanode.readahead.bytes=4193404</p>
<blockquote>
<p>预读磁盘数据。如果Hadoop本地库生效,DN可以调用posix_fadvise系统获取页面数据到操作系统的缓存中。这个配置指定读取当前读取位置之前的字节数。设置为0,取消此功能。无本地库,此功能也无效。<em>?</em></p>
</blockquote>
</li>
<li>
<p>dfs.datanode.drop.cache.behind.reads=FALSE</p>
<blockquote>
<p>在有些场景下,特别是对一些大的,并且不可能重用的数据,缓存在操作系统的缓存区是无用的。此时,DN可以配置自动清理缓存区数据,在已经发生向客户端之后。此功能自动失效,在读取小数据片时。(例如HBase的随机读写场景)。通过释放缓存,这在某些场景下可以提高性能。Hadoop本地库无效,此功能无效。<em>看起来是一个可以尝试的特性</em></p>
</blockquote>
</li>
<li>
<p>dfs.datanode.drop.cache.behind.writes=FALSE</p>
<blockquote>
<p>同dfs.datanode.drop.cache.behind.reads相似。</p>
</blockquote>
</li>
<li>
<p>dfs.datanode.sync.behind.writes=FALSE</p>
<blockquote>
<p>如果是true,在写之后,DN将指示操作系统把队列中数据全部立即写磁盘。和常用的OS策略不同,它们可能在触发写磁盘之前等待30秒。Hadoop本地库无效,此功能无效。</p>
</blockquote>
</li>
<li>
<p>dfs.client.failover.sleep.base.millis=500</p>
<blockquote>
<p>专家设置。失败重试间的等待时间,毫秒。这里的值是个基本值,实际值会根据失败/成功次数递增/递减50%。第一次失败会立即重试。第二次将延迟至少dfs.client.failover.sleep.base.millis毫秒。依次类推。</p>
</blockquote>
</li>
<li>
<p>dfs.client.failover.sleep.max.millis=15000</p>
<blockquote>
<p>专家设置。失败重试见的等待时间最大值,毫秒。</p>
</blockquote>
</li>
<li>
<p>dfs.ha.log-roll.period=120</p>
<blockquote>
<p>StandbyNode要求Active滚动EditLog,由于StandBy只能从已经完成的Log Segments中读,所以Standby上的数据新鲜程度依赖于以如何的频率滚动日志。秒。另外,故障转移也会触发一次日志滚动,所以StandbyNode在Active之前,数据也会更新成最新的。秒,默认是2分钟。</p>
</blockquote>
</li>
<li>
<p>dfs.ha.tail-edits.period=60</p>
<blockquote>
<p>StandbyNode以此频率检测共享目录中最新的日志,秒。</p>
</blockquote>
</li>
<li>
<p>dfs.ha.zkfc.port=8019</p>
<blockquote>
<p>zkfc的rpc端口</p>
</blockquote>
</li>
<li>
<p>dfs.support.append=TRUE</p>
<blockquote>
<p>是否允许append。</p>
</blockquote>
</li>
<li>
<p>dfs.client.use.datanode.hostname=FALSE</p>
<blockquote>
<p>是否客户端应该使用DN的HostName,在连接DN时,默认是使用IP。</p>
</blockquote>
</li>
<li>
<p>dfs.datanode.use.datanode.hostname=FALSE</p>
<blockquote>
<p>是否DN应该使用HostName连接其它DN,在数据传输时。默认是是IP。</p>
</blockquote>
</li>
<li>
<p>dfs.client.local.interfaces=null</p>
<blockquote>
<p>逗号分隔的网卡列表,用于在客户端和DN之间传输数据时。当创建连接时,客户端随机选择一个并绑定它的socket到这个网卡的IP上。名字可以以网卡名(例如 "eth0"), 子网卡名 (eg "eth0:0"), 或者IP地址(which may be specified using CIDR notation to match a range of IPs)。</p>
</blockquote>
</li>
<li>
<p>dfs.namenode.kerberos.internal.spnego.principal=${dfs.web.authentication.kerberos.principal}</p>
<blockquote>
<p><em>安全选项,暂不关注</em></p>
</blockquote>
</li>
<li>
<p>dfs.secondary.namenode.kerberos.internal.spnego.principal=${dfs.web.authentication.kerberos.principal}</p>
<blockquote>
<p><em>安全选项,暂不关注</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.avoid.read.stale.datanode=FALSE</p>
<blockquote>
<p>决定是否避开从脏DN上读数据。脏DN指在一个指定的时间间隔内没有收到心跳信息。脏DN将被移到可以读取节点列表的尾端。<em>尝试开启</em></p>
</blockquote>
</li>
<li>
<p>dfs.namenode.stale.datanode.interval=30000</p>
<blockquote>
<p>标记一个DN是脏的时间间隔。例如,如果NN在此设定的时间内没有接收到来自某一个节点的心跳信息,此DN将被标记为脏的。此间隔不能太小,否则容易导致被频繁的标记为脏DN。</p>
<blockquote>
<p><strong>我们建议是1分钟</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p>dfs.namenode.write.stale.datanode.ratio=0.5f</p>
<blockquote>
<p>当全部DN被标记为脏DN的比率高于此阀值,停止不写数据到脏DN的策略,以免造成热点问题(有效的,可写的DN太少,压力太大)。</p>
</blockquote>
</li>
<li>
<p>dfs.namenode.invalidate.work.pct.per.iteration=0.32f</p>
<blockquote>
<p>高级属性。改变需小心。</p>
</blockquote>
</li>
<li>
<p>dfs.namenode.replication.work.multiplier.per.iteration=2</p>
<blockquote>
<p>高级属性。改变需小心。</p>
</blockquote>
</li>
<li>
<p>dfs.webhdfs.enabled=FALSE</p>
<blockquote>
<p>在NN和DN上开启WebHDFS (REST API)功能。</p>
<blockquote>
<p><strong>可以开启尝试</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p>hadoop.fuse.connection.timeout=300</p>
<blockquote>
<p>秒,在fuse_dfs中缓存libhdfs连接对象的超时时间。 小值使用内存小。大值可以加快访问,通过避开创建新的连接对象。</p>
</blockquote>
</li>
<li>
<p>hadoop.fuse.timer.period=5</p>
<blockquote>
<p>秒</p>
</blockquote>
</li>
<li>
<p>dfs.metrics.percentiles.intervals=null</p>
<blockquote>
<p>Comma-delimited set of integers denoting the desired rollover intervals (in seconds) for percentile latency metrics on the Namenode and Datanode. By default, percentile latency metrics are disabled.</p>
</blockquote>
</li>
<li>
<p>dfs.encrypt.data.transfer=FALSE</p>
<blockquote>
<p>是否加密传输数据?仅需要配置在NN和DN。客户端可以自行判断。</p>
</blockquote>
</li>
<li>
<p>dfs.encrypt.data.transfer.algorithm=null</p>
<blockquote>
<p>可以设置为"3des"或"rc4"。否则使用默认的,通常是usually 3DES。3DES更安全,RC4更快。</p>
</blockquote>
</li>
<li>
<p>dfs.datanode.hdfs-blocks-metadata.enabled=TRUE</p>
<blockquote>
<p>布尔值,设定后台DN端是否支持DistributedFileSystem#getFileVBlockStorageLocations API。</p>
</blockquote>
</li>
<li>
<p>dfs.client.file-block-storage-locations.num-threads=10</p>
<blockquote>
<p>在调用DistributedFileSystem#getFileBlockStorageLocations()的并发RPC的线程数</p>
</blockquote>
</li>
<li>
<p>dfs.client.file-block-storage-locations.timeout=60</p>
<blockquote>
<p>Timeout (in seconds) for the parallel RPCs made in DistributedFileSystem#getFileBlockStorageLocations().</p>
</blockquote>
</li>
<li>
<p>dfs.domain.socket.path=/var/run/hadoop-hdfs/dn._PORT</p>
<blockquote>
<p>可选选项。socket文件路径,unix下。用来在DN和本地的HDFS客户端加快网络连接。如果字符串"_PORT"出现在路径中,它将被DN的TCP端口替换。</p>
</blockquote>
</li>
</ul>
<h4><a>yarn-default.xml</a></h4>
<ul>
<li>
<p>yarn.app.mapreduce.am.env=null</p>
<blockquote>
<p>用户为MR AM添加环境变量。例如:</p>
<ol>
<li>A=foo 设置环境变量A为foo</li>
<li>B=$B:c 继承并设置TT内的B变量</li>
</ol>
</blockquote>
</li>
<li>
<p><strong>yarn.app.mapreduce.am.command-opts=-Xmx1024m</strong></p>
<blockquote>
<p>MR AM的Java opts。如下符号会被替换:</p>
<ul>
<li>@taskid@ 被替换成当前的TaskID。其它出现的'@'不会改变。例如,为了让gc日志能够按task打印存储在/tmp目录,可以设置'value'为:-Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc </li>
<li>如果hadoop本地库可以使用,使用-Djava.library.path参数可能造成程序的此功能无效。这个值应该被替换,设置在MR的JVM环境中LD_LIBRARY_PATH变量中,使用 mapreduce.map.env和mapreduce.reduce.env配置项。</li>
</ul>
</blockquote>
</li>
<li>
<p><strong>yarn.app.mapreduce.am.resource.mb=1536</strong></p>
<blockquote>
<p>AM申请的内存</p>
</blockquote>
</li>
<li>
<p><strong>yarn.resourcemanager.address=0.0.0.0:8032</strong></p>
<blockquote>
<p>RM地址:端口</p>
</blockquote>
</li>
<li>
<p><strong>yarn.resourcemanager.scheduler.address=0.0.0.0:8030</strong></p>
<blockquote>
<p>调度器地址:端口</p>
</blockquote>
</li>
<li>
<p><strong>yarn.admin.acl=*</strong></p>
<blockquote>
<p>ACL中谁可以管理YARN集群</p>
</blockquote>
</li>
<li>
<p><strong>yarn.resourcemanager.admin.address=0.0.0.0:8033</strong></p>
<blockquote>
<p>RM管理接口地址:端口</p>
</blockquote>
</li>
<li>
<p><strong>yarn.resourcemanager.am.max-retries=1</strong></p>
<blockquote>
<p>AM重试最大次数。服务端参数。重启生效。</p>
<blockquote>
<p><strong>建议4</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>yarn.resourcemanager.nodes.include-path=null</strong></p>
<blockquote>
<p>存储有效节点列表的文件</p>
</blockquote>
</li>
<li>
<p><strong>yarn.resourcemanager.nodes.exclude-path=null</strong></p>
<blockquote>
<p>存储拒绝节点列表的文件。<em>如和包含文件冲突,包含文件优先级高</em></p>
</blockquote>
</li>
<li>
<p><strong>yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</strong></p>
<blockquote>
<p>调度器实现类。</p>
<blockquote>
<p><strong>建议使用公平调度器</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>yarn.scheduler.minimum-allocation-mb=1024</strong></p>
<blockquote>
<p>每个container想RM申请内存的最小大小。兆字节。内存请求小于此值,实际申请到的是此值大小。<em>默认值偏大</em></p>
</blockquote>
</li>
<li>
<p><strong>yarn.scheduler.maximum-allocation-mb=8192</strong></p>
<blockquote>
<p>每个container向RM申请内存的最大大小,兆字节。申请值大于此值,将最多得到此值内存。</p>
</blockquote>
</li>
<li>
<p><strong>yarn.resourcemanager.recovery.enabled=FALSE</strong></p>
<blockquote>
<p>是否启动RM的状态恢复功能。如果true,必须指定yarn.resourcemanager.store.class。<em>尝试启用</em></p>
</blockquote>
</li>
<li>
<p><strong>yarn.resourcemanager.store.class=null</strong></p>
<blockquote>
<p>用于持久存储的类。<em>尝试开启</em></p>
</blockquote>
</li>
<li>
<p><strong>yarn.resourcemanager.max-completed-applications=10000</strong></p>
<blockquote>
<p>RM中保存的最大完成的app数量。内存中存储。</p>
</blockquote>
</li>
<li>
<p><strong>yarn.nodemanager.address=0.0.0.0:0</strong></p>
<blockquote>
<p>NM中的container管理器的地址:端口</p>
</blockquote>
</li>
<li>
<p><strong>yarn.nodemanager.env-whitelist=JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,YARN_HOME</strong></p>
<blockquote>
<p>container应该覆盖而不是使用NM的环境变量名单。允许container自己配置的环境变量</p>
</blockquote>
</li>
<li>
<p><strong>yarn.nodemanager.delete.debug-delay-sec=0</strong></p>
<blockquote>
<p>秒,一个app完成后,NM删除服务将删除app的本地文件目录和日志目录。为了诊断问题,把这个选项设置成足够大的值(例如,设置为10分钟),可以继续访问这些目录。设置此选项,需要重启NM。Yarn应用的工作目录根路径是yarn.nodemanager.local-dirs,Yarn应用日志目录的根路径是yarn.nodemanager.log-dirs。</p>
<blockquote>
<p><strong>调试问题时可用</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>yarn.nodemanager.local-dirs=${hadoop.tmp.dir}/nm-local-dir</strong></p>
<blockquote>
<p>本地文件存储目录,列表。一个应用的本地文件目录定位方式:${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}。每个container的工作目录,是此目录的子目录,目录名是container_${contid}。</p>
<blockquote>
<p>非常重要,建议配置多个磁盘,平衡IO。</p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>yarn.nodemanager.log-dirs=${yarn.log.dir}/userlogs</strong></p>
<blockquote>
<p>存储container日志的地方。一个应用的本地日志目录定位是:${yarn.nodemanager.log-dirs}/application_${appid}。每个container的日志目录在此目录下,名字是container_{$contid}。每个container目录中包含stderr, stdin, and syslog等container产生的文件</p>
<blockquote>
<p><strong>非常重要,建议配置多个磁盘</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>yarn.log-aggregation-enable=FALSE</strong></p>
<blockquote>
<p>是否允许日志汇聚功能。<em>建议开启</em></p>
</blockquote>
</li>
<li>
<p><strong>yarn.log-aggregation.retain-seconds=-1</strong></p>
<blockquote>
<p>保存汇聚日志时间,秒,超过会删除,-1表示不删除。 注意,设置的过小,将导致NN垃圾碎片。<em>建议3-7天 = 7 * 86400 = 604800</em></p>
</blockquote>
</li>
<li>
<p><strong>yarn.nodemanager.log.retain-seconds=10800</strong></p>
<blockquote>
<p>保留用户日志的时间,秒。在日志汇聚功能关闭时生效。</p>
<blockquote>
<p><strong>建议7天</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>yarn.nodemanager.remote-app-log-dir=/tmp/logs</strong></p>
<blockquote>
<p>汇聚日志的地方,目录路径,HDFS系统。</p>
<blockquote>
<p><strong>对于开了权限检查的系统,注意权限问题。HDFS上。</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>yarn.nodemanager.remote-app-log-dir-suffix=logs</strong></p>
<blockquote>
<p>汇聚日志目录路径后缀。汇聚目录创建在{yarn.nodemanager.remote-app-log-dir}/${user}/{thisParam}</p>
</blockquote>
</li>
<li>
<p><strong>yarn.nodemanager.resource.memory-mb=8192</strong></p>
<blockquote>
<p>NM上可以用于container申请的物理内存大小,MB。</p>
</blockquote>
</li>
<li>
<p><strong>yarn.nodemanager.vmem-pmem-ratio=2.1</strong></p>
<blockquote>
<p>在设置container的内存限制时,虚拟内存到物理内存的比率。Container申请的内存如果超过此物理内存,可以以此比率获取虚拟内存用于满足需求。虚拟地址的是物理地址的倍数上限。<em>建议设置的大点,例如:4.1,8.1,此虚拟内存并非内存,而是占用的虚拟地址。</em></p>
</blockquote>
</li>
<li>
<p><strong>yarn.nodemanager.webapp.address=0.0.0.0:8042</strong></p>
<blockquote>
<p>NM的网页界面地址和端口。</p>
</blockquote>
</li>
<li>
<p><strong>yarn.nodemanager.log-aggregation.compression-type=none</strong></p>
<blockquote>
<p>汇聚日志的压缩类型。汇聚日志是TFile格式文件。Hadoop-3315。可以使用的值有none,lzo,gz等。</p>
<blockquote>
<p><strong>可以尝试</strong></p>
</blockquote>
</blockquote>
</li>
<li>
<p><strong>yarn.nodemanager.aux-services=null</strong></p>
<blockquote>
<p><em>请配置为:mapreduce.shuffle,在Yarn上开启MR的必须项</em></p>
</blockquote>
</li>
<li>
<p><strong>yarn.nodemanager.aux-services.mapreduce.shuffle.class=org.apache.hadoop.mapred.ShuffleHandler</strong></p>
<blockquote>
<p><em>对应参考yarn.nodemanager.aux-services</em></p>
</blockquote>
</li>
<li>
<p><strong>mapreduce.job.jar=null</strong></p>
<blockquote>
<p>Job客户端参数。提交的job的jar文件。</p>
</blockquote>
</li>
<li>
<p><strong>mapreduce.job.hdfs-servers=${fs.defaultFS}</strong></p>
<blockquote>
<p>Job客户端参数。</p>
</blockquote>
</li>
<li>
<p><strong>yarn.application.classpath=$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/share/hadoop/common/<em>,$HADOOP_COMMON_HOME/share/hadoop/common/lib/</em>,$HADOOP_HDFS_HOME/share/hadoop/hdfs/<em>,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/</em>,$YARN_HOME/share/hadoop/yarn/*,$YARN_HOME/share/hadoop/yarn/lib/*</strong></p>
<blockquote>
<p>YARN应用的CLASSPATH,逗号分隔列表。</p>
</blockquote>
</li>
<li>
<p>yarn.app.mapreduce.am.job.task.listener.thread-count=30</p>
<blockquote>
<p>MR AM处理RPC调用的线程数。</p>
</blockquote>
</li>
<li>
<p>yarn.app.mapreduce.am.job.client.port-range=null</p>
<blockquote>
<p>MR AM能够绑定使用的端口范围。例如:50000-50050,50100-50200。 如果你先要全部的有用端口,可以留空(默认值null)。</p>
</blockquote>
</li>
<li>
<p>yarn.app.mapreduce.am.job.committer.cancel-timeout=60000</p>
<blockquote>
<p>毫秒,如果job被kill了,等待output committer取消操作的时间。</p>
</blockquote>
</li>
<li>
<p>yarn.app.mapreduce.am.scheduler.heartbeat.interval-ms=1000</p>
<blockquote>
<p>MR AM发送心跳到RM的时间间隔,毫秒</p>
</blockquote>
</li>
<li>
<p>yarn.app.mapreduce.client-am.ipc.max-retries=3</p>
<blockquote>
<p>在重新连接RM获取Application状态前,客户端重试连接AM的次数。</p>
</blockquote>
</li>
<li>
<p>yarn.app.mapreduce.client.max-retries=3</p>
<blockquote>
<p>客户端重连RM/HS/AM的次数。这是基于ipc接口上的规则</p>
</blockquote>
</li>
<li>
<p>yarn.ipc.client.factory.class=null</p>
<blockquote>
<p>创建客户端IPC类的工厂类</p>
</blockquote>
</li>
<li>
<p>yarn.ipc.serializer.type=protocolbuffers</p>
<blockquote>
<p>使用哪种序列化类</p>
</blockquote>
</li>
<li>
<p>yarn.ipc.server.factory.class=null</p>
<blockquote>
<p>创建IPC服务类的工厂类</p>
</blockquote>
</li>
<li>
<p>yarn.ipc.exception.factory.class=null</p>
<blockquote>
<p>创建IPC异常的工厂类</p>
</blockquote>
</li>
<li>
<p>yarn.ipc.record.factory.class=null</p>
<blockquote>
<p>创建序列化记录的工厂类</p>
</blockquote>
</li>
<li>
<p>yarn.ipc.rpc.class=org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC</p>
<blockquote>
<p>RPC类实现类</p>
</blockquote>
</li>
<li>
<p>yarn.resourcemanager.client.thread-count=50</p>
<blockquote>
<p>RM用来处理客户端请求的线程数</p>
</blockquote>
</li>
<li>
<p>yarn.am.liveness-monitor.expiry-interval-ms=600000</p>
<blockquote>
<p>AM报告间隔,毫秒。?</p>
</blockquote>
</li>
<li>
<p>yarn.resourcemanager.principal=null</p>
<blockquote>
<p><em>安全选项</em></p>
</blockquote>
</li>
<li>
<p>yarn.resourcemanager.scheduler.client.thread-count=50</p>
<blockquote>
<p>调度器用于处理请求的线程数</p>
</blockquote>
</li>
<li>
<p>yarn.resourcemanager.webapp.address=0.0.0.0:8088</p>
<blockquote>
<p>RM的网页接口地址:端口</p>
</blockquote>
</li>
<li>
<p>yarn.resourcemanager.resource-tracker.address=0.0.0.0:8031</p>
<blockquote>
<p>?</p>
</blockquote>
</li>
<li>
<p>yarn.acl.enable=TRUE</p>
<blockquote>
<p>开启访问控制</p>
</blockquote>
</li>
<li>
<p>yarn.resourcemanager.admin.client.thread-count=1</p>
<blockquote>
<p>RM管理端口处理事务的线程数</p>
</blockquote>
</li>
<li>
<p>yarn.resourcemanager.amliveliness-monitor.interval-ms=1000</p>
<blockquote>
<p>RM检查AM存活的间隔</p>
</blockquote>
</li>
<li>
<p>yarn.resourcemanager.container.liveness-monitor.interval-ms=600000</p>
<blockquote>
<p>检查container存活的时间间隔,毫秒。<em>建议短一些,例如3分钟</em></p>
</blockquote>
</li>
<li>
<p>yarn.resourcemanager.keytab=/etc/krb5.keytab</p>
<blockquote>
<p>安全选项</p>
</blockquote>
</li>
<li>
<p>yarn.nm.liveness-monitor.expiry-interval-ms=600000</p>
<blockquote>
<p>RM判断NM死亡的时间间隔。<br><em>非主动检查,被动等待,不连接时间超过此值</em><br><em>10分钟无检查到活动,判定NM死亡</em></p>
</blockquote>
</li>
<li>
<p>yarn.resourcemanager.nm.liveness-monitor.interval-ms=1000</p>
<blockquote>
<p>RM检查NM存活的时间间隔。</p>
</blockquote>
</li>
<li>
<p>yarn.resourcemanager.resource-tracker.client.thread-count=50</p>
<blockquote>
<p>处理资源跟踪调用的线程数。?</p>
</blockquote>
</li>
<li>
<p>yarn.resourcemanager.delayed.delegation-token.removal-interval-ms=30000</p>
<blockquote>
<p>安全选项</p>
</blockquote>
</li>
<li>
<p>yarn.resourcemanager.application-tokens.master-key-rolling-interval-secs=86400</p>
<blockquote>
<p>安全选项</p>
</blockquote>
</li>
<li>
<p>yarn.resourcemanager.container-tokens.master-key-rolling-interval-secs=86400</p>
<blockquote>
<p>安全选项</p>
</blockquote>
</li>
<li>
<p>yarn.nodemanager.admin-env=MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX</p>
<blockquote>
<p>应该从NM传送到container的环境变量</p>
</blockquote>
</li>
<li>
<p>yarn.nodemanager.container-executor.class=org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor</p>
<blockquote>
<p>启动containers的类。</p>
</blockquote>
</li>
<li>
<p>yarn.nodemanager.container-manager.thread-count=20</p>
<blockquote>
<p>用于container管理的线程数</p>
</blockquote>
</li>
<li>
<p>yarn.nodemanager.delete.thread-count=4</p>
<b
使用 fabric 将你的公钥传到服务器
https://segmentfault.com/a/1190000000703983
2014-09-30T21:09:54+08:00
2014-09-30T21:09:54+08:00
timger
https://segmentfault.com/u/timger
0
<p>fabric 是部署好帮手<br>
下面的代码实现 ssh-copy-id 的功能<br>
批量上传你的 <code>id_rsa.pub</code> 到一组服务器</p>
<pre><code>@task
def copy_id(file="~/.ssh/id_rsa.pub"):
"""fab push 公钥 ssh-copy-id"""
put(file, "/tmp/id.pub")
try:
run("if [ ! -d ~/.ssh ]; then mkdir -p ~/.ssh; fi")
run("if [ ! -f ~/.ssh/authorized_keys ]; then cp /tmp/id.pub ~/.ssh/authorized_keys && chmod 0600 ~/.ssh/authorized_keys; fi")
run("cat ~/.ssh/authorized_keys /tmp/id.pub | sort -u > /tmp/uniq.authorized_keys")
run("if [ `cat ~/.ssh/authorized_keys | wc -l` -lt `cat /tmp/uniq.authorized_keys | wc -l` ]; then cat /tmp/id.pub >> ~/.ssh/authorized_keys; fi")
finally:
run("rm -f /tmp/id.pub /tmp/uniq.authorized_keys")
</code></pre>
从 python 到 Scala 简明教程
https://segmentfault.com/a/1190000000660764
2014-09-06T21:13:50+08:00
2014-09-06T21:13:50+08:00
timger
https://segmentfault.com/u/timger
8
<p>周末加班翻译了 python 到 Scala 的文档<br>
地址在 <a rel="nofollow" href="http://www.timger.info/PythonToScala/index.html">http://www.timger.info/PythonToScala/index.html</a><br>
pdf 下载地址 <a rel="nofollow" href="http://cachetiger.qiniudn.com/python2scala.pdf">http://cachetiger.qiniudn.com/python2scala.pdf</a></p>
gitBook 写技术文档
https://segmentfault.com/a/1190000000660756
2014-09-06T21:00:08+08:00
2014-09-06T21:00:08+08:00
timger
https://segmentfault.com/u/timger
1
<p>gitbook 是一个基于 git 和 github 的静态站点写作工具他们有一个官网上有不少好的书<br>
参见<a rel="nofollow" href="https://www.gitbook.io/">https://www.gitbook.io/</a></p>
<p>下面介绍下 gitbook 写书的一些记录</p>
<h2>界面编辑</h2>
<p>下载编辑器 <code>https://github.com/GitbookIO/editor</code></p>
<h2>命令行</h2>
<p>下面只记录了主要的命令行使用流程,文章编辑功能就没写啦<br><a rel="nofollow" href="http://segmentfault.com/">http://segmentfault.com/</a> 贴图片太麻烦</p>
<h3>安装</h3>
<pre><code>npm -g install gitbook
npm -g install gitbook-plugin
npm install gitbook-plugin-disqus
npm install gitbook-plugin-ga
</code></pre>
<h3>命令参数</h3>
<pre><code>timgerdeMac-mini:PythonToScala_Zh timger$ gitbook
Usage: gitbook [options] [command]
Commands:
build [options] [source_dir] Build a gitbook from a directory
serve [options] [source_dir] Build then serve a gitbook from a directory
pdf [options] [source_dir] Build a gitbook as a PDF
epub [options] [source_dir] Build a gitbook as a ePub book
mobi [options] [source_dir] Build a gitbook as a Mobi book
init [source_dir] Create files and folders based on contents of SUMMARY.md
publish [source_dir] Publish content to the associated gitbook.io book
git:remote [source_dir] [book_id] Adds a git remote to a book repository
Options:
-h, --help output usage information
-V, --version output the version number
</code></pre>
<h3>config</h3>
<pre><code>timgerdeMac-mini:PythonToScala_Zh timger$ cat book.json
{
"plugins": ["ga", "disqus"]
"pluginsConfig": {
"ga": {
"token": "UA-29124639-6"
},
"disqus": {
"shortName": "yishenggudou"
}
}
}
</code></pre>
<h3>build</h3>
<pre><code>gitbook build ./ -o ./build --config=book.json
</code></pre>
<h3>发布</h3>
<pre><code>cp -vrf ../PythonToScala_Zh/build/* ./
git add -f ./*
. ~/.bashrc
. ~/.bash_profile
git_commit_msg "pub"
push_auto_branch
</code></pre>
<h3>在线地址</h3>
<p><code>http://www.timger.info/PythonToScala/index.html</code></p>
<h3>打包 pdf</h3>
<pre><code>timgerdeMac-mini:PythonToScala_Zh timger$ gitbook pdf ./ -o ./python2scala.pdf --config=book.json
Starting build ...
Successfully built!
</code></pre>
分分钟教你爬数据
https://segmentfault.com/a/1190000000648301
2014-08-25T21:39:34+08:00
2014-08-25T21:39:34+08:00
timger
https://segmentfault.com/u/timger
1
<h2>得到 html</h2>
<pre><code>import requests
html=requests.get('http://sc.hkex.com.hk/TuniS/www.hkex.com.hk/chi/market/sec_tradinfo/stockcode/eisdeqty_c.htm').content
</code></pre>
<h2>解析数据</h2>
<pre><code>from pyquery import PyQuery as Q
q=Q(html)
tr = q('tr.tr_normal')
</code></pre>
<h2>导入 db</h2>
<pre><code class="lang-ipython">db=zpool['mysql+mysqldb://root:pwd@dbhost:3306/glhdb']
sqls = ["INSERT INTO `stocks_code` (`name`, `code`) VALUES ('{0}','{1}')".format(Q(i)('td')[0].text.encode('utf8','ignore'), ((Q(Q(i)('td')[1])('a') and Q(Q(i)('td')[1])('a')[0].text) or u'').encode('utf8','ignore').strip(')').strip('\'').replace('\'',"\\'")) for i in tr[0:-3]]
[db.execute(text(i)) for i in sqls]
</code></pre>
idea maven 编译 spark
https://segmentfault.com/a/1190000000640981
2014-08-19T17:16:47+08:00
2014-08-19T17:16:47+08:00
timger
https://segmentfault.com/u/timger
0
<p>参考了<br><a rel="nofollow" href="http://blog.cloudera.com/blog/2014/04/how-to-run-a-simple-apache-spark-app-in-cdh-5/">http://blog.cloudera.com/blog/2014/04/how-to-run-a-simple-apache-spark-app-in-cdh-5/</a><br><a rel="nofollow" href="https://github.com/sryza/simplesparkapp">https://github.com/sryza/simplesparkapp</a><br><a rel="nofollow" href="https://github.com/mesos/spark/pull/310">https://github.com/mesos/spark/pull/310</a></p>
<p>spark 是基于 cdh5 搭建</p>
<h2>maven 配置</h2>
<p><code>pom.xml</code> 参考上面的文章做了部分修改..<br>
sacala-tools</p>
<pre><code><?xml version="1.0" encoding="UTF-8"?>
<!--
Copyright (c) 2014, Cloudera, Inc. All Rights Reserved.
Cloudera, Inc. licenses this file to you under the Apache License,
Version 2.0 (the "License"). You may not use this file except in
compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
This software is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
CONDITIONS OF ANY KIND, either express or implied. See the License for
the specific language governing permissions and limitations under the
License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.timger.sparkwordcount</groupId>
<artifactId>sparkwordcount</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>"Spark Word Count"</name>
<repositories>
<repository>
<id>scala-tools.org</id>
<name>Scala-tools Maven2 Repository</name>
<!---
<url>http://scala-tools.org/repo-releases</url>
-->
<url>http://oss.sonatype.org/content/groups/scala-tools/</url>
</repository>
<repository>
<id>maven-hadoop</id>
<name>Hadoop Releases</name>
<url>http://repository.cloudera.com/content/repositories/releases/</url>
</repository>
<repository>
<id>cloudera-repos</id>
<name>Cloudera Repos</name>
<url>http://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>scala-tools.org</id>
<name>Scala-tools Maven2 Repository</name>
<!--
<url>http://scala-tools.org/repo-releases</url>
-->
<url>http://oss.sonatype.org/content/groups/scala-tools/</url>
</pluginRepository>
</pluginRepositories>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
</properties>
<build>
<plugins>
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<version>2.15.2</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source>1.6</source>
<target>1.6</target>
</configuration>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.10.3</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>0.9.0-cdh5.0.1</version>
</dependency>
</dependencies>
</project>
</code></pre>
<p>编译</p>
<pre><code>timgerdeMac-mini:SparkTest timger$ mvn package
[INFO] Scanning for projects...
[INFO]
[INFO] Using the builder org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder with a thread count of 1
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building "Spark Word Count" 0.0.1-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ sparkwordcount ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 0 resource
[INFO]
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ sparkwordcount ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-scala-plugin:2.15.2:compile (default) @ sparkwordcount ---
[INFO] Checking for multiple versions of scala
[WARNING] Expected all dependencies to require Scala version: 2.10.3
[WARNING] com.timger.sparkwordcount:sparkwordcount:0.0.1-SNAPSHOT requires scala version: 2.10.3
[WARNING] com.twitter:chill_2.10:0.3.1 requires scala version: 2.10.0
[WARNING] Multiple versions of scala libraries detected!
[INFO] includes = [**/*.scala,**/*.java,]
[INFO] excludes = []
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ sparkwordcount ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /Volumes/MACEXT/GitHub/SparkTest/src/test/resources
[INFO]
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ sparkwordcount ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- maven-surefire-plugin:2.12.4:test (default-test) @ sparkwordcount ---
[INFO] No tests to run.
[INFO]
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ sparkwordcount ---
[INFO] Building jar: /Volumes/MACEXT/GitHub/SparkTest/target/sparkwordcount-0.0.1-SNAPSHOT.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 9.283 s
[INFO] Finished at: 2014-08-19T17:15:55+08:00
[INFO] Final Memory: 13M/81M
[INFO] ------------------------------------------------------------------------
</code></pre>
<p>代码地址 <a rel="nofollow" href="https://github.com/yishenggudou/SparkTest">https://github.com/yishenggudou/SparkTest</a></p>