Category编程珠玑

Git branching model for BBT v0.1

Git-Workflow-bbt-v0.1

1. Git&Gitlab Intro

About Git, Please read “Pro Git book” first, at least first three chapters and “6.6 Submodules”

About our “Gitlab” , it`s a self-hosted Github like service, so just use it like Github.

2. Brief Introduction About Our Deployment

Now our deployment workflow is Our master on git server is our development environment, our production environment is managed by Felix, he will push stable version from master on git to production environment(AKA www Server) himself. So we don`t separate develop and master, there will only be master( AKA dev Server).

3. Branching Model

Basically folk from “A successful Git branching model”:http://nvie.com/posts/a-successful-git-branching-model/ , which is known as “Git Flow”:https://github.com/nvie/gitflow , we have exact same definition for Release branches, Feature branches or Hotfix branches with Git branching model, so you guys can check out this article first, we just simplify and use it here.

The main branch – master

The central repo hold only one main branch with an infinite lifetime:

  • master

The master branch at origin should be familiar to every Git user. We consider origin/master to be the main branch where the source code of HEAD always reflects our development state. So when anyone want to start involve a project , check out code from git server

This is where continuous integration or any automatic nightly builds are built from. So for web front-end and server-side, everytime if there is commit to master, there is a git hook to make sure the dev server reflects the current master source code.

And if there is a new production release is ready, system administrator will push it to Production Server.

Supporting branches

Next to master branch, our development model uses a variety of supporting branches to aid parallel development between team members, ease tracking of features, prepare for production releases and to assist in quickly fixing live production problems. Unlike the main branches, these branches always have a limited life time, since they will be removed eventually.

The different types of branches we may use are:

  • Feature branches
  • Release branches

Each of these branches have a specific purpose and are bound to strict rules as to which branches may be their originating branch and which branches must be their merge targets. We will walk through them in a minute.

By no means are these branches “special” from a technical perspective. The branch types are categorized by how we use them. They are of course plain old Git branches.

Feature branches

May branch off from: master Must merge back into: master Branch naming convention: * Recommend branch name: component-#issue-feature_name * So must prefix feature name with different platforms` name, such as ios, android, backend, web

Creating a feature branch

When starting work on a new feature, branch off from the master branch.

$ git checkout -b web#2232-user_login origin
Switched to a new branch "web-user_login"

Commit on local created feature branch

$ git commit
Commit changes to branch "web-user_login"
$ git push origin --set-upstream web-user_login
Push local branch to origin, and after first time, you can use 'git push' instead.

Sumbit A Merge Request through Gitlab

If you are solving an issue on Redmine, start here:

  • Open relevanted issue page on Redmine
  • Click “New merge request”
  • Select your feature branch
  • Write title using branch name and sumbit

If you are not solving issue on Redmine, just do 3 and 4 will be ok.

Project Master’s review feature branch code

  • Check out remote branch(like origin/web-#2322-web_login):
$ git checkout -t remote_name/remote_branch
  • Review code, feedback on Gitlab
  • When code is approved, merge and remove this feature branch

** (Optional for master of project if needed) Incorporating a finished feature on master **

$ git checkout master
Switched to branch 'master'
$ git merge --no-ff web-user_login
Updating ea1b82a..05e9557
(Summary of changes)
$ git branch -d web-user_login
Deleted local branch web-user_login (was 05e9557).
$ git push origin :web-user_login
Deleted branch web-user_login from remote git server.

Release branches

  • May branch off from: master
  • Must merge back into: master
  • Branch naming convention:
    • Recommend branch name: release-component-version_name
    • So must prefix feature name with different platforms` name, such as “ios”, “android”, “backend”, “web”

Creating a release branch

Create release branch means there is a production ready version, we gonna do beta test in release branch,this branch will last until we push to Appstore, Google Play or any other markets.

$ git checkout -b release-iOS-v1.0 master
Switched to a new branch "release-iOS-v1.0"
$ ./bump-version.sh 1.2
Files modified successfully, version bumped to 1.2 
//(We don`t have this yet, maybe done with build tool)
$ git commit -a -m "Bumped version number to 1.2"
[release-1.2 74d9424] Bumped version number to 1.2
1 files changed, 1 insertions(+), 1 deletions

Finishing a release branch

$ git checkout master
Switched to branch 'master'
$ git merge --no-ff release-iOS-v1.0
Merge made by recursive.
(Summary of changes)
$ git tag -a iOS-v1.0
$ git branch -d release-iOS-v1.0
Deleted branch release-iOS-v1.0 (was 05e9557).
$ git push origin :release-iOS-v1.0
Deleted branch release-iOS-v1.0 from remote git server.
$ git push --tags
[new tag]   iOS-v1.0 -> iOS-v1.0

如何选择开源许可证?

如何为代码选择开源许可证,这是一个问题。

世界上的开源许可证,大概有上百种。很少有人搞得清楚它们的区别。即使在最流行的六种—-GPLBSDMITMozillaApacheLGPL—-之中做选择,也很复杂。

乌克兰程序员Paul Bagwell,画了一张分析图,说明应该怎么选择。这是我见过的最简单的讲解,只用两分钟,你就能搞清楚这六种许可证之间的最大区别。

下面是我制作的中文版,请点击查看大图:

Via:阮一峰

——————————-分割线————————————

记得Felix跟我说过一个巨操蛋的Do What Fuck You Want To Do协议的,不记得具体叫什么了…….

总结下字符编码格式

这两天小朋友碰到些编码问题,我今天顺便看了下,怕忘了就记一下吧:

ASCII:基本字符集是128个常用字符,扩展字符集是128个,共256个,用1个字节表示。
GB2312:6千多个常用汉字
GBK:1万多个汉字
GB18030:更多,不过依然是两个字节来表示汉字。
上面三种GB*可以统一称为ANSI编码,且16个bit的第一个必定是1。
BIG5:繁体字符集,用于台湾地区

Unicode:两字节表示的世界通用码,存储为文本时会有连个字节的头信息。
UTF-8:一种以8个bit为一组的Unicode的表示格式,存储为本文时有三个字节的头信息。
UTF-16:16个bit为一组

—————————————–分割线一枚—————————————

ASCII 字符集

1.名称的由来

ASCII(American Standard Code for Information Interchange,美国信息互换标准代码)是基于罗马字母表的一套电脑编码系统。

2.特点

它主要用于显示现代英语和其他西欧语言。它是现今最通用的单字节编码系统,并等同于国际标准ISO 646。

3.包含内容

控制字符:回车键、退格、换行键等。

可显示字符:英文大小写字符、阿拉伯数字和西文符号

4.技术特征

7位(bits)表示一个字符,共128字符

5.ASCII扩展字符集

7位编码的字符集只能支持128个字符,为了表示更多的欧洲常用字符对ASCII进行了扩展,ASCII扩展字符集使用8位(bits)表示一个字符,共256字符。

ASCII扩展字符集比ASCII字符集扩充出来的符号包括表格符号、计算符号、希腊字母和特殊的拉丁符号。

GB2312 字符集

1.名称的由来

GB2312又称为GB2312-80字符集,全称为《信息交换用汉字编码字符集·基本集》,由原中国国家标准总局发布,1981年5月1日实施。

2.特点

GB2312是中国国家标准的简体中文字符集。它所收录的汉字已经覆盖99.75%的使用频率,基本满足了汉字的计算机处理需要。在中国大陆和新加坡获广泛使用。

3.包含内容

GB2312收录简化汉字及一般符号、序号、数字、拉丁字母、日文假名、希腊字母、俄文字母、汉语拼音符号、汉语注音字母,共 7445 个图形字符。其中包括6763个汉字,其中一级汉字3755个,二级汉字3008个;包括拉丁字母、希腊字母、日文平假名及片假名字母、俄语西里尔字母在内的682个全角字符。

4.技术特征

(1)分区表示:

GB2312中对所收汉字进行了“分区”处理,每区含有94个汉字/符号。这种表示方式也称为区位码。

各区包含的字符如下:01-09区为特殊符号;16-55区为一级汉字,按拼音排序;56-87区为二级汉字,按部首/笔画排序;10-15区及88-94区则未有编码。

(2)双字节表示

两个字节中前面的字节为第一字节,后面的字节为第二字节。习惯上称第一字节为“高字节” ,而称第二字节为“低字节”。

“高位字节”使用了0xA1-0xF7(把01-87区的区号加上0xA0),“低位字节”使用了0xA1-0xFE(把01-94加上0xA0)。

5.编码举例

以GB2312字符集的第一个汉字“啊”字为例,它的区号16,位号01,则区位码是1601,在大多数计算机程序中,高字节和低字节分别加0xA0得到程序的汉字处理编码0xB0A1。计算公式是:0xB0=0xA0+16, 0xA1=0xA0+1。

BIG5 字符集

1.名称的由来

又称大五码或五大码,1984年由台湾财团法人信息工业策进会和五间软件公司宏碁 (Acer)、神通 (MiTAC)、佳佳、零壹 (Zero One)、大众 (FIC)创立,故称大五码。

Big5码的产生,是因为当时台湾不同厂商各自推出不同的编码,如倚天码、IBM PS55、王安码等,彼此不能兼容;另一方面,台湾政府当时尚未推出官方的汉字编码,而中国大陆的GB2312编码亦未有收录繁体中文字。

2.特点

Big5字符集共收录13,053个中文字,该字符集在中国台湾使用。耐人寻味的是该字符集重复地收录了两个相同的字:“兀”(0xA461及0xC94A)、“嗀”(0xDCD1及0xDDFC)。

3.字符编码方法

Big5码使用了双字节储存方法,以两个字节来编码一个字。第一个字节称为“高位字节”,第二个字节称为“低位字节”。高位字节的编码范围0xA1-0xF9,低位字节的编码范围0x40-0x7E及0xA1-0xFE。

各编码范围对应的字符类型如下:0xA140-0xA3BF为标点符号、希腊字母及特殊符号,另外于0xA259-0xA261,存放了双音节度量衡单位用字:兙兛兞兝兡兣嗧瓩糎;0xA440-0xC67E为常用汉字,先按笔划再按部首排序;0xC940-0xF9D5为次常用汉字,亦是先按笔划再按部首排序。

4.Big5 的局限性

尽管Big5码内包含一万多个字符,但是没有考虑社会上流通的人名、地名用字、方言用字、化学及生物科等用字,没有包含日文平假名及片假名字母。

例如台湾视“着”为“著”的异体字,故没有收录“着”字。康熙字典中的一些部首用字(如“亠”、“疒”、“辵”、“癶”等)、常见的人名用字(如“堃”、“煊”、“栢”、“喆”等) 也没有收录到Big5之中。

Unicode字符集

1.名称的由来

Unicode字符集编码是Universal Multiple-Octet Coded Character Set 通用多八位编码字符集的简称,是由一个名为 Unicode 学术学会(Unicode Consortium)的机构制订的字符编码系统,支持现今世界各种不同语言的书面文本的交换、处理及显示。该编码于1990年开始研发,1994年正式公布,最新版本是2005年3月31日的Unicode 4.1.0。

2.特征

Unicode是一种在计算机上使用的字符编码。它为每种语言中的每个字符设定了统一并且唯一的二进制编码,以满足跨语言、跨平台进行文本转换、处理的要求。

3.编码方法

Unicode 标准始终使用十六进制数字,而且在书写时在前面加上前缀“U+”,例如字母“A”的编码为 004116 和字符“?”的编码为 20AC16。所以“A”的编码书写为“U+0041”。

4.UTF-8 编码

UTF-8的编码规则很简单,只有二条:

1)对于单字节的符号,字节的第一位设为0,后面7位为这个符号的unicode码。因此对于英语字母,UTF-8编码和ASCII码是相同的。

2)对于n字节的符号(n>1),第一个字节的前n位都设为1,第n+1位设为0,后面字节的前两位一律设为10。剩下的没有提及的二进制位,全部为这个符号的unicode码。

下表总结了编码规则,字母x表示可用编码的位。

Unicode符号范围 | UTF-8编码方式
(十六进制) | (二进制)
——————–+———————————————
0000 0000-0000 007F | 0xxxxxxx
0000 0080-0000 07FF | 110xxxxx 10xxxxxx
0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx
0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx

下面,还是以汉字“严”为例,演示如何实现UTF-8编码。

已知“严”的unicode是4E25(100111000100101),根据上表,可以发现4E25处在第三行的范围内(0000 0800-0000 FFFF),因此“严”的UTF-8编码需要三个字节,即格式是“1110xxxx 10xxxxxx 10xxxxxx”。然后,从“严”的最后一个二进制位开始,依次从后向前填入格式中的x,多出的位补0。这样就得到了,“严”的UTF-8编码是“11100100 10111000 10100101”,转换成十六进制就是E4B8A5。

5.UTF-16 和 UTF-32 编码
UTF-32、UTF-16 和 UTF-8 是 Unicode 标准的编码字符集的字符编码方案,UTF-16 使用一个或两个未分配的 16 位代码单元的序列对 Unicode 代码点进行编码;UTF-32 即将每一个 Unicode 代码点表示为相同值的 32 位整数。

—————————————–分割线再一枚———————————–

实例

下面,举一个实例。

打开”记事本“程序Notepad.exe,新建一个文本文件,内容就是一个”严“字,依次采用ANSI,Unicode,Unicode big endian 和 UTF-8编码方式保存。

然后,用文本编辑软件UltraEdit中的”十六进制功能“,观察该文件的内部编码方式。

1)ANSI:文件的编码就是两个字节“D1 CF”,这正是“严”的GB2312编码,这也暗示GB2312是采用大头方式存储的。

2)Unicode:编码是四个字节“FF FE 25 4E”,其中“FF FE”表明是小头方式存储,真正的编码是4E25。

3)Unicode big endian:编码是四个字节“FE FF 4E 25”,其中“FE FF”表明是大头方式存储。

4)UTF-8:编码是六个字节“EF BB BF E4 B8 A5”,前三个字节“EF BB BF”表示这是UTF-8编码,后三个“E4B8A5”就是“严”的具体编码,它的存储顺序与编码顺序是一致的。

 延伸阅读

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets(关于字符集的最基本知识)

谈谈Unicode编码

RFC3629:UTF-8, a transformation format of ISO 10646(如果实现UTF-8的规定)

————————————-分割线又一枚——————————————

//TO-Do

1.主流语言中的编码支持

2.编程当中需要注意的编码问题

URL Encoding

摘要:在Network Woring Group的URL的RFC(From Wiki:一系列以编号排定的文件。文件收集了有关互联网相关信息,以及UNIX和互联网社区的软件文件。基本的互联网通信协议都有在RFC文件内详细说明。)第1738号Request For Comments里规定了URLs只能使用US-ASCII字符集。本文就URL编码问题,摘录了编码的对象和规则,供扫盲使用。

RFC 1738:URL的说明

RFC 1738对URL使用的字符集说明:”…Only alphanumerics [0-9a-zA-Z], the special characters “$-_.+!*'(),” [not including the quotes – ed], and reserved characters used for their reserved purposes may be used unencoded within a URL.”

而URL所指名的资源,如HTML允许的字符集是ISO-8859-1(ISO-Latin),甚至HTML4包含整个Unicode字符集。

所以所有出现在HTML中指向特定资源,比如(AAPPLETAREABASEBGSOUNDBODYEMBED,FORMFRAMEIFRAMEILAYERIMGISINDEXINPUTLAYERLINKOBJECTSCRIPTSOUNDTABLETDTH, and TR elements.) 的URL,都该被编码。

那些字符应该被编码,为什么要被编码

ASCII Control characters
Why: These characters are not printable.
Characters: Includes the ISO-8859-1 (ISO-Latin) character ranges 00-1F hex (0-31 decimal) and 7F (127 decimal.)
Non-ASCII characters
Why: These are by definition not legal in URLs since they are not in the ASCII set.
Characters: Includes the entire “top half” of the ISO-Latin set 80-FF hex (128-255 decimal.)
“Reserved characters”
Why: URLs use some characters for special use in defining their syntax. When these characters are not used in their special role inside a URL, they need to be encoded.
Characters:
Character Code
Points
(Hex)
Code
Points
(Dec)
 Dollar (“$”)
Ampersand (“&”)
Plus (“+”)
Comma (“,”)
Forward slash/Virgule (“/”)
Colon (“:”)
Semi-colon (“;”)
Equals (“=”)
Question mark (“?”)
‘At’ symbol (“@”)
24
26
2B
2C
2F
3A
3B
3D
3F
40
36
38
43
44
47
58
59
61
63
64
“Unsafe characters”
Why: Some characters present the possibility of being misunderstood within URLs for various reasons. These characters should also always be encoded.
Characters:
Character Code
Points
(Hex)
Code
Points
(Dec)
Why encode?
Space 20 32 Significant sequences of spaces may be lost in some uses (especially multiple spaces)
Quotation marks
‘Less Than’ symbol (“<“)
‘Greater Than’ symbol (“>”)
22
3C
3E
34
60
62
These characters are often used to delimit URLs in plain text.
‘Pound’ character (“#”) 23 35 This is used in URLs to indicate where a fragment identifier (bookmarks/anchors in HTML) begins.
Percent character (“%”) 25 37 This is used to URL encode/escape other characters, so it should itself also be encoded.
Misc. characters:
Left Curly Brace (“{“)
Right Curly Brace (“}”)
Vertical Bar/Pipe (“|”)
Backslash (“”)
Caret (“^”)
Tilde (“~”)
Left Square Bracket (“[“)
Right Square Bracket (“]”)
Grave Accent (“`”)
7B
7D
7C
5C
5E
7E
5B
5D
60
123
125
124
92
94
126
91
93
96
Some systems can possibly modify these characters.

那么URL怎样被编码

编码后的URL包含“%”符号,后面跟两位16禁止大小写敏感的ISO-Latin字符

Example
  • Space = decimal code point 32 in the ISO-Latin set.
  • 32 decimal = 20 in hexadecimal
  • The URL encoded representation will be “%20”