今天心血来潮,写了一个 Markdown 转换器。
import os, re,webbrowser
text = '''
# TextHeader
## Header1
List
- 1
- 2
- 3
> **quote**
》 quote2
## Header2
1. *斜体*
2. [@以茄之名](https://www.zhihu.com/people/e4f87c3476a926c1e2ef51b4fcd18fa3)
3、 ![](https://pic4.zhimg.com/v2-8560440c136c746730a63813ed701f52_is.jpg)
## Header3
`*[文章地址](https://zhuanlan.zhihu.com/p/39742445)*`
·**code1**·
- [x]是否点赞
'''
程序开头先处理一些行内的语法,比如 code、strong、i 等,用正则直接替换:
text = re.sub(re.compile('([\`·])([^`·]+)[\`·]'), r'<code>\2</code>', text)
text = re.sub(re.compile('\*\*([^\*]+)\*\*'), r'<strong>\1</strong>', text)
text = re.sub(re.compile('([^\*])\*([^\*]+)\*'), r'\1<i>\2</i>', text)
接着是复杂一点的图片和链接:
text = re.sub(re.compile('([^\!])\[([^\]]+)\]\(([^)]+)\)'),
r'\1<a href="\3" target="_blank">\2</a>', text)
text = re.sub(re.compile('\!\[([^\]]*)\]\(([^)]+)\)'),
r'<img src="\2" >', text)
接着就处理其他的语法,先把文本按每一行分开:
lines = text.split('\n')
html = ''
list_flag = ''
处理列表和待办事项的问题:
for line in lines:
line = line.strip(' ')
if re.match('- \[[ x]\]', line):
print('matched')
p_html = ''
if re.match('- \[x\]', line):
p_html = ' checked="checked"'
line = re.sub('- \[[ x]\]', '', line)
html += '''<label class="cssCheckbox">
<input type="checkbox" %s />
<span></span>%s
</label>''' % (p_html, line)
因为有序列表和无序列表的区别是头尾的ol和ul,所以要用 list_flag 变量来判断
elif re.match('[\+\-\*] ', line):
if list_flag == '':
html += '<ul>\n'
list_flag = 'ul'
line = re.sub('[\+\-\*] ', '', line)
html += '<li>%s</li>\n' % (line)
elif re.match('[\d]+[.、] ', line):
if list_flag == '':
list_flag = 'ol'
html += '<ol>\n'
line = re.sub('[\d]+[.、] ', '', line)
html += '<li>%s</li>\n' % (line)
处理完后处理其他的语法:
else:
if list_flag != '':
html += '</%s>\n' % list_flag
list_flag = ''
if re.match('\#+', line):
well = re.match('\#+', line).group().count('#')
line = re.sub('\#+', '', line)
html += '<h%i>%s</h%i>\n' % (well, line, well)
elif re.match('[>》 ]', line):
line = re.sub('^\s*[>》 ]', '', line)
html += '<blockquote>%s</blockquote>\n' % (line)
# elif re.match('[>》 ]', line):
# line = re.sub('^\s*[>》 ]', '', line)
# html += '<blockquote>%s</blockquote>\n' % (line)
else:
html += line
这里我稍微修改了一点,让 > 和 》 都可以转换成引用,主要是切换中英文标点太难了。
然后就是添加 CSS,自己改了一点马克飞象的进去,因为他的引用做得很漂亮:
with open('markdown.html', 'w', encoding='utf-8')as f:
f.write('''
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<style>body{
margin: 0 auto;
font-family: "ubuntu", "Tahoma", "Microsoft YaHei", arial,sans-serif;
color: #444444;
line-height: 1;
padding: 30px;
}
input[type='checkbox']+span::before {
content:' ';/*不换行空格*/
display: inline-block;
vertical-align: 0.2em;
width:0.8em;
height:0.8em;
margin-right: .2em;
border-radius:.2em;
background: silver;/*复选框的背景色*/
text-indent:0.15em;
line-height: 0.65;
}
input[type='checkbox'] {
/*隐藏掉原先实际的 checkbox 框,之所以没用 display:none; 这种简单直接的方式,是因为这种方法会把它从键盘 tab 键切换焦点的队列中完全删除*/
position: absolute;
clip:rect(0,0,0,0);
}
input[type='checkbox']:checked+span::before {
content:'\u221a'; /*对号的 Unicode字符*/
background: yellowgreen;/*对号的颜色*/
}
img {
max-width: 100%;
}
@media screen and (min-width: 1000px) {
body {
width: 842px;
margin: 10px auto;
}
}
h1, h2, h3, h4 {
color: #111111;
font-weight: 400;
margin-top: 1em;
}
h1, h2, h3, h4, h5 {
font-family: Georgia, Palatino, serif;
}
h1, h2, h3, h4, h5, dl{
margin-bottom: 16px;
padding: 0;
}
p {
margin-top: 8px;
margin-bottom: 3px;
}
h1 {
font-size: 48px;
line-height: 54px;
}
h2 {
font-size: 36px;
line-height: 42px;
}
h1, h2 {
border-bottom: 1px solid #EFEAEA;
padding-bottom: 10px;
}
h3 {
font-size: 24px;
line-height: 30px;
}
h4 {
font-size: 21px;
line-height: 26px;
}
h5 {
font-size: 18px;
line-height: 23px;
}
a {
color: #0099ff;
margin: 0 2px;
padding: 0;
vertical-align: baseline;
text-decoration: none;
}
a:hover {
text-decoration: none;
color: #ff6600;
}
a:visited {
/*color: purple;*/
}
ul, ol {
padding: 0;
padding-left: 18px;
margin: 0;
}
li {
line-height: 24px;
}
p, ul, ol {
font-size: 16px;
line-height: 24px;
}
ol ol, ul ol {
list-style-type: lower-roman;
}
code, pre {
font-family: Consolas, Monaco, Andale Mono, monospace;
background-color:#f7f7f7;
color: inherit;
}
code {
font-family: Consolas, Monaco, Andale Mono, monospace;
margin: 0 2px;
}
pre {
font-family: Consolas, Monaco, Andale Mono, monospace;
line-height: 1.7em;
overflow: auto;
padding: 6px 10px;
border-left: 5px solid #6CE26C;
}
pre > code {
font-family: Consolas, Monaco, Andale Mono, monospace;
border: 0;
display: inline;
max-width: initial;
padding: 0;
margin: 0;
overflow: initial;
line-height: 1.6em;
font-size: .95em;
white-space: pre;
background: 0 0;
}
code {
color: #666555;
}
aside {
display: block;
float: right;
width: 390px;
}
blockquote {
border-left-width: 10px;
background-color: rgba(102,128,153,0.05);
border-top-right-radius: 5px;
border-bottom-right-radius: 5px;
padding: 15px 20px;
}
blockquote cite {
font-size:14px;
line-height:20px;
color:#bfbfbf;
}
blockquote cite:before {
content: '\2014 \00A0';
}
blockquote p {
color: #666;
}
hr {
text-align: left;
color: #999;
height: 2px;
padding: 0;
margin: 16px 0;
background-color: #e7e7e7;
border: 0 none;
}
dl {
padding: 0;
}
dl dt {
padding: 10px 0;
margin-top: 16px;
font-size: 1em;
font-style: italic;
font-weight: bold;
}
dl dd {
padding: 0 16px;
margin-bottom: 16px;
}
dd {
margin-left: 0;
}
table {
*border-collapse: collapse; /* IE7 and lower */
border-spacing: 0;
width: 100%;
}
table {
border: solid #ccc 1px;
}
table thead {
background: #f7f7f7;
}
table thead tr:hover {
background: #f7f7f7
}
table tr:hover {
background: #fbf8e9;
-o-transition: all 0.1s ease-in-out;
-webkit-transition: all 0.1s ease-in-out;
-moz-transition: all 0.1s ease-in-out;
-ms-transition: all 0.1s ease-in-out;
transition: all 0.1s ease-in-out;
}
table td, .table th {
border-left: 1px solid #ccc;
border-top: 1px solid #ccc;
padding: 10px;
text-align: left;
}
table th {
border-top: none;
text-shadow: 0 1px 0 rgba(255,255,255,.5);
padding: 5px;
border-left: 1px solid #ccc;
}
table td:first-child, table th:first-child {
border-left: none;
}</style></head>''')
f.write(html)
f.write('</html>')
用 Chrome 打开网页:
webbrowser.get('C:/Program Files (x86)/CentBrowser/Application/chrome.exe %s').open(
'file:///'+os.getcwd()+'/markdown.html')
话说这里也是个坑,系统自带的 Edge 一直打开失败,用那个注册器注册 Chrome 也没办法用 ,最后还是在外网找到了解决方案。
最后的效果:
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。