软件测试学习笔记丨Linux数据处理

在这里插入图片描述

一，数据处理 grep

[](https://ceshiren.com/t/topic/29662#h-11-2)1.1 基本语法

grep [options] pattern [file…] ——[options]表示选项，具体的命令选项见下表。pattern表示要匹配的模式（包括目标字符串、变量或者正则表达式），file表示要查询的文件名，可以是一个或者多个

-c  ：只打印匹配的文本行的行数，不显示匹配的内容
-i    ：匹配时忽略字母的大小写
-h  ：当搜索多个文件时，不显示匹配文件名前缀
-n  ：列出所有的匹配的文本行，并显示行号
-l  ：只列出含有匹配的文本行的文件的文件名，而不显示具体的匹配内容
-s  ：不显示关于不存在或者无法读取文件的错误信息
-v  ：只显示不匹配的文本行
-w  ：匹配整个单词
-x  ：匹配整个文本行
-r  ：递归搜索，搜索当前目录和子目录
-q  ：禁止输出任何匹配结果，而是以退出码的形式表示搜索是否成功，其中0表示找到了匹配的文本行
-b ：打印匹配的文本行到文件头的偏移量，以字节为单位
-E ：支持扩展正则表达式
-P ：支持Perl正则表达式
-F ：不支持正则表达式，将模式按照字面意思匹配

[](https://ceshiren.com/t/topic/29662#h-12-3)1.2 内容检索

获取行 grep pattern file
获取内容 grep -o pattern file
获取上下文 grep -A -B -C pattern file

[](https://ceshiren.com/t/topic/29662#h-13-4)1.3 文件检索

递归搜索 grep pattern -r dir/
展示匹配文件名 grep -H 111 /tmp/1
只展示匹配文件名 grep -l 111 /tmp/1

[](https://ceshiren.com/t/topic/29662#h-14-5)1.4 范围约束

忽略大小写 grep -i pattern file
不显示匹配的行 grep -v pattern file
使用扩展正则表达式 grep -E pattern file
文件范围和目录范围约束 grep 111 -r /tmp/demo/ --include "11*"

[](https://ceshiren.com/t/topic/29662#h-15-6)1.5 进程检索

进程过滤场景比较特殊，需要注意
grep 本身会开启新进程，所以需要单独过滤掉 grep 进程
- ps -ef |grep ssh | grep -v grep

[](https://ceshiren.com/t/topic/29662#awk-7)二，数据处理 awk

[](https://ceshiren.com/t/topic/29662#h-21-8)2.1 基本语法

awk 是 linux 下的一个命令，同时也是一种语言解析引擎
awk 具备完整的编程特性。比如执行命令，网络请求等
精通 awk，是一个 linux 工作者的必备技能
语法 awk 'pattern{action}'

[](https://ceshiren.com/t/topic/29662#h-22-awk-9)2.2 awk上下文变量

开始 BEGIN 结束 END
行数 NR
字段与字段数 $1 $2 … $NF NF
整行 $0
字段分隔符 FS
输出数据的字段分隔符 OFS
记录分隔符 RS
输出字段的行分隔符 ORS

[](https://ceshiren.com/t/topic/29662#h-23-10)2.3 字段变量用法

-F 参数指定字段分隔符，可以用|指定多个- 多分隔符 -F ‘<|>’
BEGIN{FS=“_”} 也可以表示分隔符
$0 代表当前的记录
$1 代表第一个字段
$N 代表第 N 个字段
$NF 代表最后一个字段
$(NF-1) 代表倒数第二个字段

[](https://ceshiren.com/t/topic/29662#h-24-pattern-11)2.4 pattern 表达式

正则匹配 $1~/pattern/ /pattern/
- 开始和结束 awk 'BEGIN{}END{}'
- 正则匹配
  - 整行匹配 awk '/Running/'
  - 字段匹配 awk '$2~/xxx/'
- 行数表达式
- 取第二行 awk 'NR==2'
- 去掉第一行 awk 'NR>1'
- 区间选择
  - awk '/aa/,/bb/'
  - awk '/1/,NR==2'
比较表达式 $2>2 $1=="b"

[](https://ceshiren.com/t/topic/29662#h-25-action-12)2.5 行为表达式 `{action}`

打印 {print $0} {print $2}
赋值 {$1="abc"}
处理函数
原始内容 $0
更新后内容 {$1=$1;print $0}
实例：
- 单行转多行：echo 1:2:3 | awk ‘BEGIN{RS=“:”}{print $0}’

多行变单行：awk ‘BEGIN{RS=“”;FS=“\n”;OFS=“:”}{$1=$1;print $0}’ / awk ‘BEGIN{ORS=“:”}{$1=$1;print $0}’

计算平均数：awk ‘BEGIN{total=0;FS=“,”}{total+=$2}END{print total/NR}’

[](https://ceshiren.com/t/topic/29662#h-26-awk-array-13)2.6 awk 的词典结构 array

统计营业额：awk ‘{data[$1]+=$3}END{for(k in data) print k,data[k]}’

统计营业额平均值：awk ‘{data[$1]+=$3;count[$1]+=1;}END{for(k in data) print k,data[k]/count[k]}’

[](https://ceshiren.com/t/topic/29662#sed-14)三，数据处理 sed（常用于定位）

[](https://ceshiren.com/t/topic/29662#h-31-15)3.1 基本语法与常用参数

语法结构 sed [addr]X[options]
-e 表达式
sed -n ‘2p’ 打印第二行
sed ‘s#hello#world#’ 修改
-i 直接修改源文件
-E 扩展表达式
–debug 调试

[](https://ceshiren.com/t/topic/29662#h-32-pattern-16)3.2 pattern 表达式

行数与行数范围 20 30,35
正则匹配 /pattern/
区间匹配 //,//

[](https://ceshiren.com/t/topic/29662#h-33-action-17)3.3 action 表达式

p 打印，通常结合-n 参数：sed -n ‘2p’
s 查找替换：s/REGEXP/REPLACEMENT/[FLAGS]
d 删除，删除前两行 sed ‘1,2d’
a 追加
c 改变
i 插入内容到匹配行之前
e 执行命令
分组匹配与字段提取：sed ‘s#([0-9] *)|([a-z]* )#\1 \2#’

[](https://ceshiren.com/t/topic/29662#h-34-s-18)3.4 s 表达式

s 表示替换
s 后面的追加字符可以为任意字符
g 表示全局匹配
& 表示匹配内容

[](https://ceshiren.com/t/topic/29662#h-35-19)3.5 反向引用

使用()对数据进行分组
使用\1 \2 反向引用分组

霍格沃兹的测试管理班是专门面向测试与质量管理人员的一门课程，通过提升从业人员的团队管理、项目管理、绩效管理、沟通管理等方面的能力，使测试管理人员可以更好的带领团队、项目以及公司获得更快的成长。提供 1v1 私教指导，BAT 级别的测试管理大咖量身打造职业规划。

微信图片_20240122172740.png

软件测试学习笔记丨Linux数据处理

一，数据处理 grep

[](https://ceshiren.com/t/topic/29662#h-11-2)1.1 基本语法

[](https://ceshiren.com/t/topic/29662#h-12-3)1.2 内容检索

[](https://ceshiren.com/t/topic/29662#h-13-4)1.3 文件检索

[](https://ceshiren.com/t/topic/29662#h-14-5)1.4 范围约束

[](https://ceshiren.com/t/topic/29662#h-15-6)1.5 进程检索

[](https://ceshiren.com/t/topic/29662#awk-7)二，数据处理 awk

[](https://ceshiren.com/t/topic/29662#h-21-8)2.1 基本语法

[](https://ceshiren.com/t/topic/29662#h-22-awk-9)2.2 awk上下文变量

[](https://ceshiren.com/t/topic/29662#h-23-10)2.3 字段变量用法

[](https://ceshiren.com/t/topic/29662#h-24-pattern-11)2.4 pattern 表达式

[](https://ceshiren.com/t/topic/29662#h-25-action-12)2.5 行为表达式 `{action}`

[](https://ceshiren.com/t/topic/29662#h-26-awk-array-13)2.6 awk 的词典结构 array

[](https://ceshiren.com/t/topic/29662#sed-14)三，数据处理 sed（常用于定位）

[](https://ceshiren.com/t/topic/29662#h-31-15)3.1 基本语法与常用参数

[](https://ceshiren.com/t/topic/29662#h-32-pattern-16)3.2 pattern 表达式

[](https://ceshiren.com/t/topic/29662#h-33-action-17)3.3 action 表达式

[](https://ceshiren.com/t/topic/29662#h-34-s-18)3.4 s 表达式

[](https://ceshiren.com/t/topic/29662#h-35-19)3.5 反向引用

用户bPc5q3Z

引用和评论

软件测试岗位内推丨京东科技控股股份有限公司岗位开放

快捷键打开某个窗口(如网页chatGPT)

但是，I/O多路复用中是如何判断文件“可读”/“可写”的？

为 Emacs 配置字体，你可曾认真过？

为什么你学不会 Emacs？

麒麟系统中theia终端崩溃问题排查小记

【笔记】CentOS 7 中配置 YUM

软件测试学习笔记丨Linux数据处理

一，数据处理 grep

[](https://ceshiren.com/t/topic/29662#h-11-2)1.1 基本语法

[](https://ceshiren.com/t/topic/29662#h-12-3)1.2 内容检索

[](https://ceshiren.com/t/topic/29662#h-13-4)1.3 文件检索

[](https://ceshiren.com/t/topic/29662#h-14-5)1.4 范围约束

[](https://ceshiren.com/t/topic/29662#h-15-6)1.5 进程检索

[](https://ceshiren.com/t/topic/29662#awk-7)二，数据处理 awk

[](https://ceshiren.com/t/topic/29662#h-21-8)2.1 基本语法

[](https://ceshiren.com/t/topic/29662#h-22-awk-9)2.2 awk上下文变量

[](https://ceshiren.com/t/topic/29662#h-23-10)2.3 字段变量用法

[](https://ceshiren.com/t/topic/29662#h-24-pattern-11)2.4 pattern 表达式

[](https://ceshiren.com/t/topic/29662#h-25-action-12)2.5 行为表达式 {action}

[](https://ceshiren.com/t/topic/29662#h-26-awk-array-13)2.6 awk 的词典结构 array

[](https://ceshiren.com/t/topic/29662#sed-14)三，数据处理 sed（常用于定位）

[](https://ceshiren.com/t/topic/29662#h-31-15)3.1 基本语法与常用参数

[](https://ceshiren.com/t/topic/29662#h-32-pattern-16)3.2 pattern 表达式

[](https://ceshiren.com/t/topic/29662#h-33-action-17)3.3 action 表达式

[](https://ceshiren.com/t/topic/29662#h-34-s-18)3.4 s 表达式

[](https://ceshiren.com/t/topic/29662#h-35-19)3.5 反向引用

用户bPc5q3Z

引用和评论

软件测试岗位内推丨京东科技控股股份有限公司岗位开放

快捷键打开某个窗口(如网页chatGPT)

但是，I/O多路复用中是如何判断文件“可读”/“可写”的？

为 Emacs 配置字体，你可曾认真过？

为什么你学不会 Emacs？

麒麟系统中theia终端崩溃问题排查小记

【笔记】CentOS 7 中配置 YUM

[](https://ceshiren.com/t/topic/29662#h-25-action-12)2.5 行为表达式 `{action}`