python处理多行字符串将第一行进行修改

以源代码从网站 KEGG-API获取了所需要的文本,其格式如下:

[字符串1]:

b'>hsa:10056 K01890 phenylalanyl-tRNA synthetase beta chain [EC:6.1.1.20] | (RefSeq) FARSB, FARSLB, FRSB, HSPC173, NEDBLLA, PheHB, PheRS; phenylalanyl-tRNA synthetase subunit beta (A)\nMPTVSVKRDLLFQALGRTYTDEEFDELCFEFGLELDEITSEKEIISKEQGNVKAAGASDV\nVLYKIDVPANRYDLLCLEGLVRGLQVFKERIKAPVYKRVMPDGKIQKLIITEETAKIRPF\nAVAAVLRNIKFTKDRYDSFIELQEKLHQNICRKRALVAIGTHDLDTLSGPFTYTAKRPSD\nIKFKPLNKTKEYTACELMNIYKTDNHLKHYLHIIENKPLYPVIYDSNGVVLSMPPIINGD\nHSRITVNTRNIFIECTGTDFTKAKIVLDIIVTMFSEYCENQFTVEAAEVVFPNGKSHTFP\nELAYRKEMVRADLINKKVGIRETPENLAKLLTRMYLKSEVIGDGNQIEIEIPPTRADIIH\nACDIVEDAAIAYGYNNIQMTLPKTYTIANQFPLNKLTELLRHDMAAAGFTEALTFALCSQ\nEDIADKLGVDISATKAVHISNPKTAEFQVARTTLLPGLLKTIAANRKMPLPLKLFEISDI\nVIKDSNTDVGAKNYRHLCAVYYNKNPGFEIIHGLLDRIMQLLDVPPGEDKGGYVIKASEG\nPAFFPGRCAEIFARGQSVGKLGVLHPDVITKFELTMPCSSLEINVGPFL\n'

处理成utf-8格式后:
[字符串2]:

>hsa:10056 K01890 phenylalanyl-tRNA synthetase beta chain [EC:6.1.1.20] | (RefSeq) FARSB, FARSLB, FRSB, HSPC173, NEDBLLA, PheHB, PheRS; phenylalanyl-tRNA synthetase subunit beta (A)
MPTVSVKRDLLFQALGRTYTDEEFDELCFEFGLELDEITSEKEIISKEQGNVKAAGASDV
VLYKIDVPANRYDLLCLEGLVRGLQVFKERIKAPVYKRVMPDGKIQKLIITEETAKIRPF
AVAAVLRNIKFTKDRYDSFIELQEKLHQNICRKRALVAIGTHDLDTLSGPFTYTAKRPSD
IKFKPLNKTKEYTACELMNIYKTDNHLKHYLHIIENKPLYPVIYDSNGVVLSMPPIINGD
HSRITVNTRNIFIECTGTDFTKAKIVLDIIVTMFSEYCENQFTVEAAEVVFPNGKSHTFP
ELAYRKEMVRADLINKKVGIRETPENLAKLLTRMYLKSEVIGDGNQIEIEIPPTRADIIH
ACDIVEDAAIAYGYNNIQMTLPKTYTIANQFPLNKLTELLRHDMAAAGFTEALTFALCSQ
EDIADKLGVDISATKAVHISNPKTAEFQVARTTLLPGLLKTIAANRKMPLPLKLFEISDI
VIKDSNTDVGAKNYRHLCAVYYNKNPGFEIIHGLLDRIMQLLDVPPGEDKGGYVIKASEG
PAFFPGRCAEIFARGQSVGKLGVLHPDVITKFELTMPCSSLEINVGPFL

现在我的目标是将第一行空格后的数据删除,其余不修改,完成如下:
[字符串3]:

>hsa:10056
MPTVSVKRDLLFQALGRTYTDEEFDELCFEFGLELDEITSEKEIISKEQGNVKAAGASDV
VLYKIDVPANRYDLLCLEGLVRGLQVFKERIKAPVYKRVMPDGKIQKLIITEETAKIRPF
AVAAVLRNIKFTKDRYDSFIELQEKLHQNICRKRALVAIGTHDLDTLSGPFTYTAKRPSD
IKFKPLNKTKEYTACELMNIYKTDNHLKHYLHIIENKPLYPVIYDSNGVVLSMPPIINGD
HSRITVNTRNIFIECTGTDFTKAKIVLDIIVTMFSEYCENQFTVEAAEVVFPNGKSHTFP
ELAYRKEMVRADLINKKVGIRETPENLAKLLTRMYLKSEVIGDGNQIEIEIPPTRADIIH
ACDIVEDAAIAYGYNNIQMTLPKTYTIANQFPLNKLTELLRHDMAAAGFTEALTFALCSQ
EDIADKLGVDISATKAVHISNPKTAEFQVARTTLLPGLLKTIAANRKMPLPLKLFEISDI
VIKDSNTDVGAKNYRHLCAVYYNKNPGFEIIHGLLDRIMQLLDVPPGEDKGGYVIKASEG
PAFFPGRCAEIFARGQSVGKLGVLHPDVITKFELTMPCSSLEINVGPFL

现在我的问题是:

如何获取文本后,不保存为为文件,直接对多行字符串进行处理,然后再保存为文件?
因为将获取的文本写入文件,然后再去处理这个文件感觉多此一举。

这是获取文本的代码:

def getHtml(url): #获取网页源代码
    request = urllib.request.Request(url)
    response = urllib.request.urlopen(request)
    return response.read().decode('utf-8')

url1 = "http://rest.kegg.jp/get/hsa:10056/aaseq"
text = getHtml(url1)

其中获取的‘text’内容如上[字符串2]所示
我知道可以使用split切除第一行:

>>>str1 = 'hsa:10056 K01890 phenylalanyl-tRNA synthetase beta chain [EC:6.1.1.20] | (RefSeq) FARSB, FARSLB, FRSB, HSPC173, NEDBLLA, PheHB, PheRS; phenylalanyl-tRNA synthetase subunit beta (A)'
>>>str2 = str1.split(' ')[:1]
>>>print(str2)
['hsa:10056']

但现在问题是,'text'是个多行字符串,我只要处理它的第一行,不知道如何解决?

阅读 9.3k
2 个回答

不知道是不是你想要的
图片描述

你知道换行符是\n的话,应该就知道怎么处理了吧。 str1.split("\n")[0].split(" ")[:1]

推荐问题