环境:Windows/Python 2
由于维护一些历史的.net
项目,还运行在Windows平台上,在项目构建,甚至代码发布的过程中,需要解决一些技术债务问题。
Jinkens
我们采用的Windows下的Jenkins编译.net framework项目,在此过程中,该项目存在多个中文的文件名称,由于该项目有十几年历史,结构复杂不便改动,于是只能想法办兼容一下。
使用git获取差异文件列表
git diff --no-renames --name-only $commit_id_1 $commit_id_2
对比历史差异文件的时候,需提前执行以下命令,否则会造成中文乱码
git config --global core.quotepath false
然后我们在python脚本中处理文件名编码格式,由于是Windows平台,所以采用gbk编码。
filename.decode("utf-8").encode("gbk")
Python
此外在python中使用zipfile
模块进行压缩文件处理的时候,也要注意中文乱码问题
import zipfile
zf = InMemoryZip('code.zip')
p = InMemoryZip()
for name in zf.get_namelist():
try:
n = name.encode('cp437').decode('gbk')
except:
n = name.encode('utf-8').decode('utf-8')
p.append(n, zf.open(name))
在内存中操作ZIP压缩文件(附注代码)
import os
import io
import zipfile
class InMemoryZip(object):
def __init__(self, buffer=None):
# Create the in-memory file-like object
self.in_memory_zip = io.BytesIO(buffer)
if buffer:
self.zf = zipfile.ZipFile(self.in_memory_zip, "r")
else:
self.zf = zipfile.ZipFile(self.in_memory_zip, "a", zipfile.ZIP_DEFLATED, False)
def append(self, filename, file_contents):
# Write the file to the in-memory zip
# zf.writestr(filename_in_zip, file_contents)
self.zf.writestr(filename, file_contents)
# Mark the files as having been created on Windows so that
# Unix permissions are not inferred as 0000
for zfile in self.zf.filelist:
zfile.create_system = 0
return self
def read_file(self, name):
return self.zf.open(name).read()
def extractall(self, directory):
self.zf.extractall(directory)
def get_namelist(self):
return self.zf.namelist()
def read(self):
"""Returns a string with the contents of the in-memory zip."""
self.zf.close()
self.in_memory_zip.seek(0)
return self.in_memory_zip.read()