环境:Windows/Python 2

由于维护一些历史的.net项目,还运行在Windows平台上,在项目构建,甚至代码发布的过程中,需要解决一些技术债务问题。

Jinkens

我们采用的Windows下的Jenkins编译.net framework项目,在此过程中,该项目存在多个中文的文件名称,由于该项目有十几年历史,结构复杂不便改动,于是只能想法办兼容一下。

使用git获取差异文件列表

git diff --no-renames --name-only $commit_id_1 $commit_id_2

对比历史差异文件的时候,需提前执行以下命令,否则会造成中文乱码

git config --global core.quotepath false

然后我们在python脚本中处理文件名编码格式,由于是Windows平台,所以采用gbk编码。

filename.decode("utf-8").encode("gbk")

Python

此外在python中使用zipfile模块进行压缩文件处理的时候,也要注意中文乱码问题

import zipfile

zf = InMemoryZip('code.zip')
p = InMemoryZip()
for name in zf.get_namelist():
    try:
        n = name.encode('cp437').decode('gbk')
    except:
        n = name.encode('utf-8').decode('utf-8')
  
    p.append(n, zf.open(name))

在内存中操作ZIP压缩文件(附注代码)

import os
import io
import zipfile

class InMemoryZip(object):
    def __init__(self, buffer=None):
        # Create the in-memory file-like object
        self.in_memory_zip = io.BytesIO(buffer)
        if buffer:
            self.zf = zipfile.ZipFile(self.in_memory_zip, "r")
        else:
            self.zf = zipfile.ZipFile(self.in_memory_zip, "a", zipfile.ZIP_DEFLATED, False)

    def append(self, filename, file_contents):
        # Write the file to the in-memory zip
        # zf.writestr(filename_in_zip, file_contents)
        self.zf.writestr(filename, file_contents)

        # Mark the files as having been created on Windows so that
        # Unix permissions are not inferred as 0000
        for zfile in self.zf.filelist:
            zfile.create_system = 0
        return self
    
    def read_file(self, name):
        return self.zf.open(name).read()

    def extractall(self, directory):
        self.zf.extractall(directory)

    def get_namelist(self):
        return self.zf.namelist()

    def read(self):
        """Returns a string with the contents of the in-memory zip."""
        self.zf.close()
        self.in_memory_zip.seek(0)
        return self.in_memory_zip.read()